Commit Graph

269 Commits

Author SHA1 Message Date
Huamin Li
43853691bc [Quantization] add an option keep_original_weights in _lower_to_native_backend (#141049)
Differential Revision: D66153809

This diff adds an option to keep_original_weights so we can track back the original weight and bias after performing prepare_fx and convert_fx

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141049
Approved by: https://github.com/jerryzh168
2024-12-27 04:02:07 +00:00
Tom Ritchford
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
Mikayla Gawarecki
d9576c9440 Fix failures when default is flipped for weights_only (#127627)
Tests on XLA shard not fixed yet but there is an issue here https://github.com/pytorch/xla/issues/7799

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127627
Approved by: https://github.com/albanD
ghstack dependencies: #132349
2024-08-16 00:22:43 +00:00
Oguz Ulgen
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
Aaron Gokaslan
34910f87f0 [BE]: Update ruff to v0.4.4 (#125031)
Update ruff version to 0.4.2. This version mostly has bugfixes for the new parser and also updates the f-string rule to be able to apply more fixes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125031
Approved by: https://github.com/albanD, https://github.com/malfet
2024-05-12 20:02:37 +00:00
Mikayla Gawarecki
c82fcb7b30 Add testing and fix weights_only load for quantized types and nn.Parameters with python attrs (#124330)
Adds the following to allowed globals for the `weights_only` unpickler
- [x] `torch._utils._rebuild_qtensor` and qtensor related types
- [x] `torch._utils._rebuild_parameter_with_state` (used deserializing a parameter that has user-defined attributes like `Param.foo`)

The remaining rebuild functions that have not been allowlisted are

- [x] `torch._utils._rebuild_wrapper_subclass` (allowlisted in above PR)
- [ ] `torch._utils._rebuild_device_tensor_from_numpy`
- [ ] `torch._utils._rebuild_xla_tensor` (legacy)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124330
Approved by: https://github.com/albanD
2024-04-23 04:13:26 +00:00
Aaron Gokaslan
1562dae62c [BE]: Apply RUF025 dict.fromkeys preview rule (#118637)
Simplifies and optimizes dict construction using the `fromkeys` classmethod ctor. This also makes it really obvious when all the keys will have the same static value, which could be a bug if unintentional. It is also significantly faster than using a dict comprehension. The rule is in preview, but I am adding a forward fix for when it becomes stable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118637
Approved by: https://github.com/albanD
2024-01-30 20:46:54 +00:00
rzou
7b753cc7b8 Skip some slow tests (under Dynamo) (#117389)
Otherwise these may cause timeouts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117389
Approved by: https://github.com/jerryzh168, https://github.com/voznesenskym
ghstack dependencies: #117318, #117320
2024-01-12 22:18:07 +00:00
Aaron Gokaslan
bd10fea79a [BE]: Enable F821 and fix bugs (#116579)
Fixes #112371

I tried to fix as many of the bugs as I could, a few I could not figure out what the proper fix for them was though and so I left them with noqas.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116579
Approved by: https://github.com/ezyang
2024-01-01 08:40:46 +00:00
Aaron Gokaslan
794545c11f [BE]: Enable RUF015 codebase wide (#115507)
Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507
Approved by: https://github.com/malfet
2023-12-11 15:51:01 +00:00
HDCharles
18e1a37c4e [ao] updating embedding_bag support for fx and eager (#107623)
Summary: our docs were saying dynamic embedding bag wasn't supported but
it actually is (at least at the same level as embeddings were) it just wasn't previously tested/listed.

Test Plan: python test/test_quantization.py -k "test_embedding"

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107623
Approved by: https://github.com/jerryzh168
2023-11-21 03:54:00 +00:00
Andrew Or
7c72238e4b Back out "Enable pickling model prepared with QAT qconfig" (#110392)
Summary:
D49187352 caused our model conversion and loading of QAT checkpoint to be stuck with thrift time out.

we are actively checking in final code and model for static quant HTP prod model, and encountered this breakage at head Thursday.

Thrift timeout is a not failing, and because of that, it's hard to bisect and find this culprit. It is also hard to set up unit test, because the job simply time-out. Better test is needed to guard downstream model conversion against upstream changes.

Our suspicion of why this diff broke us is that we create a lot of modules with qat (in a recursive manner) but our model is not a qat traceable module (it is a graph with many qat modules and floating point modules). With fuctools.partial as in the original diff, we will be caching modules in the memory and causing the memory of the machine to be taken up completely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110392
Approved by: https://github.com/junesg, https://github.com/jerryzh168
2023-10-05 14:41:00 +00:00
Sindi Shkodrani
419ec3b229 Enable pickling model prepared with QAT qconfig (#109288)
Summary:
Resolving error:

AttributeError: Can't pickle local object '_add_module_to_qconfig_obs_ctr.<locals>.get_factory_kwargs_based_on_module_device'

by moving nested function out to the main module

Test Plan: Added test to CI

Reviewed By: andrewor14

Differential Revision: D49187352

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109288
Approved by: https://github.com/andrewor14
2023-09-28 09:51:19 +00:00
PyTorch MergeBot
93b2036bef Revert "[quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894)"
This reverts commit 3ca71ed735.

Reverted https://github.com/pytorch/pytorch/pull/105894 on behalf of https://github.com/huydhn due to breaking executorch tests internally ([comment](https://github.com/pytorch/pytorch/pull/105894#issuecomment-1654831950))
2023-07-28 01:16:02 +00:00
Jerry Zhang
3ca71ed735 [quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894)
Summary:
Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this
PR changes them to buffers/Tensors so that they can be serialized

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D47770963](https://our.internmc.facebook.com/intern/diff/D47770963)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105894
Approved by: https://github.com/kimishpatel
2023-07-26 20:15:06 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
73e1455327 [BE] Enable ruff's UP rules and autoformat test/ (#105434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434
Approved by: https://github.com/albanD
2023-07-19 20:36:06 +00:00
Kimish Patel
90ee6a7354 [PT2][Quant] Update op names for decomposed quantized lib (#103251)
Summary:
Dynamo trace, via dynamo.export, with aten_graph, generates graph with nodes
whose target is an isntance of torch._ops.OpOverload. Quantization workflow
inserting quantize/dequantize ops which are sometimes instances of
torch._ops.OpOverload (quantize_per_tensor.tensor) while other times instances
of torch._ops.OpOverloadPacket (quantizer_per_tensor) is a bit inconsistent.

Also not sure if it is a valid exported model, if it has nodes with target
of type torch._ops.OpOverloadPacket.

Without op overload name attached to the 'target', it fails during executorch
tracing. Reason is that executorch tracing expects node's targets to be
instances of torch._ops.OpOverload and not torch._ops.OpOverloadPacket.

So for consistency and tracing reasons, fixing convert pass to insert ops which
are torch._ops.OpOverload

Test Plan: CI

Reviewed By: jerryzh168

Differential Revision: D46342822

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103251
Approved by: https://github.com/andrewor14
2023-06-15 04:37:58 +00:00
Riley Dulin
424c930f76 Add quantization lowering for nn.PixelShuffle and nn.PixelUnshuffle (#101926)
Similar to https://github.com/pytorch/pytorch/pull/96160 but for the modules
nn.PixelShuffle and nn.PixelUnshuffle.

torch.nn.PixelUnshuffle accepts both float and quantized inputs.
However, previously we would unnecessarily dequantize quantized inputs into floats
before passing them to the function. This commit fixes this by lowering the pattern
[dequant - PixelShuffle - quant].
[dequant - PixelUnshuffle - quant].

Test Plan:

python test/test_quantization.py TestQuantizeFxOps.test_pixel_shuffle_module
python test/test_quantization.py TestQuantizeFxOps.test_pixel_unshuffle_module

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101926
Approved by: https://github.com/jerryzh168
2023-05-24 19:33:26 +00:00
Aaron Gokaslan
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
Jerry Zhang
b0df0cd7cc [reland][quant][fix] Compare resnet with quantizer api with the prepare_fx and decomposed convert flow (#99355)
Summary:
Using a decomposed convert to make sure we get exact match, this means the nodes in resnet are
annotated correctly, reland for https://github.com/pytorch/pytorch/pull/98905

Test Plan:
python test/test_quantization.py TestQuantizePT2EModels.test_resnet18_with_quantizer_api

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D45071168](https://our.internmc.facebook.com/intern/diff/D45071168)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99355
Approved by: https://github.com/kimishpatel
2023-04-19 16:47:15 +00:00
PyTorch MergeBot
20a1788136 Revert "[quant][fix] Compare resnet with quantizer api with the prepare_fx and decomposed convert flow (#98905)"
This reverts commit 9e0df2379b.

Reverted https://github.com/pytorch/pytorch/pull/98905 on behalf of https://github.com/izaitsevfb due to Conflicts with D44918496 landed internally, blocks diff train import
2023-04-17 00:17:10 +00:00
Jerry Zhang
9e0df2379b [quant][fix] Compare resnet with quantizer api with the prepare_fx and decomposed convert flow (#98905)
Summary:
Using a decomposed convert to make sure we get exact match, this means the nodes in resnet are
annotated correctly

Test Plan:
python test/test_quantization.py TestQuantizePT2EModels.test_resnet18_with_quantizer_api

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98905
Approved by: https://github.com/andrewor14
2023-04-14 16:25:15 +00:00
Jerry Zhang
09ebdf44fa [quant][pt2e] Fix a bug in reference quantized module (decomposed mode) (#98903)
Summary:
Fixed quant_min/quant_max for per channel quantized weight for reference quantized module in decomposed mode,
this bug is triggered while onboard an internal model

Test Plan:
python test/test_quantization.py TestQuantizeFx.test__convert_to_reference_decomposed_fx_per_channel_quant_module

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98903
Approved by: https://github.com/andrewor14
2023-04-13 21:55:45 +00:00
PyTorch MergeBot
46a31e9bab Revert "[quant][pt2e] Fix a bug in reference quantized module (decomposed mode) (#98903)"
This reverts commit a2e809f29b.

Reverted https://github.com/pytorch/pytorch/pull/98903 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it breaks Windows tests on trunk a2e809f29b
2023-04-13 01:58:27 +00:00
Jerry Zhang
a2e809f29b [quant][pt2e] Fix a bug in reference quantized module (decomposed mode) (#98903)
Summary:
Fixed quant_min/quant_max for per channel quantized weight for reference quantized module in decomposed mode,
this bug is triggered while onboard an internal model

Test Plan:
python test/test_quantization.py TestQuantizeFx.test__convert_to_reference_decomposed_fx_per_channel_quant_module

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98903
Approved by: https://github.com/andrewor14
2023-04-12 22:35:24 +00:00
maxren
483fd3351a [Quant] Add get_symmetric_qnnpack_qat_qconfig_mapping (#98569)
Differential Revision: [D44776230](https://our.internmc.facebook.com/intern/diff/D44776230/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98569
Approved by: https://github.com/andrewor14
2023-04-07 17:57:56 +00:00
Xia, Weiwen
e073979794 [Quant][FX] Add test case for lowering conv_transpose with kwargs (#97311)
**Summary**
As the title

**Test plan**
python test/test_quantization.py -k test_lowering_functional_conv_transpose_with_kwargs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97311
Approved by: https://github.com/jerryzh168
2023-03-31 10:39:29 +00:00
Xia, Weiwen
e61b842001 [Quant][FX] lower functional conv_transpose ops (#97126)
**Summary**
Support quantizing and lowering functional `conv_transpose1d`, `conv_transpose2d` and `conv_transpose3d`.
Please note that
- `conv_tranpose + relu` fusion is not supported. Remember to keep `relu` node in graph when lowering.
- `conv_tranpose` requires `per-tensor` scheme for weight. Use default `qconfig_mappings` instead of deprecated `qconfig_dict` for test cases.

**Test plan**
python test/test_quantization.py -k test_conv_transpose_not_reference
python test/test_quantization.py -k test_conv_transpose_reference
python test/test_quantization.py -k test_conv_transpose_relu_not_reference
python test/test_quantization.py -k test_conv_transpose_relu_reference

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97126
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-03-31 07:17:29 +00:00
Xia, Weiwen
08766b23de [Quant][FX] lower ConvTranspose3d (#97125)
**Summary**
Enable quantization and lowering of `ConvTranspose3d`.
Add test cases for `ConvTranspose1d`, `ConvTranspose2d` and `ConvTranspose3d` since there were no such test cases.

**Test plan**
python test/test_quantization.py -k test_conv_transpose_not_reference
python test/test_quantization.py -k test_conv_transpose_reference

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97125
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-03-28 11:58:29 +00:00
Xia, Weiwen
e8be6d813b [Quant][FX] Fix issue of lowering weighted functional ops with kwargs (#95865)
Fixes #95492

**Summary**
This PR fixes the issue that weighted functional ops with kwargs are not lowered correctly since kwargs are ignored.
These kwargs should be moved from the functional op to its cooresponding prepack op, e.g., from `F.conv2d` to `quantized.conv2d_prepack`.

**Test plan**
python test/test_quantization.py -k test_lowering_functional_conv_with_kwargs
python test/test_quantization.py -k test_lowering_functional_conv_transpose_with_kwargs
python test/test_quantization.py -k test_lowering_functional_linear_with_kwargs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95865
Approved by: https://github.com/jgong5, https://github.com/supriyar
2023-03-21 05:29:03 +00:00
Catherine Lee
4519228f60 Reduce pytest blocklist part 2 (#96397)
Enable pytest for a few unique files.  pytest runs tests in a different order than unittest (but still a consistent ordering with respect to itself) and some tests change global state, causing other tests to fail.

`test_transpose_non_contiguous` in `test_torchinductor.py` gets impacted from some other test but I'm not sure which one, so my solution is to reset the metrics before the rest of the test is run.

`test_register_patterns` in `test_quantize_fx.py` adds extra keys to global variables, so remove them when the test is done via unittest's `addCleanUp` which also works on pytest.

pytest doesn't really have an equivalent for `load_tests` so change it to be like `test_jit` that imports all the classes.  I also attempted to dynamically import them, but I failed.

`test_public_api_surface` in `test_fx.py` checks for a backwards compatibility classification.  There is a different test in test_fx that results in `fuser_utils` being imported.  pytest runs this test before `test_public_api_surface` while unittest runs it after, so pytest sees `fuser_utils` when crawling through the modules.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96397
Approved by: https://github.com/huydhn
2023-03-10 19:10:43 +00:00
andrewor14
faa4cb29b2 [Quant][fx] Create new FX-based LSTM reference module (#96343)
Summary: The previous LSTM reference module implementation did
not handle dtypes other than quint8 correctly. This is because
the internal LSTM custom module quantization used eager mode,
which did not insert the q-dq ops properly. E.g., we want the
following reference quantized model:

```
[dq -> linear1_fp32 -> q_to_qint32] -> dq -> q_to_quint8 ->
  [dq - linear2_fp32 -> q_to_quint8] -> dq -> ...
```

This requires two sets of `q - dq` pairs between two adjacent
ops that have different dtypes (linear1 and linear2). However,
these `q - dq` pairs were not inserted in the old flow, because
eager mode required users to insert Quant/DeQuantStubs manually.

This commit changes the internal LSTM custom module quantization
to use FX graph mode quantization, which automatically inserts
the `q - dq` ops that convert the dtypes between adjacent ops
correctly. However, using FX graph mode quantization here comes
with its own set of challenges that required some hacks to get
the end-to-end flow to work. These hacks are detailed in the
comments in the util functions.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_static_lstm_with_custom_fixed_qparams

This commit also updates the corresponding test to verify the
dtypes as well as the qparams in the reference quantized graph.
This test case should serve as an example for users to set up
their own LSTM reference module flows.

Reviewers: vkuzo, supriyar, jcaip

Subscribers: vkuzo, supriyar, jcaip
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96343
Approved by: https://github.com/vkuzo
2023-03-09 23:23:48 +00:00
andrewor14
e7dd9b1138 [Quant][test] Add test for mixed dtypes in the same model (#96104)
Summary: This commit adds a test for mixing multiple dtypes
for different layers in the same model. The test verifies that
FX graph mode quantization converts the dtypes correctly
between the layers.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_mixed_dtypes

Reviewers: jcaip, vkuzo, supriyar

Subscribers: jcaip, vkuzo, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96104
Approved by: https://github.com/jcaip
2023-03-08 17:08:12 +00:00
Jiaxu Zhu
08fb13db65 [Quant] Add lowering for pixel_unshuffle/narrow (#96160)
Summary:
## Summary
torch.nn.functional.pixel_unshuffle and torch.narrow accepts both float
and quantized inputs. However, previously we would unnecessarily
dequantize quantized inputs into floats before passing them to
the function. This commit fixes this by lowering the pattern
[dequant - pixel_unshuffle - quant].
[dequant - narrow - quant].

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_pixel_unshuffle
```

```
python test/test_quantization.py TestQuantizeFxOps.test_narrow
```

Differential Revision: D43858199

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96160
Approved by: https://github.com/andrewor14
2023-03-08 05:25:03 +00:00
andrewor14
a3b505c55e [Quant] Fix setting fixed qparams for inner LSTM ops (#95537)
Summary: The existing util function did not quantize all inner
ops in the quantizable LSTM module, resulting in the error
"Could not run X with arguments from the 'QuantizedCPU' backend."
This commit fixes this by ensuring that all the other ops whose
qparams were not specifically configured are still quantized as
before, as in `torch.ao.nn.quantizable.LSTM.from_float`.

Test Plan: This commit also adds an additional check in the test
to ensure that the final converted model is in fact quantized,
in addition to just checking the qparams in the observers have
the right values.

python test/test_quantization.py TestQuantizeFx.test_static_lstm_with_custom_fixed_qparams

Reviewers: vkuzo

Subscribers: vkuzo, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95537
Approved by: https://github.com/vkuzo
2023-02-27 19:08:51 +00:00
andrewor14
4fc277c338 [Quant] Add lowering for pixel_shuffle (#94769)
Summary: `torch.nn.functional.pixel_shuffle` accepts both float
and quantized inputs. However, previously we would unnecessarily
dequantize quantized inputs into floats before passing them to
the function. This commit fixes this by lowering the pattern
[dequant - pixel_shuffle - quant].

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_pixel_shuffle

Reviewers: vkuzo

Subscribers: vkuzo, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94769
Approved by: https://github.com/vkuzo
2023-02-17 23:11:17 +00:00
Xuehai Pan
046e88a291 [BE] [3/3] Rewrite super() calls in test (#94592)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592
Approved by: https://github.com/ezyang, https://github.com/seemethere
2023-02-12 22:20:53 +00:00
Jerry Zhang
2394e6baa9 [quant][fx] Change prepare_fx and convert_fx to preserve the GraphModule type of input (#94412)
Summary:
Previously prepare_fx returns an ObservedGraphModule and convert_fx returns a QuantizedGraphModule,
this is to preserve the attributes since torch.fx.GraphModule did not preserve them, after https://github.com/pytorch/pytorch/pull/92062
we are preserving `model.meta`, so we can store the attributes in model.meta now to preserve them.

With this, we don't need to create a new type of GraphModule in these functions and can use GraphModule directly, this
is useful for quantization in pytorch 2.0 flow, if other transformations are using GraphModule as well, the quantization passes will be composable with them

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels
python test/test_quantization.py TestQuantizePT2E

Imported from OSS

Differential Revision: D42979722

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94412
Approved by: https://github.com/vkuzo
2023-02-09 23:03:23 +00:00
Aaron Gokaslan
1e2d82b8e4 [BE] Merge isinstance calls together (#94419)
Simplify and speeds up isinstance calls by checking for multiple types at the same time.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94419
Approved by: https://github.com/ezyang
2023-02-09 00:47:26 +00:00
Vasiliy Kuznetsov
f15ab8a7f2 AO migration: replace torch internal callsites (#94170)
Summary:

Do the following renames:
`torch.quantization` -> `torch.ao.quantization`
`torch.nn.quantized` -> `torch.ao.nn.quantized`
`torch.nn.quantizable` -> `torch.ao.nn.quantizable`
`torch.nn.qat` -> `torch.ao.nn.qat`
`torch.nn.intrinsic` -> `torch.ao.nn.intrinsic`

And then, do
`torch.ao.nn.quantized._reference` -> `torch.ao.nn.quantized.reference` to clean up the aftermath of https://github.com/pytorch/pytorch/pull/84974

Then, manually update `test/test_module_init.py` to fix hanging whitespace due to the replace.

Run this script to do the replacements: https://gist.github.com/vkuzo/7f7afebf8c31b9ba48306223e68a1c82

This is for https://github.com/pytorch/pytorch/issues/81667

Test plan: CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94170
Approved by: https://github.com/jerryzh168
2023-02-07 02:32:23 +00:00
leslie-fang-intel
0f802eedc2 [Quant][FX] Lower QConvAddReLU2d for onednn backend (#91155)
**Summary**
Add quantization mappings for QConvAddReLU2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode.

**Test plan**
```
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91155
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:18:52 +00:00
leslie-fang-intel
ef4118e435 [Quant][FX] Lower QConvAdd2d for onednn backend (#91153)
**Summary**
Add quantization mappings for QConvAdd2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode.

**Test plan**
```
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91153
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:14:12 +00:00
Jerry Zhang
61457671a5 [quant][fx][be] Remove _input_output_observed from backend_config (#92589)
Summary:
This is no longer needed, we can use dtype to decide whether an observer is needed or not

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92589
Approved by: https://github.com/jcaip
2023-01-27 22:17:05 +00:00
Xia, Weiwen
6fa84fdea2 [FX][Quant] Enable FX quant for patterns like x.view(x.size(...), ...) (#90001)
**Summary**
This work continues with https://github.com/pytorch/pytorch/pull/83784 by @vkuzo and includes all the changes in that PR.
Quote from https://github.com/pytorch/pytorch/pull/83784:
> Issue #83658 reports that ops followed by a certain pattern of `view` and `size` ops were not quantized correctly by FX graph mode quantization.
Before this PR, the "size" op was in the "op shares qparams with input" category, and the code assumed that the input of this op has the same dtype as its output. This led to incorrectly propagating the `int` dtype as the output of whichever op was preceding the `view` op, which in turn made that op blocklisted from quantization.

> The fix is to create a new category of ops which work on different dtypes of tensors but are not observed. This PR does so for `size`, and also for `shape` since it works the same way.

**Note**: This PR needs https://github.com/pytorch/pytorch/pull/91297 to be landed first otherwise there is a UT failure.

**Test plan**
```
python test/test_quantization.py -k test_linear_size_view
python test/test_quantization.py -k test_linear_shape_view
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90001
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-01-27 07:56:29 +00:00
Xia, Weiwen
1d03a6a901 [Quant][Fx] Fix issue: qconfig_mappings of onednn backend are not correctly set for fused modules (#91297)
**Summary**
For onednn quantization backend only.
Currently, FX fusion requires that all separate ops in a fused module/op have the same `qconfig`. To support `linear - leaky_relu` and `linear - tanh` fusion with onednn backend, we previously explicitly set the same `qconfig` to `linear`, `leaky_relu` and `tanh`. However, this brings two problems:
- It breaks fusion of `linear - relu` since `relu` does not have the same `qconfig` as `linear` does. And it does not look good if we set `qconfig` to all these ops. They should use a global `qconfig` by default.
- `Tanh` requires `fixed_qparams_qconfig` otherwise it is not quantized. So, we cannot set another `qconfig` to `tanh`.

Looks like there is not a straightforward way to solve the problems. This PR fixes them by the following:
- Do not set `qconfig` to these ops so that these ops use a global `qconfig` and `linear - relu` and `linear - leaky_relu` can be fused correctly.
- Set the same `qconfig` to `linear` and `tanh` manually by users when they want to fuse `linear - tanh` with onednn backend.

A known issue still exists: users cannot fuse `linear - tanh` and quantize standalone `tanh` at the same time.

**Test plan**
python test/test_quantization.py -k test_qconfig_dict_with_fused_modules

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91297
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-01-26 09:55:34 +00:00
Jerry Zhang
1464db08b4 [quant][pt2e] Support setting qconfig by module_type (#92355)
Summary:
This PR supports the following feature for QConfigMapping:
```
qconfig_mapping = QConfigMapping().set_object_type(torch.nn.Conv2d, qconfig)
backend_config = get_qnnpack_pt2e_backend_config()
m = prepare_pt2e(m, qconfig_mapping, example_inputs, backend_config)
```
which means users want to set the qconfig for all calls to `torch.nn.Conv2d` to use `qconfig`, note this is only verified for the case when the module is broken down to a single aten op right now, e.g. torch.nn.Conv2d will be torch.ops.aten.convolution op when traced through. will need to support more complicated modules that is broken down to multiple operators later, e.g. (MaxPool)

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_qconfig_module_type

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92355
Approved by: https://github.com/jcaip
2023-01-20 03:18:21 +00:00
Jerry Zhang
ec3941ada6 [quant][fx] Add support for GRU in fx graph mode quantization (#91976)
Summary:
might be needed by a meta-internal use case

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_rnn

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91976
Approved by: https://github.com/jcaip
2023-01-13 07:00:12 +00:00
Harshit Khaitan
ffbd13b654 Fix for swap_custom_module_to_observer doing duplicate swaps on the same node.target (#91905)
Summary:
This is a fix for the following issue:
"When two nodes in a model have the same dTypes / node.target, the torch quantization prepare_fx flow does not check for duplicates and tries to do a custom module swap twice. When it attempts the swap the same target for a second time, the swap_custom_module_to_observed detects the observed module instead of the float module class on the target, and fails on an assertion. "

The added unit test demonstrates a simple example where it fails in absence of this fix.

Test Plan: buck test mode/dev //caffe2/test:quantization_fx -- --exact 'caffe2/test:quantization_fx - test_custom_module_class_input_has_duplicate_nodes (quantization.fx.test_quantize_fx.TestQuantizeFx)'

Reviewed By: vkuzo

Differential Revision: D42023273

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91905
Approved by: https://github.com/jerryzh168
2023-01-12 05:24:38 +00:00
Xia, Weiwen
a5eb564ba4 [Quant] lower fused LinearTanh for onednn backend (#89188)
**Summary**
Add fuser method and quantization mappings for `QLinearLeakyReLU` for int8 inference for onednn backend. The fusion and lowering are supported only in FX mode.

**Test plan**
python test_quantization.py TestFuseFx TestQuantizeFx

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89188
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2022-12-20 01:30:21 +00:00