Commit Graph

902 Commits

Author SHA1 Message Date
Kazuaki Ishizaki
b5f9696d81 Fix typo under torch directory (#110824)
This PR fixes typo `the the` of comments and exception messages in files under `torch` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110824
Approved by: https://github.com/H-Huang
2023-10-09 19:16:43 +00:00
Jeff Daily
e8f1f4ed66 [quant][pt2][ROCm] follow-up PR 109908 for miopen_batch_norm (#110653)
Fixes recent broken unit tests caused by PR #109908 because cudnn and miopen have separate batch norm functions.

```
2023-10-05T09:35:01.6606614Z _______________ TestQuantizePT2EQAT.test_qat_conv_bn_fusion_cuda _______________
2023-10-05T09:35:01.6606948Z Traceback (most recent call last):
2023-10-05T09:35:01.6607362Z   File "/var/lib/jenkins/pytorch/test/quantization/pt2e/test_quantize_pt2e_qat.py", line 323, in test_qat_conv_bn_fusion_cuda
2023-10-05T09:35:01.6607767Z     self._verify_symmetric_xnnpack_qat_graph(
2023-10-05T09:35:01.6608217Z   File "/var/lib/jenkins/pytorch/test/quantization/pt2e/test_quantize_pt2e_qat.py", line 130, in _verify_symmetric_xnnpack_qat_graph
2023-10-05T09:35:01.6608658Z     self._verify_symmetric_xnnpack_qat_graph_helper(
2023-10-05T09:35:01.6609105Z   File "/var/lib/jenkins/pytorch/test/quantization/pt2e/test_quantize_pt2e_qat.py", line 173, in _verify_symmetric_xnnpack_qat_graph_helper
2023-10-05T09:35:01.6609623Z     m = prepare_qat_pt2e(m, quantizer)
2023-10-05T09:35:01.6610171Z   File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/quantize_pt2e.py", line 178, in prepare_qat_pt2e
2023-10-05T09:35:01.6610561Z     _fuse_conv_bn_qat(model)
2023-10-05T09:35:01.6611072Z   File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/pt2e/qat_utils.py", line 501, in _fuse_conv_bn_qat
2023-10-05T09:35:01.6611497Z     m = _fuse_conv_bn_qat_helper(m, is_cuda=True)
2023-10-05T09:35:01.6612065Z   File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/pt2e/qat_utils.py", line 575, in _fuse_conv_bn_qat_helper
2023-10-05T09:35:01.6612492Z     _get_conv_bn_getitem_nodes(r.replacements)
2023-10-05T09:35:01.6613058Z   File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/pt2e/qat_utils.py", line 383, in _get_conv_bn_getitem_nodes
2023-10-05T09:35:01.6613465Z     assert bn_node is not None
2023-10-05T09:35:01.6613716Z AssertionError
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110653
Approved by: https://github.com/jerryzh168, https://github.com/pruthvistony
2023-10-06 15:30:55 +00:00
Jerry Zhang
7b6042111f [quant][pt2e] Refactor conv related annotation for XNNPACKQuantizer (#110308)
Summary:
Since we changed IR that we are working with to pre autograd aten IR, it's easier
to use plain pattern match instead of relying on source_matcher_utils now, this
PR refactors the annotation for conv to use aten ops directly.

Also fixed reentrant test after this change.

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110308
Approved by: https://github.com/kimishpatel
2023-10-05 22:36:18 +00:00
Andrew Or
7c72238e4b Back out "Enable pickling model prepared with QAT qconfig" (#110392)
Summary:
D49187352 caused our model conversion and loading of QAT checkpoint to be stuck with thrift time out.

we are actively checking in final code and model for static quant HTP prod model, and encountered this breakage at head Thursday.

Thrift timeout is a not failing, and because of that, it's hard to bisect and find this culprit. It is also hard to set up unit test, because the job simply time-out. Better test is needed to guard downstream model conversion against upstream changes.

Our suspicion of why this diff broke us is that we create a lot of modules with qat (in a recursive manner) but our model is not a qat traceable module (it is a graph with many qat modules and floating point modules). With fuctools.partial as in the original diff, we will be caching modules in the memory and causing the memory of the machine to be taken up completely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110392
Approved by: https://github.com/junesg, https://github.com/jerryzh168
2023-10-05 14:41:00 +00:00
andrewor14
62cad5b5b0 [quant][pt2] Support cudnn_batch_norm in QAT fusion (#109908)
Summary: Today, we get different batch norm ops depending on
the device the model is placed on at export time. Exporting
`model.cpu()` gives `_native_batch_norm_legit`, while exporting
`model.cuda()` gives `cudnn_batch_norm`. QAT fusion currently
only supports the former and silently ignores the latter. This
commit fixes this by additionally matching on the latter op
during QAT fusion.

Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_fusion
python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_relu_fusion

Reviewers: jerryzh168, kimishpatel

Subscribers: jerryzh168, kimishpatel, supriyar

Differential Revision: [D49615145](https://our.internmc.facebook.com/intern/diff/D49615145)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109908
Approved by: https://github.com/jerryzh168
2023-10-05 04:08:44 +00:00
Fabrice Pont
053367b1ed fix: flake8-bugbear code B024 (#107265)
See #106571 item B024

This fix concerns the addition of `abstractmethod` to methods declared inside abstract classes.

Should I also include PEP8 compliant reformatting on the files I had to modify ?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107265
Approved by: https://github.com/kit1980
2023-10-04 23:52:52 +00:00
Max Ren
08c7dcda65 [pt2e][xnnpack_quantizer] quantize "mul" (#110428)
Adding "mul" to list of partitions that are supported by the quantizer. This shows up in EDSR, where we still want to quantize the mul op

Differential Revision: [D49850151](https://our.internmc.facebook.com/intern/diff/D49850151/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110428
Approved by: https://github.com/jerryzh168
ghstack dependencies: #110427
2023-10-04 05:11:53 +00:00
Max Ren
66202ed29c [pt2e][xnnpack_quantizer] add util function to convert scalars to attrs (#110427)
Jerry provided a notebook solution for converting scalars to attrs so that they may be properly quantized:

https://fburl.com/anp/kzz7tfn1

Adding this pass as a util function in xnnpack_quantizer_utils.py

Differential Revision: [D49850150](https://our.internmc.facebook.com/intern/diff/D49850150/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110427
Approved by: https://github.com/jerryzh168
2023-10-04 05:11:53 +00:00
Jerry Zhang
c9b8e06060 [quant] Enable quantization for wav2letter (#109830)
Summary:
Also added annotation support for conv1d_relu and conv1d in XNNPACKQuantizer, the quantized results still
matches fx quant path (didn't quantize conv1d) so tests are not disabled

Test Plan: with-proxy buck2 run executorch/examples/quantization:example -- -m=w2l --verify

Differential Revision: D49479546

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109830
Approved by: https://github.com/kimishpatel
2023-09-29 00:47:34 +00:00
Jerry Zhang
e3eb1d92d8 [quant][docs] Add documentation for prepare_pt2e, prepare_qat_pt2e and convert_pt2e (#110097)
Summary:
att

Test Plan:
.

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110097
Approved by: https://github.com/kimishpatel
2023-09-28 18:24:58 +00:00
Sindi Shkodrani
419ec3b229 Enable pickling model prepared with QAT qconfig (#109288)
Summary:
Resolving error:

AttributeError: Can't pickle local object '_add_module_to_qconfig_obs_ctr.<locals>.get_factory_kwargs_based_on_module_device'

by moving nested function out to the main module

Test Plan: Added test to CI

Reviewed By: andrewor14

Differential Revision: D49187352

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109288
Approved by: https://github.com/andrewor14
2023-09-28 09:51:19 +00:00
Jerry Zhang
1b51d29b66 [quant][pt2e] Enable constant folding for quantize ops (#109343)
Summary:
This PR added constant folding for quantize ops so that instead of storing fp32 weight in the
quantized model, we'll get int8/int16 etc. weight

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_fold_quantize

also will verify in executorch later

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343
Approved by: https://github.com/kimishpatel, https://github.com/jgong5
2023-09-27 06:04:45 +00:00
Jiaxu Zhu
595af261b2 [ao] Support Subclasses of FloatFunctional in eager mode prepare (#109646)
Summary: As title, if a module is subclassing `nnq.FloatFunctional`, also adding observers to it like `nnq.FloatFunctional`

Test Plan: CI

Differential Revision: D49431968

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109646
Approved by: https://github.com/jerryzh168
2023-09-20 08:09:55 +00:00
Kimish Patel
73ac814148 [Pytorch][quant] Move xnnpack quantizer to use aten.linear (#109254)
Summary:
Now that quantization works on pre-dispatch aten IR, moving to full set
of aten ops is ok. Plus when tracing models like ViT, the linear
projections of of k, q, v uses functional.linear and not nn.Linear,
which results not being able to extract nodes corresponding to linear.

Test Plan:
quant tests

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D49252194](https://our.internmc.facebook.com/intern/diff/D49252194)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109254
Approved by: https://github.com/jerryzh168
2023-09-18 20:26:44 +00:00
Jerry Zhang
3943afc94e [quant][be] Remove unused APIs (#109342)
Summary:
att

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109342
Approved by: https://github.com/kimishpatel, https://github.com/andrewor14
2023-09-15 16:07:01 +00:00
Jerry Zhang
41e2189843 [quant] Remove reference representation rewrite for adaptive_avg_pool2d (#108924)
Summary:
integer adaptive_avg_pool2d is not well defined due to different possible ways of rounding fp32 value to integer value, and
this op isn't too critical for numerics (since it appears not too often), so we'll skip this for now.

we might need to revert the changes that adds integer impl for adaptive_avg_pool op as well

Test Plan:
python test/test_quantization.py TestQuantizePT2ERepresentation

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108924
Approved by: https://github.com/kimishpatel
2023-09-14 10:18:36 +00:00
Jerry Zhang
cf26e5575d [quant][be] Reduce warnings in tests (#108922)
Summary:
att

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108922
Approved by: https://github.com/andrewor14
ghstack dependencies: #108920, #108921
2023-09-12 21:54:33 +00:00
Jerry Zhang
b01b934aca [quant][be] Cleanup xnnpack_quantizer implementation (#108921)
Summary:
att

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108921
Approved by: https://github.com/andrewor14
2023-09-12 19:28:41 +00:00
Jerry Zhang
241e84bf98 [quant][be] Rewrite xnnpack_quantizer_utils.py to use decorators (#108920)
Summary:
att

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108920
Approved by: https://github.com/kimishpatel
2023-09-12 00:09:13 +00:00
Andrew Or
e8a402c56e [quant][pt2] Fix and rename move_model_to_eval (#108891)
Summary:
This commit fixes two silent correctness problems with
the current implementation of `move_model_to_eval`:

(1) Previously the user had to manually call `eliminate_dead_code`
before calling `move_model_to_eval`, otherwise the dropout pattern
won't actually get eliminated. This is because subgraph rewriter
complains the match is not self-contained, and so silently does
not do the replacement.

(2) We wish to error when the user calls `model.train()` or
`model.eval()` on an exported model. This error is raised
correctly immediately after export today, but no longer raised
after the user calls prepare or convert.

We fix (1) by moving the `eliminate_dead_code` call into
`move_model_to_eval`, and fix (2) by ensuring the respective
errors are thrown after prepare and convert as well.

Additionally, this commit renames `move_model_to_eval` to
`move_exported_model_to_eval` to be more explicit.

bypass-github-export-checks

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_disallow_eval_train
python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_to_eval

Imported from OSS

Differential Revision: D49097293

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108891
Approved by: https://github.com/jerryzh168
2023-09-11 15:37:01 +00:00
Jerry Zhang
b0de6a8002 [quant][executorch] Support inception_v4 in examples (#108382)
Summary: Verified that pt2e quant flow matches the fx flow with executorch backend config

Test Plan:
with-proxy buck2 run executorch/examples/quantization:example -- -m=ic4 --verify

```
[INFO 2023-08-31 16:08:06,923 example.py:77] prepare sqnr: inf
[INFO 2023-08-31 16:08:06,932 example.py:81] quant diff max: 0.0
[INFO 2023-08-31 16:08:06,936 example.py:85] quant sqnr: inf
```

full output: https://www.internalfb.com/intern/paste/P818520579/

Differential Revision: D48889075

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108382
Approved by: https://github.com/kimishpatel
2023-09-08 17:39:31 +00:00
Paul Zhang
51c2b587c9 Back out "[PyPer][BE] Fix test_scripted_module in StatCollector" (#108588)
Differential Revision: D48908507

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108588
Approved by: https://github.com/jerryzh168
2023-09-08 14:33:58 +00:00
Kimish Patel
c1877e99c5 [Quant] Move to BFS instead of DFS to check for connectedness (#108572)
Summary:
Using dfs to check if two nodes are connecgted is making it very slow.
Use of BFS makes it much faster.

Test Plan:
https://gist.github.com/leslie-fang-intel/9cd828623f567a3afbf41564d3546398

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48971710](https://our.internmc.facebook.com/intern/diff/D48971710)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108572
Approved by: https://github.com/jerryzh168, https://github.com/osalpekar
2023-09-07 00:26:28 +00:00
Jerry Zhang
32a16d4999 [quant][pt2e] Support int16 quantization (#108453)
Summary:
Previously we can only use native pytorch int dtypes that has corresponding quantized dtypes (e.g. quint8, qint8), this
PR removes this assumption in observers/fake_quants so that users can use all pytorch native dtypes (except for int64, we can add it later if need)
the main addition here is int16.

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108453
Approved by: https://github.com/kimishpatel
2023-09-06 19:31:20 +00:00
Kimish Patel
ffc0c46092 [Quantization] Add metadata porting for nodes added by quantization (#107107)
Summary:
This diff adds adding metadata to q-dq nodes by inferring the
quatization intent from node annotations. Annotations on the node are
way for user to specify how a node or subgraph is supposed to be
quantized. We continue to use that information to copy metadata on Q/DQ
node from appropriate nodes.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48488416](https://our.internmc.facebook.com/intern/diff/D48488416)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107107
Approved by: https://github.com/jerryzh168
ghstack dependencies: #107105, #107106, #107899, #107900
2023-09-02 06:38:14 +00:00
Kimish Patel
eb67c452c8 [Quant] Add DQ duplication pass (#107900)
Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107900
Approved by: https://github.com/jerryzh168, https://github.com/andrewor14, https://github.com/leslie-fang-intel
ghstack dependencies: #107105, #107106, #107899
2023-09-02 06:20:03 +00:00
Kimish Patel
f8d1ca9835 [Quant] Bug fix (#107899)
Summary:
When two layers are quantized differently, observer map update updates
map for key (observed_node, node), whereas it should really be
(original_input, node)

Test Plan:
Test in the next diff adds a test where it otherwise fails

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663145](https://our.internmc.facebook.com/intern/diff/D48663145)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107899
Approved by: https://github.com/jerryzh168
ghstack dependencies: #107105, #107106
2023-09-02 06:20:03 +00:00
Kimish Patel
37b0d76e35 [Quantization] Make annotation util functions return annotated nodes (#107106)
Summary:
Having annotation functions return nodes that are annotated is useful
specifically for adding "quantization_tag" to those nodes

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48488415](https://our.internmc.facebook.com/intern/diff/D48488415)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107106
Approved by: https://github.com/jerryzh168
ghstack dependencies: #107105
2023-09-02 06:19:55 +00:00
Kimish Patel
99168c1fa9 [Quant] Use input_qspec_map for weight quantization of linear (#107105)
Summary:
In prepararation for metadata porting diff, it is required that weight
quant annotation happens via edge quantization, i.e. input_qspec_map.

Reason: Metadata is ported via associating DQ node's metadata with its
consumer while associating Q node's metadata with its producer.
Furthermore, such porting must be qualified via user intent to see if
the consumder of DQ, or producer of Q, actually specified intent of
quantization

By making quantization annotation on linear node's weight via
input_qspec_map, we can enable associating DQ of [weight -> Q -> DQ],
with the linear module.

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48488414](https://our.internmc.facebook.com/intern/diff/D48488414)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107105
Approved by: https://github.com/jerryzh168
2023-09-02 06:19:50 +00:00
Paul Zhang
4a9c6f1b73 [PyPer][BE] Fix test_scripted_module in StatCollector (#108232)
Summary: D41985889 removed the cast to int for the inputs to torch.histc below, allowing the inputs to still be tensors. These tensors still have require_grad_ set to True, causing issues with the call to torch.histc.

Test Plan: buck2 test 'fbcode//mode/opt' fbcode//dper3/dper3/modules/low_level_modules/tests:stat_collector_test -- --exact 'dper3/dper3/modules/low_level_modules/tests:stat_collector_test - test_scripted_module (dper3.dper3.modules.low_level_modules.tests.stat_collector_test.StatCollectorTest_1)'

Differential Revision: D48800879

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108232
Approved by: https://github.com/jerryzh168
2023-09-01 04:23:57 +00:00
Jerry Zhang
a9fe0b5b74 [quant][pt2e] Move propagate_annotation from quant flow to quantizer (#108320)
Summary:
Previously we run propagate_annotation by default in quantization flow to propagate annotations for ops like reshape, view etc.

Not all quantizers would need this so we moved this to xnnpack_quantizer_utils for now.

Next Step:
* make propagate_annotation function configurable with a custom list of ops
* remove unneeded ops in `_is_share_obs_or_fq_op`

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48856985](https://our.internmc.facebook.com/intern/diff/D48856985)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108320
Approved by: https://github.com/kimishpatel
2023-09-01 01:49:19 +00:00
leslie-fang-intel
6c342ec368 Revert PR-107951 to only support new graph capture API in Quantization (#108317)
**Summary**
Revert the changes in https://github.com/pytorch/pytorch/pull/107951 to make the utils function only support graph captured by `capture_pre_autograd_graph`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108317
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #108214
2023-09-01 00:47:10 +00:00
leslie-fang-intel
fb808c30c7 x86_inductor_quantizer switches to new graph capture API (#108214)
**Summary**
Update `X86InductorQuantizer` and related testcase to the new graph capture API `capture_pre_autograd_graph`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108214
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-09-01 00:43:45 +00:00
andrewor14
057b807178 [quant] Move dropout replacement to move_model_to_eval (#108184)
Summary: This commit adds a public facing
`torch.ao.quantization.move_model_to_eval` util function
for QAT users. Instead of calling model.eval() on an exported
model (which doesn't work, see
https://github.com/pytorch/pytorch/issues/103681), the user
would call this new util function instead. This ensures special
ops such as dropout and batchnorm (not supported yet) will have
the right behavior when the graph is later used for inference.

Note: Support for an equivalent `move_model_to_train` will be
added in the future. This is difficult to do for dropout
currently because the eval pattern of dropout is simply a clone
op, which we cannot just match and replace with a dropout op.

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_move_model_to_eval

Reviewers: jerryzh168, kimishpatel

Subscribers: jerryzh168, kimishpatel, supriyar

Differential Revision: [D48814735](https://our.internmc.facebook.com/intern/diff/D48814735)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108184
Approved by: https://github.com/jerryzh168
2023-08-30 16:33:17 +00:00
Jerry Zhang
147b3495e2 [quant][pt2e] Add reference representation for dynamic quantized linear (#108073)
Summary: att

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_representation_dynamic_linear
buck2 test 'fbcode//mode/opt' fbcode//caffe2/test:quantization_pt2e -- 'test_representation_dynamic_linear'

Reviewed By: kimishpatel

Differential Revision: D48703076

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108073
Approved by: https://github.com/andrewor14
2023-08-29 07:12:55 +00:00
Jerry Zhang
9ae3d7ca90 [reland][quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930) (#107992)
Summary: att

Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify

Differential Revision: D48588121

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107992
Approved by: https://github.com/digantdesai, https://github.com/mcr229
2023-08-27 14:50:03 +00:00
Xia, Weiwen
e9b0f62a19 [Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer (#106781)
**Summary**
Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add `quant-dequant` pattern for linear and linear-unary post op.

**Test plan**
python test/test_quantization.py -k test_linear_with_quantizer_api
python test/test_quantization.py -k test_linear_unary_with_quantizer_api

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106781
Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #105818
2023-08-27 10:50:17 +00:00
leslie-fang-intel
c85c5954f2 [Quant][PT2E]Make _fuse_conv_bn_ support graph capture by torch._dynamo.export (#107951)
**Summary**
The latest check-in a0cfaf0688 for the conv-bn folding assumes the graph is captured by the new graph capture API `torch._export.capture_pre_autograd_graph`. Since we still need to use the original graph capture API `torch._dynamo_export` in 2.1 release. So, this check-in made negative impact to workloads' performance heavily. Made this PR to fix this issue by trying to make the conv-bn folding function workable with both new and original graph capture API.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107951
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #106836, #106838, #106958
2023-08-26 17:19:41 +00:00
leslie-fang-intel
1147a28b0b [Quant][PT2E] Add cat and avg_pool2d recipe into x86InductorQuantizer (#106836)
**Summary**
Add `cat` and `avg_pool2d` quantization recipe as input output share observer into `x86InductorQuantizer`.

**Test Plan**
```
clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe
clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_same_inputs
clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_single_input
clear && python -m pytest test_x86inductor_quantizer.py -k test_avg_pool2d_recipe
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106836
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-08-26 16:51:13 +00:00
Jerry Zhang
15d4dedbbf [quant][pt2e] Add reference representation rewrite for statically quantized linear (#107994)
Summary: att

Test Plan:
```
python test/test_quantization.py TestQuantizePT2E.test_representation_linear
buck2 test 'fbcodemode/opt' fbcodecaffe2/test:quantization_pt2e -- 'test_representation_linear'
```

Reviewed By: kimishpatel

Differential Revision: D48674862

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107994
Approved by: https://github.com/mcr229, https://github.com/guangy10
2023-08-26 15:39:52 +00:00
leslie-fang-intel
70ca18f8a0 [Quant][PT2E] Enable X86InductorQuantizer single quantizable op(maxpool2d) (#105639)
**Summary**
In this PR, we mainly enable 2 things.

- Enable the skeleton of quantization recipe for single quantizable operators in `X86InductorQuantizer`.
- Add quantization recipe of `maxpool2d` and annotate it as input./output share observer.

**Test Plan**
```
python -m pytest test_x86inductor_quantizer.py -k test_maxpool2d_recipe
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105639
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456
2023-08-26 08:34:15 +00:00
andrewor14
240bdbea61 [quant][pt2e] Fix annotation for conv no bias case (#107971)
Summary: This fixes the no bias case for conv annotations.
Previously this would result in an index out of bounds, since
the new aten.conv2d op may not have the bias arg (unlike the
old aten.convolution op). This was not caught because of a lack
of test cases, which are added in this commit.

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_qat_conv_no_bias
python test/test_quantization.py TestQuantizePT2E.test_qat_conv_bn_relu_fusion_no_conv_bias

Reviewers: jerryzh168, kimishpatel

Subscribers: jerryzh168, kimishpatel

Differential Revision: [D48696874](https://our.internmc.facebook.com/intern/diff/D48696874)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107971
Approved by: https://github.com/jerryzh168
2023-08-26 01:01:54 +00:00
Jerry Zhang
f92f69dbfb [quant][pt2e] Enable testing for reference quant model representations (#107474)
Summary:
Previously these tests were disabled due to time out in dynamo export in fbcode,
this might have been resolved, so trying to enable again

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48619072](https://our.internmc.facebook.com/intern/diff/D48619072)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107474
Approved by: https://github.com/andrewor14
2023-08-26 00:37:45 +00:00
PyTorch MergeBot
8d44b0f5a5 Revert "[quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930)"
This reverts commit 1d1739dc6d.

Reverted https://github.com/pytorch/pytorch/pull/107930 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/107930#issuecomment-1694069330))
2023-08-26 00:37:02 +00:00
Jerry Zhang
1d1739dc6d [quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930)
Summary: att

Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify

Differential Revision: D48588121

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107930
Approved by: https://github.com/kimishpatel
2023-08-25 23:36:19 +00:00
leslie-fang-intel
1374974d60 [Quant][Inductor] Enable quantization conv_binary(add/add_relu) pattern fusion inside inductor (#105456)
**Summary**
Enable the `dequant-conv2d-binary_postop(add)-unary_postop(relu)-quant` pattern fusion and lowering inside inductor.

**Test Plan**
```
clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_binary
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105456
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590, #105455
2023-08-25 21:16:02 +00:00
Jerry Zhang
a0cfaf0688 [quant][pt2e] Make sure XNNPACKQuantizer works with the pre_dispatch=True (#107872)
Summary: att

Test Plan:
```
buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18

buck2 test 'fbcode//mode/opt' fbcode//caffe2/test:quantization_pt2e
```

Reviewed By: andrewor14, tugsbayasgalan

Differential Revision: D48415977

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107872
Approved by: https://github.com/andrewor14
2023-08-25 05:04:01 +00:00
Kimish Patel
2fbe6ef2f8 [pytorch][Quant] Fix bias quant bug (#107810)
Summary: Bias should be quantized by act_scale * weight_scale in conv and linear

Test Plan: Rewrite tests

Differential Revision: D48606828

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107810
Approved by: https://github.com/jerryzh168
2023-08-24 23:44:19 +00:00
Jerry Zhang
16fcb07846 [quant][pt2e] Add support for channel in DerivedQuantizationSpec (#107833)
Summary:
att

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_derived_qspec_per_channel

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48630535](https://our.internmc.facebook.com/intern/diff/D48630535)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107833
Approved by: https://github.com/andrewor14
2023-08-24 07:45:13 +00:00
Sherlock Huang
ee4b99cc3a Decomp for aten.dropout (#106274)
When exporting dropout with cpu tensor, we get following graph module
```
    class GraphModule(torch.nn.Module):
        def forward(self, arg0_1: f32[512, 10]):
            empty_memory_format: f32[512, 10] = torch.ops.aten.empty.memory_format([512, 10], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False, memory_format = torch.contiguous_format)
            bernoulli_p: f32[512, 10] = torch.ops.aten.bernoulli.p(empty_memory_format, 0.9);  empty_memory_format = None
            div_scalar: f32[512, 10] = torch.ops.aten.div.Scalar(bernoulli_p, 0.9);  bernoulli_p = None
            mul_tensor: f32[512, 10] = torch.ops.aten.mul.Tensor(arg0_1, div_scalar);  arg0_1 = div_scalar = None
            return (mul_tensor,)
```

In addition, if we export with eval() mode, we will have an empty graph.

However, when exporting with cuda tensor, we got
```
    class GraphModule(torch.nn.Module):
        def forward(self, arg0_1: f32[512, 10]):
            native_dropout_default = torch.ops.aten.native_dropout.default(arg0_1, 0.1, True);  arg0_1 = None
            getitem: f32[512, 10] = native_dropout_default[0];  native_dropout_default = None
            return (getitem,)
```
and exporting under eval() mode will still have a dropout node in graph.

This PR make exporting with CPU tensor also produce aten.native_dropout.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106274
Approved by: https://github.com/ezyang
2023-08-23 21:12:37 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Jerry Zhang
28be2c674a [quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806) (#107259)
Summary:

Currently in quantizer/quantize_pt2e we import things from specific quantizers (XNNPACKQuantizer, QuantizationConfig) etc.
this PR removes them so it's clearer that they are not part of the core quantization code base

This PR also removed get_supported_operators from main Quantizer since we haven't seen a clear need for this API

Test Plan:
CIs

Imported from OSS

Differential Revision: D48340367

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107259
Approved by: https://github.com/kimishpatel
2023-08-18 21:29:09 +00:00
Jerry Zhang
d3c4ec767b [quant][pt2e] Fix handling for SharedQuantizationSpec (#106922)
Summary:
Previously if we have:
```
conv1 -> cat
conv2  /
```
and configure output of conv1/conv2 to be int8 quantized, and cat also int8 quantized and with shared inputs,
it will not produce expected results (input of cat will not be shared)

The problem is that there is some missing checks when inserting observers for input for cat

This PR fixes the problem.

Fixes: https://github.com/pytorch/pytorch/issues/106760
Test Plan:
python tes/test_quantization.py TestQuantzePT2E.test_shared_qspec

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106922
Approved by: https://github.com/kimishpatel
2023-08-16 21:16:45 +00:00
Jerry Zhang
4afab40b56 [quant][pt2e] Removed mean/hardtanh annotations and refactored adaptive_avg_pool annotation (#106805)
Summary:
Removed annotations for some ops, since they are handled in torch/ao/quantization/pt2e/_propagate_annotation.py

Test Plan:
CIs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106805
Approved by: https://github.com/kimishpatel
2023-08-10 04:51:06 +00:00
Jerry Zhang
97ce979e5d [quant][pt2e] Add reference representation for quantized conv2d (#105784)
Summary:
Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_representation_quantize_dequantize_per_channel

Although right now it is not really testing things since there is some problem with dynamo export

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105784
Approved by: https://github.com/kimishpatel
ghstack dependencies: #105783
2023-08-09 22:41:35 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
a44c072c89 Make InternalModel and Resnet work with rexportable flow (#106676)
Summary: Internal model and Resnet uses "re-export" flow now. Also did some refactoring to make the code little cleaner

Some changes for OSS:
1. Correctly use the "cached" fake tensors so that static symbols are still resolved to static
2. Change logic in PassBase to allocate static shapes for parameters
3. Add "is_torch_exported" tag to every node to make it survive during various graph transformations.
4. Added experimental wrapper API for quantization team to get pre_dispatch=True graph. Note that it doesn't actually do that right now. But we plan to switch soon.

Test Plan: CI

Differential Revision: D47890878

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106676
Approved by: https://github.com/jerryzh168
2023-08-09 20:10:48 +00:00
Jerry Zhang
e1a1780626 [quant][pt2e] Move annotate functions in XNNPACKQuantizer to utils (#106642)
Summary:
This is to allow sharing these annotate functions by other quantizers so that writing a new quantizer is easier

note that these annotation functions will be maintained by XNNPACKQuantizer developers instead of AO team

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106642
Approved by: https://github.com/andrewor14
2023-08-09 18:52:39 +00:00
Jerry Zhang
69ecad6f2b [quant][pt2e] Add reference representation for quantize_per_channel and dequantize_per_channel (#105783)
Summary:
Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_representation_quantize_dequantize_per_channel

Although right now it is not really testing things since there is some problem with dynamo export

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105783
Approved by: https://github.com/kimishpatel
2023-08-09 01:39:52 +00:00
Jiaxu Zhu
9e35df4adc [pytorch][ao] force weight observer/fake_quant to be on the same device as the weight tensor (#106755)
Summary:
As title.
There's a corner case where both cpu and gpu are avaiable, although the model is moved to cpu, the newly created PTQ weight observer is still on gpu. Therefore, during the convert, this line will fail https://fburl.com/4rhipfvb

Test Plan: CI

Differential Revision: D48141494

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106755
Approved by: https://github.com/jerryzh168
2023-08-09 00:22:49 +00:00
Jerry Zhang
2156f0434c [quant][pt2e] Add reference representation for quantized adaptive_avg_pool2d (#105709)
Summary:
Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_representation_adaptive_avg_pool2d

Although right now it is not really testing things since there is some problem with dynamo export

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105709
Approved by: https://github.com/andrewor14
ghstack dependencies: #105708
2023-08-04 18:49:14 +00:00
Jerry Zhang
9e301949ec [quant][pt2e] Add reference representation for quantized max_pool2d (#105708)
Summary:
Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_representation_maxpool2d

Although right now it is not really testing things since there is some problem with dynamo export

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105708
Approved by: https://github.com/andrewor14
2023-08-04 08:19:52 +00:00
Jerry Zhang
820e68b58a [quant][pt2e] Add reference representation for quantized add - relu (#105707)
Summary:
Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_representation_add_relu

Although right now it is not really testing things since there is some problem with dynamo export
Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105707
Approved by: https://github.com/andrewor14
2023-08-03 00:42:06 +00:00
Jerry Zhang
d528a137e0 [quant][pt2e][quantizer] Suppoert set_module_type in XNNPACKQuantizer (#106094)
Summary:
Added support to allow users to set configurations based on module type in XNNPACKQuantizer, can also serve as an example
for implementing new quantizers

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_xnnpack_quantizer_set_module_type

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106094
Approved by: https://github.com/andrewor14
ghstack dependencies: #106087
2023-08-02 08:33:58 +00:00
Leon
850ad54139 correct spelling mistake (#106309)
Fixes #ISSUE_NUMBER
correct spelling mistake
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106309
Approved by: https://github.com/kit1980
2023-08-02 04:38:23 +00:00
Jerry Zhang
92a22a8098 [quant][pt2e][quantizer] Suppoert set_module_name in XNNPACKQuantizer (#106087)
Summary:
Added support to allow users to set configurations based on module name in XNNPACKQuantizer, can also serve as an example
for implementing new quantizers

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_xnnpack_quantizer_set_module_name

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106087
Approved by: https://github.com/andrewor14
2023-08-02 01:19:23 +00:00
PyTorch MergeBot
93b2036bef Revert "[quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894)"
This reverts commit 3ca71ed735.

Reverted https://github.com/pytorch/pytorch/pull/105894 on behalf of https://github.com/huydhn due to breaking executorch tests internally ([comment](https://github.com/pytorch/pytorch/pull/105894#issuecomment-1654831950))
2023-07-28 01:16:02 +00:00
Edward Z. Yang
7b9d250f06 Change _dynamo.export to be export(f)(*args, **kwargs) (#106109)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106109
Approved by: https://github.com/voznesenskym
2023-07-27 21:41:13 +00:00
Jerry Zhang
3ca71ed735 [quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894)
Summary:
Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this
PR changes them to buffers/Tensors so that they can be serialized

Test Plan:
python test/test_quantization.py TestQuantizePT2E

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D47770963](https://our.internmc.facebook.com/intern/diff/D47770963)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105894
Approved by: https://github.com/kimishpatel
2023-07-26 20:15:06 +00:00
Jerry Zhang
3a77f9aaaf [quant][api] Move torch.ao.quantization.pt2e.quantizer to torch.ao.quantization.quantizer (#105885)
Summary: moving quantizer to torch.ao.quantization to make it a public api, since pt2e is a folder for implementations

Test Plan:
CIs

sanity check: "buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18"

Differential Revision: D47727838

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105885
Approved by: https://github.com/andrewor14
2023-07-26 18:20:09 +00:00
Jerry Zhang
d767cff7c7 [quant][fx] Fix docs for prepare_fx/prepare_qat_fx (#105979)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/103661

Test Plan:
visual inspectation of docs https://pytorch.org/docs/2.0/generated/torch.ao.quantization.quantize_fx.prepare_fx.html#torch.ao.quantization.quantize_fx.prepare_fx

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105979
Approved by: https://github.com/andrewor14
2023-07-26 09:56:18 +00:00
Aaron Gokaslan
6d43c89f37 [BE]: Update Ruff to 0.0.280 (#105724)
Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724
Approved by: https://github.com/ezyang, https://github.com/janeyx99
2023-07-22 23:03:34 +00:00
Jerry Zhang
143c83d637 [quant][pt2e][be] Remove unneeded code (#105676)
Summary:
att

Test Plan:
CIs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105676
Approved by: https://github.com/andrewor14
2023-07-21 00:51:22 +00:00
Jerry Zhang
dff4e034b8 [quant][pt2e][be] Rename qnnpack quantizer to xnnpack quantizer (#105551)
Summary: att

Test Plan: sandcastle CI and OSS CI

Reviewed By: andrewor14

Differential Revision: D47422894

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105551
Approved by: https://github.com/andrewor14
2023-07-20 03:52:40 +00:00
Max Ren
bc6bca9d42 [XNNPACK][QS8] torch.slice (#105252)
Differential Revision: [D47487423](https://our.internmc.facebook.com/intern/diff/D47487423/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105252
Approved by: https://github.com/digantdesai
2023-07-19 23:36:02 +00:00
leslie-fang-intel
fa6be2fa6f [Quant][PT2E] Remove x86 inductor pt2e backend config (#105039)
**Summary**
For the Quantization PT2E path, we recommend to use `X86InductorQuantizer` instead of backend config of `x86_inductor_pt2e_backend_config`. Remove the `x86_inductor_pt2e_backend_config` and the relevant testing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105039
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-07-19 23:18:29 +00:00
Justin Chu
c0d8a4af0a [BE] Enable ruff's UP rules and autoformat ao/ (#105430)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105430
Approved by: https://github.com/albanD, https://github.com/malfet
2023-07-19 13:44:37 +00:00
Jerry Zhang
554052f321 [quant][pt2e][be] Rename prepare_pt2e_quantizer to prepare_pt2e (#105484)
Summary: att

Test Plan: sandcastle and OSS CI

Reviewed By: andrewor14

Differential Revision: D47422892

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105484
Approved by: https://github.com/andrewor14
2023-07-19 04:51:37 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
5666d20bb8 Add unlifting pass under private config (#104897)
Summary: We wanna do this little by little. For now, I tried only on DissectedPartsModel which needs to use aot_export version.

Test Plan: CI

Reviewed By: zhxchen17

Differential Revision: D46785735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104897
Approved by: https://github.com/JacobSzwejbka
2023-07-19 01:16:35 +00:00
maxren
88f1885ec9 [XNNPACK][QS8] torch.cat (#104800)
Differential Revision: [D47304143](https://our.internmc.facebook.com/intern/diff/D47304143/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104800
Approved by: https://github.com/digantdesai
2023-07-19 00:15:05 +00:00
Nikita Shulga
78829d6e07 Fix isinstance check in quat_utils (#105476)
Calling `isinstance(x, Tuple[Node, Node])` would either fail, or raise a
type error on a more modern Python, as none of the tuples are actually
instances of `Tuple`

```python
>>> from typing import Tuple
>>> from torch.fx import Node
>>> edge_or_node=(Node(None, "foo", "output", "foo", None, None), Node(None, "bar", "output", "bar", None, None))
>>> isinstance(edge_or_node, tuple) and len(edge_or_node) == 2 and all(isinstance(x, Node) for x in edge_or_node)
True
>>> isinstance(edge_or_node, Tuple[Node, Node])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/malfet/miniconda3/lib/python3.10/typing.py", line 994, in __instancecheck__
    return self.__subclasscheck__(type(obj))
  File "/Users/malfet/miniconda3/lib/python3.10/typing.py", line 997, in __subclasscheck__
    raise TypeError("Subscripted generics cannot be used with"
TypeError: Subscripted generics cannot be used with class and instance checks
```

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 40fa451</samp>

> _Fix type annotation_
> _Quantize nodes in the graph_
> _Autumn leaves falling_

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105476
Approved by: https://github.com/jerryzh168
2023-07-18 21:16:05 +00:00
Jerry Zhang
ed2b9f1af1 [quant][pt2e] rename _quantize_pt2e to quantize_pt2e (#105377)
Summary: att

Test Plan: CIs

Reviewed By: andrewor14

Differential Revision: D47234357

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105377
Approved by: https://github.com/andrewor14
2023-07-18 16:46:05 +00:00
Nikita Shulga
5837e95d30 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`

Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-15 20:30:20 +00:00
Jerry Zhang
7b4d080496 [quant][pt2e] Rename _pt2e to pt2e (#104668)
Summary:
X-link: https://github.com/pytorch/executorch/pull/3

att

Test Plan: Imported from OSS

Differential Revision: D47202807

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104668
Approved by: https://github.com/andrewor14
2023-07-15 06:34:17 +00:00
PyTorch MergeBot
15fd1ea118 Revert "[Reland] Update mypy to 1.4.1 (#105227)"
This reverts commit c9c4f8efc3.

Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))
2023-07-14 22:28:35 +00:00
Nikita Shulga
c9c4f8efc3 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-14 20:45:12 +00:00
PyTorch MergeBot
3c5a494d7a Revert "Update mypy to 1.4.1 (#91983)"
This reverts commit 634659e262.

Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))
2023-07-14 15:59:16 +00:00
Jerry Zhang
90b50f0303 [quant][pt2e] change internal code to only import from _quantize_pt2e (#105162)
Summary: This is to make public api clear so that we can make implementation details change easier in the future

Test Plan: CIs

Differential Revision: D47445767

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105162
Approved by: https://github.com/andrewor14
2023-07-14 05:14:29 +00:00
Tuan Tran
85745cd3d9 Fix bug in fuse_modules (#105069)
Summary: This diff fixes the issue reported in https://github.com/pytorch/pytorch/issues/105063 and also related to internal caffe2 bug (reproduced error in internal fb pytorch: N3945540)

Test Plan: Wait for sandcastle with the added unit test in caffe2/torch/ao/quantization/eager/test_fuse_eager

Differential Revision: D47402357

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105069
Approved by: https://github.com/jerryzh168
2023-07-13 23:39:59 +00:00
Nikita Shulga
634659e262 Update mypy to 1.4.1 (#91983)
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  -
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983
Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi
2023-07-13 16:30:36 +00:00
Aaron Gokaslan
96b91ab248 Fix merged lintrunner error (#105005)
Fixes lintrunner linter race condition. Follow up to #104917

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105005
Approved by: https://github.com/malfet, https://github.com/ezyang
2023-07-11 22:04:49 +00:00
Aaron Gokaslan
2f95a3d0fc [BE]: Apply ruff PERF fixes to torch (#104917)
Applies automated ruff fixes in the PERF modules and enables all automatic ones. I also updated ruff which applied some additional fixes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104917
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-07-11 20:45:21 +00:00
Andrew Or
4b29829ece [quant][pt2] Fix QAT convert for mobilenetv2 (#104110)
Summary:
QAT convert for mobilenetv2 was previously not working
because we incorrectly applied dropout during eval as well as
training. This is because, for exported models, model.eval() does
not change the behavior of dropout, unlike models with torch ops.
This commit simulates the effects of model.eval() for exported
models as well by replacing the aten dropout pattern before eval.
As of this commit, end-to-end QAT numerics now match for
mobilenetv2 between FX and PT2.

Test Plan: python test/test_quantization.py TestQuantizePT2EModels.test_qat_mobilenet_v2

Differential Revision: D46750343

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104110
Approved by: https://github.com/jerryzh168
2023-07-11 18:42:42 +00:00
maxren
332f2057df [XNNPACK][QS8] torch.nn.ELU (#104307)
Differential Revision: [D47075933](https://our.internmc.facebook.com/intern/diff/D47075933/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104307
Approved by: https://github.com/digantdesai
2023-07-11 00:35:13 +00:00
maxren
c4e084e3c7 [XNNPACK][QS8] torch.nn.ConstantPad2d (#104306)
Differential Revision: [D47075932](https://our.internmc.facebook.com/intern/diff/D47075932/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104306
Approved by: https://github.com/digantdesai
2023-07-11 00:35:02 +00:00
maxren
2c960c73a3 [XNNPACK][QS8] torch.permute (#104305)
Differential Revision: [D47075934](https://our.internmc.facebook.com/intern/diff/D47075934/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104305
Approved by: https://github.com/digantdesai
2023-07-11 00:34:58 +00:00
maxren
d41c4a8338 [XNNPACK][QS8] torch.clamp (#104304)
Differential Revision: [D47075935](https://our.internmc.facebook.com/intern/diff/D47075935/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104304
Approved by: https://github.com/digantdesai
2023-07-11 00:34:58 +00:00
leslie-fang-intel
2a21469a77 [Quant][PT2E] Enable conv2d unary and binary recipe for x86 inductor quantizer (#98826)
**Summary**

- Recipe to annotate `conv2d_relu` for `X86InductorQuantizer` is added.
- Recipe to annotate `conv2d_add` for `X86InductorQuantizer` is added.
- Recipe to annotate `conv2d_add_relu` for `X86InductorQuantizer` is added.

**Test Plan**
```
python -u -m pytest -s -v test_x86inductor_quantizer.py -k TestQuantizePT2EX86Inductor
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98826
Approved by: https://github.com/jerryzh168
2023-07-04 00:01:10 +00:00
Kimish Patel
bd0f0f40a1 [PT2][Quant] Enable symbolic shape in linear quantization (#104473)
When tracing with symbolic shapes, arbitrary sym_size nodes can appear in the
graph. Earlier changes did not account for this and quantizer fails to annotate
the right nodes. This diff fixes that by not annotating sym_size nodes, which
should really not be relevant for quantization.

As next steps, we should validate in quant workflow that a) sym_int nodes are not
being quantized and b) add similar support, as this diff, for generic
annotations

Differential Revision: [D47132050](https://our.internmc.facebook.com/intern/diff/D47132050/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104473
Approved by: https://github.com/jerryzh168
2023-07-01 05:14:30 +00:00