Commit Graph

208 Commits

Author SHA1 Message Date
Jerry Zhang
64fd706b21 [quant][pt2e] Add generate_numeric_debug_handle pass (#114315)
Summary:
This is a util for numeric suite in pt2 export so that we can build
a more streamlined UX for numerical debugging in quant + executorch stack

Test Plan:
python test/test_quantization.py TestGenerateNumericDebugHandle

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114315
Approved by: https://github.com/zhxchen17
2023-12-01 03:38:17 +00:00
leslie-fang-intel
65e99357ae [Inductor] [Quant] Enable QConv2d Unary int8-mixed-bf16 Lowering (#112550)
**Summary**
- PR 5 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor https://github.com/pytorch/pytorch/issues/111640.
- Enable the QConv2d Unary int8-mixed-bf16 weight prepack and post grad lowering inside inductor.

**TestPlan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112550
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-11-10 08:59:40 +00:00
Jerry Zhang
501d118255 [quant][pt2e] Add transform_for_annotation method in Quantizer (#113115)
Summary:
Adding the method so that people can do some transformations before annotation to make the graph easier to annotate

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_transform_for_annotation

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51141080](https://our.internmc.facebook.com/intern/diff/D51141080)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113115
Approved by: https://github.com/kimishpatel
2023-11-09 20:23:29 +00:00
Jerry Zhang
12c257cc00 [qunat][pt2e] Support allow_implicit_sharing flag (#112929)
Summary:
For a Node: node1 and edge: (node1, node2), since they are observing the same
Tensor, we may want to implicitly share observers, this flag allows people to
turn off this behavior for the output of the node

See the test_allow_implicit_sharing test for use case

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_allow_implicit_sharing

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112929
Approved by: https://github.com/kimishpatel
2023-11-08 23:47:17 +00:00
Jerry Zhang
43c211facb [quant][pt2e] Actually support transitive sharing for SharedQuantizationSpec (#111172)
Summary:
Previously we actually did not really support this, this PR added the support.

Next
* clean up insert observer logic
* add allow_transitive_sharing boolean flag to allow people to turn this op for certain edges

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_shared_qspec_transitivity

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D50250789](https://our.internmc.facebook.com/intern/diff/D50250789)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111172
Approved by: https://github.com/kimishpatel
2023-10-20 23:25:17 +00:00
Jerry Zhang
c9b8e06060 [quant] Enable quantization for wav2letter (#109830)
Summary:
Also added annotation support for conv1d_relu and conv1d in XNNPACKQuantizer, the quantized results still
matches fx quant path (didn't quantize conv1d) so tests are not disabled

Test Plan: with-proxy buck2 run executorch/examples/quantization:example -- -m=w2l --verify

Differential Revision: D49479546

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109830
Approved by: https://github.com/kimishpatel
2023-09-29 00:47:34 +00:00
Jerry Zhang
3de42995e4 [quant][pt2e] Add quant API re-entrant test (#110125)
Summary:
Add the test to make sure we can call the quantize API multiple times

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_reentrant

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110125
Approved by: https://github.com/kimishpatel
ghstack dependencies: #110097
2023-09-28 22:41:59 +00:00
Jerry Zhang
c914ca7577 [quant][be] Add TestPT2ERepresentation test case (#108923)
Summary:
att

Test Plan:
python test/test_quantization.py TestPT2ERepresentation
Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108923
Approved by: https://github.com/andrewor14
2023-09-14 02:01:38 +00:00
Jiaxu Zhu
152203d3c3 [pytorch][ao] Add torch.matmul in FloatFunctional/QFunctional (#106831)
Summary: As title

Test Plan: new unit tests

Differential Revision: D48172841

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106831
Approved by: https://github.com/jerryzh168
2023-08-10 22:43:36 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
be03a56955 [BE] Enable ruff's UP rules and autoformat testing/ (#105425)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105425
Approved by: https://github.com/malfet
2023-07-18 21:04:39 +00:00
leslie-fang-intel
dbc8eb2a8f [Quant][PT2E]Enable x86 inductor quantizer (#98730)
**Summary**

- Enable `X86InductorQuantizer` basics.
- Recipe to annotate conv2d is added.

**Test Plan**
```
python -u -m pytest -s -v test_x86inductor_quantizer.py -k TestQuantizePT2EX86Inductor
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98730
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-06-17 06:10:23 +00:00
Xia, Weiwen
08766b23de [Quant][FX] lower ConvTranspose3d (#97125)
**Summary**
Enable quantization and lowering of `ConvTranspose3d`.
Add test cases for `ConvTranspose1d`, `ConvTranspose2d` and `ConvTranspose3d` since there were no such test cases.

**Test plan**
python test/test_quantization.py -k test_conv_transpose_not_reference
python test/test_quantization.py -k test_conv_transpose_reference

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97125
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-03-28 11:58:29 +00:00
Kazuaki Ishizaki
4610ce49f6 Fix typo under torch/testing directory (#97254)
This PR fixes typo in comments and messages under `torch/testing` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97254
Approved by: https://github.com/kit1980, https://github.com/malfet
2023-03-23 01:46:17 +00:00
leslie-fang-intel
a6d8c70933 Init quantization backend config for inductor (#96476)
**Summary**
Init the backend config file with quantization recipes for quantization 2.0 inductor path. In this PR, we only init the recipe for `convolution` and `convolution_relu`.

**Test Plan**
```
clear && python -m pytest test_quantization.py -k test_inductor_backend_config_conv
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96476
Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/jerryzh168
2023-03-22 07:56:56 +00:00
Xuehai Pan
046e88a291 [BE] [3/3] Rewrite super() calls in test (#94592)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592
Approved by: https://github.com/ezyang, https://github.com/seemethere
2023-02-12 22:20:53 +00:00
Xuehai Pan
5b1cedacde [BE] [2/3] Rewrite super() calls in functorch and torch (#94588)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-10 21:16:33 +00:00
Aaron Gokaslan
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
Vasiliy Kuznetsov
f15ab8a7f2 AO migration: replace torch internal callsites (#94170)
Summary:

Do the following renames:
`torch.quantization` -> `torch.ao.quantization`
`torch.nn.quantized` -> `torch.ao.nn.quantized`
`torch.nn.quantizable` -> `torch.ao.nn.quantizable`
`torch.nn.qat` -> `torch.ao.nn.qat`
`torch.nn.intrinsic` -> `torch.ao.nn.intrinsic`

And then, do
`torch.ao.nn.quantized._reference` -> `torch.ao.nn.quantized.reference` to clean up the aftermath of https://github.com/pytorch/pytorch/pull/84974

Then, manually update `test/test_module_init.py` to fix hanging whitespace due to the replace.

Run this script to do the replacements: https://gist.github.com/vkuzo/7f7afebf8c31b9ba48306223e68a1c82

This is for https://github.com/pytorch/pytorch/issues/81667

Test plan: CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94170
Approved by: https://github.com/jerryzh168
2023-02-07 02:32:23 +00:00
Vasiliy Kuznetsov
56f9475625 ns: change PNP testing to use QNNPACK (#91421)
Summary:

Changes the PNP test cases to use QNNPACK. The only reason is because
I'm switching to Mac M1 as my primary machine, which supports QNNPACK
but not fbgemm, and it's convenient for me to be able to run these
locally.

PNP itself is not backend specific, so it does not matter which backend
the functionality is tested on.

Test plan:

```
python test/test_quantization.py -k NShadows
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91421
Approved by: https://github.com/jerryzh168
2023-02-01 18:34:04 +00:00
leslie-fang-intel
ef4118e435 [Quant][FX] Lower QConvAdd2d for onednn backend (#91153)
**Summary**
Add quantization mappings for QConvAdd2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode.

**Test plan**
```
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91153
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:14:12 +00:00
leslie-fang-intel
53c3555a6a [Quant] Add fused ConvAdd2d module for onednn backend (#91152)
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `ConvAdd2d` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown.

**Test plan**
```
python -m pytest test_quantization.py -k test_conv2d_add
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91152
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:11:25 +00:00
Xia, Weiwen
a5eb564ba4 [Quant] lower fused LinearTanh for onednn backend (#89188)
**Summary**
Add fuser method and quantization mappings for `QLinearLeakyReLU` for int8 inference for onednn backend. The fusion and lowering are supported only in FX mode.

**Test plan**
python test_quantization.py TestFuseFx TestQuantizeFx

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89188
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2022-12-20 01:30:21 +00:00
Xia, Weiwen
7b0ec67e34 [Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU (#88665)
**Summary**
Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs.

**Test plan**
python test_quantization.py TestFuseFx

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88665
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2022-12-17 03:33:08 +00:00
Xia, Weiwen
97e47a52b8 [Quant] Add fused linear-leaky_relu op for onednn backend (#88478)
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `linear-leaky_relu` op for `onednn` backend, which will be used for int8 inference with `onednn` backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
python test_quantization.py TestQuantizedLinear

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88478
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2022-12-06 08:32:59 +00:00
dzdang
ea81138bd6 [quant][improvement][better-engineering] Refactored get_supported_device_types into common_quantization.py (#79607)
Summary:
Both test_quantized_tensor.py and test_quantize_fx.py had the same
get_supported_device_types function defined. This PR refactors it into
the common_quantization.py file for common usage

Test Plan:
```
python test/test_quantization.py
```

Differential Revision: [D37173692](https://our.internmc.facebook.com/intern/diff/D37173692)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79607
Approved by: https://github.com/jerryzh168
2022-09-23 23:32:18 +00:00
zaf
d32a762147 [quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:34 +00:00
zaf
c92e5ac95b [quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- Documentation @vkuzo
  - docs/source/conf.py
  - docs/source/quantization.rst
- [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168
- [common test routine](test/quantization/ao_migration/common.py) @HDCharles
- JIT stuff @jamesr66a
  - torch/csrc/jit/passes/hoist_conv_packed_params.cpp
  - torch/csrc/jit/passes/quantization/helper.h
  - torch/csrc/jit/serialization/import_source.cpp

Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012/)

Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:33 +00:00
Sergii Dymchenko
591222f5d9 Fix use-dict-literal lint (#83718)
Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718
Approved by: https://github.com/albanD
2022-08-24 00:26:46 +00:00
PyTorch MergeBot
6a9c02339d Revert "[quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)"
This reverts commit 432f037498.

Reverted https://github.com/pytorch/pytorch/pull/78713 on behalf of https://github.com/janeyx99 due to Reverting for breaking (trunk-only) ios build
2022-08-22 07:32:37 +00:00
PyTorch MergeBot
b1a7b67529 Revert "[quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)"
This reverts commit e6fb97d8ae.

Reverted https://github.com/pytorch/pytorch/pull/78714 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:30:48 +00:00
zaf
e6fb97d8ae [quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
2022-08-22 05:22:00 +00:00
zaf
432f037498 [quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- Documentation @vkuzo
  - docs/source/conf.py
  - docs/source/quantization.rst
- [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168
- [common test routine](test/quantization/ao_migration/common.py) @HDCharles
- JIT stuff @jamesr66a
  - torch/csrc/jit/passes/hoist_conv_packed_params.cpp
  - torch/csrc/jit/passes/quantization/helper.h
  - torch/csrc/jit/serialization/import_source.cpp

Differential Revision: [D36860145](https://our.internmc.facebook.com/intern/diff/D36860145/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713
Approved by: https://github.com/jerryzh168
2022-08-22 01:38:55 +00:00
Andrew Or
782f3489c6 [Quant][fx][bc-breaking] Integrate BackendConfig with quantization flow (part 2) (#82557)
This is part 2 of the effort to replace `backend_config_dict` with
a python config object, a more formal and robust API that leads to
better user experience. This commit integrates the `BackendConfig`
implemented in part 1 (https://github.com/pytorch/pytorch/pull/81469)
with the existing FX graph mode quantization flow.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

BC-breaking Notes:

Before:
```
import torch
from torch.ao.quantization import get_default_qconfig_mapping
from torch.ao.quantization.backend_config import ObservationType
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

dtype_config = {
    "input_dtype": torch.quint8,
    "output_dtype": torch.quint8
    "weight_dtype": torch.qint8,
    "bias_dtype": torch.float,
}

backend_config_dict = {
    "name": "my_backend",
    "configs": [{
        "pattern": torch.nn.Linear,
        "observation_type": ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT,
        "dtype_configs": [dtype_config],
        "root_module": torch.nn.Linear,
        "reference_quantized_module": torch.nn.quantized._reference.Linear,
        "qat_module": torch.nn.qat.Linear,
    }]
}

m = MyModel()
qconfig_mapping = get_default_qconfig_mapping()
example_inputs = (torch.rand(3, 3),)
m = prepare_fx(
    m, qconfig_mapping, example_inputs,
    backend_config_dict=backend_config_dict)
m = convert_fx(m, backend_config_dict=backend_config_dict)
```

After:
```
import torch
from torch.ao.quantization import get_default_qconfig_mapping
from torch.ao.quantization.backend_config import (
    BackendConfig,
    BackendPatternConfig,
    DTypeConfig,
    ObservationType,
)
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

dtype_config = DTypeConfig(
    input_dtype=torch.quint8,
    output_dtype=torch.quint8
    weight_dtype=torch.qint8,
    bias_dtype=torch.float,
)

backend_config = BackendConfig("my_backend").set_backend_pattern_config(
    BackendPatternConfig(torch.nn.Linear)
        .set_observation_type(ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT)
        .add_dtype_config(dtype_config)
        .set_root_module(torch.nn.Linear)
        .set_reference_quantized_module(torch.nn.quantized._reference.Linear)
        .set_qat_module(torch.nn.qat.Linear))

m = MyModel()
qconfig_mapping = get_default_qconfig_mapping()
example_inputs = (torch.rand(3, 3),)
m = prepare_fx(m, qconfig_mapping, example_inputs, backend_config=backend_config)
m = convert_fx(m, backend_config=backend_config)
```

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar

Differential Revision: [D38471932](https://our.internmc.facebook.com/intern/diff/D38471932)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82557
Approved by: https://github.com/jerryzh168
2022-08-08 18:55:50 +00:00
Andrew Or
c657c3d3ab [Quant][fx] Rename convert_to_reference to convert_to_reference_fx (#81326)
Summary: This commit renames the convert_to_reference function to
convert_to_reference_fx, which is more descriptive and matches
prepare_fx and prepare_qat_fx better.

Test Plan:
python test/test_quantization.py TestQuantizeFx

Reviewers: jerryzh168

Subscribers: jerryh168

Differential Revision: [D37787876](https://our.internmc.facebook.com/intern/diff/D37787876)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81326
Approved by: https://github.com/jerryzh168
2022-07-13 22:18:46 +00:00
Andrew Or
17104d3d7f [Quant][fx][bc-breaking] Replace is_reference with convert_to_reference (#80091)
Summary: This PR removes the is_reference flag from the  existing
convert_fx API and replaces it with a new convert_to_reference
function. This separates (1) converting the prepared model to a
reference model from (2) lowering the reference model to a quantized
model, enabling users to call their custom lowering function for
custom backends. For the native fbgemm backend, for example, the
following are equivalent:

```
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

prepared = prepare_fx(model, ...)
quantized = convert_fx(prepared, ...)
```

```
from torch.ao.quantization.fx import lower_to_fbgemm
from torch.ao.quantization.quantize_fx import (
    prepare_fx,
    convert_to_reference
)

prepared = prepare_fx(model, ...)
reference = convert_to_reference(prepared, ...)
quantized = lower_to_fbgemm(reference, ...)
```

Note that currently `lower_to_fbgemm` takes in two other arguments
that are difficult for users to provide. A future commit will remove
these arguments to make the helper function more user friendly.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091
Approved by: https://github.com/jerryzh168
2022-06-29 23:01:27 +00:00
Andrew Or
78144b9f35 [Quant][fx][bc-breaking] Replace *custom_config_dict with config objects
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066

Following https://github.com/pytorch/pytorch/pull/78452,
this commit replaces the following config dicts with python objects:

- prepare_custom_config_dict -> PrepareCustomConfig
- convert_custom_config_dict -> ConvertCustomConfig
- fuse_custom_config_dict -> FuseCustomConfig

This leads to better type safety and better user experience in
notebook settings due to improved auto completion. The new APIs
are as follows:

```
from torch.ao.quantization.fx.custom_config import PrepareCustomConfig

prepare_custom_config = PrepareCustomConfig() \
    .set_float_to_observed_mapping(float_class, observed_class) \
    .set_non_traceable_module_names(["mod1", "mod2"]) \
    .set_non_traceable_module_classes([class1, class2]) \
    .set_input_quantized_indexes([0, 1]) \
    .set_output_quantized_indexes([0]) \
    .set_preserved_attributes(["attr1", "attr2"])

convert_custom_config = ConvertCustomConfig() \
    .set_observed_to_quantized_mapping(observed_class, quantized_class) \
    .set_preserved_attributes(["attr1", "attr2"])

model = prepare_fx(
    model,
    qconfig_mapping,
    example_inputs,
    prepare_custom_config=prepare_custom_config)

model(data)

model = convert_fx(model, convert_custom_config=convert_custom_config)
```

For backwards compatibility, prepare_fx, prepare_qat_fx, and
convert_fx will continue to accept Dicts, which will be converted
to the relevant *CustomConfig object internally.

Note that this commit does not modify existing tests to use the
new API; they will continue to pass in Dicts as before, which still
works but triggers a deprecation warning. This will be handled in
a future commit.

Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/)

Approved by: https://github.com/jerryzh168
2022-06-16 17:50:07 +00:00
Jerry Zhang
8225f42a8a [quant][fx][equalization] Fix example_inputs follow ups in test_equalize_fx
Summary:
as a followup to https://github.com/pytorch/pytorch/pull/76496, we defined model specific example_inputs
for the test models in common_quantization.py and used these in test_equalize_fx

Test Plan:
python test/test_quantization.py TestEqualizeFx

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78314

Approved by: https://github.com/vkuzo
2022-05-26 01:42:24 +00:00
Jerry Zhang
416899d1a9 [quant][fx][bc-breaking] Add required example_args argument to prepare_fx and prepare_qat_fx (#249) (#77608)
Summary:
X-link: https://github.com/facebookresearch/d2go/pull/249

X-link: https://github.com/fairinternal/ClassyVision/pull/104

X-link: https://github.com/pytorch/benchmark/pull/916

X-link: https://github.com/facebookresearch/ClassyVision/pull/791

X-link: https://github.com/facebookresearch/mobile-vision/pull/68

FX Graph Mode Quantization needs to know whether an fx node is a floating point Tensor before it can decide whether to
insert observer/fake_quantize module or not, since we only insert observer/fake_quantize module for floating point Tensors.
Currently we have some hacks to support this by defining some rules like NON_OBSERVABLE_ARG_DICT (https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/utils.py#L496), but this approach is fragile and we do not plan to maintain it long term in the pytorch code base.

As we discussed in the design review, we'd need to ask users to provide sample args and sample keyword args
so that we can infer the type in a more robust way. This PR starts with changing the prepare_fx and prepare_qat_fx api to require user to either provide
example arguments thrugh example_inputs, Note this api doesn't support kwargs, kwargs can make https://github.com/pytorch/pytorch/pull/76496#discussion_r861230047 (comment) simpler, but
it will be rare, and even then we can still workaround with positional arguments, also torch.jit.trace(https://pytorch.org/docs/stable/generated/torch.jit.trace.html) and ShapeProp: https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/shape_prop.py#L140 just have single positional args, we'll just use a single example_inputs argument for now.

If needed, we can extend the api with an optional example_kwargs. e.g. in case when there are a lot of arguments for forward and it makes more sense to
pass the arguments by keyword

BC-breaking Note:
Before:
```python
m = resnet18(...)
m = prepare_fx(m, qconfig_dict)
# or
m = prepare_qat_fx(m, qconfig_dict)
```
After:
```python
m = resnet18(...)
m = prepare_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),))
# or
m = prepare_qat_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),))
```

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels

Imported from OSS

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D35984526/V30/classyvision/)|

|**Modified Pages**|

Reviewed By: vkuzo, andrewor14

Differential Revision: D35984526

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77608
Approved by: https://github.com/dzdang
2022-05-21 21:03:48 +00:00
mingfeima
92a9c0e3e0 add channels last (2d) support for mkldnn_convolution (#55584)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55584

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D27941368

Pulled By: VitalyFedyunin

fbshipit-source-id: 7dd6f02a5787efa1995f31cdbd3244b25653840c
(cherry picked from commit bb555ed0fedafd529cb552807326384e95c90df9)
2022-04-20 22:34:44 +00:00
Thiago Crepaldi
9bbe1d632e Fix ONNX ATen fallback for non-caffe2 engines
This PR introduces 3 BC changes:

First, this PR propagates `BUILD_CAFFE2` flag to `libtorch` and `libtorch_python`, which is necessary for non-caffe2 ONNX runtimes when using `ONNX_ATEN_FALLBACK` operator export type.

Second, as a complement of https://github.com/pytorch/pytorch/pull/68490, this PR refactors Caffe2's Aten ops symbolics to consider not only the `operator_export_type` (aka `ONNX_ATEN_FALLBACK`) to emit Caffe2 Aten ops, but also whether `BUILD_CAFFE2` (which is called `torch.onnx._CAFFE2_ATEN_FALLBACK` in python binding) is set.

Lastly, it renames `onnx::ATen` to `aten::ATen` for ONNX spec consistency in a BC fashion.
ONNX doesn't have `ATen` op on its spec, but PyTorch ONNX converter emits them. Non-Caffe2 backend engines would be mislead by such operator's name/domain. A non-ideal workaround would be to have Aten ops handled based on its name and ignore the (non-complaint) domain. Moreover, users could incorrectly file bugs to either ONNX or ONNX Runtime when they inspect the model and notice the presence of an unspecified ONNX operator.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73954
Approved by: https://github.com/BowenBao, https://github.com/malfet, https://github.com/garymm, https://github.com/jiafatom
2022-04-14 23:18:45 +00:00
Digant Desai
09f32eba7a [quant] Add default symmetric qat qconfig for qnnpack (#74507)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74507

* This is the default symmetric qat qconfigs for qnnpack.
* Support for symmetric quantization is not available from other backends.
* Observers are similar to symmetric PTQ qconfigs for qnnpack.

Reviewed By: jerryzh168

Differential Revision: D34804808

fbshipit-source-id: 22c11b89242a98f54029ac195f7b984e42809164
(cherry picked from commit ea751ded1174ba2c2f061bafc81573faaf248a9a)
2022-03-24 16:19:28 +00:00
Jerry Zhang
7ddf212f33 [quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863

This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first,
and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack).

This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code
in quantization_patterns.py as well (in followup PRs).

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
and other internal/oss regression tests

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D34778506

fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b
(cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)
2022-03-11 17:11:30 +00:00
Andrew Or
e118d6e59f Add lowering path for LinearReLU module (#71427)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71427

This commit adds a lowering path for the LinearReLU modules
in static quantization mode. This includes torch.nn.qat.Linear,
torch.nn.intrinsic.LinearReLU, and torch.nn.intrinsic.qat.LinearReLU.
Future commits will add support for dynamic quantization and functional
LinearReLU.

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_module

Imported from OSS

Reviewed By: george-qi

Differential Revision: D33694742

fbshipit-source-id: 19af11f82b1ad8ade0c307498971c29a3f776036
(cherry picked from commit b3f607de43)
2022-02-01 19:31:31 +00:00
Jerry Zhang
082ff25f37 [reland][bc-breaking][quant][be] Refactor fuser_method to include is_qat argument" (#71956)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71956

Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/59

Original commit changeset: f3912e210e8c

Original Phabricator Diff: D33178977 (ef501e8fed)

Test Plan:
Please see original diff for test plans

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33833203/V3/classyvision/)|

|**Modified Pages**|

Reviewed By: andrewor14

Differential Revision: D33833203

fbshipit-source-id: 74a8f22730b00aafa6a173b208e635c1d696959e
(cherry picked from commit fb88772b18)
2022-01-31 23:02:22 +00:00
Nikita Shulga
56511f859a Revert D33178977: [bc-breaking][quant][be] Refactor fuser_method to include is_qat argument
Test Plan: revert-hammer

Differential Revision:
D33178977 (ef501e8fed)

Original commit changeset: 0c1499c45526

Original Phabricator Diff: D33178977 (ef501e8fed)

fbshipit-source-id: f3912e210e8c588fdbdc9c3c5f4acf2aa8fe6678
(cherry picked from commit cd62183414)
2022-01-27 03:29:40 +00:00
Jerry Zhang
ef501e8fed [bc-breaking][quant][be] Refactor fuser_method to include is_qat argument (#70009)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009

Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is
not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat`
to control this

Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we
can relax this constraint later

Test Plan:
```
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestFusion
```

Imported from OSS

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)|

|**Modified Pages**|

Reviewed By: mruberry

Differential Revision: D33178977

fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770
(cherry picked from commit 2d51f9fb28)
2022-01-26 23:33:28 +00:00
Terry Chen
ce3215db70 Fix nnq.dropout in vision mobilenetv3 pretrain model (#71438)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71438

Fix issue https://github.com/pytorch/vision/issues/5198
skip observer for nn.dropout to load pretrain model

Test Plan:
python -c "import torchvision; torchvision.models.quantization.mobilenet_v3_large(pretrained=True, quantize=True)"

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D33641707

fbshipit-source-id: 14ea26557c4ff3b942cf46bf06610db0b8f06b05
(cherry picked from commit 0b8b178d26)
2022-01-22 00:02:48 +00:00
Terry Chen
e7c87e8b44 [quant] fix dropout in FX graph mode quantization (#71043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71043

fix issue #68250
dropout break fx graph model quantization

Test Plan:
python test/test_quantization.py TestStaticQuantizedModule

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33490176

fbshipit-source-id: 155546505b28ffc635ada65a1464b9d622dbc235
2022-01-13 15:59:59 -08:00
Jon Morton
123be0e5b7 [fusion] Add ConvTranspose+BN fusion support (#70022)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70022

Add support for fusing ConvTranpose{1,2,3}d with BatchNorm{1,2,3}d. This re-uses the existing fusion logic but adds a "transpose" flag to the fusing function which when enabled will use the appropriate reshape for ConTranspose's transposed weights.

Test Plan: `buck test mode/dev //caffe2/test:quantization -- -r quantization.eager.test_fusion.TestFusion`

Reviewed By: jerryzh168

Differential Revision: D33074405

fbshipit-source-id: 5e9eff1a06d8f98d117e7d18e80da8e842e973b7
2021-12-20 18:42:48 -08:00