Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863
This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first,
and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack).
This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code
in quantization_patterns.py as well (in followup PRs).
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
and other internal/oss regression tests
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D34778506
fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b
(cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73470
att, this does not affect user apis since we are only exposing fuse_fx as a public api
Test Plan:
python test/test_quantization.py TestFuseFx
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34495260
fbshipit-source-id: 3aa253bc7190e50acc7229186f210901ebc5481b
(cherry picked from commit a88517ff6feff7abbece2234d82fd53e33702237)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72735
We use `get_matched_types` to get the (type) pattern from matched modules.
And we need to use MatchAllNode instead of type(MatchAllNode) to query the fuser_method for the pattern
Test Plan:
TODO
Imported from OSS
Reviewed By: raghuramank10000
Differential Revision: D34180705
fbshipit-source-id: db9b6e791a9f26b70079fddc95fce033052199ab
(cherry picked from commit 01d38afabcb1bfc207dee7d49ee13df500d32fdf)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009
Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is
not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat`
to control this
Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we
can relax this constraint later
Test Plan:
```
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestFusion
```
Imported from OSS
**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)|
|**Modified Pages**|
Reviewed By: mruberry
Differential Revision: D33178977
fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770
(cherry picked from commit 2d51f9fb28)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70757
This is an initial PR on a way to preserve stack traces throughout FX
graph mode quantization. It preserves the stack traces for ops
for all of the quantize handlers. A future PR will add stack traces
for dtype transitions.
Test Plan:
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Note: the above only tests a single case. In a future PR, once we
expand coverage, we can expand the utility functions to check for stack
traces on all tests.
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Imported from OSS
Differential Revision:
D33432485
D33432485
Reviewed By: jerryzh168
Pulled By: vkuzo
fbshipit-source-id: 56c56850393132487430a850fa1def826a9c39c0
(cherry picked from commit c11155b31e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70151
this supports converting an observed standalone module to quantized standalone module
in the new convert flow (convert observers to quant-dequant operators)
Test Plan:
```
python test/test_quant_trt.py TestConvertFxDoNotUse
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33205163
fbshipit-source-id: 01ea44fb2a8ffe30bec1dd5678e7a72797bafafc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69959
GraphModule is an implementation detail, We don't want to expose it in quantization apis
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_quantized_model_type
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33119103
fbshipit-source-id: d8736ff08b42ee009d6cfd74dcb3f6150f71f3d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70150
This PR allows user to specify backend_config_dict for standalone modules, both in prepare and convert step
adding this now to allow prototype for some of our customer use cases, test for the codepath will be added in
a separate PR
Test Plan:
regression tests
```
python test/test_quantization.py TestQuantizeFx
```
test that specifies backend_config for some module will be added in a separate PR for the use case we have in mind
since it requires other features
Imported from OSS
**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33205162/V9/classyvision/)|
|**Modified Pages**|
Reviewed By: vkuzo
Differential Revision: D33205162
fbshipit-source-id: a657cef8e49d99b6a43653141521dc87c33bfd89
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69878
But we'll still verify that model.training is True when user call prepare_qat API.
Relaxing this condition might also mean that we change the api for methods in fuser_method_mapping,
with additional flag for qat (currently we just have different fusions for training/eval), I don't think
this is P0, we could revisit if there is a need in the future
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33080988
fbshipit-source-id: b13715b91f10454948199323c5d81ef88bb3517f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69335
This PR added support for configuring fusion with:
"pattern", "fuser_method"
This only works for simple sequence of 2 op patterns currently, will extend this in future PRs
Test Plan:
regresion test on linear-relu fusion:
```
python test/fx2trt/test_quant_trt.py TestQuantizeFxTRTOps
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32816164
fbshipit-source-id: f300b7b96b36908cb94a50a8a17e0e15032509eb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67169
Looks like the doc error only appears after it's landed
Test Plan: Imported from OSS
Reviewed By: seemethere
Differential Revision: D31890431
fbshipit-source-id: d40cba082712c4b35704ea15d82fbc4749f85aec
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67152
Test Plan:
```
cd docs
make html
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D31884570
fbshipit-source-id: 2b521f617c93f6fa08da3387df2d25497293eee6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67066
We'll add it later when the api is ready
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D31849079
fbshipit-source-id: 0c00d08510166b2d897cf1562c7276527319b05c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66878
Currently convert_fx quantizes all layers that have been prepared, depending on the prepare qconfig_dict
This PR adds support to accept a variation of qconfig_dict in convert_fx that can be used to specify skip quantizing certain layers
This can help with prepare/observe all operators, quantize a subset of them (based on quantization error), to avoid preparing multiple times.
The qconfig_dict passed to convert_fx can only have the values set to `None`, with the keys being the same as what is allowed in the prepare qconfig_dict
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_convert_qconfig_dict
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D31808247
fbshipit-source-id: a4f5dca1090f0083fc3fea14aff56924033eb24f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66955
The new convert function are not meant to be used by users, it's a temporary function that
we use to build up the new convert path, we will bring feature parity with the old path
and deprecate the old path after that
Test Plan: Imported from OSS
Reviewed By: anjali411
Differential Revision: D31810488
fbshipit-source-id: 2f65a110506683123350e619c48df090a15570fc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66925
Current convert_fx implementation is using "The Interpreter Pattern" in https://pytorch.org/docs/stable/fx.html
There are two things that's changed which make the approach in this PR possible and needed:
1). original convert implementation is developed at the initial prototype where fx does not allow mutations, now fx
supports mutations
2). original convert needs to work for a lot of fbgemm/qnnpack specific logic, which is not needed for reference patterns
Therefore it makes sense for us to write a new convert function just for reference patterns, the implementation
is significantly easier to understand than the original convert implementation
Current support:
* we should be able to support all non-weighted ops like relu, add etc.
Missing:
* linear and conv
* some advanced features like standalone modules, input_quantized_idxs etc.
will add linear and conv support and start defining the backend_config_dict based on this version of convert
Test Plan:
python test/test_quantization.py TestQuantizeFxOpsNew
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D31786241
fbshipit-source-id: 2a32156eb6d3c5271cb44906cd863055785fb5d4
Summary:
- [x] Fix the Pyre type checking errors in `torch/ao/quantization/quantize_fx.py`
```
torch/quantization/quantize_fx.py:41:8 Incompatible variable type [9]: fuse_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:143:16 Incompatible variable type [9]: prepare_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:144:16 Incompatible variable type [9]: equalization_qconfig_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:206:8 Incompatible variable type [9]: prepare_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:230:12 Incompatible variable type [9]: fuse_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:268:8 Incompatible variable type [9]: prepare_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:269:8 Incompatible variable type [9]: equalization_qconfig_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:427:8 Incompatible variable type [9]: prepare_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:464:8 Incompatible variable type [9]: convert_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:486:8 Incompatible variable type [9]: convert_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
torch/quantization/quantize_fx.py:547:8 Incompatible variable type [9]: convert_custom_config_dict is declared to have type `Dict[str, typing.Any]` but is used as type `None`.
```
Fixes the issue: [MLH-Fellowship/pyre-check/issues/76](https://github.com/MLH-Fellowship/pyre-check/issues/76)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66804
Reviewed By: onionymous
Differential Revision: D31738171
Pulled By: 0xedward
fbshipit-source-id: 00d4c5749c469aff39a1531365461ced747e52fc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66379
Description:
Creates a quantization API reference and fixes all the docblock errors.
This is #66122 to #66210 squashed together
Test Plan:
```
cd docs
make html
python -m http.server
// open webpage, inspect it, looks good
```
Reviewed By: ejguan
Differential Revision: D31543172
Pulled By: vkuzo
fbshipit-source-id: 9131363d6528337e9f100759654d3f34f02142a9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66122
Description:
Adds a documentation page for FX graph mode quantization APIs which
reads from the docstrings in `quantize_fx`, and links it from the main
quantization documentation page.
Also, updates the docstrings in `quantize_fx` to render well with reStructuredText.
Test Plan:
```
cd docs
make html
python -m http.server
// open webpage, inspect it, looks good
```
Reviewed By: dagitses
Differential Revision: D31447612
Pulled By: vkuzo
fbshipit-source-id: 07d0a6137f1537af82dce0a729f9617efaa714a0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66058
After the initial migration from `torch.quantization` to `torch.ao.quantization`, some of the files did not change.
This happened because the migration was done in parallel, and some of the files were landed while the others were still in the original location.
This is the last fix in the AO migration phase 1, which completely enables the ao.quantization namespace.
Test Plan: `python test/test_quantization.py`
Reviewed By: vkuzo
Differential Revision: D31366066
Pulled By: z-a-f
fbshipit-source-id: bf4a74885be89d098df2d87e685795a2a64026c5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65033
1. Move the file:
```
hg mv caffe2/torch/quantization/fx caffe2/torch/ao/quantization/fx
hg mv caffe2/torch/quantization/quantize_fx.py caffe2/torch/ao/quantization/quantize_fx.py
```
2. Create new files
```
touch caffe2/torch/quantization/quantize_fx.py
touch caffe2/torch/quantization/fx/__init__.py
```
3. import things in the new files
4. add tests to test/quantization/ao_migration/test_quantization_fx.py
this is because we have some fx import in quantize_fx and fx/*.py
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: vkuzo, z-a-f
Differential Revision: D30949749
fbshipit-source-id: 9e5d4d039c8a0a0820bc9040e224f0d2c26886d3