Summary:
Previously prepare_fx returns an ObservedGraphModule and convert_fx returns a QuantizedGraphModule,
this is to preserve the attributes since torch.fx.GraphModule did not preserve them, after https://github.com/pytorch/pytorch/pull/92062
we are preserving `model.meta`, so we can store the attributes in model.meta now to preserve them.
With this, we don't need to create a new type of GraphModule in these functions and can use GraphModule directly, this
is useful for quantization in pytorch 2.0 flow, if other transformations are using GraphModule as well, the quantization passes will be composable with them
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels
python test/test_quantization.py TestQuantizePT2E
Imported from OSS
Differential Revision: D42979722
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94412
Approved by: https://github.com/vkuzo
Summary:
This is in preparation for quantize_pt2e API where we allow programability for users to set how
they want to quantize their model
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizePT2E
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92574
Approved by: https://github.com/jcaip
Summary:
_convert_to_reference_decomposed is a private convert function in fx graph mode quantization flow to convert
a calibrated/trained model to a reference quantized model with decomposed quantized tensor representations.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test__convert_to_reference_decomposed_fx
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87094
Approved by: https://github.com/andrewor14
Summary:
Some more clarifications for the arguments, including linking to object docs (QConfigMapping, BackendConfig) and adding types
in the doc
Test Plan:
```
cd docs
make html
```
and
visual inspection for the generated docs
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84587
Approved by: https://github.com/vkuzo
Summary: This PR removes the is_reference flag from the existing
convert_fx API and replaces it with a new convert_to_reference
function. This separates (1) converting the prepared model to a
reference model from (2) lowering the reference model to a quantized
model, enabling users to call their custom lowering function for
custom backends. For the native fbgemm backend, for example, the
following are equivalent:
```
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx
prepared = prepare_fx(model, ...)
quantized = convert_fx(prepared, ...)
```
```
from torch.ao.quantization.fx import lower_to_fbgemm
from torch.ao.quantization.quantize_fx import (
prepare_fx,
convert_to_reference
)
prepared = prepare_fx(model, ...)
reference = convert_to_reference(prepared, ...)
quantized = lower_to_fbgemm(reference, ...)
```
Note that currently `lower_to_fbgemm` takes in two other arguments
that are difficult for users to provide. A future commit will remove
these arguments to make the helper function more user friendly.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Reviewers: jerryzh168, vkuzo
Subscribers: jerryzh168, vkuzo
Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091
Approved by: https://github.com/jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066
Following https://github.com/pytorch/pytorch/pull/78452,
this commit replaces the following config dicts with python objects:
- prepare_custom_config_dict -> PrepareCustomConfig
- convert_custom_config_dict -> ConvertCustomConfig
- fuse_custom_config_dict -> FuseCustomConfig
This leads to better type safety and better user experience in
notebook settings due to improved auto completion. The new APIs
are as follows:
```
from torch.ao.quantization.fx.custom_config import PrepareCustomConfig
prepare_custom_config = PrepareCustomConfig() \
.set_float_to_observed_mapping(float_class, observed_class) \
.set_non_traceable_module_names(["mod1", "mod2"]) \
.set_non_traceable_module_classes([class1, class2]) \
.set_input_quantized_indexes([0, 1]) \
.set_output_quantized_indexes([0]) \
.set_preserved_attributes(["attr1", "attr2"])
convert_custom_config = ConvertCustomConfig() \
.set_observed_to_quantized_mapping(observed_class, quantized_class) \
.set_preserved_attributes(["attr1", "attr2"])
model = prepare_fx(
model,
qconfig_mapping,
example_inputs,
prepare_custom_config=prepare_custom_config)
model(data)
model = convert_fx(model, convert_custom_config=convert_custom_config)
```
For backwards compatibility, prepare_fx, prepare_qat_fx, and
convert_fx will continue to accept Dicts, which will be converted
to the relevant *CustomConfig object internally.
Note that this commit does not modify existing tests to use the
new API; they will continue to pass in Dicts as before, which still
works but triggers a deprecation warning. This will be handled in
a future commit.
Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/)
Approved by: https://github.com/jerryzh168
Summary: https://github.com/pytorch/pytorch/pull/78452 replaced
qconfig_dict with QConfigMapping as the default API for prepare_fx,
prepare_qat_fx, and convert_fx. We should update the docs to reflect
this change as well.
Test Plan:
```
cd docs
make html
cd build/html
python -m server.http
```
Reviewers: jerryzh168, vkuzo
Subscribers: jerryzh168, vkuzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78533
Approved by: https://github.com/vkuzo
**Summary:** Previously, FX graph mode quantization configurations
were specified through a dictionary of qconfigs. However, this
API was not in line with other core APIs in PyTorch. This commit
replaces this dictionary with a config object that users will
create and pass to prepare and convert. This leads to better
type safety and better user experience in notebook settings
due to improved auto completion.
The new API is as follows:
```
from torch.ao.quantization import QConfigMapping
from torch.ao.quantization.quantize_fx import prepare_fx
qconfig_mapping = QConfigMapping()
.set_global(qconfig)
.set_object_type(torch.nn.Linear, qconfig)
.set_module_name_regex("foo.*bar", qconfig)
.set_module_name("mod", qconfig)
prepare_fx(model, qconfig_mapping)
```
For backwards compatibility, `prepare_fx`, `prepare_qat_fx`,
and `convert_fx` will continue to accept qconfig_dicts, which
will be converted to QuantizationConfigs internally.
Note that this commit does not modify existing tests to use the
new API; they will continue to pass in qconfig_dict as before,
which still works but triggers a deprecation warning. This will
be handled in a future commit.
**Test Plan:**
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
**Reviewers:** jerryzh168, vkuzo
**Subscribers:** jerryzh168, vkuzo
Differential Revision: D36747998
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78452
Approved by: https://github.com/jerryzh168
Summary:
Following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md we implemented
the backend configuration for fbgemm/qnnpack backend, currently it was under fx folder, but we'd like to use this for all different
workflows, including eager, fx graph and define by run quantization, this PR moves it to torch.ao.quantization namespace so that
it can be shared by different workflows
Also moves some utility functions specific to fx to fx/backend_config_utils.py and some files are kept in fx folder (quantize_handler.py and fuse_handler.py)
Test Plan:
python test/teset_quantization.py TestQuantizeFx
python test/teset_quantization.py TestQuantizeFxOps
python test/teset_quantization.py TestQuantizeFxModels
python test/test_quantization.py TestAOMigrationQuantization
python test/test_quantization.py TestAOMigrationQuantizationFx
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75823
Approved by: https://github.com/vkuzo
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75389
This seems to be removed before, so won't mark this PR as bc-breaking, this use case
is now enabled with backend_config_dict api
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35451960
fbshipit-source-id: 21a8f19c1968af44bf4fa603f16ee8c6f5080e5a
(cherry picked from commit 2862f17b57f846b55736bc6b5d10df4256567adf)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75388
This is now replaced with backend_config_dict, we don't want to expose the implementation detail to
users. We'll have docs for backend_config_dict later
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35451958
fbshipit-source-id: 86e482d0782470ea02408836755cfc8531b8f66e
(cherry picked from commit 072541824b454e30df2b48758f465ebd814b436e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75387
This is now replaced with backend_config_dict, we don't want to expose the implementation detail to
users. We'll have docs for backend_config_dict later
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35451955
fbshipit-source-id: 77ede61f1d8f169dc1e1e6d847244ba99a97ab76
(cherry picked from commit 953576259fdc8827437acb6f5d04e584e37a7d64)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75386
This is now replaced with backend_config_dict, we don't want to expose the implementation detail to
users. We'll have docs for backend_config_dict later
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: ezyang
Differential Revision: D35451957
fbshipit-source-id: 52ebb5fb20cd96c1f21410b07c3d0c448c58cdba
(cherry picked from commit ccb38026f14644f9eb43335b7a7de5568c556394)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75377
This is in `prepare_custom_config_dict` but we never talked about them before, and we didn't find use cases internally
So it should be OK to remove.
We can now serve the same use case with `backend_config_dict` api
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35451961
fbshipit-source-id: 8a44c4518eecd50fab7ea2ff06697527b1cdb049
(cherry picked from commit 964183ed26bd8f367a4cf7fcc991eb519dc31a58)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75401
This commit removes asserts that require prepare_fx to
be run in eval mode and prepare_qat_fx to be run in training mode.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_prepare_mode
Imported from OSS
Reviewed By: vkuzo, jerryzh168
Differential Revision: D35457100
fbshipit-source-id: 13a55b13d9e389991f69c06c6a70bc51cdebba36
(cherry picked from commit fb0685e0873dc8e807da3213be403b51e8b4a687)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863
This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first,
and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack).
This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code
in quantization_patterns.py as well (in followup PRs).
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
and other internal/oss regression tests
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D34778506
fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b
(cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73470
att, this does not affect user apis since we are only exposing fuse_fx as a public api
Test Plan:
python test/test_quantization.py TestFuseFx
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34495260
fbshipit-source-id: 3aa253bc7190e50acc7229186f210901ebc5481b
(cherry picked from commit a88517ff6feff7abbece2234d82fd53e33702237)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72735
We use `get_matched_types` to get the (type) pattern from matched modules.
And we need to use MatchAllNode instead of type(MatchAllNode) to query the fuser_method for the pattern
Test Plan:
TODO
Imported from OSS
Reviewed By: raghuramank10000
Differential Revision: D34180705
fbshipit-source-id: db9b6e791a9f26b70079fddc95fce033052199ab
(cherry picked from commit 01d38afabcb1bfc207dee7d49ee13df500d32fdf)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009
Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is
not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat`
to control this
Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we
can relax this constraint later
Test Plan:
```
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestFusion
```
Imported from OSS
**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)|
|**Modified Pages**|
Reviewed By: mruberry
Differential Revision: D33178977
fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770
(cherry picked from commit 2d51f9fb28)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70757
This is an initial PR on a way to preserve stack traces throughout FX
graph mode quantization. It preserves the stack traces for ops
for all of the quantize handlers. A future PR will add stack traces
for dtype transitions.
Test Plan:
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Note: the above only tests a single case. In a future PR, once we
expand coverage, we can expand the utility functions to check for stack
traces on all tests.
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Imported from OSS
Differential Revision:
D33432485
D33432485
Reviewed By: jerryzh168
Pulled By: vkuzo
fbshipit-source-id: 56c56850393132487430a850fa1def826a9c39c0
(cherry picked from commit c11155b31e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70151
this supports converting an observed standalone module to quantized standalone module
in the new convert flow (convert observers to quant-dequant operators)
Test Plan:
```
python test/test_quant_trt.py TestConvertFxDoNotUse
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33205163
fbshipit-source-id: 01ea44fb2a8ffe30bec1dd5678e7a72797bafafc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69959
GraphModule is an implementation detail, We don't want to expose it in quantization apis
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_quantized_model_type
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33119103
fbshipit-source-id: d8736ff08b42ee009d6cfd74dcb3f6150f71f3d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70150
This PR allows user to specify backend_config_dict for standalone modules, both in prepare and convert step
adding this now to allow prototype for some of our customer use cases, test for the codepath will be added in
a separate PR
Test Plan:
regression tests
```
python test/test_quantization.py TestQuantizeFx
```
test that specifies backend_config for some module will be added in a separate PR for the use case we have in mind
since it requires other features
Imported from OSS
**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33205162/V9/classyvision/)|
|**Modified Pages**|
Reviewed By: vkuzo
Differential Revision: D33205162
fbshipit-source-id: a657cef8e49d99b6a43653141521dc87c33bfd89
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69878
But we'll still verify that model.training is True when user call prepare_qat API.
Relaxing this condition might also mean that we change the api for methods in fuser_method_mapping,
with additional flag for qat (currently we just have different fusions for training/eval), I don't think
this is P0, we could revisit if there is a need in the future
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33080988
fbshipit-source-id: b13715b91f10454948199323c5d81ef88bb3517f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69335
This PR added support for configuring fusion with:
"pattern", "fuser_method"
This only works for simple sequence of 2 op patterns currently, will extend this in future PRs
Test Plan:
regresion test on linear-relu fusion:
```
python test/fx2trt/test_quant_trt.py TestQuantizeFxTRTOps
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32816164
fbshipit-source-id: f300b7b96b36908cb94a50a8a17e0e15032509eb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67169
Looks like the doc error only appears after it's landed
Test Plan: Imported from OSS
Reviewed By: seemethere
Differential Revision: D31890431
fbshipit-source-id: d40cba082712c4b35704ea15d82fbc4749f85aec