Summary:
Fixes: https://github.com/pytorch/pytorch/issues/78117
Fixes: https://github.com/pytorch/pytorch/issues/73463
This PR adds a normalization pass that normalizes all the args to keyword args in positional order and fixes lowering code that previously
only uses node.args to use both args and kwargs instead.
Also tried to add a test for F.conv2d, but since conv2d matches multiple schemas we are doing an extra schema match, and because we are using symbolic values
in `transform`, we don't have a schema match, so F.conv2d still fails with runtime errors. we can resolve this issue later when there is a need.
Another thing I'm considering is to do the normalization with real inputs instead of symbolic inputs and not rely on operator_schemas (which is based on torchscript),
and rely on inspect.signature, I tried this briefly but didn't get too far, it looks like we cannot get the python signature for `torch._C._nn.linear`, it might be possible to fix as well, but will need follow up discussions.
The goal for this PR is just to introduce normalization in our codebase so that we can adapt some downstream code to this, and also fix the F.linear issue.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_normalize_args
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D37163228](https://our.internmc.facebook.com/intern/diff/D37163228)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79095
Approved by: https://github.com/andrewor14
Summary:
Some of the util functions in FX graph mode quantization throw warnings
such as:
```
/Users/vasiliy/pytorch/torch/ao/quantization/fx/utils.py:410: UserWarning: To copy construct from
a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().
requires_grad_(True), rather than torch.tensor(sourceTensor).
```
This PR fixes the warnings by moving the code to the recommended syntax if the
value is a tensor.
Test plan:
```
python test/test_quantization.py -k test_conv_linear_reference
// warning appeared before this PR and disappeared after this PR
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80883
Approved by: https://github.com/jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066
Following https://github.com/pytorch/pytorch/pull/78452,
this commit replaces the following config dicts with python objects:
- prepare_custom_config_dict -> PrepareCustomConfig
- convert_custom_config_dict -> ConvertCustomConfig
- fuse_custom_config_dict -> FuseCustomConfig
This leads to better type safety and better user experience in
notebook settings due to improved auto completion. The new APIs
are as follows:
```
from torch.ao.quantization.fx.custom_config import PrepareCustomConfig
prepare_custom_config = PrepareCustomConfig() \
.set_float_to_observed_mapping(float_class, observed_class) \
.set_non_traceable_module_names(["mod1", "mod2"]) \
.set_non_traceable_module_classes([class1, class2]) \
.set_input_quantized_indexes([0, 1]) \
.set_output_quantized_indexes([0]) \
.set_preserved_attributes(["attr1", "attr2"])
convert_custom_config = ConvertCustomConfig() \
.set_observed_to_quantized_mapping(observed_class, quantized_class) \
.set_preserved_attributes(["attr1", "attr2"])
model = prepare_fx(
model,
qconfig_mapping,
example_inputs,
prepare_custom_config=prepare_custom_config)
model(data)
model = convert_fx(model, convert_custom_config=convert_custom_config)
```
For backwards compatibility, prepare_fx, prepare_qat_fx, and
convert_fx will continue to accept Dicts, which will be converted
to the relevant *CustomConfig object internally.
Note that this commit does not modify existing tests to use the
new API; they will continue to pass in Dicts as before, which still
works but triggers a deprecation warning. This will be handled in
a future commit.
Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/)
Approved by: https://github.com/jerryzh168
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73274
As noticed in https://discuss.pytorch.org/t/calibration-of-model-in-post-training-static-quantization-using-fx-api/143661/6
and related to https://github.com/pytorch/pytorch/issues/72698 when using fx quantizaiton, if an op like view was used in a
model and the index parameters were passed in to the ops with a
variable rather than
hard coded, fx would mistakenly insert observers for them, leading to an
error when the observer tried to do tensor only operations on a
non-tensor. To fix this, an API was added to specify non tensor
arguments for various ops to enable better dtype propagation.
NON_TENSOR_ARG_DICT is a nested dict whose first key is a named tuple
which contains matching parameters for ops with nontensor args, the
inner dict's keys are dtypes and the values are a list of those arg indices that
take use such dtypes. Alternatively, instead of a list, the inner dict
value can also be a function that takes the node as an argument and
returns the list of arg indices.
Theoretically this api can support arbitrary functions but the current
implmentation is limited to simpler functions given the particular
issue this fixes seems to be rare.
Note: although torch.unsqueeze and torch.transpose are listed in
quantization_patterns.py, those ops appear to be untraceable by fx. I've
included tests for their cases but fixing this issue is beyond the scope
of this PR
Test Plan:
python test/test_quantization.py test_non_reference_size
...
python test/test_quantization.py test_non_reference_<op>
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D34410122
fbshipit-source-id: fc09949ca8a2d6473876a4b6c214eb91e9a9dae2
(cherry picked from commit 3a1375d677b7c98d62b1f5c839645698c39b32b9)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863
This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first,
and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack).
This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code
in quantization_patterns.py as well (in followup PRs).
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
and other internal/oss regression tests
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D34778506
fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b
(cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70757
This is an initial PR on a way to preserve stack traces throughout FX
graph mode quantization. It preserves the stack traces for ops
for all of the quantize handlers. A future PR will add stack traces
for dtype transitions.
Test Plan:
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Note: the above only tests a single case. In a future PR, once we
expand coverage, we can expand the utility functions to check for stack
traces on all tests.
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Imported from OSS
Differential Revision:
D33432485
D33432485
Reviewed By: jerryzh168
Pulled By: vkuzo
fbshipit-source-id: 56c56850393132487430a850fa1def826a9c39c0
(cherry picked from commit c11155b31e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69720
This function is also useful for DBR quant, moving it from FX utils
to common utils.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeDBR
```
Reviewed By: jerryzh168
Differential Revision: D33003473
Pulled By: vkuzo
fbshipit-source-id: 20360682c69d614a645c14fc29d3ee023d6b2623
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67537
This PR adds support for quantizing torch.addmm to produce a reference quantized pattern,
and also adds support in the backend_config_dict api that allows people to specify the input, weight and bias input for each input:
```
addmm_config = {
"pattern": torch.addmm,
"observation_type": ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT,
"dtype_configs": [
weighted_op_qint8_dtype_config,
],
# a map from input type to input index
"input_type_to_index": {
"bias": 0,
"input": 1,
"weight": 2,
}
}
```
This requires some changes in getting weight_dtype and bias_dtype in the type inference stage of prepare, which will be added in the previous PR
Test Plan:
```
pytho test/fx2trt/test_quant_trt.py TestQuantizeFxTRT.test_addmm
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32014998
fbshipit-source-id: 8d96c1e8b7ebb2ab385c08a5b1e43f2d5a2cbcbe
Summary:
- [x] Fix the Pyre type checking errors in `torch/quantization/fx/utils.py`
```
torch/quantization/fx/utils.py:490:4 Incompatible variable type [9]: target_module_type is declared to have type `Type[nn.modules.module.Module]` but is used as type `None`.
```
Fixes the issue: [MLH-Fellowship/pyre-check/issues/75](https://github.com/MLH-Fellowship/pyre-check/issues/75)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66311
Reviewed By: pradeep90
Differential Revision: D31506399
Pulled By: 0xedward
fbshipit-source-id: 3d866fba6005452378d4a2613b8689fa2d7a8b67
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61647
`prepare_fx` currently assumes that bias is always a positional argument to
convolutions, and only a keyword argument to other functions. This happens to work
today due to a quirk in how `__torch_function__` is handled for python
functions but shouldn't be considered stable.
Instead, we should support `bias` for both positional and keyword forms.
cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D31401360
Pulled By: albanD
fbshipit-source-id: 1e2f53d80e2176b870f326dc498e251e2386136e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65033
1. Move the file:
```
hg mv caffe2/torch/quantization/fx caffe2/torch/ao/quantization/fx
hg mv caffe2/torch/quantization/quantize_fx.py caffe2/torch/ao/quantization/quantize_fx.py
```
2. Create new files
```
touch caffe2/torch/quantization/quantize_fx.py
touch caffe2/torch/quantization/fx/__init__.py
```
3. import things in the new files
4. add tests to test/quantization/ao_migration/test_quantization_fx.py
this is because we have some fx import in quantize_fx and fx/*.py
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: vkuzo, z-a-f
Differential Revision: D30949749
fbshipit-source-id: 9e5d4d039c8a0a0820bc9040e224f0d2c26886d3