Summary: Previously, we automatically moved the model to CPU in
torch.ao.quantization.fx.convert to work around the issue where
certain functions called by convert expect CPU arguments. This
commit pushes this responsibility to the caller since it is the
user's decision of which device to use.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
BC-breaking Notes:
Before:
```
model = resnet18(...)
model = prepare_fx(model, qconfig_mapping, example_inputs)
... # calibrate
model = convert_fx(model)
```
After:
```
model = resnet18(...)
model.cpu()
model = prepare_fx(model, qconfig_mapping, example_inputs)
... # calibrate
model = convert_fx(model)
```
Reviewers: jerryzh168
Subscribers: jerryzh168
Differential Revision: [D37528830](https://our.internmc.facebook.com/intern/diff/D37528830)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80555
Approved by: https://github.com/jerryzh168
Add prelu op and module for quantized CPU backend.
The PR includes:
- Quantized version of prelu op
- Native prelu kernel for quantized CPU
- Prelu modules in `nn` and `nn.quantized`
- FX support for prelu
- Unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491
Approved by: https://github.com/jerryzh168
Summary: This commit adds qconfigs with special observers for fixed
qparams ops in get_default_qconfig_mapping and
get_default_qat_qconfig_mapping. For correctness, we also require
users to use these special observers if we detect these fixed
qparams ops in prepare.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Reviewers: jerryzh168, vkuzo
Subscribers: jerryzh168, vkuzo
Differential Revision: [D37396379](https://our.internmc.facebook.com/intern/diff/D37396379)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80184
Approved by: https://github.com/jerryzh168
Summary: This PR removes the is_reference flag from the existing
convert_fx API and replaces it with a new convert_to_reference
function. This separates (1) converting the prepared model to a
reference model from (2) lowering the reference model to a quantized
model, enabling users to call their custom lowering function for
custom backends. For the native fbgemm backend, for example, the
following are equivalent:
```
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx
prepared = prepare_fx(model, ...)
quantized = convert_fx(prepared, ...)
```
```
from torch.ao.quantization.fx import lower_to_fbgemm
from torch.ao.quantization.quantize_fx import (
prepare_fx,
convert_to_reference
)
prepared = prepare_fx(model, ...)
reference = convert_to_reference(prepared, ...)
quantized = lower_to_fbgemm(reference, ...)
```
Note that currently `lower_to_fbgemm` takes in two other arguments
that are difficult for users to provide. A future commit will remove
these arguments to make the helper function more user friendly.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Reviewers: jerryzh168, vkuzo
Subscribers: jerryzh168, vkuzo
Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091
Approved by: https://github.com/jerryzh168
Summary:
In https://github.com/pytorch/pytorch/pull/74137, the MKLDNN
quantized backend was added to PyTorch.
Sometime in the past couple of days, MKLDNN got enabled on my Mac OS
machine. This uncovered issues in FX graph mode quantization testing,
as we were only testing for fbgemm and qnnpack, and some of the tests
that were assuming fbgemm started silently going through the MKLDNN
path. Since the requirements for MKLDNN are different, the tests started
to fail.
This PR unbreaks the minimal amount of tests to get a clean
test run on my machine.
In the future, it would be great to add testing for MKLDNN specifically,
and also audit all of the current quantization tests which are assuming
fbgemm to set the backend properly.
Test plan:
```
python test/test_quantization.py -k Fx
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79718
Approved by: https://github.com/jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066
Following https://github.com/pytorch/pytorch/pull/78452,
this commit replaces the following config dicts with python objects:
- prepare_custom_config_dict -> PrepareCustomConfig
- convert_custom_config_dict -> ConvertCustomConfig
- fuse_custom_config_dict -> FuseCustomConfig
This leads to better type safety and better user experience in
notebook settings due to improved auto completion. The new APIs
are as follows:
```
from torch.ao.quantization.fx.custom_config import PrepareCustomConfig
prepare_custom_config = PrepareCustomConfig() \
.set_float_to_observed_mapping(float_class, observed_class) \
.set_non_traceable_module_names(["mod1", "mod2"]) \
.set_non_traceable_module_classes([class1, class2]) \
.set_input_quantized_indexes([0, 1]) \
.set_output_quantized_indexes([0]) \
.set_preserved_attributes(["attr1", "attr2"])
convert_custom_config = ConvertCustomConfig() \
.set_observed_to_quantized_mapping(observed_class, quantized_class) \
.set_preserved_attributes(["attr1", "attr2"])
model = prepare_fx(
model,
qconfig_mapping,
example_inputs,
prepare_custom_config=prepare_custom_config)
model(data)
model = convert_fx(model, convert_custom_config=convert_custom_config)
```
For backwards compatibility, prepare_fx, prepare_qat_fx, and
convert_fx will continue to accept Dicts, which will be converted
to the relevant *CustomConfig object internally.
Note that this commit does not modify existing tests to use the
new API; they will continue to pass in Dicts as before, which still
works but triggers a deprecation warning. This will be handled in
a future commit.
Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/)
Approved by: https://github.com/jerryzh168
Summary: This follows https://github.com/pytorch/pytorch/pull/78452,
which replaced the qconfig_dict with QConfigMapping. This PR
additionally replaces get_default_*qconfig_dict with
get_default_*qconfig_mapping. For backward compatibility, we
deprecate the old functions instead of removing them.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Reviewers: jerryzh168, vkuzo
Subscribers: jerryzh168, vkuzo, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79618
Approved by: https://github.com/jerryzh168
Summary:
The fbgemm and qnnpack backends mostly support ops with quint8 activations.
Historically, the default backend config has included ops with fp16 activations
for other backends. This PR keeps the old config under a different name to keep
the functionality tested, and makes the default config match fbgemm/qnnpack ops.
Test plan:
```
python test/test_quantization.py -k TestQuantizeFx
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78528
Approved by: https://github.com/andrewor14
**Summary:** Previously, FX graph mode quantization configurations
were specified through a dictionary of qconfigs. However, this
API was not in line with other core APIs in PyTorch. This commit
replaces this dictionary with a config object that users will
create and pass to prepare and convert. This leads to better
type safety and better user experience in notebook settings
due to improved auto completion.
The new API is as follows:
```
from torch.ao.quantization import QConfigMapping
from torch.ao.quantization.quantize_fx import prepare_fx
qconfig_mapping = QConfigMapping()
.set_global(qconfig)
.set_object_type(torch.nn.Linear, qconfig)
.set_module_name_regex("foo.*bar", qconfig)
.set_module_name("mod", qconfig)
prepare_fx(model, qconfig_mapping)
```
For backwards compatibility, `prepare_fx`, `prepare_qat_fx`,
and `convert_fx` will continue to accept qconfig_dicts, which
will be converted to QuantizationConfigs internally.
Note that this commit does not modify existing tests to use the
new API; they will continue to pass in qconfig_dict as before,
which still works but triggers a deprecation warning. This will
be handled in a future commit.
**Test Plan:**
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
**Reviewers:** jerryzh168, vkuzo
**Subscribers:** jerryzh168, vkuzo
Differential Revision: D36747998
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78452
Approved by: https://github.com/jerryzh168
Summary:
FakeQuantize class has quant_min/quant_max and activation_post_process
attributes, the latter of which already includes quant_min/max. As such,
we can remove quant_min/quant_max from FakeQuantize and use
FakeQuantize.activation_post_process.quant_m* directly.
Test plan:
```
python test/test_quantization.py
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76674
Approved by: https://github.com/vkuzo
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76637
The previous naming convention `default_affine_fixed_qparams_observer`
and `default_symmetric_fixed_qparams_observer` were uninformative, and users had to read
the definition in order to understand what these observers are. The new
naming convention reveals information about the range of the observers
The analogous changes were also made for
`default_symmetric_fixed_qparams_fake_quant` and
`default_affine_fixed_qparams_fake_quant`
Test Plan:
```
python test/test_quantization.py
```
```
python test/test_quantization.py
```
Differential Revision:
D36054169
D36054169
Reviewed By: vkuzo
Pulled By: dzdang
fbshipit-source-id: 215f7786a4b7abda7327f17cc61735697ec5cca9
(cherry picked from commit 21a4e6eda4467c8adca7fd534a506a14e975f9cf)
Summary: Calling `prepare_fx` with `get_default_qconfig_dict`
failed for models with fused modules, such as `ConvReLU2d`.
This commit fixes this by adding qconfig entries for ReLU
and BatchNorm as well.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qconfig_dict_with_fused_modules
Reviewers: jerryzh168
Subscribers: jerryzh168, vkuzo
Issue: https://github.com/pytorch/pytorch/issues/75825
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75838
Approved by: https://github.com/jerryzh168
Summary:
Previously the list of qat modules, fused modules etc. are hardcoded in the convert code, in this PR we get these information
from backend_config_dict instead
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75520
Approved by: https://github.com/vkuzo
Summary:
Previously we are still relying on the registration mechnism and get the default quantize handlers that are registered,
now we have moved all registration to backend_config_dict we can get all quant patterns just from backend_config_dict now.
This PR enables using native backend_config_dict everywhere in prepare when the backend_config_dict is None, we'll also
do similar changes in convert as well
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75469
Approved by: https://github.com/vkuzo
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75389
This seems to be removed before, so won't mark this PR as bc-breaking, this use case
is now enabled with backend_config_dict api
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35451960
fbshipit-source-id: 21a8f19c1968af44bf4fa603f16ee8c6f5080e5a
(cherry picked from commit 2862f17b57f846b55736bc6b5d10df4256567adf)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75401
This commit removes asserts that require prepare_fx to
be run in eval mode and prepare_qat_fx to be run in training mode.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_prepare_mode
Imported from OSS
Reviewed By: vkuzo, jerryzh168
Differential Revision: D35457100
fbshipit-source-id: 13a55b13d9e389991f69c06c6a70bc51cdebba36
(cherry picked from commit fb0685e0873dc8e807da3213be403b51e8b4a687)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75314
this is a refactor to use backend_config_dict for operators with fixed quantization parameters
api is not final yet, we'll update the api after we moved everything to backend_config_dict
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35423790
fbshipit-source-id: a69ce19340e2e3c996f1435b887ba122de85f22f
(cherry picked from commit 5d35983a3bac4281f8636f69ffb68adb358e9a5f)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75258
att, the remaining registrations are for fp16 ops which are no longer used
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D35403588
fbshipit-source-id: fc328d42f4cb80901ed545a11fdde49ee7ff8b2e
(cherry picked from commit fbe2db090cf8d1221dd37d19636058d8dd44c728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75241
We have a previous PR that enabled operator.add in backend_config_dict, this
PR moved the rest binary ops to backend_config_dict.
There are some ops left, which are not needed (previously fp16 ops), we
will move them in the following PR
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
Imported from OSS
Reviewed By: bdhirsh
Differential Revision: D35403589
fbshipit-source-id: 663703b310944a6b7c5ade6d07a4d938a6ca082b
(cherry picked from commit 5a76ce031872c4fed5fcab5bb3c84a9394b01118)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75135
Some operators have fixed quantization parameters, this PR adds the support to override the
qconfig in the backend_config_dict
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35334279
fbshipit-source-id: 390510bd8fc2d61004c36c54390989583e6519ce
(cherry picked from commit ccf9bcd7eb4564ec97c5e0548b8ee926f640360b)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74843
is_output_quantized is used to check if we should quantize the op based on the dtype configuration in qconfig and what
is supported by the backend, we'll skip inserting observer if the dtype configuration is not supported by the backend,
this is now supported by backend_config_dict, and we can remove this function now.
Also we previously supported fp16 static quantization for some ops for one of our internal use case, and now it is not required, so
we can remove them
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D35190541
fbshipit-source-id: 623d961810737ec01e1f8b269ec48a6a99bb284a
(cherry picked from commit a405998c60c0146dbd5feef60e2d5cb3b0aa289c)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74277
see issue: https://github.com/pytorch/pytorch/issues/74240
this fixes that issue by skipping the children of untraceable modules during
propagate_qconfig. This required extending said function to take the
prepare_custom_config_dict as an optional argument.
Test Plan:
python test/test_quantization.py
python test/test_quantization.py TestQuantizeFx.test_qat_skip_untraced
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34916074
fbshipit-source-id: 11caba2cbf78566fb51adf698b01bbba0275de28
(cherry picked from commit 5324c48e4c3277bb12a716a4408151c86006ee47)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74600
Following https://github.com/pytorch/pytorch/pull/74210, this PR adds the support for some ops
using the DefaultNodeQuantizeHandler in the backend_config_dict defintion for pytorch native backend
TODO: There is still a few ops we didn't handle with backend_config_dict path: gelu and softmax, need to discuss if we still need them, if so we can change the test
to use backend_config_dict and remove the DefaultNodeQuantizeHandler after that
Test Plan:
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D35071437
fbshipit-source-id: 70351d2810ca1ac7dc09d4a9c239f6757ccb51ca
(cherry picked from commit 5e68f755a32ba7d90d6c73db9c2017f9c58d7fa5)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74510
Previously we require the dequantize before custom module to have one user, this is because we are removing the dequantize node
before custom module while we transform an observed custom module to a quantized custom module, but actually we don't need to remove it,
we can just change the input of custom module with quantize node instead. If the dequantize node only has one user, it will be removed
by the dead code elimination pass that was added recently.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_custom_module_class_input_has_multiple_users
Imported from OSS
Reviewed By: dzdang
Differential Revision: D35034626
fbshipit-source-id: eea9fbf9fb34c61f114c6431377be347632ce36d
(cherry picked from commit 2878085a56bc529afef5e533bc5f49079d4adc52)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74364
if a input is used multiple times in modules that are dynamically quantized:
```
x -- linear1
\-- linear2
```
we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass
to duplicate dequantize ops for pattern matching:
```
x - quantize_per_tensor_dynamic - dequantize1 - linear1
\----- dequantize2 - linear2
```
But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes
we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case
to recover both patterns:
```
x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1
\- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2
```
so that they can be fused into dynamic linear:
```
x - linear_dynamic1
\-- linear_dynamic2
```
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use
Imported from OSS
Reviewed By: yixin94
Differential Revision: D34952755
fbshipit-source-id: a950159fd6a661e84faf0baf1692f6783904cfb3
(cherry picked from commit 8a6896801fdd96a55476faca4ccb7ba0b0bdb058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74231
Add a check to make sure the weighted modules we swap is actually a float fused module,
since the reference fused module like reference version of linear - relu would have the same
fused type as the floating point linear - relu (and the linear submodule will have different types)
Test Plan: phabricator diff for now, can add a test case after we know exactly what the problem is
Reviewed By: andrewor14
Differential Revision: D34888290
fbshipit-source-id: a7f53368a7c17f7d1a82afaa50d14d569b4923df
(cherry picked from commit 458dac9fdf8b4f0d786bf9c815c2f2fe8df13bb4)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74198
As title, currently in the (add, X, MatchAllNode) pattnern, the node matched with MatchAllNode is regard as part of the pattern instead of the input. As a result, the possible patterns ends with that node will not be matched.
For instance, we have two patterns
1. (nn.ReLU, (torch.add, MatchAllNode, (nn.BatchNorm2d, nn.Conv2d)))
2. (nn.ReLU, (nn.BatchNorm2d, nn.Conv2d))
And we wanna fuse the following model
Conv2d -> BatchNorm2d -> ReLU +
Conv2d -> BatchNorm2d ------ Add -> ReLU
The pattern in the first row cannot be matched becaues the end node ReLU is recorded as MatchAllNode already.
Test Plan:
new unit test
```
[jiaxuzhu@devvm3400.frc0 /data/users/jiaxuzhu/fbsource/fbcode] buck test mode/dev //caffe2/test:quantization_fx -- --exact 'caffe2/test:quantization_fx - test_fusion_pattern_with_matchallnode (quantization.fx.test_quantize_fx.TestFuseFx)'
Parsing buck files: finished in 0.9 sec
Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 12.6 sec (100%) 18546/84011 jobs, 2/84011 updated
Total time: 13.5 sec
More details at https://www.internalfb.com/intern/buck/build/9d2decdb-d01e-4332-84f5-1728a65d4f7b
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: d92e10b8-9209-4e9e-95a6-2fcac02db251
Trace available for this run at /tmp/tpx-20220314-161230.347672-d92e10b8-9209-4e9e-95a6-2fcac02db251/trace.log
RemoteExecution session id: reSessionID-d92e10b8-9209-4e9e-95a6-2fcac02db251-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/3377699814955263
✓ ListingSuccess: caffe2/test:quantization_fx : 365 tests discovered (19.275)
✓ Pass: caffe2/test:quantization_fx - test_fusion_pattern_with_matchallnode (quantization.fx.test_quantize_fx.TestFuseFx) (17.760)
Summary
Pass: 1
ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/3377699814955263
```
Reviewed By: jerryzh168
Differential Revision: D34873730
fbshipit-source-id: dc78455c7233ba33e9ab215f50754b1656b7dbc7
(cherry picked from commit 1cc74cadd7dc725be97064f57c910ef9d1bbe1a8)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73274
As noticed in https://discuss.pytorch.org/t/calibration-of-model-in-post-training-static-quantization-using-fx-api/143661/6
and related to https://github.com/pytorch/pytorch/issues/72698 when using fx quantizaiton, if an op like view was used in a
model and the index parameters were passed in to the ops with a
variable rather than
hard coded, fx would mistakenly insert observers for them, leading to an
error when the observer tried to do tensor only operations on a
non-tensor. To fix this, an API was added to specify non tensor
arguments for various ops to enable better dtype propagation.
NON_TENSOR_ARG_DICT is a nested dict whose first key is a named tuple
which contains matching parameters for ops with nontensor args, the
inner dict's keys are dtypes and the values are a list of those arg indices that
take use such dtypes. Alternatively, instead of a list, the inner dict
value can also be a function that takes the node as an argument and
returns the list of arg indices.
Theoretically this api can support arbitrary functions but the current
implmentation is limited to simpler functions given the particular
issue this fixes seems to be rare.
Note: although torch.unsqueeze and torch.transpose are listed in
quantization_patterns.py, those ops appear to be untraceable by fx. I've
included tests for their cases but fixing this issue is beyond the scope
of this PR
Test Plan:
python test/test_quantization.py test_non_reference_size
...
python test/test_quantization.py test_non_reference_<op>
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D34410122
fbshipit-source-id: fc09949ca8a2d6473876a4b6c214eb91e9a9dae2
(cherry picked from commit 3a1375d677b7c98d62b1f5c839645698c39b32b9)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74229
Previously we did not successfully remove the dequantize node for `dict`, this PR fixes that, tested with
meta-only tests right now but we should follow up with oss tests (with dict output)
since we called dead code elimination pass, some of the inplace operators are removed in the TestQuantizeFx.test_fixed_qparams_ops,
in this PR we also just removed the calls to the inplace ops, and changed the expected results in the test case,
in the future PR we can remove the support for inplace operators, since it is not really supported in fx, and it's OK
for us to skip them as well
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34888140
fbshipit-source-id: 48cea842b49e52baa8eee3ce0f4bfb4a3625ab2a
(cherry picked from commit ef790315ebcf954930deb6b9d1c384992c1f1ec8)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863
This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first,
and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack).
This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code
in quantization_patterns.py as well (in followup PRs).
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
and other internal/oss regression tests
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D34778506
fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b
(cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73509
This adds functionality to lower reference models
involving the Linear-Bn1d pattern in FX QAT mode. This follows
https://github.com/pytorch/pytorch/pull/72431 and https://github.com/pytorch/pytorch/pull/72796, which add Linear-Bn1d fusion functionality
to eager QAT mode.
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_module
Imported from OSS
Reviewed By: dagitses
Differential Revision: D34591251
fbshipit-source-id: 39144485f9954ee1830c8b414e724560fd7e47bf
(cherry picked from commit b97a39b4d9df00e045fab4c01eca88e562ca2c02)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73572
Previously we can't specify how to get extra inputs for fused ops in backend_config_dict,
for example, for patterns like:
(torch.add, (nn.BatchNorm2d, nn.Conv2d), MatchAllNode)
where nn.Conv2d is the root node, the extra MatchAllNode (the input for original torch.add) would be lost
This PR added a "extra_inputs_getter" key in the backend_config_dict, which allows user to provide a function,
that can return a list of extra input node for the fused op given the matched node pattern. In this case,
we need a function that returns the node that matches with `MatchAllNode`, it would be something like the following:
```
def extra_inputs_getter(pattern):
add, conv_bn, extra_input = pattern
return [extra_input]
```
Test Plan:
python test/test_quantization.py TestFuseFx.test_fusion_pattern_with_multiple_inputs
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34553210
fbshipit-source-id: 748f8ce20974438458a39dbe9eae75281156c227
(cherry picked from commit be748526480e811874dbca64b1cf3bf4950f0393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73233
This PR makes CopyNodeQuantizeHandler to always produce reference patterns, and we have
some custom lowering pass to rewrite the reference qunatized patterns to quantized ops
Lowering passes have been implemented previously, we just need to enable the reference path here,
and cleanup the previous code to allow list some of the ops (`check_node`)
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: mrshenli
Differential Revision: D34469446
fbshipit-source-id: b9d9c5f793fbb735839199056c197ae98969cc4b
(cherry picked from commit af0cf4e79e11e7343d57e6ff7766c80e72ec60f3)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73345
For complex patterns we need to identify which node is the root, so that we can eliminate all other nodes and only preserve the root,
e.g. (torch.add, MatchAllNode, (torch.nn.ReLU, torch.nn.Conv2d)), we can preserve the torch.nn.Conv2d as root node, and remove other nodes.
Prevoiusly we assumed the root_node of a pattern is the "last node" of the pattern, computed by:
```
def default_root_node_getter(node_pattern):
while not isinstance(node_pattern[-1], Node):
node_pattern = node_pattern[-1]
return node_pattern[-1]
```
This PR enables user configuration to define their own root_node_getter, that means we can define root_node for patterns like:
(torch.add, (torch.nn.ReLU, torch.nn.Conv2d), MatchAllNode)
Test Plan:
python test/test_quantize_fx.py TestFuseFx.test_root_node_getter
Imported from OSS
Reviewed By: VitalyFedyunin
Differential Revision: D34442193
fbshipit-source-id: 2f6da69a5b6527b49710ae32820e8e2915d9af37
(cherry picked from commit 8b49bf0d7d53cdcf2c9f40f8e25bc843e8814026)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72735
We use `get_matched_types` to get the (type) pattern from matched modules.
And we need to use MatchAllNode instead of type(MatchAllNode) to query the fuser_method for the pattern
Test Plan:
TODO
Imported from OSS
Reviewed By: raghuramank10000
Differential Revision: D34180705
fbshipit-source-id: db9b6e791a9f26b70079fddc95fce033052199ab
(cherry picked from commit 01d38afabcb1bfc207dee7d49ee13df500d32fdf)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72953
This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we have
some custom lowering pass to rewrite the reference qunatized patterns to quantized ops
it includes rewrite for
torch.ops.quantized.add, torch.ops.quantized.mul, torch.ops.quantized.matmul
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: gchanan
Differential Revision: D34292408
fbshipit-source-id: 9872a5098249bc77db15e9fb614416958e62b9b2
(cherry picked from commit dbdc61ee8b5dde2e54a34a370a3af887e5117398)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72490
This is an effort to move the current implementation towards the reference quantized model design:
https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
so that we use reference model in the default fbgemm/qnnpack path
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps.test_qbatch_norm
Imported from OSS
Reviewed By: vkuzo, andrewor14
Differential Revision: D34062365
fbshipit-source-id: ed015c61f5b969554a6477f92cf6be2358cb558c
(cherry picked from commit 9498421ddd)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72444
In https://github.com/pytorch/pytorch/pull/71783 support was added for
quantized matmul.
In this PR, the FX graph mode quantization workflow support for this
operator is added, for int8 dtypes.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_qmatmul
```
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D34047310
fbshipit-source-id: 781219047419ce621a4deb46ea04881818bf4209
(cherry picked from commit 7e039fa3a1)