Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73509
This adds functionality to lower reference models
involving the Linear-Bn1d pattern in FX QAT mode. This follows
https://github.com/pytorch/pytorch/pull/72431 and https://github.com/pytorch/pytorch/pull/72796, which add Linear-Bn1d fusion functionality
to eager QAT mode.
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_module
Imported from OSS
Reviewed By: dagitses
Differential Revision: D34591251
fbshipit-source-id: 39144485f9954ee1830c8b414e724560fd7e47bf
(cherry picked from commit b97a39b4d9df00e045fab4c01eca88e562ca2c02)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73572
Previously we can't specify how to get extra inputs for fused ops in backend_config_dict,
for example, for patterns like:
(torch.add, (nn.BatchNorm2d, nn.Conv2d), MatchAllNode)
where nn.Conv2d is the root node, the extra MatchAllNode (the input for original torch.add) would be lost
This PR added a "extra_inputs_getter" key in the backend_config_dict, which allows user to provide a function,
that can return a list of extra input node for the fused op given the matched node pattern. In this case,
we need a function that returns the node that matches with `MatchAllNode`, it would be something like the following:
```
def extra_inputs_getter(pattern):
add, conv_bn, extra_input = pattern
return [extra_input]
```
Test Plan:
python test/test_quantization.py TestFuseFx.test_fusion_pattern_with_multiple_inputs
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34553210
fbshipit-source-id: 748f8ce20974438458a39dbe9eae75281156c227
(cherry picked from commit be748526480e811874dbca64b1cf3bf4950f0393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73233
This PR makes CopyNodeQuantizeHandler to always produce reference patterns, and we have
some custom lowering pass to rewrite the reference qunatized patterns to quantized ops
Lowering passes have been implemented previously, we just need to enable the reference path here,
and cleanup the previous code to allow list some of the ops (`check_node`)
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: mrshenli
Differential Revision: D34469446
fbshipit-source-id: b9d9c5f793fbb735839199056c197ae98969cc4b
(cherry picked from commit af0cf4e79e11e7343d57e6ff7766c80e72ec60f3)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73345
For complex patterns we need to identify which node is the root, so that we can eliminate all other nodes and only preserve the root,
e.g. (torch.add, MatchAllNode, (torch.nn.ReLU, torch.nn.Conv2d)), we can preserve the torch.nn.Conv2d as root node, and remove other nodes.
Prevoiusly we assumed the root_node of a pattern is the "last node" of the pattern, computed by:
```
def default_root_node_getter(node_pattern):
while not isinstance(node_pattern[-1], Node):
node_pattern = node_pattern[-1]
return node_pattern[-1]
```
This PR enables user configuration to define their own root_node_getter, that means we can define root_node for patterns like:
(torch.add, (torch.nn.ReLU, torch.nn.Conv2d), MatchAllNode)
Test Plan:
python test/test_quantize_fx.py TestFuseFx.test_root_node_getter
Imported from OSS
Reviewed By: VitalyFedyunin
Differential Revision: D34442193
fbshipit-source-id: 2f6da69a5b6527b49710ae32820e8e2915d9af37
(cherry picked from commit 8b49bf0d7d53cdcf2c9f40f8e25bc843e8814026)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72735
We use `get_matched_types` to get the (type) pattern from matched modules.
And we need to use MatchAllNode instead of type(MatchAllNode) to query the fuser_method for the pattern
Test Plan:
TODO
Imported from OSS
Reviewed By: raghuramank10000
Differential Revision: D34180705
fbshipit-source-id: db9b6e791a9f26b70079fddc95fce033052199ab
(cherry picked from commit 01d38afabcb1bfc207dee7d49ee13df500d32fdf)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72953
This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we have
some custom lowering pass to rewrite the reference qunatized patterns to quantized ops
it includes rewrite for
torch.ops.quantized.add, torch.ops.quantized.mul, torch.ops.quantized.matmul
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: gchanan
Differential Revision: D34292408
fbshipit-source-id: 9872a5098249bc77db15e9fb614416958e62b9b2
(cherry picked from commit dbdc61ee8b5dde2e54a34a370a3af887e5117398)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72490
This is an effort to move the current implementation towards the reference quantized model design:
https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
so that we use reference model in the default fbgemm/qnnpack path
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps.test_qbatch_norm
Imported from OSS
Reviewed By: vkuzo, andrewor14
Differential Revision: D34062365
fbshipit-source-id: ed015c61f5b969554a6477f92cf6be2358cb558c
(cherry picked from commit 9498421ddd)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72444
In https://github.com/pytorch/pytorch/pull/71783 support was added for
quantized matmul.
In this PR, the FX graph mode quantization workflow support for this
operator is added, for int8 dtypes.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_qmatmul
```
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D34047310
fbshipit-source-id: 781219047419ce621a4deb46ea04881818bf4209
(cherry picked from commit 7e039fa3a1)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71780
Adds support for matching operator.add -> torch.relu in FX graph
mode quantization.
It would be nice to support torch.relu better in general, but
saving that for a future PR to keep PRs small.
This is useful for DBR quant because we have some test cases in DBR
quant which use add-relu, and we'd like to match them to FX.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_add_relu
python test/test_quantization.py TestQuantizeFxOps.test_mul_relu
```
Reviewed By: jerryzh168
Differential Revision: D33775096
Pulled By: vkuzo
fbshipit-source-id: 889d9b41d3758ecbbb6d7eab67f64ce3d4892d24
(cherry picked from commit c1f9f38ca1)
Summary:
Original commit changeset: d9c5979efb03
Original Phabricator Diff: D33994546 (a5dad85c4f)
Test Plan: None, this is a revert of a revert
Reviewed By: bigfootjon
Differential Revision: D34007153
fbshipit-source-id: cde321e98dbbfa38fb3873d9b8461ac47129f481
(cherry picked from commit 2e04ef4df4)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71427
This commit adds a lowering path for the LinearReLU modules
in static quantization mode. This includes torch.nn.qat.Linear,
torch.nn.intrinsic.LinearReLU, and torch.nn.intrinsic.qat.LinearReLU.
Future commits will add support for dynamic quantization and functional
LinearReLU.
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_module
Imported from OSS
Reviewed By: george-qi
Differential Revision: D33694742
fbshipit-source-id: 19af11f82b1ad8ade0c307498971c29a3f776036
(cherry picked from commit b3f607de43)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71858
Makes the forked subgraph rewriter code path preserve stack traces.
The strategy is pretty simple for now:
1. find any specified stack trace in pattern graph
2. if found, copy this stack trace to every node in replacement graph
If more complicated logic is needed in the future, we can address it
at a later time.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_stack_trace_preserved_subgraph_rewriter
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D33791740
fbshipit-source-id: 38bb4885549a9f954278c6c14fa41f58f1d5f7b7
(cherry picked from commit 5cc32a87ce)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009
Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is
not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat`
to control this
Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we
can relax this constraint later
Test Plan:
```
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestFusion
```
Imported from OSS
**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)|
|**Modified Pages**|
Reviewed By: mruberry
Differential Revision: D33178977
fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770
(cherry picked from commit 2d51f9fb28)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70757
This is an initial PR on a way to preserve stack traces throughout FX
graph mode quantization. It preserves the stack traces for ops
for all of the quantize handlers. A future PR will add stack traces
for dtype transitions.
Test Plan:
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Note: the above only tests a single case. In a future PR, once we
expand coverage, we can expand the utility functions to check for stack
traces on all tests.
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```
Imported from OSS
Differential Revision:
D33432485
D33432485
Reviewed By: jerryzh168
Pulled By: vkuzo
fbshipit-source-id: 56c56850393132487430a850fa1def826a9c39c0
(cherry picked from commit c11155b31e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71519
Remove inplace hardtanh in fx quantized op test case
Test Plan:
python3 test/test_quantization.py TestQuantizeFxOps.test_clamp
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D33675227
fbshipit-source-id: a496150ca4b485f953f68e24ddf9beb8ed1d94c0
(cherry picked from commit f65a888900)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71254
when we configure linear and relu with the same qconfig, we currently have utility functions to also
generate a qconfig for the fused linear relu module, but this code is not called in correct order before
which resulted in unexpected behaviors. This PR fixes the issue. Please see test case for more details.
(Test case is from Supriya)
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_fused_module_qat_swap
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33558321
fbshipit-source-id: d95114dc4b77264e603c262c2da02a3de4acba69
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69959
GraphModule is an implementation detail, We don't want to expose it in quantization apis
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_quantized_model_type
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33119103
fbshipit-source-id: d8736ff08b42ee009d6cfd74dcb3f6150f71f3d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70150
This PR allows user to specify backend_config_dict for standalone modules, both in prepare and convert step
adding this now to allow prototype for some of our customer use cases, test for the codepath will be added in
a separate PR
Test Plan:
regression tests
```
python test/test_quantization.py TestQuantizeFx
```
test that specifies backend_config for some module will be added in a separate PR for the use case we have in mind
since it requires other features
Imported from OSS
**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33205162/V9/classyvision/)|
|**Modified Pages**|
Reviewed By: vkuzo
Differential Revision: D33205162
fbshipit-source-id: a657cef8e49d99b6a43653141521dc87c33bfd89
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70022
Add support for fusing ConvTranpose{1,2,3}d with BatchNorm{1,2,3}d. This re-uses the existing fusion logic but adds a "transpose" flag to the fusing function which when enabled will use the appropriate reshape for ConTranspose's transposed weights.
Test Plan: `buck test mode/dev //caffe2/test:quantization -- -r quantization.eager.test_fusion.TestFusion`
Reviewed By: jerryzh168
Differential Revision: D33074405
fbshipit-source-id: 5e9eff1a06d8f98d117e7d18e80da8e842e973b7
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69846
Test Plan:
In pytorch main dir, execute
to run the added test
Reviewed By: jbschlosser
Differential Revision: D33152672
Pulled By: dzdang
fbshipit-source-id: 89951fcd23e7061d6c51e9422540b5f584f893aa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864
att, will have a follow up PR that removes QConfigDynamic in the api
Test Plan:
regression tests
```
python test/test_quantization.py TestPostTrainingStatic
python test/test_quantization.py TestPostTrainingDynamic
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D33073235
fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69644
This PR cleans up the init of ModuleReLUFuseHandler and moved all `module - relu`
fusion pattern to use this handler
also disabled additional_fuser_method argument temporarily, will enable
after we bring back the simple pattern format
Test Plan:
```
python test/test_quantize_fx.py TestFuseFx
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32974906
fbshipit-source-id: 23483ea4293d569cb3cec6dadfefd4d9f30921a7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69334
Original PR #68121 broke with incompatible qengine for Mac OS, this PR re-introduces changes with fix
Add FX support for QAT EmbeddingBag operator, previously only eager mode support.
Test Plan:
pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embeddingbag_linear"
Imported from OSS
Reviewed By: jingsh
Differential Revision: D32815153
fbshipit-source-id: 33654ce29de6e81920bf3277a75027fe403a1eb2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69333
Original PR reverted due to break with incompatible qengine on Mac OS, this diff fixes that.
Support QAT workflow by using torch.fx QAT API. e.g. `prepare_qat_fx` and `convert_fx`.
Test Plan:
`pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embedding_linear"`
Imported from OSS
Reviewed By: jingsh
Differential Revision: D32814827
fbshipit-source-id: f7a69d2b596f1276dc5860b397c5d5d07e5b9e16
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69249
This PR added default_replay_qconfig and default_replay_observer which is used
when we want to configure an operator to reuse the observer from input, if the input
Tensor for the operator is not observed, we will not observe the output of this operator either,
if the input Tensor is observed, we will observe the output of the operator with the same observer.
e.g.
```
x1 = x0.reshape()
```
if reshape is configured with default_replay_qconfig:
1. if x0 is observed with observer_0, we'll observe x1 with the same observer instance
2. if x0 is not observed, we won't observe x1 either
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_replay_qconfig
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32774723
fbshipit-source-id: 26862b2bc181d0433e2243daeb3b8f7ec3dd33b2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68229
This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we rely on
subgraph_rewriter to rewrite the reference qunatized patterns to quantized ops
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32537714
fbshipit-source-id: 456086b308c4446840d8d37997daa6f8f8068479
Summary:
**Summary**: FixedQParams operators do not need fake quantization
in the prepare step. This commit introduces FixedQParamsObserver
and makes FixedQParamsFakeQuantize a simple wrapper around this
observer. It also removes the fake quantize logic in forward.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68143
Test Plan:
Added two tests:
python3 test/test_quantization.py TestQuantizeFx.test_fixed_qparams_patterns
python3 test/test_quantization.py TestQuantizeFx.test_register_patterns
**Reviewers**: Jerry Zhang
**Subscribers**: Jerry Zhang, Supriya Rao
**Tasks**: T104942885
**Tags**: pytorch
Reviewed By: albanD
Differential Revision: D32484427
Pulled By: andrewor14
fbshipit-source-id: 5a048b90eb4da79074c5ceffa3c8153f8d8cd662
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68061
Test had a typo that didn't compare test value against reference value, fixed typo.
Test Plan:
`pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_functional_linear"`
Imported from OSS
Reviewed By: HDCharles
Differential Revision: D32280803
fbshipit-source-id: d57a25a0dcdd88df887a39b5117abafaf15125b2
Summary:
TorchVision accidentally included model builders for quantized models without weights; this was an old bug. These builders were largely unusable and caused issues to the users. Commonly they were filtered out to avoid causing issues.
We've recently fixed that (https://github.com/pytorch/vision/pull/4854) by either removing those unnecessary builders or by providing quantized weights. This PR removes the no-longer necessary filtering of the methods.
**It should be merged after TorchVision is synced on FBCode.**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67836
Reviewed By: jerryzh168
Differential Revision: D32230658
Pulled By: datumbox
fbshipit-source-id: 01cd425b1bda3b4591a25840593b3b5dde3a0f12
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67876
Previously we miss it when we call obj.convert and this argument would not impact the fusion.
This PR fixes it and adds a test for it
Test Plan:
python test/test_quantization.py TestFuseFx
Imported from OSS
Reviewed By: malfet
Differential Revision: D32191364
fbshipit-source-id: 566bd39461010d70a21de71f611bb929976fe01d