Commit Graph

112 Commits

Author SHA1 Message Date
Andrew Or
cedce3be20 [Quant][fx] Add lowering for Linear-Bn1d in QAT mode (#73509)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73509

This adds functionality to lower reference models
involving the Linear-Bn1d pattern in FX QAT mode. This follows
https://github.com/pytorch/pytorch/pull/72431 and https://github.com/pytorch/pytorch/pull/72796, which add Linear-Bn1d fusion functionality
to eager QAT mode.

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_module

Imported from OSS

Reviewed By: dagitses

Differential Revision: D34591251

fbshipit-source-id: 39144485f9954ee1830c8b414e724560fd7e47bf
(cherry picked from commit b97a39b4d9df00e045fab4c01eca88e562ca2c02)
2022-03-07 15:32:54 +00:00
Terry Chen
5167e9d59d [quant][fix] Fix bug for ave pooling in FX quant (#73054)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73054

Fix bug for ave pooling in FX quant

Test Plan:
python3 test/test_quantization.py TestQuantizeFxOps.test_ave_pool_with_custom_cfg

Imported from OSS

Reviewed By: george-qi

Differential Revision: D34334059

fbshipit-source-id: a2ddad4fa3abf250f5dc20486c966fff3a9098a6
(cherry picked from commit d0f6ea680427a454200735075d557fb0b145a625)
2022-03-04 23:29:18 +00:00
Jerry Zhang
f5c7e5406b [quant][fx] Add lowering support for qat and fused convs (#73527)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73527

This includes:
```
torch.nn.qat.Conv2d,
torch.nn.qat.Conv3d,
torch.nn.intrinsic.qat.ConvBn1d,
torch.nn.intrinsic.qat.ConvBn2d,
torch.nn.intrinsic.qat.ConvBn3d,
torch.nn.intrinsic.qat.ConvBnReLU1d,
torch.nn.intrinsic.qat.ConvBnReLU2d,
torch.nn.intrinsic.qat.ConvBnReLU3d,
torch.nn.intrinsic.qat.ConvReLU2d,
torch.nn.intrinsic.qat.ConvReLU3d
torch.nn.intrinsic.ConvReLU1d,
torch.nn.intrinsic.ConvReLU2d,
torch.nn.intrinsic.ConvReLU3d,
```
We first produce the reference pattern and then lower the reference pattern to quantized modules

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D34583206

fbshipit-source-id: d298114d1906ea44c071b0eee52730dadf67fd3e
(cherry picked from commit 6498af35b5aa6104cadb68ca48dff4e443bee7d6)
2022-03-04 06:29:03 +00:00
dzdang
a39e8e8f5e [Quant][fx] Added explicit entries for for functional and module conv&linear support into get_default_qconfig_dict&get_default_qat_qconfig_dict (#73528)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73528

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D34535572

Pulled By: dzdang

fbshipit-source-id: 883f46e014e47aeba3ea6f9fb401c54e3792b2ac
(cherry picked from commit 66713d518295b2e7306561030aa6b7ca049a708c)
2022-03-04 03:29:20 +00:00
Andrew Or
b7a7cdd00a [Quant][fx] Add lowering for functional linear (#72855)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72855

This adds functionality to lower reference models
involving functional linear in FX.

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_functional_linear

Imported from OSS

Reviewed By: albanD

Differential Revision: D34514127

fbshipit-source-id: 7af4f37bdeda710dc7197ede9d46f66227d7932c
(cherry picked from commit a14cbc04dea4e578643c4183f0c8ea43fbdaf5c7)
2022-03-02 18:34:35 +00:00
Jerry Zhang
bea075f305 [quant] Add support for multiple inputs in fusion pattern (#73572)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73572

Previously we can't specify how to get extra inputs for fused ops in backend_config_dict,
for example, for patterns like:
(torch.add, (nn.BatchNorm2d, nn.Conv2d), MatchAllNode)

where nn.Conv2d is the root node, the extra MatchAllNode (the input for original torch.add) would be lost
This PR added a "extra_inputs_getter" key in the backend_config_dict, which allows user to provide a function,
that can return a list of extra input node for the fused op given the matched node pattern. In this case,
we need a function that returns the node that matches with `MatchAllNode`, it would be something like the following:

```
def extra_inputs_getter(pattern):
    add, conv_bn, extra_input = pattern
    return [extra_input]
```

Test Plan:
python test/test_quantization.py TestFuseFx.test_fusion_pattern_with_multiple_inputs

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D34553210

fbshipit-source-id: 748f8ce20974438458a39dbe9eae75281156c227
(cherry picked from commit be748526480e811874dbca64b1cf3bf4950f0393)
2022-03-02 08:37:07 +00:00
Jerry Zhang
ad1078a21e [quant] Enable reference path by default for CopyNodeQuantizeHandler (#73233)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73233

This PR makes CopyNodeQuantizeHandler to always produce reference patterns, and we have
some custom lowering pass to rewrite the reference qunatized patterns to quantized ops

Lowering passes have been implemented previously, we just need to enable the reference path here,
and cleanup the previous code to allow list some of the ops (`check_node`)

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: mrshenli

Differential Revision: D34469446

fbshipit-source-id: b9d9c5f793fbb735839199056c197ae98969cc4b
(cherry picked from commit af0cf4e79e11e7343d57e6ff7766c80e72ec60f3)
2022-03-01 01:33:30 +00:00
Jerry Zhang
45a042037f [quant][fx] Add root_node_getter in backend_config_dict (#73345)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73345

For complex patterns we need to identify which node is the root, so that we can eliminate all other nodes and only preserve the root,
e.g. (torch.add, MatchAllNode, (torch.nn.ReLU, torch.nn.Conv2d)), we can preserve the torch.nn.Conv2d as root node, and remove other nodes.

Prevoiusly we assumed the root_node of a pattern is the "last node" of the pattern, computed by:
```
def default_root_node_getter(node_pattern):
    while not isinstance(node_pattern[-1], Node):
       node_pattern = node_pattern[-1]
       return node_pattern[-1]
```
This PR enables user configuration to define their own root_node_getter, that means we can define root_node for patterns like:
(torch.add, (torch.nn.ReLU, torch.nn.Conv2d), MatchAllNode)

Test Plan:
python test/test_quantize_fx.py TestFuseFx.test_root_node_getter

Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D34442193

fbshipit-source-id: 2f6da69a5b6527b49710ae32820e8e2915d9af37
(cherry picked from commit 8b49bf0d7d53cdcf2c9f40f8e25bc843e8814026)
2022-02-26 06:34:22 +00:00
Jerry Zhang
16554bec1b [qunat][fx][fix] Fix get_module_type for fusion (#72735)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72735

We use `get_matched_types` to get the (type) pattern from matched modules.
And we need to use MatchAllNode instead of type(MatchAllNode) to query the fuser_method for the pattern

Test Plan:
TODO

Imported from OSS

Reviewed By: raghuramank10000

Differential Revision: D34180705

fbshipit-source-id: db9b6e791a9f26b70079fddc95fce033052199ab
(cherry picked from commit 01d38afabcb1bfc207dee7d49ee13df500d32fdf)
2022-02-25 18:37:31 +00:00
Jerry Zhang
9db0e0e76e [quant][graphmode] produce reference pattern for binary ops and then rewrite to quantized op (#72953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72953

This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we have
some custom lowering pass to rewrite the reference qunatized patterns to quantized ops
it includes rewrite for
torch.ops.quantized.add, torch.ops.quantized.mul, torch.ops.quantized.matmul

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: gchanan

Differential Revision: D34292408

fbshipit-source-id: 9872a5098249bc77db15e9fb614416958e62b9b2
(cherry picked from commit dbdc61ee8b5dde2e54a34a370a3af887e5117398)
2022-02-25 17:36:14 +00:00
Howard Huang
dadbf43eff Fix asserts in tests (#72864)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72864

Fixes #72860

Test Plan: Imported from OSS

Reviewed By: rohan-varma

Differential Revision: D34246987

Pulled By: H-Huang

fbshipit-source-id: 1ba47585533aff4cff9beec49bdc801f8320ffc8
(cherry picked from commit 03e45ceb89)
2022-02-16 18:35:16 +00:00
Jerry Zhang
3d377fb4a3 [quant][fx][improvement] Add lowering support for BatchNormQuantizeHandler (#72490)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72490

This is an effort to move the current implementation towards the reference quantized model design:
https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
so that we use reference model in the default fbgemm/qnnpack path

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps.test_qbatch_norm

Imported from OSS

Reviewed By: vkuzo, andrewor14

Differential Revision: D34062365

fbshipit-source-id: ed015c61f5b969554a6477f92cf6be2358cb558c
(cherry picked from commit 9498421ddd)
2022-02-15 21:34:17 +00:00
Vasiliy Kuznetsov
decc79e541 fx quant: add workflow support for torch.matmul quantization (#72444)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72444

In https://github.com/pytorch/pytorch/pull/71783 support was added for
quantized matmul.

In this PR, the FX graph mode quantization workflow support for this
operator is added, for int8 dtypes.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_qmatmul
```

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D34047310

fbshipit-source-id: 781219047419ce621a4deb46ea04881818bf4209
(cherry picked from commit 7e039fa3a1)
2022-02-09 18:43:58 +00:00
Jerry Zhang
ac0cac7724 [quant][fx][devs] Add lowering support for torch.cat (#72487)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72487

This is an effort to move the current implementation towards the reference quantized model design:
https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
so that we use reference model in the default fbgemm/qnnpack path

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D34062366

fbshipit-source-id: 86673bead79180a7509b51bd577f328e90f24893
(cherry picked from commit de3e443384)
2022-02-09 06:09:57 +00:00
Jerry Zhang
4b69a2373f [quant][fx] Add lowering support for ops in GeneralTensorShapeOpQuantizeHandler (#72387)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72387

Also make GeneralTensorShapeOpQuantizeHandler produce reference patterns by default

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: albanD, terrychenism

Differential Revision: D34025005

fbshipit-source-id: 01ca62cce727bbf4579ba8fb2b8c40198f327b86
(cherry picked from commit 7f3a9ab4c5)
2022-02-09 02:10:20 +00:00
Vasiliy Kuznetsov
d672bbd0a9 fx quant: add fusion matching for operator.add and torch.relu (#71780)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71780

Adds support for matching operator.add -> torch.relu in FX graph
mode quantization.

It would be nice to support torch.relu better in general, but
saving that for a future PR to keep PRs small.

This is useful for DBR quant because we have some test cases in DBR
quant which use add-relu, and we'd like to match them to FX.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_add_relu
python test/test_quantization.py TestQuantizeFxOps.test_mul_relu
```

Reviewed By: jerryzh168

Differential Revision: D33775096

Pulled By: vkuzo

fbshipit-source-id: 889d9b41d3758ecbbb6d7eab67f64ce3d4892d24
(cherry picked from commit c1f9f38ca1)
2022-02-07 14:00:26 +00:00
Nikita Shulga
53acd2fad3 Back out "Revert D33994546: [Quant][fx][improvement] Added test for quint4x2 for fx graph mode quantization (reland PR 69846)"
Summary:
Original commit changeset: d9c5979efb03

Original Phabricator Diff: D33994546 (a5dad85c4f)

Test Plan: None, this is a revert of a revert

Reviewed By: bigfootjon

Differential Revision: D34007153

fbshipit-source-id: cde321e98dbbfa38fb3873d9b8461ac47129f481
(cherry picked from commit 2e04ef4df4)
2022-02-04 18:35:27 +00:00
Nikita Shulga
cd5ed54989 Revert D33994546: [Quant][fx][improvement] Added test for quint4x2 for fx graph mode quantization (reland PR 69846)
Test Plan: revert-hammer

Differential Revision:
D33994546 (a5dad85c4f)

Original commit changeset: 3aa911752389

Original Phabricator Diff: D33994546 (a5dad85c4f)

fbshipit-source-id: d9c5979efb035b227d14bff21f0c31ad8c841bc0
(cherry picked from commit d07fedcf47)
2022-02-04 16:03:23 +00:00
dzdang
a5dad85c4f [Quant][fx][improvement] Added test for quint4x2 for fx graph mode quantization (reland PR 69846) (#72278)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72278

Added a fx quint4x2 test

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizeFxOps.test_embedding
```

Reviewed By: jerryzh168

Differential Revision: D33994546

Pulled By: dzdang

fbshipit-source-id: 3aa9117523893a6ce27f05103d421507640c1ab0
(cherry picked from commit 3c6557f36e)
2022-02-04 14:10:31 +00:00
Andrew Or
e118d6e59f Add lowering path for LinearReLU module (#71427)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71427

This commit adds a lowering path for the LinearReLU modules
in static quantization mode. This includes torch.nn.qat.Linear,
torch.nn.intrinsic.LinearReLU, and torch.nn.intrinsic.qat.LinearReLU.
Future commits will add support for dynamic quantization and functional
LinearReLU.

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_module

Imported from OSS

Reviewed By: george-qi

Differential Revision: D33694742

fbshipit-source-id: 19af11f82b1ad8ade0c307498971c29a3f776036
(cherry picked from commit b3f607de43)
2022-02-01 19:31:31 +00:00
Jerry Zhang
082ff25f37 [reland][bc-breaking][quant][be] Refactor fuser_method to include is_qat argument" (#71956)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71956

Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/59

Original commit changeset: f3912e210e8c

Original Phabricator Diff: D33178977 (ef501e8fed)

Test Plan:
Please see original diff for test plans

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33833203/V3/classyvision/)|

|**Modified Pages**|

Reviewed By: andrewor14

Differential Revision: D33833203

fbshipit-source-id: 74a8f22730b00aafa6a173b208e635c1d696959e
(cherry picked from commit fb88772b18)
2022-01-31 23:02:22 +00:00
Vasiliy Kuznetsov
b66f1bc80f fx quant: make forked subgraph rewriter preserve stack trace (#71858)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71858

Makes the forked subgraph rewriter code path preserve stack traces.
The strategy is pretty simple for now:
1. find any specified stack trace in pattern graph
2. if found, copy this stack trace to every node in replacement graph

If more complicated logic is needed in the future, we can address it
at a later time.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_stack_trace_preserved_subgraph_rewriter
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D33791740

fbshipit-source-id: 38bb4885549a9f954278c6c14fa41f58f1d5f7b7
(cherry picked from commit 5cc32a87ce)
2022-01-27 15:33:58 +00:00
Nikita Shulga
56511f859a Revert D33178977: [bc-breaking][quant][be] Refactor fuser_method to include is_qat argument
Test Plan: revert-hammer

Differential Revision:
D33178977 (ef501e8fed)

Original commit changeset: 0c1499c45526

Original Phabricator Diff: D33178977 (ef501e8fed)

fbshipit-source-id: f3912e210e8c588fdbdc9c3c5f4acf2aa8fe6678
(cherry picked from commit cd62183414)
2022-01-27 03:29:40 +00:00
Jerry Zhang
ef501e8fed [bc-breaking][quant][be] Refactor fuser_method to include is_qat argument (#70009)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009

Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is
not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat`
to control this

Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we
can relax this constraint later

Test Plan:
```
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestFusion
```

Imported from OSS

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)|

|**Modified Pages**|

Reviewed By: mruberry

Differential Revision: D33178977

fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770
(cherry picked from commit 2d51f9fb28)
2022-01-26 23:33:28 +00:00
Vasiliy Kuznetsov
c3570fd945 fx quant: preserve node stack trace throughout prepare and convert (#70757)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70757

This is an initial PR on a way to preserve stack traces throughout FX
graph mode quantization.  It preserves the stack traces for ops
for all of the quantize handlers. A future PR will add stack traces
for dtype transitions.

Test Plan:
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```

Note: the above only tests a single case. In a future PR, once we
expand coverage, we can expand the utility functions to check for stack
traces on all tests.

```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```

Imported from OSS

Differential Revision:
D33432485
D33432485

Reviewed By: jerryzh168

Pulled By: vkuzo

fbshipit-source-id: 56c56850393132487430a850fa1def826a9c39c0
(cherry picked from commit c11155b31e)
2022-01-24 14:15:43 +00:00
Terry Chen
64a3827d4e <Qunat> remove inplace hardtanh in test (#71519)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71519

Remove inplace hardtanh in fx quantized op test case

Test Plan:
python3 test/test_quantization.py TestQuantizeFxOps.test_clamp

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D33675227

fbshipit-source-id: a496150ca4b485f953f68e24ddf9beb8ed1d94c0
(cherry picked from commit f65a888900)
2022-01-21 00:30:41 +00:00
Jerry Zhang
08d8f81704 [quant][fix][fx][graphmode] Fix qconfig setting for fused modules (#71254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71254

when we configure linear and relu with the same qconfig, we currently have utility functions to also
generate a qconfig for the fused linear relu module, but this code is not called in correct order before
which resulted in unexpected behaviors. This PR fixes the issue. Please see test case for more details.
(Test case is from Supriya)

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_fused_module_qat_swap

Imported from OSS

Reviewed By: supriyar

Differential Revision: D33558321

fbshipit-source-id: d95114dc4b77264e603c262c2da02a3de4acba69
2022-01-14 23:31:11 -08:00
Terry Chen
e7c87e8b44 [quant] fix dropout in FX graph mode quantization (#71043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71043

fix issue #68250
dropout break fx graph model quantization

Test Plan:
python test/test_quantization.py TestStaticQuantizedModule

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33490176

fbshipit-source-id: 155546505b28ffc635ada65a1464b9d622dbc235
2022-01-13 15:59:59 -08:00
Terry Chen
0cd474b2ce fix op not scriptable
Summary: Fix torch.sort, min/max, torch.numel after quantization not scriptable

Test Plan: python3 test/test_quantization.py TestQuantizeFxOps.test_general_shape_ops

Reviewed By: jerryzh168

Differential Revision: D33467184

Pulled By: terrychenism

fbshipit-source-id: 13775ab36d4007978df48c9af71d83398fce5161
2022-01-07 16:55:28 -08:00
Jerry Zhang
c627211651 [quant][fx][graphmode][be] Change the type for output of convert to be torch.nn.Module (#69959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69959

GraphModule is an implementation detail, We don't want to expose it in quantization apis

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_quantized_model_type

Imported from OSS

Reviewed By: supriyar

Differential Revision: D33119103

fbshipit-source-id: d8736ff08b42ee009d6cfd74dcb3f6150f71f3d2
2021-12-29 20:33:32 -08:00
Jerry Zhang
656d2a7bf6 [quant][fx][graphmode] Add backend_config_dict for standalone module (#70150)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70150

This PR allows user to specify backend_config_dict for standalone modules, both in prepare and convert step
adding this now to allow prototype for some of our customer use cases, test for the codepath will be added in
a separate PR

Test Plan:
regression tests
```
python test/test_quantization.py TestQuantizeFx
```
test that specifies backend_config for some module will be added in a separate PR for the use case we have in mind
since it requires other features

Imported from OSS

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33205162/V9/classyvision/)|

|**Modified Pages**|

Reviewed By: vkuzo

Differential Revision: D33205162

fbshipit-source-id: a657cef8e49d99b6a43653141521dc87c33bfd89
2021-12-22 21:18:39 -08:00
Digant Desai
a86f9806bc Back out "[Quant][fx] Added test for quint4x2 for fx graph mode quantization" (#70274)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70274

Original commit changeset: 89951fcd23e7

Original Phabricator Diff: D33152672 (de4e7dece9)

Test Plan: CI

Reviewed By: larryliu0820

Differential Revision: D33268165

fbshipit-source-id: d667a761d72b9423407ce4d6617e9b6a04b5c9f8
2021-12-21 21:26:46 -08:00
Jon Morton
123be0e5b7 [fusion] Add ConvTranspose+BN fusion support (#70022)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70022

Add support for fusing ConvTranpose{1,2,3}d with BatchNorm{1,2,3}d. This re-uses the existing fusion logic but adds a "transpose" flag to the fusing function which when enabled will use the appropriate reshape for ConTranspose's transposed weights.

Test Plan: `buck test mode/dev //caffe2/test:quantization -- -r quantization.eager.test_fusion.TestFusion`

Reviewed By: jerryzh168

Differential Revision: D33074405

fbshipit-source-id: 5e9eff1a06d8f98d117e7d18e80da8e842e973b7
2021-12-20 18:42:48 -08:00
dzdang
de4e7dece9 [Quant][fx] Added test for quint4x2 for fx graph mode quantization (#69846)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69846

Test Plan:
In pytorch main dir, execute

    to run the added test

Reviewed By: jbschlosser

Differential Revision: D33152672

Pulled By: dzdang

fbshipit-source-id: 89951fcd23e7061d6c51e9422540b5f584f893aa
2021-12-19 06:15:26 -08:00
Jerry Zhang
5db711f9d3 [quant][be] Replace QConfigDynamic with QConfig in code (#69864)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864

att, will have a follow up PR that removes QConfigDynamic in the api

Test Plan:
regression tests
```
python test/test_quantization.py TestPostTrainingStatic
python test/test_quantization.py TestPostTrainingDynamic
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33073235

fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db
2021-12-17 22:30:57 -08:00
Andrew Or
3e43c478a8 [Quant][fx] Lower reference conv[1-3]d module (#69228)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69228

Implement lowering logic for reference conv modules,
similar to https://github.com/pytorch/pytorch/pull/65723.
ghstack-source-id: 145058198

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_conv_lowering

Imported from OSS

Reviewed By: anjali411

Differential Revision: D32890743

fbshipit-source-id: 04f2500628c60b0fbc84d22705164215e190aeba
2021-12-14 11:23:39 -08:00
Jerry Zhang
f575179953 [quant][fx][graphmode] Move more patterns to use ModuleReLU fuse handler (#69644)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69644

This PR cleans up the init of ModuleReLUFuseHandler and moved all `module - relu`
fusion pattern to use this handler

also disabled additional_fuser_method argument temporarily, will enable
after we bring back the simple pattern format

Test Plan:
```
python test/test_quantize_fx.py TestFuseFx
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D32974906

fbshipit-source-id: 23483ea4293d569cb3cec6dadfefd4d9f30921a7
2021-12-11 22:00:06 -08:00
Ha-nyung Chung
3d32a0c139 Back out "[wip][quant][graphmode] produce reference pattern for binary ops and then rewrite to quantized op" (#69713)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69713

Original commit changeset: 456086b308c4

Original Phabricator Diff: D32537714 (bd8a4a9372)

Reviewed By: jerryzh168

Differential Revision: D32976643

fbshipit-source-id: bea6bf6a2718e42c9efa48a0b0c1dc7fe3893065
2021-12-09 21:55:09 -08:00
Ben Koopman
f3983f9c47 [quant][embdding qat] Re-land Add FX support for QAT EmbeddingBag (#69334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69334

Original PR #68121 broke with incompatible qengine for Mac OS, this PR re-introduces changes with fix

Add FX support for QAT EmbeddingBag operator, previously only eager mode support.

Test Plan:
pytest test/quantization/fx/test_quantize_fx.py  -v -k "test_qat_embeddingbag_linear"

Imported from OSS

Reviewed By: jingsh

Differential Revision: D32815153

fbshipit-source-id: 33654ce29de6e81920bf3277a75027fe403a1eb2
2021-12-08 05:57:20 -08:00
Ben Koopman
93aa3603ee [quant][embedding qat] Re-Land Support Embedding QAT via FX API (#69333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69333

Original PR reverted due to break with incompatible qengine on Mac OS, this diff fixes that.

Support QAT workflow by using torch.fx QAT API.  e.g. `prepare_qat_fx` and `convert_fx`.

Test Plan:
`pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embedding_linear"`

Imported from OSS

Reviewed By: jingsh

Differential Revision: D32814827

fbshipit-source-id: f7a69d2b596f1276dc5860b397c5d5d07e5b9e16
2021-12-08 05:28:07 -08:00
Jerry Zhang
ca945d989a [quant][graphmode][fx] Add default_replay_qconfig for ops like reshape (#69249)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69249

This PR added default_replay_qconfig and default_replay_observer which is used
when we want to configure an operator to reuse the observer from input, if the input
Tensor for the operator is not observed, we will not observe the output of this operator either,
if the input Tensor is observed, we will observe the output of the operator with the same observer.

e.g.

```
x1 = x0.reshape()
```
if reshape is configured with default_replay_qconfig:
1. if x0 is observed with observer_0, we'll observe x1 with the same observer instance
2. if x0 is not observed, we won't observe x1 either

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_replay_qconfig
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D32774723

fbshipit-source-id: 26862b2bc181d0433e2243daeb3b8f7ec3dd33b2
2021-12-06 22:56:14 -08:00
Jerry Zhang
bd8a4a9372 [wip][quant][graphmode] produce reference pattern for binary ops and then rewrite to quantized op (#68229)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68229

This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we rely on
subgraph_rewriter to rewrite the reference qunatized patterns to quantized ops

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D32537714

fbshipit-source-id: 456086b308c4446840d8d37997daa6f8f8068479
2021-12-06 20:20:15 -08:00
Nikita Shulga
a0367f8980 Revert D32404517: [quant][embedding qat] Support Embedding QAT via FX API
Test Plan: revert-hammer

Differential Revision:
D32404517 (abda069ce2)

Original commit changeset: 0484df8c826b

fbshipit-source-id: 4e7d62b9ccdb84eb4d184cd0b3c9506013fd8336
2021-12-02 14:28:35 -08:00
Nikita Shulga
ec4c749024 Revert D32318435: [quant][embdding qat] Add FX support for QAT EmbeddingBag
Test Plan: revert-hammer

Differential Revision:
D32318435 (4484c04513)

Original commit changeset: 8b5d1a5d5422

fbshipit-source-id: e46d431f92a5c3f86c757695164d1eb5b0041298
2021-12-02 14:27:17 -08:00
Ben Koopman
4484c04513 [quant][embdding qat] Add FX support for QAT EmbeddingBag (#68121)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68121

Add FX support for QAT EmbeddingBag operator, previously only eager mode support.

Test Plan:
pytest test/quantization/fx/test_quantize_fx.py  -v -k "test_qat_embeddingbag_linear"

Imported from OSS

Reviewed By: supriyar

Differential Revision: D32318435

fbshipit-source-id: 8b5d1a5d5422972c49676f9e470d5fbe29dd503b
2021-12-02 09:05:07 -08:00
Ben Koopman
abda069ce2 [quant][embedding qat] Support Embedding QAT via FX API (#68296)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68296

Support QAT workflow by using torch.fx QAT API.  e.g. `prepare_qat_fx` and `convert_fx`.

Test Plan:
`pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embedding_linear"`

Imported from OSS

Reviewed By: jingsh, supriyar

Differential Revision: D32404517

fbshipit-source-id: 0484df8c826b823b60dfecd9def77bf8cffe0527
2021-12-02 08:42:45 -08:00
andrewor
79b67d9a4a [Quant] Refactor handling of FixedQParams operators (#68143)
Summary:
**Summary**: FixedQParams operators do not need fake quantization
in the prepare step. This commit introduces FixedQParamsObserver
and makes FixedQParamsFakeQuantize a simple wrapper around this
observer. It also removes the fake quantize logic in forward.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68143

Test Plan:
Added two tests:
python3 test/test_quantization.py TestQuantizeFx.test_fixed_qparams_patterns
python3 test/test_quantization.py TestQuantizeFx.test_register_patterns

**Reviewers**: Jerry Zhang

**Subscribers**: Jerry Zhang, Supriya Rao

**Tasks**: T104942885

**Tags**: pytorch

Reviewed By: albanD

Differential Revision: D32484427

Pulled By: andrewor14

fbshipit-source-id: 5a048b90eb4da79074c5ceffa3c8153f8d8cd662
2021-11-23 15:26:10 -08:00
Ben Koopman
45ac6f2b65 [quant] Fix comparison against reference for test_qat_functional_linear (#68061)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68061

Test had a typo that didn't compare test value against reference value, fixed typo.

Test Plan:
`pytest test/quantization/fx/test_quantize_fx.py  -v -k "test_qat_functional_linear"`

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D32280803

fbshipit-source-id: d57a25a0dcdd88df887a39b5117abafaf15125b2
2021-11-09 13:33:13 -08:00
Vasilis Vryniotis
0a9cd6d461 Removes unnecessary no_pretrained_model from test_quantize_fx.py (#67836)
Summary:
TorchVision accidentally included model builders for quantized models without weights; this was an old bug. These builders were largely unusable and caused issues to the users. Commonly they were filtered out to avoid causing issues.

We've recently fixed that (https://github.com/pytorch/vision/pull/4854) by either removing those unnecessary builders or by providing quantized weights. This PR removes the no-longer necessary filtering of the methods.

**It should be merged after TorchVision is synced on FBCode.**

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67836

Reviewed By: jerryzh168

Differential Revision: D32230658

Pulled By: datumbox

fbshipit-source-id: 01cd425b1bda3b4591a25840593b3b5dde3a0f12
2021-11-09 05:49:27 -08:00
Jerry Zhang
10411e3561 [quan][fusion] Fix a additional_fuser_method method for fuse_fx (#67876)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67876

Previously we miss it when we call obj.convert and this argument would not impact the fusion.
This PR fixes it and adds a test for it

Test Plan:
python test/test_quantization.py TestFuseFx

Imported from OSS

Reviewed By: malfet

Differential Revision: D32191364

fbshipit-source-id: 566bd39461010d70a21de71f611bb929976fe01d
2021-11-05 14:51:15 -07:00