Commit Graph

167 Commits

Author SHA1 Message Date
Jerry Zhang
a89851a0d9 [quant][fx][graphmode] Adding a new convert function that produces reference pattern by default (#66925)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66925

Current convert_fx implementation is using "The Interpreter Pattern" in https://pytorch.org/docs/stable/fx.html
There are two things that's changed which make the approach in this PR possible and needed:
1). original convert implementation is developed at the initial prototype where fx does not allow mutations, now fx
supports mutations
2). original convert needs to work for a lot of fbgemm/qnnpack specific logic, which is not needed for reference patterns

Therefore it makes sense for us to write a new convert function just for reference patterns, the implementation
is significantly easier to understand than the original convert implementation

Current support:
* we should be able to support all non-weighted ops like relu, add etc.

Missing:
* linear and conv
* some advanced features like standalone modules, input_quantized_idxs etc.

will add linear and conv support and start defining the backend_config_dict based on this version of convert

Test Plan:
python test/test_quantization.py TestQuantizeFxOpsNew

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D31786241

fbshipit-source-id: 2a32156eb6d3c5271cb44906cd863055785fb5d4
2021-10-20 18:54:30 -07:00
Vasiliy Kuznetsov
1d9a6862cd fx quant: add a BC test for loading old torch.package models (#65538)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65538

Adds a test which verifies that `prepare_fx` and `convert_fx` work
on models created by `torch.package` in the past.  In detail:

1. (one time) create a model and save it with torch.package. Also save input,
expected output, and names of quantization related get_attrs added by
our passes.
2. (every time) load the model from (1), and verify that expected output
matches current output, and that get_attr targets did not change.

Test Plan:
```
python test/test_quantization.py TestSerialization.test_linear_relu_package_quantization_transforms
```

Imported from OSS

Reviewed By: supriyar

Differential Revision: D31512939

fbshipit-source-id: 718ad5fb66e09b6b31796ebe0dc698186e9a659f
2021-10-11 08:23:38 -07:00
Jerry Zhang
508845f2b5 [quant] AO migration of the torch/quantization/quantize_fx.py and torch/quantization/fx/* (#65033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65033

1. Move the file:
```
hg mv caffe2/torch/quantization/fx caffe2/torch/ao/quantization/fx
hg mv caffe2/torch/quantization/quantize_fx.py caffe2/torch/ao/quantization/quantize_fx.py
```
2. Create new files
```
touch caffe2/torch/quantization/quantize_fx.py
touch caffe2/torch/quantization/fx/__init__.py
```
3. import things in the new files
4. add tests to test/quantization/ao_migration/test_quantization_fx.py
this is because we have some fx import in quantize_fx and fx/*.py

Test Plan: buck test mode/dev //caffe2/test:quantization

Reviewed By: vkuzo, z-a-f

Differential Revision: D30949749

fbshipit-source-id: 9e5d4d039c8a0a0820bc9040e224f0d2c26886d3
2021-09-22 09:29:15 -07:00
Zafar Takhirov
425f173f9d [quant][refactor] Change the structure of the ao migration tests (#64912)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64912

The test naming was confusing and ambiguous. The file was changed to reflect the framework that is being migrated ("quantization" instead of "quantize"). Also, the common testing class was extracted out
ghstack-source-id: 138157450

Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization`

Reviewed By: vkuzo

Differential Revision: D30898214

fbshipit-source-id: 017f95995271d35bcdf6ff6a1b3974b837543e84
2021-09-15 13:15:43 -07:00
Zafar Takhirov
9cc44aad21 [quant] AO migration of the quantize.py (resubmission) (#64445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64445

AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the quantize.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.

Test Plan: `buck test mode/dev //caffe2/test:quantization`

Reviewed By: HDCharles

Differential Revision: D30734870

fbshipit-source-id: dc204f3cc46bff2cc81c95159eab9d333b43bb4b
2021-09-08 04:58:47 -07:00
Zafar Takhirov
046ed57a4d Revert D30055886: [quant] AO migration of the quantize.py
Test Plan: revert-hammer

Differential Revision:
D30055886 (44e3ed88c9)

Original commit changeset: 8ef7470f9fa6

fbshipit-source-id: c5bd3ead43a2d44b9e56872ec5bd7a195bdac725
2021-09-02 16:59:59 -07:00
Zafar Takhirov
44e3ed88c9 [quant] AO migration of the quantize.py (#64086)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64086

AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.

This migrates the `quantize.py` from torch.quantization to `torch.ao.quantization`.

At this point both locations will be supported. Eventually the torch.quantization will be deprecated.

Test Plan: `buck test mode/opt //caffe2/test:quantization`

Reviewed By: jerryzh168, raghuramank100

Differential Revision: D30055886

fbshipit-source-id: 8ef7470f9fa640c0042bef5bb843e7a05ecd0b9f
2021-08-29 20:30:01 -07:00
Shen Li
1022443168 Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: revert-hammer

Differential Revision:
D30279364 (b004307252)

Original commit changeset: c1ed77dfe43a

fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e
2021-08-12 11:45:01 -07:00
Zsolt Dollenstein
b004307252 [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: manual inspection & sandcastle

Reviewed By: zertosh

Differential Revision: D30279364

fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
2021-08-12 10:58:35 -07:00
Supriya Rao
b8386f5d72 [quant] Create FusedMovingAvgObsFakeQuantize for QAT (#61691)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61691

Create a new module for QAT that does a Fused MovingAvgMinMaxObserver and FakeQuantize operation
The module currently only supports per-tensor quantization (affine/symmetric). Follow-up PR will add support for per-channel

Results on running QAT with MobileNetV2 (Obs enabled/fake_quant enabled)
Original FQ module
PyTorchObserver {"type": "_", "metric": "qnnpack_fp_latency_ms", "unit": "ms", "value": "242.80261993408203"}
PyTorchObserver {"type": "_", "metric": "qnnpack_qat0_latency_ms", "unit": "ms", "value": "505.7964324951172"}
PyTorchObserver {"type": "_", "metric": "fbgemm_fp_latency_ms", "unit": "ms", "value": "235.80145835876465"}
PyTorchObserver {"type": "_", "metric": "fbgemm_qat0_latency_ms", "unit": "ms", "value": "543.8144207000732"}

Fused FakeQuant module (~50% improvement in latency)
PyTorchObserver {"type": "_", "metric": "qnnpack_fp_latency_ms", "unit": "ms", "value": "232.1624755859375"}
PyTorchObserver {"type": "_", "metric": "qnnpack_qat0_latency_ms", "unit": "ms", "value": "263.8866901397705"}
PyTorchObserver {"type": "_", "metric": "fbgemm_fp_latency_ms", "unit": "ms", "value": "236.9832992553711"}
PyTorchObserver {"type": "_", "metric": "fbgemm_qat0_latency_ms", "unit": "ms", "value": "292.1590805053711"}

Individual module benchmark result (>5x improvement in latency)
===> Baseline FakeQuantize module
```
---------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------
                                               Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg     Self CUDA   Self CUDA %    CUDA total  CUDA time avg    # of Calls
---------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------
              aten::fake_quantize_per_tensor_affine         0.77%       1.210ms         4.92%       7.730ms     154.596us     718.528us         0.45%       9.543ms     190.862us            50
    aten::fake_quantize_per_tensor_affine_cachemask         2.41%       3.792ms         4.15%       6.520ms     130.402us       8.825ms         5.58%       8.825ms     176.492us            50
                                     aten::_aminmax         3.25%       5.105ms         4.43%       6.955ms     139.102us       8.193ms         5.18%       8.193ms     163.868us            50
                                   aten::zeros_like         1.87%       2.939ms         6.95%      10.922ms     109.218us       5.992ms         3.79%      10.844ms     108.442us           100
                                        aten::zeros         0.97%       1.527ms         3.11%       4.885ms      97.702us       2.383ms         1.51%       4.800ms      96.010us            50
                                         aten::rsub         1.34%       2.106ms         2.94%       4.614ms      92.277us       2.063ms         1.30%       4.559ms      91.173us            50
                                        aten::clamp         2.79%       4.381ms         5.42%       8.519ms      85.190us       5.385ms         3.41%       8.438ms      84.381us           100
                                           aten::eq        11.70%      18.384ms        21.31%      33.479ms      83.280us      22.465ms        14.21%      33.310ms      82.861us           402
                                         aten::ones         1.05%       1.656ms         2.57%       4.038ms      80.751us       2.494ms         1.58%       3.951ms      79.028us            50
                                           aten::le         2.52%       3.955ms         4.84%       7.607ms      76.071us       4.998ms         3.16%       7.702ms      77.016us           100
                                          aten::min         0.69%       1.087ms         2.32%       3.641ms      72.827us       1.017ms         0.64%       3.603ms      72.055us            50
                                          aten::max         1.40%       2.195ms         4.62%       7.260ms      72.597us       2.008ms         1.27%       7.140ms      71.404us           100
                                   aten::is_nonzero         2.68%       4.207ms        11.35%      17.829ms      71.033us       4.062ms         2.57%      17.225ms      68.625us           251
                                       aten::detach         1.17%       1.831ms         3.65%       5.736ms      57.360us       1.680ms         1.06%       5.634ms      56.340us           100
                                          aten::mul         3.36%       5.278ms         3.36%       5.278ms      53.862us       5.215ms         3.30%       5.215ms      53.216us            98
                                          aten::div         3.42%       5.376ms         3.42%       5.376ms      53.759us       5.320ms         3.36%       5.320ms      53.196us           100
                                          aten::sub         6.79%      10.672ms         6.79%      10.672ms      53.901us      10.504ms         6.64%      10.504ms      53.050us           198
                                         aten::item         4.06%       6.380ms        12.02%      18.883ms      53.798us       6.127ms         3.87%      18.322ms      52.198us           351
                                          aten::add         3.28%       5.147ms         3.28%       5.147ms      52.518us       5.113ms         3.23%       5.113ms      52.171us            98
                                      aten::minimum         1.63%       2.555ms         1.63%       2.555ms      51.092us       2.585ms         1.64%       2.585ms      51.708us            50
                                      aten::maximum         3.22%       5.065ms         3.22%       5.065ms      50.646us       5.133ms         3.25%       5.133ms      51.329us           100
                                        aten::round         1.61%       2.529ms         1.61%       2.529ms      50.578us       2.528ms         1.60%       2.528ms      50.552us            50
                                        aten::zero_         1.99%       3.125ms         4.72%       7.422ms      49.481us       2.835ms         1.79%       7.269ms      48.462us           150
                                        aten::copy_         6.62%      10.394ms         6.62%      10.394ms      41.576us      10.252ms         6.48%      10.252ms      41.010us           250
                                             detach         2.49%       3.905ms         2.49%       3.905ms      39.049us       3.954ms         2.50%       3.954ms      39.539us           100
                                       aten::select         2.01%       3.154ms         2.47%       3.876ms      38.759us       3.866ms         2.44%       3.866ms      38.658us           100
                          aten::_local_scalar_dense         7.96%      12.503ms         7.96%      12.503ms      35.621us      12.195ms         7.71%      12.195ms      34.743us           351
                                           aten::to         2.31%       3.625ms         4.16%       6.530ms      32.650us       4.320ms         2.73%       6.270ms      31.348us           200
                                        aten::fill_         3.70%       5.808ms         3.70%       5.808ms      29.039us       5.892ms         3.73%       5.892ms      29.459us           200
                                   aten::as_strided         0.79%       1.244ms         0.79%       1.244ms       6.221us       0.000us         0.00%       0.000us       0.000us           200
                                        aten::empty         3.55%       5.579ms         3.55%       5.579ms      11.137us       0.000us         0.00%       0.000us       0.000us           501
                                      aten::resize_         2.36%       3.712ms         2.36%       3.712ms      12.332us       0.000us         0.00%       0.000us       0.000us           301
                                   aten::empty_like         1.45%       2.284ms         3.68%       5.776ms      28.878us       0.000us         0.00%       0.000us       0.000us           200
                                aten::empty_strided         2.80%       4.398ms         2.80%       4.398ms      17.592us       0.000us         0.00%       0.000us       0.000us           250
---------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------
Self CPU time total: 157.108ms
Self CUDA time total: 158.122ms
```

===> FusedFakeQuant
```
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------
                                                   Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg     Self CUDA   Self CUDA %    CUDA total  CUDA time avg    # of Calls
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------
                                   fb::fused_fake_quant        23.42%       6.408ms       100.00%      27.361ms     547.215us       7.887ms        27.20%      28.996ms     579.925us            50
                  aten::fake_quantize_per_tensor_affine         4.25%       1.162ms        27.65%       7.565ms     151.298us     686.176us         2.37%      10.217ms     204.336us            50
aten::_fake_quantize_per_tensor_affine_cachemask_ten...        14.11%       3.860ms        23.40%       6.403ms     128.068us       9.531ms        32.87%       9.531ms     190.612us            50
                                         aten::_aminmax        20.57%       5.628ms        27.47%       7.515ms     150.305us       8.218ms        28.34%       8.218ms     164.367us            50
                                             aten::item         3.65%     999.522us        10.27%       2.810ms      56.202us     931.904us         3.21%       2.674ms      53.481us            50
                              aten::_local_scalar_dense         6.62%       1.811ms         6.62%       1.811ms      36.212us       1.742ms         6.01%       1.742ms      34.843us            50
                                            aten::empty        10.85%       2.969ms        10.85%       2.969ms      14.843us       0.000us         0.00%       0.000us       0.000us           200
                                       aten::as_strided         1.92%     524.365us         1.92%     524.365us       5.244us       0.000us         0.00%       0.000us       0.000us           100
                                       aten::empty_like         6.48%       1.774ms        14.62%       4.000ms      26.670us       0.000us         0.00%       0.000us       0.000us           150
                                    aten::empty_strided         8.14%       2.226ms         8.14%       2.226ms      14.842us       0.000us         0.00%       0.000us       0.000us           150
-------------------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------
Self CPU time total: 27.361ms
Self CUDA time total: 28.996ms
```

Test Plan:
python test/test_quantization.py TestFusedObsFakeQuantModule

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D29706889

fbshipit-source-id: ae3f9fb1fc559920459bf6e8663e8299bf7d21e1
2021-07-21 10:13:04 -07:00
Supriya Rao
92d3391fb1 [quant] Add a new fused MovingAvg Obs + FakeQuant operator(CPU) (#61570)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61570

Fused operator that computes moving average min/max values (in-place) of the input tensor and fake-quantizes it.
It expects the qmin/qmax values to reflect the range of the quantized tensor (instead of reduce_range)

Motivation for adding this operator is for performance reasons, since moving the computation from python to C++/CUDA can increase the performance of QAT.

Test Plan:
python test/test_quantization.py TestFusedObsFakeQuant

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D29682762

fbshipit-source-id: 28e4c50e77236d6976fe4b326c9a12103ed95840
2021-07-21 10:11:41 -07:00
Angela Yi
cc03ea2c47 [quant] Implemented InputWeightObserver for Linear inputs
Summary: Implemented two observers (InputEqualObserver and WeightEqualObserver) which will be inserted into the graph during prepare_fx().

Test Plan: python test/test_quantization.py TestEqualizeFx

Reviewed By: supriyar

Differential Revision: D28836954

fbshipit-source-id: 25517dc82ae67698ed8b2dc334e3323286976104
2021-06-07 11:19:43 -07:00
Supriya Rao
89d78851e6 [quant][refactor tests] Move qtensor serialization tests from test_deprecated_jit (#59089)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59089

Move these tests into test_quantized_tensor

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D28750065

fbshipit-source-id: 5c4350d49dd07710b86ba330de80369403c6013c
2021-05-27 17:04:15 -07:00
Supriya Rao
886a2ddc83 [quant][refactor tests] Clean up test_quantization.py (#59088)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59088

Clean up comments and organize the tests better

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D28750064

fbshipit-source-id: 4c36922e25e3adea3aaa8b4d9185dc28b17aa57c
2021-05-27 17:03:01 -07:00
Supriya Rao
74089a0d34 [quant][refactor tests] Move quantization tests into subfolders (#59007)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59007

Create folders for each test category and move the tests.
Will follow-up with a cleanup of test_quantization.py

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D28718742

fbshipit-source-id: 4c2dbbf36db35d289df9708565b7e88e2381ff04
2021-05-26 23:02:12 -07:00
Supriya Rao
e146ed21fb [quant][refactor tests] Move TestModelNumerics to a separate file (#59000)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59000

These tests span both QAT and PTQ APIs so factor them out

Test Plan:
python test/test_quantization.py TestModelNumericsEager

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D28713910

fbshipit-source-id: b2ad27cf59abb7cc0c4e4da705f8c9220410f8ad
2021-05-26 23:02:11 -07:00
Supriya Rao
b6c5c5d90e [quant][refactor tests] Rename test_numeric_suite and equalization tests (#58999)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58999

Rename the test files to be more explicit that they are for eager mode

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D28713909

fbshipit-source-id: b4ccd06c841fe96edf8c065a0bceae15fed260f9
2021-05-26 23:02:09 -07:00
Supriya Rao
82d587f434 [quant][refactor tests] split test_workflow_module into test_workflow_ops and test_workflow_module (#58963)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58963

some tests are used to check the op level numerics of the fake quantize operations

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D28696599

fbshipit-source-id: 98f9b0c993dd43050176125461ddd5288142989b
2021-05-26 23:01:08 -07:00
Supriya Rao
950e67fa43 [quant][refactor tests] Move test_qat_module into test_quantize_eager_qat (#58928)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58928

Test Plan:
python test/test_quantization.py TestConvBNQATModule

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D28683925

fbshipit-source-id: 59d240d521c8067a344c9bdf4bec94e82f52e76f
2021-05-26 07:49:59 -07:00
Supriya Rao
cc07825a21 [quant][refactor tests] Split test_quantize into test_quantize_eager_ptq, test_quantize_eager_qat and test_fusion (#58927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58927

Part of larger re-factor of quantization tests to make it clearer as to which test belongs where.

proposed folder structure
```
test/quantization
         - bc/
            - test_backward_compatibility.py
         - core/
            - test_quantized_kernels.py
            - test_quantized_workflow_ops.py
            - test_quantized_tensor.py
            - test_workflow_module.py
         - eager/
            - test_quantize_eager_ptq.py
            - test_quantize_eager_qat.py
            - test_fusion.py
         - equalization/
            - test_equalize_eager.py
            - test_bias_correction_eager.py
         - fx/
           - test_quantize_fx.py
         - jit/
            - test_quantize_jit.py
            - test_fusion_passes.py
         - numeric_suite/
            - test_numeric_suite_fx.py
            - test_numeric_suite_eager.py
```

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D28683926

fbshipit-source-id: f84a4271c77c418ce9751196241933ea8cc14913
2021-05-26 07:48:28 -07:00
Sam Estep
e3900d2ba5 Add lint for unqualified noqa (#56272)
Summary:
As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future.

Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two:
```
test/jit/test_misc.py:27:            print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999
test/jit/test_misc.py:28:            print(f"format blank") # noqa F541
```
However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored:

- If you change them to anything else, the warnings will still be suppressed.
- If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally:
  ```
  test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment
  test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment
  ```

I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272

Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed:

- https://github.com/pytorch/pytorch/runs/2365189927

Reviewed By: janeyx99

Differential Revision: D27830127

Pulled By: samestep

fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb
2021-04-19 13:16:18 -07:00
Vasiliy Kuznetsov
9e8e744efe ns for fx: move shadow lstm test to new API (#53828)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53828

Moves LSTM shadow activations test to new API. In order
to enable this, adds support for passing two args instead
of one arg when copying a subgraph from A to B.

Since this was the last test of the old API, deletes
the old test case.

Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_shadow_activations_lstm_dynamic
```

Imported from OSS

Reviewed By: hx89

Differential Revision: D26982733

fbshipit-source-id: 03f580688dd37f3ccd688d9f444e9e79cfa84734
2021-03-25 22:35:31 -07:00
Vasiliy Kuznetsov
3978ffb37a NS for FX: add test for a simple sparsenn model (#52092)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52092

Adds a very simple toy sparsenn model, and enables
its inspection with the new NS APIs.

Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_sparsenn_compare_activations
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_sparsenn_shadow
```

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D26403095

fbshipit-source-id: 3c3650aca47186deb32f2b3f1d87a0716d1ad9d1
2021-02-18 08:17:57 -08:00
Vasiliy Kuznetsov
bfc7e28188 reland - ns for fx - stubs of the three APIs (compare weights, activations, activations with shadow) (#52302)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52302

Adds the basic functionality for the three Numeric Suite core APIs to work on FX models:
1. comparing weights
2. comparing activations, with same input fed to both models
3. comparing activations, with nodes of A shadowing nodes of B

Note: there are a lot of TODOs in the code, and some/most of the APIs and implementation details may change as we iterate.  This is just the first PR.

Test Plan:
We have unit test coverage for all of the APIs, for now this is with toy models:

```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```

Reviewed By: raghuramank100

Differential Revision: D26463013

Pulled By: vkuzo

fbshipit-source-id: e454115099ad18e4037d3c54986951cdffcab367
2021-02-16 19:59:32 -08:00
Natalia Gimelshein
eaddadd4f7 Revert D26403094: ns for fx - stubs of the three APIs (compare weights, activations, activations with shadow)
Test Plan: revert-hammer

Differential Revision:
D26403094 (37622db76a)

Original commit changeset: 9752331d4ae0

fbshipit-source-id: f0a32d443a29b25af33d90420dfd1bada40c917c
2021-02-14 15:09:16 -08:00
Vasiliy Kuznetsov
37622db76a ns for fx - stubs of the three APIs (compare weights, activations, activations with shadow) (#51669)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51669

Adds the basic functionality for the three Numeric Suite core APIs to work on FX models:
1. comparing weights
2. comparing activations, with same input fed to both models
3. comparing activations, with nodes of A shadowing nodes of B

Note: there are a lot of TODOs in the code, and some/most of the APIs and implementation details may change as we iterate.  This is just the first PR.

Test Plan:
We have unit test coverage for all of the APIs, for now this is with toy models:

```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D26403094

fbshipit-source-id: 9752331d4ae0105346d3da309b13c895b593b450
2021-02-12 17:52:21 -08:00
Vasiliy Kuznetsov
bfe6e23209 Early version of fx graph matcher for NS (#51588)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51588

Early version of utility to match nodes between graph A and graph B, for Numerical Suite for FX graph mode quantization.

The main goal of this utility is to reliably match the nodes of graph A to the nodes of graph B, and throw an easy to read error message.  This will be used in future PRs to create the APIs for matching activations.  It also could potentially be used to match weights.

Test Plan:
For now, we have bare bones test coverage on some toy models, and a single torchvision model.

```
python test/test_quantization.py TestFXGraphMatcher
```

Future PRs will add more testing.

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26403093

fbshipit-source-id: 60e318d51e6fefe65265488c4967629d946048ef
2021-02-12 17:50:13 -08:00
yanli924
ada916675f update HistogramObserver to be scriptable (#51081)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51081

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51001

fix tests in TestQuantizeJitOps

Test Plan:
Imported from OSS
python test/test_quantization.py

Reviewed By: raghuramank100

Differential Revision: D26038759

Pulled By: lyoka

fbshipit-source-id: 0977ba7b8b26a9f654f20f5c698a7a20ec078c35
2021-01-27 07:27:03 -08:00
Zafar
04a8412b86 [quant] Quantizable LSTM (#49671)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49671

- Introduces the `torch.nn.quantizable` namespace
- Adds the `torch.nn.quantizable.LSTM` module

The point of the `quantizable` namespace is to segregate the purely quantized modules with the modules that could be quantized through a normal quantization flow, but are not using the quantized kernels explicitly.
That means the quantizable modules are functionally and numerically equivalent to the FP ones and can be used instead of the FP ones without any loss.

The main difference between the `torch.nn.LSTM` and the `torch.nn.quantizable.LSTM` is that the former one does not support observation for the linear layers, because all the computation is internal to the `aten` namespace.
The `torch.nn.quantizable.LSTM`, however, uses explicit linear layers that can be observed for further quantization.

Test Plan: Imported from OSS

Differential Revision: D25663870

Reviewed By: vkuzo

Pulled By: z-a-f

fbshipit-source-id: 70ff5463bd759b9a7922571a5712d3409dfdfa06
2020-12-30 15:21:38 -08:00
Raghuraman Krishnamoorthi
f7a085af98 Dynamic GRU quantization support (#49448)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49448

ghstack-source-id: 118982171

Test Plan:
buck test caffe2/test:quantization --  'test_qlstmGRU \(quantization\.test_quantized_op\.TestDynamicQuantizedRNNOp\)' --print-passing-details
buck test caffe2/test:quantization --  'test_quantized_rnn \(quantization\.test_quantize\.TestPostTrainingDynamic\)' --print-passing-details
buck test caffe2/test:quantization --  'test_qrnncell \(quantization\.test_quantized_op\.TestDynamicQuantizedRNNOp\)' --run-disabled --print-passing-details

Reviewed By: vkuzo

Differential Revision: D25579815

fbshipit-source-id: 413cc8888eb8058230b94c9576d2fa54b0ed1416
2020-12-21 12:36:59 -08:00
Xin Guan
f8722825b5 Compare Weights FX Implementation (#48056)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48056

PyTorch FX Quantization API:  Compare weights
ghstack-source-id: 117255311

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_remove_qconfig_observer_fx'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_dynamic_fx'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_static_fx'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_conv_static_fx'

Reviewed By: hx89

Differential Revision: D24940516

fbshipit-source-id: 301c1958c0e64ead9072e0fd002e4b21e8cb5b79
2020-11-20 17:17:19 -08:00
Jerry Zhang
085193c291 [quant][graphmode][fx][fusion] Add test for fuse_fx (#47085)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47085

Both in train and eval mode

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24632457

fbshipit-source-id: 486aee4e073fb87e9da46a344e8dc77e848a60cf
2020-10-30 12:25:54 -07:00
James Reed
9bc8f071a3 [WIP] Move torch.fx into its own target (#46658)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46658

ghstack-source-id: 115213192

Test Plan: waitforsadcastle

Reviewed By: zdevito, vkuzo

Differential Revision: D24374723

fbshipit-source-id: 2b5708001f5df2ffb21ea5e586e26030653ccdcf
2020-10-29 17:03:08 -07:00
Jerry Zhang
6b50ccc41c [quant][graphmode][fx] Support sigmoid/hardsigmoid/tanh in qat (#46738) (#46871)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46871

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24547180

fbshipit-source-id: d2eb9aa74c6e5436204376b1a2ebcc6188d3562f
2020-10-26 23:52:07 -07:00
Alban Desmaison
25db74bf5e Revert D24486972: [quant][graphmode][fx] Support sigmoid/hardsigmoid/tanh in qat
Test Plan: revert-hammer

Differential Revision:
D24486972 (e927b62e73)

Original commit changeset: c9f139bfdd54

fbshipit-source-id: 2a75f5ec93d55a62b40d1cdd49adcf65436058f7
2020-10-26 12:47:05 -07:00
Jerry Zhang
e927b62e73 [quant][graphmode][fx] Support sigmoid/hardsigmoid/tanh in qat (#46738)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46738

Test Plan: Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D24486972

fbshipit-source-id: c9f139bfdd54973da1a93a45e32937595dbe67fc
2020-10-26 12:04:42 -07:00
Jerry Zhang
13decddae2 [reland][quant] Add FixedQParamsFakeQuantize module (#45538) (#46657)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46657

This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24451406

fbshipit-source-id: 26cc140c00f12bdec9a8f9dc880f4c425f4d4074
2020-10-21 16:47:11 -07:00
Ashkan Aliabadi
2181449068 Revert D24004795: [quant] Add FixedQParamsFakeQuantize module
Test Plan: revert-hammer

Differential Revision:
D24004795 (253918ec55)

Original commit changeset: fc4797f80842

fbshipit-source-id: 663169e90a2f58e5a89e4d382291ae41c24d0fee
2020-10-20 19:40:21 -07:00
Jerry Zhang
253918ec55 [quant] Add FixedQParamsFakeQuantize module (#45538)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45538

This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24004795

fbshipit-source-id: fc4797f80842daacd3b3584c5b72035774634edd
2020-10-20 17:43:25 -07:00
Jerry Zhang
0da6730f02 [quant][graphmode][fx][eagermode] Add leaky relu support in quantization workflows (#45712)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45712

Eager mode will still be able to use functional leaky relu, but it will be less accurate than
LeakyReLU module.
FX graph mode will support both leaky relu functional and module

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D24069961

fbshipit-source-id: 8d91c3c50c0bcd068ba3072378ebb4da9549be3b
2020-10-06 12:16:04 -07:00
Supriya Rao
6013a29fc0 [quant] Support quantization of embedding lookup operators (#44207)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44207

Use existing embedding_bag operator but set offsets to [0, 1, .. len(indices)]

Test Plan:
python test/test_quantization.py TestEmbeddingOps.test_embedding_byte

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23547385

fbshipit-source-id: ccce348bc192c6a4a65a8eca4c8b90f99f40f1b1
2020-09-08 19:03:59 -07:00
Jerry Zhang
5a1aa0e21e [reland][quant][graphmode][fx] Add e2e test on torchvision (#43587)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43587

Add tests for graph mode quantization on torchvision and make sure it matches
current eager mode quantization

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23331253

fbshipit-source-id: 0445a44145d99837a2c975684cd0a0b7d965c8f9
2020-08-27 10:12:07 -07:00
Mikhail Zolotukhin
be637fd5f6 Revert D23306683: [quant][graphmode][fx] Testing torchvision
Test Plan: revert-hammer

Differential Revision:
D23306683 (62dcd253e3)

Original commit changeset: 30d27e225d45

fbshipit-source-id: e661334d187d3d6756facd36f2ebdb3ab2cd2e26
2020-08-25 15:24:02 -07:00
Jerry Zhang
62dcd253e3 [quant][graphmode][fx] Testing torchvision (#43526)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43526

Add tests for graph mode quantization on torchvision and make sure it matches
current eager mode quantization

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23306683

fbshipit-source-id: 30d27e225d4557bfc1d9aa462086e416aa9a9c0e
2020-08-25 13:02:14 -07:00
Edmund Williams Jr
17f9edda42 Bias Correction Implementation (#41845)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41845

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D22661503

Pulled By: edmundw314

fbshipit-source-id: a88c349c6cc15b1c66aa6dee7593ef3df588eb85
2020-08-20 21:40:33 -07:00
Jerry Zhang
b0ec336477 [quant][graphmode][fx][test] Add per op test for graph mode quant on fx (#43229)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43229

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D23201692

fbshipit-source-id: 37fa54dcf0a9d5029f1101e11bfd4ca45b422641
2020-08-20 17:32:02 -07:00
Jerry Zhang
dae2973fae [quant][graphmode][fx] Add graph mode quantization on fx (#43175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43175

This PR added graph mode quantization on fx: https://github.com/pytorch/pytorch/pull/42741
Currently it matches eager mode quantization for torchvision with static/dynamic/qat
ddp/synbn test is still wip

Test Plan:
python test/test_quantization.py TestQuantizeFx

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23178602

fbshipit-source-id: 8e7e0322846fbda2cfa79ad188abd7235326f879
2020-08-20 14:50:09 -07:00
Mike Ruberry
b7a9bc0802 Revert D22217029: Add fake quantize operator that works in backward pass
Test Plan: revert-hammer

Differential Revision:
D22217029 (48e978ba18)

Original commit changeset: 7055a2cdafcf

fbshipit-source-id: f57a27be412c6fbfd5a5b07a26f758ac36be3b67
2020-08-07 23:04:40 -07:00
Presley Graham
48e978ba18 Add fake quantize operator that works in backward pass (#40532)
Summary:
This diff adds FakeQuantizeWithBackward. This works the same way as the regular FakeQuantize module, allowing QAT to occur in the forward pass, except it has an additional quantize_backward parameter. When quantize_backward is enabled, the gradients are fake quantized as well (dynamically, using hard-coded values). This allows the user to see whether there would be a significant loss of accuracy if the gradients were quantized in their model.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40532

Test Plan: The relevant test for this can be run using `python test/test_quantization.py TestQATBackward.test_forward_and_backward`

Reviewed By: supriyar

Differential Revision: D22217029

Pulled By: durumu

fbshipit-source-id: 7055a2cdafcf022f1ea11c3442721ae146d2b3f2
2020-08-07 17:47:01 -07:00
Edmund Williams Jr
fd62847eb2 cross_layer_equalization (#41685)
Summary:
The goal is to implement cross layer equalization as described in section 4.1 in this paper: https://arxiv.org/pdf/1906.04721.pdf
Given two adjacent submodules in a trained model, A,B quantization might hurt one of the submodules more than the other. The paper poses the idea that a loss in accuracy from quantizing can be due to a difference in the channel ranges between the two submodules (the output channel range of A can be small, while the input channel range of B can be large). To minimize this source of error, we want to scale the tensors of A,B s.t. their channel ranges are equal (them being equal means no difference in ranges and minimizes this source of error).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41685

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D22630219

Pulled By: edmundw314

fbshipit-source-id: ccc91ba12c10b652d7275222da8b85455b8a7cd5
2020-07-22 08:39:23 -07:00