Commit Graph

34 Commits

Author SHA1 Message Date
Dmytro Dzhulgakov
128a65e2e0 Use noop observer to pass dtype for dynamic quantization (#26709)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26709

Polishes implementation from #25975. Primarily, we use NoopObserver to communicate that weights need to be quantized to float16. The very top-level API (quantize_dynamic) stays the same with `dtype` argument but the implementation follows the common flow.

One can argue that dynamic fp16 quantization doesn't really fit into the 'observer' mechanism. It's in fact not ideal, but it's better to have the same flow than branching on both dtype and qconfig.

Test Plan: Imported from OSS

Differential Revision: D17544103

Pulled By: dzhulgakov

fbshipit-source-id: 6af3f18c35929a1a53ea734079c005f656e4925f
2019-09-24 09:24:39 -07:00
Lingyi Liu
11f9fe2433 Fix the API for record observer (#26413)
Summary:
Mainly want to resolve comments from https://github.com/pytorch/pytorch/pull/25830.

Overall, we want to provide a recording observer for recording the runtime tensor values of activation path in order to debug the numerical accuracy loss offline.

According to the feedback from https://github.com/pytorch/pytorch/issues/25830, it might be better to record all the observers in a dict and query the dict to get corresponding tensor values. hx89 is working on how to insert the recording observers into model under debug.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26413

Differential Revision: D17506502

Pulled By: llyfacebook

fbshipit-source-id: 3ab90dc78920e7ec3fa572c2a07327a9991c530a
2019-09-20 14:27:56 -07:00
Jianyu Huang
f433ee1499 Add the FP16 weight support for LSTM in dynamic_quantize (#25975)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25975

We would like to add the FP16 weight support for the dynamic quantized LSTM.

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_quantized_rnn \(test_quantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details

```
[jianyuhuang@devvm794.ftw3.facebook.com: ~/fbsource/fbcode/caffe2/test] $ buck test mode/dev caffe2/test:quantization
-- 'test_quantized_rnn \(test_quantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details
Building: finished in 13.4 sec (100%) 8134/8134 jobs, 81 updated
  Total time: 13.9 sec
Trace available for this run at /tmp/testpilot.20190910-210241.2092790.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision c86e65add357582accb6ec0be23b92c8a2c510bd fbpkg ca46e8f5b26c451a8b0b2462c11bb61d at Mon Sep  9
22:16:37 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/696/t.par
Discovering tests
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1125900050322971
      ✓ caffe2/test:quantization - test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) 0.183 1/1 (passed)
Test output:
> test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.184s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/1125900050322971
Summary (total time 4.35s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D17299116

fbshipit-source-id: 7fe91ece25867f2c0496f1b63fb1041e6b815166
2019-09-19 22:19:22 -07:00
Lingyi Liu
62767077c3 add the tensor_observer to record the runtime tensor for quantization … (#25830)
Summary:
…accuracy analsyis
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25830

Differential Revision: D17327147

Pulled By: llyfacebook

fbshipit-source-id: 095d5537a31b8d7541081000eaeb8b8474dfb8d0
2019-09-11 13:36:28 -07:00
Jianyu Huang
9b4f3fd7d3 Add torch.nn.LSTM into the default dynamic quantize mappings (#25954)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25954

Add torch.nn.LSTM into the default dynamic quantize mappings. We will by default dynamic quantize LSTM when we apply the quantize_dynamic API.
ghstack-source-id: 89839673

Test Plan: CI

Differential Revision: D17294958

fbshipit-source-id: 824aceef821276b3e28c52ce3bebafaf9b0a0833
2019-09-10 21:03:12 -07:00
Haixin Liu
9c10f729de Add Dropout to blacklist (#25881)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25881

Add Dropout to blacklist to avoid the error in eager mode quantization.
ghstack-source-id: 89759536

Test Plan: Test locally in python notebook.

Reviewed By: jianyuh

Differential Revision: D17270826

fbshipit-source-id: bcf43483976740564d7f407838f25c2dbb67b016
2019-09-10 10:57:38 -07:00
Jianyu Huang
0483d537ab Add the dynamic quantized LSTM module (#25157)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25157

Add the dynamic quantized LSTM module.

TODO (separate PRs):
- Serialization.
- Bias can be Null.

ghstack-source-id: 89443731

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_quantized_rnn \(test_quantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details
```
[jianyuhuang@devvm2816.prn3.facebook.com: ~/fbsource/fbcode/caffe2/test] $ buck test mode/dev caffe2/test:quantization -- 'test_quantized_rnn \(test_q
uantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details
Action graph will be rebuilt because files have been added or removed.
Parsing buck files: finished in 1.4 sec
Building: finished in 4.0 sec (100%) 8122/8122 jobs, 2 updated
  Total time: 5.5 sec
Trace available for this run at /tmp/testpilot.20190902-164918.1275502.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision b61bc0e3b71033578eddfe0a28b0739bc685663f fbpkg 3b1c1aed1c534c0cb161a981eca6e2f0 at Sun Sep  1 20:58:52 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/690/t.par
Discovering tests
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/2251799823877227
      ✓ caffe2/test:quantization - test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) 1.048 1/1 (passed)
Test output:
> test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 1.049s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/2251799823877227
Summary (total time 5.53s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D16955662

fbshipit-source-id: 61cf1a74913105fa02e44b3941813eabac0006b5
2019-09-03 19:18:28 -07:00
Zafar Takhirov
e44c09ecae making quant utilities inplace
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25054

Test Plan: Imported from OSS

Differential Revision: D16974198

Pulled By: zafartahirov

fbshipit-source-id: 54befc8429990adafe746d1255d117fca5f12e11
2019-08-29 16:03:13 -07:00
Raghuraman Krishnamoorthi
f5a3d59254 Handle empty qconfig for functional Modules (#25215)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25215

ghstack-source-id: 89044252

Test Plan: Test implemented in D16879132/

Differential Revision: D17064670

fbshipit-source-id: 08d3d566aa123bedf318ab5a8bc9b71457930ff2
2019-08-27 12:31:26 -07:00
Raghuraman Krishnamoorthi
f622ec8084 Update mapping dictionary to support functionalmodules and pooling operations (#25216)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25216

ghstack-source-id: 89045562

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_resnet_base\ \(test_quantization.PostTrainingQuantTest\)' --print-passing-details

Differential Revision: D17065029

fbshipit-source-id: b248abf6de162f38e35e6bace17bde1be9e38c57
2019-08-26 23:00:01 -07:00
Raghuraman Krishnamoorthi
17f69eff22 Revert D16879133: Handle empty qconfig for functional Modules
Test Plan: revert-hammer

Differential Revision:
D16879133

Original commit changeset: 230f5204cfbd

fbshipit-source-id: 29b4bfe066b173797f3d9f2fcf7cbf5ee21ff8fb
2019-08-26 16:25:29 -07:00
Raghuraman Krishnamoorthi
a9fdc1923b Revert D16879132: Update mapping dictionary to support functionalmodules and pooling operations
Test Plan: revert-hammer

Differential Revision:
D16879132

Original commit changeset: cd8c10182aa7

fbshipit-source-id: 9b67ccf73f43d15ef50bf0331d3df4d57835931b
2019-08-26 16:25:25 -07:00
Raghuraman Krishnamoorthi
794f63fe92 Update mapping dictionary to support functionalmodules and pooling operations (#24804)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24804

ghstack-source-id: 89003799

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_resnet_base\ \(test_quantization.PostTrainingQuantTest\)' --print-passing-details

Differential Revision: D16879132

fbshipit-source-id: cd8c10182aa732ddf655bcda17f72ea08033a633
2019-08-26 12:16:49 -07:00
Raghuraman Krishnamoorthi
d7f6ac1dbb Handle empty qconfig for functional Modules (#24803)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24803

ghstack-source-id: 89003797

Test Plan: Test implemented in D16879132/

Differential Revision: D16879133

fbshipit-source-id: 230f5204cfbd149fea1c0985578a2572a0e0f2a8
2019-08-26 12:16:46 -07:00
Zafar Takhirov
1a74bd407d Fixes the adding of the observer to the FloatFunctional (#24418)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24418

Fixes #24394

The observer is not added correctlty, because one of the conditions is not met.

Test Plan: Imported from OSS

Differential Revision: D16833951

Pulled By: zafartahirov

fbshipit-source-id: bb4699e6a1cf6368c7278272a68e5e7c6d3f59a8
2019-08-15 17:27:00 -07:00
Jerry Zhang
761ae8e9b6 Add intrinsic module mappings (#23753)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23753

Add intrinsic(fused) module mappings in quantize.py to enable mapping fused modules
in both QAT and post PTQ

Differential Revision: D16820749

fbshipit-source-id: 07de76a4f09b44bde8b193c103eac02c22b875b6
2019-08-15 09:37:24 -07:00
Jianyu Huang
0f64043b49 Remove the activation observer for default_qconfig (#24299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24299

As suggested in https://github.com/pytorch/pytorch/pull/24232, we will remove the activation observer for dynamic quantization path.
ghstack-source-id: 88287094

Differential Revision: D16798590

fbshipit-source-id: 07a245d5584b5b15c6895d9b09deef4a0605073a
2019-08-14 17:21:50 -07:00
Jianyu Huang
e8d2ddc2c4 Make the default qconfig_dict (#24232)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24232

As suggested in https://github.com/pytorch/pytorch/pull/23128#discussion_r306650311, we will make the keys of default_qconfig_dict as `torch.nn.Linear`. That is, we will do the dynamic quantization on the `torch.nn.Linear` by default, if the user just specify `torch.quantize_dynamic(model)`.
ghstack-source-id: 88287089

Differential Revision: D16781191

fbshipit-source-id: 991a5e151a9ea32b879d6897cd9862855d747135
2019-08-14 15:12:55 -07:00
Jianyu Huang
584c6986fd Add the type matching rule for qconfig_dict (#23212)
Summary:
We want to use the Module type as the key for the qconfig_dict for the module replacement during the quantization.

Before this Diff, to dynamic quantize the BERT model, we have to specify each layer:
```
qconfig_dict = {
    'encoder.layer.0.attention.self.query': default_qconfig,
    'encoder.layer.0.attention.self.key': default_qconfig,
    'encoder.layer.0.attention.self.value': default_qconfig,
    'encoder.layer.0.attention.output.dense': default_qconfig,
    'encoder.layer.0.intermediate.dense': default_qconfig,
    'encoder.layer.0.output.dense': default_qconfig,
    'encoder.layer.1.attention.self.query': default_qconfig,
    'encoder.layer.1.attention.self.key': default_qconfig,
    'encoder.layer.1.attention.self.value': default_qconfig,
    'encoder.layer.1.attention.output.dense': default_qconfig,
    'encoder.layer.1.intermediate.dense': default_qconfig,
    'encoder.layer.1.output.dense': default_qconfig,
   ...
}
```
After this Diff, we only need the following
```
qconfig_dict = {
     torch.nn.Linear : default_qconfig
}
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23212
ghstack-source-id: 88287091

Reviewed By: zafartahirov

Differential Revision: D16436542

fbshipit-source-id: 11fbe68ee460560c1a7cdded63581eb7a00e5a89
2019-08-14 13:07:36 -07:00
Jianyu Huang
e94ba742b0 Dynamic Quantized Linear Module (#23128)
Summary:
- ~~Add a unit test for the Dynamic Quantized Linear operator (```torch.fbgemm_linear_quantize_weight```, ```torch.fbgemm_pack_quantized_matrix```, and ```torch.fbgemm_linear_int8_weight```) in ```test_quantized.py```.~~ Move this to D16404027 for a separate review.
- Add the Dynamic Quantized Linear module in ```torch/nn/quantized/modules/linear.py```. ~~This is in a rudimentary stage. Will add more functions later~~.
- Add the torch.quantize logic (prepare, eval, convert) for dynamic quantization.
- Add a unit test for the Dynamic Quantized Linear module  in ```test_nn_quantized.py```.
- Add a unit test for the Model-level Quantization API

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23128
ghstack-source-id: 88257232

Differential Revision: D16258664

fbshipit-source-id: 4be3ac39ee27c088b341c741d3f09f51d5a23ef0
2019-08-13 21:01:23 -07:00
Zafar Takhirov
4cc16782f3 Removing the make_module script. (#23635)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23635

It appears it is the same complexity to add new modules using a base class and using a generation script.

Test Plan: Imported from OSS

Differential Revision: D16593364

Pulled By: zafartahirov

fbshipit-source-id: 852dcf41f3dfa2a89152042b8e61d0b6defa8feb
2019-08-13 09:58:28 -07:00
Jerry Zhang
89956374c3 Remove qconfig_dict from API (#23465)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23465

We decided not to allow user to use qconfig_dict to do quantization
since that API is not robust.

Differential Revision: D16611504

fbshipit-source-id: b0d1d311b32c990a165c480f50e9ce3d68b785b5
2019-08-02 10:28:48 -07:00
Zafar Takhirov
9c549dfdc1 make_module: First version
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23288

Test Plan: Imported from OSS

Differential Revision: D16455390

Pulled By: zafartahirov

fbshipit-source-id: 4352f0a17cd0382b48502b93e51574cc3acdfdcc
2019-07-30 22:14:44 -07:00
Jerry Zhang
bc64324da9 Change condition in swap module
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23561

Test Plan:
python test/test_quantization.py

Imported from OSS

Differential Revision: D16570928

Pulled By: jerryzh168

fbshipit-source-id: 70f36f577ac657d015f3d7738819867742088e5a
2019-07-30 17:25:02 -07:00
Jerry Zhang
7364aa796d skip nn.Identity in add_observer
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23500

Test Plan:
e2e test in quantizing resnext 101

Imported from OSS

Differential Revision: D16550190

Pulled By: jerryzh168

fbshipit-source-id: 6128d7c3419235152b43739fcc5cade34342ba3d
2019-07-30 11:00:36 -07:00
Jerry Zhang
d7448c7812 quantized conv module (#23178)
Summary:
att

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23178
ghstack-source-id: 86973164

Differential Revision: D16426871

fbshipit-source-id: a2ebb38997acfeb61b7dfd6b11dd8ee9b3a7a8ed
2019-07-22 20:47:40 -07:00
Jerry Zhang
77353636de Conv module (#23084)
Summary:
Added Conv module for qat

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23084
ghstack-source-id: 86862445

Differential Revision: D16379417

fbshipit-source-id: 742cc8b8e0f132070ca4943a1c2e3db60c2b5bdc
2019-07-19 18:49:52 -07:00
Jerry Zhang
7cc029cb75 Quantization aware training in eager mode (#23082)
Summary:
Add support for quantization aware training in eager mode

Modifications to Post training flow:
## Prepare
* Fusion: e.g. (Conv, Bn) → ConvBn (float)
* Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn
```
    * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook:
        * def forward_pre_hook(self, input):
                self.float_weight = self.weight
                self.weight = self.fake_quantize(self.float_weight)

            def forward_hook(self, input):
                self.weight = self.float_weight
```

* Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight.
* But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module
* So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear
* qat modules will have fake_quant for output and weights inserted in forward function

## Convert
* flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23082
ghstack-source-id: 86824650

Differential Revision: D16379374

fbshipit-source-id: 7d16d1acd87025065a24942ff92abf18e9fc8070
2019-07-19 14:57:25 -07:00
Soumith Chintala
84c2c89e2c Revert D16199356: [qat] Quantization aware training in eager mode
Differential Revision:
D16199356

Original commit changeset: 62aeaf47c12c

fbshipit-source-id: d06a96b0a617ae38029ffb246173ec065454b666
2019-07-19 03:18:48 -07:00
Soumith Chintala
f19aa12ae5 Revert D16274792: [qat] Conv module
Differential Revision:
D16274792

Original commit changeset: 1da10194123b

fbshipit-source-id: 71b34774b463f2350289bd39b8cfd798e095ffa5
2019-07-19 03:18:45 -07:00
Jerry Zhang
12d9d768b8 Conv module (#22899)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22899

Added Conv module for qat

Reviewed By: zafartahirov

Differential Revision: D16274792

fbshipit-source-id: 1da10194123b2759a6a35c60d1c2d2c0b569ccdc
2019-07-18 18:58:07 -07:00
Jerry Zhang
65ef671d11 Quantization aware training in eager mode (#22732)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22732

Add support for quantization aware training in eager mode

Modifications to Post training flow:
## Prepare
* Fusion: e.g. (Conv, Bn) → ConvBn (float)
* Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn
```
    * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook:
        * def forward_pre_hook(self, input):
                self.float_weight = self.weight
                self.weight = self.fake_quantize(self.float_weight)

            def forward_hook(self, input):
                self.weight = self.float_weight
```

* Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight.
* But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module
* So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear
* qat modules will have fake_quant for output and weights inserted in forward function

## Convert
* flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step.

Reviewed By: zafartahirov

Differential Revision: D16199356

fbshipit-source-id: 62aeaf47c12c62a87d9cac208f25f7592e245d6c
2019-07-18 18:58:03 -07:00
Jerry Zhang
b984b0ab4b fix print (#22689)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22689

att

Reviewed By: Lucaskabela

Differential Revision: D16184260

fbshipit-source-id: 1a6ad51a37918d0c81d6e3baa0ca0baa32cb9673
2019-07-10 11:26:34 -07:00
Jerry Zhang
5040d52a5a torch.quantization conversion utilities, observers for eager mode quantization (#22010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22010

torch.quantization module with observers and conversion routines

Reviewed By: zafartahirov

Differential Revision: D15554183

fbshipit-source-id: 05a3fabe28dd701978b8ecebf5bfc3a4c044ba5c
2019-07-09 10:51:38 -07:00