Commit Graph

202 Commits

Author SHA1 Message Date
Dmytro Dzhulgakov
764bf826e3 Remove fbgemm_is_cpu_supported in favor of torch.backends.quantized.supported_qengines (#26840)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26840

Cleaning up top-level namespace. Also cosmetic changes to torch.backends.quantized

Test Plan: Imported from OSS

Differential Revision: D17604403

Pulled By: dzhulgakov

fbshipit-source-id: c55af277ea7319d962a82a6120f65ccd47a60abc
2019-09-27 13:45:15 -07:00
Raghuraman Krishnamoorthi
b0a2f6f2f5 Serialization and range reduction support for Fake Quant/Observer (#26519)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26519

ghstack-source-id: 90895631

Test Plan:
buck test caffe2/test:quantization -- 'test_histogram_observer \(test_quantization\.ObserverTest\)' --print-passing-details
and
buck test caffe2/test:fake_quant -- 'test_fq_serializable \(test_fake_quant\.TestFakeQuantizePerTensorAffine\)' --print-passing-details

Differential Revision: D17217408

fbshipit-source-id: 0da7efdcdae0c065dd035c5dd2b6a78231545ece
2019-09-27 10:09:39 -07:00
Dmytro Dzhulgakov
0a8a779abe Add more inplace arguments to quantization top level API (#26782)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26782

At least we should be consistent on top-level APIs and prepare/convert/etc.

Logic is inplace=False by default but top-level APIs take care of doing fewer copies.

Also renames always-inplace methods like add_observer to have underscore in the end.

One fix for MinMaxObserver was triggered by deepcopy surfacing that we were accidentally keeping autograd around

Test Plan: Imported from OSS

Differential Revision: D17595956

Pulled By: dzhulgakov

fbshipit-source-id: 801f9f5536b553f24c7a660064dd6fce685edd65
2019-09-26 00:07:07 -07:00
Dmytro Dzhulgakov
128a65e2e0 Use noop observer to pass dtype for dynamic quantization (#26709)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26709

Polishes implementation from #25975. Primarily, we use NoopObserver to communicate that weights need to be quantized to float16. The very top-level API (quantize_dynamic) stays the same with `dtype` argument but the implementation follows the common flow.

One can argue that dynamic fp16 quantization doesn't really fit into the 'observer' mechanism. It's in fact not ideal, but it's better to have the same flow than branching on both dtype and qconfig.

Test Plan: Imported from OSS

Differential Revision: D17544103

Pulled By: dzhulgakov

fbshipit-source-id: 6af3f18c35929a1a53ea734079c005f656e4925f
2019-09-24 09:24:39 -07:00
Dmytro Dzhulgakov
a79b3685db Simplify observers declaration with functools.partial (#26492)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26492

Previous definition of observers was quite clumsy - with things like `default_observer()()`. This PR strips a way a lot of craft and allows to pass just class names directly. In order to override default arguments either `functools.partial` can be used or convenient wrapper `MyObserver.with_args(x=1)` is provided.

Also rename `QConfig_dynamic` to `QConfigDynamic` because it violates the naming convention.

Test Plan: Imported from OSS

Differential Revision: D17521265

Pulled By: dzhulgakov

fbshipit-source-id: ba9df19b368641acf4093c43df9990796284fd9e
2019-09-23 10:15:59 -07:00
Jerry Zhang
254122dd4e quantize_linear -> quantize_per_tensor (#26574)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26574

Since we also have `quantized::linear`, `quantize_linear` sounds
confusing, so we plan to rename it before the branch cut

Test Plan:
ci

Imported from OSS

Differential Revision: D17514876

fbshipit-source-id: 01d9005e6ec8cb9950b9d8bba122109c389641d3
2019-09-20 21:58:48 -07:00
Lingyi Liu
11f9fe2433 Fix the API for record observer (#26413)
Summary:
Mainly want to resolve comments from https://github.com/pytorch/pytorch/pull/25830.

Overall, we want to provide a recording observer for recording the runtime tensor values of activation path in order to debug the numerical accuracy loss offline.

According to the feedback from https://github.com/pytorch/pytorch/issues/25830, it might be better to record all the observers in a dict and query the dict to get corresponding tensor values. hx89 is working on how to insert the recording observers into model under debug.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26413

Differential Revision: D17506502

Pulled By: llyfacebook

fbshipit-source-id: 3ab90dc78920e7ec3fa572c2a07327a9991c530a
2019-09-20 14:27:56 -07:00
Jianyu Huang
f433ee1499 Add the FP16 weight support for LSTM in dynamic_quantize (#25975)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25975

We would like to add the FP16 weight support for the dynamic quantized LSTM.

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_quantized_rnn \(test_quantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details

```
[jianyuhuang@devvm794.ftw3.facebook.com: ~/fbsource/fbcode/caffe2/test] $ buck test mode/dev caffe2/test:quantization
-- 'test_quantized_rnn \(test_quantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details
Building: finished in 13.4 sec (100%) 8134/8134 jobs, 81 updated
  Total time: 13.9 sec
Trace available for this run at /tmp/testpilot.20190910-210241.2092790.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision c86e65add357582accb6ec0be23b92c8a2c510bd fbpkg ca46e8f5b26c451a8b0b2462c11bb61d at Mon Sep  9
22:16:37 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/696/t.par
Discovering tests
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1125900050322971
      ✓ caffe2/test:quantization - test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) 0.183 1/1 (passed)
Test output:
> test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.184s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/1125900050322971
Summary (total time 4.35s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D17299116

fbshipit-source-id: 7fe91ece25867f2c0496f1b63fb1041e6b815166
2019-09-19 22:19:22 -07:00
Haixin Liu
dcbfc3bdbf Add per channel observer (#25887)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25887

ghstack-source-id: 90383258

Add per channel observer to compute the qparams for each channel.

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_per_channel_minmax_observer'

buck test mode/dev caffe2/test:quantization -- 'test_per_channel_minmax_observer_scriptable'

Differential Revision: D17137226

fbshipit-source-id: 0b1c93e3cbcda86f5c4e30f7cd94c670f2665063
2019-09-18 22:16:45 -07:00
Haixin Liu
f2e9622ed8 Add l2 norm minimization (#24022)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24022

In histogram observer add an approximation for L2 error minimization for selecting min/max.
By selecting new min/max, we filter out outliers in input distribution.

This follows the implementation of NormMinimization::NonlinearQuantizationParamsSearch in caffe2/quantization/server/norm_minimization.cc
ghstack-source-id: 90298789

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_histogram_observer'

Differential Revision: D16713239

fbshipit-source-id: 82631ba47974e25689c9c66bc3088117090e26d4
2019-09-18 00:07:10 -07:00
Sebastian Messmer
9f6b6b8101 Back out "[quant][observer] Add histogram observer" (#26236)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26236

Original diff broke oss CI. Reverting.

Original commit changeset: 0f047d3349cb
ghstack-source-id: 90125990

Test Plan: testinprod

Reviewed By: hx89

Differential Revision: D17385490

fbshipit-source-id: 4258502bbc0e3a6dd6852c8ce01ed05eee618b1a
2019-09-14 12:48:46 -07:00
Haixin Liu
1563fdb591 Add histogram observer (#23959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23959

Add histogram observer that records the running histogram of tensor values along with min/max values.
ghstack-source-id: 90076996

Test Plan:
Added a test test_histogram_observer
buck test mode/dev caffe2/test:quantization -- 'test_histogram_observer'

buck test mode/dev caffe2/test:quantization -- 'test_observer_scriptable'

Differential Revision: D16692835

fbshipit-source-id: 0f047d3349cb9770fad4a2b6cb346c51d9e99cd4
2019-09-13 19:24:04 -07:00
James Reed
bdc656da70 TorchScript Serialization for dynamic LSTM
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26084

Test Plan: Imported from OSS

Differential Revision: D17339315

Pulled By: jamesr66a

fbshipit-source-id: 03a2674edcf779becfe3b8ec96f1bae23c74b11c
2019-09-12 11:04:47 -07:00
James Reed
83ecdf76da Revert "TorchScript Serialization for dynamic LSTM module" (#26079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26079

This reverts commit e3039612d8.

Test Plan: Imported from OSS

Differential Revision: D17337585

Pulled By: jamesr66a

fbshipit-source-id: 4b93a4c5ca2fe491d609da889a42d22be8e52889
2019-09-11 21:23:19 -07:00
Jianyu Huang
ead14a6bd4 Use BytesIO instead of tempfile (#25976)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25976

As recommended in https://github.com/pytorch/pytorch/pull/25877/files#r322956051:

> We should move more of these toward using BytesIO. Using files in tests is generally considered bad practice because it introduces syscalls and dependencies on the execution environment, and thus can cause test flakiness/instability.
ghstack-source-id: 89929947

Test Plan: CI

Differential Revision: D17310441

fbshipit-source-id: ba97cce4224225df45ff44062f1bc8ebefb25922
2019-09-11 19:35:49 -07:00
James Reed
e3039612d8 TorchScript Serialization for dynamic LSTM module
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25877

Test Plan: Imported from OSS

Reviewed By: jianyuh

Differential Revision: D17275746

Pulled By: jamesr66a

fbshipit-source-id: db2f38ddd99f02ccb4fb754fa1c1e6cad4425fa8
2019-09-11 19:17:25 -07:00
Lingyi Liu
62767077c3 add the tensor_observer to record the runtime tensor for quantization … (#25830)
Summary:
…accuracy analsyis
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25830

Differential Revision: D17327147

Pulled By: llyfacebook

fbshipit-source-id: 095d5537a31b8d7541081000eaeb8b8474dfb8d0
2019-09-11 13:36:28 -07:00
James Reed
79bcf6e5ba Test scripting and tracing for dynamic linear modules
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25870

Test Plan: Imported from OSS

Differential Revision: D17275747

Pulled By: jamesr66a

fbshipit-source-id: ed8eaf7e9af3127c987e56d17d60b52d039d5ae8
2019-09-09 19:00:35 -07:00
Raghuraman Krishnamoorthi
17c1b2c715 Relax scale to prevent saturation in conv/linear. Add test to verify precision of numerics of quantized model with updated observer. This test catches errors in (#25667)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25667

Relax scale and zero-point for activations to ensure that fbgemm implementations of conv and linear do not saturate due to 16 bit intermediate accumulation.

Add test to verify precision of numerics of quantized model with updated observer. This test catches errors in
handling layouts for quantized ops in addition to saturation/quantization errors.
ghstack-source-id: 89587942

Test Plan:
buck test caffe2/test:quantized -- 'test_float_quant_compare \(test_quantized_models\.ModelNumerics\)' --print-passing-details

Passes when SQNR > 35 dB

buck test caffe2/test:quantization -- 'test_minmax_observer \(test_quantization\.ObserverTest\)' --print-passing-details
Passes with additional coverage for observer changes

Differential Revision: D17140498

fbshipit-source-id: 42c58e726bb0b0f51890590ee2525428f9a8d24e
2019-09-06 17:18:01 -07:00
Jianyu Huang
0483d537ab Add the dynamic quantized LSTM module (#25157)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25157

Add the dynamic quantized LSTM module.

TODO (separate PRs):
- Serialization.
- Bias can be Null.

ghstack-source-id: 89443731

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_quantized_rnn \(test_quantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details
```
[jianyuhuang@devvm2816.prn3.facebook.com: ~/fbsource/fbcode/caffe2/test] $ buck test mode/dev caffe2/test:quantization -- 'test_quantized_rnn \(test_q
uantization\.PostTrainingDynamicQuantTest\)'  --print-passing-details
Action graph will be rebuilt because files have been added or removed.
Parsing buck files: finished in 1.4 sec
Building: finished in 4.0 sec (100%) 8122/8122 jobs, 2 updated
  Total time: 5.5 sec
Trace available for this run at /tmp/testpilot.20190902-164918.1275502.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision b61bc0e3b71033578eddfe0a28b0739bc685663f fbpkg 3b1c1aed1c534c0cb161a981eca6e2f0 at Sun Sep  1 20:58:52 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/690/t.par
Discovering tests
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/2251799823877227
      ✓ caffe2/test:quantization - test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) 1.048 1/1 (passed)
Test output:
> test_quantized_rnn (test_quantization.PostTrainingDynamicQuantTest) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 1.049s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/2251799823877227
Summary (total time 5.53s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D16955662

fbshipit-source-id: 61cf1a74913105fa02e44b3941813eabac0006b5
2019-09-03 19:18:28 -07:00
Zafar Takhirov
e44c09ecae making quant utilities inplace
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25054

Test Plan: Imported from OSS

Differential Revision: D16974198

Pulled By: zafartahirov

fbshipit-source-id: 54befc8429990adafe746d1255d117fca5f12e11
2019-08-29 16:03:13 -07:00
Zafar Takhirov
e8acc2ebb1 Removing future imports from the test fixtures.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25296

Test Plan: Imported from OSS

Differential Revision: D17090201

Pulled By: zafartahirov

fbshipit-source-id: 5a4f6ac0ea475b55d2c610e2f9f4f0cef8690e8f
2019-08-29 01:39:59 -07:00
Haixin Liu
06757acb30 Refactor MinMax observer (#23902)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23902

Copied from Daya's diff in pytorch/pytorch #23191

Refactor MinMax observer and create the base observer class to prepare for future observers such as histogram observer.
ghstack-source-id: 89146014

Test Plan:
Added a test test_minmax_observer

buck test mode/dev caffe2/test:quantization -- 'test_minmax_observer'

```
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/2533274797931635
      ✓ caffe2/test:quantization - test_minmax_observer (test_quantization.ObserverTest) 0.055 1/1 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/2533274797931635
Summary (total time 4.26s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

buck test mode/dev caffe2/test:quantization -- 'test_observer_scriptable'
```
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/5348024563344195
      ✓ caffe2/test:quantization - test_observer_scriptable (test_quantization.ObserverTest) 1.762 1/1 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/5348024563344195
Summary (total time 6.02s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D16663221

fbshipit-source-id: 3d0e1aa9e4d27808e61b10604782606de067a34a
2019-08-28 13:12:38 -07:00
Raghuraman Krishnamoorthi
9d06a984f8 Serialization for nn.quantized.functional modules (#25220)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25220

Add load_from_state_dict and save_from_state_dict for quantized functional modules
ghstack-source-id: 89070836

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_scriptability_serialization\ \(test_quantization.ScriptabilityTest\)' --print-passing-details

Differential Revision: D17065243

fbshipit-source-id: 413ce0a95d0c27fedb23894f1483e3da2f60f417
2019-08-27 18:56:10 -07:00
Raghuraman Krishnamoorthi
f622ec8084 Update mapping dictionary to support functionalmodules and pooling operations (#25216)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25216

ghstack-source-id: 89045562

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_resnet_base\ \(test_quantization.PostTrainingQuantTest\)' --print-passing-details

Differential Revision: D17065029

fbshipit-source-id: b248abf6de162f38e35e6bace17bde1be9e38c57
2019-08-26 23:00:01 -07:00
Raghuraman Krishnamoorthi
a9fdc1923b Revert D16879132: Update mapping dictionary to support functionalmodules and pooling operations
Test Plan: revert-hammer

Differential Revision:
D16879132

Original commit changeset: cd8c10182aa7

fbshipit-source-id: 9b67ccf73f43d15ef50bf0331d3df4d57835931b
2019-08-26 16:25:25 -07:00
Raghuraman Krishnamoorthi
c3c36a5b68 Revert D16923651: Serialization for nn.quantized.functional modules
Test Plan: revert-hammer

Differential Revision:
D16923651

Original commit changeset: eb1234be1941

fbshipit-source-id: c80d0b50db0f949cc293dbc2f825404bbc8cb86c
2019-08-26 15:36:21 -07:00
Raghuraman Krishnamoorthi
95a3ffc2f1 Serialization for nn.quantized.functional modules (#24924)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24924

Add load_from_state_dict and save_from_state_dict for quantized functional modules
ghstack-source-id: 89001576

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_scriptability_serialization\ \(test_quantization.ScriptabilityTest\)' --print-passing-details

Differential Revision: D16923651

fbshipit-source-id: eb1234be1941ccf268a2fc5b756540ab973f3ffb
2019-08-26 12:16:57 -07:00
Raghuraman Krishnamoorthi
794f63fe92 Update mapping dictionary to support functionalmodules and pooling operations (#24804)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24804

ghstack-source-id: 89003799

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_resnet_base\ \(test_quantization.PostTrainingQuantTest\)' --print-passing-details

Differential Revision: D16879132

fbshipit-source-id: cd8c10182aa732ddf655bcda17f72ea08033a633
2019-08-26 12:16:49 -07:00
James Reed
049284e14d Make observer scriptable
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24996

Test Plan: Imported from OSS

Differential Revision: D16952938

Pulled By: jamesr66a

fbshipit-source-id: 3d08e0c746603d0fe090fb3dbf13c5fc9dc022f4
2019-08-22 11:28:45 -07:00
Raghuraman Krishnamoorthi
696cabae9b Baseline observer module, ensuring that (min,max) range includes zero. (#24297)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24297

ghstack-source-id: 88252409

Differential Revision: D16635637

fbshipit-source-id: fcef20b9c88b2c3bd97e311514e5b2d0339ff28a
2019-08-15 15:25:23 -07:00
Jerry Zhang
761ae8e9b6 Add intrinsic module mappings (#23753)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23753

Add intrinsic(fused) module mappings in quantize.py to enable mapping fused modules
in both QAT and post PTQ

Differential Revision: D16820749

fbshipit-source-id: 07de76a4f09b44bde8b193c103eac02c22b875b6
2019-08-15 09:37:24 -07:00
Jianyu Huang
f66c90469b Fix Lint (#24381)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24381

As pointed out in https://github.com/pytorch/pytorch/pull/24299#issuecomment-521471089, the previous PR broke the Lint.
ghstack-source-id: 88339887

Reviewed By: jamesr66a

Differential Revision: D16822443

fbshipit-source-id: 3aed5b9404b0f0fcf453c05b59189974243b0df2
2019-08-14 19:22:09 -07:00
Jianyu Huang
0f64043b49 Remove the activation observer for default_qconfig (#24299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24299

As suggested in https://github.com/pytorch/pytorch/pull/24232, we will remove the activation observer for dynamic quantization path.
ghstack-source-id: 88287094

Differential Revision: D16798590

fbshipit-source-id: 07a245d5584b5b15c6895d9b09deef4a0605073a
2019-08-14 17:21:50 -07:00
Jianyu Huang
584c6986fd Add the type matching rule for qconfig_dict (#23212)
Summary:
We want to use the Module type as the key for the qconfig_dict for the module replacement during the quantization.

Before this Diff, to dynamic quantize the BERT model, we have to specify each layer:
```
qconfig_dict = {
    'encoder.layer.0.attention.self.query': default_qconfig,
    'encoder.layer.0.attention.self.key': default_qconfig,
    'encoder.layer.0.attention.self.value': default_qconfig,
    'encoder.layer.0.attention.output.dense': default_qconfig,
    'encoder.layer.0.intermediate.dense': default_qconfig,
    'encoder.layer.0.output.dense': default_qconfig,
    'encoder.layer.1.attention.self.query': default_qconfig,
    'encoder.layer.1.attention.self.key': default_qconfig,
    'encoder.layer.1.attention.self.value': default_qconfig,
    'encoder.layer.1.attention.output.dense': default_qconfig,
    'encoder.layer.1.intermediate.dense': default_qconfig,
    'encoder.layer.1.output.dense': default_qconfig,
   ...
}
```
After this Diff, we only need the following
```
qconfig_dict = {
     torch.nn.Linear : default_qconfig
}
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23212
ghstack-source-id: 88287091

Reviewed By: zafartahirov

Differential Revision: D16436542

fbshipit-source-id: 11fbe68ee460560c1a7cdded63581eb7a00e5a89
2019-08-14 13:07:36 -07:00
Jianyu Huang
e94ba742b0 Dynamic Quantized Linear Module (#23128)
Summary:
- ~~Add a unit test for the Dynamic Quantized Linear operator (```torch.fbgemm_linear_quantize_weight```, ```torch.fbgemm_pack_quantized_matrix```, and ```torch.fbgemm_linear_int8_weight```) in ```test_quantized.py```.~~ Move this to D16404027 for a separate review.
- Add the Dynamic Quantized Linear module in ```torch/nn/quantized/modules/linear.py```. ~~This is in a rudimentary stage. Will add more functions later~~.
- Add the torch.quantize logic (prepare, eval, convert) for dynamic quantization.
- Add a unit test for the Dynamic Quantized Linear module  in ```test_nn_quantized.py```.
- Add a unit test for the Model-level Quantization API

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23128
ghstack-source-id: 88257232

Differential Revision: D16258664

fbshipit-source-id: 4be3ac39ee27c088b341c741d3f09f51d5a23ef0
2019-08-13 21:01:23 -07:00
Zafar Takhirov
4cc16782f3 Removing the make_module script. (#23635)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23635

It appears it is the same complexity to add new modules using a base class and using a generation script.

Test Plan: Imported from OSS

Differential Revision: D16593364

Pulled By: zafartahirov

fbshipit-source-id: 852dcf41f3dfa2a89152042b8e61d0b6defa8feb
2019-08-13 09:58:28 -07:00
Daya Khudia
f510409281 Enable FBGEMM tests under UBSAN as well (#23570)
Summary:
Enabling tests under UBSAN as well
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23570

Test Plan:
buck test mode/dev caffe2/test:quantized
```
Running 29 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649677415136
      ✓ caffe2/test:quantized - test_qtensor (test_quantized_tensor.TestQuantizedTensor) 0.536 1/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_per_channel_affine (test_quantized_tensor.TestQuantizedTensor) 0.453 2/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_reshape (test_quantized_tensor.TestQuantizedTensor) 0.302 3/29 (passed)
      ✓ caffe2/test:quantized - test_qadd_relu_same_qparams (test_quantized.TestQuantizedOps) 0.332 4/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_view (test_quantized_tensor.TestQuantizedTensor) 0.351 5/29 (passed)
      ✓ caffe2/test:quantized - test_qadd_relu_different_qparams (test_quantized.TestQuantizedOps) 0.348 6/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_dequantize_linear (test_quantized_tensor.TestQuantizedTensor) 0.338 7/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_copy (test_quantized_tensor.TestQuantizedTensor) 0.267 8/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_clone (test_quantized_tensor.TestQuantizedTensor) 0.330 9/29 (passed)
      ✓ caffe2/test:quantized - test_qrelu (test_quantized.TestQuantizedOps) 1.774 10/29 (passed)
      ✓ caffe2/test:quantized - test_pool_api (test_nn_quantized.ModuleAPITest) 0.418 11/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_load_save (test_quantized_tensor.TestQuantizedTensor) 0.724 12/29 (passed)
      ✓ caffe2/test:quantized - test_relu_api (test_nn_quantized.FunctionalAPITest) 1.013 13/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_quant_dequant (test_quantized_tensor.TestQuantizedTensor) 1.055 14/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_permute (test_quantized_tensor.TestQuantizedTensor) 0.696 15/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_dtypes (test_quantized_tensor.TestQuantizedTensor) 0.841 16/29 (passed)
      ✓ caffe2/test:quantized - test_quant_dequant_api (test_nn_quantized.ModuleAPITest) 0.616 17/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_creation (test_quantized_tensor.TestQuantizedTensor) 0.698 18/29 (passed)
      ✓ caffe2/test:quantized - test_qconv (test_quantized.TestQuantizedConv) 4.743 19/29 (passed)
      ✓ caffe2/test:quantized - test_cat (test_quantized.TestQuantizedOps) 6.992 20/29 (passed)
      ✓ caffe2/test:quantized - test_linear_api (test_nn_quantized.ModuleAPITest) 8.970 21/29 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.QuantizedConvTest) 9.403 22/29 (passed)
      ↷ caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps) 0.000 23/29 (skipped)
Test output:
> Skipped: QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.
> test_qnnpack_linear (test_quantized.TestQNNPackOps) ... skipped 'QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.'
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.000s
>
> OK (skipped=1)
      ↷ caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps) 0.000 24/29 (skipped)
Test output:
> Skipped: QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.
> test_qnnpack_relu (test_quantized.TestQNNPackOps) ... skipped 'QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.'
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.000s
>
> OK (skipped=1)
      ✓ caffe2/test:quantized - test_max_pool2d (test_quantized.TestQuantizedOps) 8.453 25/29 (passed)
      ✓ caffe2/test:quantized - test_qlinear_unpack (test_quantized.TestQuantizedLinear) 0.664 26/29 (passed)
      ✓ caffe2/test:quantized - test_qconv_unpack (test_quantized.TestQuantizedConv) 2.965 27/29 (passed)
      ✓ caffe2/test:quantized - test_qlinear (test_quantized.TestQuantizedLinear) 1.915 28/29 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 60.804 29/29 (passed)
      ✓ caffe2/test:quantized - main 0.000 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649677415136
Summary (total time 68.66s):
  PASS: 28
  FAIL: 0
  SKIP: 2
    caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps)
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Reviewed By: jianyuh

Differential Revision: D16569166

Pulled By: dskhudia

fbshipit-source-id: 53522b4162eb1ebb35b408a1503d9664305c85b0
2019-08-12 17:59:22 -07:00
James Reed
a35d2902ef jit.script() testing and fixes (#23891)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23891

This adds an initial set of testing coverage for quantization that checks if the modules can be scripted. Testing for tracing and serialization is forthcoming

Test Plan: Imported from OSS

Differential Revision: D16698045

Pulled By: jamesr66a

fbshipit-source-id: 96d80d938b816220af72359165a7b96d998a30c9
2019-08-08 12:06:18 -07:00
James Reed
40f0b1c844 Enable OSS quantization tests (#23858)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23858

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23718

Changes:

- Enable tests for quantization test files in `run_tests.py`
- Remove `__future__` imports from `torch/nn/qat/modules/__init__.py`, since `unicode_literals` messes up imports on python2 because the elements in `__all__` will be Unicode and not string
- Skip PostTrainingQuantTests if the build doesn't have FBGEMM (only a small subset of targets in tests) or if testing under UBSAN (the suppression file doesn't seem to work)

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D16639467

Pulled By: jamesr66a

fbshipit-source-id: 532766797c216976dd7e07d751f768ff8e0fc207
2019-08-06 11:20:30 -07:00
Jerry Zhang
89956374c3 Remove qconfig_dict from API (#23465)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23465

We decided not to allow user to use qconfig_dict to do quantization
since that API is not robust.

Differential Revision: D16611504

fbshipit-source-id: b0d1d311b32c990a165c480f50e9ce3d68b785b5
2019-08-02 10:28:48 -07:00
Zafar Takhirov
058645acb1 Fusion and _intrinsic modules (#23003)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23003

torch.quantization.fuse_module and torch.nn._intrinsic convRelu and LinearRelu

Fusion function to combine specific modules: (conv,bn) and  (conv,bn,relu).
In all cases, replace modules in place. The first module is replaced with the _intrinsic fused module and the remaining modules are replaced by nn.Identity.
Support both training and eval. For training, the modules are "fused" with a sequential container. This is to allow for further module swaps for quantization aware training.
Also add: torch.nn._intrinsic for convRelu and LinearRelu.

TODO: Add tests for _intrinsic modules.

Conv BN fusion code is based on DsKhudia's implementation

Differential Revision: D16199720

fbshipit-source-id: 95fb9ffe72b361d280313b2ec57de2acd4f9dda2
2019-07-23 14:54:19 -07:00
Jerry Zhang
77353636de Conv module (#23084)
Summary:
Added Conv module for qat

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23084
ghstack-source-id: 86862445

Differential Revision: D16379417

fbshipit-source-id: 742cc8b8e0f132070ca4943a1c2e3db60c2b5bdc
2019-07-19 18:49:52 -07:00
Jerry Zhang
7cc029cb75 Quantization aware training in eager mode (#23082)
Summary:
Add support for quantization aware training in eager mode

Modifications to Post training flow:
## Prepare
* Fusion: e.g. (Conv, Bn) → ConvBn (float)
* Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn
```
    * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook:
        * def forward_pre_hook(self, input):
                self.float_weight = self.weight
                self.weight = self.fake_quantize(self.float_weight)

            def forward_hook(self, input):
                self.weight = self.float_weight
```

* Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight.
* But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module
* So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear
* qat modules will have fake_quant for output and weights inserted in forward function

## Convert
* flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23082
ghstack-source-id: 86824650

Differential Revision: D16379374

fbshipit-source-id: 7d16d1acd87025065a24942ff92abf18e9fc8070
2019-07-19 14:57:25 -07:00
Soumith Chintala
84c2c89e2c Revert D16199356: [qat] Quantization aware training in eager mode
Differential Revision:
D16199356

Original commit changeset: 62aeaf47c12c

fbshipit-source-id: d06a96b0a617ae38029ffb246173ec065454b666
2019-07-19 03:18:48 -07:00
Soumith Chintala
f19aa12ae5 Revert D16274792: [qat] Conv module
Differential Revision:
D16274792

Original commit changeset: 1da10194123b

fbshipit-source-id: 71b34774b463f2350289bd39b8cfd798e095ffa5
2019-07-19 03:18:45 -07:00
Jerry Zhang
12d9d768b8 Conv module (#22899)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22899

Added Conv module for qat

Reviewed By: zafartahirov

Differential Revision: D16274792

fbshipit-source-id: 1da10194123b2759a6a35c60d1c2d2c0b569ccdc
2019-07-18 18:58:07 -07:00
Jerry Zhang
65ef671d11 Quantization aware training in eager mode (#22732)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22732

Add support for quantization aware training in eager mode

Modifications to Post training flow:
## Prepare
* Fusion: e.g. (Conv, Bn) → ConvBn (float)
* Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn
```
    * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook:
        * def forward_pre_hook(self, input):
                self.float_weight = self.weight
                self.weight = self.fake_quantize(self.float_weight)

            def forward_hook(self, input):
                self.weight = self.float_weight
```

* Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight.
* But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module
* So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear
* qat modules will have fake_quant for output and weights inserted in forward function

## Convert
* flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step.

Reviewed By: zafartahirov

Differential Revision: D16199356

fbshipit-source-id: 62aeaf47c12c62a87d9cac208f25f7592e245d6c
2019-07-18 18:58:03 -07:00
Lucas Kabela
86fc417147 Move Quantization Models to common_quantization (#22706)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22706

Moved the models used for quantization test from the test_quantization.py file to common_quantization.py

Reviewed By: jerryzh168

Differential Revision: D16189865

fbshipit-source-id: 409b43454b6b3fe278ac16b1affb9085d6ed6835
2019-07-10 15:05:49 -07:00
Lucas Kabela
3e3e6ee335 Add common_quantized test case utilities (#22694)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22694

Move quantization and quantized utility functions for testing to common_quantized.py and common_quantization.py.  Addditionally, add a quantized test case base class which contains common methods for checking the results of quantization on modules.  As a consequence of the move, fixed the import at the top of test_quantized.py, and test_quantization to use the new utility

Reviewed By: jerryzh168

Differential Revision: D16172012

fbshipit-source-id: 329166af5555fc829f26bf1383d682c25c01a7d9
2019-07-10 12:23:36 -07:00
Jerry Zhang
164388150a fix lint (#22654)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22654

att

Reviewed By: bddppq

Differential Revision: D16168219

fbshipit-source-id: db1a5e2161e7be70b2f6e6b4beaa27ea91f853f2
2019-07-09 16:41:26 -07:00
Jerry Zhang
5040d52a5a torch.quantization conversion utilities, observers for eager mode quantization (#22010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22010

torch.quantization module with observers and conversion routines

Reviewed By: zafartahirov

Differential Revision: D15554183

fbshipit-source-id: 05a3fabe28dd701978b8ecebf5bfc3a4c044ba5c
2019-07-09 10:51:38 -07:00