Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40039
Similar to static quant, break it up into op level tests and tests for jit passes
Test Plan:
python test/test_quantization.py TestQuantizeScriptPTDQOps
python test/test_quantization.py TestDynamicQuantizeScriptJitPasses
Imported from OSS
Differential Revision: D22071278
fbshipit-source-id: 54292addcfbc00f7af960fb333921db2ff9fda04
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37032
DataParallel requires all params and buffers of child modules to be updated
in place because of how it implements model replication during the
forward pass (see https://github.com/pytorch/pytorch/pull/12671 for
context). Any params or buffers not updated in place are lost and not
propagated back to the master.
This diff updates (some quantized modules) (TBD: all quantized modules? determine a good cut
point) to do their parameter update in-place. This will enable static
quant and QAT to work correctly with DataParallel.
TODO: https://github.com/pytorch/pytorch/pull/32684 needs to land before we can fix the graph mode test failures on this PR.
Test Plan:
script failed before and passes after the diff:
https://gist.github.com/vkuzo/78b06c01f23f98ee2aaaeb37e55f8d40
TODO before land: add integration testing
Imported from OSS
Differential Revision: D21206454
fbshipit-source-id: df6b4b04d0ae0f7ef582c82d81418163019e96f7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37366
- we can put both fake quant module and observer module tests in the test_workflow_module.py
- added test_quantized_functional.py
- moved tests in test_numerics.py to test_quantize.py and removed test_numerics.py
Test Plan:
python test/test_quantization.py
Imported from OSS
Differential Revision: D21282198
fbshipit-source-id: 60107cee7d1ed2cd14a45650e91ec28b8a262c52
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35586
This pass fuses the choose_qparams-quant-dequant sequence
Fusion for weight tensor is the same as static quant.
Test Plan:
python test/test_quantize_script.py
Imported from OSS
Differential Revision: D20755680
fbshipit-source-id: b7443770642b6e6fa0fa9da8a44637e9b2d4df70
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35455
In graph mode we need to observer the activation tensor for dynamic quantization. This observer should behave the same way as the quantization functions called in the dynamic operator.
Currently for qlinear_dynamic we call quant_utils::ChooseQuantizationParams which has its own logic for calculating scale and zero_point.
We mimic those calculations in the new observer.
Test Plan:
python test/test_quantization.py ObserverTest
Imported from OSS
Differential Revision: D20664586
fbshipit-source-id: e987ea71fff777c21e00c498504e6586e92568a2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35265
In graph mode we need to observer the activation tensor for dynamic quantization. This observer should behave the same way as the quantization functions called in the dynamic operator.
Currently for qlinear_dynamic we call quant_utils::ChooseQuantizationParams which has its own logic for calculating scale and zero_point.
We mimic those calculations in the new observer.
Test Plan:
python test/test_quantization.py ObserverTest
Imported from OSS
Differential Revision: D20630988
fbshipit-source-id: 7e7aca77590f965dcb423a705e68d030aaf98550
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33927
Test Plan:
test will be added in later PRs
Imported from OSS
Differential Revision: D20354879
fbshipit-source-id: 03976f4b86c46dbdc4e45764a1e72f1a3855a404
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33852
This fixes an issue for QAT models. During eval if we call `prepare_qat` and `convert` before calling `load_state_dict` it throws an error because the weight info (num channels) is not updated in the observer module.
It is not an issue for per-tensor case
Fixes issue #33830
Test Plan:
python test/test_quantization.py EagerModePostTrainingQuantTest.test_eval_after_train
python test/test_quantization.py EagerModeQuantizationAwareTrainingTest.test_eval_after_train
Imported from OSS
Differential Revision: D20212996
fbshipit-source-id: a04af8fe4df2e555270ae4d6693f5777d86f8a46
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32757
This PR updates the main quantize_dynamic API to use QNNPACK backend for mobile
Test Plan:
python test/test_quantization.py PostTrainingDynamicQuantTest.test_quantized_rnn
Imported from OSS
Differential Revision: D19632220
fbshipit-source-id: b4c51485c281d088524101b97c84dd806438b597
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445
Create distributed and rpc directories under caffe/test for better management
of unit tests.
Differential Revision: D18702786
fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31970
Now that the ClassType can be shared among different module instances, we'll
preserve the sharing in clone as well, that is if the original module has
a ClassType that is shared, we'll clone this ClassType once and share it between
different module instances as well.
Test Plan:
build/test/test_jit
Imported from OSS
Differential Revision: D19406251
fbshipit-source-id: 2881c695f6e718e5432040a3817cf187a62017bf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30892
Fixes all outstanding lints and actually installs a properly configured
flake8
Test Plan: Imported from OSS
Differential Revision: D18862825
Pulled By: suo
fbshipit-source-id: 08e9083338a7309272e17bb803feaa42e348aa85
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30890
We've received way too many complaints about this functionality making tests flaky, and it's not providing value to us anyway. Let's cut the shit and kill deadline testing
Test Plan: Imported from OSS
Differential Revision: D18857597
Pulled By: jamesr66a
fbshipit-source-id: 67e3412795ef2fb7b7ee896169651084e434d2f6
Summary:
In the PR, we enhance the graph-mode quantization for aten::_convolution, which could be generated from tracing path.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30245
Differential Revision: D18671597
Pulled By: lly-zero-one
fbshipit-source-id: 78a2470fbb0fe0def55d63c6bda7cbb5c89f7848
Summary:
The PR tried to enable the per-channel(row-wise) dynamic quantization for linear operator. Given we have seen some accuracy drop due to the per-tensor quantization, we expect the per-channel could help improve the accuracy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30122
Differential Revision: D18630541
Pulled By: lly-zero-one
fbshipit-source-id: d52685deec5e7de46cd686ae649a8c8765b9cacf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29331Closes#27954
This fixes the hard-coding of packed parameter values for the dynamic quantized LSTM by orchestrating the following dance:
1) Each variadic parameter on the module has its own Module. That Module defines the `__getstate__` and __setstate__` method s.t. packed weights are properly re-done on model load.
2) Each of these modules is wrapped into a `torch.nn.ModuleList`, s.t. the parameters appear as attributes in the hierarchy. Then, `gatherParametersAndBuffers` (9c43b16df9/torch/csrc/jit/tracer.cpp (L285)) can see these parameters and create a `Value*` for them in the traced graph.
3) In forward, we need to convert from ModuleList -> Module -> Parameter to a simple TensorList of the parameters. We just use a loop here. In tracing, we simply record a `ListConstruct` with each of the proper parameter values. In scripting, the `ModuleList` is const, so it can be unrolled into the graph and a subsequent `ListConstruct` does its business.
The `forward` of the traced LSTM before and after this change are as follows:
Before
```
def forward(self,
input: Tensor,
argument_2: Tuple[Tensor, Tensor]) -> Tuple[Tensor, Tuple[Tensor, Tensor]]:
hx, hx0, = argument_2
_0, _1, _2 = torch.quantized_lstm(input, [hx, hx0], [CONSTANTS.c0, CONSTANTS.c1], True, 1, 0., True, False, False, dtype=12, use_dynamic=True)
return (_0, (_1, _2))
```
After
```
def forward(self,
input: Tensor,
argument_2: Tuple[Tensor, Tensor]) -> Tuple[Tensor, Tuple[Tensor, Tensor]]:
_0 = self.cell._all_weight_values
_1 = getattr(_0, "0").param
_2 = getattr(_0, "1").param
hx, hx0, = argument_2
_3, _4, _5 = torch.quantized_lstm(input, [hx, hx0], [_1, _2], True, 1, 0., True, False, False, dtype=12, use_dynamic=True)
return (_3, (_4, _5))
```
Test Plan: Imported from OSS
Differential Revision: D18374904
Pulled By: jamesr66a
fbshipit-source-id: f1a9b58998bc365b9baad38c21fd4bb510dd639c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29331Closes#27954
This fixes the hard-coding of packed parameter values for the dynamic quantized LSTM by orchestrating the following dance:
1) Each variadic parameter on the module has its own Module. That Module defines the `__getstate__` and __setstate__` method s.t. packed weights are properly re-done on model load.
2) Each of these modules is wrapped into a `torch.nn.ModuleList`, s.t. the parameters appear as attributes in the hierarchy. Then, `gatherParametersAndBuffers` (9c43b16df9/torch/csrc/jit/tracer.cpp (L285)) can see these parameters and create a `Value*` for them in the traced graph.
3) In forward, we need to convert from ModuleList -> Module -> Parameter to a simple TensorList of the parameters. We just use a loop here. In tracing, we simply record a `ListConstruct` with each of the proper parameter values. In scripting, the `ModuleList` is const, so it can be unrolled into the graph and a subsequent `ListConstruct` does its business.
The `forward` of the traced LSTM before and after this change are as follows:
Before
```
def forward(self,
input: Tensor,
argument_2: Tuple[Tensor, Tensor]) -> Tuple[Tensor, Tuple[Tensor, Tensor]]:
hx, hx0, = argument_2
_0, _1, _2 = torch.quantized_lstm(input, [hx, hx0], [CONSTANTS.c0, CONSTANTS.c1], True, 1, 0., True, False, False, dtype=12, use_dynamic=True)
return (_0, (_1, _2))
```
After
```
def forward(self,
input: Tensor,
argument_2: Tuple[Tensor, Tensor]) -> Tuple[Tensor, Tuple[Tensor, Tensor]]:
_0 = self.cell._all_weight_values
_1 = getattr(_0, "0").param
_2 = getattr(_0, "1").param
hx, hx0, = argument_2
_3, _4, _5 = torch.quantized_lstm(input, [hx, hx0], [_1, _2], True, 1, 0., True, False, False, dtype=12, use_dynamic=True)
return (_3, (_4, _5))
```
Test Plan: Imported from OSS
Differential Revision: D18359880
Pulled By: jamesr66a
fbshipit-source-id: 0ff2cad294a1871123015dfc704eaf73a7ac1d9e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27347
it's already done in the op, we don't need to permute again
Test Plan:
test_jit.py
we'll test in e2e tests
Imported from OSS
Differential Revision: D18182919
fbshipit-source-id: 04dd2a19a719828fbc7b62e451b81752187e0fcb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27269
Remove `test_quantizer.py`, add and rewrite one of the tests in `test_quantizer`
in `test_quantization.py`
The conv test is removed for now since conv pattern is still broken, we'll add another test
later
ghstack-source-id: 92869823
Test Plan:
python test/test_quantization.py
Imported from OSS
Differential Revision: D18182916
fbshipit-source-id: 325b5d8e877228d6a513e3ddf52c974479250d42
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27396
Observer that estimates moving averages of min and max values per batch, more suited for quantization aware training instead of minmax observers that track extremal values across batches
ghstack-source-id: 91369018
Test Plan:
buck test caffe2/test:quantization -- 'test_per_tensor_observers \(test_quantization\.ObserverTest\)' --print-passing-details
buck test caffe2/test:quantization -- 'test_per_channel_observers \(test_quantization\.ObserverTest\)' --print-passing-details
Differential Revision: D17727213
fbshipit-source-id: 024a890bf3dd0bf269d8bfe61f19871d027326f0