Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26992
Run the same test for FBGEMM and QNNPACK backends.
Checks that QNNPACK or FBGEMM are supported before running it (using supported_qengines)
Test Plan:
python test/test_quantized.py TestQuantizedLinear
python test/test_quantized.py TestQuantizedConv
python test/test_quantized_models.py
python test/test_quantized_nn_mods.py
Imported from OSS
Differential Revision: D17689171
fbshipit-source-id: e11c0a5e41f5f4e6836a614a5b61e4db3c5e384b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26457
Enhancement to fuse module to support sequentials, fuse list can now be just like the state dict.
Also add support for Conv-Relu and linear-relu fusion
Also support inplace and out of place fusion of models.
ghstack-source-id: 91076386
Test Plan:
buck test caffe2/test:quantization -- 'test_fusion_sequential_model_train \(test_quantization\.FusionTest\)' --print-passing-details
buck test caffe2/test:quantization -- 'test_fusion_sequential_model_eval \(test_quantization\.FusionTest\)' --print-passing-details
Differential Revision: D17466382
fbshipit-source-id: 0a548f8f4c366f3ecc59db693bac725ccd62328e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26516
ghstack-source-id: 90982010
Test Plan:
Integrate per-channel support into conv and linear modules.
The following tests pass:
buck test caffe2/test:quantized -- 'test_linear_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details
buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details
buck test caffe2/test:quantized -- 'test_float_quant_compare_per_channel \(test_quantized_models\.ModelNumerics\)' --print-passing-details
Differential Revision: D17342622
fbshipit-source-id: f0d618928e3d9348672c589a6b7a47049c372a2e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26782
At least we should be consistent on top-level APIs and prepare/convert/etc.
Logic is inplace=False by default but top-level APIs take care of doing fewer copies.
Also renames always-inplace methods like add_observer to have underscore in the end.
One fix for MinMaxObserver was triggered by deepcopy surfacing that we were accidentally keeping autograd around
Test Plan: Imported from OSS
Differential Revision: D17595956
Pulled By: dzhulgakov
fbshipit-source-id: 801f9f5536b553f24c7a660064dd6fce685edd65
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25667
Relax scale and zero-point for activations to ensure that fbgemm implementations of conv and linear do not saturate due to 16 bit intermediate accumulation.
Add test to verify precision of numerics of quantized model with updated observer. This test catches errors in
handling layouts for quantized ops in addition to saturation/quantization errors.
ghstack-source-id: 89587942
Test Plan:
buck test caffe2/test:quantized -- 'test_float_quant_compare \(test_quantized_models\.ModelNumerics\)' --print-passing-details
Passes when SQNR > 35 dB
buck test caffe2/test:quantization -- 'test_minmax_observer \(test_quantization\.ObserverTest\)' --print-passing-details
Passes with additional coverage for observer changes
Differential Revision: D17140498
fbshipit-source-id: 42c58e726bb0b0f51890590ee2525428f9a8d24e