pytorch/torch/quantization
Supriya Rao 6f63126b5c [quant][fx] Add pass in convert to fold quant-dequant sequence (#54860)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54860

Currently we insert a quantize_per_tensor op when we encounter the quantizable input,
so if it has multiple uses and not all are quantizable then we need to add a dequantize op
before these ops.

In this pass - For a sequence of quantize_per_tensor - dequantize, we combine them
since it is a no-op.

[internal only][pyper]

Before this change we had redundant dequantize nodes in the graph
Example 1x inline_cvr graph https://www.internalfb.com/intern/everpaste/?handle=GODBxAlUMzGHD6 (98143776f5)MSACpHKKu9qjorbsIXAAAz
 FC layers -> 37
 quantize_per_tensor -> 30
 dequantize -> 49

After this change
https://www.internalfb.com/intern/everpaste/?handle=GAl0uQnOlDNmpLoSAB-GZqRxu9wMbsIXAAAz
 FC layers -> 37
 quantize_per_tensor -> 30
 dequantize -> 39

We remove extra 10 dequantize nodes in the graph.

Test Plan:
python test/test_quantization.py test_fold_quant_dequant

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27390506

fbshipit-source-id: 56e6fb8496171246eccf4bd45eb8bebd87fcb740
2021-03-30 08:40:24 -07:00
..
fx [quant][fx] Add pass in convert to fold quant-dequant sequence (#54860) 2021-03-30 08:40:24 -07:00
ns ns for fx: add weight matching for linear fp16 emulation (#54257) 2021-03-25 22:35:38 -07:00
__init__.py [quant][fix] Fix quant type classification for float_qparam qconfig (#48069) 2020-11-18 18:22:08 -08:00
_correct_bias.py Remove py2 compatible future imports (#44735) 2020-09-16 12:55:57 -07:00
_equalize.py Fix type annotation errors in torch.functional (#43446) 2020-08-26 08:27:59 -07:00
_learnable_fake_quantize.py mem-efficient learnable fake quantization (#49315) 2021-02-03 18:57:47 -08:00
_numeric_suite_fx.py compare_model_outputs_fx API implementation (#49266) 2021-02-02 10:43:25 -08:00
_numeric_suite.py ns_eager: rename Logger I/O var names to logger_cls (#51359) 2021-02-09 22:30:44 -08:00
fake_quantize.py memory efficient per-channel fq: use it everywhere, delete old version (#51265) 2021-01-28 19:42:25 -08:00
fuse_modules.py quantization: Linear + BatchNorm1d fusion (#50748) 2021-01-20 12:59:02 -08:00
fuser_method_mappings.py quantization: Linear + BatchNorm1d fusion (#50748) 2021-01-20 12:59:02 -08:00
observer.py update HistogramObserver to be scriptable (#51081) 2021-01-27 07:27:03 -08:00
qconfig.py [quant][graphmode][fx] Add reference option support for linear_static_fp16 (#52650) 2021-02-27 08:25:44 -08:00
quant_type.py [quant][graphmode][fx] custom_module support static/dynamic/weight_only quant (#46786) 2020-10-27 21:41:33 -07:00
quantization_mappings.py [quantization] Add some support for 3d operations (#50003) 2021-03-10 16:40:35 -08:00
quantize_fx.py [quant][fx] add _remove_qconfig flag to convert_fx (#53166) 2021-03-03 12:58:05 -08:00
quantize_jit.py Lazily initialize alias db constant prop (#54640) 2021-03-26 19:44:29 -07:00
quantize.py [quant] Factoring out the list of no_observers (#50459) 2021-02-17 12:38:30 -08:00
stubs.py type check for torch.quantization.stubs (#46475) 2020-10-16 15:34:23 -07:00
utils.py [quant][graphmode][fx] Fix a condition check for CopyNode (#53585) 2021-03-11 09:32:20 -08:00