pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Jerry Zhang d56adb1b54 [quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220 ) Summary: Previously we have two places we need to decide whether to insert observer or fake quantizer or not: (1) input arguments of a node (2) output of a node, and right now we have separate code to do this in this PR, the logic is unified in `_needs_obs_or_fq` helper function that takes the target_dtype and is_dynamic from previous output and target_dtype and is_dynamic for the current Tensor we are looking at let's use an example for conv node: ``` conv = convolution(input, weight, bias, ...) ``` let's say we have `input_node` object for argument `input`, and `conv_node` for `conv` node in the graph (1) input arguments, e.g. `input` the target_dtype/is_dynamic from previous output is the node that produces `input`, we get this from input_node.meta["target_dtype_info"]["output_act_obs_or_fq"] the taregt_dtype/is_dynamic for the current argument `input`, comes from conv_node.meta["target_dtype_info"]["input_act_obs_or_fq"] similarly for weight it comes from conv_node.meta["target"]["weightobs_or_fq"] etc. (2) output for conv node the target_dtype/is_dynamic from previous output will be the floating point output from the fp32 convolution operator, so it is hardcoded to be (torch.float, False), however, technically we should get this from node.meta["val"], but since the current code base is shared by fx graph mode quantization and pytorch 2.0 export quantization, we cannot do that, we can revisit after we decide to deprecate fx graph mode quantization the target_dtype/is_dynamic for the current output comes from conv_node.meta["target_dtype_info"]["output_act_obs_or_fq"] there is one caveat here about dynamic quantization, that is explained in the comment, so I won't repeat here Note: also fixed some places in `_get_arg_target_dtype_as_input_to_node` and `_get_arg_target_is_dynamic_as_input_to_node` to make sure "not specified" == specifying a fp32 placeholder observer as well Next: we can merge the two get target dtype and get is_dynamic function to reduce code duplication Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels python test/test_quantization.py TestQuantizePT2E python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D45167585](https://our.internmc.facebook.com/intern/diff/D45167585) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99220 Approved by: https://github.com/kimishpatel		2023-04-21 16:58:35 +00:00
..
_pt2e	[quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220 )	2023-04-21 16:58:35 +00:00
backend_config	[Quant][pt2e] torch.mean and ReLU6 (#98984 )	2023-04-17 18:33:04 +00:00
experimental	AO migration: replace torch internal callsites (#94170 )	2023-02-07 02:32:23 +00:00
fx	[quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220 )	2023-04-21 16:58:35 +00:00
__init__.py	[ao] making _is_activation_post_process private with BC (#90554 )	2022-12-16 08:09:33 +00:00
_correct_bias.py	[BE] [2/3] Rewrite `super()` calls in functorch and torch (#94588 )	2023-02-10 21:16:33 +00:00
_equalize.py	Fix typos under torch/ao directory (#97679 )	2023-04-10 22:25:15 +00:00
_learnable_fake_quantize.py	Fix disbale typos (#95322 )	2023-02-23 02:08:45 +00:00
_quantize_pt2e.py	[quant][pt2] Add Conv + BN fusion for prepare QAT (#98568 )	2023-04-20 20:15:28 +00:00
fake_quantize.py	[BE] [2/3] Rewrite `super()` calls in functorch and torch (#94588 )	2023-02-10 21:16:33 +00:00
fuse_modules.py	Fix typos under torch/ao directory (#97679 )	2023-04-10 22:25:15 +00:00
fuser_method_mappings.py	Fix typos under torch/ao directory (#97679 )	2023-04-10 22:25:15 +00:00
observer.py	[quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220 )	2023-04-21 16:58:35 +00:00
pattern.md	[quant][refactor] Move pattern type definition to ao/quantization/utils.py (#68769 )	2021-12-07 11:00:22 -08:00
qconfig_mapping.py	[Quant] Add get_symmetric_qnnpack_qat_qconfig_mapping (#98569 )	2023-04-07 17:57:56 +00:00
qconfig.py	[BE] Merge isinstance calls together (#94419 )	2023-02-09 00:47:26 +00:00
quant_type.py	[ao] quant_type.py fixing public v private (#87519 )	2022-11-15 15:42:31 +00:00
quantization_mappings.py	[BE] Apply almost all remaining flake8-comprehension checks (#94676 )	2023-02-12 01:01:25 +00:00
quantize_fx.py	Fix typos under torch/ao directory (#97679 )	2023-04-10 22:25:15 +00:00
quantize_jit.py	Fix typos under torch/ao directory (#97679 )	2023-04-10 22:25:15 +00:00
quantize.py	Fix typos under torch/ao directory (#97679 )	2023-04-10 22:25:15 +00:00
stubs.py	[BE] [2/3] Rewrite `super()` calls in functorch and torch (#94588 )	2023-02-10 21:16:33 +00:00
utils.py	Return zero_point from determine_qparams as a int64 (#98746 )	2023-04-11 19:01:05 +00:00