pytorch/torch/ao/quantization
Jerry Zhang d56adb1b54 [quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220)
Summary:
Previously we have two places we need to decide whether to insert observer or fake quantizer or not:
(1) input arguments of a node (2) output of a node, and right now we have separate code to do this
in this PR, the logic is unified in `_needs_obs_or_fq` helper function that takes the target_dtype and is_dynamic from previous output
and target_dtype and is_dynamic for the current Tensor we are looking at

let's use an example for conv node:
```
conv = convolution(input, weight, bias, ...)
```

let's say we have `input_node` object for argument `input`, and `conv_node` for `conv` node in the graph

(1) input arguments, e.g. `input`
the target_dtype/is_dynamic from previous output is the node that produces `input`, we get this from
input_node.meta["target_dtype_info"]["output_act_obs_or_fq"]

the taregt_dtype/is_dynamic for the current argument `input`, comes from conv_node.meta["target_dtype_info"]["input_act_obs_or_fq"]
similarly for weight it comes from conv_node.meta["target"]["weightobs_or_fq"] etc.

(2) output for conv node
the target_dtype/is_dynamic from previous output will be the floating point output from the fp32 convolution operator, so it
is hardcoded to be (torch.float, False), however, technically we should get this from node.meta["val"], but since the
current code base is shared by fx graph mode quantization and pytorch 2.0 export quantization, we cannot do that, we can revisit
after we decide to deprecate fx graph mode quantization

the target_dtype/is_dynamic for the current output comes from conv_node.meta["target_dtype_info"]["output_act_obs_or_fq"]

there is one caveat here about dynamic quantization, that is explained in the comment, so I won't repeat here

Note: also fixed some places in `_get_arg_target_dtype_as_input_to_node` and `_get_arg_target_is_dynamic_as_input_to_node` to make sure "not specified" == specifying a fp32 placeholder observer as well

Next: we can merge the two get target dtype and get is_dynamic function to reduce code duplication

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels
python test/test_quantization.py TestQuantizePT2E
python test/test_quantization.py TestQuantizePT2EModels

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D45167585](https://our.internmc.facebook.com/intern/diff/D45167585)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99220
Approved by: https://github.com/kimishpatel
2023-04-21 16:58:35 +00:00
..
_pt2e [quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220) 2023-04-21 16:58:35 +00:00
backend_config [Quant][pt2e] torch.mean and ReLU6 (#98984) 2023-04-17 18:33:04 +00:00
experimental AO migration: replace torch internal callsites (#94170) 2023-02-07 02:32:23 +00:00
fx [quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220) 2023-04-21 16:58:35 +00:00
__init__.py [ao] making _is_activation_post_process private with BC (#90554) 2022-12-16 08:09:33 +00:00
_correct_bias.py [BE] [2/3] Rewrite super() calls in functorch and torch (#94588) 2023-02-10 21:16:33 +00:00
_equalize.py Fix typos under torch/ao directory (#97679) 2023-04-10 22:25:15 +00:00
_learnable_fake_quantize.py Fix disbale typos (#95322) 2023-02-23 02:08:45 +00:00
_quantize_pt2e.py [quant][pt2] Add Conv + BN fusion for prepare QAT (#98568) 2023-04-20 20:15:28 +00:00
fake_quantize.py [BE] [2/3] Rewrite super() calls in functorch and torch (#94588) 2023-02-10 21:16:33 +00:00
fuse_modules.py Fix typos under torch/ao directory (#97679) 2023-04-10 22:25:15 +00:00
fuser_method_mappings.py Fix typos under torch/ao directory (#97679) 2023-04-10 22:25:15 +00:00
observer.py [quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220) 2023-04-21 16:58:35 +00:00
pattern.md [quant][refactor] Move pattern type definition to ao/quantization/utils.py (#68769) 2021-12-07 11:00:22 -08:00
qconfig_mapping.py [Quant] Add get_symmetric_qnnpack_qat_qconfig_mapping (#98569) 2023-04-07 17:57:56 +00:00
qconfig.py [BE] Merge isinstance calls together (#94419) 2023-02-09 00:47:26 +00:00
quant_type.py [ao] quant_type.py fixing public v private (#87519) 2022-11-15 15:42:31 +00:00
quantization_mappings.py [BE] Apply almost all remaining flake8-comprehension checks (#94676) 2023-02-12 01:01:25 +00:00
quantize_fx.py Fix typos under torch/ao directory (#97679) 2023-04-10 22:25:15 +00:00
quantize_jit.py Fix typos under torch/ao directory (#97679) 2023-04-10 22:25:15 +00:00
quantize.py Fix typos under torch/ao directory (#97679) 2023-04-10 22:25:15 +00:00
stubs.py [BE] [2/3] Rewrite super() calls in functorch and torch (#94588) 2023-02-10 21:16:33 +00:00
utils.py Return zero_point from determine_qparams as a int64 (#98746) 2023-04-11 19:01:05 +00:00