pytorch/torch/ao/quantization/quantizer
Jerry Zhang 1b51d29b66 [quant][pt2e] Enable constant folding for quantize ops (#109343)
Summary:
This PR added constant folding for quantize ops so that instead of storing fp32 weight in the
quantized model, we'll get int8/int16 etc. weight

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_fold_quantize

also will verify in executorch later

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343
Approved by: https://github.com/kimishpatel, https://github.com/jgong5
2023-09-27 06:04:45 +00:00
..
__init__.py [quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806) (#107259) 2023-08-18 21:29:09 +00:00
composable_quantizer.py [quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806) (#107259) 2023-08-18 21:29:09 +00:00
embedding_quantizer.py [quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806) (#107259) 2023-08-18 21:29:09 +00:00
quantizer.py [quant][pt2e] Support int16 quantization (#108453) 2023-09-06 19:31:20 +00:00
utils.py [Quant] Add DQ duplication pass (#107900) 2023-09-02 06:20:03 +00:00
x86_inductor_quantizer.py x86_inductor_quantizer switches to new graph capture API (#108214) 2023-09-01 00:43:45 +00:00
xnnpack_quantizer_utils.py [Pytorch][quant] Move xnnpack quantizer to use aten.linear (#109254) 2023-09-18 20:26:44 +00:00
xnnpack_quantizer.py [quant][pt2e] Enable constant folding for quantize ops (#109343) 2023-09-27 06:04:45 +00:00