pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

History

Jerry Zhang 1b51d29b66 [quant][pt2e] Enable constant folding for quantize ops (#109343 ) Summary: This PR added constant folding for quantize ops so that instead of storing fp32 weight in the quantized model, we'll get int8/int16 etc. weight Test Plan: python test/test_quantization.py TestQuantizePT2E.test_fold_quantize also will verify in executorch later Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343 Approved by: https://github.com/kimishpatel, https://github.com/jgong5		2023-09-27 06:04:45 +00:00
..
__init__.py	[quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806 ) (#107259 )	2023-08-18 21:29:09 +00:00
composable_quantizer.py	[quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806 ) (#107259 )	2023-08-18 21:29:09 +00:00
embedding_quantizer.py	[quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806 ) (#107259 )	2023-08-18 21:29:09 +00:00
quantizer.py	[quant][pt2e] Support int16 quantization (#108453 )	2023-09-06 19:31:20 +00:00
utils.py	[Quant] Add DQ duplication pass (#107900 )	2023-09-02 06:20:03 +00:00
x86_inductor_quantizer.py	x86_inductor_quantizer switches to new graph capture API (#108214 )	2023-09-01 00:43:45 +00:00
xnnpack_quantizer_utils.py	[Pytorch][quant] Move xnnpack quantizer to use aten.linear (#109254 )	2023-09-18 20:26:44 +00:00
xnnpack_quantizer.py	[quant][pt2e] Enable constant folding for quantize ops (#109343 )	2023-09-27 06:04:45 +00:00