pytorch/torch/ao
Jerry Zhang 1b51d29b66 [quant][pt2e] Enable constant folding for quantize ops (#109343)
Summary:
This PR added constant folding for quantize ops so that instead of storing fp32 weight in the
quantized model, we'll get int8/int16 etc. weight

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_fold_quantize

also will verify in executorch later

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343
Approved by: https://github.com/kimishpatel, https://github.com/jgong5
2023-09-27 06:04:45 +00:00
..
nn [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
ns [BE]: Update Ruff to 0.0.280 (#105724) 2023-07-22 23:03:34 +00:00
pruning add pruning method: Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (#95689) 2023-08-02 16:24:42 +00:00
quantization [quant][pt2e] Enable constant folding for quantize ops (#109343) 2023-09-27 06:04:45 +00:00
__init__.py [refactor] Renaming ao.sparsity to ao.pruning (#84867) 2022-10-07 00:58:41 +00:00