pytorch/torch/ao
andrewor14 3eea300680 [quant] Do not decompose choose_qparams_per_token_asymmetric (#124178)
Summary: https://github.com/pytorch/pytorch/pull/123452 added
backward support to this op by turning it into
CompositeImplicitAutograd, which meant it gets decomposed during
export/compile. However, this is not desirable behavior for the
PTQ case when we try to lower the model. This commit enables
QAT without breaking PTQ by refactoring the impl into a separate
op that does have backward support.

Test Plan:
python test/test_quantization.py -k test_decomposed_choose_qparams_per_token_asymmetric_backward

Reviewers: jerryzh168, digantdesai, zou3519

Subscribers: jerryzh168, digantdesai, zou3519, supriyar

Differential Revision: [D56192116](https://our.internmc.facebook.com/intern/diff/D56192116)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124178
Approved by: https://github.com/digantdesai
2024-04-16 22:58:48 +00:00
..
nn Update Quantizable LSTM to support QAT (#121448) 2024-03-08 18:55:50 +00:00
ns Enable possibly-undefined error code (#118533) 2024-01-30 21:07:01 +00:00
pruning Enable possibly-undefined error code (#118533) 2024-01-30 21:07:01 +00:00
quantization [quant] Do not decompose choose_qparams_per_token_asymmetric (#124178) 2024-04-16 22:58:48 +00:00
__init__.py