mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 00:21:07 +01:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23577 This diff is fixing a model size issue introduced in #23291. After that PR, the model size after in8 quantization is the same as that of the original unquantized model. The reason is that we save original weight for int8 quantization even when that's not needed anymore. This diff fixes that by only saving original weight for fp16 quantization path. Reviewed By: llyfacebook Differential Revision: D16557619 fbshipit-source-id: f924ae8d155a0d525b86a7440b3c7147d5bead0a |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| _logging.py | ||
| _pickle.py | ||
| annotations.py | ||
| frontend.py | ||
| quantized.py | ||
| supported_ops.py | ||