Summary:
att
Test Plan:
(ao) $ PYTHONWARNINGS='default' python
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from torch.ao.quantization.quantizer.xnnpack_quantizer import XNNPACKQuantizer
printing warning
*/anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/__init__.py:36: DeprecationWarning: torch.ao.quantization is deprecated. Plan is to
1. Remove eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager mode quantize_ API instead
2. Remove fx graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx, torch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantization API instead (prepare_pt2e, convert_pt2e)
3. pt2e quantization has been migrated to torchao (https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e)
see https://dev-discuss.pytorch.org/t/torch-ao-quantization-migration-plan/2810 for more details
warnings.warn(
>>> a = XNNPACKQuantizer()
*/anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/quantizer/xnnpack_quantizer.py:281: DeprecationWarning: XNNPACKQuantizer is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead
warnings.warn(f"{self.__class__.__name__} is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead", DeprecationWarning)
>>>
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153892
Approved by: https://github.com/Skylion007
Summary:
If module being quantized contains a some meta tensors and some tensors with actual device, we should not fail quantization.
Quantization should also not fail if new quantized module is created on a meta device.
Differential Revision: D66895899
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142262
Approved by: https://github.com/iamzainhuda
Summary:
### Motivation
In D65283170, we need subclass of quantizable LSTM to enable split_gates. Also, required for tests.
### What's the change?
As subclass is not part of no_observer() set, an improper observer is added after the quantizable LSTM module. Here, we switch class check change to issubclass check on no_observer set.
Test Plan:
- N6206576
- CI.
Reviewed By: andrewor14
Differential Revision: D65989314
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140818
Approved by: https://github.com/andrewor14
Summary:
After QAT is completed or given pre-tuned weight observer via tunable PTQ algorithm, it should not over-write again with a given weight, at least for static QAT never.
Dynamic QAT also does not require to re-run weight observer again by design.
This is a fix
Test Plan: Signals
Differential Revision: D57747749
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127309
Approved by: https://github.com/jerryzh168
Summary: tests are failing due to code packaged with trained models calling now defunct function names (is_activation_post_process).
this diff maintains BC temporarily until the cached code can be refreshed
Test Plan: no functional change
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89260
Approved by: https://github.com/jerryzh168
Summary: same function in observer and quantize, consolidated to a
single function. Note the definitions were slightly different, I've
changed the definition to be maximally inclusive so that the name of the
function is more accurate
Test Plan: python test/test_public_bindings.py
python test/test_quantization.py
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D40709276](https://our.internmc.facebook.com/intern/diff/D40709276)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87520
Approved by: https://github.com/jcaip
Summary: This commit fixes the bug where `non_leaf_module_list`
was not respected for activation modules like `torch.nn.Sigmoid`
and `torch.nn.Tanh`. Today, these modules default to
`default_fixed_qparams_range_0to1_fake_quant`, and there is no
way to configure them to use any other activation_post_process
(e.g. FixedQParamsObserver) (see this [mapping](dc00bb51b8/torch/ao/quantization/quantization_mappings.py (L188-L193))).
`non_leaf_module_list` is a "list of non-leaf modules we want
to add observer" (see prepare docstring). If the user explicitly
specified to insert observers for these modules, we should respect
that instead of continuing to use the default.
Test Plan:
python test/test_quantization.py TestQuantizeEagerPTQStatic.test_activations_in_non_leaf_module_list
Reviewers: vkuzo, jerryzh168
Subscribers: vkuzo, jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88498
Approved by: https://github.com/jerryzh168
Summary:
Currently we expect the users to provide custom modules for LSTM and MHA. However, as we almost always ask the users to use those modules in the custom context, it is better to make this behavior default. In this case we try to align with the base quantization API, if the user specifies a custom_config_dict then that is used, however if the value is left as None then the default is used. If a user would like to both use the default and modify it, they have to do so manually, however the default is accessible by get_default_custom_config_dict
Additionally, the NS which uses prepare to insert custom observers for
its purposes had to be slightly modified to pass in an empty
custom_config_dict in order to avoid modifying the custom modules.
due to weird CI issues with previous PR,
previous discussion can be found: https://github.com/pytorch/pytorch/pull/71192
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79960
Approved by: https://github.com/z-a-f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74846
This PR primarily allows the PTQ convert function to work with
parametrized modules. Given that the parametrized weight is what is used
by default in convert, as long as sparsifier.step() has already been
called, the converted model will use the sparisified weights. There is
currently no way to handle things if sparsifier.step() has not been
called. Lastly, added the is_leaf_or_only_parametrized function because
parametrized modules no longer look like leaves due to the
parametrizations module attached to them
Test Plan:
python test/test_ao_sparsity.py TestComposability
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35240275
fbshipit-source-id: 48529f2a83edfe6d8a2d2dff8ca3d08a3fb0d553
(cherry picked from commit 9d6361482e2885db964e02b0222cd23c9f4d469e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74845
This PR adds support for quantization flow to detect
parametrized modules and match them using their original module types.
This mainly involved using the new type_before_parametrizations function rather than
type to check for module mathcing
Test Plan:
python test/test_ao_sparsity.py TestComposability
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D35240274
fbshipit-source-id: 7294d89c9c2e069e51d8b9bafa45c15f92bed124
(cherry picked from commit ed5cdb7b636c42e040d1b4a67b6b94604d06e1ff)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74277
see issue: https://github.com/pytorch/pytorch/issues/74240
this fixes that issue by skipping the children of untraceable modules during
propagate_qconfig. This required extending said function to take the
prepare_custom_config_dict as an optional argument.
Test Plan:
python test/test_quantization.py
python test/test_quantization.py TestQuantizeFx.test_qat_skip_untraced
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34916074
fbshipit-source-id: 11caba2cbf78566fb51adf698b01bbba0275de28
(cherry picked from commit 5324c48e4c3277bb12a716a4408151c86006ee47)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71104
This shouldn't change any functionality given that those
variables were not used. It should be noted that a similar variable is
used in add_observer which is why it wasn't removed from there.
ghstack-source-id: 146940043
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D33510352
fbshipit-source-id: c66ed72c2b71a6e1822f9311467adaa1f4b730d0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864
att, will have a follow up PR that removes QConfigDynamic in the api
Test Plan:
regression tests
```
python test/test_quantization.py TestPostTrainingStatic
python test/test_quantization.py TestPostTrainingDynamic
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D33073235
fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db
Summary:
**Summary:** This commit adds the `torch.nn.qat.dynamic.modules.Linear`
module, the dynamic counterpart to `torch.nn.qat.modules.Linear`.
Functionally these are very similar, except the dynamic version
expects a memoryless observer and is converted into a dynamically
quantized module before inference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67325
Test Plan:
`python3 test/test_quantization.py TestQuantizationAwareTraining.test_dynamic_qat_linear`
**Reviewers:** Charles David Hernandez, Jerry Zhang
**Subscribers:** Charles David Hernandez, Supriya Rao, Yining Lu
**Tasks:** 99696812
**Tags:** pytorch
Reviewed By: malfet, jerryzh168
Differential Revision: D32178739
Pulled By: andrewor14
fbshipit-source-id: 5051bdd7e06071a011e4e7d9cc7769db8d38fd73
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674
Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules.
With this PR they can use either the static or dynamic quantization APIs for Embedding quantization
The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float
method of the quantized Embedding/Embedding modules.
To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type.
The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32.
Addresses Issue #65185
ghstack-source-id: 139935419
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: gchanan
Differential Revision: D31211199
fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4