Summary:
Currently we expect the users to provide custom modules for LSTM and MHA. However, as we almost always ask the users to use those modules in the custom context, it is better to make this behavior default. In this case we try to align with the base quantization API, if the user specifies a custom_config_dict then that is used, however if the value is left as None then the default is used. If a user would like to both use the default and modify it, they have to do so manually, however the default is accessible by get_default_custom_config_dict
Additionally, the NS which uses prepare to insert custom observers for
its purposes had to be slightly modified to pass in an empty
custom_config_dict in order to avoid modifying the custom modules.
due to weird CI issues with previous PR,
previous discussion can be found: https://github.com/pytorch/pytorch/pull/71192
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79960
Approved by: https://github.com/z-a-f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74846
This PR primarily allows the PTQ convert function to work with
parametrized modules. Given that the parametrized weight is what is used
by default in convert, as long as sparsifier.step() has already been
called, the converted model will use the sparisified weights. There is
currently no way to handle things if sparsifier.step() has not been
called. Lastly, added the is_leaf_or_only_parametrized function because
parametrized modules no longer look like leaves due to the
parametrizations module attached to them
Test Plan:
python test/test_ao_sparsity.py TestComposability
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D35240275
fbshipit-source-id: 48529f2a83edfe6d8a2d2dff8ca3d08a3fb0d553
(cherry picked from commit 9d6361482e2885db964e02b0222cd23c9f4d469e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74845
This PR adds support for quantization flow to detect
parametrized modules and match them using their original module types.
This mainly involved using the new type_before_parametrizations function rather than
type to check for module mathcing
Test Plan:
python test/test_ao_sparsity.py TestComposability
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D35240274
fbshipit-source-id: 7294d89c9c2e069e51d8b9bafa45c15f92bed124
(cherry picked from commit ed5cdb7b636c42e040d1b4a67b6b94604d06e1ff)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74277
see issue: https://github.com/pytorch/pytorch/issues/74240
this fixes that issue by skipping the children of untraceable modules during
propagate_qconfig. This required extending said function to take the
prepare_custom_config_dict as an optional argument.
Test Plan:
python test/test_quantization.py
python test/test_quantization.py TestQuantizeFx.test_qat_skip_untraced
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34916074
fbshipit-source-id: 11caba2cbf78566fb51adf698b01bbba0275de28
(cherry picked from commit 5324c48e4c3277bb12a716a4408151c86006ee47)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71104
This shouldn't change any functionality given that those
variables were not used. It should be noted that a similar variable is
used in add_observer which is why it wasn't removed from there.
ghstack-source-id: 146940043
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D33510352
fbshipit-source-id: c66ed72c2b71a6e1822f9311467adaa1f4b730d0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864
att, will have a follow up PR that removes QConfigDynamic in the api
Test Plan:
regression tests
```
python test/test_quantization.py TestPostTrainingStatic
python test/test_quantization.py TestPostTrainingDynamic
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D33073235
fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db
Summary:
**Summary:** This commit adds the `torch.nn.qat.dynamic.modules.Linear`
module, the dynamic counterpart to `torch.nn.qat.modules.Linear`.
Functionally these are very similar, except the dynamic version
expects a memoryless observer and is converted into a dynamically
quantized module before inference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67325
Test Plan:
`python3 test/test_quantization.py TestQuantizationAwareTraining.test_dynamic_qat_linear`
**Reviewers:** Charles David Hernandez, Jerry Zhang
**Subscribers:** Charles David Hernandez, Supriya Rao, Yining Lu
**Tasks:** 99696812
**Tags:** pytorch
Reviewed By: malfet, jerryzh168
Differential Revision: D32178739
Pulled By: andrewor14
fbshipit-source-id: 5051bdd7e06071a011e4e7d9cc7769db8d38fd73
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674
Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules.
With this PR they can use either the static or dynamic quantization APIs for Embedding quantization
The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float
method of the quantized Embedding/Embedding modules.
To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type.
The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32.
Addresses Issue #65185
ghstack-source-id: 139935419
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: gchanan
Differential Revision: D31211199
fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66058
After the initial migration from `torch.quantization` to `torch.ao.quantization`, some of the files did not change.
This happened because the migration was done in parallel, and some of the files were landed while the others were still in the original location.
This is the last fix in the AO migration phase 1, which completely enables the ao.quantization namespace.
Test Plan: `python test/test_quantization.py`
Reviewed By: vkuzo
Differential Revision: D31366066
Pulled By: z-a-f
fbshipit-source-id: bf4a74885be89d098df2d87e685795a2a64026c5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65119
Pytorch Quantization: allow prepare_qat to include custom module by passing allow_list into the prepare_qat.
When we are implementing custom module and custom mapping for Quantization Aware Training (QAT), we need to add the custom module to the mappings and to the allow_list during prepare_qat. The allow_list needs to be surfaced to the propagate_qconfig_.
Test Plan: relying on general unit test
Reviewed By: supriyar
Differential Revision: D30982060
fbshipit-source-id: 1114115b6a3b853238d33d72b5cbaafc60f463e0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64861
1. move the file
```
hg mv caffe2/torch/quantization/stubs.py caffe2/torch/ao/quantization/
```
2. create a new file in the old location and copy the imports
3. fix all call sites inside `torch`
ghstack-source-id: 137885365
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: jerryzh168
Differential Revision: D30879678
fbshipit-source-id: a2d24f25d01064212aca15e94e8c78240ba48953
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64445
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the quantize.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/dev //caffe2/test:quantization`
Reviewed By: HDCharles
Differential Revision: D30734870
fbshipit-source-id: dc204f3cc46bff2cc81c95159eab9d333b43bb4b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64086
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the `quantize.py` from torch.quantization to `torch.ao.quantization`.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/opt //caffe2/test:quantization`
Reviewed By: jerryzh168, raghuramank100
Differential Revision: D30055886
fbshipit-source-id: 8ef7470f9fa640c0042bef5bb843e7a05ecd0b9f