Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863
This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first,
and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack).
This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code
in quantization_patterns.py as well (in followup PRs).
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
and other internal/oss regression tests
Imported from OSS
Reviewed By: andrewor14
Differential Revision: D34778506
fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b
(cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73436
This PR adds support reference module support for Embedding and EmbeddingBag, following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
* the reference module inherits from the corresponding float module (e.g. nn.Embedding), and the ReferenceQuantizedModule (which defines some utility functions to store qparms for a single weight)
* in forward, we first quantize and then dequantize weight (to generate the pattern) and then feed the weight to the original fp32 op
We'll connect this with fx grpah mode quantization later, in the final PR that deprecates the current convert implementation. Since current convert doesn't
support emitting quantize_per_tensor_dynamic ops, we don't want to implement it and immediately throw away the code, so might be better to just implement this
in the final flow.
Test Plan:
Will be tested later, in the final PR that deprecates the current convert implementation
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D34480325
fbshipit-source-id: bc353f3be035a364e013fa9132d0422f19120ac3
(cherry picked from commit 1722ec2f8d82e9763ef252fed5796fd09d120e34)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72277
Minor modifications were made to support 4 bit embedding quantized module in eager mode quantization flow and to allow for testing of the changes
Test Plan:
In pytorch main dir, execute
```
python test_quantization.py TestPostTrainingStatic.test_quantized_embedding
```
Reviewed By: jerryzh168
Differential Revision: D33994545
Pulled By: dzdang
fbshipit-source-id: faafad54b7b07fc393904ba55c2b2ac934c276f7
(cherry picked from commit 042ffb2091)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72276
Added 4 bit support and the correpsonding test in the module api. Restructured the test_quantized_module for both 4 & 8 bit support.
Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestStaticQuantizedModule.test_embedding_api
```
Reviewed By: dagitses
Differential Revision: D33994544
Pulled By: dzdang
fbshipit-source-id: 49f04f267913e9f3f9649305b233055157c82dee
(cherry picked from commit c8c8e6fb44)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69806
Minor modifications were made to support 4 bit embedding quantized module in eager mode quantization flow and to allow for testing of the changes
Test Plan:
In pytorch main dir, execute
```
python test_quantization.py TestPostTrainingStatic.test_quantized_embedding
```
to run the series of tests, including the newly added test_embedding_4bit
function
Imported from OSS
Reviewed By: jbschlosser
Differential Revision: D33152675
fbshipit-source-id: 5cdaac5aee9b8850e61c99e74033889bcfec5d9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69769
Added 4 bit support and the correpsonding test in the module api. Restructured the test_quantized_module for both 4 & 8 bit support.
Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestStaticQuantizedModule.test_embedding_api
```
Imported from OSS
Reviewed By: jbschlosser
Differential Revision: D33152674
fbshipit-source-id: 73e63383cf60994ab34cc7b4eedd8f32a806cf7f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69334
Original PR #68121 broke with incompatible qengine for Mac OS, this PR re-introduces changes with fix
Add FX support for QAT EmbeddingBag operator, previously only eager mode support.
Test Plan:
pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embeddingbag_linear"
Imported from OSS
Reviewed By: jingsh
Differential Revision: D32815153
fbshipit-source-id: 33654ce29de6e81920bf3277a75027fe403a1eb2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65900
This changes the imports in the `caffe2/torch/nn/quantized` to include the new import locations.
```
codemod -d torch/nn/quantized --extensions py 'torch.quantization' 'torch.ao.quantization'
```
Test Plan: `python test/run_test.py`
Reviewed By: jerryzh168
Differential Revision: D31301193
fbshipit-source-id: 58efb1ad51a8b441e2a3bd5b91af11eab6b9331f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66051
Make the error message clearer when quantized embedding is converted
with an unsupported dtype. This is helpful when debugging quantization
errors on new models.
Test Plan:
```
class M(nn.Module):
def __init__(self):
super().__init__()
self.embedding = nn.Embedding(1, 1)
m = M().eval()
m.qconfig = torch.quantization.QConfig(
activation=torch.quantization.MinMaxObserver.with_args(dtype=torch.qint8),
weight=torch.quantization.MinMaxObserver.with_args(dtype=torch.qint8))
m.embedding.qconfig = m.qconfig
mp = torch.quantization.prepare(m)
mq = torch.quantization.convert(m)
// error message now includes the incorrect dtype
```
Imported from OSS
Reviewed By: dagitses
Differential Revision: D31472848
fbshipit-source-id: 86f6d90bc0ad611aa9d1bdae24497bc6f3d2acaa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674
Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules.
With this PR they can use either the static or dynamic quantization APIs for Embedding quantization
The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float
method of the quantized Embedding/Embedding modules.
To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type.
The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32.
Addresses Issue #65185
ghstack-source-id: 139935419
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: gchanan
Differential Revision: D31211199
fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4
Summary:
These unused variables were identified by [pyflakes](https://pypi.org/project/pyflakes/). They can be safely removed to simplify the code and possibly improve performance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50100
Reviewed By: ezyang
Differential Revision: D25797764
Pulled By: smessmer
fbshipit-source-id: ced341aee692f429d2dcc3a4ef5c46c8ee99cabb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48069
also renamed float_qparam_dynamic_qconfig to float_qparam_weight_only_qconfig
It's not used in user code yet so we only need to update the tests.
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D25010175
fbshipit-source-id: caa3eaa5358a8bc5c808bf5f64e6ebff3e0b61e8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46003
sparse is confusing because itt is used in training for sparse gradients
Test Plan: Imported from OSS
Reviewed By: radkris-git, qizzzh
Differential Revision: D24178248
fbshipit-source-id: 0a2b595f3873d33b2ce25839b6eee31d2bfd3b0d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44217
Move the tests to static ones as well
Test Plan:
python test/test_quantization.py TestStaticQuantizedModule.test_embedding_bag_api
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D23547386
fbshipit-source-id: 41f81c31e1613098ecf6a7eff601c7dcd4b09c76