Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66379
Description:
Creates a quantization API reference and fixes all the docblock errors.
This is #66122 to #66210 squashed together
Test Plan:
```
cd docs
make html
python -m http.server
// open webpage, inspect it, looks good
```
Reviewed By: ejguan
Differential Revision: D31543172
Pulled By: vkuzo
fbshipit-source-id: 9131363d6528337e9f100759654d3f34f02142a9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66125
Before this PR, the documentation for observers and fake_quants was inlined in the
Eager mode quantization page. This was hard to discover, especially
since that page is really long, and we now have FX graph mode quantization reusing
all of this code.
This PR moves observers and fake_quants into their own documentation pages. It also
adds docstrings to all user facing module attributes such as the default observers
and fake_quants, so people can discover them from documentation without having
to inspect the source code.
For now, enables autoformatting (which means all public classes, functions, members
with docstrings will get docs). If we need to exclude something in these files from
docs in the future, we can go back to manual docs.
Test Plan:
```
cd docs
make html
python -m server.http
// inspect docs on localhost, renders correctly
```
Reviewed By: dagitses
Differential Revision: D31447613
Pulled By: vkuzo
fbshipit-source-id: 63b4cf518badfb29ede583a5c2ca823f572c8599
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674
Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules.
With this PR they can use either the static or dynamic quantization APIs for Embedding quantization
The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float
method of the quantized Embedding/Embedding modules.
To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type.
The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32.
Addresses Issue #65185
ghstack-source-id: 139935419
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: gchanan
Differential Revision: D31211199
fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66058
After the initial migration from `torch.quantization` to `torch.ao.quantization`, some of the files did not change.
This happened because the migration was done in parallel, and some of the files were landed while the others were still in the original location.
This is the last fix in the AO migration phase 1, which completely enables the ao.quantization namespace.
Test Plan: `python test/test_quantization.py`
Reviewed By: vkuzo
Differential Revision: D31366066
Pulled By: z-a-f
fbshipit-source-id: bf4a74885be89d098df2d87e685795a2a64026c5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65699
related to: https://github.com/pytorch/pytorch/pull/65443#discussion_r715132425
The QAT and PAT (pruning aware training) support for embedding bags needs a memoryless observer to work properly. This is necessitated by the changing pruned/non-pruned weights during training which can significantly change the quantization parameters.
This PR adds a memoryless flag to the simpler observer classes (not moving average since those explicitly have memory)
In addition to the above, I altered the reset_min_max_vals
function for MinMaxObserver so that it would preserve the device of the
existing self.min_val and self.max_val which was not preserved
previously compared to how it is initialized (using factory_kwargs)
Test Plan:
python test/test_quantization.py TestObserver
(added test_memoryless_minmaxobserver, test_memoryless_per_channel_minmaxobserver, test_memoryless_histogramobserver)
Imported from OSS
Reviewed By: supriyar
Differential Revision: D31209773
fbshipit-source-id: 44a63298e44880fbd3576f49ac568e781f3fd79a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64919
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the quantization utilities.
ghstack-source-id: 138303325
Test Plan: `buck test mode/dev //caffe2/test:quantization`
Reviewed By: jerryzh168
Differential Revision: D30899082
fbshipit-source-id: 85eb38c419e417147e71758b682cd095308dd0c9