Commit Graph

19 Commits

Author SHA1 Message Date
Jerry Zhang
7ddf212f33 [quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863

This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first,
and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack).

This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code
in quantization_patterns.py as well (in followup PRs).

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
and other internal/oss regression tests

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D34778506

fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b
(cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)
2022-03-11 17:11:30 +00:00
Charles David Hernandez
39605a5632 [ao] Removing memoryless observer args for MovingAverage (#73947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73947

The original implementation of memoryless observers used MinMaxObservers and
a memoryless argument to manipulate the behavior of the observer such that it wouldn't
keep track of previously observed min and max's. It was later pointed
out that this was equivalent to a movingaverageobserver with averaging_constant=1
which is requires less overhead and no 1 off args (memoryless) so this PR refactors
the memoryless arg and uses MovingAverage observers instead, although the memoryless
adjective is still used, a complete definintion was also added to clarify error
messages given these changes.

TestPlan
python test/test_quantization.py TestQuantizeEagerQAT
python test/test_quantization.py TestObserver

Test Plan: Imported from OSS

Reviewed By: andrewor14

Differential Revision: D34732080

Pulled By: HDCharles

fbshipit-source-id: 227a1ab29d18adae55093a684ea35ac34523d07a
(cherry picked from commit 5238e70e8f90f3219c36f9c64b647951dcf64b5a)
2022-03-11 00:21:49 +00:00
dzdang
a39e8e8f5e [Quant][fx] Added explicit entries for for functional and module conv&linear support into get_default_qconfig_dict&get_default_qat_qconfig_dict (#73528)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73528

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D34535572

Pulled By: dzdang

fbshipit-source-id: 883f46e014e47aeba3ea6f9fb401c54e3792b2ac
(cherry picked from commit 66713d518295b2e7306561030aa6b7ca049a708c)
2022-03-04 03:29:20 +00:00
Jerry Zhang
5db711f9d3 [quant][be] Replace QConfigDynamic with QConfig in code (#69864)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864

att, will have a follow up PR that removes QConfigDynamic in the api

Test Plan:
regression tests
```
python test/test_quantization.py TestPostTrainingStatic
python test/test_quantization.py TestPostTrainingDynamic
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33073235

fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db
2021-12-17 22:30:57 -08:00
Charles David Hernandez
497ec9d9b8 Getting NS to work with Ferraris (#68908)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68908

see description in github

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D32928449

fbshipit-source-id: ba7085b823a0ebcd0d9e40f4ac19ca0a2cac1169
2021-12-08 12:26:00 -08:00
Ben Koopman
93aa3603ee [quant][embedding qat] Re-Land Support Embedding QAT via FX API (#69333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69333

Original PR reverted due to break with incompatible qengine on Mac OS, this diff fixes that.

Support QAT workflow by using torch.fx QAT API.  e.g. `prepare_qat_fx` and `convert_fx`.

Test Plan:
`pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embedding_linear"`

Imported from OSS

Reviewed By: jingsh

Differential Revision: D32814827

fbshipit-source-id: f7a69d2b596f1276dc5860b397c5d5d07e5b9e16
2021-12-08 05:28:07 -08:00
Jerry Zhang
ca945d989a [quant][graphmode][fx] Add default_replay_qconfig for ops like reshape (#69249)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69249

This PR added default_replay_qconfig and default_replay_observer which is used
when we want to configure an operator to reuse the observer from input, if the input
Tensor for the operator is not observed, we will not observe the output of this operator either,
if the input Tensor is observed, we will observe the output of the operator with the same observer.

e.g.

```
x1 = x0.reshape()
```
if reshape is configured with default_replay_qconfig:
1. if x0 is observed with observer_0, we'll observe x1 with the same observer instance
2. if x0 is not observed, we won't observe x1 either

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_replay_qconfig
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D32774723

fbshipit-source-id: 26862b2bc181d0433e2243daeb3b8f7ec3dd33b2
2021-12-06 22:56:14 -08:00
Nikita Shulga
a0367f8980 Revert D32404517: [quant][embedding qat] Support Embedding QAT via FX API
Test Plan: revert-hammer

Differential Revision:
D32404517 (abda069ce2)

Original commit changeset: 0484df8c826b

fbshipit-source-id: 4e7d62b9ccdb84eb4d184cd0b3c9506013fd8336
2021-12-02 14:28:35 -08:00
Ben Koopman
abda069ce2 [quant][embedding qat] Support Embedding QAT via FX API (#68296)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68296

Support QAT workflow by using torch.fx QAT API.  e.g. `prepare_qat_fx` and `convert_fx`.

Test Plan:
`pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embedding_linear"`

Imported from OSS

Reviewed By: jingsh, supriyar

Differential Revision: D32404517

fbshipit-source-id: 0484df8c826b823b60dfecd9def77bf8cffe0527
2021-12-02 08:42:45 -08:00
Ben Koopman
f6e45102d2 [quant][embedding qat] Support non-partial functions in qconfig comparison (#68067)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68067

Embedding QAT uses a NoopObserver class for activation,
and a FakeQuant for weight, make sure that qconfig comparison
functions properly for a mix of partial function and class in
qconfig.

Test Plan:
`pytest test/quantization/eager/test_quantize_eager_qat.py  -v -k "test_embedding_qat_qconfig_equal"`

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D32318434

fbshipit-source-id: c036eef9cbabe7c247745930501328e9c75a8cb0
2021-11-12 12:48:00 -08:00
andrewor
4a8f27445d [Quant] Add dynamic QAT Linear module (#67325)
Summary:
**Summary:** This commit adds the `torch.nn.qat.dynamic.modules.Linear`
module, the dynamic counterpart to `torch.nn.qat.modules.Linear`.
Functionally these are very similar, except the dynamic version
expects a memoryless observer and is converted into a dynamically
quantized module before inference.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67325

Test Plan:
`python3 test/test_quantization.py TestQuantizationAwareTraining.test_dynamic_qat_linear`

**Reviewers:** Charles David Hernandez, Jerry Zhang

**Subscribers:** Charles David Hernandez, Supriya Rao, Yining Lu

**Tasks:** 99696812

**Tags:** pytorch

Reviewed By: malfet, jerryzh168

Differential Revision: D32178739

Pulled By: andrewor14

fbshipit-source-id: 5051bdd7e06071a011e4e7d9cc7769db8d38fd73
2021-11-08 10:24:25 -08:00
Ben Koopman
aa7da7b09c [quant][embedding qat] Enable quint4 in EmbeddingBag QAT workflow (#66348)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66348

Test Plan: Imported from OSS

Reviewed By: HDCharles

Differential Revision: D31691300

Pulled By: b-koopman

fbshipit-source-id: 11bd75b608b972394fe9f7c9b7bf034af42f28b5
2021-10-18 08:51:39 -07:00
Vasiliy Kuznetsov
8b1258698e Improve quantization API docs (#66379)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66379

Description:

Creates a quantization API reference and fixes all the docblock errors.

This is #66122 to #66210 squashed together

Test Plan:
```
cd docs
make html
python -m http.server
// open webpage, inspect it, looks good
```

Reviewed By: ejguan

Differential Revision: D31543172

Pulled By: vkuzo

fbshipit-source-id: 9131363d6528337e9f100759654d3f34f02142a9
2021-10-11 18:46:11 -07:00
Mike Ruberry
10633460ce Revert D31447614: Create a documentation page for torch.ao.quantization.QConfig
Test Plan: revert-hammer

Differential Revision:
D31447614 (7332ed13ed)

Original commit changeset: 5d9dd2a4e864

fbshipit-source-id: 6ac15a956222ca61f7fbb75ed36bcc58b23f0f36
2021-10-10 01:51:09 -07:00
Vasiliy Kuznetsov
7332ed13ed Create a documentation page for torch.ao.quantization.QConfig (#66129)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66129

Adds a documentation page for `torch.ao.quantization.QConfig`. It is useful
for this to have a separate page since it shared between Eager and FX graph
mode quantization.

Also, ensures that all important functions and module attributes in this
module have docstrings, so users can discover these without reading the
source code.

Test Plan:
```
cd docs
make html
python -m http.server
// open webpage, inspect it, renders correctly
```

Reviewed By: jerryzh168

Differential Revision: D31447614

Pulled By: vkuzo

fbshipit-source-id: 5d9dd2a4e8647fa17b96cefbaae5299adede619c
2021-10-09 06:45:58 -07:00
Ben Koopman
a58ff186e8 [quant][embedding qat] Add basic EmbeddingBag QAT fakeQuant workflow (#65443)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65443

Test Plan: Imported from OSS

Reviewed By: dagitses, supriyar

Differential Revision: D31456445

Pulled By: b-koopman

fbshipit-source-id: 0edda6e272d9005fce65f2ba6a5e6abc831836de
2021-10-07 20:19:29 -07:00
Supriya Rao
8a974a482c [quant] Add support for quantization of Embedding{Bag} in dynamic quant APIs (#65674)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674

Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules.
With this PR they can use either the static or dynamic quantization APIs for Embedding quantization

The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float
method of the quantized Embedding/Embedding modules.

To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type.

The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32.

Addresses Issue #65185
ghstack-source-id: 139935419

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: gchanan

Differential Revision: D31211199

fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4
2021-10-06 23:19:38 -07:00
Zafar
0d020effab [quant] Fix the parts that were missing after initial migration (#66058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66058

After the initial migration from `torch.quantization` to `torch.ao.quantization`, some of the files did not change.
This happened because the migration was done in parallel, and some of the files were landed while the others were still in the original location.
This is the last fix in the AO migration phase 1, which completely enables the ao.quantization namespace.

Test Plan: `python test/test_quantization.py`

Reviewed By: vkuzo

Differential Revision: D31366066

Pulled By: z-a-f

fbshipit-source-id: bf4a74885be89d098df2d87e685795a2a64026c5
2021-10-05 11:45:37 -07:00
Charles David Hernandez
f309f8fbd4 [quant] ao migration of observer and qconfig (#64982)
Summary:
(Had to recreate this diff so it wasn't dependent on the stack)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64982

migration of qconfig.py and observer.py to torch/ao/quantization using new test format
ghstack-source-id: 138215256

Test Plan:
buck test mode/opt //caffe2/test:quantization

https://www.internalfb.com/intern/testinfra/testconsole/testrun/8444249354294701/

buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization

https://www.internalfb.com/intern/testinfra/testrun/3940649742829796

Reviewed By: z-a-f

Differential Revision: D30982534

fbshipit-source-id: 48d08969b1984311ceb036eac0877c811cd6add9
2021-09-16 10:33:16 -07:00