pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Jerry Zhang	7ddf212f33	[quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863 This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first, and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack). This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code in quantization_patterns.py as well (in followup PRs). Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` and other internal/oss regression tests Imported from OSS Reviewed By: andrewor14 Differential Revision: D34778506 fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b (cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)	2022-03-11 17:11:30 +00:00
Charles David Hernandez	39605a5632	[ao] Removing memoryless observer args for MovingAverage (#73947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73947 The original implementation of memoryless observers used MinMaxObservers and a memoryless argument to manipulate the behavior of the observer such that it wouldn't keep track of previously observed min and max's. It was later pointed out that this was equivalent to a movingaverageobserver with averaging_constant=1 which is requires less overhead and no 1 off args (memoryless) so this PR refactors the memoryless arg and uses MovingAverage observers instead, although the memoryless adjective is still used, a complete definintion was also added to clarify error messages given these changes. TestPlan python test/test_quantization.py TestQuantizeEagerQAT python test/test_quantization.py TestObserver Test Plan: Imported from OSS Reviewed By: andrewor14 Differential Revision: D34732080 Pulled By: HDCharles fbshipit-source-id: 227a1ab29d18adae55093a684ea35ac34523d07a (cherry picked from commit 5238e70e8f90f3219c36f9c64b647951dcf64b5a)	2022-03-11 00:21:49 +00:00
dzdang	a39e8e8f5e	[Quant][fx] Added explicit entries for for functional and module conv&linear support into get_default_qconfig_dict&get_default_qat_qconfig_dict (#73528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73528 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34535572 Pulled By: dzdang fbshipit-source-id: 883f46e014e47aeba3ea6f9fb401c54e3792b2ac (cherry picked from commit 66713d518295b2e7306561030aa6b7ca049a708c)	2022-03-04 03:29:20 +00:00
Jerry Zhang	5db711f9d3	[quant][be] Replace QConfigDynamic with QConfig in code (#69864 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864 att, will have a follow up PR that removes QConfigDynamic in the api Test Plan: regression tests ``` python test/test_quantization.py TestPostTrainingStatic python test/test_quantization.py TestPostTrainingDynamic python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D33073235 fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db	2021-12-17 22:30:57 -08:00
Charles David Hernandez	497ec9d9b8	Getting NS to work with Ferraris (#68908 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68908 see description in github Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D32928449 fbshipit-source-id: ba7085b823a0ebcd0d9e40f4ac19ca0a2cac1169	2021-12-08 12:26:00 -08:00
Ben Koopman	93aa3603ee	[quant][embedding qat] Re-Land Support Embedding QAT via FX API (#69333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69333 Original PR reverted due to break with incompatible qengine on Mac OS, this diff fixes that. Support QAT workflow by using torch.fx QAT API. e.g. `prepare_qat_fx` and `convert_fx`. Test Plan: `pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embedding_linear"` Imported from OSS Reviewed By: jingsh Differential Revision: D32814827 fbshipit-source-id: f7a69d2b596f1276dc5860b397c5d5d07e5b9e16	2021-12-08 05:28:07 -08:00
Jerry Zhang	ca945d989a	[quant][graphmode][fx] Add default_replay_qconfig for ops like reshape (#69249 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69249 This PR added default_replay_qconfig and default_replay_observer which is used when we want to configure an operator to reuse the observer from input, if the input Tensor for the operator is not observed, we will not observe the output of this operator either, if the input Tensor is observed, we will observe the output of the operator with the same observer. e.g. ``` x1 = x0.reshape() ``` if reshape is configured with default_replay_qconfig: 1. if x0 is observed with observer_0, we'll observe x1 with the same observer instance 2. if x0 is not observed, we won't observe x1 either Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_replay_qconfig ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D32774723 fbshipit-source-id: 26862b2bc181d0433e2243daeb3b8f7ec3dd33b2	2021-12-06 22:56:14 -08:00
Nikita Shulga	a0367f8980	Revert D32404517: [quant][embedding qat] Support Embedding QAT via FX API Test Plan: revert-hammer Differential Revision: D32404517 (`abda069ce2`) Original commit changeset: 0484df8c826b fbshipit-source-id: 4e7d62b9ccdb84eb4d184cd0b3c9506013fd8336	2021-12-02 14:28:35 -08:00
Ben Koopman	abda069ce2	[quant][embedding qat] Support Embedding QAT via FX API (#68296 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68296 Support QAT workflow by using torch.fx QAT API. e.g. `prepare_qat_fx` and `convert_fx`. Test Plan: `pytest test/quantization/fx/test_quantize_fx.py -v -k "test_qat_embedding_linear"` Imported from OSS Reviewed By: jingsh, supriyar Differential Revision: D32404517 fbshipit-source-id: 0484df8c826b823b60dfecd9def77bf8cffe0527	2021-12-02 08:42:45 -08:00
Ben Koopman	f6e45102d2	[quant][embedding qat] Support non-partial functions in qconfig comparison (#68067 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68067 Embedding QAT uses a NoopObserver class for activation, and a FakeQuant for weight, make sure that qconfig comparison functions properly for a mix of partial function and class in qconfig. Test Plan: `pytest test/quantization/eager/test_quantize_eager_qat.py -v -k "test_embedding_qat_qconfig_equal"` Imported from OSS Reviewed By: HDCharles Differential Revision: D32318434 fbshipit-source-id: c036eef9cbabe7c247745930501328e9c75a8cb0	2021-11-12 12:48:00 -08:00
andrewor	4a8f27445d	[Quant] Add dynamic QAT Linear module (#67325 ) Summary: Summary: This commit adds the `torch.nn.qat.dynamic.modules.Linear` module, the dynamic counterpart to `torch.nn.qat.modules.Linear`. Functionally these are very similar, except the dynamic version expects a memoryless observer and is converted into a dynamically quantized module before inference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67325 Test Plan: `python3 test/test_quantization.py TestQuantizationAwareTraining.test_dynamic_qat_linear` Reviewers: Charles David Hernandez, Jerry Zhang Subscribers: Charles David Hernandez, Supriya Rao, Yining Lu Tasks: 99696812 Tags: pytorch Reviewed By: malfet, jerryzh168 Differential Revision: D32178739 Pulled By: andrewor14 fbshipit-source-id: 5051bdd7e06071a011e4e7d9cc7769db8d38fd73	2021-11-08 10:24:25 -08:00
Ben Koopman	aa7da7b09c	[quant][embedding qat] Enable quint4 in EmbeddingBag QAT workflow (#66348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66348 Test Plan: Imported from OSS Reviewed By: HDCharles Differential Revision: D31691300 Pulled By: b-koopman fbshipit-source-id: 11bd75b608b972394fe9f7c9b7bf034af42f28b5	2021-10-18 08:51:39 -07:00
Vasiliy Kuznetsov	8b1258698e	Improve quantization API docs (#66379 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66379 Description: Creates a quantization API reference and fixes all the docblock errors. This is #66122 to #66210 squashed together Test Plan: ``` cd docs make html python -m http.server // open webpage, inspect it, looks good ``` Reviewed By: ejguan Differential Revision: D31543172 Pulled By: vkuzo fbshipit-source-id: 9131363d6528337e9f100759654d3f34f02142a9	2021-10-11 18:46:11 -07:00
Mike Ruberry	10633460ce	Revert D31447614: Create a documentation page for `torch.ao.quantization.QConfig` Test Plan: revert-hammer Differential Revision: D31447614 (`7332ed13ed`) Original commit changeset: 5d9dd2a4e864 fbshipit-source-id: 6ac15a956222ca61f7fbb75ed36bcc58b23f0f36	2021-10-10 01:51:09 -07:00
Vasiliy Kuznetsov	7332ed13ed	Create a documentation page for `torch.ao.quantization.QConfig` (#66129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66129 Adds a documentation page for `torch.ao.quantization.QConfig`. It is useful for this to have a separate page since it shared between Eager and FX graph mode quantization. Also, ensures that all important functions and module attributes in this module have docstrings, so users can discover these without reading the source code. Test Plan: ``` cd docs make html python -m http.server // open webpage, inspect it, renders correctly ``` Reviewed By: jerryzh168 Differential Revision: D31447614 Pulled By: vkuzo fbshipit-source-id: 5d9dd2a4e8647fa17b96cefbaae5299adede619c	2021-10-09 06:45:58 -07:00
Ben Koopman	a58ff186e8	[quant][embedding qat] Add basic EmbeddingBag QAT fakeQuant workflow (#65443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65443 Test Plan: Imported from OSS Reviewed By: dagitses, supriyar Differential Revision: D31456445 Pulled By: b-koopman fbshipit-source-id: 0edda6e272d9005fce65f2ba6a5e6abc831836de	2021-10-07 20:19:29 -07:00
Supriya Rao	8a974a482c	[quant] Add support for quantization of Embedding{Bag} in dynamic quant APIs (#65674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674 Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules. With this PR they can use either the static or dynamic quantization APIs for Embedding quantization The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float method of the quantized Embedding/Embedding modules. To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type. The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32. Addresses Issue #65185 ghstack-source-id: 139935419 Test Plan: python test/test_quantization.py Imported from OSS Reviewed By: gchanan Differential Revision: D31211199 fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4	2021-10-06 23:19:38 -07:00
Zafar	0d020effab	[quant] Fix the parts that were missing after initial migration (#66058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66058 After the initial migration from `torch.quantization` to `torch.ao.quantization`, some of the files did not change. This happened because the migration was done in parallel, and some of the files were landed while the others were still in the original location. This is the last fix in the AO migration phase 1, which completely enables the ao.quantization namespace. Test Plan: `python test/test_quantization.py` Reviewed By: vkuzo Differential Revision: D31366066 Pulled By: z-a-f fbshipit-source-id: bf4a74885be89d098df2d87e685795a2a64026c5	2021-10-05 11:45:37 -07:00
Charles David Hernandez	f309f8fbd4	[quant] ao migration of observer and qconfig (#64982 ) Summary: (Had to recreate this diff so it wasn't dependent on the stack) Pull Request resolved: https://github.com/pytorch/pytorch/pull/64982 migration of qconfig.py and observer.py to torch/ao/quantization using new test format ghstack-source-id: 138215256 Test Plan: buck test mode/opt //caffe2/test:quantization https://www.internalfb.com/intern/testinfra/testconsole/testrun/8444249354294701/ buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization https://www.internalfb.com/intern/testinfra/testrun/3940649742829796 Reviewed By: z-a-f Differential Revision: D30982534 fbshipit-source-id: 48d08969b1984311ceb036eac0877c811cd6add9	2021-09-16 10:33:16 -07:00

19 Commits