pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Andrew Or	8fab682e47	[Quant][fx][bc-breaking] Do not move models to CPU in convert (#80555 ) Summary: Previously, we automatically moved the model to CPU in torch.ao.quantization.fx.convert to work around the issue where certain functions called by convert expect CPU arguments. This commit pushes this responsibility to the caller since it is the user's decision of which device to use. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps BC-breaking Notes: Before: ``` model = resnet18(...) model = prepare_fx(model, qconfig_mapping, example_inputs) ... # calibrate model = convert_fx(model) ``` After: ``` model = resnet18(...) model.cpu() model = prepare_fx(model, qconfig_mapping, example_inputs) ... # calibrate model = convert_fx(model) ``` Reviewers: jerryzh168 Subscribers: jerryzh168 Differential Revision: [D37528830](https://our.internmc.facebook.com/intern/diff/D37528830) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80555 Approved by: https://github.com/jerryzh168	2022-07-08 19:23:57 +00:00
PyTorch MergeBot	b64096a264	Revert "Add prelu op and module for quantized CPU backend (#73491 )" This reverts commit `3a6d6bc3cc`. Reverted https://github.com/pytorch/pytorch/pull/73491 on behalf of https://github.com/malfet due to Broke Windows builds, see `3a6d6bc3cc`	2022-06-30 12:54:39 +00:00
Weiwen Xia	3a6d6bc3cc	Add prelu op and module for quantized CPU backend (#73491 ) Add prelu op and module for quantized CPU backend. The PR includes: - Quantized version of prelu op - Native prelu kernel for quantized CPU - Prelu modules in `nn` and `nn.quantized` - FX support for prelu - Unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491 Approved by: https://github.com/jerryzh168	2022-06-30 06:50:22 +00:00
Andrew Or	c44317704a	[Quant][fx] Add default configs for fixed qparams ops (#80184 ) Summary: This commit adds qconfigs with special observers for fixed qparams ops in get_default_qconfig_mapping and get_default_qat_qconfig_mapping. For correctness, we also require users to use these special observers if we detect these fixed qparams ops in prepare. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: [D37396379](https://our.internmc.facebook.com/intern/diff/D37396379) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80184 Approved by: https://github.com/jerryzh168	2022-06-29 23:07:26 +00:00
Andrew Or	17104d3d7f	[Quant][fx][bc-breaking] Replace is_reference with convert_to_reference (#80091 ) Summary: This PR removes the is_reference flag from the existing convert_fx API and replaces it with a new convert_to_reference function. This separates (1) converting the prepared model to a reference model from (2) lowering the reference model to a quantized model, enabling users to call their custom lowering function for custom backends. For the native fbgemm backend, for example, the following are equivalent: ``` from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx prepared = prepare_fx(model, ...) quantized = convert_fx(prepared, ...) ``` ``` from torch.ao.quantization.fx import lower_to_fbgemm from torch.ao.quantization.quantize_fx import ( prepare_fx, convert_to_reference ) prepared = prepare_fx(model, ...) reference = convert_to_reference(prepared, ...) quantized = lower_to_fbgemm(reference, ...) ``` Note that currently `lower_to_fbgemm` takes in two other arguments that are difficult for users to provide. A future commit will remove these arguments to make the helper function more user friendly. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091 Approved by: https://github.com/jerryzh168	2022-06-29 23:01:27 +00:00
Vasiliy Kuznetsov	91c5fc323b	fx quant: unbreak Mac OS tests when MKLDNN is available Summary: In https://github.com/pytorch/pytorch/pull/74137, the MKLDNN quantized backend was added to PyTorch. Sometime in the past couple of days, MKLDNN got enabled on my Mac OS machine. This uncovered issues in FX graph mode quantization testing, as we were only testing for fbgemm and qnnpack, and some of the tests that were assuming fbgemm started silently going through the MKLDNN path. Since the requirements for MKLDNN are different, the tests started to fail. This PR unbreaks the minimal amount of tests to get a clean test run on my machine. In the future, it would be great to add testing for MKLDNN specifically, and also audit all of the current quantization tests which are assuming fbgemm to set the backend properly. Test plan: ``` python test/test_quantization.py -k Fx ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79718 Approved by: https://github.com/jerryzh168	2022-06-16 23:57:38 +00:00
Andrew Or	78144b9f35	[Quant][fx][bc-breaking] Replace custom_config_dict with config objects Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066 Following https://github.com/pytorch/pytorch/pull/78452, this commit replaces the following config dicts with python objects: - prepare_custom_config_dict -> PrepareCustomConfig - convert_custom_config_dict -> ConvertCustomConfig - fuse_custom_config_dict -> FuseCustomConfig This leads to better type safety and better user experience in notebook settings due to improved auto completion. The new APIs are as follows: ``` from torch.ao.quantization.fx.custom_config import PrepareCustomConfig prepare_custom_config = PrepareCustomConfig() \ .set_float_to_observed_mapping(float_class, observed_class) \ .set_non_traceable_module_names(["mod1", "mod2"]) \ .set_non_traceable_module_classes([class1, class2]) \ .set_input_quantized_indexes([0, 1]) \ .set_output_quantized_indexes([0]) \ .set_preserved_attributes(["attr1", "attr2"]) convert_custom_config = ConvertCustomConfig() \ .set_observed_to_quantized_mapping(observed_class, quantized_class) \ .set_preserved_attributes(["attr1", "attr2"]) model = prepare_fx( model, qconfig_mapping, example_inputs, prepare_custom_config=prepare_custom_config) model(data) model = convert_fx(model, convert_custom_config=convert_custom_config) ``` For backwards compatibility, prepare_fx, prepare_qat_fx, and convert_fx will continue to accept Dicts, which will be converted to the relevant CustomConfig object internally. Note that this commit does not modify existing tests to use the new API; they will continue to pass in Dicts as before, which still works but triggers a deprecation warning. This will be handled in a future commit. Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/) Approved by: https://github.com/jerryzh168	2022-06-16 17:50:07 +00:00
Andrew Or	61a1eef7fc	[Quant][fx] Add get_default_qconfig_mapping Summary: This follows https://github.com/pytorch/pytorch/pull/78452, which replaced the qconfig_dict with QConfigMapping. This PR additionally replaces get_default_qconfig_dict with get_default_qconfig_mapping. For backward compatibility, we deprecate the old functions instead of removing them. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/79618 Approved by: https://github.com/jerryzh168	2022-06-16 16:10:14 +00:00
Vasiliy Kuznetsov	71e1992b0d	quantization: remove most fp16 configs from fbgemm/qnnpack Summary: The fbgemm and qnnpack backends mostly support ops with quint8 activations. Historically, the default backend config has included ops with fp16 activations for other backends. This PR keeps the old config under a different name to keep the functionality tested, and makes the default config match fbgemm/qnnpack ops. Test plan: ``` python test/test_quantization.py -k TestQuantizeFx ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78528 Approved by: https://github.com/andrewor14	2022-06-06 19:02:53 +00:00
Andrew Or	c7b4eec233	[Quant][fx][bc-breaking] Replace qconfig_dict with a config object (#78452 ) Summary: Previously, FX graph mode quantization configurations were specified through a dictionary of qconfigs. However, this API was not in line with other core APIs in PyTorch. This commit replaces this dictionary with a config object that users will create and pass to prepare and convert. This leads to better type safety and better user experience in notebook settings due to improved auto completion. The new API is as follows: ``` from torch.ao.quantization import QConfigMapping from torch.ao.quantization.quantize_fx import prepare_fx qconfig_mapping = QConfigMapping() .set_global(qconfig) .set_object_type(torch.nn.Linear, qconfig) .set_module_name_regex("foo.bar", qconfig) .set_module_name("mod", qconfig) prepare_fx(model, qconfig_mapping) ``` For backwards compatibility, `prepare_fx`, `prepare_qat_fx`, and `convert_fx` will continue to accept qconfig_dicts, which will be converted to QuantizationConfigs internally. Note that this commit does not modify existing tests to use the new API; they will continue to pass in qconfig_dict as before, which still works but triggers a deprecation warning. This will be handled in a future commit. Test Plan:* python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: D36747998 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78452 Approved by: https://github.com/jerryzh168	2022-05-30 18:30:07 +00:00
Jerry Zhang	416899d1a9	[quant][fx][bc-breaking] Add required example_args argument to prepare_fx and prepare_qat_fx (#249 ) (#77608 ) Summary: X-link: https://github.com/facebookresearch/d2go/pull/249 X-link: https://github.com/fairinternal/ClassyVision/pull/104 X-link: https://github.com/pytorch/benchmark/pull/916 X-link: https://github.com/facebookresearch/ClassyVision/pull/791 X-link: https://github.com/facebookresearch/mobile-vision/pull/68 FX Graph Mode Quantization needs to know whether an fx node is a floating point Tensor before it can decide whether to insert observer/fake_quantize module or not, since we only insert observer/fake_quantize module for floating point Tensors. Currently we have some hacks to support this by defining some rules like NON_OBSERVABLE_ARG_DICT (https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/utils.py#L496), but this approach is fragile and we do not plan to maintain it long term in the pytorch code base. As we discussed in the design review, we'd need to ask users to provide sample args and sample keyword args so that we can infer the type in a more robust way. This PR starts with changing the prepare_fx and prepare_qat_fx api to require user to either provide example arguments thrugh example_inputs, Note this api doesn't support kwargs, kwargs can make https://github.com/pytorch/pytorch/pull/76496#discussion_r861230047 (comment) simpler, but it will be rare, and even then we can still workaround with positional arguments, also torch.jit.trace(https://pytorch.org/docs/stable/generated/torch.jit.trace.html) and ShapeProp: https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/shape_prop.py#L140 just have single positional args, we'll just use a single example_inputs argument for now. If needed, we can extend the api with an optional example_kwargs. e.g. in case when there are a lot of arguments for forward and it makes more sense to pass the arguments by keyword BC-breaking Note: Before: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict) # or m = prepare_qat_fx(m, qconfig_dict) ``` After: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) # or m = prepare_qat_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Imported from OSS Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D35984526/V30/classyvision/)\| \|Modified Pages\| Reviewed By: vkuzo, andrewor14 Differential Revision: D35984526 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77608 Approved by: https://github.com/dzdang	2022-05-21 21:03:48 +00:00
dzdang	1d7b294574	[quant][better-engineering][bc-breaking] Removed quant_min/quant_max from fake_quant modules Summary: FakeQuantize class has quant_min/quant_max and activation_post_process attributes, the latter of which already includes quant_min/max. As such, we can remove quant_min/quant_max from FakeQuantize and use FakeQuantize.activation_post_process.quant_m* directly. Test plan: ``` python test/test_quantization.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76674 Approved by: https://github.com/vkuzo	2022-05-11 14:23:05 +00:00
dzdang	e2aa28a2d0	[quant][fx][improvement] Renamed default_affine_fixed_qparams_observer and default_symmetric_fixed_qparams_observer (#76637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76637 The previous naming convention `default_affine_fixed_qparams_observer` and `default_symmetric_fixed_qparams_observer` were uninformative, and users had to read the definition in order to understand what these observers are. The new naming convention reveals information about the range of the observers The analogous changes were also made for `default_symmetric_fixed_qparams_fake_quant` and `default_affine_fixed_qparams_fake_quant` Test Plan: ``` python test/test_quantization.py ``` ``` python test/test_quantization.py ``` Differential Revision: D36054169 D36054169 Reviewed By: vkuzo Pulled By: dzdang fbshipit-source-id: 215f7786a4b7abda7327f17cc61735697ec5cca9 (cherry picked from commit 21a4e6eda4467c8adca7fd534a506a14e975f9cf)	2022-05-04 02:39:20 +00:00
Vasiliy Kuznetsov	35545d85dc	fx quant: add quantized Softmax workflow integration (#75106 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75106 In https://github.com/pytorch/pytorch/pull/75017 a quantized softmax kernel was added. This PR adds the FX graph mode quantization workflow integration to swap `nn.Softmax` to `nnq.Softmax`. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops ``` Reviewed By: kimishpatel, andrewor14 Differential Revision: D35324817 Pulled By: vkuzo fbshipit-source-id: 710ae3bedf8a6ad1dc411cd9808fdd0ce743e757 (cherry picked from commit d67603c0fbb1d3469d97bd538cec38aa8b03324b)	2022-04-20 21:54:26 +00:00
Andrew Or	5dcbcc6de8	[Quant][fx] Fix get_default_qconfig_dict for fused modules Summary: Calling `prepare_fx` with `get_default_qconfig_dict` failed for models with fused modules, such as `ConvReLU2d`. This commit fixes this by adding qconfig entries for ReLU and BatchNorm as well. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qconfig_dict_with_fused_modules Reviewers: jerryzh168 Subscribers: jerryzh168, vkuzo Issue: https://github.com/pytorch/pytorch/issues/75825 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75838 Approved by: https://github.com/jerryzh168	2022-04-15 22:37:26 +00:00
Jerry Zhang	0c08fcff32	[quant][fx] Cleanup some unused states and args Summary: * Removed "patterns" from observed module since it's no longer needed * Removed an arg from insert_observer * Removed some unused keys in checking the validity of qconfig_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75521 Approved by: https://github.com/andrewor14	2022-04-14 13:18:00 +00:00
Jerry Zhang	761bb06292	[quant][fx] Use native backend_config_dict in convert Summary: Previously the list of qat modules, fused modules etc. are hardcoded in the convert code, in this PR we get these information from backend_config_dict instead Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels python test/test_quantization.py TestFXNumericSuiteCoreAPIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75520 Approved by: https://github.com/vkuzo	2022-04-12 17:59:24 +00:00
Jerry Zhang	f83d047338	[quant][fx] Use native backend_config_dict in prepare Summary: Previously we are still relying on the registration mechnism and get the default quantize handlers that are registered, now we have moved all registration to backend_config_dict we can get all quant patterns just from backend_config_dict now. This PR enables using native backend_config_dict everywhere in prepare when the backend_config_dict is None, we'll also do similar changes in convert as well Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75469 Approved by: https://github.com/vkuzo	2022-04-12 17:05:31 +00:00
Jerry Zhang	72d3d160fb	[quant][fx] Remove additional_object_mapping from the docs (#75389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75389 This seems to be removed before, so won't mark this PR as bc-breaking, this use case is now enabled with backend_config_dict api Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35451960 fbshipit-source-id: 21a8f19c1968af44bf4fa603f16ee8c6f5080e5a (cherry picked from commit 2862f17b57f846b55736bc6b5d10df4256567adf)	2022-04-11 10:40:11 +00:00
Andrew Or	0bdf9a9833	[Quant][fx] Decouple prepare_*fx from training/eval modes (#75401 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75401 This commit removes asserts that require prepare_fx to be run in eval mode and prepare_qat_fx to be run in training mode. Test Plan: python test/test_quantization.py TestQuantizeFx.test_prepare_mode Imported from OSS Reviewed By: vkuzo, jerryzh168 Differential Revision: D35457100 fbshipit-source-id: 13a55b13d9e389991f69c06c6a70bc51cdebba36 (cherry picked from commit fb0685e0873dc8e807da3213be403b51e8b4a687)	2022-04-08 15:34:08 +00:00
Jerry Zhang	e167244aa4	[quant][fx] Move the remaining fixed qparam ops to backend_config_dict (#75314 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75314 this is a refactor to use backend_config_dict for operators with fixed quantization parameters api is not final yet, we'll update the api after we moved everything to backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: vkuzo Differential Revision: D35423790 fbshipit-source-id: a69ce19340e2e3c996f1435b887ba122de85f22f (cherry picked from commit 5d35983a3bac4281f8636f69ffb68adb358e9a5f)	2022-04-06 16:11:14 -07:00
Jerry Zhang	86485f61c5	[quant][fx] Remove the remaining registrations in BinaryOpQuantizeHandler (#75258 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75258 att, the remaining registrations are for fp16 ops which are no longer used Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35403588 fbshipit-source-id: fc328d42f4cb80901ed545a11fdde49ee7ff8b2e (cherry picked from commit fbe2db090cf8d1221dd37d19636058d8dd44c728)	2022-04-06 16:11:13 -07:00
Jerry Zhang	53f7233004	[quant][fx] Move all binary op configs to backend_config_dict (#75241 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75241 We have a previous PR that enabled operator.add in backend_config_dict, this PR moved the rest binary ops to backend_config_dict. There are some ops left, which are not needed (previously fp16 ops), we will move them in the following PR Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: bdhirsh Differential Revision: D35403589 fbshipit-source-id: 663703b310944a6b7c5ade6d07a4d938a6ca082b (cherry picked from commit 5a76ce031872c4fed5fcab5bb3c84a9394b01118)	2022-04-06 16:11:13 -07:00
Jerry Zhang	a90bcd2066	[quant][fx] Support override observers and fake quantize module in backend_config_dict (#75135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75135 Some operators have fixed quantization parameters, this PR adds the support to override the qconfig in the backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35334279 fbshipit-source-id: 390510bd8fc2d61004c36c54390989583e6519ce (cherry picked from commit ccf9bcd7eb4564ec97c5e0548b8ee926f640360b)	2022-04-06 07:00:32 +00:00
Jerry Zhang	bd032cd8d6	[quant][fx] Remove is_output_quantized from QuantizeHandler (#74843 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74843 is_output_quantized is used to check if we should quantize the op based on the dtype configuration in qconfig and what is supported by the backend, we'll skip inserting observer if the dtype configuration is not supported by the backend, this is now supported by backend_config_dict, and we can remove this function now. Also we previously supported fp16 static quantization for some ops for one of our internal use case, and now it is not required, so we can remove them Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35190541 fbshipit-source-id: 623d961810737ec01e1f8b269ec48a6a99bb284a (cherry picked from commit a405998c60c0146dbd5feef60e2d5cb3b0aa289c)	2022-04-02 16:21:54 +00:00
Charles David Hernandez	bf091f78a6	[AO][bugfix] Fixing FX QAT but for untraceable modules (#74277 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74277 see issue: https://github.com/pytorch/pytorch/issues/74240 this fixes that issue by skipping the children of untraceable modules during propagate_qconfig. This required extending said function to take the prepare_custom_config_dict as an optional argument. Test Plan: python test/test_quantization.py python test/test_quantization.py TestQuantizeFx.test_qat_skip_untraced Imported from OSS Reviewed By: vkuzo Differential Revision: D34916074 fbshipit-source-id: 11caba2cbf78566fb51adf698b01bbba0275de28 (cherry picked from commit 5324c48e4c3277bb12a716a4408151c86006ee47)	2022-03-30 15:08:45 +00:00
Jerry Zhang	b347b8c191	[quant][fx] Support some default ops in the native backend config (#74600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74600 Following https://github.com/pytorch/pytorch/pull/74210, this PR adds the support for some ops using the DefaultNodeQuantizeHandler in the backend_config_dict defintion for pytorch native backend TODO: There is still a few ops we didn't handle with backend_config_dict path: gelu and softmax, need to discuss if we still need them, if so we can change the test to use backend_config_dict and remove the DefaultNodeQuantizeHandler after that Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35071437 fbshipit-source-id: 70351d2810ca1ac7dc09d4a9c239f6757ccb51ca (cherry picked from commit 5e68f755a32ba7d90d6c73db9c2017f9c58d7fa5)	2022-03-25 02:59:36 +00:00
Jerry Zhang	93a1068d09	[quant][fx] Relax the constraint for input of custom module nodes (#74510 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74510 Previously we require the dequantize before custom module to have one user, this is because we are removing the dequantize node before custom module while we transform an observed custom module to a quantized custom module, but actually we don't need to remove it, we can just change the input of custom module with quantize node instead. If the dequantize node only has one user, it will be removed by the dead code elimination pass that was added recently. Test Plan: python test/test_quantization.py TestQuantizeFx.test_custom_module_class_input_has_multiple_users Imported from OSS Reviewed By: dzdang Differential Revision: D35034626 fbshipit-source-id: eea9fbf9fb34c61f114c6431377be347632ce36d (cherry picked from commit 2878085a56bc529afef5e533bc5f49079d4adc52)	2022-03-23 18:50:49 +00:00
Jerry Zhang	e9776fe58c	[quant][fx] Support conv1d and its fusion variants in QAT (#74506 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74506 This PR supports qat Conv1d, ConvBn1d, ConvBnReLU1d, ConvReLU1d in qat in FX Graph Mode Quantization Test Plan: python test/test_quantization.py TestQuantizeFx.test_conv_bn_relu Imported from OSS Reviewed By: vkuzo Differential Revision: D35032995 fbshipit-source-id: 645da33f0d893aa44f35ee1384fd1539a9c788e7 (cherry picked from commit 6b583baa74c5a4fd2f50270d633f277e2fc94716)	2022-03-23 18:43:53 +00:00
Jerry Zhang	b86554abed	[quant][fx] Fix dynamic weighted op lowering when input is used multiple times (#74364 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74364 if a input is used multiple times in modules that are dynamically quantized: ``` x -- linear1 \-- linear2 ``` we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass to duplicate dequantize ops for pattern matching: ``` x - quantize_per_tensor_dynamic - dequantize1 - linear1 \----- dequantize2 - linear2 ``` But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case to recover both patterns: ``` x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1 \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2 ``` so that they can be fused into dynamic linear: ``` x - linear_dynamic1 \-- linear_dynamic2 ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use Imported from OSS Reviewed By: yixin94 Differential Revision: D34952755 fbshipit-source-id: a950159fd6a661e84faf0baf1692f6783904cfb3 (cherry picked from commit 8a6896801fdd96a55476faca4ccb7ba0b0bdb058)	2022-03-18 23:09:33 +00:00
Jerry Zhang	dbf43d621d	[quant][fx] Only do reference moduel swapping for floating point fused modules (#74231 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74231 Add a check to make sure the weighted modules we swap is actually a float fused module, since the reference fused module like reference version of linear - relu would have the same fused type as the floating point linear - relu (and the linear submodule will have different types) Test Plan: phabricator diff for now, can add a test case after we know exactly what the problem is Reviewed By: andrewor14 Differential Revision: D34888290 fbshipit-source-id: a7f53368a7c17f7d1a82afaa50d14d569b4923df (cherry picked from commit 458dac9fdf8b4f0d786bf9c815c2f2fe8df13bb4)	2022-03-18 22:20:16 +00:00
Jiaxu Zhu	dc0c94910f	[quant] Don't regard MatchAllNode as node matched (#74198 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74198 As title, currently in the (add, X, MatchAllNode) pattnern, the node matched with MatchAllNode is regard as part of the pattern instead of the input. As a result, the possible patterns ends with that node will not be matched. For instance, we have two patterns 1. (nn.ReLU, (torch.add, MatchAllNode, (nn.BatchNorm2d, nn.Conv2d))) 2. (nn.ReLU, (nn.BatchNorm2d, nn.Conv2d)) And we wanna fuse the following model Conv2d -> BatchNorm2d -> ReLU + Conv2d -> BatchNorm2d ------ Add -> ReLU The pattern in the first row cannot be matched becaues the end node ReLU is recorded as MatchAllNode already. Test Plan: new unit test ``` [jiaxuzhu@devvm3400.frc0 /data/users/jiaxuzhu/fbsource/fbcode] buck test mode/dev //caffe2/test:quantization_fx -- --exact 'caffe2/test:quantization_fx - test_fusion_pattern_with_matchallnode (quantization.fx.test_quantize_fx.TestFuseFx)' Parsing buck files: finished in 0.9 sec Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules) Building: finished in 12.6 sec (100%) 18546/84011 jobs, 2/84011 updated Total time: 13.5 sec More details at https://www.internalfb.com/intern/buck/build/9d2decdb-d01e-4332-84f5-1728a65d4f7b BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: d92e10b8-9209-4e9e-95a6-2fcac02db251 Trace available for this run at /tmp/tpx-20220314-161230.347672-d92e10b8-9209-4e9e-95a6-2fcac02db251/trace.log RemoteExecution session id: reSessionID-d92e10b8-9209-4e9e-95a6-2fcac02db251-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/3377699814955263 ✓ ListingSuccess: caffe2/test:quantization_fx : 365 tests discovered (19.275) ✓ Pass: caffe2/test:quantization_fx - test_fusion_pattern_with_matchallnode (quantization.fx.test_quantize_fx.TestFuseFx) (17.760) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/3377699814955263 ``` Reviewed By: jerryzh168 Differential Revision: D34873730 fbshipit-source-id: dc78455c7233ba33e9ab215f50754b1656b7dbc7 (cherry picked from commit 1cc74cadd7dc725be97064f57c910ef9d1bbe1a8)	2022-03-17 20:12:35 +00:00
Charles David Hernandez	c1d070d0f0	[ao] Fixing obs insertion through dtype propagation (#73274 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73274 As noticed in https://discuss.pytorch.org/t/calibration-of-model-in-post-training-static-quantization-using-fx-api/143661/6 and related to https://github.com/pytorch/pytorch/issues/72698 when using fx quantizaiton, if an op like view was used in a model and the index parameters were passed in to the ops with a variable rather than hard coded, fx would mistakenly insert observers for them, leading to an error when the observer tried to do tensor only operations on a non-tensor. To fix this, an API was added to specify non tensor arguments for various ops to enable better dtype propagation. NON_TENSOR_ARG_DICT is a nested dict whose first key is a named tuple which contains matching parameters for ops with nontensor args, the inner dict's keys are dtypes and the values are a list of those arg indices that take use such dtypes. Alternatively, instead of a list, the inner dict value can also be a function that takes the node as an argument and returns the list of arg indices. Theoretically this api can support arbitrary functions but the current implmentation is limited to simpler functions given the particular issue this fixes seems to be rare. Note: although torch.unsqueeze and torch.transpose are listed in quantization_patterns.py, those ops appear to be untraceable by fx. I've included tests for their cases but fixing this issue is beyond the scope of this PR Test Plan: python test/test_quantization.py test_non_reference_size ... python test/test_quantization.py test_non_reference_<op> Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34410122 fbshipit-source-id: fc09949ca8a2d6473876a4b6c214eb91e9a9dae2 (cherry picked from commit 3a1375d677b7c98d62b1f5c839645698c39b32b9)	2022-03-16 01:41:17 +00:00
Jerry Zhang	9a0b7b4723	[quant] Fix implementation for `output_quantized_idxs` in convert (#74140 ) (#74229 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74229 Previously we did not successfully remove the dequantize node for `dict`, this PR fixes that, tested with meta-only tests right now but we should follow up with oss tests (with dict output) since we called dead code elimination pass, some of the inplace operators are removed in the TestQuantizeFx.test_fixed_qparams_ops, in this PR we also just removed the calls to the inplace ops, and changed the expected results in the test case, in the future PR we can remove the support for inplace operators, since it is not really supported in fx, and it's OK for us to skip them as well Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D34888140 fbshipit-source-id: 48cea842b49e52baa8eee3ce0f4bfb4a3625ab2a (cherry picked from commit ef790315ebcf954930deb6b9d1c384992c1f1ec8)	2022-03-16 00:00:13 +00:00
Jerry Zhang	7ddf212f33	[quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863 This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first, and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack). This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code in quantization_patterns.py as well (in followup PRs). Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` and other internal/oss regression tests Imported from OSS Reviewed By: andrewor14 Differential Revision: D34778506 fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b (cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)	2022-03-11 17:11:30 +00:00
Andrew Or	f3c6e8f720	[Quant][fx] Add lowering for functional conv (#73708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73708 This adds functionality to lower reference models involving functional conv in FX. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_conv Imported from OSS Reviewed By: mruberry Differential Revision: D34648870 fbshipit-source-id: d1c8afdb9787c36639d5ee5762ae71e7e8ab3769 (cherry picked from commit 7a28617faf4b8aad152076239927e94ed3f0169e)	2022-03-07 15:32:54 +00:00
Andrew Or	cedce3be20	[Quant][fx] Add lowering for Linear-Bn1d in QAT mode (#73509 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73509 This adds functionality to lower reference models involving the Linear-Bn1d pattern in FX QAT mode. This follows https://github.com/pytorch/pytorch/pull/72431 and https://github.com/pytorch/pytorch/pull/72796, which add Linear-Bn1d fusion functionality to eager QAT mode. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_module Imported from OSS Reviewed By: dagitses Differential Revision: D34591251 fbshipit-source-id: 39144485f9954ee1830c8b414e724560fd7e47bf (cherry picked from commit b97a39b4d9df00e045fab4c01eca88e562ca2c02)	2022-03-07 15:32:54 +00:00
Terry Chen	5167e9d59d	[quant][fix] Fix bug for ave pooling in FX quant (#73054 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73054 Fix bug for ave pooling in FX quant Test Plan: python3 test/test_quantization.py TestQuantizeFxOps.test_ave_pool_with_custom_cfg Imported from OSS Reviewed By: george-qi Differential Revision: D34334059 fbshipit-source-id: a2ddad4fa3abf250f5dc20486c966fff3a9098a6 (cherry picked from commit d0f6ea680427a454200735075d557fb0b145a625)	2022-03-04 23:29:18 +00:00
Jerry Zhang	f5c7e5406b	[quant][fx] Add lowering support for qat and fused convs (#73527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73527 This includes: ``` torch.nn.qat.Conv2d, torch.nn.qat.Conv3d, torch.nn.intrinsic.qat.ConvBn1d, torch.nn.intrinsic.qat.ConvBn2d, torch.nn.intrinsic.qat.ConvBn3d, torch.nn.intrinsic.qat.ConvBnReLU1d, torch.nn.intrinsic.qat.ConvBnReLU2d, torch.nn.intrinsic.qat.ConvBnReLU3d, torch.nn.intrinsic.qat.ConvReLU2d, torch.nn.intrinsic.qat.ConvReLU3d torch.nn.intrinsic.ConvReLU1d, torch.nn.intrinsic.ConvReLU2d, torch.nn.intrinsic.ConvReLU3d, ``` We first produce the reference pattern and then lower the reference pattern to quantized modules Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D34583206 fbshipit-source-id: d298114d1906ea44c071b0eee52730dadf67fd3e (cherry picked from commit 6498af35b5aa6104cadb68ca48dff4e443bee7d6)	2022-03-04 06:29:03 +00:00
dzdang	a39e8e8f5e	[Quant][fx] Added explicit entries for for functional and module conv&linear support into get_default_qconfig_dict&get_default_qat_qconfig_dict (#73528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73528 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34535572 Pulled By: dzdang fbshipit-source-id: 883f46e014e47aeba3ea6f9fb401c54e3792b2ac (cherry picked from commit 66713d518295b2e7306561030aa6b7ca049a708c)	2022-03-04 03:29:20 +00:00
Andrew Or	b7a7cdd00a	[Quant][fx] Add lowering for functional linear (#72855 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72855 This adds functionality to lower reference models involving functional linear in FX. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_linear Imported from OSS Reviewed By: albanD Differential Revision: D34514127 fbshipit-source-id: 7af4f37bdeda710dc7197ede9d46f66227d7932c (cherry picked from commit a14cbc04dea4e578643c4183f0c8ea43fbdaf5c7)	2022-03-02 18:34:35 +00:00
Jerry Zhang	bea075f305	[quant] Add support for multiple inputs in fusion pattern (#73572 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73572 Previously we can't specify how to get extra inputs for fused ops in backend_config_dict, for example, for patterns like: (torch.add, (nn.BatchNorm2d, nn.Conv2d), MatchAllNode) where nn.Conv2d is the root node, the extra MatchAllNode (the input for original torch.add) would be lost This PR added a "extra_inputs_getter" key in the backend_config_dict, which allows user to provide a function, that can return a list of extra input node for the fused op given the matched node pattern. In this case, we need a function that returns the node that matches with `MatchAllNode`, it would be something like the following: ``` def extra_inputs_getter(pattern): add, conv_bn, extra_input = pattern return [extra_input] ``` Test Plan: python test/test_quantization.py TestFuseFx.test_fusion_pattern_with_multiple_inputs Imported from OSS Reviewed By: vkuzo Differential Revision: D34553210 fbshipit-source-id: 748f8ce20974438458a39dbe9eae75281156c227 (cherry picked from commit be748526480e811874dbca64b1cf3bf4950f0393)	2022-03-02 08:37:07 +00:00
Jerry Zhang	ad1078a21e	[quant] Enable reference path by default for CopyNodeQuantizeHandler (#73233 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73233 This PR makes CopyNodeQuantizeHandler to always produce reference patterns, and we have some custom lowering pass to rewrite the reference qunatized patterns to quantized ops Lowering passes have been implemented previously, we just need to enable the reference path here, and cleanup the previous code to allow list some of the ops (`check_node`) Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: mrshenli Differential Revision: D34469446 fbshipit-source-id: b9d9c5f793fbb735839199056c197ae98969cc4b (cherry picked from commit af0cf4e79e11e7343d57e6ff7766c80e72ec60f3)	2022-03-01 01:33:30 +00:00
Jerry Zhang	45a042037f	[quant][fx] Add root_node_getter in backend_config_dict (#73345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73345 For complex patterns we need to identify which node is the root, so that we can eliminate all other nodes and only preserve the root, e.g. (torch.add, MatchAllNode, (torch.nn.ReLU, torch.nn.Conv2d)), we can preserve the torch.nn.Conv2d as root node, and remove other nodes. Prevoiusly we assumed the root_node of a pattern is the "last node" of the pattern, computed by: ``` def default_root_node_getter(node_pattern): while not isinstance(node_pattern[-1], Node): node_pattern = node_pattern[-1] return node_pattern[-1] ``` This PR enables user configuration to define their own root_node_getter, that means we can define root_node for patterns like: (torch.add, (torch.nn.ReLU, torch.nn.Conv2d), MatchAllNode) Test Plan: python test/test_quantize_fx.py TestFuseFx.test_root_node_getter Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D34442193 fbshipit-source-id: 2f6da69a5b6527b49710ae32820e8e2915d9af37 (cherry picked from commit 8b49bf0d7d53cdcf2c9f40f8e25bc843e8814026)	2022-02-26 06:34:22 +00:00
Jerry Zhang	16554bec1b	[qunat][fx][fix] Fix get_module_type for fusion (#72735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72735 We use `get_matched_types` to get the (type) pattern from matched modules. And we need to use MatchAllNode instead of type(MatchAllNode) to query the fuser_method for the pattern Test Plan: TODO Imported from OSS Reviewed By: raghuramank10000 Differential Revision: D34180705 fbshipit-source-id: db9b6e791a9f26b70079fddc95fce033052199ab (cherry picked from commit 01d38afabcb1bfc207dee7d49ee13df500d32fdf)	2022-02-25 18:37:31 +00:00
Jerry Zhang	9db0e0e76e	[quant][graphmode] produce reference pattern for binary ops and then rewrite to quantized op (#72953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72953 This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we have some custom lowering pass to rewrite the reference qunatized patterns to quantized ops it includes rewrite for torch.ops.quantized.add, torch.ops.quantized.mul, torch.ops.quantized.matmul Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: gchanan Differential Revision: D34292408 fbshipit-source-id: 9872a5098249bc77db15e9fb614416958e62b9b2 (cherry picked from commit dbdc61ee8b5dde2e54a34a370a3af887e5117398)	2022-02-25 17:36:14 +00:00
Howard Huang	dadbf43eff	Fix asserts in tests (#72864 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72864 Fixes #72860 Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D34246987 Pulled By: H-Huang fbshipit-source-id: 1ba47585533aff4cff9beec49bdc801f8320ffc8 (cherry picked from commit `03e45ceb89`)	2022-02-16 18:35:16 +00:00
Jerry Zhang	3d377fb4a3	[quant][fx][improvement] Add lowering support for BatchNormQuantizeHandler (#72490 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72490 This is an effort to move the current implementation towards the reference quantized model design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md so that we use reference model in the default fbgemm/qnnpack path Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps.test_qbatch_norm Imported from OSS Reviewed By: vkuzo, andrewor14 Differential Revision: D34062365 fbshipit-source-id: ed015c61f5b969554a6477f92cf6be2358cb558c (cherry picked from commit `9498421ddd`)	2022-02-15 21:34:17 +00:00
Vasiliy Kuznetsov	decc79e541	fx quant: add workflow support for torch.matmul quantization (#72444 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72444 In https://github.com/pytorch/pytorch/pull/71783 support was added for quantized matmul. In this PR, the FX graph mode quantization workflow support for this operator is added, for int8 dtypes. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_qmatmul ``` Imported from OSS Reviewed By: andrewor14 Differential Revision: D34047310 fbshipit-source-id: 781219047419ce621a4deb46ea04881818bf4209 (cherry picked from commit `7e039fa3a1`)	2022-02-09 18:43:58 +00:00
Jerry Zhang	ac0cac7724	[quant][fx][devs] Add lowering support for torch.cat (#72487 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72487 This is an effort to move the current implementation towards the reference quantized model design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md so that we use reference model in the default fbgemm/qnnpack path Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D34062366 fbshipit-source-id: 86673bead79180a7509b51bd577f328e90f24893 (cherry picked from commit `de3e443384`)	2022-02-09 06:09:57 +00:00

1 2 3

148 Commits