pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Jerry Zhang	b0de6a8002	[quant][executorch] Support inception_v4 in examples (#108382 ) Summary: Verified that pt2e quant flow matches the fx flow with executorch backend config Test Plan: with-proxy buck2 run executorch/examples/quantization:example -- -m=ic4 --verify ``` [INFO 2023-08-31 16:08:06,923 example.py:77] prepare sqnr: inf [INFO 2023-08-31 16:08:06,932 example.py:81] quant diff max: 0.0 [INFO 2023-08-31 16:08:06,936 example.py:85] quant sqnr: inf ``` full output: https://www.internalfb.com/intern/paste/P818520579/ Differential Revision: D48889075 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108382 Approved by: https://github.com/kimishpatel	2023-09-08 17:39:31 +00:00
leslie-fang-intel	fa6be2fa6f	[Quant][PT2E] Remove x86 inductor pt2e backend config (#105039 ) Summary For the Quantization PT2E path, we recommend to use `X86InductorQuantizer` instead of backend config of `x86_inductor_pt2e_backend_config`. Remove the `x86_inductor_pt2e_backend_config` and the relevant testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105039 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-07-19 23:18:29 +00:00
Justin Chu	c0d8a4af0a	[BE] Enable ruff's UP rules and autoformat ao/ (#105430 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105430 Approved by: https://github.com/albanD, https://github.com/malfet	2023-07-19 13:44:37 +00:00
maxren	88f1885ec9	[XNNPACK][QS8] torch.cat (#104800 ) Differential Revision: [D47304143](https://our.internmc.facebook.com/intern/diff/D47304143/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104800 Approved by: https://github.com/digantdesai	2023-07-19 00:15:05 +00:00
maxren	332f2057df	[XNNPACK][QS8] torch.nn.ELU (#104307 ) Differential Revision: [D47075933](https://our.internmc.facebook.com/intern/diff/D47075933/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104307 Approved by: https://github.com/digantdesai	2023-07-11 00:35:13 +00:00
maxren	c4e084e3c7	[XNNPACK][QS8] torch.nn.ConstantPad2d (#104306 ) Differential Revision: [D47075932](https://our.internmc.facebook.com/intern/diff/D47075932/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104306 Approved by: https://github.com/digantdesai	2023-07-11 00:35:02 +00:00
maxren	2c960c73a3	[XNNPACK][QS8] torch.permute (#104305 ) Differential Revision: [D47075934](https://our.internmc.facebook.com/intern/diff/D47075934/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104305 Approved by: https://github.com/digantdesai	2023-07-11 00:34:58 +00:00
maxren	d41c4a8338	[XNNPACK][QS8] torch.clamp (#104304 ) Differential Revision: [D47075935](https://our.internmc.facebook.com/intern/diff/D47075935/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104304 Approved by: https://github.com/digantdesai	2023-07-11 00:34:58 +00:00
Digant Desai	36c4dad197	[ET][XNNPACK] Add support for quantized LeakyReLU (#104309 ) Summary: Also adds support for backend_config Test Plan: `buck test fbcode//mode/dev-nosan fbcode//executorch/backends/xnnpack/test:` Reviewed By: mcr229 Differential Revision: D47043207 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104309 Approved by: https://github.com/salilsdesai, https://github.com/manuelcandales	2023-06-30 17:42:22 +00:00
Digant Desai	ef285faeba	[ET][XNNPACK] Add support for quantized Multiply (#104134 ) Summary: Also adds support for backend_config with relu fusion since XNNPACK allows it. We should revisit the relu fusion once we gain more clarity on quantSrcPartition or some other way to do these fusion and not having to add all combinations. We should really rename the backend config to et_xnnpack.py or something TODO Test Plan: `buck test fbcode//mode/dev-nosan fbcode//executorch/backends/xnnpack/test:` Differential Revision: D46985169 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104134 Approved by: https://github.com/mcr229, https://github.com/salilsdesai	2023-06-27 16:59:28 +00:00
Digant Desai	bd8841101b	[ET][XNNPACK] Add support for quantized Sub (#104090 ) Summary: Also adds support for backend_config with relu fusion since XNNPACK allows it. We should revisit the relu fusion once we gain more clarity on quantSrcPartition or some other way to do these fusion and not having to add all combinations. We should really rename the backend config to et_xnnpack.py or something TODO Test Plan: `buck test fbcode//mode/dev-nosan fbcode//executorch/backends/xnnpack/test:` Differential Revision: D46924209 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104090 Approved by: https://github.com/mcr229	2023-06-26 16:32:15 +00:00
andrewor14	0d5f1cb666	[quant] Add torch.flatten to executorch backend_config (#103988 ) Summary: This is needed to make the short-term and long-term quantization numerics match for mobilenetv2. Test Plan: python test/test_quantization.py TestQuantizeFx Reviewers: jerryzh, kimishpatel Subscribers: jerryzh, kimishpatel Differential Revision: [D46909962](https://our.internmc.facebook.com/intern/diff/D46909962) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103988 Approved by: https://github.com/jerryzh168	2023-06-22 22:11:48 +00:00
maxren	f37be77813	[Quant][XNNPACK] Delegate add_relu fusion (#103266 ) Quantized Resnet currently sees fused add-relu ``` --> dq \ add --> relu --> quant / --> dq ``` Let us support this fusion in the delegate as xnnpack can use the output_min and output_max of the op nodes to clamp the values and perform a fused add - relu operation Differential Revision: [D45258028](https://our.internmc.facebook.com/intern/diff/D45258028/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103266 Approved by: https://github.com/jerryzh168	2023-06-12 04:35:29 +00:00
Riley Dulin	424c930f76	Add quantization lowering for nn.PixelShuffle and nn.PixelUnshuffle (#101926 ) Similar to https://github.com/pytorch/pytorch/pull/96160 but for the modules nn.PixelShuffle and nn.PixelUnshuffle. torch.nn.PixelUnshuffle accepts both float and quantized inputs. However, previously we would unnecessarily dequantize quantized inputs into floats before passing them to the function. This commit fixes this by lowering the pattern [dequant - PixelShuffle - quant]. [dequant - PixelUnshuffle - quant]. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_pixel_shuffle_module python test/test_quantization.py TestQuantizeFxOps.test_pixel_unshuffle_module Pull Request resolved: https://github.com/pytorch/pytorch/pull/101926 Approved by: https://github.com/jerryzh168	2023-05-24 19:33:26 +00:00
Max Ren	151d76cc23	[quant][pt2e] remove dropout from fx quant Differential Revision: D45250152nnPull Request resolved: https://github.com/pytorch/pytorch/pull/99935	2023-04-27 11:22:41 -07:00
maxren	e63c502baa	[Executorch][XNNPACK] Quantized Max Pool 2d (#99587 ) Adding support for Quantized Max Pool 2d Additions: - Add quantized max pool 2d to executorch backend config - modify max pool node visitors to grab quant params from input/output - Add qmaxpool 2d patterns for partitioners Differential Revision: [D44977783](https://our.internmc.facebook.com/intern/diff/D44977783/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99587 Approved by: https://github.com/jerryzh168	2023-04-22 07:17:13 +00:00
maxren	a964a3dbed	[quant][pt2e] add all convs-relu fusion qat configs (#99586 ) Currently when prepare_qat_fx with executorch backend config we do not properly quantize conv or conv - relu To fix this we add all the necessary qat configs for conv and conv-relu Differential Revision: [D45135947](https://our.internmc.facebook.com/intern/diff/D45135947/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99586 Approved by: https://github.com/jerryzh168	2023-04-22 06:44:23 +00:00
maxren	c139dfd71e	[quant][pt2e] add dropout to executorch backend config (#99585 ) OD Model has a dropout layer in training, In order to match eager mode qat, we also fake quantize the drop out layer in prepare_qat_fx. To do this we add the dropout layer to the default_op_configs in which the observation type uses a different observer from its input Differential Revision: [D45095936](https://our.internmc.facebook.com/intern/diff/D45095936/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99585 Approved by: https://github.com/jerryzh168	2023-04-22 06:41:44 +00:00
maxren	80eab63587	[Quant][pt2e] torch.mean and ReLU6 (#98984 ) Add nn.Module ReLU6 in addition to functional relu6. Also add torch .mean to quantization config Differential Revision: [D44901038](https://our.internmc.facebook.com/intern/diff/D44901038/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98984 Approved by: https://github.com/jerryzh168	2023-04-17 18:33:04 +00:00
maxren	444a9769ae	[quant][pt2e] QAT Linear (#98897 ) Differential Revision: [D44901039](https://our.internmc.facebook.com/intern/diff/D44901039/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98897 Approved by: https://github.com/tiandiao123, https://github.com/manuelcandales	2023-04-17 18:27:39 +00:00
maxren	568935caca	[quant][pt2e] QAT conv + bn + relu (#98896 ) Differential Revision: [D44901040](https://our.internmc.facebook.com/intern/diff/D44901040/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98896 Approved by: https://github.com/manuelcandales	2023-04-17 18:24:08 +00:00
Kazuaki Ishizaki	a13a63ae9a	Fix typos under torch/ao directory (#97679 ) This PR fixes typos in comments and messages of `.py` files under `torch/ao` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/97679 Approved by: https://github.com/janeyx99, https://github.com/kit1980	2023-04-10 22:25:15 +00:00
Xia, Weiwen	e61b842001	[Quant][FX] lower functional conv_transpose ops (#97126 ) Summary Support quantizing and lowering functional `conv_transpose1d`, `conv_transpose2d` and `conv_transpose3d`. Please note that - `conv_tranpose + relu` fusion is not supported. Remember to keep `relu` node in graph when lowering. - `conv_tranpose` requires `per-tensor` scheme for weight. Use default `qconfig_mappings` instead of deprecated `qconfig_dict` for test cases. Test plan python test/test_quantization.py -k test_conv_transpose_not_reference python test/test_quantization.py -k test_conv_transpose_reference python test/test_quantization.py -k test_conv_transpose_relu_not_reference python test/test_quantization.py -k test_conv_transpose_relu_reference Pull Request resolved: https://github.com/pytorch/pytorch/pull/97126 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-03-31 07:17:29 +00:00
maxren	3a5ca4bdd4	[quant][pt2e] Add support for conv bn fusion in et backend config (#97389 ) Batch Norm was supported by XNNPACK via fusion with the preceding convolution op. We do the same here by fusing across q -> dq nodes. We must update the original pass in order to fuse convolution weight/bias with batch norm parameters, this way quantization is supported for batch norm Differential Revision: [D43976324](https://our.internmc.facebook.com/intern/diff/D43976324/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97389 Approved by: https://github.com/salilsdesai	2023-03-31 05:33:42 +00:00
maxren	fe2bdfb2cd	[Executorch][XNNPACK] Quantized mean (#97388 ) Support Quantized Mean.dim for xnnpack Adding another pattern for Quantized Partitioner and test to ensure quantized operator works Differential Revision: [D43915706](https://our.internmc.facebook.com/intern/diff/D43915706/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97388 Approved by: https://github.com/salilsdesai	2023-03-31 05:08:53 +00:00
maxren	f9ca48ddb5	[Executorch][XNNPACK] Quantized hardtanh (#97387 ) Lower Quantized Hardtanh to XNNPACK Also add symmetric quantization support for hardtanh in executorch backend config Differential Revision: [D43901222](https://our.internmc.facebook.com/intern/diff/D43901222/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97387 Approved by: https://github.com/salilsdesai	2023-03-31 04:58:24 +00:00
leslie-fang-intel	a6d8c70933	Init quantization backend config for inductor (#96476 ) Summary Init the backend config file with quantization recipes for quantization 2.0 inductor path. In this PR, we only init the recipe for `convolution` and `convolution_relu`. Test Plan ``` clear && python -m pytest test_quantization.py -k test_inductor_backend_config_conv ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/96476 Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/jerryzh168	2023-03-22 07:56:56 +00:00
Jiaxu Zhu	08fb13db65	[Quant] Add lowering for pixel_unshuffle/narrow (#96160 ) Summary: ## Summary torch.nn.functional.pixel_unshuffle and torch.narrow accepts both float and quantized inputs. However, previously we would unnecessarily dequantize quantized inputs into floats before passing them to the function. This commit fixes this by lowering the pattern [dequant - pixel_unshuffle - quant]. [dequant - narrow - quant]. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_pixel_unshuffle ``` ``` python test/test_quantization.py TestQuantizeFxOps.test_narrow ``` Differential Revision: D43858199 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96160 Approved by: https://github.com/andrewor14	2023-03-08 05:25:03 +00:00
Xia, Weiwen	f3c25cd348	[Quant][PT2.0] fix issues for rearranging weight observer for decomposed linear (#94296 ) Summary Linear is decomposed to `t - addmm/mm` after `dynamo.export`. And weight's observer is inserted between `t` and `addmm/mm` in the first place. `_rearrange_weight_observer_for_addmm()` is then called to move the observer between weight and `t`. ``` before: weight - t - observer \ input - observer - addmm/mm after: weight - observer - t \ input - observer - addmm/mm ``` We found two issues of `_rearrange_weight_observer_for_addmm()`: - It does not call `m.recompile()` in the end, so it does not function correctly. - It does not support `aten.mm.default` which is from decomposed linear without bias. This PR fixes the two issues and renames the function to `_rearrange_weight_observer_for_decomposed_linear`. Test plan python test/test_quantization.py -k test_rearrange_weight_observer_for_decomposed_linear Pull Request resolved: https://github.com/pytorch/pytorch/pull/94296 Approved by: https://github.com/jgong5, https://github.com/andrewor14	2023-03-03 15:54:11 +00:00
Jacob Szwejbka	fc324d3485	[quant][pt2e] Add support for dynamic quantization with symmetric quant for input (#94854 ) Summary: Previously we assumed asymmetric quantization for dynamic quantization, this diff adds the support of symmetric quantization for the input in dynamic quantization Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic" Reviewed By: digantdesai Differential Revision: D43134794 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94854 Approved by: https://github.com/digantdesai	2023-02-28 19:39:31 +00:00
andrewor14	4fc277c338	[Quant] Add lowering for pixel_shuffle (#94769 ) Summary: `torch.nn.functional.pixel_shuffle` accepts both float and quantized inputs. However, previously we would unnecessarily dequantize quantized inputs into floats before passing them to the function. This commit fixes this by lowering the pattern [dequant - pixel_shuffle - quant]. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_pixel_shuffle Reviewers: vkuzo Subscribers: vkuzo, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/94769 Approved by: https://github.com/vkuzo	2023-02-17 23:11:17 +00:00
Vasiliy Kuznetsov	f15ab8a7f2	AO migration: replace torch internal callsites (#94170 ) Summary: Do the following renames: `torch.quantization` -> `torch.ao.quantization` `torch.nn.quantized` -> `torch.ao.nn.quantized` `torch.nn.quantizable` -> `torch.ao.nn.quantizable` `torch.nn.qat` -> `torch.ao.nn.qat` `torch.nn.intrinsic` -> `torch.ao.nn.intrinsic` And then, do `torch.ao.nn.quantized._reference` -> `torch.ao.nn.quantized.reference` to clean up the aftermath of https://github.com/pytorch/pytorch/pull/84974 Then, manually update `test/test_module_init.py` to fix hanging whitespace due to the replace. Run this script to do the replacements: https://gist.github.com/vkuzo/7f7afebf8c31b9ba48306223e68a1c82 This is for https://github.com/pytorch/pytorch/issues/81667 Test plan: CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/94170 Approved by: https://github.com/jerryzh168	2023-02-07 02:32:23 +00:00
leslie-fang-intel	0f802eedc2	[Quant][FX] Lower QConvAddReLU2d for onednn backend (#91155 ) Summary Add quantization mappings for QConvAddReLU2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode. Test plan ``` python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91155 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-02-01 01:18:52 +00:00
leslie-fang-intel	ef4118e435	[Quant][FX] Lower QConvAdd2d for onednn backend (#91153 ) Summary Add quantization mappings for QConvAdd2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode. Test plan ``` python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91153 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-02-01 01:14:12 +00:00
Jerry Zhang	61457671a5	[quant][fx][be] Remove _input_output_observed from backend_config (#92589 ) Summary: This is no longer needed, we can use dtype to decide whether an observer is needed or not Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/92589 Approved by: https://github.com/jcaip	2023-01-27 22:17:05 +00:00
Xia, Weiwen	6fa84fdea2	[FX][Quant] Enable FX quant for patterns like x.view(x.size(...), ...) (#90001 ) Summary This work continues with https://github.com/pytorch/pytorch/pull/83784 by @vkuzo and includes all the changes in that PR. Quote from https://github.com/pytorch/pytorch/pull/83784: > Issue #83658 reports that ops followed by a certain pattern of `view` and `size` ops were not quantized correctly by FX graph mode quantization. Before this PR, the "size" op was in the "op shares qparams with input" category, and the code assumed that the input of this op has the same dtype as its output. This led to incorrectly propagating the `int` dtype as the output of whichever op was preceding the `view` op, which in turn made that op blocklisted from quantization. > The fix is to create a new category of ops which work on different dtypes of tensors but are not observed. This PR does so for `size`, and also for `shape` since it works the same way. Note: This PR needs https://github.com/pytorch/pytorch/pull/91297 to be landed first otherwise there is a UT failure. Test plan ``` python test/test_quantization.py -k test_linear_size_view python test/test_quantization.py -k test_linear_shape_view ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/90001 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-01-27 07:56:29 +00:00
Jacob Szwejbka	eb32bb2ca6	[Executorch][Quantization] Backend Config for functional embedding (#92700 ) Summary: title Test Plan: ci Differential Revision: D42643985 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92700 Approved by: https://github.com/jerryzh168	2023-01-24 03:12:56 +00:00
Jerry Zhang	ec3941ada6	[quant][fx] Add support for GRU in fx graph mode quantization (#91976 ) Summary: might be needed by a meta-internal use case Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_rnn Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/91976 Approved by: https://github.com/jcaip	2023-01-13 07:00:12 +00:00
andrewor14	0bd3fa3d22	[Quant][docs] Move parts of BackendConfig tutorial (#91999 ) Summary: This commit moves the API specification section of the BackendConfig tutorial to the docstrings, which is a more suitable place for this content. This change also reduces some duplication. There is no new content added in this change. Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/91999 Approved by: https://github.com/vkuzo, https://github.com/jerryzh168	2023-01-13 05:59:22 +00:00
Vasiliy Kuznetsov	ebb7f20afc	quant: make various configs printable (#91419 ) Summary: Makes various quantization configs print out human readable values instead of just the class name. This is useful when printing these configs out when debugging. Test plan: test script ``` conf_1 = torch.ao.quantization.backend_config.backend_config.DTypeConfig() print(conf_1) conf_2 = torch.ao.quantization.backend_config.backend_config.BackendConfig() print(conf_2) conf_3 = torch.ao.quantization.backend_config.backend_config.BackendPatternConfig() print(conf_3) conf_4 = torch.ao.quantization.fx.custom_config.PrepareCustomConfig()\ .set_input_quantized_indexes([0]) print(conf_4) conf_5 = torch.ao.quantization.fx.custom_config.ConvertCustomConfig()\ .set_preserved_attributes(['foo']) print(conf_5) conf_6 = torch.ao.quantization.fx.custom_config.FuseCustomConfig()\ .set_preserved_attributes(['foo']) print(conf_6) ``` test script output ``` DTypeConfig(input_dtype_with_constraints=DTypeWithConstraints(dtype=None, quant_min_lower_bound=None, quant_max_ upper_bound=None, scale_min_lower_bound=None, scale_max_upper_bound=None, scale_exact_match=None, zero_point_exa ct_match=None), output_dtype_with_constraints=DTypeWithConstraints(dtype=None, quant_min_lower_bound=None, quant _max_upper_bound=None, scale_min_lower_bound=None, scale_max_upper_bound=None, scale_exact_match=None, zero_poin t_exact_match=None), weight_dtype_with_constraints=DTypeWithConstraints(dtype=None, quant_min_lower_bound=None, quant_max_upper_bound=None, scale_min_lower_bound=None, scale_max_upper_bound=None, scale_exact_match=None, zero _point_exact_match=None), bias_dtype=None, is_dynamic=None) BackendConfig({'name': '', '_pattern_complex_format_to_config': {}}) BackendPatternConfig({'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>}) PrepareCustomConfig({'input_quantized_indexes': [0]}) ConvertCustomConfig({'preserved_attributes': ['foo']}) FuseCustomConfig({'preserved_attributes': ['foo']}) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91419 Approved by: https://github.com/andrewor14	2023-01-04 04:52:20 +00:00
joncrall	ad782ff7df	Enable xdoctest runner in CI for real this time (#83816 ) Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-12-29 05:32:42 +00:00
Jerry Zhang	2a23dfe8ed	[quant] Support lowering for quantized embedding byte operator (#91159 ) Summary: This PR adds lowering for embedding in quantization in executorch flow Test Plan: buck run executorch/exir/tests:quant_fusion_pass -- "executorch.exir.tests.test_quant_fusion_pass.TestQuantFusionPass.test_embedding_byte" Reviewed By: qihqi Differential Revision: D41673139 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91159 Approved by: https://github.com/vkuzo	2022-12-21 22:52:24 +00:00
Xia, Weiwen	a5eb564ba4	[Quant] lower fused LinearTanh for onednn backend (#89188 ) Summary Add fuser method and quantization mappings for `QLinearLeakyReLU` for int8 inference for onednn backend. The fusion and lowering are supported only in FX mode. Test plan python test_quantization.py TestFuseFx TestQuantizeFx Pull Request resolved: https://github.com/pytorch/pytorch/pull/89188 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2022-12-20 01:30:21 +00:00
Xia, Weiwen	7b0ec67e34	[Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU (#88665 ) Summary Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs. Test plan python test_quantization.py TestFuseFx Pull Request resolved: https://github.com/pytorch/pytorch/pull/88665 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2022-12-17 03:33:08 +00:00
Jerry Zhang	f7b384cc46	[reland][quant][pt2e] Add early prototype top level quantize_pt2e APIs (#91035 ) Summary: This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack * torch.ao.quantization.quantize_pt2e.prepare_pt2e Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration for post training quantization * torch.ao.quantization.quantize_pt2e.convert_pt2e Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules Also added a backend config for the qnnpack_pt2e backend: * torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees Test Plan: python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/91035 Approved by: https://github.com/HDCharles	2022-12-17 02:15:53 +00:00
PyTorch MergeBot	ad1b04c4a9	Revert "[reland][quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90971 )" This reverts commit `7dd5e55497`. Reverted https://github.com/pytorch/pytorch/pull/90971 on behalf of https://github.com/ezyang due to still broke tons of master jobs sorry	2022-12-16 09:29:39 +00:00
Jerry Zhang	7dd5e55497	[reland][quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90971 ) Summary: This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack * torch.ao.quantization.quantize_pt2e.prepare_pt2e Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration for post training quantization * torch.ao.quantization.quantize_pt2e.convert_pt2e Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules Also added a backend config for the qnnpack_pt2e backend: * torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees Test Plan: python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/90971 Approved by: https://github.com/HDCharles	2022-12-16 06:24:28 +00:00
PyTorch MergeBot	9c912c7dd0	Revert "[quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802 )" This reverts commit `a66af1feba`. Reverted https://github.com/pytorch/pytorch/pull/90802 on behalf of https://github.com/malfet due to somehow broke test_resnet18 (quantization.fx.test_quantize_pt2e.TestQuantizePT2EModels), see `a66af1feba`	2022-12-15 23:28:21 +00:00
Jerry Zhang	a66af1feba	[quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802 ) Summary: This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack * torch.ao.quantization.quantize_pt2e.prepare_pt2e Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration for post training quantization * torch.ao.quantization.quantize_pt2e.convert_pt2e Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules Also added a backend config for the qnnpack_pt2e backend: * torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees Test Plan: python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/90802 Approved by: https://github.com/qihqi	2022-12-15 21:50:29 +00:00
andrewor14	691a44f403	[Quant][fx][bc-breaking] Add simpler BackendConfig pattern format (#90698 ) Summary: The existing BackendConfig fusion pattern uses a "reversed nested tuple" format that is highly unintuitive. For example, ``` linear-relu -> (nn.ReLU, nn.Linear) conv-bn-relu -> (nn.ReLU, (nn.BatchNorm2d, nn.Conv2d)) ``` This pattern format also complicates the signatures of the user specified "fuser methods", which needed to accept arguments in reverse nested order to match the patterns: ``` def fuse_linear_relu(is_qat, relu, linear): ... def fuse_conv_bn_relu(is_qat, relu, bn_conv): (bn, conv) = bn_conv ... ``` Instead, this commit introduces a new pattern format that simply specifies the ops in forward order with no nesting: ``` linear-relu -> (nn.Linear, nn.ReLU) conv-bn-relu -> (nn.Conv2d, nn.BatchNorm2d, nn.ReLU) def fuse_linear_relu(is_qat, linear, relu): ... def fuse_conv_bn_relu(is_qat, conv, bn, relu): ... ``` Note that the legacy "reversed nested tuple" is still used internally since it is more general. In the future, we should replace it with the format used in the subgraph rewriter in `torch.fx`, and simplify the existing pattern matching code to handle the new format added in this commit. BC-breaking Notes: Before: ``` import torch as nn import torch.ao.nn.intrinsic as nni from torch.ao.quantization.backend_config import BackendPatternConfig def fuse_linear_relu(is_qat, relu, bn_conv): (bn, conv) = bn_conv return nni.ConvBnReLU2d(conv, bn, relu) config = BackendPatternConfig((nn.ReLU, (nn.BatchNorm2d, nn.Conv2d))) \ .set_dtype_configs(...) \ .set_fuser_method(fuse_conv_bn_relu) \ .set_fused_module(nni.ConvBnReLU2d) ``` After: ``` def fuse_linear_relu(is_qat, conv, bn, relu): return nni.ConvBnReLU2d(conv, bn, relu) config = BackendPatternConfig((nn.Conv2d, nn.BatchNorm2d, nn.ReLU)) \ .set_dtype_configs(...) \ .set_fuser_method(fuse_conv_bn_relu) \ .set_fused_module(nni.ConvBnReLU2d) ``` OR (for backward-compatibility) ``` def fuse_linear_relu(is_qat, relu, bn_conv): (bn, conv) = bn_conv return nni.ConvBnReLU2d(conv, bn, relu) config = BackendPatternConfig() \ ._set_pattern_complex_format((nn.ReLU, (nn.BatchNorm2d, nn.Conv2d))) \ .set_dtype_configs(...) \ .set_fuser_method(fuse_conv_bn_relu) \ .set_fused_module(nni.ConvBnReLU2d) \ ._set_use_legacy_pattern_format(True) ``` Before: ``` backend_config.configs # returns Dict[Pattern, BackendPatternConfig] ``` After: ``` backend_config.configs # returns List[BackendPatternConfig] ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestBackendConfig Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: [D41954553](https://our.internmc.facebook.com/intern/diff/D41954553) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90698 Approved by: https://github.com/vkuzo, https://github.com/jerryzh168	2022-12-14 22:44:29 +00:00

1 2

97 Commits