pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	b6d0d0819a	Revert "[PT2] [Quant] Change the QConv2d Binary post op name from add to sum (#115329 )" This reverts commit `9ae0e62929`. Reverted https://github.com/pytorch/pytorch/pull/115329 on behalf of https://github.com/jeanschmidt due to Breaking internal builds, please check internal diff to get the list and logs, @jerryzh168 please support the author in order to get these changes merged and landed ([comment](https://github.com/pytorch/pytorch/pull/115329#issuecomment-1863021726))	2023-12-19 15:52:57 +00:00
leslie-fang-intel	9ae0e62929	[PT2] [Quant] Change the QConv2d Binary post op name from add to sum (#115329 ) Summary Change the QConv2d Binary fusion post op name from `add` to `sum`, since we are actually using OneDNN `post op sum` instead of `Binary_Add` for now. TestPlan ``` python -m pytest test_quantized_op.py -k test_qconv2d_sum_pt2e python -m pytest test_quantized_op.py -k test_qconv2d_sum_relu_pt2e python -m pytest test_quantized_op.py -k test_qconv2d_sum_relu_float_output_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115329 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-12-15 05:10:47 +00:00
angelayi	dd42201cb8	[export] Preserve FQN in export_to_torch_ir (#115462 ) AOTInductor currently relies of export_to_torch_ir to generate a graph, and passes it to inductor to generate the .so. They would like the FQN to be consistent so that they can easily find/update the weights in the .so. Note that since export flattens all modules in to a single computational graph, we will change the FQNs in the original module by replacing all periods with underscores. For example, `foo.child1param`, which points to a submodule named `foo`'s parameter named `child1param`, will be renamed to `foo_child1param` since we no longer have the submodule `foo`. This is done just by doing `name.replace(".", "_")`. Outputted AOTInductor c++ code: https://www.internalfb.com/phabricator/paste/view/P900120950?lines=377-355%2C354 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115462 Approved by: https://github.com/tugsbayasgalan	2023-12-13 04:58:47 +00:00
Aaron Gokaslan	794545c11f	[BE]: Enable RUF015 codebase wide (#115507 ) Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507 Approved by: https://github.com/malfet	2023-12-11 15:51:01 +00:00
HDCharles	b5d3d3ebf0	[ao] making hist_obs handle torch.inf and closeby values (#103467 ) Summary: This PR does 2 things: 1) Previously this would simply error, now it will ignore any torch.inf values that it recieves. note: The code checks for torch.inf after aminmax that way if there are no torch.inf values found, the perf is a relatively unchanged 2) as mentioned in https://github.com/pytorch/pytorch/issues/100051, values close to (but not quite at) the maximum/minimum float value could overflow to infinity in the course of _adjust_min_max() (when this large value would be multiplied by something in the middle of a calculation that would otherwise result in a non inf value). This was fixed by rearranging the order of operations for the lines in question without altering the actual equations. Specifically, where operations in lines 1095, 1098 and 1100 have multiplication and division of large values, its better to divide the two large values before multiplying, rather than multiplying the two large values together (creating overflow) before dividing like it had been. Test Plan: python test/test_quantization.py TestObserver.test_histogram_observer_ignore_infinity python test/test_quantization.py TestObserver.test_histogram_observer_handle_close_to_infinity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51489345](https://our.internmc.facebook.com/intern/diff/D51489345) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103467 Approved by: https://github.com/andrewor14	2023-12-08 21:41:31 +00:00
Jerry Zhang	cc8f6f56dc	[quant][pt2e] Add convert callback to Observer module (#115001 ) Summary: This is to allow easier extension of quant workflow in the future, as we are seening more diverse ways of doing quantization putting up this for feedbacks first Test Plan: python test/test_quantization.py TestQuantizePT2E.test_observer_callback Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/115001 Approved by: https://github.com/kimishpatel	2023-12-08 13:47:37 +00:00
Jerry Zhang	ecba053cff	[quant][pt2e] XNNPACKQuantizer skip inserting observers for non-float Tensors (#114999 ) Summary: att Test Plan: python test/test_quantization.py -k test_add_mul_long Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/114999 Approved by: https://github.com/kimishpatel, https://github.com/guangy10	2023-12-07 22:13:36 +00:00
Jerry Zhang	a93b9ee9d8	[quant][be] Add a test for per channel quant for groupwise conv (#115224 ) Summary: just making sure this works Test Plan: python test/test_quantization.py -k test_groupwise_per_channel_quant Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/115224 Approved by: https://github.com/andrewor14	2023-12-07 04:46:20 +00:00
leslie-fang-intel	7ec145bfed	[Quant] [PT2] Fix XNNPACKQuantizer set_module_type issue (#115252 ) Summary Fix the issue https://github.com/pytorch/pytorch/issues/115251, the root-cause is we pass the `filter_fn` parameter of `find_sequential_partitions` in wrong position. Use keyword arg to fix this issue. Summary ``` python -u -m pytest -s -v test_quantization.py -k test_set_module_type_case_2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115252 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-12-07 03:08:20 +00:00
leslie-fang-intel	1489e4bcf3	[Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval (#114547 ) Summary Add standalone batchnorm into `_move_exported_model_to_eval` to move it from training mode into eval mode Test Plan ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_bn_conv2d python -u -m pytest -s -v test_quantize_pt2e.py -k test_bn_move_exported_model_to_eval ``` Differential Revision: [D51853407](https://our.internmc.facebook.com/intern/diff/D51853407) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114547 Approved by: https://github.com/jgong5, https://github.com/andrewor14	2023-12-06 19:51:22 +00:00
leslie-fang-intel	4a624d1f8a	[Quant] [PT2] Enable QLinear input with multi dims (#113733 ) Summary In the previous QLinear implementation, it was assumed that inputs have a dimension of 2. In this update, we have modified QLinear to accept inputs with a dimension greater than 2, incorporating input and output reshaping accordingly. Test Plan ``` python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/113733 Approved by: https://github.com/jgong5, https://github.com/eellison	2023-12-06 01:16:51 +00:00
Natalia Gimelshein	b8ce05456c	enable cat for cuda bits types (#115044 ) It was already working for cpu, so bring parity. Also, slightly reduce number of compiled kernels by using OpaqueType. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115044 Approved by: https://github.com/malfet	2023-12-06 00:05:18 +00:00
PyTorch MergeBot	063423edf5	Revert "enable cat for cuda bits types (#115044 )" This reverts commit `4cf97c40f7`. Reverted https://github.com/pytorch/pytorch/pull/115044 on behalf of https://github.com/malfet due to This breaks ROCM ([comment](https://github.com/pytorch/pytorch/pull/115044#issuecomment-1841494814))	2023-12-05 19:37:25 +00:00
Natalia Gimelshein	4cf97c40f7	enable cat for cuda bits types (#115044 ) It was already working for cpu, so bring parity. Also, slightly reduce number of compiled kernels by using OpaqueType. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115044 Approved by: https://github.com/malfet	2023-12-05 17:14:42 +00:00
Jerry Zhang	1474dad28c	[quant][pt2e][xnnpack] Add support for QAT dynamic quantization for linear in XNNPACKQuantizer (#113288 ) Summary: FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag for all observers so that it's possible to convert these observers to the pattern. However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1) for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity checks in other observers to make sure the is_dynamic is False. Test Plan: python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113288 Approved by: https://github.com/kimishpatel	2023-12-04 23:06:38 +00:00
Jerry Zhang	8f164017ee	[quant][pt2e][xnnpack] XNNPACKQuantizer skip quantization for input and output to workaround histogram observer problem (#113405 ) Summary: att, this is because histogram observer does not work for a corner case in mobilebert (observing a scalar tensor of float32 max value) because histc operator errors out when the value is larger than certain number Test Plan: python test/test_quantization.py -k test_mul_float32_max Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/113405 Approved by: https://github.com/mcr229	2023-12-02 00:44:42 +00:00
PyTorch MergeBot	c6e975bc0e	Revert "[Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval (#114547 )" This reverts commit `bab054063c`. Reverted https://github.com/pytorch/pytorch/pull/114547 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/114547#issuecomment-1836612143))	2023-12-01 18:52:51 +00:00
Huy Do	5687285ca5	Skip quantization tests running from BaseTestQuantizePT2EQAT_ConvBn (#114829 ) Summary: This is a follow-up from D51428979. These tests should be run only from `TestQuantizePT2EQAT_ConvBn1d` and `TestQuantizePT2EQAT_ConvBn2d`. The base class doesn't have the necessary setup to run them and will fail expectedly. I previously ignored the failures on D51428979, and these failed tests have been disabled. Test Plan: Run an example test there and confirm that two versions from `TestQuantizePT2EQAT_ConvBn1d` and `TestQuantizePT2EQAT_ConvBn2d` are run while the one from `BaseTestQuantizePT2EQAT_ConvBn` is skipped ``` $ buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/quantization:test_quantization -- --run-disabled 'caffe2/test/quantization:test_quantization - test_qat_conv_bn_fusion_literal_args' File changed: fbcode//caffe2/test/quantization/pt2e/test_quantize_pt2e_qat.py ↷ Skip: caffe2/test/quantization:test_quantization - test_qat_conv_bn_fusion_literal_args (caffe2.test.quantization.pt2e.test_quantize_pt2e_qat.BaseTestQuantizePT2EQAT_ConvBn) (0.0s) /data/users/huydo/fbsource/buck-out/v2/gen/fbcode/689edf96bfbb5738/caffe2/test/quantization/__test_quantization__/test_quantization#link-tree/torch/_utils_internal.py:230: NCCL_DEBUG env var is set to None /data/users/huydo/fbsource/buck-out/v2/gen/fbcode/689edf96bfbb5738/caffe2/test/quantization/__test_quantization__/test_quantization#link-tree/torch/_utils_internal.py:239: NCCL_DEBUG is WARN from /etc/nccl.conf INFO:2023-11-29 19:20:33 3049620:3049620 CuptiActivityProfiler.cpp:225] CUDA versions. CUPTI: 18; Runtime: 12000; Driver: 12000 /data/users/huydo/fbsource/buck-out/v2/gen/fbcode/689edf96bfbb5738/caffe2/test/quantization/__test_quantization__/test_quantization#link-tree/torch/_utils_internal.py:158: DeprecationWarning: This is a NOOP in python >= 3.7, its just too dangerous with how we write code at facebook. Instead we patch os.fork and multiprocessing which can raise exceptions if a deadlock would happen. threadSafeForkRegisterAtFork() test_qat_conv_bn_fusion_literal_args (caffe2.test.quantization.pt2e.test_quantize_pt2e_qat.BaseTestQuantizePT2EQAT_ConvBn) ... skipped 'Skipping test running from BaseTestQuantizePT2EQAT_ConvBn' ---------------------------------------------------------------------- Ran 1 test in 0.001s OK (skipped=1) Skipped: Skipping test running from BaseTestQuantizePT2EQAT_ConvBn Buck UI: https://www.internalfb.com/buck2/7b70fb33-44cb-4745-92e1-64031bb413b8 Test UI: https://www.internalfb.com/intern/testinfra/testrun/6473924660765251 Network: Up: 12KiB Down: 0B (reSessionID-0399f0c3-e671-4770-a41c-75c06ae709d5) Jobs completed: 11. Time elapsed: 1:07.2s. Cache hits: 0%. Commands: 1 (cached: 0, remote: 0, local: 1) Tests finished: Pass 2. Fail 0. Fatal 0. Skip 1. Build failure 0 ``` Differential Revision: D51694959 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114829 Approved by: https://github.com/clee2000	2023-12-01 05:13:27 +00:00
Jerry Zhang	64fd706b21	[quant][pt2e] Add generate_numeric_debug_handle pass (#114315 ) Summary: This is a util for numeric suite in pt2 export so that we can build a more streamlined UX for numerical debugging in quant + executorch stack Test Plan: python test/test_quantization.py TestGenerateNumericDebugHandle Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/114315 Approved by: https://github.com/zhxchen17	2023-12-01 03:38:17 +00:00
leslie-fang-intel	fd7201029a	[Quant] [PT2] Enable Inplace Dropout in _move_exported_model_to_eval (#114725 ) Summary Enable Inplace Dropout replacement in `_move_exported_model_to_eval` Test Plan ``` python -u -m pytest -s -v test_quantize_pt2e.py -k test_move_exported_model_to_eval ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114725 Approved by: https://github.com/andrewor14, https://github.com/jgong5 ghstack dependencies: #114547	2023-11-30 04:43:22 +00:00
leslie-fang-intel	bab054063c	[Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval (#114547 ) Summary Add standalone batchnorm into `_move_exported_model_to_eval` to move it from training mode into eval mode Test Plan ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_bn_conv2d python -u -m pytest -s -v test_quantize_pt2e.py -k test_bn_move_exported_model_to_eval ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114547 Approved by: https://github.com/jgong5, https://github.com/andrewor14	2023-11-30 04:31:27 +00:00
leslie-fang-intel	8c1f65dc2b	[Quant] [PT2] Add Hardtanh and ReLU6 into X86InductorQuantizer Conv2d Unary Annotation (#114579 ) Summary Add `Hardtanh` and `ReLU6` into X86InductorQuantizer Conv2d Unary Annotation TestPlan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_unary python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114579 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #114578	2023-11-28 07:18:00 +00:00
leslie-fang-intel	8a35a68bb7	[Quant] Enable QConv2d with hardtanh post op (#114578 ) Summary Enable QConv2d implementation with post op `hardtanh` Test Plan ``` python -m pytest test_quantized_op.py -k test_qconv2d_hardtanh_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114578 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-28 07:13:01 +00:00
leslie-fang-intel	74370a8a9d	Add adaptive_avg_pool2d and flatten into x86 Inductor Quantizer recipe (#114442 ) Summary Add adaptive_avg_pool2d and flatten into x86 Inductor Quantizer recipe Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_adaptive_avg_pool2d_recipe python -m pytest test_x86inductor_quantizer.py -k test_flatten_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114442 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-28 01:35:57 +00:00
Huy Do	cf9f3ae8d8	Skip an example of test_instance_norm when running internally due to its size (#114452 ) After https://github.com/pytorch/pytorch/pull/113420, `torch.unique` now includes a call to `torch.sort` and that call is slow when running in dev mode, i.e. `@fbcode//mode/dev`. This causes the test to take more than 10 minutes and time out internally [T170720856](https://www.internalfb.com/intern/tasks/?t=170720856). Running the test in `@fbcode//mode/opt` is fine, so please let me know if there is a way to set that. Otherwise, this change will skip the largest example when running in sandcastle internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114452 Approved by: https://github.com/malfet	2023-11-28 01:11:19 +00:00
leslie-fang-intel	e592b9a469	[Quant] [PT2] Fix an issue in Conv Binary Quantization Annotation (#114540 ) Summary To annotate a conv-binary pattern, should skip the pattern if the conv node has more than one user. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_binary2 python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_binary2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114540 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-28 01:06:48 +00:00
Xia, Weiwen	d18e6b07aa	Overload vec::dequantize to eliminate rounding error for quantized sigmoid (#114098 ) Description Fix #107030 Dequantize X by `(x_val - zp) * scale` instead of `x_val * scale + (-zp * scale)` to eliminate rounding error. Now this overload is used for sigmoid only. Performance impact: ![image](https://github.com/pytorch/pytorch/assets/12522207/655abd16-7d9d-4a9a-8c59-327ebf39157a) Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) Test plan `python test_quantization.py TestQuantizedOps.test_sigmoid_dequantize_rounding_error` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114098 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-23 04:33:57 +00:00
HDCharles	18e1a37c4e	[ao] updating embedding_bag support for fx and eager (#107623 ) Summary: our docs were saying dynamic embedding bag wasn't supported but it actually is (at least at the same level as embeddings were) it just wasn't previously tested/listed. Test Plan: python test/test_quantization.py -k "test_embedding" Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/107623 Approved by: https://github.com/jerryzh168	2023-11-21 03:54:00 +00:00
drisspg	2b97f5a9a1	Disallow fp8 type promotion (#113975 ) Fixes #113663 As well as updating the promotion logic to disallow automatic type promotion between fp8 types this PR also cleans up the table entries. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113975 Approved by: https://github.com/albanD, https://github.com/malfet	2023-11-20 19:47:43 +00:00
andrewor14	e5102ccd27	[quant][pt2] Support conv1d-bn QAT fusion (#113714 ) Summary: Previously the PT2 QAT code only supported conv2d-bn. This commit extends all existing QAT fusion support to conv1d-bn, including support for all variants like relu, no bias, literal args, cuda etc. This commit also refactors the code such that we can support conv3d-bn easily in the future. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT_ConvBn1d Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [D51428979](https://our.internmc.facebook.com/intern/diff/D51428979) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113714 Approved by: https://github.com/jerryzh168	2023-11-17 22:09:30 +00:00
Peter Bell	9f47580ad7	[BE] Don't mutate torch.compile global config in tests (#113882 ) We should uniformly use `config.patch` so the configuration changes don't effect different tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113882 Approved by: https://github.com/lezcano	2023-11-17 16:49:48 +00:00
Nikita Shulga	0d6d97d956	Relax constraints on `test_cast_round_trip` (#113872 ) Results of float point operation can be affected by execution order and compiler is not guaranteed to make trivial optimization that might result in lost off precision while compiling in debug mode Fixes https://github.com/pytorch/pytorch/issues/113829 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113872 Approved by: https://github.com/Skylion007, https://github.com/huydhn	2023-11-16 19:52:05 +00:00
George White	6c187246d6	Add support for float8_e4m3fnuz and _e5m2fnuz (#107586 ) This PR relates to the feature in [this feature submission](https://docs.google.com/document/d/1pF2T1xz54IPg1jG7FhykbrpbcJZVelQw0v8vBaoLkfs/edit). It has been based on #104242 which adds similar float8 types. These new types added in this PR are described in the paper at https://arxiv.org/abs/2206.02915. A brief description and comparison of the types with other float8 types can be also found in the [OpenXLA RFC](https://github.com/openxla/stablehlo/blob/main/rfcs/20230321-fp8_fnuz.md). Pull Request resolved: https://github.com/pytorch/pytorch/pull/107586 Approved by: https://github.com/seemethere, https://github.com/malfet	2023-11-15 15:01:11 +00:00
andrewor14	f9ea697112	[quant][pt2][be] Refactor QAT tests for future patterns (#113658 ) Summary: Currently the QAT tests are very specific to conv-bn-2d. This makes it difficult to test new patterns like conv-bn-1d if we want to add them. This commit refactors these tests so we can add and test future patterns easily. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT_ConvBn2d Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/113658 Approved by: https://github.com/jerryzh168	2023-11-15 02:17:13 +00:00
Natalia Gimelshein	15a2caea8e	Enables copy/clone/reshape/contiguous operations for bits types (#113508 ) Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/113508 Approved by: https://github.com/albanD	2023-11-11 22:51:50 +00:00
George White	8880584015	Improve test_float8.py (#113361 ) The numeric test for round-trip casting of float8 dtypes originally consisted of generating a 100x100 tensor in the range 0..max. This change refactors the test, adds further edge cases and fixes multiple issues with the lower precision simulation which the results of the round-trip cast test were checked against. Set atol=0 and rtol=0 to ensure an exact equality comparison. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113361 Approved by: https://github.com/malfet, https://github.com/Neilblaze	2023-11-10 15:23:22 +00:00
Jerry Zhang	501d118255	[quant][pt2e] Add transform_for_annotation method in Quantizer (#113115 ) Summary: Adding the method so that people can do some transformations before annotation to make the graph easier to annotate Test Plan: python test/test_quantization.py TestQuantizePT2E.test_transform_for_annotation Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51141080](https://our.internmc.facebook.com/intern/diff/D51141080) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113115 Approved by: https://github.com/kimishpatel	2023-11-09 20:23:29 +00:00
Jerry Zhang	12c257cc00	[qunat][pt2e] Support allow_implicit_sharing flag (#112929 ) Summary: For a Node: node1 and edge: (node1, node2), since they are observing the same Tensor, we may want to implicitly share observers, this flag allows people to turn off this behavior for the output of the node See the test_allow_implicit_sharing test for use case Test Plan: python test/test_quantization.py TestQuantizePT2E.test_allow_implicit_sharing Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/112929 Approved by: https://github.com/kimishpatel	2023-11-08 23:47:17 +00:00
andrewor14	c0aba9be41	[quant][pt2] Fix custom dtype per channel weight in QAT (#112612 ) Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/112612 Approved by: https://github.com/jerryzh168	2023-11-07 20:10:53 +00:00
andrewor14	b6e85eb8d5	[quant][pt2] Support quantized conv bias in QAT fusion (#112528 ) Summary: Previously QAT fusion assumes bias is not quantized. This works for the existing XNNPACKQuantizer, but not for custom quantizers that wish to quantize the bias. This commit supports this by adding the necessary patterns. This requires refactoring the code, however, since it previously assumed that there will only be one pair of q-dq (from conv weight) in the matched pattern, and this is no longer true. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_bias_derived_qspec Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D50856377](https://our.internmc.facebook.com/intern/diff/D50856377) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112528 Approved by: https://github.com/jerryzh168	2023-11-06 17:58:57 +00:00
leslie-fang-intel	6ba2748690	[Quant] [PT2] Enable Decomposed quant per tensor/channel to accept bfloat16 input (#112225 ) Summary - PR 4 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor https://github.com/pytorch/pytorch/issues/111640. - Enable `decomposed quant_per_tensor` and `quant_per_channel` accepts bfloat16 input. TestPlan ``` python -m pytest test_quantized_tensor.py -k test_decomposed_quantize_per_tensor_bfloat16_input python -m pytest test_quantized_tensor.py -k test_decomposed_quantize_per_channel_bfloat16_input ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/112225 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-03 23:47:43 +00:00
leslie-fang-intel	871e27a61c	[Quant] [PT2] Remove the output Annotation of Conv/Linear in x86InductorQuantizer (#112140 ) Summary - PR 3 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor https://github.com/pytorch/pytorch/issues/111640. - Remove the output annotation of QConv/QLinear in X86InductorQuantizer. Test Plan ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d python -m pytest test_mkldnn_pattern_matcher.py -k test_qlinear python -m pytest test_x86inductor_quantizer.py -k Conv2d python -m pytest test_x86inductor_quantizer.py -k Linear ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/112140 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #112010, #112126	2023-11-03 08:24:55 +00:00
leslie-fang-intel	a53d29cc18	Enable oneDNN QLinear FP32/BF16 output (#112126 ) Summary - PR 2 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor https://github.com/pytorch/pytorch/issues/111640. - Enable QLinear (relu) with BFloat16 or Float32 output. TestPlan ``` python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/112126 Approved by: https://github.com/jerryzh168, https://github.com/jgong5 ghstack dependencies: #112010	2023-11-03 08:20:54 +00:00
leslie-fang-intel	b6fc7af8a0	Enable oneDNN QConv FP32/BF16 output (#112010 ) Summary - PR 1 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor https://github.com/pytorch/pytorch/issues/111640. - Enable QConv (relu, add, add_relu) with BFloat16 or Float32 output. Test Plan ``` python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e python -u -m pytest test_quantized_op.py -k test_qconv2d_relu_pt2e python -u -m pytest test_quantized_op.py -k test_qconv2d_add_pt2e python -u -m pytest test_quantized_op.py -k test_qconv2d_add_relu_pt2e python -u -m pytest test_quantized_op.py -k test_qconv2d_add_relu_float_output_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/112010 Approved by: https://github.com/jerryzh168, https://github.com/jgong5	2023-11-03 08:16:45 +00:00
leslie-fang-intel	6c19de07cd	[Quant] [PT2] Add ConvBNAdd(ReLU) Annotation into X86InductorQuantizer (#111281 ) Summary This PR adds ConvBNAdd(ReLU) QAT Annotation into `X86InductorQuantizer`. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_binary_with_quantizer_api python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_binary_unary_with_quantizer_api python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d_add python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d_add_relu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/111281 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #111280	2023-11-02 02:05:49 +00:00
leslie-fang-intel	56ca0043f6	[Quant] [PT2] Enable QAT Quantization flow in X86InductorQuantizer (#111280 ) Summary This PR enables PT2 QAT Quantization flow in `X86InductorQuantizer`. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_with_quantizer_api python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary_with_quantizer_api python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d_relu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/111280 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-02 02:03:10 +00:00
Kimish Patel	9e2af971fc	[Quantization] Add "quantization_tag" as metadata to fx proxy (#108764 ) Summary: In order to make sure that quantization_tag is preserved through second stage export, this PR adds it as a special metadata that should be preserved. Since quantization in export path will work on top of pre dispatch graph, subsequent post dispatch op decomposition, will decompose ops that quant workflow tagged. In order to make sure that the patterns identified by quantizer, remains identifiable, even after decompositions are applied, we must preserve "quantization_tag". This enables backend delegates, that quantized a model for specific backend, to be able to identify "quantized" patterns. Test Plan: metadata porting tests Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49056259](https://our.internmc.facebook.com/intern/diff/D49056259) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108764 Approved by: https://github.com/tugsbayasgalan, https://github.com/jerryzh168	2023-11-01 21:41:58 +00:00
andrewor14	231129ea36	[quant][pt2] Fix QAT conv-bn bias derived qspec (#112159 ) Summary: Today, we have special handling for special qspecs like `SharedQuantizationSpec` or `DerivedQuantizationSpec`, since these qspecs refer to other nodes in the graph and these node references need to be updated after replacement (since they referred to nodes in the original graph that no longer exist in the new graph). However, we only do the above for special nodes like conv, bn, getitem, and relu. This doesn't cover the common use case of having conv bias derive its qparams from those of conv input activations and conv weight. This commit adds support for this use case by also replacing the node references for these nodes. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_bias_derived_qspec Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D50697078](https://our.internmc.facebook.com/intern/diff/D50697078) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112159 Approved by: https://github.com/jerryzh168	2023-10-31 18:04:23 +00:00
Jerry Zhang	3db0095ea2	[reland][quant][pt2e][be] Cleanup observer insertion logic (#111828 ) (#112453 ) Summary: att, after SharedQuantizationSpec bug fix we are doing some checks before hand, this can simplify the logic when we insert observers Test Plan: contbuild & OSS CI, see `bf998a2c5d` Test plan from GitHub: python test/test_quantization.py TestQuantizePT2E CIs Differential Revision: D50816224 Pulled By: jerryzh168 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112453 Approved by: https://github.com/andrewor14	2023-10-31 17:33:24 +00:00
PyTorch MergeBot	797d7100de	Revert "[quant][pt2e][be] Cleanup observer insertion logic (#111828 )" This reverts commit `bf998a2c5d`. Reverted https://github.com/pytorch/pytorch/pull/111828 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/111828#issuecomment-1782154648))	2023-10-27 01:35:27 +00:00

1 2 3 4 5 ...

1596 Commits