pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	9d6597b1e9	Correctly use test parameters (#166726 ) This PR uses unused arguments in some tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166726 Approved by: https://github.com/rec, https://github.com/albanD, https://github.com/Skylion007	2025-11-01 04:43:31 +00:00
Yuanyuan Chen	d97144d31e	[5/N] Remove unused loop variables in tests (#166716 ) This PR removes unused loop variables in tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166716 Approved by: https://github.com/Lucaskabela, https://github.com/Skylion007	2025-10-31 20:47:57 +00:00
Yuanyuan Chen	fc8ac1216c	[4/N] Remove unused loop variables in tests (#166690 ) This PR removes unused loop variables in tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166690 Approved by: https://github.com/justinchuby, https://github.com/mlazos	2025-10-31 10:20:48 +00:00
Yuanyuan Chen	8b188647cf	[2/N] Fix unused loop variables (#166500 ) This PR removes unused loop variables. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166500 Approved by: https://github.com/mlazos	2025-10-29 08:30:35 +00:00
linhaifeng	695cb0d342	[2/N][Fix] Fix typo in test folder (#166374 ) Fix typo in test folder. _typos.toml ```bash [default.extend-words] nd = "nd" arange = "arange" Nd = "Nd" GLOBALs = "GLOBALs" hte = "hte" iy = "iy" PN = "PN" Dout = "Dout" optin = "optin" gam = "gam" PTD = "PTD" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/166374 Approved by: https://github.com/cyyever, https://github.com/ezyang	2025-10-29 03:02:07 +00:00
Yuanyuan Chen	f9953e0f61	Enable PLC0414 on ruff (#165828 ) This PR enables `PLC0414` that fixes redundant import aliases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165828 Approved by: https://github.com/albanD	2025-10-22 04:56:52 +00:00
Yuanyuan Chen	0e083942cc	Enable PLW0127 in ruff (#165851 ) This PR enables `PLW0127` in ruff, which checks self-assignment of variables with the form `var=var`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165851 Approved by: https://github.com/Lucaskabela	2025-10-21 03:30:57 +00:00
Yuanyuan Chen	fdab48a7c1	Enable all PIE rules on ruff (#165814 ) This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are ``` PIE796 Enum contains duplicate value: {value} PIE808 Unnecessary start argument in range ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814 Approved by: https://github.com/ezyang	2025-10-18 07:36:18 +00:00
PyTorch MergeBot	24520b8386	Revert "Enable all PIE rules on ruff (#165814 )" This reverts commit `c79dfdc655`. Reverted https://github.com/pytorch/pytorch/pull/165814 on behalf of https://github.com/cyyever due to Need to cover more files ([comment](https://github.com/pytorch/pytorch/pull/165814#issuecomment-3417931863))	2025-10-18 07:21:08 +00:00
Yuanyuan Chen	c79dfdc655	Enable all PIE rules on ruff (#165814 ) This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are ``` PIE796 Enum contains duplicate value: {value} PIE808 Unnecessary start argument in range ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814 Approved by: https://github.com/ezyang	2025-10-18 06:40:12 +00:00
Yuanyuan Chen	e595136187	Enable PLC1802 on ruff (#165813 ) This PR enables ruff check `PLC1802`, which detects len calls on sequences in a boolean test context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165813 Approved by: https://github.com/ezyang	2025-10-18 05:44:14 +00:00
Yuanyuan Chen	e925dfcc6b	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang, https://github.com/mlazos	2025-10-17 07:27:11 +00:00
Angel Li	fe5ccb1a74	bf16 support for per tensor backward (#165362 ) Adding bf16 for the backward pass of `torch._fake_quantize_learnable_per_tensor_affine()`. Note that for testing, we modified the seed to avoid increasing tolerance due to cases where difference in Python vs CPP downcasting causes tensor mismatches. (e.g. 27.87704 vs 27.8408 before downcasting, 27.7500 vs 27.8750 after downcasting for Python vs CPP op) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165362 Approved by: https://github.com/andrewor14	2025-10-16 17:47:01 +00:00
Angel Li	a856a17799	bf16 support for per_channel bwd (#165325 ) Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning. For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165325 Approved by: https://github.com/andrewor14	2025-10-14 05:47:32 +00:00
Yuanyuan Chen	8de85896e0	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-13 01:48:55 +00:00
PyTorch MergeBot	816fb7f48d	Revert "Enable ruff rule E721 (#165162 )" This reverts commit `9e7c19f72b`. Reverted https://github.com/pytorch/pytorch/pull/165162 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165162#issuecomment-3393328271))	2025-10-11 13:25:40 +00:00
Yuanyuan Chen	9e7c19f72b	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-11 06:43:53 +00:00
Animesh Jain	d73416642f	[test] Skip testing of source_fn_stack in light of export changes (#165176 ) This is in regards to https://github.com/pytorch/pytorch/pull/164691 where we are inlining into nn modules, and therefore it is causing this test to fail. The test here looks for node.name which is quite different with inlining. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165176 Approved by: https://github.com/andrewor14 ghstack dependencies: #165172	2025-10-11 00:16:59 +00:00
Angel Li	253fd765bd	bf16 support for fake_quantize_learnable_per_channel_affine (#165098 ) Adding bf16 support for `torch._fake_quantize_learnable_per_channel_affine()` op by relaxing the type check on scale TODO: need to add bf16 support to `per_tensor_affine_` as `torch._fake_quantize_learnable_per_tensor_affine_backward` gets called in the backward pass Test Modified unit test in `test_workflow_ops.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165098 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14	2025-10-10 16:24:52 +00:00
PyTorch MergeBot	5d7360bb03	Revert "Enable all SIM rules except disabled ones (#164645 )" This reverts commit `321e602692`. Reverted https://github.com/pytorch/pytorch/pull/164645 on behalf of https://github.com/izaitsevfb due to causes lint failures ([comment](https://github.com/pytorch/pytorch/pull/164645#issuecomment-3369274351))	2025-10-05 19:32:21 +00:00
Yuanyuan Chen	321e602692	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang	2025-10-05 07:38:25 +00:00
Yuanyuan Chen	5743d731c1	Use torch.testing.test_close instead of torch.testing.test_allclose (#164539 ) Because torch.testing.test_allclose is deprecated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164539 Approved by: https://github.com/mlazos	2025-10-03 14:39:10 +00:00
Jeff Daily	ffc645c870	half support for fused_moving_avg_obs_fake_quant() op (#164175 ) Follow up to https://github.com/pytorch/pytorch/pull/162620. Add half support, as well. This fixes some failures in inductor benchmarks such as from this log https://github.com/pytorch/pytorch/actions/runs/18051942373/job/51376749459. `NotImplementedError: "aminmax_kernel" not implemented for 'Half'` Pull Request resolved: https://github.com/pytorch/pytorch/pull/164175 Approved by: https://github.com/malfet, https://github.com/jerryzh168	2025-09-30 19:35:17 +00:00
can-gaa-hou	e64dd8c694	[Fix] Adding missing `f` prefixes to formatted strings [4/N] (#164068 ) As stated in the title. * __->__ #164068 * #164067 * #164066 * #164065 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164068 Approved by: https://github.com/Skylion007	2025-09-29 04:07:07 +00:00
Sampath Victor	783a9dcb6d	[6/n] Quantization with min & max bounds support - using fbgemm changes in ATen (#162924 ) Summary: This diff uses the FBGEMM changes made in D78181177 & D81858256 to support using the provided per row min/max values while quantizaing float/half to 8-bit, 4-bit & 2-bit in ATen library. Please find more context on this here: https://fburl.com/gdoc/yutf32a0 Test Plan: ``` buck test mode/opt caffe2/torch/fb/model_transform/splitting/tests:split_dispatcher_test ``` https://www.internalfb.com/intern/testinfra/testrun/7881299640979446 Please refer to D80905814's test plan for integration testing. Rollback Plan: Differential Revision: D81327342 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162924 Approved by: https://github.com/jerryzh168	2025-09-25 02:52:04 +00:00
Angel Li	3b73841f43	update test_quantization tests to run weekly (#163077 ) Fixes #162854 Pull Request resolved: https://github.com/pytorch/pytorch/pull/163077 Approved by: https://github.com/huydhn	2025-09-24 11:31:11 +00:00
Angel Li	9494b09549	bf16 support for fused_moving_avg_obs_fake_quant() op (#162620 ) enabling bf16 support for `torch.fused_moving_avg_obs_fake_quant()` op on cuda testing `python test/quantization/pt2e/test_quantize_pt2e.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/162620 Approved by: https://github.com/andrewor14, https://github.com/jerryzh168	2025-09-16 21:22:44 +00:00
PyTorch MergeBot	468c1f9e9d	Revert "[nn] Assert parsed iterable arguments are an appropriate length (#162340 )" This reverts commit `b5e6e58050`. Reverted https://github.com/pytorch/pytorch/pull/162340 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to break an MPS tests on ExecuTorch ([comment](https://github.com/pytorch/pytorch/pull/162340#issuecomment-3282676242))	2025-09-11 21:22:57 +00:00
Benjamin Glass	b5e6e58050	[nn] Assert parsed iterable arguments are an appropriate length (#162340 ) Fixes #162327 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162340 Approved by: https://github.com/Skylion007	2025-09-10 15:15:49 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	de05dbc39c	Replace export_for_training with export (#162396 ) Summary: replace export_for_training with epxort Test Plan: CI Rollback Plan: Differential Revision: D81935792 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162396 Approved by: https://github.com/angelayi, https://github.com/jerryzh168	2025-09-10 14:19:34 +00:00
arkadip-maitra	1aa7476885	fix to segmentation fault when empty tensor is passed to choose_qpara… (#161966 ) …ms_optimized Fixes #153326 Minimal code to reproduce error: ``` import torch tensor = torch.tensor([]) torch.choose_qparams_optimized( tensor, 0, 200, 0.16, 8 ) ``` Previous Output: `Segmentation fault` Now Output: ``` Traceback (most recent call last): File "/home/amaitra/work/tests/issue_153326.py", line 5, in <module> torch.choose_qparams_optimized( RuntimeError: input tensor is empty and has no data ``` Caused because `const float* input_row =input_tensor.const_data_ptr<float>();` becomes null Pull Request resolved: https://github.com/pytorch/pytorch/pull/161966 Approved by: https://github.com/Skylion007	2025-09-03 20:26:26 +00:00
Dmitry Nikolaev	b76f6d117a	[ROCm] fix numpy version detection and adjust fudge_factors for MI355 (#161429 ) This PR fixes: - Numpy >= 2.1 version detection (instead of python 3.13 version detection) to skip some tests (numpy 2.1 can be installed for older python versions) ``` test_quantization.py::TestDynamicQuantizedOps::test_qlinear test_quantization.py::TestDynamicQuantizedOps::test_qlinear_legacy test_quantization.py::TestQuantizedLinear::test_qlinear test_quantization.py::TestQuantizedLinear::test_qlinear_leaky_relu test_quantization.py::TestQuantizedLinear::test_qlinear_relu test_quantization.py::TestQuantizedLinear::test_qlinear_tanh test_quantization.py::TestQuantizedLinear::test_qlinear_with_input_q_dq_qweight_dq_output_fp32 ``` - A couple of SDPA tests on MI355 by adjusting fudge_factors: ``` test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32 test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/161429 Approved by: https://github.com/jeffdaily	2025-08-28 19:32:09 +00:00
Xia, Weiwen	a941d7ffe5	[Quant][CPU] Avoid NaN in fp8 output of qlinear and qconv (#160957 ) Summary When output dtype is fp8, oneDNN does not ensure intermediate results in the range of [-448, 448] before converting to fp8. So, we may get NaN in the output, which is a disaster for inference. This PR fixes this issue by clamping the intermediate results by oneDNN's post-op clip. Test plan ``` pytest -sv test/quantization/core/test_quantized_op.py -k "q and fp8" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/160957 Approved by: https://github.com/Valentine233, https://github.com/CaoE	2025-08-21 08:36:21 +00:00
PyTorch MergeBot	a53d14d5f8	Revert "unskipped mobilenet_v3 quantization and mobilenet_v2 quantization plus tests from https://github.com/pytorch/pytorch/issues/125438 (#157786 )" This reverts commit `3a2c3c8ed3`. Reverted https://github.com/pytorch/pytorch/pull/157786 on behalf of https://github.com/albanD due to Breaks lint ([comment](https://github.com/pytorch/pytorch/pull/157786#issuecomment-3164126250))	2025-08-07 13:09:33 +00:00
christinaburge	3a2c3c8ed3	unskipped mobilenet_v3 quantization and mobilenet_v2 quantization plus tests from https://github.com/pytorch/pytorch/issues/125438 (#157786 ) These tests now pass on AArch64 in our downstream CI. `test_quantization.py::TestNumericSuiteEager::test_mobilenet_v2 <- test/quantization/eager/test_numeric_suite_eager.py PASSED [2.4434s] [ 35%]` Pull Request resolved: https://github.com/pytorch/pytorch/pull/157786 Approved by: https://github.com/jerryzh168, https://github.com/malfet	2025-08-06 22:41:07 +00:00
wengshiy	668d414ae7	[CPU] Fix bias dtype issue for FP8 qlinear (#159125 ) Fixes `RuntimeError: self and mat2 must have the same dtype, but got BFloat16 and Float` With bf16 autocast, bias converted into BFloat16, but fp8_qlinear_onednn_ref not support bf16 bias. In this pr, convert bias into bf16 on fp8_qlinear_onednn_ref. Add this case into ut and reproduce: `python test/test_quantization.py -k test_qlinear_fp8` Pull Request resolved: https://github.com/pytorch/pytorch/pull/159125 Approved by: https://github.com/Xia-Weiwen, https://github.com/cyyever, https://github.com/CaoE	2025-07-31 01:26:45 +00:00
Xuehai Pan	775788f93b	[BE][PYFMT] migrate PYFMT for `test/[i-z]*/` to `ruff format` (#144556 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144556 Approved by: https://github.com/ezyang	2025-07-29 03:26:09 +00:00
Xuehai Pan	f5e2de928b	[BE] fix remaining flake8 v7 warnings (#159044 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159044 Approved by: https://github.com/Skylion007 ghstack dependencies: #159043	2025-07-25 02:56:34 +00:00
Huamin Li	2c37acfd89	[AOTI][CPU] Consider bias=None case for fbgemm_linear_fp16_weight (#158535 ) Test Plan: Rollback Plan: Differential Revision: D78458214 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158535 Approved by: https://github.com/houseroad, https://github.com/henryoier, https://github.com/jingsh	2025-07-21 23:42:44 +00:00
Aaron Gokaslan	7a08755c5f	[BE][Ez]: Update ruff to 0.12.2 (#157937 ) Updates to the latest version of ruff and apply some fixes that it flagged and silence a few new lints Pull Request resolved: https://github.com/pytorch/pytorch/pull/157937 Approved by: https://github.com/ezyang	2025-07-11 15:16:20 +00:00
Xia, Weiwen	e1a20988f3	[Quant][CPU] Enable fp8 qconv (#157076 ) Summary Enable fp8 qconv on CPU. It's part of the plan to enable fp8 static quantization on CPU. This PR only adds FP8 support of the existing int8 qconv op. It does not add a new op nor does it affect frontend or quantization flow. The schema of the qconv op is not changed either. So, the FP8 qconv shares the same op as INT8 qconv and the difference is that src/wei dtype is fp8 instead of int8. The output dtype can be fp8/float32/bfloat16. The implementation uses the oneDNN library. Note: OneDNN does not support quantized fp8 convolution until v3.9 but the version used in PyTorch is v3.7.2. So, the op goes to the reference kernel for now. And we have also update the oneDNN path so that it's compatible with the fp8 dtype. Once oneDNN is upgraded to v3.9 or newer, minimum changes are needed to enable the oneDNN path. And we have ensured that the behavior of the reference kernel is the same as the new oneDNN's implementation. - oneDNN version < 3.9 (now) - Always go to the reference kernel - oneDNN version >= 3.9 (future) - Go to reference kernel on old platforms (without AMX) - Use oneDNN on new platforms (with AMX) Test plan ``` pytest test/quantization/core/test_quantized_op.py -k "qconv and fp8" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/157076 Approved by: https://github.com/leslie-fang-intel, https://github.com/jerryzh168	2025-07-11 10:00:57 +00:00
Jerry Zhang	11a86ad2fa	Remove pytorch quant docs since we are moving to torchao (#157766 ) Summary: att Test Plan: doc page generated from CI Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/157766 Approved by: https://github.com/Skylion007	2025-07-11 03:21:47 +00:00
Abhishek Nandy	548c9d8281	Fix typo: 'paramter' → 'parameter' in quantization model report test (#157646 ) This PR addresses a minor typo in the file `test/quantization/fx/test_model_report_fx.py`: - Corrected the word "paramter" to "parameter" for better readability and accuracy. While it's a small change, correcting such typographical errors contributes to maintaining the overall quality and professionalism of the codebase. Thank you for your time and consideration in reviewing this PR. I'm happy to make any further adjustments if needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157646 Approved by: https://github.com/yewentao256, https://github.com/ezyang	2025-07-05 12:28:36 +00:00
Xia, Weiwen	b5bfbba184	[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values (#155109 ) Fixes #154328 Summary Fail reason: The input value is infinity in float and it has undefined behavior to convert it to int64_t. On X86, it will be converted to the min value of int64_t, which is not expected. Fix: Clamping `(input * inv_scale + zero_point)` to `[quant_min, quant_max]` before converting it to int64_t. Test plan ``` pytest test/quantization/core/test_workflow_ops.py -k test_fake_quantize_per_tensor_affine_inf ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/155109 Approved by: https://github.com/leslie-fang-intel, https://github.com/jerryzh168	2025-06-26 01:24:36 +00:00
PyTorch MergeBot	029e2b05c2	Revert "[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values (#155109 )" This reverts commit `19ffb5e6f7`. Reverted https://github.com/pytorch/pytorch/pull/155109 on behalf of https://github.com/albanD due to The corresponding test still breaks on rocm ([comment](https://github.com/pytorch/pytorch/pull/155109#issuecomment-3004698438))	2025-06-25 13:05:40 +00:00
Xia, Weiwen	c2185dc4a5	[Quant][CPU] Enable fp8 qlinear (#155678 ) Summary Enable fp8 qlinear on CPU. It's part of the plan to enable fp8 static quantization on CPU. This PR only adds FP8 support of the existing int8 qlinear op. It does not add a new op nor does it affect frontend or quantization flow. The schema of the qlinear op is not changed either. So, the FP8 qlinear shares the same op as INT8 qlinear and the difference is that src/wei dtype is fp8 instead of int8. The output dtype can be fp8/float32/bfloat16. The implementation uses the oneDNN library. The differences of qlinear from `_scaled_mm` are that - Qlinear supports post op fusion while `_scaled_mm` does not - Weights are prepacked for qlinear Test plan ``` pytest test/quantization/core/test_quantized_op.py -k "qlinear and fp8" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/155678 Approved by: https://github.com/leslie-fang-intel, https://github.com/jerryzh168	2025-06-25 10:01:08 +00:00
Xia, Weiwen	19ffb5e6f7	[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values (#155109 ) Fixes #154328 Summary Fail reason: The input value is infinity in float and it has undefined behavior to convert it to int64_t. On X86, it will be converted to the min value of int64_t, which is not expected. Fix: Clamping `(input * inv_scale + zero_point)` to `[quant_min, quant_max]` before converting it to int64_t. Test plan ``` pytest test/quantization/core/test_workflow_ops.py -k test_fake_quantize_per_tensor_affine_inf ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/155109 Approved by: https://github.com/leslie-fang-intel, https://github.com/jerryzh168	2025-06-25 09:28:54 +00:00
PyTorch MergeBot	e9fdaf8701	Revert "[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values (#155109 )" This reverts commit `e375d21bb9`. Reverted https://github.com/pytorch/pytorch/pull/155109 on behalf of https://github.com/malfet due to Looks like it broke ROCM tests ([comment](https://github.com/pytorch/pytorch/pull/155109#issuecomment-2977428354))	2025-06-16 17:22:55 +00:00
Xia, Weiwen	d9799a2ee7	Support boolean tensor for torch.fused_moving_avg_obs_fake_quant on CUDA (#153699 ) Fixes #153310 As the title Test plan ``` pytest test/quantization/core/test_workflow_ops.py -k test_fused_obs_fake_quant_moving_avg ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/153699 Approved by: https://github.com/mingfeima, https://github.com/jerryzh168	2025-06-16 07:10:06 +00:00
Xia, Weiwen	e375d21bb9	[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values (#155109 ) Fixes #154328 Summary Fail reason: The input value is infinity in float and it has undefined behavior to convert it to int64_t. On X86, it will be converted to the min value of int64_t, which is not expected. Fix: Clamping `(input * inv_scale + zero_point)` to `[quant_min, quant_max]` before converting it to int64_t. Test plan ``` pytest test/quantization/core/test_workflow_ops.py -k test_fake_quantize_per_tensor_affine_inf ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/155109 Approved by: https://github.com/leslie-fang-intel, https://github.com/jerryzh168	2025-06-14 14:12:38 +00:00

1 2 3 4 5 ...

1886 Commits