pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	694db5f549	Use 'is' in callable comparisons (#166624 ) Just like we use `is/is not` for class comparisons, it is generally advised to use `is/is not` for comparisons against torch functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166624 Approved by: https://github.com/Lucaskabela, https://github.com/Skylion007	2025-10-30 19:00:09 +00:00
Maggie Moss	154e4d36e9	Fix pyrelfy ignore syntax in distributions and ao (#166248 ) Ensures existing pyrefly ignores only ignore the intended error code pyrefly check lintrunner Pull Request resolved: https://github.com/pytorch/pytorch/pull/166248 Approved by: https://github.com/oulgen	2025-10-26 22:13:48 +00:00
Yuanyuan Chen	a60d9e1f6d	Fix flake8 B028 warnings (#166224 ) This PR fixes flake8 B028 warning by specifying stacklevel=2 in `warnings.warn`. The advantage is that users can know more contextual information about PyTorch warnings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166224 Approved by: https://github.com/ezyang	2025-10-26 06:18:55 +00:00
PyTorch MergeBot	8daef35cf1	Revert "[Code Clean] Clean asserts in torch/ao/quantization (root, quantizer, backend_config) (#165433 )" This reverts commit `df64c0c464`. Reverted https://github.com/pytorch/pytorch/pull/165433 on behalf of https://github.com/clee2000 due to I think this broke some quantization tests ([comment](https://github.com/pytorch/pytorch/pull/165433#issuecomment-3429741770))	2025-10-21 22:10:19 +00:00
zhudada	df64c0c464	[Code Clean] Clean asserts in torch/ao/quantization (root, quantizer, backend_config) (#165433 ) Replace assert statements with explicit if/raise patterns in: - torch/ao/quantization/~ - torch/ao/quantization/quantizer/ - torch/ao/quantization/backend_config/ fix partialy #164878 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165433 Approved by: https://github.com/albanD	2025-10-20 22:42:51 +00:00
Yuanyuan Chen	fbe0d20a17	[2/N] More ruff SIM fixes (#165031 ) This is follow-up of #164695 to apply ruff SIM rules to more files. Most changes are about simplifying dict.get because None is already the default value. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165031 Approved by: https://github.com/mlazos	2025-10-14 14:22:54 +00:00
PyTorch MergeBot	b8be796a57	Revert "[2/N] More ruff SIM fixes (#165031 )" This reverts commit `38095fbd13`. Reverted https://github.com/pytorch/pytorch/pull/165031 on behalf of https://github.com/albanD due to One of the changed line started to fail on trunk ([comment](https://github.com/pytorch/pytorch/pull/165031#issuecomment-3390190870))	2025-10-10 13:42:14 +00:00
Yuanyuan Chen	38095fbd13	[2/N] More ruff SIM fixes (#165031 ) This is follow-up of #164695 to apply ruff SIM rules to more files. Most changes are about simplifying dict.get because None is already the default value. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165031 Approved by: https://github.com/mlazos	2025-10-10 05:37:46 +00:00
Maggie Moss	b13cd141b3	Add pyrefly suppressions (#164748 ) Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283 Test plan: dmypy restart && python3 scripts/lintrunner.py -a pyrefly check step 1: delete lines in the pyrefly.toml file from the `project-excludes` field step 2: run pyrefly check step 3: add suppressions, clean up unused suppressions before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199 after: 0 errors (4,263 ignored) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164748 Approved by: https://github.com/oulgen	2025-10-07 17:31:18 +00:00
PyTorch MergeBot	5d7360bb03	Revert "Enable all SIM rules except disabled ones (#164645 )" This reverts commit `321e602692`. Reverted https://github.com/pytorch/pytorch/pull/164645 on behalf of https://github.com/izaitsevfb due to causes lint failures ([comment](https://github.com/pytorch/pytorch/pull/164645#issuecomment-3369274351))	2025-10-05 19:32:21 +00:00
Yuanyuan Chen	321e602692	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang	2025-10-05 07:38:25 +00:00
Yuanyuan Chen	315ffdc1e4	[4/N] Apply ruff UP035 rule to python code (#164206 ) Follows #164104 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164206 Approved by: https://github.com/albanD	2025-10-01 19:05:53 +00:00
Yuanyuan Chen	e30f01b5b5	[1/N] Simplify "in" operation for containers of a single item (#164224 ) These issues are detected by ruff [FURB171](https://docs.astral.sh/ruff/rules/single-item-membership-test/#single-item-membership-test-furb171). Pull Request resolved: https://github.com/pytorch/pytorch/pull/164224 Approved by: https://github.com/rec, https://github.com/Skylion007	2025-09-30 19:59:43 +00:00
William Wen	7cbc011700	[dynamo, 3.14] support some bytecodes, fix CALL_FUNCTION_EX (#163009 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163009 Approved by: https://github.com/anijain2305 ghstack dependencies: #161838, #161555, #161839	2025-09-30 17:41:56 +00:00
Bob Ren	d5afb9e31a	remove allow-untyped-defs from ./torch/ao/quantization/quantizer/utils.py (#163471 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163471 Approved by: https://github.com/Skylion007 ghstack dependencies: #163478, #163475	2025-09-25 06:48:44 +00:00
Xuehai Pan	f8293116f5	[BE][13/16] fix typos in torch/ (torch/ao/) (#156603 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156603 Approved by: https://github.com/msaroufim	2025-06-29 04:34:04 +00:00
Xuehai Pan	279cae52e7	[BE][PYFMT] migrate PYFMT for `torch/ao/` to `ruff format` (#148185 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148185 Approved by: https://github.com/ezyang	2025-06-14 16:47:04 +00:00
Jerry Zhang	d23aa7e182	Add deprecation warning for `torch.ao.quantization` (#153892 ) Summary: att Test Plan: (ao) $ PYTHONWARNINGS='default' python Python 3.10.14 \| packaged by conda-forge \| (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from torch.ao.quantization.quantizer.xnnpack_quantizer import XNNPACKQuantizer printing warning /anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/__init__.py:36: DeprecationWarning: torch.ao.quantization is deprecated. Plan is to 1. Remove eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager mode quantize_ API instead 2. Remove fx graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx, torch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantization API instead (prepare_pt2e, convert_pt2e) 3. pt2e quantization has been migrated to torchao (https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e) see https://dev-discuss.pytorch.org/t/torch-ao-quantization-migration-plan/2810 for more details warnings.warn( >>> a = XNNPACKQuantizer() /anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/quantizer/xnnpack_quantizer.py:281: DeprecationWarning: XNNPACKQuantizer is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead warnings.warn(f"{self.__class__.__name__} is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead", DeprecationWarning) >>> Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/153892 Approved by: https://github.com/Skylion007	2025-05-28 16:25:30 +00:00
Anthony Shoumikhin	e2f9759bd0	Fix broken URLs (#152237 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152237 Approved by: https://github.com/huydhn, https://github.com/malfet	2025-04-27 09:56:42 +00:00
Xia, Weiwen	246f3b6530	[Quant][PT2E][X86] enable qconv1d-relu fusion (#150751 ) Summary As the title. - The `conv1d - relu` pattern will be annotated by the `X86InductorQuantizer`. - The pattern will be fused as `qconv_pointwise` during lowering. Test plan ``` python test/inductor/test_mkldnn_pattern_matcher.py -k test_qconv1d_relu_cpu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/150751 Approved by: https://github.com/jerryzh168, https://github.com/leslie-fang-intel	2025-04-09 14:42:02 +00:00
Yan Zhiwei	8a5265cb37	[Intel GPU] qlinear_pointwise.binary[_tensor] XPU support (#135337 ) # Motivation This PR intends to enable quantized fusion `qlinear+add` at Intel GPU backend. At backend level, we register the op via schema `TORCH_SELECTIVE_NAME("onednn::qlinear_pointwise.binary")` and `TORCH_SELECTIVE_NAME("onednn::qlinear_pointwise.binary_tensor")` which is the one already defined in `x86InductorQuantzer` At Inductor level, we have small modification at `torch/_inductor/fx_passes/quantization.py` to allow signed int8 data type(s8) during op lowering. As for the pattern matching, we greatly reuse the code existing at x86InductorQuantizer. # UT verification ```bash python test/inductor/test_mkldnn_pattern_matcher.py -v \ -k test_qlinear_add_xpu ``` # Runtime Verification ```bash onednn_verbose,primitive,exec,gpu:0,matmul,jit:gemm:any,undef,src_s8::blocked:ab::f0 wei_s8::blocked:ab::f0 bia_f32::blocked:ab::f0_mask2 dst_f32::blocked:ab::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:2:f32 attr-zero-points:src0:0:s32 attr-post-ops:eltwise_linear:1:0.654408+sum:0.00511256+eltwise_relu,,4x4:4x4,0.0319824 ``` The verbose is collected from UT. We can see the attribute ` attr-post-ops:eltwise_linear:1:0.654408+sum:0.00511256+eltwise_relu`, the post add and ReLU is successfully fused on GEMM computation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135337 Approved by: https://github.com/EikanWang, https://github.com/guangyey, https://github.com/liangan1, https://github.com/jerryzh168 ghstack dependencies: #133307, #135189 Co-authored-by: guangyey <guangye.yu@intel.com>	2025-02-21 02:09:28 +00:00
Yan Zhiwei	f79b352f5a	[Intel GPU] qconv_pointwise.binary XPU support (#135189 ) # Motivation This PR intends to enable quantized fusion `qconv+add` and `qconv+add+relu` at Intel GPU backend. At backend level, we register the op via schema `TORCH_SELECTIVE_NAME("onednn::qconv2d_pointwise.binary")` which is the one already defined in `x86InductorQuantzer` At Inductor level, we have small modification at `torch/_inductor/fx_passes/quantization.py` to allow signed int8 data type(s8) during op lowering. As for the pattern matching, we greatly reuse the code existing at x86InductorQuantizer. # UT verification ```bash python test/inductor/test_mkldnn_pattern_matcher.py -v \ -k test_qconv2d_add_xpu \ -k test_qconv2d_add_relu_xpu 2>&1 ``` # Runtime exemplification Following is the oneDNN verbose collected from UT ```bash onednn_verbose,primitive,exec,gpu:0,convolution,jit:ir,forward_training,src_s8::blocked:acdb::f0 wei_s8::blocked:abcd::f0 bia_f32::blocked:a::f0 dst_s8::blocked:acdb::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:1:f32 attr-zero-points:src0:0:s32+dst:0:s32 attr-post-ops:eltwise_linear:1:0.337704+sum:0.0241217+eltwise_relu,alg:convolution_direct,mb1_ic3oc6_ih8oh6kh3sh1dh0ph0_iw8ow6kw3sw1dw0pw0,0.151123 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/135189 Approved by: https://github.com/liangan1, https://github.com/EikanWang, https://github.com/guangyey, https://github.com/jerryzh168 ghstack dependencies: #133307 Co-authored-by: guangyey <guangye.yu@intel.com>	2025-02-20 02:02:54 +00:00
ZhiweiYan-96	59915b8dec	[Intel GPU] qlinear at XPU backend (#133307 ) # Motivation The PR is intended to enable `onednn.qlinear` and `onednn.qlinear_unary` at Intel GPU. We register the qlinear ops at C++ backend via `TORCH_LIBRARY_IMPL`, the op this PR registers includes `onednn::qlinear_pointwise`, `onednn::qlinear_pointwise.tensor`, and `onednn::qlinear_prepack`. The prepack conduct transpose on weight for fitting oneDNN requirement on weight to acquire higher performance. Also, we remove the limitation of the corresponding annotation method in the `XPUInductorQuantizer` (`torch/ao/quantization/quantizer/xpu_inductor_quantizer.py`) to allow GPU linear conversion. We add the kChar(`torch.int8`) dtype in the `torch/_inductor/fx_passes/quantization` and `torch/_inductor/mkldnn_ir.py`, as signed int8 is the default INT8 data type at GPU side. We verified the op through UTs and e2e model testing like ResNet18, ResNet50. # UT verification ``` DNNL_VERBOSE=0 TORCH_COMPILE_DEBUG=0 python test/inductor/test_mkldnn_pattern_matcher.py -v \ -k test_qlinear_xpu \ -k test_qlinear_relu_xpu \ -k test_qlinear_gelu_xpu ``` # Runtime exemplification Here is the oneDNN verbose collected through running above UTs ``` //pure int8 gemm onednn_verbose,primitive,exec,gpu:0,matmul,jit:gemm:any,undef,src_s8::blocked:ab::f0 wei_s8::blocked:ab::f0 dst_s8::blocked:ab::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:2:f32 attr-zero-points:src0:0:s32+dst:0:s32,,2x4:4x3,0.187988 // post-relu fusion onednn_verbose,primitive,exec,gpu:0,matmul,jit:gemm:any,undef,src_s8::blocked:ab::f0 wei_s8::blocked:ab::f0 bia_f32::blocked:ab::f0_mask2 dst_f32::blocked:ab::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:2:f32 attr-zero-points:src0:0:s32 attr-post-ops:eltwise_relu,,2x4:4x4,0.115234 // post-gelu fusion onednn_verbose,primitive,exec,gpu:0,matmul,jit:gemm:any,undef,src_s8::blocked:ab::f0 wei_s8::blocked:ab::f0 dst_f32::blocked:ab::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:2:f32 attr-zero-points:src0:0:s32 attr-post-ops:eltwise_gelu_tanh,,2x4:4x4,0.170898 ```` Pull Request resolved: https://github.com/pytorch/pytorch/pull/133307 Approved by: https://github.com/liangan1, https://github.com/guangyey, https://github.com/EikanWang, https://github.com/jerryzh168 Co-authored-by: guangyey <guangye.yu@intel.com>	2025-02-18 04:02:42 +00:00
Digant Desai	f08b9bc7e4	[WIP] Move XNNPACKQuantizer from PyTorch to ExecuTorch (#144940 ) Summary: This replicates XNNPACKQuantizer from PyTorch to ExecuTorch. Rationale: Main motivation is to avoid pytorch pin update in OSS after updating XNNPACKQuantizer, which can be rather frequent. Other impact and considerations: PT2e flow (which lives in PyTorch) relies havily on XNNPACKQuantizer for a "example" implementation for quantizer and more importantly tests. Fow now, we will keep the torch.ao.quantization.xnnpack_quantizer as is but mark is as not BC, and deprecated to discourace future new dependencies on it. Other OSS repository using XNNPACKQuantizer from PyTorch now have to take an additional dependency on ExecuTorch. Differential Revision: D68191752 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144940 Approved by: https://github.com/jerryzh168, https://github.com/mcr229	2025-01-24 10:06:07 +00:00
Aaron Orenstein	9e0437a04a	PEP585 update - torch/ao/quantization (#145140 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145140 Approved by: https://github.com/bobrenjc93	2025-01-19 10:20:00 +00:00
bobrenjc93	a55977f763	Migrate from Tuple -> tuple in torch/ao (#144265 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144265 Approved by: https://github.com/aorenste	2025-01-10 00:12:06 +00:00
Aaron Orenstein	45ef3309e3	[BE] typing for decorators (#144161 ) Summary: Untyped decorators strip annotations from the decorated items. - _compile - _inductor/fx_passes/post_grad - _inductor/lowering - _library/custom_ops - _meta_registrations - _ops - _refs/nn/functional - ao/quantization/quantizer/xnnpack_quantizer_utils - distributed/_composable/contract - fx/experimental/graph_gradual_typechecker - fx/experimental/migrate_gradual_types/constraint_generator - optim/optimizer - signal/windows/windows - testing/_internal/common_device_type - torch/_inductor/decomposition - utils/flop_counter Test Plan: unit tests Differential Revision: D62302684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144161 Approved by: https://github.com/Skylion007, https://github.com/albanD	2025-01-04 16:40:09 +00:00
Shangdi Yu	d8ea4ce631	[reland] Kill capture_pre_autograd_graph API (#143426 ) Summary: Delete the following API: - capture_pre_autograd_graph() - capture_pre_autograd_graph_using_training_ir() - gm_using_training_ir() Update XLA pin to include https://github.com/pytorch/xla/pull/8398 There's no more call sites to `capture_pre_autograd_graph`. Except 1) two test cases in coreml, guarded by version guard, PR to remove: https://github.com/apple/coremltools/pull/2400 2) a few call sites guarded by version guard (< 2.5.0) Test Plan: CI Differential Revision: D67354440 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143426 Approved by: https://github.com/gmagogsfm	2024-12-18 12:07:09 +00:00
albanD	792f1c47e9	No actual change, just remove variable contain Tensors from global scope (#143225 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143225 Approved by: https://github.com/ezyang	2024-12-17 16:14:25 +00:00
PyTorch MergeBot	519d858c31	Revert "Kill capture_pre_autograd_graph API (#143224 )" This reverts commit `4c62275325`. Reverted https://github.com/pytorch/pytorch/pull/143224 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the XLA failure is legit ([comment](https://github.com/pytorch/pytorch/pull/143224#issuecomment-2547264675))	2024-12-17 00:47:24 +00:00
Shangdi Yu	4c62275325	Kill capture_pre_autograd_graph API (#143224 ) Summary: Delete the following API: - capture_pre_autograd_graph() - capture_pre_autograd_graph_using_training_ir() - gm_using_training_ir() There's no more call sites to `capture_pre_autograd_graph`. Except 1) two test cases in coreml, PR to remove: https://github.com/apple/coremltools/pull/2400 2) XLA: one test case in pytorch/xla, PR to remove: https://github.com/pytorch/xla/pull/8398 3) a few call sites guarded by version guard (< 2.5.0) Test Plan: CI Reviewed By: tugsbayasgalan Differential Revision: D64056353 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143224 Approved by: https://github.com/tugsbayasgalan	2024-12-16 23:06:22 +00:00
Tom Ritchford	dc23f1944a	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-12 17:39:14 +00:00
PyTorch MergeBot	5c97ac9721	Revert "Remove unused Python variables in torch/[_-a]* (#133492 )" This reverts commit `fda975a7b3`. Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))	2024-12-11 17:29:12 +00:00
Tom Ritchford	fda975a7b3	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-10 21:48:44 +00:00
Fabian Keller	8cb68b136f	Proper modeling of recursive types (#142300 ) Currently there are a few type annotations that falsely state that mypy doesn't support recursive types. Recursive type support is available in mypy for a few years already. It has been officially enabled in [version 0.991](https://mypy-lang.blogspot.com/2022/11/mypy-0990-released.html). Pyright even had support for recursive types earlier (https://github.com/microsoft/pyright/issues/569), so there is probably no reason not to model these types correctly. This PR models these types properly now. Since this has turned a few implicit `Any` into fully typed variables that are not narrowed cleanly, a small number of type ignores were necessary. Note that regarding the `Argument` it is desirable to model it in a covariant way (i.e. using `Sequence` and `Mapping`) instead of making it invariant unnecessarily (using `List` and `Dict`). If it were modeled invariant, it would for instance mean that a `List[Node]` would not type check as `Argument`, because invariance would mean that it really has to be a `List[Argument]` (i.e., including all the branches of the union type). Since even the name of the type "argument" strongly suggest that it is semantically used as "argument", having covariance natural anyway. There are no chances in this PR that affect runtime behavior. CC @Skylion007 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142300 Approved by: https://github.com/ezyang, https://github.com/Skylion007	2024-12-07 21:30:45 +00:00
siahuat0727	3c63e76b03	[PT2E Quantization] Fix RecursionError when prepare_pt2e graph with concat of the same node (#141651 ) Fixes #129038 Related PR #129567 Here is the new PR against main, thanks! @jerryzh168 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141651 Approved by: https://github.com/jerryzh168	2024-11-29 09:19:22 +00:00
Xia, Weiwen	9827d677b4	[Quant][PT2E][X86] annotate and convert for linear_dynamic_fp16 (#141480 ) Annotate linear node for `linear_dynamic_fp16` with `X86InductorQuantizer` After `convert_pt2e`, the pattern will be ``` x \| linear <- to_fp32 <- to_fp16 <- w ``` Test plan ``` pytest test/quantization/pt2e/test_x86inductor_quantizer.py -k test_linear_dynamic_fp16 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/141480 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2024-11-29 07:48:39 +00:00
ZhiweiYan-96	c418a9ac75	[Intel GPU] XPUInductorQuantizer for XPU int8 recipe customization (#139578 ) # Motivation This PR add `XPUInductorQuantizer`, which would defined the recipe of int8 quantization at XPU backend. # Detailed The `XPUInductorQuantizer` is class derived from `X86InductorQuantizer` as both quantizer would take the advantage of highly optimized operators in oneDNN library(qconv, qlinear, qconv/qlinear fusion). We share the same recipe as `X86InductorQuantizer`, so we would have same `annotate_xxxx` methods. So, in ideal situation, the `XPUInductorQuantizer` would have no class body as all implementation can inherit from base class. In this PR, we override the `annotate_xxx` method for operators that has NOT be implemented. All operators XPU backend does not implement would be fallbacked to fp32 implementation as the node in graph is a `dq-op-q` pairs. This would help provide good OOB usability for XPU backend. On the other hand, the implemented operators would uses `annotate_op` implemented in base class and could be lowered successfully. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139578 Approved by: https://github.com/EikanWang, https://github.com/leslie-fang-intel, https://github.com/CuiYifeng, https://github.com/jerryzh168 ghstack dependencies: #133080	2024-11-26 09:44:14 +00:00
Aaron Gokaslan	12e95aa4ee	[BE]: Apply PERF401 autofixes from ruff (#140980 ) * Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables. * list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize. * Manually went back and made mypy happy after the change. * Also fixed style lints in files covered by flake8 but not by pyfmt Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-11-20 17:52:07 +00:00
Max Ren	240aa77ad0	[Quantizer][XNNPACK] Fix ReLU fusion when conv/linear has > 1 user (#140846 ) Summary: Bug in quantizer when Conv + ReLU is fused even when the preceeding conv has more than one user. Conv and ReLU can not be fused in this case because the result of Conv must be used elsewhere. XNNPACK Delegate naturally handles this by inserting a clamp node for ReLU. Test Plan: CI Reviewed By: digantdesai Differential Revision: D65989599 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140846 Approved by: https://github.com/digantdesai	2024-11-19 02:29:45 +00:00
Shen Xu	efe8482c0d	Add prepare_obs_or_fq_callback to quantizer (#140863 ) Test Plan: CI. Differential Revision: D65982003 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140863 Approved by: https://github.com/jerryzh168	2024-11-19 01:13:38 +00:00
Wu, Chunyuan	3192bdeea4	[AOTI] Use `len(serialized_weights)` when calculating `consts_size` (#139054 ) Fixes the failure of INT8 DLRM using AOTI. The previous code calculates `consts_size` directly using `tensor` from `graph.constants`: ``` consts_size = sum( get_nbytes_of_tensor(tensor, all_cuda) for (name, tensor) in graph.constants.items() if name not in graph.folded_constants ) ``` Meanwhile, the actual bytes to serialize (`serialized_weights`) is using `graph.get_original_value_of_constant(name)`: ``` serialized_weights = b"".join( _to_bytes(graph.get_original_value_of_constant(name), all_cuda) for name in graph.constants.keys() if name not in graph.folded_constants ) ``` `tensor` from `graph.constants` could be different from `graph.get_original_value_of_constant(name)` thus making the `consts_size` inconsistent with the actual byte size of the `serialized_weights`, resulting in runtime error `weights_offset must be aligned to 16K boundary`, similar to what happened in https://github.com/pytorch/pytorch/pull/135205. This PR direclty gets `consts_size ` using `len(serialized_weights)`, which fixes the inconsistency. We also added a `reduce_range` argument to the `get_default_x86_inductor_quantization_config` function, which is needed in the unit test to avoid accuracy issue on CI machines (earlier CPUs without VNNI). Pull Request resolved: https://github.com/pytorch/pytorch/pull/139054 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/desertfire	2024-10-31 09:54:16 +00:00
leslie-fang-intel	cd8f7730f4	[PT2E][Quant] Remove Redundant Method in X86 Quantizer (#139161 ) Summary Remove the redundant method of X86 Inductor Quantizer as `get_supported_quantization_configs`, `get_supported_operator_for_quantization_config` and `get_supported_operators`. They are not the must have to implement a customized Quantizer and not mentioned in existing document for how to use X86 Inductor Quantizer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139161 Approved by: https://github.com/jgong5	2024-10-30 03:31:17 +00:00
Shangdi Yu	a3f3773477	Make PT2E work with both IR simultaneously (#135769 ) Summary: as title Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:quantization_pt2e_qat ``` Differential Revision: D62449830 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135769 Approved by: https://github.com/angelayi	2024-10-02 21:05:22 +00:00
blzheng	797c7e2802	[Quant][PT2E]change flatten recipe for X86InductorQuantizer (#136298 ) This PR modifies the flatten recipe: if none of the users of the flatten node are quantizable ops, int8 flatten will be disabled to avoid unnecessary dtype conversions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136298 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5	2024-09-24 04:30:12 +00:00
Aaron Gokaslan	31715be72a	[BE]: Update mypy to 1.11.2 (#133816 ) Updates mypy to 1.11.1 to improve type inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/133816 Approved by: https://github.com/ezyang	2024-09-16 19:44:11 +00:00
PyTorch MergeBot	3117f2cf67	Revert "[BE]: Update mypy to 1.11.2 (#133816 )" This reverts commit `55299cfc22`. Reverted https://github.com/pytorch/pytorch/pull/133816 on behalf of https://github.com/jeanschmidt due to seems to have broken https://github.com/pytorch/pytorch/actions/runs/10865710499/job/30155699792 on main ([comment](https://github.com/pytorch/pytorch/pull/133816#issuecomment-2352377684))	2024-09-16 09:11:16 +00:00
Aaron Gokaslan	55299cfc22	[BE]: Update mypy to 1.11.2 (#133816 ) Updates mypy to 1.11.1 to improve type inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/133816 Approved by: https://github.com/ezyang	2024-09-14 21:40:36 +00:00
Aaron Orenstein	d95aedf5fd	[BE] typing for decorators - fx/_compatibility (part 1) (#134202 ) Part of #134054. This corresponds to the pytorch mypy changes from D61493706. Updating takes so long and touches so many files that it's impossible to land as a whole without conflicting with some other intermediate change. So landing these 'type: ignore' for pytorch in advance of them actually being needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134202 Approved by: https://github.com/Skylion007	2024-08-22 17:07:33 +00:00
Xuehai Pan	758a0a88a2	[BE][Easy] enable `ruff` rule `PIE790`: unnecessary `pass` statement (#133200 ) This PR removes unnecessary `pass` statement. This is semanticly safe because the bytecode for the Python code does not change. Note that if there is a docstring in the function, a empty function does not need a `pass` statement as placeholder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133200 Approved by: https://github.com/malfet, https://github.com/eqy, https://github.com/kit1980	2024-08-15 15:50:19 +00:00

1 2 3

147 Commits