Commit Graph

88238 Commits

Author SHA1 Message Date
Thomas Bohnstingl
68034198e5 [HOP] Mutation and alias rework (#146658)
This PR reworks the way the input mutations and various aliases are checked

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146658
Approved by: https://github.com/ydwu4
2025-05-18 08:05:22 +00:00
Justin Chu
0e805aad7f [ONNX] Support float4 (#151069)
- Support exporting float4 models (note: currently we use IR version 10 universally in the exporter, which does not include float 4 support. Eventually when onnx runtime and the ecosystem moves to support the new IR version 11 we should bump our version to 11 in the exporter as well)
- The shape of the type is set according to https://github.com/pytorch/pytorch/pull/148791#discussion_r2038704986 (added last dim with size 2)
- Use ml_dtypes types when converting to numpy for consistency with ONNX IR

Fix https://github.com/pytorch/pytorch/issues/150202

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151069
Approved by: https://github.com/titaiwangms
2025-05-18 03:19:35 +00:00
Tom Ritchford
8568dbce1d [inductor] Clean typing in codegen/common.py and codecache.py (#150767)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150767
Approved by: https://github.com/aorenste
2025-05-17 13:56:50 +00:00
Xuehai Pan
27f7b65a69 [BE] Ensure generated stub files by gen_pyi are properly formatted (#150730)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150730
Approved by: https://github.com/aorenste
2025-05-17 12:30:40 +00:00
Michael Lazos
7ebea09986 [Cutlass] Enable fusion with FusedSchedulerNodes (#153588)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153588
Approved by: https://github.com/eellison
ghstack dependencies: #152815
2025-05-17 12:29:10 +00:00
Michael Lazos
f604732e2e [Cutlass] E2E Tests for EVT (#152815)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152815
Approved by: https://github.com/henrylhtsang, https://github.com/eellison
2025-05-17 12:29:10 +00:00
Angela Yi
b4fb801b2d [export] Move PT2 constants to torch::_export (#153206)
Test Plan:
`buck2 test //sigmoid/...`
https://www.internalfb.com/intern/testinfra/testrun/1970325119807758

Differential Revision: D74417085

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153206
Approved by: https://github.com/zhxchen17, https://github.com/dolpm
2025-05-17 08:21:59 +00:00
PyTorch MergeBot
40339c1e99 Revert "[CUDA][cuBLAS][cuBLASLt] avoid polluting prefer cuBLAS/Lt setting across tests (#153655)"
This reverts commit 3bde364996.

Reverted https://github.com/pytorch/pytorch/pull/153655 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to fail a test in trunk ([comment](https://github.com/pytorch/pytorch/pull/153655#issuecomment-2888212597))
2025-05-17 08:11:54 +00:00
Xuehai Pan
9b2a45ac7d Refactor torch/utils/data/datapipes/gen_pyi.py with torchgen (#150626)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150626
Approved by: https://github.com/aorenste
2025-05-17 06:21:41 +00:00
eqy
e802b29ed4 [SDPA][EZ] Abate narrowing conversion warning spam in flash_api.cpp (#153643)
for messages like
```/workspace/pytorch/aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp:1396:38: warning: narrowing conversion of ‘(char)(& q)->at::Tensor::<anonymous>.at::TensorBase::get_device()’ from ‘char’ to ‘c10::DeviceIndex’ {aka ‘signed ```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153643
Approved by: https://github.com/Skylion007
2025-05-17 02:07:35 +00:00
Sidharth
aac30ef503 [Dynamo] added warning message for tracing lru_cache wrapped functions (#153744)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153744
Approved by: https://github.com/williamwen42
2025-05-17 00:43:18 +00:00
Aaron Gokaslan
e88c4db302 [BE]: Update ruff linter to 0.11.10 (#153625)
Fixes a bug with #153543 where I forgot to add pyproject.toml to the list of files RUF can scan and also updates it to the latest version (which is just minor bugfixes).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153625
Approved by: https://github.com/cyyever, https://github.com/atalman
2025-05-17 00:39:47 +00:00
clr
a952f42bdb dynamo: Log if we're using dynamic shapes via set_feature_usage (#153490)
This makes it extremely clear if a specific model didn't use dynamic shapes and
should have (except it had a bad config option).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153490
Approved by: https://github.com/jansel
2025-05-16 23:59:00 +00:00
Zhe Qu
1e9666b32d Add cudaLaunchKernel to cuda_to_hip_mappings (#153690)
Summary: as $title

Test Plan:
Used in D74789639

Rollback Plan:

Reviewed By: cenzhaometa

Differential Revision: D74789639

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153690
Approved by: https://github.com/Skylion007, https://github.com/malfet
2025-05-16 23:37:11 +00:00
cyy
7ae7324ac4 [submodule] Update google benchmark to v1.9.3 (#153676)
And remove `include_directories`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153676
Approved by: https://github.com/Skylion007
2025-05-16 23:31:53 +00:00
NikhilAPatel
59c3463653 [Inductor] Fallback bmm to mm when batch == 1 (#153572)
Summary:
This change introduces a fallback path from `bmm` to `mm` when the batch dimension is `1`.
The motivation is to unlock specialized `mm` kernel paths (e.g., `decomposeK`, `persistent+TMA`, etc.) which often don't have `bmm` equivalents.

### Rationale

- **No regression:** On shapes where the fallback triggers, we see no performance loss.

- **Performance wins:** On select shapes (especially with large `K`), we observe measurable speedups by triggering `mm`-specific optimizations.
  For example, on `bmm` shapes of the form `(1, H, K, H)` where `H ∈ {16, 32, 48, 64}` and `K ∈ {4096 ... 32768}`, we see an **average speedup of 10%**.

- **Prevalence in prod:** Internal workloads frequently emit `bmm` ops with `batch=1`, making this fallback broadly useful in practice.

Test Plan:
contbuild & OSS CI

Tests in test/inductor/test_torchinductor.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153572
Approved by: https://github.com/PaulZhang12, https://github.com/eellison
2025-05-16 22:35:03 +00:00
henrylhtsang
76f182f8e0 [cutlass backend] Reduce log level for cutlass compilation error (#153397)
Differential Revision: [D74596410](https://our.internmc.facebook.com/intern/diff/D74596410/)

This change should only affect cutlass backend. We realize that we are going to have Cuda compilation errors, and we do a really good job handling them and caching them. So reduce the logging levels there.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153397
Approved by: https://github.com/ColinPeppler, https://github.com/Skylion007
2025-05-16 21:46:14 +00:00
Eddie Yan
3bde364996 [CUDA][cuBLAS][cuBLASLt] avoid polluting prefer cuBLAS/Lt setting across tests (#153655)
Some tests may not set the preferred backend, which leads to unexpected behavior when multiple tests are run vs. standalone

Tests that should exercise both backends should explicitly parametrize this setting

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153655
Approved by: https://github.com/ngimel
2025-05-16 21:31:13 +00:00
PyTorch MergeBot
084c4aa614 Revert "Reapply "Delete TorchScript based Android demo app and point to ExecuTorch (#153633)" (#153656)"
This reverts commit 7ed377f577.

Reverted https://github.com/pytorch/pytorch/pull/153656 on behalf of https://github.com/larryliu0820 due to Still being used internally so can't remove ([comment](https://github.com/pytorch/pytorch/pull/153656#issuecomment-2887665403))
2025-05-16 21:00:11 +00:00
Ryan Guo
e4a636df80 [dynamo] Make OptimizedModule more robust in attribute reads and writes (#153637)
Fixes #138157.

Differential Revision: [D74834872](https://our.internmc.facebook.com/intern/diff/D74834872)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153637
Approved by: https://github.com/williamwen42
2025-05-16 20:29:19 +00:00
PyTorch MergeBot
1748fa529a Revert "cleanup, refactor and add missing self._dde_suppressed checks (#152657)"
This reverts commit f7fb2f66e3.

Reverted https://github.com/pytorch/pytorch/pull/152657 on behalf of https://github.com/malfet due to Broke lint ([comment](https://github.com/pytorch/pytorch/pull/152657#issuecomment-2887539146))
2025-05-16 19:42:20 +00:00
Nikita Shulga
62d8e3cb40 [BE][MPS] Cleanup log ops migration (#153727)
Introduced by https://github.com/pytorch/pytorch/pull/153398

Workaround internal compiler error on MacOS-13 by providing boolean specialization

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153727
Approved by: https://github.com/Skylion007
2025-05-16 19:32:17 +00:00
Aaron Gokaslan
cf226cb4d4 [BE]: Enable misc RUF rules and fix pyproject.toml indent (#153624)
Enables a variety of misc ruff rules and fixes some incorrect indentation in the file. Now that we updated ruff recently we can enable this rule lints. Most of these lints I've already applied, but now they are out of preview can apply them as stable lints.

Including:
* Do not bother why typing union with Never as this gets cancelled otu
* Simplify nested Literal into a single Literal
* Properly use packaging to parse version instead of `map(int(`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153624
Approved by: https://github.com/atalman, https://github.com/malfet
2025-05-16 19:29:16 +00:00
Laith Sakka
f7fb2f66e3 cleanup, refactor and add missing self._dde_suppressed checks (#152657)
so two things other than cleanups and refactoring
1) do not use propagate_real_tensors to resolve eval under guard_or_true/guard_or_false .
2) do not guard for dimensions of type  DimDynamic.OBLIVIOUS_SIZE under guard_or_true/guard_or_false .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152657
Approved by: https://github.com/pianpwk
2025-05-16 19:10:04 +00:00
PyTorch MergeBot
c2dda47bc5 Revert "[dynamo] Make OptimizedModule more robust in attribute reads and writes (#153637)"
This reverts commit 2ce0b66db8.

Reverted https://github.com/pytorch/pytorch/pull/153637 on behalf of https://github.com/malfet due to Looks like it broke slow tests, see cda572b053/1 ([comment](https://github.com/pytorch/pytorch/pull/153637#issuecomment-2887449037))
2025-05-16 18:49:57 +00:00
Benjamin Glass
cda572b053 codecache: Remove cpp_prefix.h duplication per build, then precompile it (#144293)
Prior to this PR, `_inductor/codegen/cpp_prefix.h` was copied into a new temporary directory on every inductor run utilizing the CPP backend (i.e. CPU-only), then included in the output source code. Instead, this PR puts it in an appropriate place in the torch includes, and includes it from there. This allows us to precompile it in cpp_wrapper and AOT inductor mode, saving significant compilation time.

Due to difficulties getting this to work in FBCode, the precompilation itself is only enabled in OSS PyTorch.

Differential Revision: [D69420620](https://our.internmc.facebook.com/intern/diff/D69420620)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144293
Approved by: https://github.com/desertfire
2025-05-16 17:41:36 +00:00
Pian Pawakapan
befb5bd52a [dynamic shapes] simplify int(x / y) pattern (#153477)
Fixes #138853

Summary: Converts `TruncToInt(IntTrueDiv(x / y))` to `x // y` if divisible, helps detect symint specializations where we didn't previously

Differential Revision: D74664734

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153477
Approved by: https://github.com/bobrenjc93
2025-05-16 17:32:15 +00:00
Yulun Wang
3aa84775e7 [hipify] Replace cuda error cudaErrorContextIsDestroyed (#153576)
Summary: The cuda symbol the cuda symbol cudaErrorContextIsDestroyed is not converted to hipErrorContextIsDestroyed. Add this convertion

Test Plan: CI

Differential Revision: D74542735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153576
Approved by: https://github.com/xw285cornell, https://github.com/cyyever
2025-05-16 16:19:42 +00:00
soulitzer
a060f3d272 Rewrite autograd producer consumer stream sync logic (#151079)
Also see previous work https://github.com/pytorch/pytorch/pull/142097

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151079
Approved by: https://github.com/albanD
2025-05-16 15:42:22 +00:00
Ryan Guo
2ce0b66db8 [dynamo] Make OptimizedModule more robust in attribute reads and writes (#153637)
Fixes #138157.

Differential Revision: [D74834872](https://our.internmc.facebook.com/intern/diff/D74834872)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153637
Approved by: https://github.com/williamwen42
2025-05-16 15:17:07 +00:00
Guilherme Leobas
f66a159db5 [Set] Raise TypeError if set is called with the wrong number of arguments (#152990)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152990
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901, #152902, #152903, #152905, #152906, #152989, #152907, #152908
2025-05-16 14:28:32 +00:00
Guilherme Leobas
5a0ca65555 [Set] Add correct set/frozenset __init__ behavior (#152908)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152908
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901, #152902, #152903, #152905, #152906, #152989, #152907
2025-05-16 14:28:32 +00:00
Guilherme Leobas
053025494f [Set] Raise KeyError on empty set.pop() (#152907)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152907
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901, #152902, #152903, #152905, #152906, #152989
2025-05-16 14:28:32 +00:00
Guilherme Leobas
5964cb5eb1 [Set] Update set.union and set.update to support *args (#152989)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152989
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901, #152902, #152903, #152905, #152906
2025-05-16 14:28:32 +00:00
Guilherme Leobas
4759922c5e [Set] Add set.intersection(_update) (#152906)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152906
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901, #152902, #152903, #152905
2025-05-16 14:28:32 +00:00
Guilherme Leobas
ca96d55322 [Set] Add set.difference(_update) (#152905)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152905
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901, #152902, #152903
2025-05-16 14:28:32 +00:00
Guilherme Leobas
5c6830ced0 [Set] Raise KeyError if elem not contained in the set (#152903)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152903
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901, #152902
2025-05-16 14:28:32 +00:00
Guilherme Leobas
574f4c507a [Set] Add set.issubset and set.issuperset (#152902)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152902
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904, #152901
2025-05-16 14:28:32 +00:00
Guilherme Leobas
5926b7a38f [Set] Add set.symmetric_difference(_update) (#152901)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152901
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988, #152904
2025-05-16 14:28:32 +00:00
Guilherme Leobas
fe51ce62ca [Set] Raise TypeError if number of arguments mismatch (#152904)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152904
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987, #152988
2025-05-16 14:28:32 +00:00
Guilherme Leobas
481c345f49 [Set] Raise TypeError if argument is unhashable (#152988)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152988
Approved by: https://github.com/anijain2305
ghstack dependencies: #150792, #152987
2025-05-16 14:28:32 +00:00
Guilherme Leobas
cf7021a0ee [Set] Handle exception in ConstantVariable operation (#152987)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152987
Approved by: https://github.com/williamwen42, https://github.com/anijain2305
ghstack dependencies: #150792
2025-05-16 14:28:32 +00:00
Guilherme Leobas
477f13c3fb [Set] Add CPython set tests (#150792)
Tests:
* test_set.py

This PR adds test_set.py from the CPython 3.13 branch and ~400 files to test/dynamo_expected_failures. Most of these are expected to be fixed in upcoming PRs. Only minimal changes were made to test_set.py to enable compilation with Dynamo using the PYTORCH_TEST_WITH_DYNAMO=1 environment variable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150792
Approved by: https://github.com/anijain2305
2025-05-16 14:28:32 +00:00
Siddharth Kotapati
6592086ac3 Add metal kernel for log ops (#153398)
Move unary log ops to metal kernels
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153398
Approved by: https://github.com/kulinseth, https://github.com/malfet
2025-05-16 14:25:28 +00:00
xinan.lin
8ca985b365 [Break XPU] Skip newly added test case on XPU that failed because torch._C._scatter not implemented. (#153685)
Fixes #153608
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153685
Approved by: https://github.com/malfet
2025-05-16 14:15:50 +00:00
Scott Wolchok
9ccd601a14 [easy] Fix endif comments in functional_base.h (#153696)
The first one of these confused me on #152388. Happened to notice the second.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153696
Approved by: https://github.com/Skylion007, https://github.com/malfet
2025-05-16 14:08:41 +00:00
PyTorch MergeBot
3443627e07 Revert "[BE]: Enable RUFF TRY400 rule - log.exception (#153473)"
This reverts commit 4f4ecc583e.

Reverted https://github.com/pytorch/pytorch/pull/153473 on behalf of https://github.com/jeanschmidt due to seems to have broken internal signals, @albanD may I count on you to help the author merge his PR? D74837988 ([comment](https://github.com/pytorch/pytorch/pull/153473#issuecomment-2886017075))
2025-05-16 08:29:26 +00:00
PyTorch MergeBot
86c6f71ddb Revert "[Ez][BE]: Remove accidental classvar (#153540)"
This reverts commit e0dece510b.

Reverted https://github.com/pytorch/pytorch/pull/153540 on behalf of https://github.com/jeanschmidt due to Broken internal tests, @albanD may you help the author get his PR merged? D74804063 ([comment](https://github.com/pytorch/pytorch/pull/153540#issuecomment-2886011101))
2025-05-16 08:26:37 +00:00
PyTorch MergeBot
4d073af58c Revert "[inductor][dynamo] Include operator name in size/stride/alignment assertion (#152353)"
This reverts commit 725bbb6b5f.

Reverted https://github.com/pytorch/pytorch/pull/152353 on behalf of https://github.com/jeanschmidt due to seems to have broken a few internal tests, @jansel may you help the author get his PR merged? ([comment](https://github.com/pytorch/pytorch/pull/152353#issuecomment-2885997862))
2025-05-16 08:20:39 +00:00
Robert Burke
741539a790 Split out second pass of LayerNorm for profiler attribution reasons (#153578)
Summary:
Split out second pass of LayerNorm so it's more likely to show up in
profiler output. In my testing with perf, the samples from the lambda in the
current implementation are attributed somewhat haphazardly.

Differential Revision: D74181627

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153578
Approved by: https://github.com/hl475
2025-05-16 08:07:13 +00:00