Nikita Shulga
bb680b5a87
[MPSInductor] Fix masked_fill decomp ( #152268 )
...
By adding `mps` to the list of accelerators that can work with CPU scalars
Fixes `GPUTests.test_masked_fill_promotion_mps`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152268
Approved by: https://github.com/kulinseth , https://github.com/dcci , https://github.com/Skylion007
ghstack dependencies: #152266
2025-04-27 15:50:46 +00:00
Pian Pawakapan
7c97720d16
[dynamic shapes] rewrite expand with guard_or_false ( #150236 )
...
Rewrites the expand decomposition to avoid unbacked errors, assuming the general path where `input shape == output shape or input shape == 1`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150236
Approved by: https://github.com/laithsakka
2025-04-23 06:11:11 +00:00
Pian Pawakapan
54f736155b
[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims ( #150127 )
...
For reshape/view: removes fast paths for 0 elements, checking dimensions to skip. Modifies the loop accumulating input elements, to raise a UserError if we run out of dimensions, graph breaking for compile and erroring out for export.
For infer_size: assumes if user passes us an unbacked, it's probably not -1
Will think about changes in https://docs.google.com/document/d/1WYx6EZwVDXtBnWyrzoecgGWdiK0V3XZKftfpWwQ5i3E/edit?tab=t.0#heading=h.22k54zym11qp in a later PR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150127
Approved by: https://github.com/laithsakka
2025-04-23 05:42:30 +00:00
PyTorch MergeBot
e76c0b159a
Revert "[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims ( #150127 )"
...
This reverts commit a02eae8142 .
Reverted https://github.com/pytorch/pytorch/pull/150127 on behalf of https://github.com/malfet due to Caused TestDynamoTimed.test_dynamo_timed to fail on macOS, see https://github.com/pytorch/pytorch/actions/runs/14584536979/job/40908019050 ([comment](https://github.com/pytorch/pytorch/pull/150127#issuecomment-2820081721 ))
2025-04-22 05:05:50 +00:00
Pian Pawakapan
a02eae8142
[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims ( #150127 )
...
For reshape/view: removes fast paths for 0 elements, checking dimensions to skip. Modifies the loop accumulating input elements, to raise a UserError if we run out of dimensions, graph breaking for compile and erroring out for export.
For infer_size: assumes if user passes us an unbacked, it's probably not -1
Will think about changes in https://docs.google.com/document/d/1WYx6EZwVDXtBnWyrzoecgGWdiK0V3XZKftfpWwQ5i3E/edit?tab=t.0#heading=h.22k54zym11qp in a later PR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150127
Approved by: https://github.com/laithsakka
2025-04-22 01:14:15 +00:00
PyTorch MergeBot
97d97aef24
Revert "[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims ( #150127 )"
...
This reverts commit 1dd2033c0a .
Reverted https://github.com/pytorch/pytorch/pull/150127 on behalf of https://github.com/clee2000 due to maybe caused export test to fail? export/test_draft_export.py::TestDraftExport::test_masked_linear [GH job link](https://github.com/pytorch/pytorch/actions/runs/14538768138/job/40794985504 ) [HUD commit link](1dd2033c0a ), bad TD ([comment](https://github.com/pytorch/pytorch/pull/150127#issuecomment-2816232086 ))
2025-04-18 21:38:47 +00:00
Pian Pawakapan
1dd2033c0a
[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims ( #150127 )
...
For reshape/view: removes fast paths for 0 elements, checking dimensions to skip. Modifies the loop accumulating input elements, to raise a UserError if we run out of dimensions, graph breaking for compile and erroring out for export.
For infer_size: assumes if user passes us an unbacked, it's probably not -1
Will think about changes in https://docs.google.com/document/d/1WYx6EZwVDXtBnWyrzoecgGWdiK0V3XZKftfpWwQ5i3E/edit?tab=t.0#heading=h.22k54zym11qp in a later PR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150127
Approved by: https://github.com/laithsakka
2025-04-18 17:05:11 +00:00
Laith Sakka
5471e80fb4
Remove guard_size_oblivious from vector_norm decomposition. ( #148809 )
...
This PR remove the usage of guard_size_oblivious in vector_norm by inlining it in the runtime check,
this prevent any data dependent error from ever appearing here at the locations where guard_size_oblivious
used to exist. Before this PR it used to break potentially. This is NOT BC breaking or changing of semantics from eager.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148809
Approved by: https://github.com/bobrenjc93
2025-04-10 16:19:00 +00:00
FFFrog
3e0038ae85
Fix torch.matmul related out dtype check ( #148174 )
...
----
- torch.matmul -> CompositeImplicitAutograd -> dot_out (when left_dim == 1 & right_dim == 1)
-> mv_out (when left_dim == 2 & right_dim == 1)
-> mm_out (when left_dim == 1 & right_dim == 2)
-> ...
- torch.dot
- torch.vdot
- torch.mm
- torch.mv
ISSUE related:
https://github.com/pytorch/pytorch/issues/138399
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148174
Approved by: https://github.com/jansel
2025-04-08 17:00:28 +00:00
Isuru Fernando
957faaadca
Avoid overflow in vector_norm for scalar input ( #144073 )
...
Fixes https://github.com/pytorch/pytorch/issues/143960 where torch.dist gave different results from eager due to vector_norm overflowing and eager mode avoids the overflow for single element reductions by not computing the power and then the root.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144073
Approved by: https://github.com/eellison , https://github.com/laithsakka
2025-04-07 17:10:10 +00:00
Yidi Wu
9ec9f4740c
[export] fix stft decomp and making it consistent with cpp impl. ( #149232 )
...
Summary: We change the fake impl of stft to follow more closely with its cpp implementation [here](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/SpectralOps.cpp#L951-L963 )
where " n_frames = 1 + (len - n_fft) / hop_length;" is also an integer division.
Test Plan: Existing tests and buck2 build --flagfile fbcode//mode/dev fbcode//executorch/examples/models/fb/llama4:speech_transform.pte
Differential Revision: D71209142
edit: we kept the original path un-changed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149232
Approved by: https://github.com/jackzhxng
2025-03-19 18:40:35 +00:00
Sun, Jiayi
b2862f1435
optimize the decomposition of aten.native_group_norm ( #144733 )
...
Summary:
Optimize the decomposition of aten.native_group_norm. Reduce unnecessary repeated operations by changing the order of operations for `mean`, `rstd`, `weight`, `bias `and `input`, which can improve performance when `flattened_inner_size `is large.
The original decomposition:
1. compute `mean `and `rstd`,
2. out = (x - mean) * rstd, compute in the range [N, C, *],
3. out = out * weight + bias, compute in the range [N, C, *],
The new decomposition:
1. compute `mean `and `rstd`,
2. new_weight = rstd * weight, new_bias = - mean * rstd * weight + bias, compute in the range [N, C],
3. out = out * new_weight + new_bias, compute in the range [N, C, *],
I tested the Inductor performance benchmark with this PR on both CPU and A100. On CPU, two torchbench models(functorch_dp_cifar10 and opacus_cifar10) have about 25% performance improvement, and two diffusion models(Stable Diffusion and Latent Consistency Model(LCM)) have about 2% performance improvement. On A100, no performance gains or regressions were seen.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144733
Approved by: https://github.com/leslie-fang-intel , https://github.com/jansel
2025-03-17 09:27:01 +00:00
cz2h
05f2cbfe19
Add meta function for out variants of ones,zeros,empty ( #149098 )
...
Open another PR to fix merge conflicts. Fixes https://github.com/pytorch/pytorch/issues/135832
For aten.ones, aten.zeros, followed this [link](https://docs.google.com/document/d/1GgvOe7C8_NVOMLOCwDaYV1mXXyHMXY7ExoewHqooxrs/edit?tab=t.0#heading=h.64r4npvq0w0 ) to register meta functions.
For aten.empty.out, followed this [part](https://docs.google.com/document/d/1GgvOe7C8_NVOMLOCwDaYV1mXXyHMXY7ExoewHqooxrs/edit?tab=t.0#heading=h.iy9lxhxhtl5v ) to register a decomp for empty that handles the FakeTensor input.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149098
Approved by: https://github.com/williamwen42
2025-03-14 22:17:30 +00:00
Tugsbayasgalan Manlaibaatar
5ccd659c0e
Fix decomp for linspace ( #147997 )
...
In python decompositions, we shouldn't do any non-functional operations for functional operators. This should go away once we start decomposing before functionalization.
Differential Revision: [D70265200](https://our.internmc.facebook.com/intern/diff/D70265200 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147997
Approved by: https://github.com/zou3519
2025-03-05 22:10:08 +00:00
PyTorch MergeBot
644d84d594
Revert "optimize the decomposition of aten.native_group_norm ( #144733 )"
...
This reverts commit b533bb4b13 .
Reverted https://github.com/pytorch/pytorch/pull/144733 on behalf of https://github.com/desertfire due to Cause TIMM pass rate regression on H100, see https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Thu%2C%2020%20Feb%202025%2020%3A53%3A55%20GMT&stopTime=Thu%2C%2027%20Feb%202025%2020%3A53%3A55%20GMT&granularity=hour&mode=training&dtype=amp&deviceName=cuda%20(h100)&lBranch=main&lCommit=4216478250e08e950fdd090fc23a1b270c520cc4&rBranch=main&rCommit=4986f0f52eb871cdb91b8124ee162cfe622b8688 ([comment](https://github.com/pytorch/pytorch/pull/144733#issuecomment-2689092714 ))
2025-02-27 20:57:25 +00:00
Sun, Jiayi
b533bb4b13
optimize the decomposition of aten.native_group_norm ( #144733 )
...
Summary:
Optimize the decomposition of aten.native_group_norm. Reduce unnecessary repeated operations by changing the order of operations for `mean`, `rstd`, `weight`, `bias `and `input`, which can improve performance when `flattened_inner_size `is large.
The original decomposition:
1. compute `mean `and `rstd`,
2. out = (x - mean) * rstd, compute in the range [N, C, *],
3. out = out * weight + bias, compute in the range [N, C, *],
The new decomposition:
1. compute `mean `and `rstd`,
2. new_weight = rstd * weight, new_bias = - mean * rstd * weight + bias, compute in the range [N, C],
3. out = out * new_weight + new_bias, compute in the range [N, C, *],
I tested the Inductor performance benchmark with this PR on both CPU and A100. On CPU, two torchbench models(functorch_dp_cifar10 and opacus_cifar10) have about 25% performance improvement, and two diffusion models(Stable Diffusion and Latent Consistency Model(LCM)) have about 2% performance improvement. On A100, no performance gains or regressions were seen.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144733
Approved by: https://github.com/leslie-fang-intel , https://github.com/jansel
2025-02-26 01:42:46 +00:00
FFFrog
b0fa92042b
Fix torch.mean out dtype check ( #147188 )
...
**For CPU**:
Type promotion is supported for torch.mean
**For Meta**:
Not supported for torch.mean
ISSUE related:
https://github.com/pytorch/pytorch/issues/138399
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147188
Approved by: https://github.com/albanD
2025-02-25 02:50:03 +00:00
Aaron Orenstein
db4ce78d46
PEP585: More UP006 fixes ( #146392 )
...
This should be the final PR before we can enable RUFF UP006.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146392
Approved by: https://github.com/justinchuby , https://github.com/albanD , https://github.com/Skylion007
2025-02-20 06:18:13 +00:00
Jack Zhang
ed309b9156
Re-add stft option to align window for center = false ( #146379 )
...
Skips advancing the fc window on https://github.com/pytorch/pytorch/pull/145437 , since I just found that there were non-trivial efforts to do so a while ago that eventually was reverted: https://github.com/pytorch/pytorch/pull/73434
Works around the issue by keeping the stft sans center overload
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146379
Approved by: https://github.com/justinchuby , https://github.com/iseeyuan
2025-02-06 14:07:13 +00:00
Harmen Stoppels
01554c7b5a
fix incorrect literal strings / accidental tuples ( #146037 )
...
* `expr,` is short for `(expr,)`
* literal strings over multiple lines need to escape the newline `\` or use `(...)`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146037
Approved by: https://github.com/Skylion007
2025-02-03 15:08:11 +00:00
leslie-fang-intel
9728e900dc
[Inductor][CPP] fix torch logit decomposition ( #145576 )
...
**Summary**
Fix issue https://github.com/pytorch/pytorch/issues/145379 , current decomposition using `self = torch.clamp(self, lo, hi)` which gives wrong result when `lo` is larger than `hi` comparing to eager implementation: cd68d54911/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp (L165)
Align their behavior in this PR.
**Test Plan**
```
python -u -m pytest -s -v test/inductor/test_cpu_repro.py -k test_torch_logit
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145576
Approved by: https://github.com/jgong5 , https://github.com/eellison
2025-01-27 19:37:51 +00:00
Aaron Orenstein
5b5766665d
PEP585 update - torch/_C torch/_decomp torch/_lazy torch/_library torch/_numpy torch/_prims torch/_refs torch/_strobelight ( #145102 )
...
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145102
Approved by: https://github.com/bobrenjc93
ghstack dependencies: #145105
2025-01-18 20:47:12 +00:00
Tom Ritchford
46fbd63405
Fix unbind_copy and add its decomposition ( #134319 )
...
* Fixes https://github.com/pytorch/pytorch/issues/130829
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134319
Approved by: https://github.com/amjames , https://github.com/eellison
2025-01-17 18:21:22 +00:00
zeshengzong
094ca3154d
Fix torch._refs.tensor error with empty list ( #143461 )
...
Fixes #143216
**Test Result**
**Before**
```python
>>> import torch
>>> torch._refs.tensor([])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/zong/code/pytorch/torch/_refs/__init__.py", line 6614, in tensor
new_tensor = _internal_new_from_data(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zong/code/pytorch/torch/_refs/__init__.py", line 6596, in _internal_new_from_data
tensor = _recursive_build(inferred_scalar_type, data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zong/code/pytorch/torch/_refs/__init__.py", line 6545, in _recursive_build
return torch.stack([_recursive_build(scalarType, item) for item in seq])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: stack expects a non-empty TensorList
```
**After**
```python
>>> torch._refs.tensor([])
tensor([])
>>> torch._refs.tensor([], device='cuda')
tensor([], device='cuda:0')
```
```bash
$ pytest test/test_tensor_creation_ops.py -k test_refs_tensor
```

```bash
$ lintrunner
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143461
Approved by: https://github.com/ezyang , https://github.com/soulitzer
2025-01-08 01:29:00 +00:00
Aaron Gokaslan
e4a05dec0f
[BE][Ez]: Fix docs recommending inefficient tensor op order ( #144270 )
...
`detach().clone()` is faster than `.clone().detatch()` since the gradients are not cloned. Let's update all the documentation and tests so that users do not use the inefficient op ordering.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144270
Approved by: https://github.com/awgu , https://github.com/XuehaiPan
2025-01-07 17:31:32 +00:00
Aaron Orenstein
45ef3309e3
[BE] typing for decorators ( #144161 )
...
Summary:
Untyped decorators strip annotations from the decorated items.
- _compile
- _inductor/fx_passes/post_grad
- _inductor/lowering
- _library/custom_ops
- _meta_registrations
- _ops
- _refs/nn/functional
- ao/quantization/quantizer/xnnpack_quantizer_utils
- distributed/_composable/contract
- fx/experimental/graph_gradual_typechecker
- fx/experimental/migrate_gradual_types/constraint_generator
- optim/optimizer
- signal/windows/windows
- testing/_internal/common_device_type
- torch/_inductor/decomposition
- utils/flop_counter
Test Plan: unit tests
Differential Revision: D62302684
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144161
Approved by: https://github.com/Skylion007 , https://github.com/albanD
2025-01-04 16:40:09 +00:00
Tom Ritchford
dc23f1944a
Remove unused Python variables in torch/[_-a]* ( #133492 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-12 17:39:14 +00:00
PyTorch MergeBot
5c97ac9721
Revert "Remove unused Python variables in torch/[_-a]* ( #133492 )"
...
This reverts commit fda975a7b3 .
Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516 ))
2024-12-11 17:29:12 +00:00
Tom Ritchford
fda975a7b3
Remove unused Python variables in torch/[_-a]* ( #133492 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-10 21:48:44 +00:00
Aaron Gokaslan
08db735629
[BE]: Update mypy to 1.13.0 ( #140808 )
...
Update mypy to 1.13.0 . Should hopefully reduce linting time. Has support for orjson cache serialization which should improve mypy cache perf if orjson is installed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140808
Approved by: https://github.com/ezyang , https://github.com/malfet
2024-12-03 02:50:10 +00:00
PyTorch MergeBot
daa77f3d9f
Revert "[BE]: Update mypy to 1.13.0 ( #140808 )"
...
This reverts commit 00134d68af .
Reverted https://github.com/pytorch/pytorch/pull/140808 on behalf of https://github.com/huydhn due to This is failing a distributed test in trunk, target determination missed this test and did not run it on PR ([comment](https://github.com/pytorch/pytorch/pull/140808#issuecomment-2512788426 ))
2024-12-02 20:47:43 +00:00
Aaron Gokaslan
00134d68af
[BE]: Update mypy to 1.13.0 ( #140808 )
...
Update mypy to 1.13.0 . Should hopefully reduce linting time. Has support for orjson cache serialization which should improve mypy cache perf if orjson is installed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140808
Approved by: https://github.com/ezyang , https://github.com/malfet
2024-12-02 18:47:54 +00:00
isalia20
37fe8015ac
softshrink nan fixes ( #138421 )
...
Fixes #138385 .
Currently contains fixes for cpu and cuda. Will add fixes to mps as well soon if my mac can build it from source.(Had some issues with building it on my linux pc due to limited memory)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138421
Approved by: https://github.com/mikaylagawarecki
2024-11-21 23:06:08 +00:00
Aaron Gokaslan
12e95aa4ee
[BE]: Apply PERF401 autofixes from ruff ( #140980 )
...
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby , https://github.com/malfet
2024-11-20 17:52:07 +00:00
Xinran / Allan Rui
f23d034826
[PyTorch Decomp] Allow native_layernorm decomp to align [mean, rstd] output dtypes with input dtype for MTIA backend ( #141025 )
...
Summary: As title
Test Plan: CI
Differential Revision: D66169328
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141025
Approved by: https://github.com/bdhirsh
2024-11-20 01:58:08 +00:00
Brian Hirsh
9ae19ffbed
fix layer_norm decomp precision for cpu ( #140557 )
...
xref: https://fb.workplace.com/groups/1075192433118967/posts/1540519826586223/?comment_id=1543752356262970&reply_comment_id=1544425069529032
the issue is that our decomp needs to branch on device (it only upcasts for cpu), but the device shows up as "meta" because it is registered as a meta tensor rule.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140557
Approved by: https://github.com/ezyang
2024-11-19 02:31:31 +00:00
Yukio Siraichi
435286e985
Fix unary references' out dtype check. ( #140288 )
...
Tracking issue: #138399
This PR fixes a number of reference implementations (which are also used as meta
functions), making them more consistent with CPU device. More specifically, it fixes those
operations that use `_make_elementwise_unary_reference` decorator, and don't error on
mismatching out argument dtype while they error when using concrete devices (e.g. CPU).
The fixed operations are:
- `abs`
- `ceil`
- `floor`
- `frac`
- `isneginf`
- `isposinf`
- `sgn`
- `sign`
- `signbit`
- `trunc`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140288
Approved by: https://github.com/ezyang
ghstack dependencies: #140186 , #140286
2024-11-18 23:05:29 +00:00
PyTorch MergeBot
38645e8a3e
Revert "Fix unbind_copy and add its decomposition ( #134319 )"
...
This reverts commit 8aedc649bd .
Reverted https://github.com/pytorch/pytorch/pull/134319 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but this is still failing the same test on ExecuTorch ([comment](https://github.com/pytorch/pytorch/pull/134319#issuecomment-2443209139 ))
2024-10-29 04:54:37 +00:00
Mwiza Kunda
c2ded9ec0d
Fix dot reference checks ( #138596 )
...
dot reference implementation should be consistent with the cpu / cuda implementations since it may be used for meta dispatch
i.e.
```python
import torch
x = torch.tensor([1,2,3], dtype=torch.float32)
y = torch.tensor([4,5,6], dtype=torch.float16)
x.dot(y)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dot : expected both vectors to have same dtype, but found Float and Half
```
However the below does not raise an exception
```python
x.to("meta").dot(y.to("meta"))
```
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138596
Approved by: https://github.com/bdhirsh
2024-10-28 19:11:40 +00:00
Tom Ritchford
8aedc649bd
Fix unbind_copy and add its decomposition ( #134319 )
...
* Fixes https://github.com/pytorch/pytorch/issues/130829
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134319
Approved by: https://github.com/amjames , https://github.com/eellison
2024-10-23 19:13:44 +00:00
Tom Ritchford
1bc73f3157
Add decomposition for permute_copy ( #130944 )
...
* Extracted from #129476
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130944
Approved by: https://github.com/amjames , https://github.com/eellison
2024-10-23 17:42:11 +00:00
PyTorch MergeBot
7b39fb5712
Revert "Fix unbind_copy and add its decomposition ( #134319 )"
...
This reverts commit 9f81270d75 .
Reverted https://github.com/pytorch/pytorch/pull/134319 on behalf of https://github.com/clee2000 due to breaking some executorch tests D64568664 ([comment](https://github.com/pytorch/pytorch/pull/134319#issuecomment-2423157700 ))
2024-10-18 20:09:40 +00:00
Tom Ritchford
9f81270d75
Fix unbind_copy and add its decomposition ( #134319 )
...
* Fixes https://github.com/pytorch/pytorch/issues/130829
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134319
Approved by: https://github.com/amjames , https://github.com/eellison
2024-10-17 21:27:35 +00:00
PyTorch MergeBot
4b3035f2fe
Revert "Add decomposition for permute_copy ( #130944 )"
...
This reverts commit e7a4ad3b40 .
Reverted https://github.com/pytorch/pytorch/pull/130944 on behalf of https://github.com/clee2000 due to breaking internal builds D64418214 cc @digantdesai @GregoryComer to help get this fixed and remerged ([comment](https://github.com/pytorch/pytorch/pull/130944#issuecomment-2418125356 ))
2024-10-16 23:18:53 +00:00
angelayi
f1c741dbe9
Fixes GuardOnDataDependentSymNode error in masked_fill ( #137060 )
...
Fixes [P1621441513](https://www.internalfb.com/phabricator/paste/view/P1621441513 ) ([ref to internal post](https://fb.workplace.com/groups/6829516587176185/posts/1051474609896021/?comment_id=1055262166183932&reply_comment_id=1056583932718422 ))
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137060
Approved by: https://github.com/ezyang
2024-10-16 18:16:33 +00:00
Tom Ritchford
e7a4ad3b40
Add decomposition for permute_copy ( #130944 )
...
* Extracted from #129476
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130944
Approved by: https://github.com/amjames , https://github.com/eellison
2024-10-15 13:51:20 +00:00
chilli
2b329d3bf1
Fix typo in _normalize ref ( #137079 )
...
I think this should basically make no difference numerically, but it does have some ramifications on things like CSE.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137079
Approved by: https://github.com/Skylion007
ghstack dependencies: #136826 , #137043 , #137049 , #137065
2024-10-02 19:06:48 +00:00
Tom Ritchford
b85f21fc1d
Add decomposition for squeeze_copy ( #130941 )
...
* Extracted from #128416
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130941
Approved by: https://github.com/amjames , https://github.com/eellison
ghstack dependencies: #136653
2024-10-01 10:23:22 +00:00
Isuru Fernando
f276da7f98
Remove prims.slice_in_dim and prims.slice ( #136150 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136150
Approved by: https://github.com/ezyang
2024-09-23 01:27:22 +00:00
PyTorch MergeBot
462b727d1e
Revert "Add decomposition for permute_copy ( #130944 )"
...
This reverts commit ab9a7eadd3 .
Reverted https://github.com/pytorch/pytorch/pull/130944 on behalf of https://github.com/jeanschmidt due to Broke internal signal executorch.backends.xnnpack.test.ops.permute.TestPermute, more details on D62737086. @eellison could you please help get this PR merged to main? ([comment](https://github.com/pytorch/pytorch/pull/130944#issuecomment-2355846394 ))
2024-09-17 13:42:55 +00:00