Commit Graph

211 Commits

Author SHA1 Message Date
Xuehai Pan
fc0376e8b1 [BE][2/6] fix typos in test/ (test/test_*.py) (#157636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157636
Approved by: https://github.com/yewentao256, https://github.com/mlazos
ghstack dependencies: #156311, #156609
2025-07-09 11:02:23 +00:00
Valentine233
7597988f1b [fake tensor] fix issue of no attribute tags (#156689)
Fixes #156688

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156689
Approved by: https://github.com/leslie-fang-intel, https://github.com/atalman
2025-07-03 01:16:01 +00:00
Ryan Guo
a4b59498c5 Fix fake kernel for the out=... variant of unbind_copy (#156643)
`unbind_copy(..., out=...)` returns None rather than the `out` argument
(see https://github.com/pytorch/pytorch/issues/130829#issuecomment-2283936222),
but the old fake kernel didn't account for that and caused an assertion
failure in `pushPyOutToStack`. This patch fixes that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156643
Approved by: https://github.com/zou3519, https://github.com/jansel, https://github.com/bdhirsh
ghstack dependencies: #156642
2025-06-27 01:34:07 +00:00
Ryan Guo
89aa708b39 [core] Dispatch to at::nansum_out rather than at::native::nansum_out (#156642)
Calling `at::native::nansum_out` causes the fake kernel to dispatch to a
`make_reduction` call and then segfaults later due to the
`mutable_data_ptr` call in `TensorIteratorBase::build`. It also causes
fake tensor propagation issue in Dynamo. The added tests demonstrate the
aforementioned 2 issues.

This patch fixes it by dispatching to `at::nansum_out` instead.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156642
Approved by: https://github.com/zou3519
2025-06-27 01:34:07 +00:00
William Wen
1c3f5e902d [dynamo] control one_graph behavior additionally through config (#154283)
`torch.compile` now always goes through `torch._dynamo._optimize`. fullgraph is now implemented in `torch.compile` by looking at `config.error_on_graph_break`. Export still goes through `torch._dynamo._optimize_assert`, which uses `tx.one_graph` instead of `config.error_on_graph_break`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154283
Approved by: https://github.com/jansel, https://github.com/anijain2305
2025-06-26 21:40:38 +00:00
PyTorch MergeBot
b5c8b8d09f Revert "[dynamo] control one_graph behavior additionally through config (#154283)"
This reverts commit b46eb1ccaf.

Reverted https://github.com/pytorch/pytorch/pull/154283 on behalf of https://github.com/ezyang due to All of this is responsible for regression, see https://github.com/pytorch/pytorch/pull/156561 ([comment](https://github.com/pytorch/pytorch/pull/154283#issuecomment-2994242583))
2025-06-22 14:22:07 +00:00
William Wen
b46eb1ccaf [dynamo] control one_graph behavior additionally through config (#154283)
`torch.compile` now always goes through `torch._dynamo._optimize`. fullgraph is now implemented in `torch.compile` by looking at `config.error_on_graph_break`. Export still goes through `torch._dynamo._optimize_assert`, which uses `tx.one_graph` instead of `config.error_on_graph_break`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154283
Approved by: https://github.com/jansel, https://github.com/anijain2305
2025-06-20 07:02:57 +00:00
PyTorch MergeBot
408d9884b0 Revert "[dynamo] fix set_fullgraph for nested calls (#154782)"
This reverts commit 3c8c48f793.

Reverted https://github.com/pytorch/pytorch/pull/154782 on behalf of https://github.com/atalman due to inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_do_not_trigger_dynamic_shapes_on_empty_block_mask_cuda GH job link HUD commit link ([comment](https://github.com/pytorch/pytorch/pull/154782#issuecomment-2984764330))
2025-06-18 15:47:21 +00:00
William Wen
3c8c48f793 [dynamo] fix set_fullgraph for nested calls (#154782)
- Make the fullgraph argument of set_fullgraph a positional argument
- Fix behavior on nested calls by updating `tracer.error_on_graph_break` in more places. In particular, a tracer's error_on_graph_break is set to the inlined tracer's error_on_graph_break upon the latter's exit. We also track error_on_graph_break in the speculation log now, since if we encounter a nested graph break, we will restart analysis and we need to somehow remember the error_on_graph_break setting after attempting to run the nested function (but we don't actually trace into it in the restart analysis).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154782
Approved by: https://github.com/jansel
ghstack dependencies: #154283, #154289
2025-06-18 07:27:09 +00:00
Mikayla Gawarecki
38bfd462b8 Use swap_tensors path in nn.Module.to for FakeTensor (#152539)
Fixes https://github.com/pytorch/pytorch/issues/148977

Differential Revision: [D76458023](https://our.internmc.facebook.com/intern/diff/D76458023)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152539
Approved by: https://github.com/albanD
2025-06-12 22:08:21 +00:00
Laith Sakka
43b2716e89 PYFMT lint grandfathered files 1 (#154261)
lint:
-  test/test_fake_tensor.py
-  test/test_flop_counter.py
- torch/_export/verifier.py

with same rules as other files, it was a night mare for me to update tests in one of the skipped files
with not being able to lint them locally like other files with lintrunner -a.
note that those file do have active dev and not old not touched files.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154261
Approved by: https://github.com/angelayi, https://github.com/Skylion007
2025-05-25 17:36:14 +00:00
Aaron Orenstein
4b7abce6a4 Fix fake tensor caching when output has unbacked (#153034)
We handle fake tensor caching in two ways:
1. If the inputs have no symbols (SymInt, etc) then we cache on the FakeTensorMode.
2. If the inputs have symbols then we cache on the ShapeEnv.

This way the symbols in the inputs and outputs are associated with the guards in place at the time of the call.

However - it's possible to have an op where there are no symbols in the inputs but there is an unbacked symbol in the output.  In this case we shouldn't cache at all because what would that really mean?

So this PR changes the caching behavior so that if there's a symbol in the output which doesn't come in some way from the input then we refuse to cache that op.

Added a test which checks for this case.

While in there I also did a couple other related changes:
1. Added negative caching - if we see that an (op, args) failed to cache previously we don't even bother trying to cache it again.
2. Reworked the inner behavior of _cached_dispatch_impl a little to make it more clear which bits we expect to be able to throw _BypassDispatchCache and add some comments.

The latest version of this also:
1. Addresses the problem that caused #153891.
    The issue was that with caching ops are required to support `__eq__`.  Unfortunately _RecordFunction is minimalistic and doesn't support that - so in the off-chance that two keys hash to the same value the `__eq__` check would raise an exception.

    Apparently this was much more common on MacOS where memory patterns end up with more reuse (so the object IDs are the same and give you the same hash value for objects that use pointer hash).

    Tested locally on MacOS where running
```
python test/inductor/test_torchinductor.py GPUTests
```
was pretty much guaranteed to fail (at least for me) somewhere around test 100-200 and passed all 800 tests after this change.

Another way to test this is to run the inductor tests with `torch._subclasses.fake_tensor._DispatchCacheKey.__hash__` monkey-patched to return a constant (causing all values to hash-collide) but this can't really be checked-in since it causes the cache lookup to turn into an O(n) lookup which takes a crazy long time to run through all the tests...

2. Folds in #153780 to ensure that exceptions raised from the op don't include the context from the cache key bypass.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153034
Approved by: https://github.com/masnesral, https://github.com/tugsbayasgalan
2025-05-23 15:03:31 +00:00
PyTorch MergeBot
1075bb37d3 Revert "Fix fake tensor caching when output has unbacked (#153034)"
This reverts commit cb5f31a4a1.

Reverted https://github.com/pytorch/pytorch/pull/153034 on behalf of https://github.com/malfet due to Seems to have introduced flakiness in MacOS inductor tests, see https://github.com/pytorch/pytorch/issues/153891 ([comment](https://github.com/pytorch/pytorch/pull/153034#issuecomment-2893059329))
2025-05-20 06:02:38 +00:00
Aaron Orenstein
cb5f31a4a1 Fix fake tensor caching when output has unbacked (#153034)
We handle fake tensor caching in two ways:
1. If the inputs have no symbols (SymInt, etc) then we cache on the FakeTensorMode.
2. If the inputs have symbols then we cache on the ShapeEnv.

This way the symbols in the inputs and outputs are associated with the guards in place at the time of the call.

However - it's possible to have an op where there are no symbols in the inputs but there is an unbacked symbol in the output.  In this case we shouldn't cache at all because what would that really mean?

So this PR changes the caching behavior so that if there's a symbol in the output which doesn't come in some way from the input then we refuse to cache that op.

Added a test which checks for this case.

While in there I also did a couple other related changes:
1. Added negative caching - if we see that an (op, args) failed to cache previously we don't even bother trying to cache it again.
2. Reworked the inner behavior of _cached_dispatch_impl a little to make it more clear which bits we expect to be able to throw _BypassDispatchCache and add some comments.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153034
Approved by: https://github.com/masnesral, https://github.com/tugsbayasgalan
2025-05-15 23:18:52 +00:00
PyTorch MergeBot
e6dccb036e Revert "Fix fake tensor caching when output has unbacked (#153034)"
This reverts commit 4f425a0397.

Reverted https://github.com/pytorch/pytorch/pull/153034 on behalf of https://github.com/malfet due to Broke pr_time_benchmarks, see d07fbd41e3/1 ([comment](https://github.com/pytorch/pytorch/pull/153034#issuecomment-2868100487))
2025-05-09 23:43:56 +00:00
Aaron Orenstein
4f425a0397 Fix fake tensor caching when output has unbacked (#153034)
We handle fake tensor caching in two ways:
1. If the inputs have no symbols (SymInt, etc) then we cache on the FakeTensorMode.
2. If the inputs have symbols then we cache on the ShapeEnv.

This way the symbols in the inputs and outputs are associated with the guards in place at the time of the call.

However - it's possible to have an op where there are no symbols in the inputs but there is an unbacked symbol in the output.  In this case we shouldn't cache at all because what would that really mean?

So this PR changes the caching behavior so that if there's a symbol in the output which doesn't come in some way from the input then we refuse to cache that op.

Added a test which checks for this case.

While in there I also did a couple other related changes:
1. Added negative caching - if we see that an (op, args) failed to cache previously we don't even bother trying to cache it again.
2. Reworked the inner behavior of _cached_dispatch_impl a little to make it more clear which bits we expect to be able to throw _BypassDispatchCache and add some comments.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153034
Approved by: https://github.com/masnesral, https://github.com/tugsbayasgalan
2025-05-09 21:17:54 +00:00
PyTorch MergeBot
4f9f1abd6d Revert "Use swap_tensors path in nn.Module.to for all subclasses that override __torch_dispatch__ (#152539)"
This reverts commit 037343657e.

Reverted https://github.com/pytorch/pytorch/pull/152539 on behalf of https://github.com/wdvr due to failing internal tests - discussed with author ([comment](https://github.com/pytorch/pytorch/pull/152539#issuecomment-2846484924))
2025-05-02 06:43:35 +00:00
Animesh Jain
4649fd17b0 [invoke_subgraph] Unpacked operands (#152547)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152547
Approved by: https://github.com/ydwu4, https://github.com/zou3519
2025-05-02 05:44:46 +00:00
Mikayla Gawarecki
037343657e Use swap_tensors path in nn.Module.to for all subclasses that override __torch_dispatch__ (#152539)
Fixes https://github.com/pytorch/pytorch/issues/148977

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152539
Approved by: https://github.com/albanD
2025-05-01 18:04:33 +00:00
Animesh Jain
d743a7bd85 [invoke_subgraph] Cache fake tensor if no unbacked symint in the output (#151957)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151957
Approved by: https://github.com/zou3519, https://github.com/bdhirsh
ghstack dependencies: #151409, #151633, #151477
2025-04-24 14:17:22 +00:00
Animesh Jain
1d73b644a8 [fake tensor cache] Support index with non bool/int8 indices (#151477)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151477
Approved by: https://github.com/zou3519, https://github.com/bdhirsh
ghstack dependencies: #151409, #151633
2025-04-24 13:48:18 +00:00
PyTorch MergeBot
9344da8bd1 Revert "[fake tensor cache] Support index with non bool/int8 indices (#151477)"
This reverts commit bdb34f55a0.

Reverted https://github.com/pytorch/pytorch/pull/151477 on behalf of https://github.com/wdvr due to reverting confusing ghstack state ([comment](https://github.com/pytorch/pytorch/pull/151477#issuecomment-2825023953))
2025-04-23 17:30:27 +00:00
Animesh Jain
bdb34f55a0 [fake tensor cache] Support index with non bool/int8 indices (#151477)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151477
Approved by: https://github.com/zou3519, https://github.com/bdhirsh
ghstack dependencies: #151330, #151256, #151357
2025-04-17 21:51:08 +00:00
Shangdi Yu
cfab04d01b Fix aten.div type promotion for FakeTensor (#150874)
Summary:
When we divide a FakeTensor by an integer using the fast op implementation, the type promotion should be `ELEMENTWISE_TYPE_PROMOTION_KIND.INT_TO_FLOAT` so we get a float when dividing an int FakeTensor by an integer.

```
FAST = get_fast_op_impls()
fast_div = FAST[torch.ops.aten.div.Tensor]
fast_div(fake_tensor, some_int)
```

Test Plan:
```
python test/test_fake_tensor.py -k test_fast_div
```

Differential Revision: D72667430

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150874
Approved by: https://github.com/angelayi
2025-04-09 18:52:01 +00:00
Animesh Jain
999fa15ba8 [invoke_subgraph][fake tensor cache] Add a finalizer for id hashed objects (#149667)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149667
Approved by: https://github.com/zou3519
ghstack dependencies: #149087
2025-03-27 00:01:39 +00:00
Animesh Jain
a7596b4b34 [invoke_subgraph] Fake tensor prop caching (#149087)
Redoing https://github.com/pytorch/pytorch/pull/137808
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149087
Approved by: https://github.com/zou3519
2025-03-27 00:01:39 +00:00
Mikayla Gawarecki
209977e6e5 Add information about checkpoint offset to untyped storages when torch.load under FakeTensorMode (#147787)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147787
Approved by: https://github.com/albanD
ghstack dependencies: #147786
2025-03-06 12:04:39 +00:00
Mikayla Gawarecki
bdcc1b579b Allow torch.load under FakeTensorMode to load FakeTensors with correct devices (for plain Tensors) (#147786)
This only fixes _rebuild_tensor_v2 and _rebuild_tensor_v3

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147786
Approved by: https://github.com/albanD
2025-03-06 12:04:32 +00:00
Brian Hirsh
5cda021cac support meta_tensor.to(device='cpu') under fake_mode (#146729)
Fixing this is actually a bit annoying:

(1) FakeTensorMode sees a function where all of its inputs are real tensors, so it tries to run the real compute before converting the output to a FakeTensor

(2) we don't actually want this, because the "real compute" is support to error normally, when you do `meta_tensor.to(device='cpu')`. Instead, we want FakeTensor to actually skip constant prop and run the normal FakeTensor implementation, which will not error

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146729
Approved by: https://github.com/zou3519, https://github.com/SherlockNoMad, https://github.com/albanD
ghstack dependencies: #146642
2025-02-12 20:57:10 +00:00
Brian Hirsh
ec0b318ddb [poc] force UntypedStorage.from_buffer(buf) to return meta storage under FakeTensorMode (#146642)
context here: https://fb.workplace.com/groups/326136610199609/permalink/495389539940981/

This PR is an attempt to make it such that if you create a tensor from an external buffer (using `UntypedStorage.from_buffer(buf)`, we can generate a proper fake tensor for you out of the box.

The annoying bit is that there are not any dispatcher ops to interpose on and change behavior. So instead, I took the manual C binding and tweaked the storage device to be "meta' if we see an active fake mode.

Put "poc" in the title since I... think this is hopefully reasonable, but I can be convinced that it's not :)

```
from torch._subclasses.fake_tensor import FakeTensorMode
import pickle
import io
import torch
from contextlib import nullcontext

use_fake_tensor = True
with FakeTensorMode() if use_fake_tensor else nullcontext():
    obj = [1, 2]
    f = io.BytesIO()
    pickle.Pickler(f).dump(obj)
    byte_storage = torch.ByteStorage._from_buffer(f.getvalue())  # type: ignore[attr-defined]

    t = torch.ByteTensor(byte_storage)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146642
Approved by: https://github.com/zou3519
2025-02-12 20:57:10 +00:00
Edward Z. Yang
90448f0128 Output of nonzero is transposed, fix fake tensor (#144695)
Needs this companion executorch PR: https://github.com/pytorch/executorch/pull/7657

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144695
Approved by: https://github.com/bobrenjc93, https://github.com/albanD
2025-01-26 01:07:22 +00:00
Edward Z. Yang
bc62930765 Work around buggy use_const_ref_for_mutable_tensors (#145530)
See https://github.com/pytorch/pytorch/issues/145522 for context

This doesn't fix the problem with use_const_ref_for_mutable_tensors and the boxed wrapper, instead it just gets all of our out kernels off of this flag so that the mutable matching pattern works correctly. I also add a check in torchgen to prevent people from making this mistake in the future.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145530
Approved by: https://github.com/albanD, https://github.com/bdhirsh
2025-01-24 14:38:49 +00:00
PyTorch MergeBot
f0a210bf5d Revert "Output of nonzero is transposed, fix fake tensor (#144695)"
This reverts commit 693d8c7e94.

Reverted https://github.com/pytorch/pytorch/pull/144695 on behalf of https://github.com/izaitsevfb due to breaking internal tests, see D68461259 ([comment](https://github.com/pytorch/pytorch/pull/144695#issuecomment-2608443589))
2025-01-22 23:04:50 +00:00
Edward Z. Yang
693d8c7e94 Output of nonzero is transposed, fix fake tensor (#144695)
Needs this companion executorch PR: https://github.com/pytorch/executorch/pull/7657

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144695
Approved by: https://github.com/bobrenjc93, https://github.com/albanD
2025-01-21 20:50:09 +00:00
Daniel Vega-Myhre
d02c396fbb add fp8 support to index_cuda (#144747)
Fixes #133605

**Summary**

This PR adds support for FP8 data types to the `index_cuda` op.

It uses `AT_DISPATCH_V2` which is a new macro that can handle arbitrary number of dtypes, as opposed to the old implementations which had a separate macro for each possible number of dtype arguments (e.g. `AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND{2,3,4,5...}`).

**Test plan**

Updated test `index_cuda_with_cpu` in `test/test_fake_tensor.py` to have cases for all dtypes handled by `index_cuda`, including fp8 dtypes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144747
Approved by: https://github.com/vkuzo
2025-01-17 22:53:23 +00:00
Brian Hirsh
d7f45fc575 dynamic shape support for interpolate(antialias=True) backward (#141198)
Fixes https://github.com/pytorch/pytorch/issues/141187

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141198
Approved by: https://github.com/ezyang, https://github.com/Chillee
ghstack dependencies: #141161
2025-01-16 00:08:25 +00:00
Yanan Cao (PyTorch)
ba5cacbc17 [Codemod][AddExplicitStrictExportArg] caffe2/test (#143688)
Reviewed By: avikchaudhuri

Differential Revision: D67530154

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143688
Approved by: https://github.com/tugsbayasgalan
2024-12-27 07:58:44 +00:00
Tom Ritchford
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
soulitzer
e41a0b33ec Allow Fakified subclass to have different device for inner and outer tensor (#141839)
Previously if a wrapper tensor subclass is fakified, the inner tensors would end up having the same device as the outer tensor. This PR makes it so that inner and outer tensors can have different devices.

See OffloadTensor PR https://github.com/pytorch/pytorch/pull/141840/files#diff-3bc0cf540b694f4ec0a3749f78b047456657a53a5657e495ffb68e5970c5fdaaR1955 for an application. A simpler test has been added in this PR.

This is technically bc-breaking because now the callback passed to MetaConverter needs to accept an extra argument, but no one external should be using this anyway?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141839
Approved by: https://github.com/bdhirsh
ghstack dependencies: #141166
2024-12-03 00:09:41 +00:00
eellison
fb7148d05d Fix split decomp returning self (#140065)
Previously the split decomp would return the input when there were no splits. this errors in torch.compile (or FakeTensorMode) with :

> RuntimeError: View operation returned a tensor that is the same as the input base tensor.  This is no longer allowed; you must explicitly create a new tensor (e.g., using .detach()). As a user, you could have made a mistake implementing __torch_dispatch__ or a Python operator decomposition or meta registration; if that's not the case, please report a bug to PyTorch or the backend you are using.

Fix for https://github.com/pytorch/pytorch/issues/133394

Differential Revision: [D65635070](https://our.internmc.facebook.com/intern/diff/D65635070)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140065
Approved by: https://github.com/bdhirsh
2024-11-13 01:58:02 +00:00
PyTorch MergeBot
7eb66173e2 Revert "Fix split decomp returning self (#140065)"
This reverts commit 9d99dceb53.

Reverted https://github.com/pytorch/pytorch/pull/140065 on behalf of https://github.com/ZainRizvi due to Diff been imported internally, but merged externally. And the internal diff has been updated so the diff and PR are now mismatched.  Reverting this PR to get things back into a consistent state. See D65635070 ([comment](https://github.com/pytorch/pytorch/pull/140065#issuecomment-2465928027))
2024-11-09 00:16:26 +00:00
eellison
9d99dceb53 Fix split decomp returning self (#140065)
Previously the split decomp would return the input when there were no splits. this errors in torch.compile (or FakeTensorMode) with :

> RuntimeError: View operation returned a tensor that is the same as the input base tensor.  This is no longer allowed; you must explicitly create a new tensor (e.g., using .detach()). As a user, you could have made a mistake implementing __torch_dispatch__ or a Python operator decomposition or meta registration; if that's not the case, please report a bug to PyTorch or the backend you are using.

Fix for https://github.com/pytorch/pytorch/issues/133394

Differential Revision: [D65635070](https://our.internmc.facebook.com/intern/diff/D65635070)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140065
Approved by: https://github.com/bdhirsh
2024-11-08 16:53:18 +00:00
Pian Pawakapan
a678eaf1ad check fake/real mismatches during real tensor prop (#137747)
Summary:
While testing exportability for PT2 Inference models, we found various cases of invalid op inputs during tracing, for example errors like: `a and b must have same reduction dim`, `expected scalar type Long but found Int`, etc. Looking more closely, these happened to due the same few meta kernels & eager kernels producing mismatched outputs upstream (e.g. different output tensor dtype, int output).

Adding checks to catch mismatched outputs in real tensor prop upstream, so errors are raised at the mismatched op, instead of the downstream ops taking them as inputs. Relies a lot on utils from [CrossRefFakeMode](929797dedb/torch/_subclasses/fake_utils.py (L78))

Follow ups: could add more checks, and maybe have a flag to only enable these for cases like draft mode, so perf doesn't suffer?

Test Plan: test_export, test_fake_tensor

Differential Revision: D64210055

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137747
Approved by: https://github.com/zou3519
2024-11-04 23:39:48 +00:00
eellison
d90717e4e2 Add option to save real tensors in TORCH_COMPILE_DEBUG repro (#138110)
This pr adds a utility to try to try to construct the corresponding real tensor values of fake tensors by seeing if their meta storage is contained in the meta converter.

Then, we are able to save real tensor values for fx_graph_runnable if `TORCH_COMPILE_DEBUG_SAVE_REAL=1` is set.

Differential Revision: [D64502744](https://our.internmc.facebook.com/intern/diff/D64502744)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138110
Approved by: https://github.com/ezyang
2024-10-28 16:18:22 +00:00
William Wen
92fdea8a39 remove skips due to https://github.com/pytorch/torchdynamo/issues/1991 (#138133)
Closes https://github.com/pytorch/pytorch/issues/93479. A bunch of other dynamo-wrapped tests also exhibit "torch.* returned non-Tensor output unimplemented" making the issue seem less relevant to me. Some tests are marked as xfail as they fail for other reasons.

If these tests are indeed important, we should create a new issue to track them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138133
Approved by: https://github.com/ezyang
2024-10-17 17:42:46 +00:00
Yidi Wu
819d6b139c [scan] flatten subgraph output and make subgraph inputs to be a slice (#135601)
This pr introduces two changes:
1. Before this pr, the subgraphs output is ([], []), in this pr, we change it to a flattened list for easier codegen and consistency with other control flow operators.

2. Before the PR, the combine_fn of scan takes a sliced input but keep the sliced dimension. For exmaple, suppose xs = torch.randn(3, 4, 5) and we scan over dim 0, the combine_fn looks like:
```
# x.shape = (1, 4, 5) instead of (4, 5)
def combine_fn(carry, x):
  ...
```

In this PR, we fixed this and also simplify some of the slicing logic.

3. this diff also make sure we always stack ys on fist dimension.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135601
Approved by: https://github.com/zou3519
ghstack dependencies: #135600
2024-10-16 22:28:03 +00:00
Animesh Jain
19665f4619 [fake_tensor][cache] Supports ops with tuple of output tensors (#137935)
This is needed for invoke_subgraph work.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137935
Approved by: https://github.com/masnesral
2024-10-15 22:15:07 +00:00
Thomas Bohnstingl
e889252493 Implementation of scan (#134102)
This operation is supposed to be the pendant to the `associative_scan`, but can operate with non-associative functions.

@ydwu4

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134102
Approved by: https://github.com/ydwu4
2024-09-10 04:51:16 +00:00
xinan.lin
5707c6e952 [Fake tensor] Align the appearance of device_put op in fx_graph generated for CUDA and XPU, which is exposed in the issue #130823 (#132479)
[Fake tensor] Align the appearance of device_put op in fx_graph generated for CUDA and XPU, which is exposed in the issue #130823
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132479
Approved by: https://github.com/EikanWang, https://github.com/zou3519, https://github.com/eellison
2024-08-09 05:31:00 +00:00
Oguz Ulgen
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00