Commit Graph

796 Commits

Author SHA1 Message Date
Ryan Guo
4f75f1e80d [dynamo] Use proper item source for NamedTupleVariable (#142437)
Dynamo was generating `GetItemSource(tuple_source, index)` for items of
`NamedTupleVariable`, but that stops working when a user supplied named
tuple has a custom `__getitem__` function with different semantics.

This patch
- fixes the aforementioned issue by using `AttrSource` instead.
- handles named tuple outside `wrap_listlike`, by removing the special
  case of named tuple in `BaseListVariable.cls_for_instance`, since the
  semantics of named tuple is different enough.
- makes user all constructions of `NamedTupleVariable` has items with
  proper sources.

Fixes #142399.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142437
Approved by: https://github.com/jansel
2024-12-10 19:23:48 +00:00
Ryan Guo
a45326b649 [dynamo] Support multiple inheritance for custom dict construction (#142416)
This patch applies a local and practical workaround for custom dict
construction when multiple inheritance is involved.

Handling multiple inheritance in general could be a lot more involved,
so I created #142414 to track that.

Fixes #141118.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142416
Approved by: https://github.com/jansel
2024-12-10 19:22:15 +00:00
Xuehai Pan
0bd7b7ae58 Add version check for C++ pytree availability (#142299)
Resolves #142256

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142299
Approved by: https://github.com/jansel, https://github.com/weifengpy
2024-12-08 06:27:32 +00:00
Ryan Guo
aab0f32ea4 [dynamo] Properly handle != under user-defined __eq__ (#142078)
Previously Dynamo modelled `object.__ne__` as just comparison over value
identity; however, in CPython the default `!=` dispatches to `__eq__`,
which might've been overriden by user. This patch fixes the behavior
divergence.

Fixes #142055.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142078
Approved by: https://github.com/jansel, https://github.com/zou3519
2024-12-06 08:06:53 +00:00
Yuanhao Ji
3baf8859e6 [Dynamo] Replace torch._dynamo.optimize() with torch.compile() [4/N] (#140253)
related commits:

- #139706
- #140238
- #140247
- #140253
- #140663
- #140688

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140253
Approved by: https://github.com/soulitzer
2024-12-05 00:30:00 +00:00
Bob Ren
9286c21b22 Fix fbcode tests for automatic dynamic unspecialize float (#141975)
Differential Revision: D66708552

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141975
Approved by: https://github.com/bdhirsh, https://github.com/atalman
2024-12-03 23:59:06 +00:00
Xuehai Pan
78543e6002 [dynamo][pytree][1/N] make CXX pytree traceable: tree_iter / tree_leaves (#137397)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137397
Approved by: https://github.com/jansel
2024-12-03 11:17:39 +00:00
Ryan Guo
0efd184685 [dynamo] Fix side effects for range iterator that escapes the graph (#141716)
`wrap_range_iterator` mistakenly used `ValueMutationNew`, when it
should've used `ValueMutationExisting`, because this code path always
has a source.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141716
Approved by: https://github.com/jansel
ghstack dependencies: #141713, #141714, #141715, #141902
2024-12-03 09:18:06 +00:00
Ryan Guo
7c3c8a662e [dynamo] Add RANGE_ITERATOR_MATCH to properly guard on range iterators (#141902)
A subsequeunt patch attempts to fix a side-effect issue for range
iterators, which in turn exposed an exising issue on guards for range
iterators -- the following test started failing:
```
PYTORCH_TEST_WITH_DYNAMO=1 python test/test_tensor_creation_ops.py TestTensorCreationCPU.test_hstack_column_stack_cpu_int16
```

This patch adds a `RANGE_ITERATOR_MATCH` guard to make sure that we
properly guard on range iterators, and adds a regression test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141902
Approved by: https://github.com/jansel
ghstack dependencies: #141713, #141714, #141715
2024-12-03 09:18:06 +00:00
Ryan Guo
ff3f4a164c [dynamo] Fix aliasing issue for dict.copy that escapes the graph (#141715)
Dynamo accidentally passed the original `ConstDictVariable.source` to
the result of `dict.copy(...)`, which caused aliasing issue when the
result escapes the graph (e.g., is a return value).

This patch fixes that and adds a regression test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141715
Approved by: https://github.com/jansel
ghstack dependencies: #141713, #141714
2024-12-03 09:18:06 +00:00
Ryan Guo
9eb0520d75 [dynamo] Fix side-effect handling for pre-existing collections.deque (#141714)
Previously we never replayed side effects to `DequeVariable` with a
source; the bug was already in the `test_deque_input` test, but went
unnoticed because we didn't check the deque objects.

This patch adds limited but practical support for this (see comments in
`side_effects.py` for why limited), and updates the deque tests to check
for this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141714
Approved by: https://github.com/jansel
ghstack dependencies: #141713
2024-12-03 09:18:06 +00:00
Ryan Guo
e14d8c980f [dynamo][NFC] Rename NewCellVariable to CellVariable (#141628)
It was named `NewCellVariable` because we originally used it to
represent cells by the code Dynamo is tracing through. However, now we
use it to represent pre-existing cells as well, so this patch renames it
to avoid confusion.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141628
Approved by: https://github.com/williamwen42, https://github.com/jansel
2024-12-02 19:09:30 +00:00
PyTorch MergeBot
9012e7a62f Revert "[dynamo][pytree][1/N] make CXX pytree traceable: tree_iter / tree_leaves (#137397)"
This reverts commit 07850bb2c1.

Reverted https://github.com/pytorch/pytorch/pull/137397 on behalf of https://github.com/atalman due to Failing internal test ([comment](https://github.com/pytorch/pytorch/pull/137397#issuecomment-2511934283))
2024-12-02 16:05:14 +00:00
Bob Ren
2f72635a5c automatic dynamic unspecialize float (#141647)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141647
Approved by: https://github.com/ezyang
2024-11-29 22:36:53 +00:00
PyTorch MergeBot
9e98b3d73c Revert "automatic dynamic unspecialize float (#141647)"
This reverts commit 1a32daeb17.

Reverted https://github.com/pytorch/pytorch/pull/141647 on behalf of https://github.com/atalman due to functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inner_grad [GH job link](https://github.com/pytorch/pytorch/actions/runs/12080983316/job/33697901875) [HUD commit link](1a32daeb17) ([comment](https://github.com/pytorch/pytorch/pull/141647#issuecomment-2507980876))
2024-11-29 15:00:33 +00:00
Bob Ren
1a32daeb17 automatic dynamic unspecialize float (#141647)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141647
Approved by: https://github.com/ezyang
2024-11-29 07:53:53 +00:00
Ryan Guo
3141e038f0 [dynamo] Fix VariableBuilder._wrap on frozenset and enforce invariants on ConstantVariable (#141504)
Prior to this patch, we are using `ConstantVariable.create` to create VT
for frozenset objects, and intended yet failed to predicate that on all
itmes being literals (see https://github.com/pytorch/pytorch/pull/140984#discussion_r1847393736).

The code was from https://github.com/pytorch/torchdynamo/commit/7c03434 and
the original goal was to help DBR quantization, but as the new test in
this patch shows, it could lead to silent incorrectness.

Upon a closer look, this exposes some subtleties in how Dynamo handles
`ConstantVariable` and `LOAD_CONST`, so this patch both fixes the
aforementioned issue and documents, enforces, and makes explicit the
invariants around `ConstantVariable` and `LOAD_CONST` -- only immutable
objects are supported.

Specifically, this patch:
1. refine the checks for wrapping a `frozenset` object, document why we
   can't just wrap its items directly due to lack of `Sourcec` for set
   items, and use a safe workaround (`SourcelessBuilder`) to ensure
   soundness while keeping the DBR quantization support.
2. Adds more types to `common_constant_types`, thereby making
   `ConstantVariable.is_base_literal` more lenient, and strictly checks
   this property in the constructor of `ConstantVariable`.
3. Change relevant uses of `create_instruction("LOAD_CONST", ...)` to
   `create_load_const` which checks `is_safe_constant`, and makes
   developer overrides explicit by using `create_load_const_unchecked`
   when needed.
4. In a few places, use more specific `VariableTracker`, e.g.,
   `TypingVariable` rather than `ConstantVariable`, and
   `FrozensetVariable` rather than `SetVariable`.

(2) and (3) are mainly to future-proof Dynamo against bugs like (1).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141504
Approved by: https://github.com/jansel
2024-11-27 21:58:35 +00:00
Xuehai Pan
07850bb2c1 [dynamo][pytree][1/N] make CXX pytree traceable: tree_iter / tree_leaves (#137397)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137397
Approved by: https://github.com/jansel
ghstack dependencies: #141360
2024-11-27 00:21:58 +00:00
Isuru Fernando
44186a0a4e Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-26 18:11:00 +00:00
Ryan Guo
99a0e2b1a1 [dynamo] Trace through dataclasses by removing it from BUILTIN_SKIPLIST (#141294)
Fixes #141261.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141294
Approved by: https://github.com/williamwen42, https://github.com/jansel
2024-11-26 17:05:23 +00:00
Yanbo Liang
dcd16bdc21 [Dynamo][autograd.Function] Use fake tensor prop to infer fwd output (#136184)
Fixes #129963

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136184
Approved by: https://github.com/zou3519
2024-11-26 01:10:08 +00:00
Ryan Guo
583484b726 [dynamo] Fix and simplify hanlding of Set.update method (#141286)
The old implementation of `SetVariable.call_method("update", ...)` was
incorrectly becacuse it wouldn't handle iterable inputs. This patches
removes the input type restriction altogether, and implements the method
as a polyfill (like how most of the other set methods are handled).

Fixes #141283.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141286
Approved by: https://github.com/anijain2305
2024-11-26 00:41:50 +00:00
PyTorch MergeBot
ad37afd590 Revert "Always unspecialize float in OSS (#138922)"
This reverts commit ba5253da9b.

Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/yf225 due to perf regression on torchbench ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2499277511))
2024-11-26 00:03:03 +00:00
Bob Ren
ba5253da9b Always unspecialize float in OSS (#138922)
Fixes https://github.com/pytorch/pytorch/issues/107277

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922
Approved by: https://github.com/ezyang

Co-authored-by: Edward Z. Yang <ezyang@meta.com>
2024-11-24 01:58:13 +00:00
Jason Ansel
83116ec90c [dynamo] Fix fbcode flakey test from asyncio warning (#141399)
Summary: This was failing with a `/usr/local/fbcode/platform010/lib/python3.10/asyncio/events.py:666: DeprecationWarning` that seems unrelated.

Test Plan:
```
buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/dynamo:test_dynamo -- --exact 'caffe2/test/dynamo:test_dynamo - test_misc.py::InlineInbuiltNNModulesMiscTests::test_numpy_readonly_inline_inbuilt_nn_modules' --run-disabled
```

Differential Revision: D66394773

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141399
Approved by: https://github.com/yanboliang
2024-11-23 18:16:50 +00:00
PyTorch MergeBot
a8c90e5140 Revert "Always unspecialize float in OSS (#138922)"
This reverts commit 6d779d0549.

Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/huydhn due to Sorry for reverting your change but there is some slow tests failing after this land ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2495076878))
2024-11-22 23:18:36 +00:00
Bob Ren
6d779d0549 Always unspecialize float in OSS (#138922)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922
Approved by: https://github.com/ezyang

Co-authored-by: Edward Z. Yang <ezyang@meta.com>
2024-11-22 17:54:42 +00:00
PyTorch MergeBot
f23621ec56 Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597)"
This reverts commit c25b201583.

Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Trunk is sad again after this lands, this looks like a landrace this time, so please do a rebase ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2494052978))
2024-11-22 15:43:39 +00:00
Isuru Fernando
c25b201583 Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-22 02:04:36 +00:00
Animesh Jain
fa63276691 [user empathy day][dynamo] Support get on subclassed dicts (#141214)
Fixes https://github.com/pytorch/pytorch/issues/141138 but we need to do
a more exhaustive job of going through dict methods and check each one
of them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141214
Approved by: https://github.com/Skylion007, https://github.com/jansel
ghstack dependencies: #141209
2024-11-21 21:18:42 +00:00
PyTorch MergeBot
701e06b643 Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597)"
This reverts commit aefcdb3c9f.

Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think it fails inductor/test_padding in trunk. This is a target determination miss and that failed test was not run in your PR ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2489641453))
2024-11-20 22:13:57 +00:00
Isuru Fernando
aefcdb3c9f Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-20 20:26:49 +00:00
Animesh Jain
f4ce9ac29d [dynamo] Dont erase the cache line on invalidation (#140821)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140821
Approved by: https://github.com/jansel
2024-11-19 19:11:10 +00:00
Ryan Guo
ac6684ebbc [dynamo] Identify pre-existing captured cells by cell id rather than content id (#140436)
In `match_nested_cell`, Dynamo tried to identify pre-existing captured
cells by `(cell_name, id(cell_contents))`. This works in most cases, but
as the test added in this patch shows, it's not a complete solution.

This patch
1. changes `match_nested_cell` to `lookup_variable_for_captured_cell`,
   and does the lookup based on id of cell objects, not their contents.
   This requires plumbing a tuple of captured cell objects from
   different CPython versions all the way to
   `InstructionTranslator.__init__`, where we store a mapping from the
   ids of these cell objects, and use it later in
   `UserFunctionVariable.bind_args` to look for these unboxed cells.
2. builds off (1) -- rather than using a `VariableTracker` that
   represents the content of the unboxed cells, use `ClosureVariable`,
   which enables codegen in case these cells escape as closure of a
   `NestedUserFunctionVariable`.

The patch adds a regression test for each of the scenarios above:
1. `test_write_to_cells_with_name_shadowing` where Dynamo mistakenly
   thought the program is writing to a cell captured by root frame (which
   it doesn't support atm), which resulted in
```
  File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/symbolic_convert.py", line 3340, in STORE_DEREF
    unimplemented("write to __closure__ while inlining")
  File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/exc.py", line 313, in unimplemented
    raise Unsupported(msg, case_name=case_name)
torch._dynamo.exc.Unsupported: write to __closure__ while inlining
```
2. `test_existing_func_that_creates_capturing_nested_func` where Dynamo
   ended up trying to codegen a `NestedUserFunctionVariable` that
   captures a cell which was also captured by the root frame, so it was
   unboxed and ends up emitting `LOAD_DEREF` rather than
   `LOAD_FAST/LOAD_CLOSURE` during codegen, resulting in
```
  File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/variables/functions.py", line 105, in _create_nested_fn
    func = FunctionType(code, f_globals, name, defaults, closure)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: arg 5 (closure) expected cell, found int
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140436
Approved by: https://github.com/jansel, https://github.com/williamwen42
ghstack dependencies: #140330, #140152
2024-11-15 17:17:30 +00:00
Yuanhao Ji
8a80cee2f3 [Dynamo] Replace torch._dynamo.optimize() with torch.compile() [3/N] (#140247)
related commits:

- #139706
- #140238
- #140247
- #140253

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140247
Approved by: https://github.com/soulitzer
2024-11-13 05:51:42 +00:00
Ryan Guo
d34d5ccec5 [dynamo] Fix some corner cases for modeling pre-existing cells (#140150)
In `UserFunctionVariable.bind_args`, there's a rare case when the
underlying function satisfies all conditions below
1. The function captures a pre-existing cell
2. The cell isn't captured by root frame
3. `UserFunctionVariable.source` is `None`

In such cases, Dynamo would model the cell as its content (just like
what we do for cells in the root frame). However, this could break in
two cases:
- We could have multiple instances of `UserFunctionVariable`, where some
  have source and others don't. This means sometimes we'll model the
  cell as a `NewCellVariable`, and sometimes as its content. This
  causes issues because writes to the `NewCellVariable` would be
  buffered in `SideEffects` and never get picked up by the other
  modeling.
- Only when `UserFunctionVariable` has a source, do we check whether we
  already had a `NewCellVariable` for the captured cell. This again causes
  Dynamo to potentially have multiple representations for the same cell
  object, resulting in a similar "buffered writes not reflected" issue
  as above.

This patch fixes the above 2 issues by
1. modeling captured cells of sourceless `UserFunctionVariable` as
   immutable `NewCellVariable`, and adds a few lines in `SideEffects` to
   account for its immutability.
2. always checking whether we already had a `NewCellVariable` for the
   captured cell, before constructing a new one.

Tests are added for each aforementioned case.

I also left a TODO to investigate why exactly we would lose source
information for `UserFunctionVariable`. Some cases are easily fixable,
but others not so much.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140150
Approved by: https://github.com/jansel
ghstack dependencies: #140035, #140036, #140149
2024-11-13 03:14:23 +00:00
Ryan Guo
6a821c9e6a [dynamo] Remove cell unboxing/restart optimization (#140149)
We added an unboxing optimization to avoid writes to cells that existed
before Dynamo tracing (such writes interfere with HOPs). However, the
avoided write shouldn't be there in the first place, since we were
basically creating an empty `NewCellVariable`, and then write the
pre-existing content into the variable.

This patch
1. adds logic to bypass the initial write for pre-existing cells
   without undermining correctness.
2. removes the unboxing optimization and the restart code path.

Fixes #137456, #138491; also see those issues for more historical
context.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140149
Approved by: https://github.com/ezyang, https://github.com/jansel
ghstack dependencies: #140035, #140036
2024-11-13 03:14:23 +00:00
Ryan Guo
698ff07323 [dynamo] Fix name collision bug for captured cells and locals (#140036)
The `export_freevars` method was introduced very early on, for
propagating writes to unboxed cells from child to parent frame, see
https://github.com/pytorch/torchdynamo/commit/d0c10341.

However, it's no longer needed after we started to modify root tracer's
`symbolic_locals` directly for the unboxed cells, see
https://github.com/pytorch/torchdynamo/commit/663e4d92.

As a result, we no longer need `export_freevars`. In fact, it can cause
a very subtle bug when name collision happens across the parent and
child frames during inlining, because the parent frame isn't necessarily
the frame that defined the cell captured by child frame.

In summary, this patch removes the `export_freevars` bits, and adds a
regression test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140036
Approved by: https://github.com/williamwen42, https://github.com/jansel
ghstack dependencies: #140035
2024-11-13 03:14:23 +00:00
Bob Ren
4488e23763 Fix another item memo loss location + bool specialization bug (#139587)
This fix was a bit more involved:
1) It fixes a item_memo loss place.
2) It updates a test to be eager instead of aot_eager since it reveals a very obscure bug related to replacements that's not worth solving since in practice inductor will regenerate the runtime asserts anyways
3) It updates tensorify to specialize more places now that the aforementioned bug is fixed.

Fixes `PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=6 python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCPU.test_comprehensive_linalg_norm_cpu_float16` when `specialize_float=False`

while ensuring `python test/dynamo/test_dynamic_shapes.py DynamicShapesMiscTests.test_runtime_assert_replacement_dynamic_shapes` doesn't regress

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139587
Approved by: https://github.com/ezyang
ghstack dependencies: #139569, #139457, #139568, #139572, #139846, #139454, #139896, #139935
2024-11-09 03:11:19 +00:00
Michael Lazos
ea0f60ecfa [Dynamo] allow dynamic callables on tensor variables (#137940)
Fixes https://github.com/pytorch/pytorch/issues/134844

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137940
Approved by: https://github.com/williamwen42
2024-11-08 23:49:34 +00:00
Animesh Jain
738bfff5f9 [dynamo][user-defined] Fix bugs with method descriptors (#139856)
Should fix some problems in https://github.com/pytorch/pytorch/pull/138080

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139856
Approved by: https://github.com/jansel
2024-11-06 23:16:40 +00:00
Michael Lazos
d622b490d6 [Dynamo] Support tensor mro without source (#139838)
Fixes https://github.com/pytorch/pytorch/issues/137743

The issue here is that if `type` was called on a tensor without a source, we wouldn't have a source even for `torch.Tensor`, and the `__mro__` retrieval would fail. Since `torch.Tensor` is an internal torch type, I add handling for it in `call_type` in builtins.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139838
Approved by: https://github.com/williamwen42
2024-11-06 08:52:53 +00:00
Laith Sakka
a787320d0f Do not try to optimize new implications in get_implications (#139738)
Summary:
save around 8%  on the torchrec model.
In most case the new implications are not optimizaiton anyway in some case though they are,
but optimizing them is useless.

ex:
```
generating implications for Eq(Mod(s0, 3), 0)
adding Eq(Mod(s0, 3), 0)
adding Eq(0, Mod(s0, 3))
adding Ne(Mod(s0, 3), 0)
adding Ne(0, Mod(s0, 3))
adding Mod(s0, 3) <= 0
adding 0 < Mod(s0, 3)
adding True
adding False
```

VS
```
generating implications for Eq(Mod(s0, 3), 0)
adding Eq(Mod(s0, 3), 0)
adding Eq(0, Mod(s0, 3))
adding Ne(Mod(s0, 3), 0)
adding Ne(0, Mod(s0, 3))
adding Mod(s0, 3) <= 0
adding 0 < Mod(s0, 3)
adding 0 <= Mod(s0, 3)
adding Mod(s0, 3) < 0
```
the main difference is that  0 <= Mod(s0, 3) can be simplified to True and Mod(s0, 3) < 0 to False but with this change
this wont happen. but True:True and False: False are useless anyway lol. so its ok i think
```
buck2 run fbcode//mode/opt fbcode//torchrec/distributed/tests:pt2_compile_benchmark -- --num-features=1000
```

<img width="1082" alt="Screenshot 2024-11-04 at 9 25 51 PM" src="https://github.com/user-attachments/assets/a26e291b-9280-4b55-9275-f3201a36ac51">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139738
Approved by: https://github.com/ezyang
ghstack dependencies: #139703
2024-11-06 00:23:40 +00:00
PyTorch MergeBot
b6b9596607 Revert "[dynamo] Fix constant propagation in builtins and UserClasses (#131354)"
This reverts commit 44257c063e.

Reverted https://github.com/pytorch/pytorch/pull/131354 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it seems to break some internal tests ([comment](https://github.com/pytorch/pytorch/pull/131354#issuecomment-2451050605))
2024-11-01 00:13:20 +00:00
Laith Sakka
6a1c451479 Don't uselessly recompute axiom dict every static eval call (#138967)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138967
Approved by: https://github.com/ezyang
2024-10-31 21:16:55 +00:00
PyTorch MergeBot
87f1990697 Revert "Don't uselessly recompute axiom dict every static eval call (#138967)"
This reverts commit 24b695ae2d.

Reverted https://github.com/pytorch/pytorch/pull/138967 on behalf of https://github.com/ZainRizvi due to Sorry, looks like this PR introduced a failure that was incorrectly classified as flaky, and the log classifier didn't identify the right log line either ([comment](https://github.com/pytorch/pytorch/pull/138967#issuecomment-2450228525))
2024-10-31 15:54:18 +00:00
Laith Sakka
24b695ae2d Don't uselessly recompute axiom dict every static eval call (#138967)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138967
Approved by: https://github.com/ezyang
2024-10-31 07:46:35 +00:00
Tom Ritchford
44257c063e [dynamo] Fix constant propagation in builtins and UserClasses (#131354)
* Fixes https://github.com/pytorch/pytorch/issues/118675
* Replaces https://github.com/pytorch/pytorch/pull/118994

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131354
Approved by: https://github.com/jansel, https://github.com/anijain2305
2024-10-30 12:47:20 +00:00
Xuehai Pan
9bbe4a67ad [dynamo] support maxlen for collections.deque (#138194)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138194
Approved by: https://github.com/jansel, https://github.com/malfet
2024-10-30 10:08:02 +00:00
Simon Fan
99608ceed6 Scoped extension building for C++ backed custom ops tests (#136695)
FIXES #125579 #131103 #133197 #133283 #134738 #135369 #135685

Tests that create C++ extensions can cause flakiness in CI due to library namespace conflict and test ordering. We can build them in temp dirs to ensure isolation.

An alternative is to build these as part of the build process and have build time errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136695
Approved by: https://github.com/zou3519
2024-10-26 07:41:00 +00:00
Ryan Guo
f14247d5aa [dynamo] Accurately identify mutated cells captured by multiple functions (#138632)
This patch changes `mutated_closure_cell_contents: Set[str]` to
`mutated_closure_cell_ids: Set[int]` so that Dynamo can more accurately
identify closure cells across different instances of
`UserFunctionVariable`. This prevents Dynamo from mistakenly treat a
cell as immutable, despite it'll be mutated when referenced as closure
cell from another function.

More context in
https://github.com/pytorch/pytorch/issues/138112#issuecomment-2420580779.

Fixes #138112.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138632
Approved by: https://github.com/jansel
ghstack dependencies: #138639
2024-10-26 02:17:07 +00:00
Ryan Guo
0a4197490c Delay mul/pow expansion for _SympyT to enable more folding (#138235)
Instead of calling `safe_expand` right after symbolic expression construction, we invoke it in `ShapeEnv.simplify`. This enables more simplification with product form, e.g.,
```
(a + b)^2 / (a + b) --> (a + b)
```
which won't happen if we expand eagerly during product construction:
```
(a^2 + 2ab + b^2) / (a + b) --> no change
```

Fixes #136044.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138235
Approved by: https://github.com/ezyang
2024-10-21 16:38:47 +00:00
Isuru Fernando
4f45a052ad Fix try_solve for s1*s2 == 0 when both symbols are unknown (#137919)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137919
Approved by: https://github.com/ezyang
2024-10-20 23:33:08 +00:00
Ryan Guo
59158f640c [dynamo] Support equality comparison between Tensor and None (#138289)
This patch updates the `wrap_fx_proxy_cls` function to allow boolean output when the operation is one of
`supported_const_comparison_op_values`.

Fixes #120907.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138289
Approved by: https://github.com/williamwen42
2024-10-18 17:49:26 +00:00
Adnan Akhundov
809ff3b274 Add host-side Triton TMA support to Dynamo (#137677)
This adds Dynamo tracing support for the host-side Triton TMA API (see `create_2d_tma_descriptor` calls on the host in the [Triton tutorial](https://triton-lang.org/main/getting-started/tutorials/09-persistent-matmul.html#sphx-glr-getting-started-tutorials-09-persistent-matmul-py)). A few notes:

- Here we assume the availability of the host-side TMA API added to upstream Triton in https://github.com/triton-lang/triton/pull/4498. As of time of writing, this is not a part of the PT2 OSS Triton pin (although back-ported internally). OSS Triton pin update should be done in December 2024.
- To capture the chain of calls `t.data_ptr() --> create_{1d,2d}_tma_descriptor(ptr, ...) --> kernel[grid](tma_desc, ...)`, we add three new variable trackers: `DataPtrVariable`, `CreateTMADescriptorVariable` (for the function), `TMADescriptorVariable` (for TMA descriptor object). This is to maintain the path back from the Triton kernel to the Tensor from which the TMA descriptor has been created.
- The newly introduced variables have `reconstruct` methods used in case of graph breaks.
- The `tma_descriptor_metadata` extracted from the captured `create_{1d,2d}_tma_descriptor` calls is propagated through the HOPs in Dynamo and AOTAutograd to be used by the downstream compiler (e.g., Inductor). See the unit tests for how the captured HOP arguments look like.
- In the Dynamo-captured fx graph, we replace the TMA descriptor arguments of the Triton kernel by the underlying Tensors, to be able to track the input/output relationships in terms of Tensors.
- In the Triton kernel mutation analysis pass (in AOTAutograd), we use the `tt.experimental_descriptor_store` TTIR op to detect mutations of the underlying tensors via TMA descriptors. So that downstream AOTAutograd can perform functionalizations as required.
- JIT Inductor and AOT Inductor support will be implemented in follow-up PRs.

Differential Revision: [D64404928](https://our.internmc.facebook.com/intern/diff/D64404928)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137677
Approved by: https://github.com/zou3519
2024-10-16 02:18:48 +00:00
Xuehai Pan
1d6932937e [dynamo] fix NamedTupleVariable for PyStructSequence (torch.return_types.*) support (#137776)
PyStructSequence is the C API equivalent for `collections.namedtuple` in Python. But they have different constructors:

```python
tuple = NamedTupleType(*args)
tuple = NamedTupleType._make(args)
tuple = StructSequenceType(args)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137776
Approved by: https://github.com/jansel
2024-10-13 06:46:41 +00:00
William Wen
93bbc8abcc [dynamo, 3.13] use 3.13 multiline traceback in get_instruction_source_311 (#137617)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137617
Approved by: https://github.com/jansel
2024-10-10 20:19:27 +00:00
Ryan Guo
dd7c2899bd [dynamo] Properly prune dead cell local variables (#136891)
This patch updates the `prune_dead_locals` logic to do slightly more aggressive pruning for cell local variables, in absence of side-effects, e.g., a cell variable can be pruned when its user function(s) will never be used again.

See added tests for examples; note that a few tests in `test/dynamo/test_higher_order_ops.py` also got updated because we are no longer returning the unnecessary graph output.

Fixes #127350, #124653

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136891
Approved by: https://github.com/jansel, https://github.com/anijain2305, https://github.com/williamwen42, https://github.com/zou3519
2024-10-10 18:21:24 +00:00
Ryan Guo
394c143e4e [dynamo] Fix error when inlining certain nested closure returned by another function (#137510)
See `test_inline_closure_returned_by_another_function_and_captures` and #136814 for more context.

In #90286, we introduced an optimization so that for captured cells that are unmodified during a Dynamo trace, `UserFunctionVariable` will represent them as variable of the cell's actual value, rather than a `NewCellVariable`.

Later on we introduced more mechanisms to model such cells across function calls (#104222), and across function calls where `NestedUserFunctionVariable::bind_args` need to look up further in the parent frames (#106491) to find these cells' values.

This patch removes `InlinedClosureVariable` in favor of a simpler modelling, which is also more consistent with what was introduced in #90286, i.e., just model these cells as their contents, in `symbolic_locals`.

This fixes #136814 because resolution of `InlinedClosureVariable` to the underlying cell content value happens in
`NestedUserFunctionVariable::bind_args`, which requires Dynamo to have the value in scope at the function call site (when Dynamo does inlining), but's not always the case (as the test case shows). However, if we model the cells in `symbolic_locals`, we never need such resolution, and the values are directly stored into the `NestedUserFunctionVariable::closure` upon the function creation, at which point Dynamo always has the cell value in `symbolic_locals` for look up.

Fixes #136814.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137510
Approved by: https://github.com/williamwen42
2024-10-09 18:13:57 +00:00
William Wen
a6707a7303 [dynamo] log all graph breaks to graph_breaks logging artifact (#137244)
We were previously not logging all graph breaks (e.g. data dependent jumps) to the graph_breaks logging artifact.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137244
Approved by: https://github.com/jansel
2024-10-07 22:34:27 +00:00
PyTorch MergeBot
af64c44b56 Revert "Don't uselessly recompute axiom dict every static eval call (#135429)"
This reverts commit 1d6e0412f5.

Reverted https://github.com/pytorch/pytorch/pull/135429 on behalf of https://github.com/ezyang due to try again ([comment](https://github.com/pytorch/pytorch/pull/135429#issuecomment-2384288879))
2024-09-30 22:29:13 +00:00
Edward Z. Yang
9dbc6bacff Propagate detailed location information of shape guards to guards/recompiles output (#136917)
To see the payoff, look at test/dynamo/test_logging.py

The general idea is to refactor produce_guards into produce_guards_verbose which also returns verbose code parts, which have our annotations.

The rest of the logic is plumbing around SLocs to the places they need to be so we can print them. Guards are easy; value ranges and duck sizing take more care.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136917
Approved by: https://github.com/anijain2305
2024-09-30 00:43:12 +00:00
Edward Z. Yang
1d6e0412f5 Don't uselessly recompute axiom dict every static eval call (#135429)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135429
Approved by: https://github.com/isuruf
2024-09-28 20:59:59 +00:00
PyTorch MergeBot
e5228a7771 Revert "Don't uselessly recompute axiom dict every static eval call (#135429)"
This reverts commit 507c69e20f.

Reverted https://github.com/pytorch/pytorch/pull/135429 on behalf of https://github.com/malfet due to It(or it's parent) broke trunk CI, see 507c69e20f ([comment](https://github.com/pytorch/pytorch/pull/135429#issuecomment-2379422971))
2024-09-27 14:33:25 +00:00
Edward Z. Yang
507c69e20f Don't uselessly recompute axiom dict every static eval call (#135429)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135429
Approved by: https://github.com/isuruf
ghstack dependencies: #135137
2024-09-27 04:03:25 +00:00
Edward Z. Yang
11fd55827d Make CLOSURE_VARS construction lazy (#136599)
This makes us less likely to hit import cycle problems with torch

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136599
Approved by: https://github.com/anijain2305
2024-09-26 16:50:13 +00:00
PyTorch MergeBot
9223c16208 Revert "Fix constant propagation in builtins and UserClasses (#131354)"
This reverts commit dd4a51b39a.

Reverted https://github.com/pytorch/pytorch/pull/131354 on behalf of https://github.com/atalman due to Breaks torchrec tests ([comment](https://github.com/pytorch/pytorch/pull/131354#issuecomment-2375417145))
2024-09-25 23:01:03 +00:00
Tom Ritchford
dd4a51b39a Fix constant propagation in builtins and UserClasses (#131354)
* Fixes https://github.com/pytorch/pytorch/issues/118675
* Replaces https://github.com/pytorch/pytorch/pull/118994

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131354
Approved by: https://github.com/jansel, https://github.com/anijain2305
2024-09-25 13:03:40 +00:00
Tom Ritchford
e3ea5429f2 Implement GetAttrVariable.as_python_constant() (#134216)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134216
Approved by: https://github.com/amjames, https://github.com/williamwen42
2024-09-20 03:44:43 +00:00
Jan Wieczorek
908a5689eb Return unsafe_view instead of view from matmul when folding occurs (#134568)
When tensor folding occurs during matmul operation returned tensor is a view. This can cause issues when matmul is used inside a custom function and such view is then returned as output. Then it cannot be modified inplace and causes errors.
It can be especially problematic when after such function inplace allreduce is performed.
Issue is resolved when unsafe_view is returned from matmul instead. This solution aligns matmul decomposition with eager implementation in such a way that a non view tensor is returned.

Test included in this PR reproduces the issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134568
Approved by: https://github.com/zou3519
2024-09-19 11:52:16 +00:00
Michael Lazos
14cabdf626 [Dynamo] Support thread local setattr (#135443)
In preparation for tracing through DeviceContext (defb515306/torch/utils/_device.py (L66))
This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443
Approved by: https://github.com/anijain2305
ghstack dependencies: #134732, #133137
2024-09-14 18:52:22 +00:00
PyTorch MergeBot
46f5037007 Revert "[Dynamo] Support thread local setattr (#135443)"
This reverts commit 149d0b7161.

Reverted https://github.com/pytorch/pytorch/pull/135443 on behalf of https://github.com/mlazos due to broke python test/quantization/pt2e/test_numeric_debugger.py TestNumericDebugger.test_re_export_preserve_handle modified yesterday ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2350937008))
2024-09-14 10:02:55 +00:00
Michael Lazos
149d0b7161 [Dynamo] Support thread local setattr (#135443)
In preparation for tracing through DeviceContext (defb515306/torch/utils/_device.py (L66))
This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443
Approved by: https://github.com/anijain2305
ghstack dependencies: #134732, #133137
2024-09-14 02:40:52 +00:00
PyTorch MergeBot
3f30360d05 Revert "[Dynamo] Support thread local setattr (#135443)"
This reverts commit 30b007bea3.

Reverted https://github.com/pytorch/pytorch/pull/135443 on behalf of https://github.com/albanD due to Broke tests on main ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2348886378))
2024-09-13 12:52:58 +00:00
Michael Lazos
30b007bea3 [Dynamo] Support thread local setattr (#135443)
In preparation for tracing through DeviceContext (defb515306/torch/utils/_device.py (L66))
This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443
Approved by: https://github.com/anijain2305
ghstack dependencies: #134732, #133137
2024-09-13 08:41:07 +00:00
Bob Ren
dd47f6f623 Simplify expr before getting implications in _maybe_evaluate_static (#135499)
Fixes #134268

Previously we weren't simplifying these expressions before calling get_implications, resulting in inconsistent application of FloorDiv/CleanDiv. See #134268  for more details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135499
Approved by: https://github.com/ezyang
2024-09-11 19:48:29 +00:00
PyTorch MergeBot
3ab12e2596 Revert "[Dynamo] Support thread local setattr (#135443)"
This reverts commit 160c228a4b.

Reverted https://github.com/pytorch/pytorch/pull/135443 on behalf of https://github.com/clee2000 due to something in this stack broke functorch/test_control_flow.py::TestControlFlow::test_scan_simple_graph [GH job link](https://github.com/pytorch/pytorch/actions/runs/10804912306/job/29980571390) [HUD commit link](444b52ff40), newly added test yesterday ([comment](https://github.com/pytorch/pytorch/pull/135443#issuecomment-2344042800))
2024-09-11 15:53:55 +00:00
Michael Lazos
160c228a4b [Dynamo] Support thread local setattr (#135443)
In preparation for tracing through DeviceContext (defb515306/torch/utils/_device.py (L66))
This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443
Approved by: https://github.com/anijain2305
ghstack dependencies: #134732, #133137
2024-09-11 04:18:22 +00:00
rzou
82d00acfee Allow cross-device copies for cpu scalars in refs (#135140)
This copies our eager-mode behavior where someone can do torch.add(a, b, out=c)
where a and b are CPU scalar tensors and c is a CUDA tensor.

Fixes https://github.com/pytorch/pytorch/issues/121619 by side effect (we get into a situation where we're writing a CPU scalar into a FakeTensor that is actually a meta tensor)

Test Plan:
- new test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135140
Approved by: https://github.com/williamwen42, https://github.com/yanboliang
2024-09-05 19:08:48 +00:00
Animesh Jain
32f45f01a9 [dynamo] Retire CompileProfiler (#135133)
Fixes confusion in https://github.com/pytorch/pytorch/issues/113443

We have TORCH_LOGS that supersedes CompileProfiler

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135133
Approved by: https://github.com/ezyang
ghstack dependencies: #135039, #135121, #135129, #135130
2024-09-05 01:08:40 +00:00
Michael Lazos
d9ae92cd6e [Dynamo] Support for proxying frozen dataclasses (#134846)
Fixes https://github.com/pytorch/pytorch/issues/133858

Details: Previously Dynamo would treat dataclasses as UserDefinedVariables. This was non-desirable if we would like to proxy the value into the graph, which is needed for TensorSubclassMetadata. To rectify this, frozen dataclasses are now able to be proxied similarly to NamedTuples. We require the object to be frozen, because if arbitrary mutation were allowed, we would need to replay those mutations in the graph after construction of the object.

For tracing construction of the variable, the generated `__init__` for the dataclass uses `object.__setattr__` because frozen dataclasses throw errors on the usual `__setattr__` invocation. With this treatment, no special handling is needed in dynamo for frozen dataclass construction.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134846
Approved by: https://github.com/bdhirsh, https://github.com/anijain2305
2024-09-04 22:17:00 +00:00
rzou
d7b57c4d63 Fix tensor.data access under inference_mode and compile (#134878)
Fixes https://github.com/pytorch/pytorch/issues/134798

In the regular Tensor case, when you call Tensor.data, there's a check
for if inference mode is active. If it is active, then we don't set the
version counter. We replicate this check for Tensor Subclasses (the bug
was we were trying to set the version counter on a FakeTensor in
inference_mode).

Test Plan:
- new test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134878
Approved by: https://github.com/bdhirsh
2024-09-04 17:55:41 +00:00
Laith Sakka
6c3767452d Move auto functionalize tests in their own test file (#134834)
title + use `with torch.library._scoped_library as lib` when needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134834
Approved by: https://github.com/zou3519
ghstack dependencies: #134831
2024-09-03 17:09:03 +00:00
Chen Haifeng
27ffa67984 Support __class__ attr for tuple and list variables (#134099)
Fixes #134086

This supports __class__ attribute for TupleVariable and ListVariable. And allows to construct a tuple or list by using __class__ attribute. This patch also fix a bug in NamedTupleVariable which misses a return on calling super var_getattr.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134099
Approved by: https://github.com/anijain2305, https://github.com/jansel
2024-08-30 01:57:49 +00:00
Xuehai Pan
70853b792a [dynamo][itertools] support itertools.tee (#133771)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771
Approved by: https://github.com/jansel
ghstack dependencies: #133801
2024-08-29 13:36:52 +00:00
PyTorch MergeBot
f65df5edae Revert "[dynamo][itertools] support itertools.tee (#133771)"
This reverts commit 1dbd3476de.

Reverted https://github.com/pytorch/pytorch/pull/133771 on behalf of https://github.com/ZainRizvi due to Sorry, have to revert this in order to be able to revert https://github.com/pytorch/pytorch/pull/133769 ([comment](https://github.com/pytorch/pytorch/pull/133771#issuecomment-2316611158))
2024-08-29 02:49:30 +00:00
Yanbo Liang
97c8a0739e [Dynamo] Support inspect.signature.Parameter getattr (#134636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134636
Approved by: https://github.com/Chillee, https://github.com/anijain2305
2024-08-28 09:59:41 +00:00
Bob Ren
1ba39ec1d0 Add test case test_arange_length_with_float32_dtype (#134415)
Adding a test as a followup from https://github.com/pytorch/pytorch/pull/134296

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134415
Approved by: https://github.com/ezyang
2024-08-27 21:36:23 +00:00
Xuehai Pan
1dbd3476de [dynamo][itertools] support itertools.tee (#133771)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771
Approved by: https://github.com/jansel
2024-08-27 00:08:04 +00:00
PyTorch MergeBot
3d7f3f6a55 Revert "[dynamo][itertools] support itertools.tee (#133771)"
This reverts commit 0e49b2f18e.

Reverted https://github.com/pytorch/pytorch/pull/133771 on behalf of https://github.com/ZainRizvi due to Sorry, but this breaks internal tests because of using functools ([comment](https://github.com/pytorch/pytorch/pull/133778#issuecomment-2310445169))
2024-08-26 15:16:17 +00:00
Xu Han
dc1959e6a7 [inductor] calibration inductor windows uts (7/N) (#134420)
Disable UTs on Windows: `test/dynamo/test_misc.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134420
Approved by: https://github.com/jansel
2024-08-25 20:39:54 +00:00
Xu Han
90fb83749e [inductor] fix test torch package working with trace on windows (#134397)
Current temporary directory path is hard code. Fixed by get temporary directory path by API.

Reproduce UTs:
```cmd
python test/dynamo/test_dynamic_shapes.py -v -k test_torch_package_working_with_trace_dynamic_shapes
```

Error message:
```cmd
________________________________________________________________________________________________ DynamicShapesMiscTests.test_torch_package_working_with_trace_dynamic_shapes ________________________________________________________________________________________________
Traceback (most recent call last):
  File "D:\xu_git\dnnl_cb\pytorch\test\dynamo\test_misc.py", line 7199, in test_torch_package_working_with_trace
    with package.PackageExporter(path) as exp:
  File "C:\Users\Xuhan\.conda\envs\win_mkl_static\lib\site-packages\torch\package\package_exporter.py", line 237, in __init__
    self.zip_file = torch._C.PyTorchFileWriter(f)
RuntimeError: Parent directory /tmp does not exist.

To execute this test, run the following from the base repo dir:
    python test\dynamo\test_dynamic_shapes.py DynamicShapesMiscTests.test_torch_package_working_with_trace_dynamic_shapes

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
========================================================================================================================== short test summary info ==========================================================================================================================
FAILED [0.0080s] test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_package_working_with_trace_dynamic_shapes - RuntimeError: Parent directory /tmp does not exist.
==================================================================================================================== 1 failed, 1665 deselected in 4.00s =====================================================================================================================
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134397
Approved by: https://github.com/ezyang
2024-08-24 20:25:44 +00:00
Xuehai Pan
0e49b2f18e [dynamo][itertools] support itertools.tee (#133771)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771
Approved by: https://github.com/jansel
ghstack dependencies: #133769, #133778, #133779
2024-08-23 10:13:12 +00:00
Xuehai Pan
25b2e46573 [dynamo] add max iterator limit while inlining generators (#134233)
Related:

- #133879

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134233
Approved by: https://github.com/jansel
2024-08-23 07:03:31 +00:00
rzou
683609c631 Skip cpp_extension test internally (#134011)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134011
Approved by: https://github.com/masnesral
2024-08-21 13:51:05 +00:00
Xuehai Pan
c929e1e11f [dynamo] fix polyfill for user defined constructor __new__ (#133822)
In `cls->tp_call`, if `cls->tp_new` does not return an instance of class `cls`, then `cls->tp_init` is not called on the new instance.

Related PR:

- #132977

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133822
Approved by: https://github.com/jansel
2024-08-21 12:41:19 +00:00
PyTorch MergeBot
2540ee372a Revert "[dynamo][itertools] support itertools.tee (#133771)"
This reverts commit 28ce3c0227.

Reverted https://github.com/pytorch/pytorch/pull/133771 on behalf of https://github.com/ZainRizvi due to breaking main windows cpu tests - this stack still causes that windows test to fail ([comment](https://github.com/pytorch/pytorch/pull/133712#issuecomment-2299776241))
2024-08-20 21:14:44 +00:00
Xuehai Pan
b03381cac2 [dynamo] support cls.__flags__ (#133970)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133970
Approved by: https://github.com/jansel
ghstack dependencies: #133969
2024-08-20 20:03:31 +00:00
Xuehai Pan
28ce3c0227 [dynamo][itertools] support itertools.tee (#133771)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771
Approved by: https://github.com/jansel
ghstack dependencies: #133712, #133769, #133778, #133779
2024-08-20 19:48:57 +00:00
Bob Ren
f08d484702 Add itertools.islice support in dynamo (#133893)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133893
Approved by: https://github.com/oulgen
2024-08-20 05:55:53 +00:00
Animesh Jain
6ca68357b3 [dynamo] Save class vt in UserDefinedObjectVariable (#133800)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133800
Approved by: https://github.com/jansel
ghstack dependencies: #133745, #133747, #133746, #133799
2024-08-19 17:21:48 +00:00
Animesh Jain
fed6096e73 [dynamo] Support object.__new__ call (#133746)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133746
Approved by: https://github.com/Skylion007, https://github.com/jansel
ghstack dependencies: #133745, #133747
2024-08-18 07:18:52 +00:00
Animesh Jain
8a5708ba3d [dynamo] Support object creation of classes with custom __new__ (#132977)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132977
Approved by: https://github.com/jansel
2024-08-16 03:09:23 +00:00
Edward Z. Yang
b5711297a0 Add support for SetVariable.discard (#133317)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133317
Approved by: https://github.com/Skylion007
2024-08-14 09:10:36 +00:00
rzou
afb73d253c [custom_ops] torch.library.{custom_op, register_kernel} disable Dynamo (#133125)
We promise the user that these custom ops (and their kernels) are black
boxes w.r.t. torch.compile. Unfortunately Dynamo can turn itself back
on in the implementation of the custom operator, so we force it off by
disabling Dynamo

Test Plan:
- new tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133125
Approved by: https://github.com/ezyang
2024-08-12 18:29:18 +00:00
Yiming Zhou
c69b2d24e3 [dynamo] Support remove method of set (#132943)
Fixes https://github.com/pytorch/pytorch/issues/132800

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132943
Approved by: https://github.com/anijain2305
2024-08-08 02:43:19 +00:00
Joel Schlosser
fb146fc3c6 Only store necessary tensor_dict fields in node meta (#132805)
Fixes #132290

This PR attempts a more invasive / complete solution than the one from #132338, which removes immediate tensor fields from the `tensor_dict` copy stored in node meta. The approach taken here is to store only those fields of the `tensor_dict` which are absolutely utilized somewhere else.

So far, this appears to be limited to:
* `_dynamo_static_input_type`
* `tag` (at least in the tests). Discussion at #94080 appears to indicate this is depended on for export

(CI may point out more)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132805
Approved by: https://github.com/mlazos
2024-08-07 13:35:16 +00:00
Animesh Jain
06581c277a [dynamo][stable-diffusion] Support dict(obj) on constrained subclasses of dict and OrderedDict (#132558)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132558
Approved by: https://github.com/jansel
2024-08-03 06:31:00 +00:00
William Wen
f379bbd46d [dynamo] support inspect.signature.bind (#132330)
Fixes https://github.com/pytorch/pytorch/issues/93760.

This was not that small of a task...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132330
Approved by: https://github.com/jansel
ghstack dependencies: #132329
2024-08-02 20:37:05 +00:00
Edward Z. Yang
fc32732596 Don't attempt to compute hints for unbacked expressions (#132060)
This breaks the inference we made that if you cat an N-D tensor with a 1-D tensor of size (u0,), the u0 must be zero, but no one really wanted that anyway...

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132060
Approved by: https://github.com/Skylion007
2024-08-02 16:39:14 +00:00
PyTorch MergeBot
1197550876 Revert "Don't attempt to compute hints for unbacked expressions (#132060)"
This reverts commit d342dc0179.

Reverted https://github.com/pytorch/pytorch/pull/132060 on behalf of https://github.com/ezyang due to test_correct_module_names ([comment](https://github.com/pytorch/pytorch/pull/132407#issuecomment-2265754857))
2024-08-02 16:32:43 +00:00
Edward Z. Yang
d342dc0179 Don't attempt to compute hints for unbacked expressions (#132060)
This breaks the inference we made that if you cat an N-D tensor with a 1-D tensor of size (u0,), the u0 must be zero, but no one really wanted that anyway...

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132060
Approved by: https://github.com/Skylion007
ghstack dependencies: #131649, #132407
2024-08-02 12:09:37 +00:00
Yanbo Liang
5ea0f51187 [Dynamo] Support abc.MutableMapping.get (#132363)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132363
Approved by: https://github.com/anijain2305, https://github.com/mlazos
2024-08-02 04:17:35 +00:00
Chen Haifeng
50ed6ce277 Support built-in id function for TensorVariable on parameters (#130100)
Fixes #130087

This patch tries to provide a built-in id function implementation for TensorVariable when the id function is called on tensors like module parameters. The id function call on intermediate tensors is not supported.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130100
Approved by: https://github.com/anijain2305
2024-08-02 01:19:25 +00:00
Oguz Ulgen
920f0426ae Add None return type to init -- tests rest (#132376)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376
Approved by: https://github.com/jamesjwu
ghstack dependencies: #132335, #132351, #132352
2024-08-01 15:44:51 +00:00
YangQun1
589aef4bb0 Fix py codegen to delete values that don't have any users (#131028)
Fixes #131025

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028
Approved by: https://github.com/ezyang
2024-08-01 03:18:37 +00:00
ekamiti
9e473fd868 Make adding Buffers more like adding Parameters (#125971)
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.

Fixes #35735

Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971
Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos
2024-07-31 10:32:40 +00:00
Yidi Wu
32c57e78ed Specialize sym node when used as device kwarg (#131811)
Fixes https://github.com/pytorch/pytorch/issues/131189.

We specialize the symint in python_arg_parser when used as kwarg device.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131811
Approved by: https://github.com/yanboliang, https://github.com/jansel, https://github.com/albanD
2024-07-30 17:11:57 +00:00
Animesh Jain
13457d1da0 [dynamo][log] Suggest to use pytree when graph-break on optree (#131827)
Discovered while working on https://github.com/pytorch/pytorch/issues/121369
On the model above, the log looks like this

~~~
/home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree._C.PyCapsule.flatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py.
  torch._dynamo.utils.warn_once(msg)
/home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree.PyCapsule.unflatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py.
  torch._dynamo.utils.warn_once(msg)
  ~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131827
Approved by: https://github.com/zou3519, https://github.com/mlazos
2024-07-30 05:49:58 +00:00
Xuehai Pan
918ece4f4d [BE][Easy][11/19] enforce style for empty lines in import segments in test/dy*/ (#129762)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129762
Approved by: https://github.com/anijain2305
2024-07-27 17:43:53 +00:00
Animesh Jain
13ab92b72d [dynamo][recompile-logs] Suggest force_parameter_static_shapes on the recompile log for parameter-related recomps (#131825)
Discovered in https://github.com/pytorch/pytorch/issues/121369

On the user-empathy-day model, the logs look like these
~~~
W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] torch._dynamo hit config.cache_size_limit (8)
W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8]    function: 'auto_repeat_tensors_for_time' (/home/anijain/local/lumiere-pytorch/lumiere_pytorch/lumiere.py:545)
W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8]    last reason: 0/0: len(L['args']) == 1
W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html.
W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] torch._dynamo hit config.cache_size_limit (8)
W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8]    function: 'forward' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/denoising_diffusion_pytorch/karras_unet.py:150)
W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8]    last reason: 11/0: tensor 'L['x']' size mismatch at index 0. expected 16, actual 8
W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html.
W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] torch._dynamo hit config.cache_size_limit (8)
W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8]    function: 'normalize_weight' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/denoising_diffusion_pytorch/karras_unet.py:127)
W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8]    last reason: 40/1: tensor 'L['weight']' size mismatch at index 0. expected 64, actual 16. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters.
W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html.
W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] torch._dynamo hit config.cache_size_limit (8)
W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8]    function: 'pack_one' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/denoising_diffusion_pytorch/karras_unet.py:38)
W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8]    last reason: 58/1: tensor 'L['t']' stride mismatch at index 0. expected 32, actual 8. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters.
W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html.
W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] torch._dynamo hit config.cache_size_limit (8)
W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8]    function: 'torch_dynamo_resume_in_pack_at_70' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/einops-0.8.0-py3.10.egg/einops/packing.py:70)
W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8]    last reason: 62/0: tensor 'L['tensors'][0]' size mismatch at index 0. expected 16, actual 32. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters.
W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html.
W0725 15:34:12.357000 1967777 torch/_dynamo/convert_frame.py:807] [65/8] torch._dynamo hit config.cache_size_limit (8)
W0725 15:34:12.357000 1967777 torch/_dynamo/convert_frame.py:807] [65/8]    function: 'reshape' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/einops-0.8.0-py3.10.egg/einops/_backends.py:91)
W0725 15:34:12.357000 1967777 torch/_dynamo/convert_frame.py:807] [65/8]    last reason: 65/0: tensor 'L['x']' size mismatch at index 0. expected 32, actual 8. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters.
~~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131825
Approved by: https://github.com/ezyang
ghstack dependencies: #131795, #131801, #131804
2024-07-26 16:25:21 +00:00
PyTorch MergeBot
c3679bed35 Revert "Fix py codegen to delete values that don't have any users (#131028)"
This reverts commit 91aba7baac.

Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/clee2000 due to broke inductor/test_triton_kernels inductor/test_triton_kernels.py::KernelTests::test_triton_kernel_functionalize [GH job link](https://github.com/pytorch/pytorch/actions/runs/10094659640/job/27915271250) [HUD commit link](91aba7baac) ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2251058374))
2024-07-25 17:42:18 +00:00
Yidi Wu
ffc6bf8149 [dynamo] lazily guard and specialize on the symint when used in f-string. (#131529)
Fixes https://github.com/pytorch/pytorch/issues/103602.

This PR implements the idea of "if someone creates a string and then ends up not using it, we would prefer to NOT have specialized." mentioned in above issue. Specifically, we create a lazy variable tracker instead of ConstantVariable when we're in FORMAT_VALUE, and when the lazy variable tracker is realized (i.e. it's going to be used), we create a ConstantVariable and the specialization/guarding happens at the time of realization.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131529
Approved by: https://github.com/ezyang
2024-07-25 16:16:34 +00:00
YangQun1
91aba7baac Fix py codegen to delete values that don't have any users (#131028)
Fixes #131025

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028
Approved by: https://github.com/ezyang
2024-07-25 13:04:23 +00:00
PyTorch MergeBot
8ffd109a00 Revert "Fix py codegen to delete values that don't have any users (#131028)"
This reverts commit 466c167b71.

Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/atalman due to breaks CI ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2247771530))
2024-07-24 12:21:43 +00:00
YangQun1
466c167b71 Fix py codegen to delete values that don't have any users (#131028)
Fixes #131025

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028
Approved by: https://github.com/ezyang
2024-07-24 01:03:56 +00:00
Aaron Orenstein
b193894b94 FakeTensor cache SymInt support (#127596)
Adds support for SymInts in the FakeTensor cache.

A couple notes:
1. When a SymInt is present in the input key for a FakeTensor operation we cache on the ShapeEnv instead of using the FakeTensorMode cache. This is necessary so we don't have to remember and check the guards. It reduces the cache hits but there's diminishing return on how much work we can do before the cache becomes more of a burden than a gain.
2. We need to be careful that when we cache an output SymInt that is a direct copy from the input that when we have a cache-hit we copy the SymNode from the input to the output. This is important because the fx-graph building code actually uses SymNode ids in the process of building the graph so constructing a same-content-but-different-id SymNode will fail.
3. In the cache key we store SymInts as a _PySymInputStub. These represent SymInt (and friends) but support `__hash__` and `__eq__` (which SymInt do not).
4. In the cache entry we store SymInts as a _SymIntOutputStub.

Perf example:
```
python benchmarks/dynamo/timm_models.py --ci --accuracy --timing
--explain --inductor --dynamic-shapes --dynamic-batch-only --device cuda
--training --amp --total-partitions 2 --partition-id 0 --output
/tmp/training_timm_models.csv --filter crossvit_9_240
```
fake tensor cache before:
```
INFO: FakeTensor cache stats:
INFO:   cache_hits: 68137
INFO:   cache_misses: 837
INFO:   cache_bypasses:
INFO:     symbolic shape:            48224
INFO:     CompositeImplicitAutograd: 917
INFO:     non-fake tensor:           70
INFO:     non-FakeTensor output:     62
INFO:     non-builtin:               8
INFO:     dynamic output shape:      1
```
and after:
```
INFO: FakeTensor cache stats:
INFO:   cache_hits: 88187
INFO:   cache_misses: 14233
INFO:   cache_bypasses:
INFO:     CompositeImplicitAutograd: 1037
INFO:     non-FakeTensor output:     602
INFO:     non-fake tensor:           70
INFO:     unsafe view:               36
INFO:     non-builtin:               8
INFO:     dynamic output shape:      1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127596
Approved by: https://github.com/eellison
ghstack dependencies: #131014, #129780
2024-07-21 19:26:38 +00:00
Michael Lazos
1b72cf0b09 Add hasattr for tensor variable (#131008)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131008
Approved by: https://github.com/anijain2305
ghstack dependencies: #131007
2024-07-19 12:43:27 +00:00
Pian Pawakapan
988ed4d5db [export] clean up allow_complex_guards_as_runtime_asserts flag (#130596)
Summary: removes underscore, cleans up dead code in DimConstraints

Test Plan: existing export tests

Reviewed By: angelayi

Differential Revision: D59612746

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130596
Approved by: https://github.com/angelayi
2024-07-12 17:17:11 +00:00
Michael Lazos
c101c4517a Add python type for list iterators (#130511)
Fixes https://github.com/pytorch/pytorch/issues/117026

Also not sure why this was missing

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130511
Approved by: https://github.com/williamwen42, https://github.com/yanboliang, https://github.com/anijain2305
2024-07-12 01:14:18 +00:00
Xuehai Pan
973037be6a [BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): list() / tuple() / dict() (#130199)
This PR changes the empty collection factory call to Python literals:

- `list()` -> `[]`
- `tuple()` -> `()`
- `dict()` -> `{}`

The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary:

```bash
$ python3 -m dis - <<EOS
import collections

d1 = {}
d2 = dict()

dict = collections.OrderedDict
d3 = dict()
EOS
```

```text
  0           0 RESUME                   0

  1           2 LOAD_CONST               0 (0)
              4 LOAD_CONST               1 (None)
              6 IMPORT_NAME              0 (collections)
              8 STORE_NAME               0 (collections)

  3          10 BUILD_MAP                0
             12 STORE_NAME               1 (d1)

  4          14 PUSH_NULL
             16 LOAD_NAME                2 (dict)
             18 CALL                     0
             26 STORE_NAME               3 (d2)

  6          28 LOAD_NAME                0 (collections)
             30 LOAD_ATTR                8 (OrderedDict)
             50 STORE_NAME               2 (dict)

  7          52 PUSH_NULL
             54 LOAD_NAME                2 (dict)
             56 CALL                     0
             64 STORE_NAME               5 (d3)
             66 RETURN_CONST             1 (None)
```

The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199
Approved by: https://github.com/malfet
2024-07-11 17:30:28 +00:00
Pian Pawakapan
1b3b4c2fb9 [runtime asserts] deduplicate runtime asserts & CSE (#128599) (#130380)
original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train)

Summary:
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]

s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Test Plan:
contbuild & OSS CI, see 940e4477ab

Original Phabricator Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Differential Revision: D59543603

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380
Approved by: https://github.com/izaitsevfb
2024-07-10 19:23:37 +00:00
PyTorch MergeBot
9c9744c3ac Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599)"
This reverts commit 940e4477ab.

Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))
2024-07-09 21:03:49 +00:00
Yueming Hao
b4cc25f126 [custom_op]Fix self in mutation_args (#130179)
Fixes #124933

## Issue Summary
If users define `self` as mutate args, there is an error occurs `TypeError: AutoFunctionalized.__call__() got multiple values for argument 'self'`. For the following example, the schema for mutates_args is parsed as {"self": FakeTensor}.  6df963a2c8/torch/_higher_order_ops/auto_functionalize.py (L234)
In the above line, it is unwrapped as `self=FakeTensor` and leads to wrong argument pass because `self` is the default keyword for functions of a class, such as https://github.com/pytorch/pytorch/compare/main...findhao/fix-self-custom-ops#diff-9453b6b52a54783beec3dd1c60248620f61c3a524d404a188af17bbdf6be3d9eR292 .
```python
import torch

@torch.library.custom_op("mylib::foo", mutates_args={"self"})
def foo(self: torch.Tensor) -> None:
    self.sin_()

x = torch.randn(3)

@torch.compile(backend="inductor", fullgraph=True)
def f(x):
    foo(x)

f(x)
```
## Fix
This PR changes all related default argument `self` to `self_` following the existing way in 6fc771d19b/torch/_ops.py (L667)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130179
Approved by: https://github.com/zou3519
2024-07-08 22:55:50 +00:00
Pian Pawakapan
940e4477ab [runtime asserts] deduplicate runtime asserts & CSE (#128599)
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]
# something with _w ...

# turns into ->
s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

# turns into
torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599
Approved by: https://github.com/ezyang
2024-07-07 20:10:14 +00:00
PyTorch MergeBot
963f430d13 Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599)"
This reverts commit 0267b2ddcb.

Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk 0267b2ddcb ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))
2024-07-06 07:20:05 +00:00
Pian Pawakapan
0267b2ddcb [runtime asserts] deduplicate runtime asserts & CSE (#128599)
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]
# something with _w ...

# turns into ->
s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

# turns into
torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599
Approved by: https://github.com/ezyang
2024-07-06 03:44:49 +00:00
Animesh Jain
bd0252fb98 [dynamo][user-defined] Support method descriptors (#130159)
Fixes https://github.com/pytorch/pytorch/issues/120650

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130159
Approved by: https://github.com/jansel
ghstack dependencies: #118448
2024-07-06 02:03:09 +00:00
Yanbo Liang
551f3b92b2 [Dynamo] Add assertion for tensor unpack shape mismatch (#130077)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130077
Approved by: https://github.com/Chillee
2024-07-04 09:25:08 +00:00
Animesh Jain
fa4e489d70 [dynamo][dynamic-shapes] Graph break if out shape changes on out= variants (#130074)
Fixes https://github.com/pytorch/pytorch/issues/130068

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130074
Approved by: https://github.com/ezyang
ghstack dependencies: #129913, #129914
2024-07-04 08:36:12 +00:00
Edward Z. Yang
29c68df600 Stop immediately specializing common constants 0/1 for plain int (#128327)
Fixes https://github.com/pytorch/pytorch/issues/128319

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128327
Approved by: https://github.com/lezcano
ghstack dependencies: #129983
2024-07-03 16:41:51 +00:00
Colin Peppler
39357ba06f [dynamo] don't constrain range on the replacement for a symbol (#129907)
# Error
```
  File "/data/users/colinpeppler/pytorch/torch/_meta_registrations.py", line 704, in sym_constrain_range
    constrain_range(size, min=min, max=max)
  File "/data/users/colinpeppler/pytorch/torch/fx/experimental/symbolic_shapes.py", line 898, in constrain_range
    a.node.shape_env._constrain_range(a.node.expr, min, max)
  File "/data/users/colinpeppler/pytorch/torch/fx/experimental/recording.py", line 245, in wrapper
    return fn(*args, **kwargs)
  File "/data/users/colinpeppler/pytorch/torch/fx/experimental/symbolic_shapes.py", line 2813, in _constrain_range
    assert isinstance(a, sympy.Symbol), f"constraining non-Symbols NYI, {a} is {type(a)}"
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
AssertionError: constraining non-Symbols NYI, s1 + s2 is <class 'sympy.core.add.Add'>
```

# Context
I ran into the following scenario:
```
getitem = ...
sym_size_int = torch.ops.aten.sym_size.int(getitem, 0) # this is u0 = s0 + s1
_check_is_size = torch._check_is_size(sym_size_int)
# we fail at this guy
sym_constrain_range_default = torch.ops.aten.sym_constrain_range.default(sym_size_int, min = 4, max = 1234)

# runtime assertion
add = sym_size_int + sym_size_int_1
eq = add == sym_size_int
_assert_scalar_default = torch.ops.aten._assert_scalar(eq, "Runtime assertion failed for expression Eq(s0 + s1, u0) on node 'eq'")
```

everything but getitem was asserted into the FX graph by insert_deferred_runtime_asserts()
7e4329c258/torch/fx/passes/runtime_assert.py (L38-L52)

In the above scenario, we fail trying to constraint the range on `s0 + s1` which is not a `sympy.Symbol`.

And why exactly are we constraining the range on `s0 + s1`? Because it's the replacement for `u0`.

# Approach
Whenever we try to constrain the range on the replacement of ~~an unbacked symint~~ a non-symbol, just ignore it.

In the scenario above, we'll be okay to ignore it because whenever there's a replacement on an unbacked symint, we will update its range. Hence, no need to constrain the range on `s1 + s1`. We can confirm this with `TORCH_LOGS="+dynamic"`.
```
torch/fx/experimental/symbolic_shapes.py:4737: _update_var_to_range u0 = VR[4, 198] (update)
torch/fx/experimental/symbolic_shapes.py:4856: set_replacement u0 = s1 + s2 (trivial_lhs) VR[4, 198]
```

600bf978ba/torch/fx/experimental/symbolic_shapes.py (L4759-L4764)

Differential Revision: [D59257079](https://our.internmc.facebook.com/intern/diff/D59257079)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129907
Approved by: https://github.com/jingsh
2024-07-02 21:46:40 +00:00
Animesh Jain
e62073d799 [dynamo] Skip FUNCTION_MATCH on method-wrapper objects (#129830)
Fixes https://github.com/pytorch/pytorch/issues/118563

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129830
Approved by: https://github.com/jansel
2024-06-30 20:21:18 +00:00
Chien-Lin Chen
5e7ac69a67 [Dynamic Shapes] fixed dynamic shape inference (#128807)
Made dynamic dimension indirectly bound to an integer constrained.
After each ShapeEnv._refine_ranges, check if the new ValueRange is singleton, if it is, replace the symbol.

Fixes #122307

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128807
Approved by: https://github.com/ezyang
2024-06-27 22:33:32 +00:00
Brian Hirsh
a4d7aa498b [Traceable FSDP2] Add auto-functionalize support for mutable list[Tensor] (copy from Brian's PR #127347); enable E2E inductor unit test for transformer model (#129502)
Copy of Brian's PR: https://github.com/pytorch/pytorch/pull/127347 with additional changes to support mutable `List[Tensor]` in Inductor. Also enable E2E inductor unit test for Traceable FSDP2 + transformer model.

Test commands:
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_trace_fsdp_set_`
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_aot_eager`
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_inductor`
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_fullgraph_backend_aot_eager`
- `pytest -rA test/dynamo/test_misc.py::MiscTests::test_auto_functionalize_tensorlist`
- `pytest -rA  test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_list_cuda`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129502
Approved by: https://github.com/zou3519
2024-06-27 17:50:57 +00:00
rzou
08b616281f [custom ops] Switch out references from old landing page to new landing page (#129178)
Test Plan:
- existing tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129178
Approved by: https://github.com/albanD
ghstack dependencies: #129177
2024-06-21 13:31:40 +00:00
Brian Hirsh
8c2542623b [Traceable FSDP2] [Dynamo] Add tracing support for out-variant custom ops that return None (#129078)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129078
Approved by: https://github.com/yanboliang
2024-06-20 17:46:13 +00:00
Huy Do
73f5d2b787 Run ET unit tests on PT CI (#128560)
This is the first PR to add all existing ET unit tests into PT CI.  The goal is to improve the coverage there to avoid breaking change from PT that could break ET.  With this, any future unit tests on ET will automatically be run on PT CI.  The duration of the job is now 40+ minutes, not too bad.

This also fixed the failed ET build in https://github.com/pytorch/pytorch/pull/123043.

Adding model coverage is a bit more evolved and requires adding new shards, so I will follow up on that in separate PRs.

[T192117506](https://www.internalfb.com/intern/tasks/?t=192117506), with the failed diffs D58295865 and D58394154

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128560
Approved by: https://github.com/guangy10, https://github.com/digantdesai
2024-06-19 20:08:58 +00:00
chilli
11ff5345d2 Changed colored logging to only be turned on if printing to interactive terminal (#128874)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128874
Approved by: https://github.com/anijain2305
2024-06-17 23:53:26 +00:00
Oguz Ulgen
472211c97a Make assert_size_stride to return all errors (#128764)
This will help debug some problems I'm encountering, but in general, it is best to show the entire error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128764
Approved by: https://github.com/jansel
2024-06-15 06:32:40 +00:00
Yueming Hao
73ba432d32 [custom_op]Fix None return schema (#128667)
Fixes #125044

If users define a schema returns `None`, it will be parsed to a `torch.NoneType`.  Auto functionalization support the `()` as a empty return but not for `None`. So, `None` return fails the check for [`can_auto_functionalize`](https://github.com/pytorch/pytorch/blob/findhao/fix_none_return_functionalize/torch/_higher_order_ops/auto_functionalize.py#L71) even we can take this as a `()` return. This PR is a fix to skip the check for None return.

I hope it can be fixed in a [deeper level](31e44c72ca), but this fix breaks a lot of existing schemas. So it's better to fix this issue in the auto_functionalize.py at this moment.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128667
Approved by: https://github.com/zou3519
2024-06-15 00:41:37 +00:00
chilli
c486e2ab64 Add coloring to fx graph print out (#128476)
Note: Won't land immediately, at least I'll need to add a color option to the field. But curious if any tests fail.

Old:
<img width="1294" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/c3a750ed-5e54-4621-b2e4-be5481be15b6">

New:
<img width="1303" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/3a1f1adc-6f3a-413e-8b87-ee53da9bf4ed">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128476
Approved by: https://github.com/ezyang
2024-06-13 23:39:04 +00:00
Animesh Jain
865d7b3424 [Reland][dynamo] Enable some inlining inbuilt nn module tests (#128440)
Co-authored-by: Laith Sakka <lsakka@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128440
Approved by: https://github.com/williamwen42, https://github.com/jansel
2024-06-13 22:39:22 +00:00
Edward Z. Yang
2229884102 Introduce int_oo (#127693)
In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range.

After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better.

But especially after #126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. **test/test_sympy_utils.py** describes some basic properties of the number, and **torch/utils/_sympy/numbers.py** has the actual implementation.

The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments.

Fixes https://github.com/pytorch/pytorch/issues/127396

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127693
Approved by: https://github.com/lezcano
ghstack dependencies: #126905
2024-06-13 04:08:20 +00:00
rzou
87072dcfdb Change Dynamo's custom ops warning message to be less spammy (#128456)
This is a short-term fix (for 2.4). In the longer term we should
fix https://github.com/pytorch/pytorch/issues/128430

The problem is that warnings.warn that are inside Dynamo print
all the time. Python warnings are supposed to print once, unless their
cache is reset: Dynamo ends up resetting that cache everytime it runs.

As a workaround we provide our own warn_once cache that is keyed on the
warning msg. I am not worried about this increasing memory usage because
that's effectively what python's warnings.warn cache does.

Test Plan:
- fix tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128456
Approved by: https://github.com/anijain2305
2024-06-12 21:57:12 +00:00
PyTorch MergeBot
5d8c7f39d4 Revert "Introduce int_oo (#127693)"
This reverts commit 9cab5987bd.

Reverted https://github.com/pytorch/pytorch/pull/127693 on behalf of https://github.com/clee2000 due to sorry executorch CI is a bit weird regarding pins, I'll make a chat with mergen with the choices of what to do and how it'll affect executorch CI, reverting for now to prevent more divergences in the meantime ([comment](https://github.com/pytorch/pytorch/pull/127693#issuecomment-2161775400))
2024-06-11 23:36:08 +00:00
Edward Z. Yang
9cab5987bd Introduce int_oo (#127693)
In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range.

After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better.

But especially after #126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. **test/test_sympy_utils.py** describes some basic properties of the number, and **torch/utils/_sympy/numbers.py** has the actual implementation.

The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments.

Fixes https://github.com/pytorch/pytorch/issues/127396

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127693
Approved by: https://github.com/lezcano
ghstack dependencies: #126905
2024-06-10 19:09:53 +00:00
Edward Z. Yang
3964a3ec73 Complete revamp of float/promotion sympy handling (#126905)
At a high level, the idea behind this PR is:

* Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.)
* Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers.

The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions:

* FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing).
* ModularIndexing, LShift, RShift now assert they are given integer inputs.
* Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver
* TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division.
* Trunc is split to TruncToFloat and TruncToInt.
* Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result.
* RoundDecimal updated to consistently only ever return a float
* Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing)

In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations.  Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information.

We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**:

* `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy
* `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv`

These changes have consequences. First, we need to make some administrative changes:

* Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2)
* Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py**
  * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function
  * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here
* Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet
* Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions.

In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments:

* Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now
* `_assert_bound_is_rational` is no more, we no longer generate rational bounds
* Don't intersect non-int value ranges with the `int_range`
* Support more sympy Functions for guard SYMPY_INTERP
* Assert the type of value range is consistent with the variable type

The new asserts uncovered necessary bug fixes:

* **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions
* **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions
* **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr!
* **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1

Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py**

**Reland notes.** This requires this internal fbcode diff https://www.internalfb.com/phabricator/paste/view/P1403322587 but I cannot prepare the diff codev due to https://fb.workplace.com/groups/osssupport/posts/26343544518600814/

It also requires this Executorch PR https://github.com/pytorch/executorch/pull/3911 but the ET PR can be landed prior to this landing.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905
Approved by: https://github.com/xadupre, https://github.com/lezcano
2024-06-09 06:20:25 +00:00
Edward Z. Yang
73d6ec2db6 Increase verbosity of FX graph dumps (#128042)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128042
Approved by: https://github.com/aorenste
2024-06-08 07:24:58 +00:00
PyTorch MergeBot
ac51f782fe Revert "Complete revamp of float/promotion sympy handling (#126905)"
This reverts commit 2f7cfecd86.

Reverted https://github.com/pytorch/pytorch/pull/126905 on behalf of https://github.com/atalman due to Sorry need to revert - failing internally ([comment](https://github.com/pytorch/pytorch/pull/126905#issuecomment-2155118778))
2024-06-07 16:01:46 +00:00
PyTorch MergeBot
224b4339e5 Revert "Make ValueRange repr less chatty by default (#128043)"
This reverts commit f0dd11df55.

Reverted https://github.com/pytorch/pytorch/pull/128043 on behalf of https://github.com/atalman due to Sorry reverting because in conflict with [#126905](https://github.com/pytorch/pytorch/pull/126905) which needs to be reverted ([comment](https://github.com/pytorch/pytorch/pull/128043#issuecomment-2155091732))
2024-06-07 15:43:39 +00:00
Edward Z. Yang
f0dd11df55 Make ValueRange repr less chatty by default (#128043)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128043
Approved by: https://github.com/lezcano
2024-06-06 16:42:48 +00:00
Edward Z. Yang
2f7cfecd86 Complete revamp of float/promotion sympy handling (#126905)
At a high level, the idea behind this PR is:

* Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.)
* Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers.

The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions:

* FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing).
* ModularIndexing, LShift, RShift now assert they are given integer inputs.
* Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver
* TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division.
* Trunc is split to TruncToFloat and TruncToInt.
* Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result.
* RoundDecimal updated to consistently only ever return a float
* Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing)

In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations.  Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information.

We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**:

* `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy
* `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv`

These changes have consequences. First, we need to make some administrative changes:

* Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2)
* Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py**
  * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function
  * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here
* Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet
* Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions.

In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments:

* Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now
* `_assert_bound_is_rational` is no more, we no longer generate rational bounds
* Don't intersect non-int value ranges with the `int_range`
* Support more sympy Functions for guard SYMPY_INTERP
* Assert the type of value range is consistent with the variable type

The new asserts uncovered necessary bug fixes:

* **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions
* **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions
* **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr!
* **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1

Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py**

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905
Approved by: https://github.com/xadupre, https://github.com/lezcano
2024-06-06 02:29:45 +00:00
PyTorch MergeBot
d5cb5d623a Revert "Complete revamp of float/promotion sympy handling (#126905)"
This reverts commit fb696ef3aa.

Reverted https://github.com/pytorch/pytorch/pull/126905 on behalf of https://github.com/ezyang due to internal user reported ceiling equality simplification problem, I have a plan ([comment](https://github.com/pytorch/pytorch/pull/126905#issuecomment-2148805840))
2024-06-05 03:57:58 +00:00
Edward Z. Yang
fb696ef3aa Complete revamp of float/promotion sympy handling (#126905)
At a high level, the idea behind this PR is:

* Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.)
* Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers.

The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions:

* FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing).
* ModularIndexing, LShift, RShift now assert they are given integer inputs.
* Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver
* TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division.
* Trunc is split to TruncToFloat and TruncToInt.
* Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result.
* RoundDecimal updated to consistently only ever return a float
* Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing)

In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations.  Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information.

We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**:

* `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy
* `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv`

These changes have consequences. First, we need to make some administrative changes:

* Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2)
* Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py**
  * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function
  * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here
* Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet
* Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions.

In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments:

* Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now
* `_assert_bound_is_rational` is no more, we no longer generate rational bounds
* Don't intersect non-int value ranges with the `int_range`
* Support more sympy Functions for guard SYMPY_INTERP
* Assert the type of value range is consistent with the variable type

The new asserts uncovered necessary bug fixes:

* **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions
* **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions
* **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr!
* **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1

Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py**

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905
Approved by: https://github.com/xadupre, https://github.com/lezcano
2024-06-04 11:47:32 +00:00
Michael Lazos
2129903aa3 Properly detect nested torch function args (#127496)
Dynamo was not detecting nested torch function classes in containers. This was due to pytree compatibility for variable trackers being removed.
Fixes https://github.com/pytorch/pytorch/issues/127174

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127496
Approved by: https://github.com/anijain2305
2024-06-02 03:43:22 +00:00
Animesh Jain
efcea2d2fd [dynamo] Support __getitem__ on NNModuleVariable __dict__ (#126956)
Moves further along (but still fails) for the testcase in https://github.com/pytorch/pytorch/pull/126875

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126956
Approved by: https://github.com/jansel
ghstack dependencies: #126923
2024-06-01 15:22:45 +00:00
Animesh Jain
4aa7a1efcf [dynamo] Initial exception handling support (#126923)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126923
Approved by: https://github.com/williamwen42, https://github.com/jansel
2024-06-01 13:00:32 +00:00
rzou
ffe506e853 Better graph break msg (and warning) on Dynamo x Python C++ extension (#127301)
Dynamo graph breaks on Python C/C++ extensions (e.g. pybinded
functions). The usual way to handle this is to turn those extensions
into custom ops. This PR adds a nicer graph break message and also
changes it to unconditionally warn on this graph break (because graph
break messages are usually not visible).

Fixes https://github.com/pytorch/pytorch/issues/126799

Test Plan:
- new test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127301
Approved by: https://github.com/jansel
ghstack dependencies: #127291, #127292, #127400, #127423
2024-05-30 14:54:29 +00:00
laithsakka
5196ef1b59 support builtin id function on user defined object variables. (#127146)
Fix: https://github.com/pytorch/pytorch/pull/127146

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127146
Approved by: https://github.com/anijain2305
ghstack dependencies: #126444
2024-05-29 19:00:37 +00:00
Pian Pawakapan
8a31c2aa84 [export] allow complex guards as runtime asserts (#127129)
With the current state of export's dynamic shapes, we struggle with guards and constraints that are beyond the current dynamic shapes language, expressed with dims and derived dims. While we can compile and guarantee correctness for guards within the current language (e.g. min/max ranges, linear relationships, integer divisibility) we struggle to dynamically compile guards which extend beyond that.

For these "complex" guards, we typically do either of the following: 1) raise a constraint violation error, along the lines of "not all values of <symbol> in the specified range satisfy <guard>", with or without suggested fixes, 2) specialize to the provided static values and suggest removing dynamism, or 3) fail compilation due to some arbitrary unsupported case. Previous [work](https://github.com/pytorch/pytorch/pull/124949) went towards resolving this by disabling forced specializations, instead allowing the user to fail at runtime with incorrect inputs.

In this PR, relying on [hybrid backed-unbacked symints](https://github.com/pytorch/pytorch/issues/121749), [deferred runtime asserts](https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/runtime_assert.py), and the function [_is_supported_equivalence()](d7de4c9d80/torch/fx/experimental/symbolic_shapes.py (L1824)), we add a flag `_allow_complex_guards_as_runtime_asserts` which allows the user to compile exported programs containing these guards and maintain dynamism, while adding correctness checks as runtime assertions in the graph.

Hybrid backed-unbacked symints allow us to easily bypass "implicit" guards emitted from computation - guards that we ~expect to be true. Popular examples revolve around reshapes:
```
# reshape
def forward(self, x, y):  # x: [s0, s1], y: [s2]
    return x.reshape([-1]) + y  # guard s0 * s1 = s2

This leads to the following exported program

class GraphModule(torch.nn.Module):
    def forward(self, x: "f32[s0, s1]", y: "f32[s2]"):
        sym_size_int: "Sym(s2)" = torch.ops.aten.sym_size.int(y, 0)
        mul: "Sym(-s2)" = -1 * sym_size_int;  sym_size_int = None
        sym_size_int_1: "Sym(s0)" = torch.ops.aten.sym_size.int(x, 0)
        sym_size_int_2: "Sym(s1)" = torch.ops.aten.sym_size.int(x, 1)
        mul_1: "Sym(s0*s1)" = sym_size_int_1 * sym_size_int_2;  sym_size_int_1 = sym_size_int_2 = None
        add: "Sym(s0*s1 - s2)" = mul + mul_1;  mul = mul_1 = None
        eq: "Sym(Eq(s0*s1 - s2, 0))" = add == 0;  add = None
        _assert_scalar = torch.ops.aten._assert_scalar.default(eq, "Runtime assertion failed for expression Eq(s0*s1 - s2, 0) on node 'eq'");  eq = None

        view: "f32[s0*s1]" = torch.ops.aten.view.default(x, [-1]);  x = None
        add_1: "f32[s0*s1]" = torch.ops.aten.add.Tensor(view, y);  view = y = None
        return (add_1,)
```
Another case is symbol divisibility:
```
def forward(self, x):  # x: [s0, s1]
    return x.reshape([-1, x.shape[0] - 1])  # Eq(Mod(s0 * s1, s0 - 1), 0)
```

Applying deferred runtime asserts also helps dynamic compilation for "explicit" complex guards that typically cause problems for export. For example we can generate runtime asserts for not-equal guards, and complex conditions like the following:
```
class Foo(torch.nn.Module):
    def forward(self, x, y):
        # check that negation of first guard also shows up as runtime assertion
        if x.shape[0] == y.shape[0]:  # False
            return x + y
        elif x.shape[0] == y.shape[0] ** 3:  # False
            return x + 2, y + 3
        elif x.shape[0] ** 2 == y.shape[0] * 3:  # True
            return x * 2.0, y * 3.0
```
For the above graph we will generate 3 runtime assertions: the negation of the first 2, and the 3rd condition as a guard.

One additional benefit here over the current state of exported programs is that this adds further correctness guarantees - previously with explicit complex guards, if compilation succeeded, the guards would be ignored at runtime, treated as given.

As shown above, the runtime asserts appear as math ops in the graph, generated by the sympy interpreter, resulting in an _assert_scalar call. There is an option to avoid adding these asserts into the graph, by setting `TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1`. This results in the "original" computation graph, with dynamism, and any incorrect inputs will fail on ops during runtime. Further work could go into prettifying the printer, so the majority of the graph isn't guard-related.

Ideally this PR would subsume and remove the recently added [_disable_forced_specializations](https://github.com/pytorch/pytorch/pull/124949) flag, but that flag still handles one additional case of specialization: single-variable equalities where the symbol is solvable for a concrete value: see this [PR](https://github.com/pytorch/pytorch/pull/126925)

This PR doesn't change any behavior around data-dependent errors/unbacked symints yet, that could be further work.

NOTE: will take naming change suggestions for the flag :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127129
Approved by: https://github.com/avikchaudhuri
2024-05-29 17:15:25 +00:00
William Wen
719589c9bf [dynamo] move bytecode tests from test_misc to new bytecode test file (#127329)
Also merge with bytecode hook test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127329
Approved by: https://github.com/yanboliang, https://github.com/jansel
2024-05-29 06:10:59 +00:00
Xuehai Pan
26f4f10ac8 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
2024-05-27 14:49:57 +00:00
PyTorch MergeBot
55c0ab2887 Revert "[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)"
This reverts commit 7763c83af6.

Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))
2024-05-27 09:22:08 +00:00
Xuehai Pan
7763c83af6 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
ghstack dependencies: #127122, #127123, #127124, #127125
2024-05-27 04:22:18 +00:00
Animesh Jain
f0366de414 [dynamo] Support __contains__ on obj.__dict__ (#126922)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126922
Approved by: https://github.com/jansel, https://github.com/yanboliang
2024-05-23 09:01:29 +00:00
laithsakka
b0e849870e Change error message when nn module inlining is enabled for MiscTests.test_map_side_effects (#126444)
#fix https://github.com/pytorch/pytorch/issues/126355

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126444
Approved by: https://github.com/anijain2305
2024-05-22 23:24:03 +00:00
Peter Bell
51c07f9f69 [dynamo] Allow asserts to fail (#126661)
Currently if an assertion is statically known to be false, dynamo converts it to
`_assert_async` which inductor currently ignores. Instead this graph breaks to
raise the original assertion.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126661
Approved by: https://github.com/ezyang
2024-05-21 02:42:13 +00:00
Animesh Jain
7aa068f350 [dynamo][inline-inbuilt-nn-modules] Change test to not depend on id of mod instance (#126314)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126314
Approved by: https://github.com/williamwen42
ghstack dependencies: #126303, #126316
2024-05-16 01:35:09 +00:00
Edward Z. Yang
534ddfa619 Move compute unbacked bindings call to track_tensor_tree (#126168)
This ensures we hit it in all the HOP proxy tensor implementations

Fixes https://github.com/pytorch/pytorch/issues/125869

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126168
Approved by: https://github.com/ydwu4
2024-05-14 21:05:05 +00:00
Edward Z. Yang
db3b38202b Improve dead code elimination of unnecessary int arguments (#126074)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126074
Approved by: https://github.com/lezcano
ghstack dependencies: #125325, #125915
2024-05-14 17:22:30 +00:00
Edward Z. Yang
2ba102f689 Implement native support for float inputs in Dynamo and ShapeEnv (#125325)
The big idea is that floats are treated as Tensors on input/output to the FX graph, but on the inside, we immediately call item() on the synthetic Tensor and record regular float operations on it. Canonicalization to Tensor operations will happen in a standalone FX pass. This behavior is controlled by `specialize_float` config variable when set to False.

The generated graph looks like this for the test `test_unspec_float_output`:

```
 def forward(self, L_x_: "f32[3]", L_y_: "f32[]"):
     l_x_ = L_x_
     l_y_ = L_y_

     # File: /data/users/ezyang/a/pytorch/test/dynamo/test_unspec.py:511 in f, code: return x + 1, y * 2
     add: "f32[3]" = l_x_ + 1;  l_x_ = None
     item: "Sym(zf0)" = l_y_.item();  l_y_ = None
     mul: "Sym(2*zf0)" = item * 2;  item = None
     scalar_tensor: "f32[]" = torch.scalar_tensor(mul);  mul = None
     return (add, scalar_tensor)
```

The ingredients:

* **torch/_dynamo/variables/builder.py** When `specialize_float` is False, we wrap float literals with `wrap_symfloat`. This is an unholy mashup of `wrap_symint` and `wrap_unspecialized_primitive`. The overall strategy is that we first generate a tensor argument (because that's what we want to show up into the FX graph), but then immediately call item() on the tensor argument to get a SymNodeVariable, which we will do the rest of the tracing with.  Importantly, this SymNodeVariable is backed with the source of the original float: this means we can guard on the resulting value (something we could NOT do with UnspecializedPythonVariable). This has to be done manually, because if you literally call item() on the tensor, you will end up with an unbacked float. There is a bit of copy paste from wrap_symint and wrap_unspecialized_primitive which we can try to factor out, but this really is its own thing and you should review every line of code in the function.
* **torch/fx/experimental/symbolic_shapes.py** We now can generate guards on float inputs, and these guards are handled inside of ShapeEnv. So we need to be able to allocate (backed!) float symbols, and produce guards for them. Fairly straightforward generalization.
* **torch/_dynamo/codegen.py** I also need to maintain the invariant that there are no float outputs to the FX graph. I chose to do this at codegen time. When we detect a SymNodeVariable on the return stack for a float, we on the fly convert it (via `as_tensor`) to a TensorVariable, which is the true output. We then special case the output bytecode to call item() on it again. The tensor conversion is memoized on SymNodeVariable since we typically run the code generation process twice.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125325
Approved by: https://github.com/lezcano, https://github.com/jansel
2024-05-14 04:10:01 +00:00
Animesh Jain
a7575e8bd5 [dynamo] Use correct source for custom getattr (#125828)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125828
Approved by: https://github.com/williamwen42
2024-05-09 20:37:23 +00:00
Edward Z. Yang
1b1d593c8c Don't call item() into torch.scalar_tensor uselessly (#125373)
Fixes https://github.com/pytorch/pytorch/issues/125368

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125373
Approved by: https://github.com/Skylion007
2024-05-05 22:38:16 +00:00
PyTorch MergeBot
a32ad828dc Revert "Don't call item() into torch.scalar_tensor uselessly (#125373)"
This reverts commit 2b4fe183db.

Reverted https://github.com/pytorch/pytorch/pull/125373 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but there are real failures on the PR that sneak in during the log classifier outage ([comment](https://github.com/pytorch/pytorch/pull/125373#issuecomment-2094464241))
2024-05-04 22:22:36 +00:00
Animesh Jain
5ba777f46e [guards][cpp-guards] Optimize NN module getattr guards (#124522)
Improves the guard overhead of MobileBert model with nn module guards from 92000 units to 20000 units.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124522
Approved by: https://github.com/jansel
ghstack dependencies: #125439, #125421
2024-05-04 22:08:56 +00:00
Edward Z. Yang
2b4fe183db Don't call item() into torch.scalar_tensor uselessly (#125373)
Fixes https://github.com/pytorch/pytorch/issues/125368

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125373
Approved by: https://github.com/Skylion007
2024-05-04 08:07:13 +00:00
Edward Z. Yang
e93b57a570 Add propagate_real_tensors mode for unbacked (#125115)
A common complaint when working with data-dependent code in PyTorch is that it's hard to tell how far you are from the finish line: every time a GuardOnDataDependentSymNode error is hit, you have to somehow fix or workaround it to see the next one.

This PR adds a new mode `torch._functorch.config.fake_tensor_propagate_real_tensors` which modifies fake tensors to also propagate real tensors. This means that when we try to guard on a data-dependent SymNode, we can actually produce a real result. We also produce a warning which you should consult to figure out what the crux points are.

I ran this on vision_maskrcnn. In the baseline (without this mode), the model has 27 graph breaks, resulting in 40 graphs. With this mode on, the model has only 11 graph breaks, resulting in 15 graphs (the remaining graph breaks are due to missing functionality for item() on float tensor and some other Dynamo missing features.) You get a list of things that would have errored like this:

```
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> False
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u0), 1)) -> False
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> False
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u0), 1)) -> False
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> False
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> False
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> True
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> False
WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> False
```

Potential later follow ups:

* Improve the warning messages (in particular, should provide user frames)
* GC real tensors when they are no longer needed by tracing. Right now, this will use A LOT of memory, equal to as if your GC was broken and every intermediate tensor was kept live

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125115
Approved by: https://github.com/IvanKobzarev
2024-05-02 15:28:26 +00:00
Avik Chaudhuri
746da8755c switch tests from constrain_as* to torch._check* (#125253)
To fix data-dependent errors we want to recommend that people use `torch._check*` APIs. The `constrain_as*` APIs should be fully subsumed by them, and in the future we should kill them entirely.

Differential Revision: D56774333

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125253
Approved by: https://github.com/ezyang
2024-05-01 21:01:27 +00:00
Animesh Jain
37c993546d [dynamo][guards] Bug fix for set_export_info (#125275)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125275
Approved by: https://github.com/yanboliang
2024-05-01 03:46:26 +00:00
Sam Larsen
74e8817311 [inductor] Minor fixes to various tests before enabling fx graph caching in OSS by default (#125258)
Summary: Discovered breakages by enabling codecache by default and doing a CI run. I'll commit these fixes first and eventually enabling caching by default will (hopefully) be a one-liner.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125258
Approved by: https://github.com/eellison
2024-05-01 02:34:01 +00:00
William Wen
d6c713884a [dynamo, 3.12] xfail refleaking tests due to buggy getattr_static (#125062)
For tracking https://github.com/pytorch/pytorch/issues/124302 so that we can re-enable the test once 3.12 updates with the bug fix for https://github.com/python/cpython/issues/118013.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125062
Approved by: https://github.com/anijain2305, https://github.com/jansel
2024-04-30 22:40:47 +00:00
Edward Z. Yang
e5e623af4b Codegen runtime asserts in Inductor (#124874)
This completely subsumes https://github.com/pytorch/pytorch/pull/120816

This makes use of the unbacked binding machinery to teach Inductor how to generate deferred runtime asserts directly. There is some back story about why I did it this way, let me explain.

Previously, our strategy for generating runtime asserts was that Dynamo would insert them into the FX graph after finishing tracing, and we would attempt to code generate them based on the FX graph. This is a good strategy for export, where we immediately export the graph. However, this strategy was afflicted by problems in eager, where we reuse the same ShapeEnv as before. In particular, on subsequent graph passes, we would immediately turn all of these assertions into noops, because when we evaluated their expressions, we would see that because we had a deferred runtime assert in the ShapeEnv, we know "oh, of course this expression is True" already. Oops!

So, with this PR, we take the attitude that as long as the ShapeEnv sticks around, the ShapeEnv's list of deferred runtime asserts is the source of truth, and we don't put anything in the graph. So we just need to decide when to actually generate asserts, and the place I picked was Inductor lowering, since we already have an AssertScalar buffer concept, and so I just need to insert them at this point. AssertScalar also uses raw sympy.Expr rather than SymInt/Bool, so it is easier to prevent unrestricted simplification at this point.

There are a few things jumbled together in this PR. I can split them if you want, but some of the changes are before I changed my strategy, but they're useful changes anyway.

**torch/_dynamo/output_graph.py** and **torch/_inductor/lowering.py** - Here, we stop putting deferred runtime asserts in the graph. I also have to make sure we don't DCE unused symbol arguments; we're going to get some goofy graph arguments this way, will be good to restore that optimization eventually. We also just disable codegen for `_assert_scalar`  entirely; we assume that ShapeEnv will be good enough to capture all of these.

**torch/_inductor/codegen/wrapper.py** and **torch/_inductor/ir.py** - Add a way to codegen sizevars without forcing simplification

**torch/_inductor/graph.py** - The main logic. Our strategy is to interpose in the same place we are testing that unbacked SymInts are properly showing up in lowered code. The logic is directly analogous to the logic in the existing insert deferred runtime asserts FX pass, but it's simpler because sympy expressions can be directly stored on inductor IR nodes.

**torch/fx/experimental/symbolic_shapes.py** - For extra safety, we have a way of freezing runtime asserts, so that if you try to add more we error. This prevents us from adding runtime asserts after we've done lowering. There's a funny interaction with backwards which there's a comment for in graph.py

**torch/fx/passes/runtime_assert.py** - This is not really needed in this PR, but I rewrote the runtime assert logic to use unbacked_bindings rather than inferring it by looking for unbacked SymInts. Now, keypaths are translated into FX node acessors. Unfortunately, I couldn't delete the old inference code, because you still need it to find backed SymInts from arguments (as this pass may be used on graphs which don't explicitly bind all their shape variables as argments). There are some new tests exercising this.

TODO: I think we need to generate asserts for replacements too. This is a preexisting problem that the old FX pass had too.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124874
Approved by: https://github.com/jansel
ghstack dependencies: #124864
2024-04-29 10:19:29 +00:00
Animesh Jain
0f139b04b3 [dynamo] Fix test (#125107)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125107
Approved by: https://github.com/jansel
ghstack dependencies: #125097
2024-04-28 15:24:17 +00:00
Animesh Jain
e68d65dae2 [dynamo][cpp-guards] Differentiate dict guards wrt to guarding on key order (#124779)
We guard on key order
1) When a key is a non-constant object
2) When we actually need key order - like .values, .items etc

For dicts/OrderedDicts that do not require key order guarding, we just rely on usual `GuardManger + DictGetItemGuardAccessor`. This is faster than going through the `list(d.keys())` based design for OrderedDicts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124779
Approved by: https://github.com/jansel
2024-04-25 08:20:35 +00:00
Edward Z. Yang
e0e2d897ed Handle Tensor returns in PropagateUnbackedSymInts (#124297)
This subsumes https://github.com/pytorch/pytorch/pull/124069

In the original PR, my idea was that when we run PropagateUnbackedSymInts, we check that the sizes before and after are exactly the same. This ended up turning up lots of bugs that I didn't feel like fixing. Separately, Ivan let me know that this pass was quite expensive in terms of compile time, since we spent a lot of time thinking about the equalities.

To kill two birds with one stone, we now only check for equality precisely when an unbacked SymInt was bound (thanks to the previous PR in this stack, we now have this information). Specifically, we look to see if `meta["unbacked_bindings"]` is set on the old node, and if it is, we assert the old value is equal to the new value from the repropagation. Note that the pytree key is used to actually extract the new value from the example value, as it may be nested inside an, e.g., tensor size.

We do something a bit naughty at the end: we use `defer_runtime_assert` to actually teach ShapeEnv about the equality. This is implementationally equivalent to what we used to do, but we're going to change this later soon.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124297
Approved by: https://github.com/lezcano
ghstack dependencies: #124290
2024-04-24 12:18:33 +00:00
Xuehai Pan
a6f044a490 [dynamo, 3.8-3.9] support dataclass with frozen=True in Python 3.8/3.9 (#124393)
Closes #114966

Frozen field assignment in `__init__` in Python 3.8-3.9:

f5bd65ed37/Lib/dataclasses.py (L402-L411)

```python
import builtins

BUILTINS = builtins

def _field_assign(frozen, name, value, self_name):
    # If we're a frozen class, then assign to our fields in __init__
    # via object.__setattr__.  Otherwise, just use a simple
    # assignment.
    #
    # self_name is what "self" is called in this function: don't
    # hard-code "self", since that might be a field name.
    if frozen:
        return f'BUILTINS.object.__setattr__({self_name},{name!r},{value})'
    return f'{self_name}.{name}={value}'
```

Frozen field assignment in `__init__` in Python 3.10+:

812245ecce/Lib/dataclasses.py (L436-L445)

```python
__dataclass_builtins_object__ = object

def _field_assign(frozen, name, value, self_name):
    # If we're a frozen class, then assign to our fields in __init__
    # via object.__setattr__.  Otherwise, just use a simple
    # assignment.
    #
    # self_name is what "self" is called in this function: don't
    # hard-code "self", since that might be a field name.
    if frozen:
        return f'__dataclass_builtins_object__.__setattr__({self_name},{name!r},{value})'
    return f'{self_name}.{name}={value}'
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124393
Approved by: https://github.com/jansel
2024-04-19 05:10:33 +00:00
Animesh Jain
f213f262af [dynamo][cpp-guards] Improve when to use Dict vs DictSubclassGuardManager (#124237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124237
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #124230
2024-04-18 03:33:37 +00:00
William Wen
dca24d70ba [dynamo, test] remove skip for unhandled exception test (#123876)
This test might no longer segfault in CI due to changes to how we allocate and free shadow frames in dynamo.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123876
Approved by: https://github.com/jansel
2024-04-18 03:02:34 +00:00
William Wen
812bae09be [dynamo] fix 3.11+ refleak (#124238)
Fixes https://github.com/pytorch/pytorch/issues/119607 for 3.11+.

In 3.11+, `_PyFrame_FastToLocalsWithError` could implicity run `COPY_FREE_VARS` on the original frame, leading to double incref's since the dynamo shadow frame can rerun `COPY_FREE_VARS`. So the solution is to skip the first `COPY_FREE_VARS` instruction in the shadow frame if it was already executed in the original frame.

Also move the location for clearing the original frame in 3.12 to handle error cases more thoroughly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124238
Approved by: https://github.com/jansel
2024-04-18 03:02:29 +00:00