pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Ryan Guo	4f75f1e80d	[dynamo] Use proper item source for `NamedTupleVariable` (#142437 ) Dynamo was generating `GetItemSource(tuple_source, index)` for items of `NamedTupleVariable`, but that stops working when a user supplied named tuple has a custom `__getitem__` function with different semantics. This patch - fixes the aforementioned issue by using `AttrSource` instead. - handles named tuple outside `wrap_listlike`, by removing the special case of named tuple in `BaseListVariable.cls_for_instance`, since the semantics of named tuple is different enough. - makes user all constructions of `NamedTupleVariable` has items with proper sources. Fixes #142399. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142437 Approved by: https://github.com/jansel	2024-12-10 19:23:48 +00:00
Ryan Guo	a45326b649	[dynamo] Support multiple inheritance for custom dict construction (#142416 ) This patch applies a local and practical workaround for custom dict construction when multiple inheritance is involved. Handling multiple inheritance in general could be a lot more involved, so I created #142414 to track that. Fixes #141118. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142416 Approved by: https://github.com/jansel	2024-12-10 19:22:15 +00:00
Xuehai Pan	0bd7b7ae58	Add version check for C++ pytree availability (#142299 ) Resolves #142256 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142299 Approved by: https://github.com/jansel, https://github.com/weifengpy	2024-12-08 06:27:32 +00:00
Ryan Guo	aab0f32ea4	[dynamo] Properly handle `!=` under user-defined `__eq__` (#142078 ) Previously Dynamo modelled `object.__ne__` as just comparison over value identity; however, in CPython the default `!=` dispatches to `__eq__`, which might've been overriden by user. This patch fixes the behavior divergence. Fixes #142055. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142078 Approved by: https://github.com/jansel, https://github.com/zou3519	2024-12-06 08:06:53 +00:00
Yuanhao Ji	3baf8859e6	[Dynamo] Replace `torch._dynamo.optimize()` with `torch.compile()` [4/N] (#140253 ) related commits: - #139706 - #140238 - #140247 - #140253 - #140663 - #140688 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140253 Approved by: https://github.com/soulitzer	2024-12-05 00:30:00 +00:00
Bob Ren	9286c21b22	Fix fbcode tests for automatic dynamic unspecialize float (#141975 ) Differential Revision: D66708552 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141975 Approved by: https://github.com/bdhirsh, https://github.com/atalman	2024-12-03 23:59:06 +00:00
Xuehai Pan	78543e6002	[dynamo][pytree][1/N] make CXX pytree traceable: `tree_iter` / `tree_leaves` (#137397 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137397 Approved by: https://github.com/jansel	2024-12-03 11:17:39 +00:00
Ryan Guo	0efd184685	[dynamo] Fix side effects for range iterator that escapes the graph (#141716 ) `wrap_range_iterator` mistakenly used `ValueMutationNew`, when it should've used `ValueMutationExisting`, because this code path always has a source. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141716 Approved by: https://github.com/jansel ghstack dependencies: #141713, #141714, #141715, #141902	2024-12-03 09:18:06 +00:00
Ryan Guo	7c3c8a662e	[dynamo] Add `RANGE_ITERATOR_MATCH` to properly guard on range iterators (#141902 ) A subsequeunt patch attempts to fix a side-effect issue for range iterators, which in turn exposed an exising issue on guards for range iterators -- the following test started failing: ``` PYTORCH_TEST_WITH_DYNAMO=1 python test/test_tensor_creation_ops.py TestTensorCreationCPU.test_hstack_column_stack_cpu_int16 ``` This patch adds a `RANGE_ITERATOR_MATCH` guard to make sure that we properly guard on range iterators, and adds a regression test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141902 Approved by: https://github.com/jansel ghstack dependencies: #141713, #141714, #141715	2024-12-03 09:18:06 +00:00
Ryan Guo	ff3f4a164c	[dynamo] Fix aliasing issue for `dict.copy` that escapes the graph (#141715 ) Dynamo accidentally passed the original `ConstDictVariable.source` to the result of `dict.copy(...)`, which caused aliasing issue when the result escapes the graph (e.g., is a return value). This patch fixes that and adds a regression test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141715 Approved by: https://github.com/jansel ghstack dependencies: #141713, #141714	2024-12-03 09:18:06 +00:00
Ryan Guo	9eb0520d75	[dynamo] Fix side-effect handling for pre-existing `collections.deque` (#141714 ) Previously we never replayed side effects to `DequeVariable` with a source; the bug was already in the `test_deque_input` test, but went unnoticed because we didn't check the deque objects. This patch adds limited but practical support for this (see comments in `side_effects.py` for why limited), and updates the deque tests to check for this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141714 Approved by: https://github.com/jansel ghstack dependencies: #141713	2024-12-03 09:18:06 +00:00
Ryan Guo	e14d8c980f	[dynamo][NFC] Rename `NewCellVariable` to `CellVariable` (#141628 ) It was named `NewCellVariable` because we originally used it to represent cells by the code Dynamo is tracing through. However, now we use it to represent pre-existing cells as well, so this patch renames it to avoid confusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141628 Approved by: https://github.com/williamwen42, https://github.com/jansel	2024-12-02 19:09:30 +00:00
PyTorch MergeBot	9012e7a62f	Revert "[dynamo][pytree][1/N] make CXX pytree traceable: `tree_iter` / `tree_leaves` (#137397 )" This reverts commit `07850bb2c1`. Reverted https://github.com/pytorch/pytorch/pull/137397 on behalf of https://github.com/atalman due to Failing internal test ([comment](https://github.com/pytorch/pytorch/pull/137397#issuecomment-2511934283))	2024-12-02 16:05:14 +00:00
Bob Ren	2f72635a5c	automatic dynamic unspecialize float (#141647 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141647 Approved by: https://github.com/ezyang	2024-11-29 22:36:53 +00:00
PyTorch MergeBot	9e98b3d73c	Revert "automatic dynamic unspecialize float (#141647 )" This reverts commit `1a32daeb17`. Reverted https://github.com/pytorch/pytorch/pull/141647 on behalf of https://github.com/atalman due to functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inner_grad [GH job link](https://github.com/pytorch/pytorch/actions/runs/12080983316/job/33697901875) [HUD commit link](`1a32daeb17`) ([comment](https://github.com/pytorch/pytorch/pull/141647#issuecomment-2507980876))	2024-11-29 15:00:33 +00:00
Bob Ren	1a32daeb17	automatic dynamic unspecialize float (#141647 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141647 Approved by: https://github.com/ezyang	2024-11-29 07:53:53 +00:00
Ryan Guo	3141e038f0	[dynamo] Fix `VariableBuilder._wrap` on frozenset and enforce invariants on `ConstantVariable` (#141504 ) Prior to this patch, we are using `ConstantVariable.create` to create VT for frozenset objects, and intended yet failed to predicate that on all itmes being literals (see https://github.com/pytorch/pytorch/pull/140984#discussion_r1847393736). The code was from https://github.com/pytorch/torchdynamo/commit/7c03434 and the original goal was to help DBR quantization, but as the new test in this patch shows, it could lead to silent incorrectness. Upon a closer look, this exposes some subtleties in how Dynamo handles `ConstantVariable` and `LOAD_CONST`, so this patch both fixes the aforementioned issue and documents, enforces, and makes explicit the invariants around `ConstantVariable` and `LOAD_CONST` -- only immutable objects are supported. Specifically, this patch: 1. refine the checks for wrapping a `frozenset` object, document why we can't just wrap its items directly due to lack of `Sourcec` for set items, and use a safe workaround (`SourcelessBuilder`) to ensure soundness while keeping the DBR quantization support. 2. Adds more types to `common_constant_types`, thereby making `ConstantVariable.is_base_literal` more lenient, and strictly checks this property in the constructor of `ConstantVariable`. 3. Change relevant uses of `create_instruction("LOAD_CONST", ...)` to `create_load_const` which checks `is_safe_constant`, and makes developer overrides explicit by using `create_load_const_unchecked` when needed. 4. In a few places, use more specific `VariableTracker`, e.g., `TypingVariable` rather than `ConstantVariable`, and `FrozensetVariable` rather than `SetVariable`. (2) and (3) are mainly to future-proof Dynamo against bugs like (1). Pull Request resolved: https://github.com/pytorch/pytorch/pull/141504 Approved by: https://github.com/jansel	2024-11-27 21:58:35 +00:00
Xuehai Pan	07850bb2c1	[dynamo][pytree][1/N] make CXX pytree traceable: `tree_iter` / `tree_leaves` (#137397 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137397 Approved by: https://github.com/jansel ghstack dependencies: #141360	2024-11-27 00:21:58 +00:00
Isuru Fernando	44186a0a4e	Move Sympy printers to torch/utils/_sympy/printers.py (#140597 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2024-11-26 18:11:00 +00:00
Ryan Guo	99a0e2b1a1	[dynamo] Trace through `dataclasses` by removing it from `BUILTIN_SKIPLIST` (#141294 ) Fixes #141261. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141294 Approved by: https://github.com/williamwen42, https://github.com/jansel	2024-11-26 17:05:23 +00:00
Yanbo Liang	dcd16bdc21	[Dynamo][autograd.Function] Use fake tensor prop to infer fwd output (#136184 ) Fixes #129963 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136184 Approved by: https://github.com/zou3519	2024-11-26 01:10:08 +00:00
Ryan Guo	583484b726	[dynamo] Fix and simplify hanlding of `Set.update` method (#141286 ) The old implementation of `SetVariable.call_method("update", ...)` was incorrectly becacuse it wouldn't handle iterable inputs. This patches removes the input type restriction altogether, and implements the method as a polyfill (like how most of the other set methods are handled). Fixes #141283. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141286 Approved by: https://github.com/anijain2305	2024-11-26 00:41:50 +00:00
PyTorch MergeBot	ad37afd590	Revert "Always unspecialize float in OSS (#138922 )" This reverts commit `ba5253da9b`. Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/yf225 due to perf regression on torchbench ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2499277511))	2024-11-26 00:03:03 +00:00
Bob Ren	ba5253da9b	Always unspecialize float in OSS (#138922 ) Fixes https://github.com/pytorch/pytorch/issues/107277 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922 Approved by: https://github.com/ezyang Co-authored-by: Edward Z. Yang <ezyang@meta.com>	2024-11-24 01:58:13 +00:00
Jason Ansel	83116ec90c	[dynamo] Fix fbcode flakey test from asyncio warning (#141399 ) Summary: This was failing with a `/usr/local/fbcode/platform010/lib/python3.10/asyncio/events.py:666: DeprecationWarning` that seems unrelated. Test Plan: ``` buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/dynamo:test_dynamo -- --exact 'caffe2/test/dynamo:test_dynamo - test_misc.py::InlineInbuiltNNModulesMiscTests::test_numpy_readonly_inline_inbuilt_nn_modules' --run-disabled ``` Differential Revision: D66394773 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141399 Approved by: https://github.com/yanboliang	2024-11-23 18:16:50 +00:00
PyTorch MergeBot	a8c90e5140	Revert "Always unspecialize float in OSS (#138922 )" This reverts commit `6d779d0549`. Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/huydhn due to Sorry for reverting your change but there is some slow tests failing after this land ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2495076878))	2024-11-22 23:18:36 +00:00
Bob Ren	6d779d0549	Always unspecialize float in OSS (#138922 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922 Approved by: https://github.com/ezyang Co-authored-by: Edward Z. Yang <ezyang@meta.com>	2024-11-22 17:54:42 +00:00
PyTorch MergeBot	f23621ec56	Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597 )" This reverts commit `c25b201583`. Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Trunk is sad again after this lands, this looks like a landrace this time, so please do a rebase ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2494052978))	2024-11-22 15:43:39 +00:00
Isuru Fernando	c25b201583	Move Sympy printers to torch/utils/_sympy/printers.py (#140597 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2024-11-22 02:04:36 +00:00
Animesh Jain	fa63276691	[user empathy day][dynamo] Support get on subclassed dicts (#141214 ) Fixes https://github.com/pytorch/pytorch/issues/141138 but we need to do a more exhaustive job of going through dict methods and check each one of them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141214 Approved by: https://github.com/Skylion007, https://github.com/jansel ghstack dependencies: #141209	2024-11-21 21:18:42 +00:00
PyTorch MergeBot	701e06b643	Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597 )" This reverts commit `aefcdb3c9f`. Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think it fails inductor/test_padding in trunk. This is a target determination miss and that failed test was not run in your PR ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2489641453))	2024-11-20 22:13:57 +00:00
Isuru Fernando	aefcdb3c9f	Move Sympy printers to torch/utils/_sympy/printers.py (#140597 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2024-11-20 20:26:49 +00:00
Animesh Jain	f4ce9ac29d	[dynamo] Dont erase the cache line on invalidation (#140821 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140821 Approved by: https://github.com/jansel	2024-11-19 19:11:10 +00:00
Ryan Guo	ac6684ebbc	[dynamo] Identify pre-existing captured cells by cell id rather than content id (#140436 ) In `match_nested_cell`, Dynamo tried to identify pre-existing captured cells by `(cell_name, id(cell_contents))`. This works in most cases, but as the test added in this patch shows, it's not a complete solution. This patch 1. changes `match_nested_cell` to `lookup_variable_for_captured_cell`, and does the lookup based on id of cell objects, not their contents. This requires plumbing a tuple of captured cell objects from different CPython versions all the way to `InstructionTranslator.__init__`, where we store a mapping from the ids of these cell objects, and use it later in `UserFunctionVariable.bind_args` to look for these unboxed cells. 2. builds off (1) -- rather than using a `VariableTracker` that represents the content of the unboxed cells, use `ClosureVariable`, which enables codegen in case these cells escape as closure of a `NestedUserFunctionVariable`. The patch adds a regression test for each of the scenarios above: 1. `test_write_to_cells_with_name_shadowing` where Dynamo mistakenly thought the program is writing to a cell captured by root frame (which it doesn't support atm), which resulted in ``` File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/symbolic_convert.py", line 3340, in STORE_DEREF unimplemented("write to __closure__ while inlining") File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/exc.py", line 313, in unimplemented raise Unsupported(msg, case_name=case_name) torch._dynamo.exc.Unsupported: write to __closure__ while inlining ``` 2. `test_existing_func_that_creates_capturing_nested_func` where Dynamo ended up trying to codegen a `NestedUserFunctionVariable` that captures a cell which was also captured by the root frame, so it was unboxed and ends up emitting `LOAD_DEREF` rather than `LOAD_FAST/LOAD_CLOSURE` during codegen, resulting in ``` File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/variables/functions.py", line 105, in _create_nested_fn func = FunctionType(code, f_globals, name, defaults, closure) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: arg 5 (closure) expected cell, found int ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/140436 Approved by: https://github.com/jansel, https://github.com/williamwen42 ghstack dependencies: #140330, #140152	2024-11-15 17:17:30 +00:00
Yuanhao Ji	8a80cee2f3	[Dynamo] Replace `torch._dynamo.optimize()` with `torch.compile()` [3/N] (#140247 ) related commits: - #139706 - #140238 - #140247 - #140253 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140247 Approved by: https://github.com/soulitzer	2024-11-13 05:51:42 +00:00
Ryan Guo	d34d5ccec5	[dynamo] Fix some corner cases for modeling pre-existing cells (#140150 ) In `UserFunctionVariable.bind_args`, there's a rare case when the underlying function satisfies all conditions below 1. The function captures a pre-existing cell 2. The cell isn't captured by root frame 3. `UserFunctionVariable.source` is `None` In such cases, Dynamo would model the cell as its content (just like what we do for cells in the root frame). However, this could break in two cases: - We could have multiple instances of `UserFunctionVariable`, where some have source and others don't. This means sometimes we'll model the cell as a `NewCellVariable`, and sometimes as its content. This causes issues because writes to the `NewCellVariable` would be buffered in `SideEffects` and never get picked up by the other modeling. - Only when `UserFunctionVariable` has a source, do we check whether we already had a `NewCellVariable` for the captured cell. This again causes Dynamo to potentially have multiple representations for the same cell object, resulting in a similar "buffered writes not reflected" issue as above. This patch fixes the above 2 issues by 1. modeling captured cells of sourceless `UserFunctionVariable` as immutable `NewCellVariable`, and adds a few lines in `SideEffects` to account for its immutability. 2. always checking whether we already had a `NewCellVariable` for the captured cell, before constructing a new one. Tests are added for each aforementioned case. I also left a TODO to investigate why exactly we would lose source information for `UserFunctionVariable`. Some cases are easily fixable, but others not so much. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140150 Approved by: https://github.com/jansel ghstack dependencies: #140035, #140036, #140149	2024-11-13 03:14:23 +00:00
Ryan Guo	6a821c9e6a	[dynamo] Remove cell unboxing/restart optimization (#140149 ) We added an unboxing optimization to avoid writes to cells that existed before Dynamo tracing (such writes interfere with HOPs). However, the avoided write shouldn't be there in the first place, since we were basically creating an empty `NewCellVariable`, and then write the pre-existing content into the variable. This patch 1. adds logic to bypass the initial write for pre-existing cells without undermining correctness. 2. removes the unboxing optimization and the restart code path. Fixes #137456, #138491; also see those issues for more historical context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140149 Approved by: https://github.com/ezyang, https://github.com/jansel ghstack dependencies: #140035, #140036	2024-11-13 03:14:23 +00:00
Ryan Guo	698ff07323	[dynamo] Fix name collision bug for captured cells and locals (#140036 ) The `export_freevars` method was introduced very early on, for propagating writes to unboxed cells from child to parent frame, see https://github.com/pytorch/torchdynamo/commit/d0c10341. However, it's no longer needed after we started to modify root tracer's `symbolic_locals` directly for the unboxed cells, see https://github.com/pytorch/torchdynamo/commit/663e4d92. As a result, we no longer need `export_freevars`. In fact, it can cause a very subtle bug when name collision happens across the parent and child frames during inlining, because the parent frame isn't necessarily the frame that defined the cell captured by child frame. In summary, this patch removes the `export_freevars` bits, and adds a regression test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140036 Approved by: https://github.com/williamwen42, https://github.com/jansel ghstack dependencies: #140035	2024-11-13 03:14:23 +00:00
Bob Ren	4488e23763	Fix another item memo loss location + bool specialization bug (#139587 ) This fix was a bit more involved: 1) It fixes a item_memo loss place. 2) It updates a test to be eager instead of aot_eager since it reveals a very obscure bug related to replacements that's not worth solving since in practice inductor will regenerate the runtime asserts anyways 3) It updates tensorify to specialize more places now that the aforementioned bug is fixed. Fixes `PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=6 python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCPU.test_comprehensive_linalg_norm_cpu_float16` when `specialize_float=False` while ensuring `python test/dynamo/test_dynamic_shapes.py DynamicShapesMiscTests.test_runtime_assert_replacement_dynamic_shapes` doesn't regress Pull Request resolved: https://github.com/pytorch/pytorch/pull/139587 Approved by: https://github.com/ezyang ghstack dependencies: #139569, #139457, #139568, #139572, #139846, #139454, #139896, #139935	2024-11-09 03:11:19 +00:00
Michael Lazos	ea0f60ecfa	[Dynamo] allow dynamic callables on tensor variables (#137940 ) Fixes https://github.com/pytorch/pytorch/issues/134844 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137940 Approved by: https://github.com/williamwen42	2024-11-08 23:49:34 +00:00
Animesh Jain	738bfff5f9	[dynamo][user-defined] Fix bugs with method descriptors (#139856 ) Should fix some problems in https://github.com/pytorch/pytorch/pull/138080 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139856 Approved by: https://github.com/jansel	2024-11-06 23:16:40 +00:00
Michael Lazos	d622b490d6	[Dynamo] Support tensor mro without source (#139838 ) Fixes https://github.com/pytorch/pytorch/issues/137743 The issue here is that if `type` was called on a tensor without a source, we wouldn't have a source even for `torch.Tensor`, and the `__mro__` retrieval would fail. Since `torch.Tensor` is an internal torch type, I add handling for it in `call_type` in builtins. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139838 Approved by: https://github.com/williamwen42	2024-11-06 08:52:53 +00:00
Laith Sakka	a787320d0f	Do not try to optimize new implications in get_implications (#139738 ) Summary: save around 8% on the torchrec model. In most case the new implications are not optimizaiton anyway in some case though they are, but optimizing them is useless. ex: ``` generating implications for Eq(Mod(s0, 3), 0) adding Eq(Mod(s0, 3), 0) adding Eq(0, Mod(s0, 3)) adding Ne(Mod(s0, 3), 0) adding Ne(0, Mod(s0, 3)) adding Mod(s0, 3) <= 0 adding 0 < Mod(s0, 3) adding True adding False ``` VS ``` generating implications for Eq(Mod(s0, 3), 0) adding Eq(Mod(s0, 3), 0) adding Eq(0, Mod(s0, 3)) adding Ne(Mod(s0, 3), 0) adding Ne(0, Mod(s0, 3)) adding Mod(s0, 3) <= 0 adding 0 < Mod(s0, 3) adding 0 <= Mod(s0, 3) adding Mod(s0, 3) < 0 ``` the main difference is that 0 <= Mod(s0, 3) can be simplified to True and Mod(s0, 3) < 0 to False but with this change this wont happen. but True:True and False: False are useless anyway lol. so its ok i think ``` buck2 run fbcode//mode/opt fbcode//torchrec/distributed/tests:pt2_compile_benchmark -- --num-features=1000 ``` <img width="1082" alt="Screenshot 2024-11-04 at 9 25 51 PM" src="https://github.com/user-attachments/assets/a26e291b-9280-4b55-9275-f3201a36ac51"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/139738 Approved by: https://github.com/ezyang ghstack dependencies: #139703	2024-11-06 00:23:40 +00:00
PyTorch MergeBot	b6b9596607	Revert "[dynamo] Fix constant propagation in builtins and UserClasses (#131354 )" This reverts commit `44257c063e`. Reverted https://github.com/pytorch/pytorch/pull/131354 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it seems to break some internal tests ([comment](https://github.com/pytorch/pytorch/pull/131354#issuecomment-2451050605))	2024-11-01 00:13:20 +00:00
Laith Sakka	6a1c451479	Don't uselessly recompute axiom dict every static eval call (#138967 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/138967 Approved by: https://github.com/ezyang	2024-10-31 21:16:55 +00:00
PyTorch MergeBot	87f1990697	Revert "Don't uselessly recompute axiom dict every static eval call (#138967 )" This reverts commit `24b695ae2d`. Reverted https://github.com/pytorch/pytorch/pull/138967 on behalf of https://github.com/ZainRizvi due to Sorry, looks like this PR introduced a failure that was incorrectly classified as flaky, and the log classifier didn't identify the right log line either ([comment](https://github.com/pytorch/pytorch/pull/138967#issuecomment-2450228525))	2024-10-31 15:54:18 +00:00
Laith Sakka	24b695ae2d	Don't uselessly recompute axiom dict every static eval call (#138967 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/138967 Approved by: https://github.com/ezyang	2024-10-31 07:46:35 +00:00
Tom Ritchford	44257c063e	[dynamo] Fix constant propagation in builtins and UserClasses (#131354 ) * Fixes https://github.com/pytorch/pytorch/issues/118675 * Replaces https://github.com/pytorch/pytorch/pull/118994 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131354 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-10-30 12:47:20 +00:00
Xuehai Pan	9bbe4a67ad	[dynamo] support `maxlen` for `collections.deque` (#138194 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138194 Approved by: https://github.com/jansel, https://github.com/malfet	2024-10-30 10:08:02 +00:00
Simon Fan	99608ceed6	Scoped extension building for C++ backed custom ops tests (#136695 ) FIXES #125579 #131103 #133197 #133283 #134738 #135369 #135685 Tests that create C++ extensions can cause flakiness in CI due to library namespace conflict and test ordering. We can build them in temp dirs to ensure isolation. An alternative is to build these as part of the build process and have build time errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136695 Approved by: https://github.com/zou3519	2024-10-26 07:41:00 +00:00
Ryan Guo	f14247d5aa	[dynamo] Accurately identify mutated cells captured by multiple functions (#138632 ) This patch changes `mutated_closure_cell_contents: Set[str]` to `mutated_closure_cell_ids: Set[int]` so that Dynamo can more accurately identify closure cells across different instances of `UserFunctionVariable`. This prevents Dynamo from mistakenly treat a cell as immutable, despite it'll be mutated when referenced as closure cell from another function. More context in https://github.com/pytorch/pytorch/issues/138112#issuecomment-2420580779. Fixes #138112. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138632 Approved by: https://github.com/jansel ghstack dependencies: #138639	2024-10-26 02:17:07 +00:00
Ryan Guo	0a4197490c	Delay mul/pow expansion for `_SympyT` to enable more folding (#138235 ) Instead of calling `safe_expand` right after symbolic expression construction, we invoke it in `ShapeEnv.simplify`. This enables more simplification with product form, e.g., ``` (a + b)^2 / (a + b) --> (a + b) ``` which won't happen if we expand eagerly during product construction: ``` (a^2 + 2ab + b^2) / (a + b) --> no change ``` Fixes #136044. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138235 Approved by: https://github.com/ezyang	2024-10-21 16:38:47 +00:00
Isuru Fernando	4f45a052ad	Fix try_solve for s1*s2 == 0 when both symbols are unknown (#137919 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137919 Approved by: https://github.com/ezyang	2024-10-20 23:33:08 +00:00
Ryan Guo	59158f640c	[dynamo] Support equality comparison between Tensor and `None` (#138289 ) This patch updates the `wrap_fx_proxy_cls` function to allow boolean output when the operation is one of `supported_const_comparison_op_values`. Fixes #120907. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138289 Approved by: https://github.com/williamwen42	2024-10-18 17:49:26 +00:00
Adnan Akhundov	809ff3b274	Add host-side Triton TMA support to Dynamo (#137677 ) This adds Dynamo tracing support for the host-side Triton TMA API (see `create_2d_tma_descriptor` calls on the host in the [Triton tutorial](https://triton-lang.org/main/getting-started/tutorials/09-persistent-matmul.html#sphx-glr-getting-started-tutorials-09-persistent-matmul-py)). A few notes: - Here we assume the availability of the host-side TMA API added to upstream Triton in https://github.com/triton-lang/triton/pull/4498. As of time of writing, this is not a part of the PT2 OSS Triton pin (although back-ported internally). OSS Triton pin update should be done in December 2024. - To capture the chain of calls `t.data_ptr() --> create_{1d,2d}_tma_descriptor(ptr, ...) --> kernel[grid](tma_desc, ...)`, we add three new variable trackers: `DataPtrVariable`, `CreateTMADescriptorVariable` (for the function), `TMADescriptorVariable` (for TMA descriptor object). This is to maintain the path back from the Triton kernel to the Tensor from which the TMA descriptor has been created. - The newly introduced variables have `reconstruct` methods used in case of graph breaks. - The `tma_descriptor_metadata` extracted from the captured `create_{1d,2d}_tma_descriptor` calls is propagated through the HOPs in Dynamo and AOTAutograd to be used by the downstream compiler (e.g., Inductor). See the unit tests for how the captured HOP arguments look like. - In the Dynamo-captured fx graph, we replace the TMA descriptor arguments of the Triton kernel by the underlying Tensors, to be able to track the input/output relationships in terms of Tensors. - In the Triton kernel mutation analysis pass (in AOTAutograd), we use the `tt.experimental_descriptor_store` TTIR op to detect mutations of the underlying tensors via TMA descriptors. So that downstream AOTAutograd can perform functionalizations as required. - JIT Inductor and AOT Inductor support will be implemented in follow-up PRs. Differential Revision: [D64404928](https://our.internmc.facebook.com/intern/diff/D64404928) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137677 Approved by: https://github.com/zou3519	2024-10-16 02:18:48 +00:00
Xuehai Pan	1d6932937e	[dynamo] fix `NamedTupleVariable` for PyStructSequence (`torch.return_types.`) support (#137776 ) PyStructSequence is the C API equivalent for `collections.namedtuple` in Python. But they have different constructors: ```python tuple = NamedTupleType(args) tuple = NamedTupleType._make(args) tuple = StructSequenceType(args) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/137776 Approved by: https://github.com/jansel	2024-10-13 06:46:41 +00:00
William Wen	93bbc8abcc	[dynamo, 3.13] use 3.13 multiline traceback in get_instruction_source_311 (#137617 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137617 Approved by: https://github.com/jansel	2024-10-10 20:19:27 +00:00
Ryan Guo	dd7c2899bd	[dynamo] Properly prune dead cell local variables (#136891 ) This patch updates the `prune_dead_locals` logic to do slightly more aggressive pruning for cell local variables, in absence of side-effects, e.g., a cell variable can be pruned when its user function(s) will never be used again. See added tests for examples; note that a few tests in `test/dynamo/test_higher_order_ops.py` also got updated because we are no longer returning the unnecessary graph output. Fixes #127350, #124653 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136891 Approved by: https://github.com/jansel, https://github.com/anijain2305, https://github.com/williamwen42, https://github.com/zou3519	2024-10-10 18:21:24 +00:00
Ryan Guo	394c143e4e	[dynamo] Fix error when inlining certain nested closure returned by another function (#137510 ) See `test_inline_closure_returned_by_another_function_and_captures` and #136814 for more context. In #90286, we introduced an optimization so that for captured cells that are unmodified during a Dynamo trace, `UserFunctionVariable` will represent them as variable of the cell's actual value, rather than a `NewCellVariable`. Later on we introduced more mechanisms to model such cells across function calls (#104222), and across function calls where `NestedUserFunctionVariable::bind_args` need to look up further in the parent frames (#106491) to find these cells' values. This patch removes `InlinedClosureVariable` in favor of a simpler modelling, which is also more consistent with what was introduced in #90286, i.e., just model these cells as their contents, in `symbolic_locals`. This fixes #136814 because resolution of `InlinedClosureVariable` to the underlying cell content value happens in `NestedUserFunctionVariable::bind_args`, which requires Dynamo to have the value in scope at the function call site (when Dynamo does inlining), but's not always the case (as the test case shows). However, if we model the cells in `symbolic_locals`, we never need such resolution, and the values are directly stored into the `NestedUserFunctionVariable::closure` upon the function creation, at which point Dynamo always has the cell value in `symbolic_locals` for look up. Fixes #136814. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137510 Approved by: https://github.com/williamwen42	2024-10-09 18:13:57 +00:00
William Wen	a6707a7303	[dynamo] log all graph breaks to graph_breaks logging artifact (#137244 ) We were previously not logging all graph breaks (e.g. data dependent jumps) to the graph_breaks logging artifact. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137244 Approved by: https://github.com/jansel	2024-10-07 22:34:27 +00:00
PyTorch MergeBot	af64c44b56	Revert "Don't uselessly recompute axiom dict every static eval call (#135429 )" This reverts commit `1d6e0412f5`. Reverted https://github.com/pytorch/pytorch/pull/135429 on behalf of https://github.com/ezyang due to try again ([comment](https://github.com/pytorch/pytorch/pull/135429#issuecomment-2384288879))	2024-09-30 22:29:13 +00:00
Edward Z. Yang	9dbc6bacff	Propagate detailed location information of shape guards to guards/recompiles output (#136917 ) To see the payoff, look at test/dynamo/test_logging.py The general idea is to refactor produce_guards into produce_guards_verbose which also returns verbose code parts, which have our annotations. The rest of the logic is plumbing around SLocs to the places they need to be so we can print them. Guards are easy; value ranges and duck sizing take more care. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136917 Approved by: https://github.com/anijain2305	2024-09-30 00:43:12 +00:00
Edward Z. Yang	1d6e0412f5	Don't uselessly recompute axiom dict every static eval call (#135429 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/135429 Approved by: https://github.com/isuruf	2024-09-28 20:59:59 +00:00
PyTorch MergeBot	e5228a7771	Revert "Don't uselessly recompute axiom dict every static eval call (#135429 )" This reverts commit `507c69e20f`. Reverted https://github.com/pytorch/pytorch/pull/135429 on behalf of https://github.com/malfet due to It(or it's parent) broke trunk CI, see `507c69e20f` ([comment](https://github.com/pytorch/pytorch/pull/135429#issuecomment-2379422971))	2024-09-27 14:33:25 +00:00
Edward Z. Yang	507c69e20f	Don't uselessly recompute axiom dict every static eval call (#135429 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/135429 Approved by: https://github.com/isuruf ghstack dependencies: #135137	2024-09-27 04:03:25 +00:00
Edward Z. Yang	11fd55827d	Make CLOSURE_VARS construction lazy (#136599 ) This makes us less likely to hit import cycle problems with torch Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136599 Approved by: https://github.com/anijain2305	2024-09-26 16:50:13 +00:00
PyTorch MergeBot	9223c16208	Revert "Fix constant propagation in builtins and UserClasses (#131354 )" This reverts commit `dd4a51b39a`. Reverted https://github.com/pytorch/pytorch/pull/131354 on behalf of https://github.com/atalman due to Breaks torchrec tests ([comment](https://github.com/pytorch/pytorch/pull/131354#issuecomment-2375417145))	2024-09-25 23:01:03 +00:00
Tom Ritchford	dd4a51b39a	Fix constant propagation in builtins and UserClasses (#131354 ) * Fixes https://github.com/pytorch/pytorch/issues/118675 * Replaces https://github.com/pytorch/pytorch/pull/118994 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131354 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-09-25 13:03:40 +00:00
Tom Ritchford	e3ea5429f2	Implement GetAttrVariable.as_python_constant() (#134216 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134216 Approved by: https://github.com/amjames, https://github.com/williamwen42	2024-09-20 03:44:43 +00:00
Jan Wieczorek	908a5689eb	Return unsafe_view instead of view from matmul when folding occurs (#134568 ) When tensor folding occurs during matmul operation returned tensor is a view. This can cause issues when matmul is used inside a custom function and such view is then returned as output. Then it cannot be modified inplace and causes errors. It can be especially problematic when after such function inplace allreduce is performed. Issue is resolved when unsafe_view is returned from matmul instead. This solution aligns matmul decomposition with eager implementation in such a way that a non view tensor is returned. Test included in this PR reproduces the issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134568 Approved by: https://github.com/zou3519	2024-09-19 11:52:16 +00:00
Michael Lazos	14cabdf626	[Dynamo] Support thread local setattr (#135443 ) In preparation for tracing through DeviceContext (`defb515306/torch/utils/_device.py (L66)`) This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443 Approved by: https://github.com/anijain2305 ghstack dependencies: #134732, #133137	2024-09-14 18:52:22 +00:00
PyTorch MergeBot	46f5037007	Revert "[Dynamo] Support thread local setattr (#135443 )" This reverts commit `149d0b7161`. Reverted https://github.com/pytorch/pytorch/pull/135443 on behalf of https://github.com/mlazos due to broke python test/quantization/pt2e/test_numeric_debugger.py TestNumericDebugger.test_re_export_preserve_handle modified yesterday ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2350937008))	2024-09-14 10:02:55 +00:00
Michael Lazos	149d0b7161	[Dynamo] Support thread local setattr (#135443 ) In preparation for tracing through DeviceContext (`defb515306/torch/utils/_device.py (L66)`) This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443 Approved by: https://github.com/anijain2305 ghstack dependencies: #134732, #133137	2024-09-14 02:40:52 +00:00
PyTorch MergeBot	3f30360d05	Revert "[Dynamo] Support thread local setattr (#135443 )" This reverts commit `30b007bea3`. Reverted https://github.com/pytorch/pytorch/pull/135443 on behalf of https://github.com/albanD due to Broke tests on main ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2348886378))	2024-09-13 12:52:58 +00:00
Michael Lazos	30b007bea3	[Dynamo] Support thread local setattr (#135443 ) In preparation for tracing through DeviceContext (`defb515306/torch/utils/_device.py (L66)`) This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443 Approved by: https://github.com/anijain2305 ghstack dependencies: #134732, #133137	2024-09-13 08:41:07 +00:00
Bob Ren	dd47f6f623	Simplify expr before getting implications in _maybe_evaluate_static (#135499 ) Fixes #134268 Previously we weren't simplifying these expressions before calling get_implications, resulting in inconsistent application of FloorDiv/CleanDiv. See #134268 for more details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135499 Approved by: https://github.com/ezyang	2024-09-11 19:48:29 +00:00
PyTorch MergeBot	3ab12e2596	Revert "[Dynamo] Support thread local setattr (#135443 )" This reverts commit `160c228a4b`. Reverted https://github.com/pytorch/pytorch/pull/135443 on behalf of https://github.com/clee2000 due to something in this stack broke functorch/test_control_flow.py::TestControlFlow::test_scan_simple_graph [GH job link](https://github.com/pytorch/pytorch/actions/runs/10804912306/job/29980571390) [HUD commit link](`444b52ff40`), newly added test yesterday ([comment](https://github.com/pytorch/pytorch/pull/135443#issuecomment-2344042800))	2024-09-11 15:53:55 +00:00
Michael Lazos	160c228a4b	[Dynamo] Support thread local setattr (#135443 ) In preparation for tracing through DeviceContext (`defb515306/torch/utils/_device.py (L66)`) This PR adds support for calling the setattr of thread local objects. These objects have a slots impl, and since this doesn't appear to have any side effects, we call this setattr impl when replaying mutations, since calling `object.__setattr__` on these objects results in a type error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135443 Approved by: https://github.com/anijain2305 ghstack dependencies: #134732, #133137	2024-09-11 04:18:22 +00:00
rzou	82d00acfee	Allow cross-device copies for cpu scalars in refs (#135140 ) This copies our eager-mode behavior where someone can do torch.add(a, b, out=c) where a and b are CPU scalar tensors and c is a CUDA tensor. Fixes https://github.com/pytorch/pytorch/issues/121619 by side effect (we get into a situation where we're writing a CPU scalar into a FakeTensor that is actually a meta tensor) Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/135140 Approved by: https://github.com/williamwen42, https://github.com/yanboliang	2024-09-05 19:08:48 +00:00
Animesh Jain	32f45f01a9	[dynamo] Retire CompileProfiler (#135133 ) Fixes confusion in https://github.com/pytorch/pytorch/issues/113443 We have TORCH_LOGS that supersedes CompileProfiler Pull Request resolved: https://github.com/pytorch/pytorch/pull/135133 Approved by: https://github.com/ezyang ghstack dependencies: #135039, #135121, #135129, #135130	2024-09-05 01:08:40 +00:00
Michael Lazos	d9ae92cd6e	[Dynamo] Support for proxying frozen dataclasses (#134846 ) Fixes https://github.com/pytorch/pytorch/issues/133858 Details: Previously Dynamo would treat dataclasses as UserDefinedVariables. This was non-desirable if we would like to proxy the value into the graph, which is needed for TensorSubclassMetadata. To rectify this, frozen dataclasses are now able to be proxied similarly to NamedTuples. We require the object to be frozen, because if arbitrary mutation were allowed, we would need to replay those mutations in the graph after construction of the object. For tracing construction of the variable, the generated `__init__` for the dataclass uses `object.__setattr__` because frozen dataclasses throw errors on the usual `__setattr__` invocation. With this treatment, no special handling is needed in dynamo for frozen dataclass construction. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134846 Approved by: https://github.com/bdhirsh, https://github.com/anijain2305	2024-09-04 22:17:00 +00:00
rzou	d7b57c4d63	Fix tensor.data access under inference_mode and compile (#134878 ) Fixes https://github.com/pytorch/pytorch/issues/134798 In the regular Tensor case, when you call Tensor.data, there's a check for if inference mode is active. If it is active, then we don't set the version counter. We replicate this check for Tensor Subclasses (the bug was we were trying to set the version counter on a FakeTensor in inference_mode). Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/134878 Approved by: https://github.com/bdhirsh	2024-09-04 17:55:41 +00:00
Laith Sakka	6c3767452d	Move auto functionalize tests in their own test file (#134834 ) title + use `with torch.library._scoped_library as lib` when needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134834 Approved by: https://github.com/zou3519 ghstack dependencies: #134831	2024-09-03 17:09:03 +00:00
Chen Haifeng	27ffa67984	Support __class__ attr for tuple and list variables (#134099 ) Fixes #134086 This supports __class__ attribute for TupleVariable and ListVariable. And allows to construct a tuple or list by using __class__ attribute. This patch also fix a bug in NamedTupleVariable which misses a return on calling super var_getattr. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134099 Approved by: https://github.com/anijain2305, https://github.com/jansel	2024-08-30 01:57:49 +00:00
Xuehai Pan	70853b792a	[dynamo][itertools] support `itertools.tee` (#133771 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771 Approved by: https://github.com/jansel ghstack dependencies: #133801	2024-08-29 13:36:52 +00:00
PyTorch MergeBot	f65df5edae	Revert "[dynamo][itertools] support `itertools.tee` (#133771 )" This reverts commit `1dbd3476de`. Reverted https://github.com/pytorch/pytorch/pull/133771 on behalf of https://github.com/ZainRizvi due to Sorry, have to revert this in order to be able to revert https://github.com/pytorch/pytorch/pull/133769 ([comment](https://github.com/pytorch/pytorch/pull/133771#issuecomment-2316611158))	2024-08-29 02:49:30 +00:00
Yanbo Liang	97c8a0739e	[Dynamo] Support inspect.signature.Parameter getattr (#134636 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134636 Approved by: https://github.com/Chillee, https://github.com/anijain2305	2024-08-28 09:59:41 +00:00
Bob Ren	1ba39ec1d0	Add test case test_arange_length_with_float32_dtype (#134415 ) Adding a test as a followup from https://github.com/pytorch/pytorch/pull/134296 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134415 Approved by: https://github.com/ezyang	2024-08-27 21:36:23 +00:00
Xuehai Pan	1dbd3476de	[dynamo][itertools] support `itertools.tee` (#133771 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771 Approved by: https://github.com/jansel	2024-08-27 00:08:04 +00:00
PyTorch MergeBot	3d7f3f6a55	Revert "[dynamo][itertools] support `itertools.tee` (#133771 )" This reverts commit `0e49b2f18e`. Reverted https://github.com/pytorch/pytorch/pull/133771 on behalf of https://github.com/ZainRizvi due to Sorry, but this breaks internal tests because of using functools ([comment](https://github.com/pytorch/pytorch/pull/133778#issuecomment-2310445169))	2024-08-26 15:16:17 +00:00
Xu Han	dc1959e6a7	[inductor] calibration inductor windows uts (7/N) (#134420 ) Disable UTs on Windows: `test/dynamo/test_misc.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/134420 Approved by: https://github.com/jansel	2024-08-25 20:39:54 +00:00
Xu Han	90fb83749e	[inductor] fix test torch package working with trace on windows (#134397 ) Current temporary directory path is hard code. Fixed by get temporary directory path by API. Reproduce UTs: ```cmd python test/dynamo/test_dynamic_shapes.py -v -k test_torch_package_working_with_trace_dynamic_shapes ``` Error message: ```cmd ________________________________________________________________________________________________ DynamicShapesMiscTests.test_torch_package_working_with_trace_dynamic_shapes ________________________________________________________________________________________________ Traceback (most recent call last): File "D:\xu_git\dnnl_cb\pytorch\test\dynamo\test_misc.py", line 7199, in test_torch_package_working_with_trace with package.PackageExporter(path) as exp: File "C:\Users\Xuhan\.conda\envs\win_mkl_static\lib\site-packages\torch\package\package_exporter.py", line 237, in __init__ self.zip_file = torch._C.PyTorchFileWriter(f) RuntimeError: Parent directory /tmp does not exist. To execute this test, run the following from the base repo dir: python test\dynamo\test_dynamic_shapes.py DynamicShapesMiscTests.test_torch_package_working_with_trace_dynamic_shapes This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ========================================================================================================================== short test summary info ========================================================================================================================== FAILED [0.0080s] test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_package_working_with_trace_dynamic_shapes - RuntimeError: Parent directory /tmp does not exist. ==================================================================================================================== 1 failed, 1665 deselected in 4.00s ===================================================================================================================== ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/134397 Approved by: https://github.com/ezyang	2024-08-24 20:25:44 +00:00
Xuehai Pan	0e49b2f18e	[dynamo][itertools] support `itertools.tee` (#133771 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771 Approved by: https://github.com/jansel ghstack dependencies: #133769, #133778, #133779	2024-08-23 10:13:12 +00:00
Xuehai Pan	25b2e46573	[dynamo] add max iterator limit while inlining generators (#134233 ) Related: - #133879 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134233 Approved by: https://github.com/jansel	2024-08-23 07:03:31 +00:00
rzou	683609c631	Skip cpp_extension test internally (#134011 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134011 Approved by: https://github.com/masnesral	2024-08-21 13:51:05 +00:00
Xuehai Pan	c929e1e11f	[dynamo] fix polyfill for user defined constructor `__new__` (#133822 ) In `cls->tp_call`, if `cls->tp_new` does not return an instance of class `cls`, then `cls->tp_init` is not called on the new instance. Related PR: - #132977 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133822 Approved by: https://github.com/jansel	2024-08-21 12:41:19 +00:00
PyTorch MergeBot	2540ee372a	Revert "[dynamo][itertools] support `itertools.tee` (#133771 )" This reverts commit `28ce3c0227`. Reverted https://github.com/pytorch/pytorch/pull/133771 on behalf of https://github.com/ZainRizvi due to breaking main windows cpu tests - this stack still causes that windows test to fail ([comment](https://github.com/pytorch/pytorch/pull/133712#issuecomment-2299776241))	2024-08-20 21:14:44 +00:00
Xuehai Pan	b03381cac2	[dynamo] support `cls.__flags__` (#133970 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133970 Approved by: https://github.com/jansel ghstack dependencies: #133969	2024-08-20 20:03:31 +00:00
Xuehai Pan	28ce3c0227	[dynamo][itertools] support `itertools.tee` (#133771 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133771 Approved by: https://github.com/jansel ghstack dependencies: #133712, #133769, #133778, #133779	2024-08-20 19:48:57 +00:00
Bob Ren	f08d484702	Add itertools.islice support in dynamo (#133893 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133893 Approved by: https://github.com/oulgen	2024-08-20 05:55:53 +00:00
Animesh Jain	6ca68357b3	[dynamo] Save class vt in UserDefinedObjectVariable (#133800 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133800 Approved by: https://github.com/jansel ghstack dependencies: #133745, #133747, #133746, #133799	2024-08-19 17:21:48 +00:00
Animesh Jain	fed6096e73	[dynamo] Support object.__new__ call (#133746 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133746 Approved by: https://github.com/Skylion007, https://github.com/jansel ghstack dependencies: #133745, #133747	2024-08-18 07:18:52 +00:00
Animesh Jain	8a5708ba3d	[dynamo] Support object creation of classes with custom __new__ (#132977 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132977 Approved by: https://github.com/jansel	2024-08-16 03:09:23 +00:00
Edward Z. Yang	b5711297a0	Add support for SetVariable.discard (#133317 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/133317 Approved by: https://github.com/Skylion007	2024-08-14 09:10:36 +00:00
rzou	afb73d253c	[custom_ops] torch.library.{custom_op, register_kernel} disable Dynamo (#133125 ) We promise the user that these custom ops (and their kernels) are black boxes w.r.t. torch.compile. Unfortunately Dynamo can turn itself back on in the implementation of the custom operator, so we force it off by disabling Dynamo Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/133125 Approved by: https://github.com/ezyang	2024-08-12 18:29:18 +00:00
Yiming Zhou	c69b2d24e3	[dynamo] Support remove method of set (#132943 ) Fixes https://github.com/pytorch/pytorch/issues/132800 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132943 Approved by: https://github.com/anijain2305	2024-08-08 02:43:19 +00:00
Joel Schlosser	fb146fc3c6	Only store necessary tensor_dict fields in node meta (#132805 ) Fixes #132290 This PR attempts a more invasive / complete solution than the one from #132338, which removes immediate tensor fields from the `tensor_dict` copy stored in node meta. The approach taken here is to store only those fields of the `tensor_dict` which are absolutely utilized somewhere else. So far, this appears to be limited to: * `_dynamo_static_input_type` * `tag` (at least in the tests). Discussion at #94080 appears to indicate this is depended on for export (CI may point out more) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132805 Approved by: https://github.com/mlazos	2024-08-07 13:35:16 +00:00
Animesh Jain	06581c277a	[dynamo][stable-diffusion] Support dict(obj) on constrained subclasses of dict and OrderedDict (#132558 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132558 Approved by: https://github.com/jansel	2024-08-03 06:31:00 +00:00
William Wen	f379bbd46d	[dynamo] support inspect.signature.bind (#132330 ) Fixes https://github.com/pytorch/pytorch/issues/93760. This was not that small of a task... Pull Request resolved: https://github.com/pytorch/pytorch/pull/132330 Approved by: https://github.com/jansel ghstack dependencies: #132329	2024-08-02 20:37:05 +00:00
Edward Z. Yang	fc32732596	Don't attempt to compute hints for unbacked expressions (#132060 ) This breaks the inference we made that if you cat an N-D tensor with a 1-D tensor of size (u0,), the u0 must be zero, but no one really wanted that anyway... Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/132060 Approved by: https://github.com/Skylion007	2024-08-02 16:39:14 +00:00
PyTorch MergeBot	1197550876	Revert "Don't attempt to compute hints for unbacked expressions (#132060 )" This reverts commit `d342dc0179`. Reverted https://github.com/pytorch/pytorch/pull/132060 on behalf of https://github.com/ezyang due to test_correct_module_names ([comment](https://github.com/pytorch/pytorch/pull/132407#issuecomment-2265754857))	2024-08-02 16:32:43 +00:00
Edward Z. Yang	d342dc0179	Don't attempt to compute hints for unbacked expressions (#132060 ) This breaks the inference we made that if you cat an N-D tensor with a 1-D tensor of size (u0,), the u0 must be zero, but no one really wanted that anyway... Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/132060 Approved by: https://github.com/Skylion007 ghstack dependencies: #131649, #132407	2024-08-02 12:09:37 +00:00
Yanbo Liang	5ea0f51187	[Dynamo] Support abc.MutableMapping.get (#132363 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132363 Approved by: https://github.com/anijain2305, https://github.com/mlazos	2024-08-02 04:17:35 +00:00
Chen Haifeng	50ed6ce277	Support built-in id function for TensorVariable on parameters (#130100 ) Fixes #130087 This patch tries to provide a built-in id function implementation for TensorVariable when the id function is called on tensors like module parameters. The id function call on intermediate tensors is not supported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130100 Approved by: https://github.com/anijain2305	2024-08-02 01:19:25 +00:00
Oguz Ulgen	920f0426ae	Add None return type to init -- tests rest (#132376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376 Approved by: https://github.com/jamesjwu ghstack dependencies: #132335, #132351, #132352	2024-08-01 15:44:51 +00:00
YangQun1	589aef4bb0	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-08-01 03:18:37 +00:00
ekamiti	9e473fd868	Make adding Buffers more like adding Parameters (#125971 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971 Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos	2024-07-31 10:32:40 +00:00
Yidi Wu	32c57e78ed	Specialize sym node when used as device kwarg (#131811 ) Fixes https://github.com/pytorch/pytorch/issues/131189. We specialize the symint in python_arg_parser when used as kwarg device. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131811 Approved by: https://github.com/yanboliang, https://github.com/jansel, https://github.com/albanD	2024-07-30 17:11:57 +00:00
Animesh Jain	13457d1da0	[dynamo][log] Suggest to use pytree when graph-break on optree (#131827 ) Discovered while working on https://github.com/pytorch/pytorch/issues/121369 On the model above, the log looks like this ~~~ /home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree._C.PyCapsule.flatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py. torch._dynamo.utils.warn_once(msg) /home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree.PyCapsule.unflatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py. torch._dynamo.utils.warn_once(msg) ~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/131827 Approved by: https://github.com/zou3519, https://github.com/mlazos	2024-07-30 05:49:58 +00:00
Xuehai Pan	918ece4f4d	[BE][Easy][11/19] enforce style for empty lines in import segments in `test/dy*/` (#129762 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129762 Approved by: https://github.com/anijain2305	2024-07-27 17:43:53 +00:00
Animesh Jain	13ab92b72d	[dynamo][recompile-logs] Suggest force_parameter_static_shapes on the recompile log for parameter-related recomps (#131825 ) Discovered in https://github.com/pytorch/pytorch/issues/121369 On the user-empathy-day model, the logs look like these ~~~ W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] torch._dynamo hit config.cache_size_limit (8) W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] function: 'auto_repeat_tensors_for_time' (/home/anijain/local/lumiere-pytorch/lumiere_pytorch/lumiere.py:545) W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] last reason: 0/0: len(L['args']) == 1 W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". W0725 15:33:58.022000 1967777 torch/_dynamo/convert_frame.py:807] [0/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html. W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] torch._dynamo hit config.cache_size_limit (8) W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] function: 'forward' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/denoising_diffusion_pytorch/karras_unet.py:150) W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] last reason: 11/0: tensor 'L['x']' size mismatch at index 0. expected 16, actual 8 W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". W0725 15:34:00.282000 1967777 torch/_dynamo/convert_frame.py:807] [11/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html. W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] torch._dynamo hit config.cache_size_limit (8) W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] function: 'normalize_weight' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/denoising_diffusion_pytorch/karras_unet.py:127) W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] last reason: 40/1: tensor 'L['weight']' size mismatch at index 0. expected 64, actual 16. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters. W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". W0725 15:34:10.216000 1967777 torch/_dynamo/convert_frame.py:807] [40/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html. W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] torch._dynamo hit config.cache_size_limit (8) W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] function: 'pack_one' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/denoising_diffusion_pytorch/karras_unet.py:38) W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] last reason: 58/1: tensor 'L['t']' stride mismatch at index 0. expected 32, actual 8. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters. W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". W0725 15:34:11.643000 1967777 torch/_dynamo/convert_frame.py:807] [58/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html. W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] torch._dynamo hit config.cache_size_limit (8) W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] function: 'torch_dynamo_resume_in_pack_at_70' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/einops-0.8.0-py3.10.egg/einops/packing.py:70) W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] last reason: 62/0: tensor 'L['tensors'][0]' size mismatch at index 0. expected 16, actual 32. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters. W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". W0725 15:34:12.029000 1967777 torch/_dynamo/convert_frame.py:807] [62/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html. W0725 15:34:12.357000 1967777 torch/_dynamo/convert_frame.py:807] [65/8] torch._dynamo hit config.cache_size_limit (8) W0725 15:34:12.357000 1967777 torch/_dynamo/convert_frame.py:807] [65/8] function: 'reshape' (/home/anijain/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/einops-0.8.0-py3.10.egg/einops/_backends.py:91) W0725 15:34:12.357000 1967777 torch/_dynamo/convert_frame.py:807] [65/8] last reason: 65/0: tensor 'L['x']' size mismatch at index 0. expected 32, actual 8. Guard failed on a parameter, consider using torch._dynamo.config.force_parameter_static_shapes = False to allow dynamism on parameters. ~~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/131825 Approved by: https://github.com/ezyang ghstack dependencies: #131795, #131801, #131804	2024-07-26 16:25:21 +00:00
PyTorch MergeBot	c3679bed35	Revert "Fix py codegen to delete values that don't have any users (#131028 )" This reverts commit `91aba7baac`. Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/clee2000 due to broke inductor/test_triton_kernels inductor/test_triton_kernels.py::KernelTests::test_triton_kernel_functionalize [GH job link](https://github.com/pytorch/pytorch/actions/runs/10094659640/job/27915271250) [HUD commit link](`91aba7baac`) ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2251058374))	2024-07-25 17:42:18 +00:00
Yidi Wu	ffc6bf8149	[dynamo] lazily guard and specialize on the symint when used in f-string. (#131529 ) Fixes https://github.com/pytorch/pytorch/issues/103602. This PR implements the idea of "if someone creates a string and then ends up not using it, we would prefer to NOT have specialized." mentioned in above issue. Specifically, we create a lazy variable tracker instead of ConstantVariable when we're in FORMAT_VALUE, and when the lazy variable tracker is realized (i.e. it's going to be used), we create a ConstantVariable and the specialization/guarding happens at the time of realization. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131529 Approved by: https://github.com/ezyang	2024-07-25 16:16:34 +00:00
YangQun1	91aba7baac	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-07-25 13:04:23 +00:00
PyTorch MergeBot	8ffd109a00	Revert "Fix py codegen to delete values that don't have any users (#131028 )" This reverts commit `466c167b71`. Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/atalman due to breaks CI ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2247771530))	2024-07-24 12:21:43 +00:00
YangQun1	466c167b71	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-07-24 01:03:56 +00:00
Aaron Orenstein	b193894b94	FakeTensor cache SymInt support (#127596 ) Adds support for SymInts in the FakeTensor cache. A couple notes: 1. When a SymInt is present in the input key for a FakeTensor operation we cache on the ShapeEnv instead of using the FakeTensorMode cache. This is necessary so we don't have to remember and check the guards. It reduces the cache hits but there's diminishing return on how much work we can do before the cache becomes more of a burden than a gain. 2. We need to be careful that when we cache an output SymInt that is a direct copy from the input that when we have a cache-hit we copy the SymNode from the input to the output. This is important because the fx-graph building code actually uses SymNode ids in the process of building the graph so constructing a same-content-but-different-id SymNode will fail. 3. In the cache key we store SymInts as a _PySymInputStub. These represent SymInt (and friends) but support `__hash__` and `__eq__` (which SymInt do not). 4. In the cache entry we store SymInts as a _SymIntOutputStub. Perf example: ``` python benchmarks/dynamo/timm_models.py --ci --accuracy --timing --explain --inductor --dynamic-shapes --dynamic-batch-only --device cuda --training --amp --total-partitions 2 --partition-id 0 --output /tmp/training_timm_models.csv --filter crossvit_9_240 ``` fake tensor cache before: ``` INFO: FakeTensor cache stats: INFO: cache_hits: 68137 INFO: cache_misses: 837 INFO: cache_bypasses: INFO: symbolic shape: 48224 INFO: CompositeImplicitAutograd: 917 INFO: non-fake tensor: 70 INFO: non-FakeTensor output: 62 INFO: non-builtin: 8 INFO: dynamic output shape: 1 ``` and after: ``` INFO: FakeTensor cache stats: INFO: cache_hits: 88187 INFO: cache_misses: 14233 INFO: cache_bypasses: INFO: CompositeImplicitAutograd: 1037 INFO: non-FakeTensor output: 602 INFO: non-fake tensor: 70 INFO: unsafe view: 36 INFO: non-builtin: 8 INFO: dynamic output shape: 1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/127596 Approved by: https://github.com/eellison ghstack dependencies: #131014, #129780	2024-07-21 19:26:38 +00:00
Michael Lazos	1b72cf0b09	Add hasattr for tensor variable (#131008 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131008 Approved by: https://github.com/anijain2305 ghstack dependencies: #131007	2024-07-19 12:43:27 +00:00
Pian Pawakapan	988ed4d5db	[export] clean up allow_complex_guards_as_runtime_asserts flag (#130596 ) Summary: removes underscore, cleans up dead code in DimConstraints Test Plan: existing export tests Reviewed By: angelayi Differential Revision: D59612746 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130596 Approved by: https://github.com/angelayi	2024-07-12 17:17:11 +00:00
Michael Lazos	c101c4517a	Add python type for list iterators (#130511 ) Fixes https://github.com/pytorch/pytorch/issues/117026 Also not sure why this was missing Pull Request resolved: https://github.com/pytorch/pytorch/pull/130511 Approved by: https://github.com/williamwen42, https://github.com/yanboliang, https://github.com/anijain2305	2024-07-12 01:14:18 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
Pian Pawakapan	1b3b4c2fb9	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) (#130380 ) original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train) Summary: This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Test Plan: contbuild & OSS CI, see `940e4477ab` Original Phabricator Test Plan: Imported from GitHub, without a `Test Plan:` line. Differential Revision: D59543603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380 Approved by: https://github.com/izaitsevfb	2024-07-10 19:23:37 +00:00
PyTorch MergeBot	9c9744c3ac	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `940e4477ab`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))	2024-07-09 21:03:49 +00:00
Yueming Hao	b4cc25f126	[custom_op]Fix self in mutation_args (#130179 ) Fixes #124933 ## Issue Summary If users define `self` as mutate args, there is an error occurs `TypeError: AutoFunctionalized.__call__() got multiple values for argument 'self'`. For the following example, the schema for mutates_args is parsed as {"self": FakeTensor}. `6df963a2c8/torch/_higher_order_ops/auto_functionalize.py (L234)` In the above line, it is unwrapped as `self=FakeTensor` and leads to wrong argument pass because `self` is the default keyword for functions of a class, such as https://github.com/pytorch/pytorch/compare/main...findhao/fix-self-custom-ops#diff-9453b6b52a54783beec3dd1c60248620f61c3a524d404a188af17bbdf6be3d9eR292 . ```python import torch @torch.library.custom_op("mylib::foo", mutates_args={"self"}) def foo(self: torch.Tensor) -> None: self.sin_() x = torch.randn(3) @torch.compile(backend="inductor", fullgraph=True) def f(x): foo(x) f(x) ``` ## Fix This PR changes all related default argument `self` to `self_` following the existing way in `6fc771d19b/torch/_ops.py (L667)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/130179 Approved by: https://github.com/zou3519	2024-07-08 22:55:50 +00:00
Pian Pawakapan	940e4477ab	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-07 20:10:14 +00:00
PyTorch MergeBot	963f430d13	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `0267b2ddcb`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk `0267b2ddcb` ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))	2024-07-06 07:20:05 +00:00
Pian Pawakapan	0267b2ddcb	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-06 03:44:49 +00:00
Animesh Jain	bd0252fb98	[dynamo][user-defined] Support method descriptors (#130159 ) Fixes https://github.com/pytorch/pytorch/issues/120650 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130159 Approved by: https://github.com/jansel ghstack dependencies: #118448	2024-07-06 02:03:09 +00:00
Yanbo Liang	551f3b92b2	[Dynamo] Add assertion for tensor unpack shape mismatch (#130077 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/130077 Approved by: https://github.com/Chillee	2024-07-04 09:25:08 +00:00
Animesh Jain	fa4e489d70	[dynamo][dynamic-shapes] Graph break if out shape changes on out= variants (#130074 ) Fixes https://github.com/pytorch/pytorch/issues/130068 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130074 Approved by: https://github.com/ezyang ghstack dependencies: #129913, #129914	2024-07-04 08:36:12 +00:00
Edward Z. Yang	29c68df600	Stop immediately specializing common constants 0/1 for plain int (#128327 ) Fixes https://github.com/pytorch/pytorch/issues/128319 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128327 Approved by: https://github.com/lezcano ghstack dependencies: #129983	2024-07-03 16:41:51 +00:00
Colin Peppler	39357ba06f	[dynamo] don't constrain range on the replacement for a symbol (#129907 ) # Error ``` File "/data/users/colinpeppler/pytorch/torch/_meta_registrations.py", line 704, in sym_constrain_range constrain_range(size, min=min, max=max) File "/data/users/colinpeppler/pytorch/torch/fx/experimental/symbolic_shapes.py", line 898, in constrain_range a.node.shape_env._constrain_range(a.node.expr, min, max) File "/data/users/colinpeppler/pytorch/torch/fx/experimental/recording.py", line 245, in wrapper return fn(args, *kwargs) File "/data/users/colinpeppler/pytorch/torch/fx/experimental/symbolic_shapes.py", line 2813, in _constrain_range assert isinstance(a, sympy.Symbol), f"constraining non-Symbols NYI, {a} is {type(a)}" torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: AssertionError: constraining non-Symbols NYI, s1 + s2 is <class 'sympy.core.add.Add'> ``` # Context I ran into the following scenario: ``` getitem = ... sym_size_int = torch.ops.aten.sym_size.int(getitem, 0) # this is u0 = s0 + s1 _check_is_size = torch._check_is_size(sym_size_int) # we fail at this guy sym_constrain_range_default = torch.ops.aten.sym_constrain_range.default(sym_size_int, min = 4, max = 1234) # runtime assertion add = sym_size_int + sym_size_int_1 eq = add == sym_size_int _assert_scalar_default = torch.ops.aten._assert_scalar(eq, "Runtime assertion failed for expression Eq(s0 + s1, u0) on node 'eq'") ``` everything but getitem was asserted into the FX graph by insert_deferred_runtime_asserts() `7e4329c258/torch/fx/passes/runtime_assert.py (L38-L52)` In the above scenario, we fail trying to constraint the range on `s0 + s1` which is not a `sympy.Symbol`. And why exactly are we constraining the range on `s0 + s1`? Because it's the replacement for `u0`. # Approach Whenever we try to constrain the range on the replacement of ~~an unbacked symint~~ a non-symbol, just ignore it. In the scenario above, we'll be okay to ignore it because whenever there's a replacement on an unbacked symint, we will update its range. Hence, no need to constrain the range on `s1 + s1`. We can confirm this with `TORCH_LOGS="+dynamic"`. ``` torch/fx/experimental/symbolic_shapes.py:4737: _update_var_to_range u0 = VR[4, 198] (update) torch/fx/experimental/symbolic_shapes.py:4856: set_replacement u0 = s1 + s2 (trivial_lhs) VR[4, 198] ``` `600bf978ba/torch/fx/experimental/symbolic_shapes.py (L4759-L4764)` Differential Revision: [D59257079](https://our.internmc.facebook.com/intern/diff/D59257079) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129907 Approved by: https://github.com/jingsh	2024-07-02 21:46:40 +00:00
Animesh Jain	e62073d799	[dynamo] Skip FUNCTION_MATCH on method-wrapper objects (#129830 ) Fixes https://github.com/pytorch/pytorch/issues/118563 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129830 Approved by: https://github.com/jansel	2024-06-30 20:21:18 +00:00
Chien-Lin Chen	5e7ac69a67	[Dynamic Shapes] fixed dynamic shape inference (#128807 ) Made dynamic dimension indirectly bound to an integer constrained. After each ShapeEnv._refine_ranges, check if the new ValueRange is singleton, if it is, replace the symbol. Fixes #122307 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128807 Approved by: https://github.com/ezyang	2024-06-27 22:33:32 +00:00
Brian Hirsh	a4d7aa498b	[Traceable FSDP2] Add auto-functionalize support for mutable list[Tensor] (copy from Brian's PR #127347 ); enable E2E inductor unit test for transformer model (#129502 ) Copy of Brian's PR: https://github.com/pytorch/pytorch/pull/127347 with additional changes to support mutable `List[Tensor]` in Inductor. Also enable E2E inductor unit test for Traceable FSDP2 + transformer model. Test commands: - `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_trace_fsdp_set_` - `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_aot_eager` - `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_inductor` - `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_fullgraph_backend_aot_eager` - `pytest -rA test/dynamo/test_misc.py::MiscTests::test_auto_functionalize_tensorlist` - `pytest -rA test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_list_cuda` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129502 Approved by: https://github.com/zou3519	2024-06-27 17:50:57 +00:00
rzou	08b616281f	[custom ops] Switch out references from old landing page to new landing page (#129178 ) Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/129178 Approved by: https://github.com/albanD ghstack dependencies: #129177	2024-06-21 13:31:40 +00:00
Brian Hirsh	8c2542623b	[Traceable FSDP2] [Dynamo] Add tracing support for out-variant custom ops that return None (#129078 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129078 Approved by: https://github.com/yanboliang	2024-06-20 17:46:13 +00:00
Huy Do	73f5d2b787	Run ET unit tests on PT CI (#128560 ) This is the first PR to add all existing ET unit tests into PT CI. The goal is to improve the coverage there to avoid breaking change from PT that could break ET. With this, any future unit tests on ET will automatically be run on PT CI. The duration of the job is now 40+ minutes, not too bad. This also fixed the failed ET build in https://github.com/pytorch/pytorch/pull/123043. Adding model coverage is a bit more evolved and requires adding new shards, so I will follow up on that in separate PRs. [T192117506](https://www.internalfb.com/intern/tasks/?t=192117506), with the failed diffs D58295865 and D58394154 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128560 Approved by: https://github.com/guangy10, https://github.com/digantdesai	2024-06-19 20:08:58 +00:00
chilli	11ff5345d2	Changed colored logging to only be turned on if printing to interactive terminal (#128874 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128874 Approved by: https://github.com/anijain2305	2024-06-17 23:53:26 +00:00
Oguz Ulgen	472211c97a	Make assert_size_stride to return all errors (#128764 ) This will help debug some problems I'm encountering, but in general, it is best to show the entire error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128764 Approved by: https://github.com/jansel	2024-06-15 06:32:40 +00:00
Yueming Hao	73ba432d32	[custom_op]Fix None return schema (#128667 ) Fixes #125044 If users define a schema returns `None`, it will be parsed to a `torch.NoneType`. Auto functionalization support the `()` as a empty return but not for `None`. So, `None` return fails the check for [`can_auto_functionalize`](https://github.com/pytorch/pytorch/blob/findhao/fix_none_return_functionalize/torch/_higher_order_ops/auto_functionalize.py#L71) even we can take this as a `()` return. This PR is a fix to skip the check for None return. I hope it can be fixed in a [deeper level](`31e44c72ca`), but this fix breaks a lot of existing schemas. So it's better to fix this issue in the auto_functionalize.py at this moment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128667 Approved by: https://github.com/zou3519	2024-06-15 00:41:37 +00:00
chilli	c486e2ab64	Add coloring to fx graph print out (#128476 ) Note: Won't land immediately, at least I'll need to add a color option to the field. But curious if any tests fail. Old: <img width="1294" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/c3a750ed-5e54-4621-b2e4-be5481be15b6"> New: <img width="1303" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/3a1f1adc-6f3a-413e-8b87-ee53da9bf4ed"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128476 Approved by: https://github.com/ezyang	2024-06-13 23:39:04 +00:00
Animesh Jain	865d7b3424	[Reland][dynamo] Enable some inlining inbuilt nn module tests (#128440 ) Co-authored-by: Laith Sakka <lsakka@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128440 Approved by: https://github.com/williamwen42, https://github.com/jansel	2024-06-13 22:39:22 +00:00
Edward Z. Yang	2229884102	Introduce int_oo (#127693 ) In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range. After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better. But especially after #126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. test/test_sympy_utils.py describes some basic properties of the number, and torch/utils/_sympy/numbers.py has the actual implementation. The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments. Fixes https://github.com/pytorch/pytorch/issues/127396 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/127693 Approved by: https://github.com/lezcano ghstack dependencies: #126905	2024-06-13 04:08:20 +00:00
rzou	87072dcfdb	Change Dynamo's custom ops warning message to be less spammy (#128456 ) This is a short-term fix (for 2.4). In the longer term we should fix https://github.com/pytorch/pytorch/issues/128430 The problem is that warnings.warn that are inside Dynamo print all the time. Python warnings are supposed to print once, unless their cache is reset: Dynamo ends up resetting that cache everytime it runs. As a workaround we provide our own warn_once cache that is keyed on the warning msg. I am not worried about this increasing memory usage because that's effectively what python's warnings.warn cache does. Test Plan: - fix tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128456 Approved by: https://github.com/anijain2305	2024-06-12 21:57:12 +00:00
PyTorch MergeBot	5d8c7f39d4	Revert "Introduce int_oo (#127693 )" This reverts commit `9cab5987bd`. Reverted https://github.com/pytorch/pytorch/pull/127693 on behalf of https://github.com/clee2000 due to sorry executorch CI is a bit weird regarding pins, I'll make a chat with mergen with the choices of what to do and how it'll affect executorch CI, reverting for now to prevent more divergences in the meantime ([comment](https://github.com/pytorch/pytorch/pull/127693#issuecomment-2161775400))	2024-06-11 23:36:08 +00:00
Edward Z. Yang	9cab5987bd	Introduce int_oo (#127693 ) In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range. After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better. But especially after #126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. test/test_sympy_utils.py describes some basic properties of the number, and torch/utils/_sympy/numbers.py has the actual implementation. The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments. Fixes https://github.com/pytorch/pytorch/issues/127396 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/127693 Approved by: https://github.com/lezcano ghstack dependencies: #126905	2024-06-10 19:09:53 +00:00
Edward Z. Yang	3964a3ec73	Complete revamp of float/promotion sympy handling (#126905 ) At a high level, the idea behind this PR is: * Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.) * Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers. The story begins in torch/utils/_sympy/functions.py. Here, I make some changes to how we represent certain operations in sympy expressions: * FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing). * ModularIndexing, LShift, RShift now assert they are given integer inputs. * Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver * TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2*53 beyond what first coercing the integer to floats and then doing true division. Trunc is split to TruncToFloat and TruncToInt. * Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result. * RoundDecimal updated to consistently only ever return a float * Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing) In torch/__init__.py, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations. Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information. We also need to introduce some new op handlers in torch/_inductor/ops_handler.py: * `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy * `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv` These changes have consequences. First, we need to make some administrative changes: * Actually wire up these Sympy functions from SymInt/SymFloat in torch/fx/experimental/sym_node.py, including the new promotion rules (promote2) * Add support for new Sympy functions in torch/utils/_sympy/interp.py, torch/utils/_sympy/reference.py * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here * Add printer support for the Sympy functions in torch/_inductor/codegen/common.py, torch/_inductor/codegen/cpp_utils.py, torch/_inductor/codegen/triton.py. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet * Update ValueRanges logic to use new sympy functions in torch/utils/_sympy/value_ranges.py. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions. In torch/fx/experimental/symbolic_shapes.py we need to make some symbolic reasoning adjustments: * Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now * `_assert_bound_is_rational` is no more, we no longer generate rational bounds * Don't intersect non-int value ranges with the `int_range` * Support more sympy Functions for guard SYMPY_INTERP * Assert the type of value range is consistent with the variable type The new asserts uncovered necessary bug fixes: * torch/_inductor/codegen/cpp.py, torch/_inductor/select_algorithm.py, torch/_inductor/sizevars.py - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions * torch/_inductor/utils.py - make sure you actually pass in sympy.Expr to these functions * torch/_inductor/ir.py - make_contiguous_strides_for takes int/SymInt, not sympy.Expr! * torch/export/dynamic_shapes.py - don't use infinity to represent int ranges, instead use sys.maxsize - 1 Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at test/test_proxy_tensor.py Reland notes. This requires this internal fbcode diff https://www.internalfb.com/phabricator/paste/view/P1403322587 but I cannot prepare the diff codev due to https://fb.workplace.com/groups/osssupport/posts/26343544518600814/ It also requires this Executorch PR https://github.com/pytorch/executorch/pull/3911 but the ET PR can be landed prior to this landing. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905 Approved by: https://github.com/xadupre, https://github.com/lezcano	2024-06-09 06:20:25 +00:00
Edward Z. Yang	73d6ec2db6	Increase verbosity of FX graph dumps (#128042 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128042 Approved by: https://github.com/aorenste	2024-06-08 07:24:58 +00:00
PyTorch MergeBot	ac51f782fe	Revert "Complete revamp of float/promotion sympy handling (#126905 )" This reverts commit `2f7cfecd86`. Reverted https://github.com/pytorch/pytorch/pull/126905 on behalf of https://github.com/atalman due to Sorry need to revert - failing internally ([comment](https://github.com/pytorch/pytorch/pull/126905#issuecomment-2155118778))	2024-06-07 16:01:46 +00:00
PyTorch MergeBot	224b4339e5	Revert "Make ValueRange repr less chatty by default (#128043 )" This reverts commit `f0dd11df55`. Reverted https://github.com/pytorch/pytorch/pull/128043 on behalf of https://github.com/atalman due to Sorry reverting because in conflict with [#126905](https://github.com/pytorch/pytorch/pull/126905) which needs to be reverted ([comment](https://github.com/pytorch/pytorch/pull/128043#issuecomment-2155091732))	2024-06-07 15:43:39 +00:00
Edward Z. Yang	f0dd11df55	Make ValueRange repr less chatty by default (#128043 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128043 Approved by: https://github.com/lezcano	2024-06-06 16:42:48 +00:00
Edward Z. Yang	2f7cfecd86	Complete revamp of float/promotion sympy handling (#126905 ) At a high level, the idea behind this PR is: * Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.) * Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers. The story begins in torch/utils/_sympy/functions.py. Here, I make some changes to how we represent certain operations in sympy expressions: * FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing). * ModularIndexing, LShift, RShift now assert they are given integer inputs. * Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver * TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2*53 beyond what first coercing the integer to floats and then doing true division. Trunc is split to TruncToFloat and TruncToInt. * Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result. * RoundDecimal updated to consistently only ever return a float * Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing) In torch/__init__.py, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations. Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information. We also need to introduce some new op handlers in torch/_inductor/ops_handler.py: * `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy * `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv` These changes have consequences. First, we need to make some administrative changes: * Actually wire up these Sympy functions from SymInt/SymFloat in torch/fx/experimental/sym_node.py, including the new promotion rules (promote2) * Add support for new Sympy functions in torch/utils/_sympy/interp.py, torch/utils/_sympy/reference.py * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here * Add printer support for the Sympy functions in torch/_inductor/codegen/common.py, torch/_inductor/codegen/cpp_utils.py, torch/_inductor/codegen/triton.py. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet * Update ValueRanges logic to use new sympy functions in torch/utils/_sympy/value_ranges.py. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions. In torch/fx/experimental/symbolic_shapes.py we need to make some symbolic reasoning adjustments: * Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now * `_assert_bound_is_rational` is no more, we no longer generate rational bounds * Don't intersect non-int value ranges with the `int_range` * Support more sympy Functions for guard SYMPY_INTERP * Assert the type of value range is consistent with the variable type The new asserts uncovered necessary bug fixes: * torch/_inductor/codegen/cpp.py, torch/_inductor/select_algorithm.py, torch/_inductor/sizevars.py - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions * torch/_inductor/utils.py - make sure you actually pass in sympy.Expr to these functions * torch/_inductor/ir.py - make_contiguous_strides_for takes int/SymInt, not sympy.Expr! * torch/export/dynamic_shapes.py - don't use infinity to represent int ranges, instead use sys.maxsize - 1 Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at test/test_proxy_tensor.py Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905 Approved by: https://github.com/xadupre, https://github.com/lezcano	2024-06-06 02:29:45 +00:00
PyTorch MergeBot	d5cb5d623a	Revert "Complete revamp of float/promotion sympy handling (#126905 )" This reverts commit `fb696ef3aa`. Reverted https://github.com/pytorch/pytorch/pull/126905 on behalf of https://github.com/ezyang due to internal user reported ceiling equality simplification problem, I have a plan ([comment](https://github.com/pytorch/pytorch/pull/126905#issuecomment-2148805840))	2024-06-05 03:57:58 +00:00
Edward Z. Yang	fb696ef3aa	Complete revamp of float/promotion sympy handling (#126905 ) At a high level, the idea behind this PR is: * Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.) * Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers. The story begins in torch/utils/_sympy/functions.py. Here, I make some changes to how we represent certain operations in sympy expressions: * FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing). * ModularIndexing, LShift, RShift now assert they are given integer inputs. * Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver * TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2*53 beyond what first coercing the integer to floats and then doing true division. Trunc is split to TruncToFloat and TruncToInt. * Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result. * RoundDecimal updated to consistently only ever return a float * Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing) In torch/__init__.py, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations. Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information. We also need to introduce some new op handlers in torch/_inductor/ops_handler.py: * `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy * `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv` These changes have consequences. First, we need to make some administrative changes: * Actually wire up these Sympy functions from SymInt/SymFloat in torch/fx/experimental/sym_node.py, including the new promotion rules (promote2) * Add support for new Sympy functions in torch/utils/_sympy/interp.py, torch/utils/_sympy/reference.py * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here * Add printer support for the Sympy functions in torch/_inductor/codegen/common.py, torch/_inductor/codegen/cpp_utils.py, torch/_inductor/codegen/triton.py. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet * Update ValueRanges logic to use new sympy functions in torch/utils/_sympy/value_ranges.py. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions. In torch/fx/experimental/symbolic_shapes.py we need to make some symbolic reasoning adjustments: * Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now * `_assert_bound_is_rational` is no more, we no longer generate rational bounds * Don't intersect non-int value ranges with the `int_range` * Support more sympy Functions for guard SYMPY_INTERP * Assert the type of value range is consistent with the variable type The new asserts uncovered necessary bug fixes: * torch/_inductor/codegen/cpp.py, torch/_inductor/select_algorithm.py, torch/_inductor/sizevars.py - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions * torch/_inductor/utils.py - make sure you actually pass in sympy.Expr to these functions * torch/_inductor/ir.py - make_contiguous_strides_for takes int/SymInt, not sympy.Expr! * torch/export/dynamic_shapes.py - don't use infinity to represent int ranges, instead use sys.maxsize - 1 Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at test/test_proxy_tensor.py Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905 Approved by: https://github.com/xadupre, https://github.com/lezcano	2024-06-04 11:47:32 +00:00
Michael Lazos	2129903aa3	Properly detect nested torch function args (#127496 ) Dynamo was not detecting nested torch function classes in containers. This was due to pytree compatibility for variable trackers being removed. Fixes https://github.com/pytorch/pytorch/issues/127174 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127496 Approved by: https://github.com/anijain2305	2024-06-02 03:43:22 +00:00
Animesh Jain	efcea2d2fd	[dynamo] Support __getitem__ on NNModuleVariable __dict__ (#126956 ) Moves further along (but still fails) for the testcase in https://github.com/pytorch/pytorch/pull/126875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126956 Approved by: https://github.com/jansel ghstack dependencies: #126923	2024-06-01 15:22:45 +00:00
Animesh Jain	4aa7a1efcf	[dynamo] Initial exception handling support (#126923 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126923 Approved by: https://github.com/williamwen42, https://github.com/jansel	2024-06-01 13:00:32 +00:00
rzou	ffe506e853	Better graph break msg (and warning) on Dynamo x Python C++ extension (#127301 ) Dynamo graph breaks on Python C/C++ extensions (e.g. pybinded functions). The usual way to handle this is to turn those extensions into custom ops. This PR adds a nicer graph break message and also changes it to unconditionally warn on this graph break (because graph break messages are usually not visible). Fixes https://github.com/pytorch/pytorch/issues/126799 Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/127301 Approved by: https://github.com/jansel ghstack dependencies: #127291, #127292, #127400, #127423	2024-05-30 14:54:29 +00:00
laithsakka	5196ef1b59	support builtin id function on user defined object variables. (#127146 ) Fix: https://github.com/pytorch/pytorch/pull/127146 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127146 Approved by: https://github.com/anijain2305 ghstack dependencies: #126444	2024-05-29 19:00:37 +00:00
Pian Pawakapan	8a31c2aa84	[export] allow complex guards as runtime asserts (#127129 ) With the current state of export's dynamic shapes, we struggle with guards and constraints that are beyond the current dynamic shapes language, expressed with dims and derived dims. While we can compile and guarantee correctness for guards within the current language (e.g. min/max ranges, linear relationships, integer divisibility) we struggle to dynamically compile guards which extend beyond that. For these "complex" guards, we typically do either of the following: 1) raise a constraint violation error, along the lines of "not all values of <symbol> in the specified range satisfy <guard>", with or without suggested fixes, 2) specialize to the provided static values and suggest removing dynamism, or 3) fail compilation due to some arbitrary unsupported case. Previous [work](https://github.com/pytorch/pytorch/pull/124949) went towards resolving this by disabling forced specializations, instead allowing the user to fail at runtime with incorrect inputs. In this PR, relying on [hybrid backed-unbacked symints](https://github.com/pytorch/pytorch/issues/121749), [deferred runtime asserts](https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/runtime_assert.py), and the function [_is_supported_equivalence()](`d7de4c9d80/torch/fx/experimental/symbolic_shapes.py (L1824)`), we add a flag `_allow_complex_guards_as_runtime_asserts` which allows the user to compile exported programs containing these guards and maintain dynamism, while adding correctness checks as runtime assertions in the graph. Hybrid backed-unbacked symints allow us to easily bypass "implicit" guards emitted from computation - guards that we ~expect to be true. Popular examples revolve around reshapes: ``` # reshape def forward(self, x, y): # x: [s0, s1], y: [s2] return x.reshape([-1]) + y # guard s0 * s1 = s2 This leads to the following exported program class GraphModule(torch.nn.Module): def forward(self, x: "f32[s0, s1]", y: "f32[s2]"): sym_size_int: "Sym(s2)" = torch.ops.aten.sym_size.int(y, 0) mul: "Sym(-s2)" = -1 * sym_size_int; sym_size_int = None sym_size_int_1: "Sym(s0)" = torch.ops.aten.sym_size.int(x, 0) sym_size_int_2: "Sym(s1)" = torch.ops.aten.sym_size.int(x, 1) mul_1: "Sym(s0s1)" = sym_size_int_1 sym_size_int_2; sym_size_int_1 = sym_size_int_2 = None add: "Sym(s0s1 - s2)" = mul + mul_1; mul = mul_1 = None eq: "Sym(Eq(s0s1 - s2, 0))" = add == 0; add = None _assert_scalar = torch.ops.aten._assert_scalar.default(eq, "Runtime assertion failed for expression Eq(s0s1 - s2, 0) on node 'eq'"); eq = None view: "f32[s0s1]" = torch.ops.aten.view.default(x, [-1]); x = None add_1: "f32[s0s1]" = torch.ops.aten.add.Tensor(view, y); view = y = None return (add_1,) ``` Another case is symbol divisibility: ``` def forward(self, x): # x: [s0, s1] return x.reshape([-1, x.shape[0] - 1]) # Eq(Mod(s0 s1, s0 - 1), 0) ``` Applying deferred runtime asserts also helps dynamic compilation for "explicit" complex guards that typically cause problems for export. For example we can generate runtime asserts for not-equal guards, and complex conditions like the following: ``` class Foo(torch.nn.Module): def forward(self, x, y): # check that negation of first guard also shows up as runtime assertion if x.shape[0] == y.shape[0]: # False return x + y elif x.shape[0] == y.shape[0] 3: # False return x + 2, y + 3 elif x.shape[0] 2 == y.shape[0] * 3: # True return x * 2.0, y * 3.0 ``` For the above graph we will generate 3 runtime assertions: the negation of the first 2, and the 3rd condition as a guard. One additional benefit here over the current state of exported programs is that this adds further correctness guarantees - previously with explicit complex guards, if compilation succeeded, the guards would be ignored at runtime, treated as given. As shown above, the runtime asserts appear as math ops in the graph, generated by the sympy interpreter, resulting in an _assert_scalar call. There is an option to avoid adding these asserts into the graph, by setting `TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1`. This results in the "original" computation graph, with dynamism, and any incorrect inputs will fail on ops during runtime. Further work could go into prettifying the printer, so the majority of the graph isn't guard-related. Ideally this PR would subsume and remove the recently added [_disable_forced_specializations](https://github.com/pytorch/pytorch/pull/124949) flag, but that flag still handles one additional case of specialization: single-variable equalities where the symbol is solvable for a concrete value: see this [PR](https://github.com/pytorch/pytorch/pull/126925) This PR doesn't change any behavior around data-dependent errors/unbacked symints yet, that could be further work. NOTE: will take naming change suggestions for the flag :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127129 Approved by: https://github.com/avikchaudhuri	2024-05-29 17:15:25 +00:00
William Wen	719589c9bf	[dynamo] move bytecode tests from test_misc to new bytecode test file (#127329 ) Also merge with bytecode hook test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127329 Approved by: https://github.com/yanboliang, https://github.com/jansel	2024-05-29 06:10:59 +00:00
Xuehai Pan	26f4f10ac8	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980	2024-05-27 14:49:57 +00:00
PyTorch MergeBot	55c0ab2887	Revert "[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 )" This reverts commit `7763c83af6`. Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))	2024-05-27 09:22:08 +00:00
Xuehai Pan	7763c83af6	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980 ghstack dependencies: #127122, #127123, #127124, #127125	2024-05-27 04:22:18 +00:00
Animesh Jain	f0366de414	[dynamo] Support __contains__ on obj.__dict__ (#126922 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126922 Approved by: https://github.com/jansel, https://github.com/yanboliang	2024-05-23 09:01:29 +00:00
laithsakka	b0e849870e	Change error message when nn module inlining is enabled for MiscTests.test_map_side_effects (#126444 ) #fix https://github.com/pytorch/pytorch/issues/126355 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126444 Approved by: https://github.com/anijain2305	2024-05-22 23:24:03 +00:00
Peter Bell	51c07f9f69	[dynamo] Allow asserts to fail (#126661 ) Currently if an assertion is statically known to be false, dynamo converts it to `_assert_async` which inductor currently ignores. Instead this graph breaks to raise the original assertion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126661 Approved by: https://github.com/ezyang	2024-05-21 02:42:13 +00:00
Animesh Jain	7aa068f350	[dynamo][inline-inbuilt-nn-modules] Change test to not depend on id of mod instance (#126314 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126314 Approved by: https://github.com/williamwen42 ghstack dependencies: #126303, #126316	2024-05-16 01:35:09 +00:00
Edward Z. Yang	534ddfa619	Move compute unbacked bindings call to track_tensor_tree (#126168 ) This ensures we hit it in all the HOP proxy tensor implementations Fixes https://github.com/pytorch/pytorch/issues/125869 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126168 Approved by: https://github.com/ydwu4	2024-05-14 21:05:05 +00:00
Edward Z. Yang	db3b38202b	Improve dead code elimination of unnecessary int arguments (#126074 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126074 Approved by: https://github.com/lezcano ghstack dependencies: #125325, #125915	2024-05-14 17:22:30 +00:00
Edward Z. Yang	2ba102f689	Implement native support for float inputs in Dynamo and ShapeEnv (#125325 ) The big idea is that floats are treated as Tensors on input/output to the FX graph, but on the inside, we immediately call item() on the synthetic Tensor and record regular float operations on it. Canonicalization to Tensor operations will happen in a standalone FX pass. This behavior is controlled by `specialize_float` config variable when set to False. The generated graph looks like this for the test `test_unspec_float_output`: ``` def forward(self, L_x_: "f32[3]", L_y_: "f32[]"): l_x_ = L_x_ l_y_ = L_y_ # File: /data/users/ezyang/a/pytorch/test/dynamo/test_unspec.py:511 in f, code: return x + 1, y * 2 add: "f32[3]" = l_x_ + 1; l_x_ = None item: "Sym(zf0)" = l_y_.item(); l_y_ = None mul: "Sym(2zf0)" = item 2; item = None scalar_tensor: "f32[]" = torch.scalar_tensor(mul); mul = None return (add, scalar_tensor) ``` The ingredients: * torch/_dynamo/variables/builder.py When `specialize_float` is False, we wrap float literals with `wrap_symfloat`. This is an unholy mashup of `wrap_symint` and `wrap_unspecialized_primitive`. The overall strategy is that we first generate a tensor argument (because that's what we want to show up into the FX graph), but then immediately call item() on the tensor argument to get a SymNodeVariable, which we will do the rest of the tracing with. Importantly, this SymNodeVariable is backed with the source of the original float: this means we can guard on the resulting value (something we could NOT do with UnspecializedPythonVariable). This has to be done manually, because if you literally call item() on the tensor, you will end up with an unbacked float. There is a bit of copy paste from wrap_symint and wrap_unspecialized_primitive which we can try to factor out, but this really is its own thing and you should review every line of code in the function. * torch/fx/experimental/symbolic_shapes.py We now can generate guards on float inputs, and these guards are handled inside of ShapeEnv. So we need to be able to allocate (backed!) float symbols, and produce guards for them. Fairly straightforward generalization. * torch/_dynamo/codegen.py I also need to maintain the invariant that there are no float outputs to the FX graph. I chose to do this at codegen time. When we detect a SymNodeVariable on the return stack for a float, we on the fly convert it (via `as_tensor`) to a TensorVariable, which is the true output. We then special case the output bytecode to call item() on it again. The tensor conversion is memoized on SymNodeVariable since we typically run the code generation process twice. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125325 Approved by: https://github.com/lezcano, https://github.com/jansel	2024-05-14 04:10:01 +00:00
Animesh Jain	a7575e8bd5	[dynamo] Use correct source for custom getattr (#125828 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125828 Approved by: https://github.com/williamwen42	2024-05-09 20:37:23 +00:00
Edward Z. Yang	1b1d593c8c	Don't call item() into torch.scalar_tensor uselessly (#125373 ) Fixes https://github.com/pytorch/pytorch/issues/125368 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125373 Approved by: https://github.com/Skylion007	2024-05-05 22:38:16 +00:00
PyTorch MergeBot	a32ad828dc	Revert "Don't call item() into torch.scalar_tensor uselessly (#125373 )" This reverts commit `2b4fe183db`. Reverted https://github.com/pytorch/pytorch/pull/125373 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but there are real failures on the PR that sneak in during the log classifier outage ([comment](https://github.com/pytorch/pytorch/pull/125373#issuecomment-2094464241))	2024-05-04 22:22:36 +00:00
Animesh Jain	5ba777f46e	[guards][cpp-guards] Optimize NN module getattr guards (#124522 ) Improves the guard overhead of MobileBert model with nn module guards from 92000 units to 20000 units. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124522 Approved by: https://github.com/jansel ghstack dependencies: #125439, #125421	2024-05-04 22:08:56 +00:00
Edward Z. Yang	2b4fe183db	Don't call item() into torch.scalar_tensor uselessly (#125373 ) Fixes https://github.com/pytorch/pytorch/issues/125368 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125373 Approved by: https://github.com/Skylion007	2024-05-04 08:07:13 +00:00
Edward Z. Yang	e93b57a570	Add propagate_real_tensors mode for unbacked (#125115 ) A common complaint when working with data-dependent code in PyTorch is that it's hard to tell how far you are from the finish line: every time a GuardOnDataDependentSymNode error is hit, you have to somehow fix or workaround it to see the next one. This PR adds a new mode `torch._functorch.config.fake_tensor_propagate_real_tensors` which modifies fake tensors to also propagate real tensors. This means that when we try to guard on a data-dependent SymNode, we can actually produce a real result. We also produce a warning which you should consult to figure out what the crux points are. I ran this on vision_maskrcnn. In the baseline (without this mode), the model has 27 graph breaks, resulting in 40 graphs. With this mode on, the model has only 11 graph breaks, resulting in 15 graphs (the remaining graph breaks are due to missing functionality for item() on float tensor and some other Dynamo missing features.) You get a list of things that would have errored like this: ``` WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u0), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u0), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> False ``` Potential later follow ups: * Improve the warning messages (in particular, should provide user frames) * GC real tensors when they are no longer needed by tracing. Right now, this will use A LOT of memory, equal to as if your GC was broken and every intermediate tensor was kept live Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125115 Approved by: https://github.com/IvanKobzarev	2024-05-02 15:28:26 +00:00
Avik Chaudhuri	746da8755c	switch tests from constrain_as* to torch._check* (#125253 ) To fix data-dependent errors we want to recommend that people use `torch._check` APIs. The `constrain_as` APIs should be fully subsumed by them, and in the future we should kill them entirely. Differential Revision: D56774333 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125253 Approved by: https://github.com/ezyang	2024-05-01 21:01:27 +00:00
Animesh Jain	37c993546d	[dynamo][guards] Bug fix for set_export_info (#125275 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125275 Approved by: https://github.com/yanboliang	2024-05-01 03:46:26 +00:00
Sam Larsen	74e8817311	[inductor] Minor fixes to various tests before enabling fx graph caching in OSS by default (#125258 ) Summary: Discovered breakages by enabling codecache by default and doing a CI run. I'll commit these fixes first and eventually enabling caching by default will (hopefully) be a one-liner. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125258 Approved by: https://github.com/eellison	2024-05-01 02:34:01 +00:00
William Wen	d6c713884a	[dynamo, 3.12] xfail refleaking tests due to buggy getattr_static (#125062 ) For tracking https://github.com/pytorch/pytorch/issues/124302 so that we can re-enable the test once 3.12 updates with the bug fix for https://github.com/python/cpython/issues/118013. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125062 Approved by: https://github.com/anijain2305, https://github.com/jansel	2024-04-30 22:40:47 +00:00
Edward Z. Yang	e5e623af4b	Codegen runtime asserts in Inductor (#124874 ) This completely subsumes https://github.com/pytorch/pytorch/pull/120816 This makes use of the unbacked binding machinery to teach Inductor how to generate deferred runtime asserts directly. There is some back story about why I did it this way, let me explain. Previously, our strategy for generating runtime asserts was that Dynamo would insert them into the FX graph after finishing tracing, and we would attempt to code generate them based on the FX graph. This is a good strategy for export, where we immediately export the graph. However, this strategy was afflicted by problems in eager, where we reuse the same ShapeEnv as before. In particular, on subsequent graph passes, we would immediately turn all of these assertions into noops, because when we evaluated their expressions, we would see that because we had a deferred runtime assert in the ShapeEnv, we know "oh, of course this expression is True" already. Oops! So, with this PR, we take the attitude that as long as the ShapeEnv sticks around, the ShapeEnv's list of deferred runtime asserts is the source of truth, and we don't put anything in the graph. So we just need to decide when to actually generate asserts, and the place I picked was Inductor lowering, since we already have an AssertScalar buffer concept, and so I just need to insert them at this point. AssertScalar also uses raw sympy.Expr rather than SymInt/Bool, so it is easier to prevent unrestricted simplification at this point. There are a few things jumbled together in this PR. I can split them if you want, but some of the changes are before I changed my strategy, but they're useful changes anyway. torch/_dynamo/output_graph.py and torch/_inductor/lowering.py - Here, we stop putting deferred runtime asserts in the graph. I also have to make sure we don't DCE unused symbol arguments; we're going to get some goofy graph arguments this way, will be good to restore that optimization eventually. We also just disable codegen for `_assert_scalar` entirely; we assume that ShapeEnv will be good enough to capture all of these. torch/_inductor/codegen/wrapper.py and torch/_inductor/ir.py - Add a way to codegen sizevars without forcing simplification torch/_inductor/graph.py - The main logic. Our strategy is to interpose in the same place we are testing that unbacked SymInts are properly showing up in lowered code. The logic is directly analogous to the logic in the existing insert deferred runtime asserts FX pass, but it's simpler because sympy expressions can be directly stored on inductor IR nodes. torch/fx/experimental/symbolic_shapes.py - For extra safety, we have a way of freezing runtime asserts, so that if you try to add more we error. This prevents us from adding runtime asserts after we've done lowering. There's a funny interaction with backwards which there's a comment for in graph.py torch/fx/passes/runtime_assert.py - This is not really needed in this PR, but I rewrote the runtime assert logic to use unbacked_bindings rather than inferring it by looking for unbacked SymInts. Now, keypaths are translated into FX node acessors. Unfortunately, I couldn't delete the old inference code, because you still need it to find backed SymInts from arguments (as this pass may be used on graphs which don't explicitly bind all their shape variables as argments). There are some new tests exercising this. TODO: I think we need to generate asserts for replacements too. This is a preexisting problem that the old FX pass had too. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124874 Approved by: https://github.com/jansel ghstack dependencies: #124864	2024-04-29 10:19:29 +00:00
Animesh Jain	0f139b04b3	[dynamo] Fix test (#125107 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125107 Approved by: https://github.com/jansel ghstack dependencies: #125097	2024-04-28 15:24:17 +00:00
Animesh Jain	e68d65dae2	[dynamo][cpp-guards] Differentiate dict guards wrt to guarding on key order (#124779 ) We guard on key order 1) When a key is a non-constant object 2) When we actually need key order - like .values, .items etc For dicts/OrderedDicts that do not require key order guarding, we just rely on usual `GuardManger + DictGetItemGuardAccessor`. This is faster than going through the `list(d.keys())` based design for OrderedDicts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124779 Approved by: https://github.com/jansel	2024-04-25 08:20:35 +00:00
Edward Z. Yang	e0e2d897ed	Handle Tensor returns in PropagateUnbackedSymInts (#124297 ) This subsumes https://github.com/pytorch/pytorch/pull/124069 In the original PR, my idea was that when we run PropagateUnbackedSymInts, we check that the sizes before and after are exactly the same. This ended up turning up lots of bugs that I didn't feel like fixing. Separately, Ivan let me know that this pass was quite expensive in terms of compile time, since we spent a lot of time thinking about the equalities. To kill two birds with one stone, we now only check for equality precisely when an unbacked SymInt was bound (thanks to the previous PR in this stack, we now have this information). Specifically, we look to see if `meta["unbacked_bindings"]` is set on the old node, and if it is, we assert the old value is equal to the new value from the repropagation. Note that the pytree key is used to actually extract the new value from the example value, as it may be nested inside an, e.g., tensor size. We do something a bit naughty at the end: we use `defer_runtime_assert` to actually teach ShapeEnv about the equality. This is implementationally equivalent to what we used to do, but we're going to change this later soon. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124297 Approved by: https://github.com/lezcano ghstack dependencies: #124290	2024-04-24 12:18:33 +00:00
Xuehai Pan	a6f044a490	[dynamo, 3.8-3.9] support dataclass with `frozen=True` in Python 3.8/3.9 (#124393 ) Closes #114966 Frozen field assignment in `__init__` in Python 3.8-3.9: `f5bd65ed37/Lib/dataclasses.py (L402-L411)` ```python import builtins BUILTINS = builtins def _field_assign(frozen, name, value, self_name): # If we're a frozen class, then assign to our fields in __init__ # via object.__setattr__. Otherwise, just use a simple # assignment. # # self_name is what "self" is called in this function: don't # hard-code "self", since that might be a field name. if frozen: return f'BUILTINS.object.__setattr__({self_name},{name!r},{value})' return f'{self_name}.{name}={value}' ``` Frozen field assignment in `__init__` in Python 3.10+: `812245ecce/Lib/dataclasses.py (L436-L445)` ```python __dataclass_builtins_object__ = object def _field_assign(frozen, name, value, self_name): # If we're a frozen class, then assign to our fields in __init__ # via object.__setattr__. Otherwise, just use a simple # assignment. # # self_name is what "self" is called in this function: don't # hard-code "self", since that might be a field name. if frozen: return f'__dataclass_builtins_object__.__setattr__({self_name},{name!r},{value})' return f'{self_name}.{name}={value}' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/124393 Approved by: https://github.com/jansel	2024-04-19 05:10:33 +00:00
Animesh Jain	f213f262af	[dynamo][cpp-guards] Improve when to use Dict vs DictSubclassGuardManager (#124237 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124237 Approved by: https://github.com/jansel, https://github.com/mlazos ghstack dependencies: #124230	2024-04-18 03:33:37 +00:00
William Wen	dca24d70ba	[dynamo, test] remove skip for unhandled exception test (#123876 ) This test might no longer segfault in CI due to changes to how we allocate and free shadow frames in dynamo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123876 Approved by: https://github.com/jansel	2024-04-18 03:02:34 +00:00
William Wen	812bae09be	[dynamo] fix 3.11+ refleak (#124238 ) Fixes https://github.com/pytorch/pytorch/issues/119607 for 3.11+. In 3.11+, `_PyFrame_FastToLocalsWithError` could implicity run `COPY_FREE_VARS` on the original frame, leading to double incref's since the dynamo shadow frame can rerun `COPY_FREE_VARS`. So the solution is to skip the first `COPY_FREE_VARS` instruction in the shadow frame if it was already executed in the original frame. Also move the location for clearing the original frame in 3.12 to handle error cases more thoroughly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124238 Approved by: https://github.com/jansel	2024-04-18 03:02:29 +00:00

... 2 3 4 5 6 ...

796 Commits