pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Laith Sakka	ed313a5ca2	Introduce torch.sym_add, variadic add (#138660 ) Tested internally here: https://www.internalfb.com/diff/D64057744 This is a reland after previous internal failures. main change is ``` if min is None and max is None: torch._check_is_size(size) return ``` Partially addresses https://github.com/pytorch/pytorch/issues/128150 When you have big sums of values, we end up computing long chains of binary addition in our FX graph representation. Not only is this ugly, it also is quadratic, as the sympy.Add constructor is O(N) in number of arguments. Instead, ensure that we maintain the summation as a single FX node so we can do the entire addition all in one go. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/138660 Approved by: https://github.com/ezyang, https://github.com/bobrenjc93	2024-10-23 17:42:41 +00:00
David Berard	bb2e090b7d	[user triton] typing triton_kernel_wrap.py (#138230 ) Remove `# mypy: allow-untyped-defs` from triton_kernel_wrap.py, and fixed all the mypy errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138230 Approved by: https://github.com/oulgen, https://github.com/Skylion007	2024-10-21 20:34:49 +00:00
Tom Ritchford	8ad191ae21	[dynamo] Replace __str__ with __repr__ in some places (#136316 ) ## The problem In a typical debugger, `repr()` is used to display variables and not `str()`. Several classes in Dynamo have a `__str__()` method that returns useful information and a `__repr__()` that does not. Having to call `str(x)` or `[str(i) for i in x]` in the debugger all the time is a chore. `str()` should be ["informal, nicely printable"](https://docs.python.org/3/library/stdtypes.html#str) and `repr()` should ["attempt to return a string that would yield an object with the same value when passed to eval()](https://docs.python.org/3/library/functions.html#repr)". ## The solution In the Python object model, if there is no `__str__` method, `__repr__` is used instead (but not the other way around). So renaming `__str__` to `__repr__` in a few cases where no `__repr__` method exists now should not change observable behavior, and should make debugging easier. The specific classes changed were all in `torch._dynamo.variables`: * `builtin.BuiltinVariable` * `constant.ConstantVariable` * `constant.EnumVariable` * `functions.UserMethodVariable` * `lazy.LazyVariableTracker` * `lazy.LazySymNodeFormatString` * `misc.GetAttrVariable` * `misc.NullVariable` * `user_defined.UserDefinedObjectVariable` Pull Request resolved: https://github.com/pytorch/pytorch/pull/136316 Approved by: https://github.com/XuehaiPan, https://github.com/jansel	2024-10-21 19:50:38 +00:00
PyTorch MergeBot	e8b1409dcf	Revert "[user triton] typing triton_kernel_wrap.py (#138230 )" This reverts commit `2f61b69603`. Reverted https://github.com/pytorch/pytorch/pull/138230 on behalf of https://github.com/wdvr due to Reverting this, as it started failing tests on main ([comment](https://github.com/pytorch/pytorch/pull/138230#issuecomment-2423354596))	2024-10-18 23:12:29 +00:00
David Berard	2f61b69603	[user triton] typing triton_kernel_wrap.py (#138230 ) Remove `# mypy: allow-untyped-defs` from triton_kernel_wrap.py, and fixed all the mypy errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138230 Approved by: https://github.com/oulgen, https://github.com/Skylion007	2024-10-18 19:29:31 +00:00
Tom Ritchford	e1c4548441	[dynamo] Simplify creation of VariableTrackers (#135714 ) ## `VariableTracker::build()` hides the Builders ### The problem In the current code, creating a `VariableTracker` involves choosing one of two `Builder` classes and either calling a method, or calling a constructor that creates an object that you immediately call, [like this](`083c9149b7/torch/_dynamo/variables/functions.py (L761-L768)`). Variations on this code are repeated in many places. More, the `Builder` classes have a lot of dependencies, so they have to be loaded late in the whole import process to avoid circular imports, so they end up being repeatedly imported at local scope. ### The solution In this commit, the import from `builder` and the logic of choosing and calling the Builder class are hidden in a single static factory method, `VariableTracker.build()`, easier to reason about and to import. This commit net lowers the total lines of code by over 150 lines by removing repetitive logic and unnecessary local imports. CHANGES: Originally the name of the static method was `VariableTracker.create()` but a static method on a derived class, `LazyVariableTracker.create()` now exists with a different signature that's irreconcilable, so the new static method was renamed to `VariableTracker.build()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135714 Approved by: https://github.com/jansel	2024-10-18 09:36:46 +00:00
Adnan Akhundov	809ff3b274	Add host-side Triton TMA support to Dynamo (#137677 ) This adds Dynamo tracing support for the host-side Triton TMA API (see `create_2d_tma_descriptor` calls on the host in the [Triton tutorial](https://triton-lang.org/main/getting-started/tutorials/09-persistent-matmul.html#sphx-glr-getting-started-tutorials-09-persistent-matmul-py)). A few notes: - Here we assume the availability of the host-side TMA API added to upstream Triton in https://github.com/triton-lang/triton/pull/4498. As of time of writing, this is not a part of the PT2 OSS Triton pin (although back-ported internally). OSS Triton pin update should be done in December 2024. - To capture the chain of calls `t.data_ptr() --> create_{1d,2d}_tma_descriptor(ptr, ...) --> kernel[grid](tma_desc, ...)`, we add three new variable trackers: `DataPtrVariable`, `CreateTMADescriptorVariable` (for the function), `TMADescriptorVariable` (for TMA descriptor object). This is to maintain the path back from the Triton kernel to the Tensor from which the TMA descriptor has been created. - The newly introduced variables have `reconstruct` methods used in case of graph breaks. - The `tma_descriptor_metadata` extracted from the captured `create_{1d,2d}_tma_descriptor` calls is propagated through the HOPs in Dynamo and AOTAutograd to be used by the downstream compiler (e.g., Inductor). See the unit tests for how the captured HOP arguments look like. - In the Dynamo-captured fx graph, we replace the TMA descriptor arguments of the Triton kernel by the underlying Tensors, to be able to track the input/output relationships in terms of Tensors. - In the Triton kernel mutation analysis pass (in AOTAutograd), we use the `tt.experimental_descriptor_store` TTIR op to detect mutations of the underlying tensors via TMA descriptors. So that downstream AOTAutograd can perform functionalizations as required. - JIT Inductor and AOT Inductor support will be implemented in follow-up PRs. Differential Revision: [D64404928](https://our.internmc.facebook.com/intern/diff/D64404928) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137677 Approved by: https://github.com/zou3519	2024-10-16 02:18:48 +00:00
PyTorch MergeBot	16a2c2cfd4	Revert "Introduce torch.sym_sum (#136429 )" This reverts commit `90bed32b98`. Reverted https://github.com/pytorch/pytorch/pull/136429 on behalf of https://github.com/ezyang due to fails internal stuff ([comment](https://github.com/pytorch/pytorch/pull/136429#issuecomment-2403335147))	2024-10-09 20:08:01 +00:00
Ryan Guo	394c143e4e	[dynamo] Fix error when inlining certain nested closure returned by another function (#137510 ) See `test_inline_closure_returned_by_another_function_and_captures` and #136814 for more context. In #90286, we introduced an optimization so that for captured cells that are unmodified during a Dynamo trace, `UserFunctionVariable` will represent them as variable of the cell's actual value, rather than a `NewCellVariable`. Later on we introduced more mechanisms to model such cells across function calls (#104222), and across function calls where `NestedUserFunctionVariable::bind_args` need to look up further in the parent frames (#106491) to find these cells' values. This patch removes `InlinedClosureVariable` in favor of a simpler modelling, which is also more consistent with what was introduced in #90286, i.e., just model these cells as their contents, in `symbolic_locals`. This fixes #136814 because resolution of `InlinedClosureVariable` to the underlying cell content value happens in `NestedUserFunctionVariable::bind_args`, which requires Dynamo to have the value in scope at the function call site (when Dynamo does inlining), but's not always the case (as the test case shows). However, if we model the cells in `symbolic_locals`, we never need such resolution, and the values are directly stored into the `NestedUserFunctionVariable::closure` upon the function creation, at which point Dynamo always has the cell value in `symbolic_locals` for look up. Fixes #136814. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137510 Approved by: https://github.com/williamwen42	2024-10-09 18:13:57 +00:00
Duygu Altinok	2a1829d728	Error message for allow_in_graph decorator and arbitrary function combo (#135972 ) Fixes #103615 Quick error message for non-allowed allow_in_graph decorator and arbitrary function combo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135972 Approved by: https://github.com/anijain2305	2024-10-08 22:48:38 +00:00
Edward Z. Yang	90bed32b98	Introduce torch.sym_sum (#136429 ) Partially addresses https://github.com/pytorch/pytorch/issues/128150 When you have big sums of values, we end up computing long chains of binary addition in our FX graph representation. Not only is this ugly, it also is quadratic, as the sympy.Add constructor is O(N) in number of arguments. Instead, ensure that we maintain the summation as a single FX node so we can do the entire addition all in one go. update_hint_regression benchmark, before and after: ``` update_hint_regression,compile_time_instruction_count,2648328980 update_hint_regression,compile_time_instruction_count,2563748678 ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136429 Approved by: https://github.com/isuruf	2024-10-08 18:12:57 +00:00
David Berard	ef3142d2a0	[user triton] Make tl.constexpr specialization work for triton_op & capture_triton (#136686 ) In #136512, we fixed handling for tl.constexpr and dynamic shapes: if a symint is passed to tl.constexpr, you should specialize on it, because tl.constexpr implies needing to know the concrete value at compile time. However, when using triton_op, capture_triton, or non-strict export, the regression remains (and #136512 might technically regress some specific export scenarios) - see [Richard's comment](https://github.com/pytorch/pytorch/pull/136512/files#r1775999871). This fixes these scenarios: implement the handling differently depending on whether we're expecting a SymNodeVariable or a SymInt(/SymBool/SymFloat) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136686 Approved by: https://github.com/zou3519	2024-09-27 23:02:46 +00:00
PyTorch MergeBot	287dc36395	Revert "[user triton] Make tl.constexpr specialization work for triton_op & capture_triton (#136686 )" This reverts commit `9f5b97a006`. Reverted https://github.com/pytorch/pytorch/pull/136686 on behalf of https://github.com/davidberard98 due to breaks lint on main. Please rebase to see and fix the error ([comment](https://github.com/pytorch/pytorch/pull/136686#issuecomment-2379830921))	2024-09-27 18:25:49 +00:00
David Berard	9f5b97a006	[user triton] Make tl.constexpr specialization work for triton_op & capture_triton (#136686 ) In #136512, we fixed handling for tl.constexpr and dynamic shapes: if a symint is passed to tl.constexpr, you should specialize on it, because tl.constexpr implies needing to know the concrete value at compile time. However, when using triton_op, capture_triton, or non-strict export, the regression remains (and #136512 might technically regress some specific export scenarios) - see [Richard's comment](https://github.com/pytorch/pytorch/pull/136512/files#r1775999871). This fixes these scenarios: implement the handling differently depending on whether we're expecting a SymNodeVariable or a SymInt(/SymBool/SymFloat) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136686 Approved by: https://github.com/zou3519	2024-09-27 16:11:02 +00:00
PyTorch MergeBot	9223c16208	Revert "Fix constant propagation in builtins and UserClasses (#131354 )" This reverts commit `dd4a51b39a`. Reverted https://github.com/pytorch/pytorch/pull/131354 on behalf of https://github.com/atalman due to Breaks torchrec tests ([comment](https://github.com/pytorch/pytorch/pull/131354#issuecomment-2375417145))	2024-09-25 23:01:03 +00:00
Tom Ritchford	dd4a51b39a	Fix constant propagation in builtins and UserClasses (#131354 ) * Fixes https://github.com/pytorch/pytorch/issues/118675 * Replaces https://github.com/pytorch/pytorch/pull/118994 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131354 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-09-25 13:03:40 +00:00
Will Feng	386884e553	[Traceable FSDP2] Ignore FSDP2 forward hook side-effects in AC; Support FSDP2 + AC (#134997 ) > Ignore FSDP2 forward hook side-effects in AC Under AC, FSDP2 does not rely on forward hook to all-gather weights to do recomputation, instead it relies on pre-backward hook to do this job: `451eaf0ff2/torch/distributed/_composable/fsdp/_fsdp_state.py (L219-L220)` So when we use `speculate_subgraph` to trace the utils.checkpoint AC region, we don't actually need to worry about FSDP2 forward hook's side effects and can safely ignore it, because we are not and we don't expect to re-run the FSDP2 forward hook during backward recomputation. ---- Test commands: - `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_inductor` - `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor` Pull Request resolved: https://github.com/pytorch/pytorch/pull/134997 Approved by: https://github.com/zou3519 ghstack dependencies: #135727	2024-09-15 02:00:17 +00:00
Animesh Jain	eaba287adb	[dynamo] Bug fix for _torchdynamo_inline source handling (#135612 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135612 Approved by: https://github.com/drisspg	2024-09-12 04:05:08 +00:00
PyTorch MergeBot	596e93b506	Revert "[dynamo] Bug fix for _torchdynamo_inline source handling (#135612 )" This reverts commit `5c3d0a2ded`. Reverted https://github.com/pytorch/pytorch/pull/135612 on behalf of https://github.com/clee2000 due to broke inductor/test_cpu_select_algorithm.py::TestSelectAlgorithmCPU::test_linear_input_transpose_bias_True_cpu_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/10805518363/job/29982386304) [HUD commit link](`5c3d0a2ded`), bad TD ([comment](https://github.com/pytorch/pytorch/pull/135612#issuecomment-2344039370))	2024-09-11 15:51:12 +00:00
Animesh Jain	5c3d0a2ded	[dynamo] Bug fix for _torchdynamo_inline source handling (#135612 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135612 Approved by: https://github.com/drisspg ghstack dependencies: #135588	2024-09-11 05:23:42 +00:00
Tom Ritchford	2c99f17a32	Implement VariableTracker.python_type() (#134215 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134215 Approved by: https://github.com/amjames, https://github.com/jansel	2024-09-05 16:35:47 +00:00
Xuehai Pan	ec660c383e	[dynamo] reduce overhead for `PolyfilledFunctionVariable.call_function` (#134842 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134842 Approved by: https://github.com/jansel	2024-08-31 09:12:46 +00:00
Xuehai Pan	ebbdeeede1	[dynamo][itertools] refactor `itertools.chain` and `itertools.chain.from_iterable` to use polyfills (#133864 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133864 Approved by: https://github.com/jansel	2024-08-31 00:11:54 +00:00
PyTorch MergeBot	1ad08c7a5b	Revert "[dynamo][itertools] refactor `itertools.chain` and `itertools.chain.from_iterable` to use polyfills (#133864 )" This reverts commit `1b70366957`. Reverted https://github.com/pytorch/pytorch/pull/133864 on behalf of https://github.com/ZainRizvi due to This is still failing internally with the same error about 'Graph break due to unsupported builtin _functools.reduce' ([comment](https://github.com/pytorch/pytorch/pull/133778#issuecomment-2321787968))	2024-08-30 16:06:10 +00:00
Xuehai Pan	1b70366957	[dynamo][itertools] refactor `itertools.chain` and `itertools.chain.from_iterable` to use polyfills (#133864 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133864 Approved by: https://github.com/jansel ghstack dependencies: #133769, #133778, #133779	2024-08-29 20:56:16 +00:00
Xuehai Pan	e09324e7da	[dynamo] simplify polyfill registration for `builtins.all` and `builtins.any` (#133769 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133769 Approved by: https://github.com/jansel	2024-08-29 20:56:16 +00:00
Xuehai Pan	b6abac68ec	[BE][dynamo] reorganize polyfill module hierarchy (#133977 ) Changes: 1. Move `polyfill.py` -> `polyfills/__init__.py`. It can be used as `polyfill.xxx` -> `polyfills.xxx`. 2. Move submodule loading from `polyfills/__init__.py` to `polyfills/loader.py`. Merge `polyfill.py` and `polyfills/` packages. Each polyfill module have its own namespace for better code organization. The ultimate goal is make `polyfills/__init__.py` empty and all polyfill functions move to its own namespace. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133977 Approved by: https://github.com/jansel	2024-08-22 16:42:29 +00:00
Xuehai Pan	022cd7c9aa	[RFC][dynamo] add decorator to register polyfill for unsupported C++ function to avoid graph break (#133712 ) Add decorator `torch.compiler.substitute_in_graph` to register polyfill for unsupported C++ function to avoid graph break. This API provides an official way to add support for dynamo for third-party C extensions. Also, it can be used to simplify our implementation for `torch._dynamo.polyfill`. `5ee070266f/torch/_dynamo/variables/builtin.py (L97-L107)` Example: ```python >>> import operator >>> operator.indexOf([1, 2, 3, 4, 5], 3) 2 >>> torch.compile(operator.indexOf, fullgraph=True)([1, 2, 3, 4, 5], 3) Unsupported: ... >>> @torch.compiler.substitute_in_graph(operator.indexOf) ... def indexOf(sequence, x): ... for i, item in enumerate(sequence): ... if item is x or item == x: ... return i ... raise ValueError("sequence.index(x): x not in sequence") >>> torch.compile(operator.indexOf, fullgraph=True)([1, 2, 3, 4, 5], 3) 2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/133712 Approved by: https://github.com/jansel	2024-08-21 06:36:41 +00:00
PyTorch MergeBot	15b5a0b67f	Revert "[RFC][dynamo] add decorator to register polyfill for unsupported C++ function to avoid graph break (#133712 )" This reverts commit `71dd52f51a`. Reverted https://github.com/pytorch/pytorch/pull/133712 on behalf of https://github.com/ZainRizvi due to breaking main windows cpu tests - this stack still causes that windows test to fail ([comment](https://github.com/pytorch/pytorch/pull/133712#issuecomment-2299776241))	2024-08-20 21:14:45 +00:00
Xuehai Pan	71dd52f51a	[RFC][dynamo] add decorator to register polyfill for unsupported C++ function to avoid graph break (#133712 ) Add decorator `torch.compiler.substitute_in_graph` to register polyfill for unsupported C++ function to avoid graph break. This API provides an official way to add support for dynamo for third-party C extensions. Also, it can be used to simplify our implementation for `torch._dynamo.polyfill`. `5ee070266f/torch/_dynamo/variables/builtin.py (L97-L107)` Example: ```python >>> import operator >>> operator.indexOf([1, 2, 3, 4, 5], 3) 2 >>> torch.compile(operator.indexOf, fullgraph=True)([1, 2, 3, 4, 5], 3) Unsupported: ... >>> @torch.compiler.substitute_in_graph(operator.indexOf) ... def indexOf(sequence, x): ... for i, item in enumerate(sequence): ... if item is x or item == x: ... return i ... raise ValueError("sequence.index(x): x not in sequence") >>> torch.compile(operator.indexOf, fullgraph=True)([1, 2, 3, 4, 5], 3) 2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/133712 Approved by: https://github.com/jansel	2024-08-20 19:48:57 +00:00
PyTorch MergeBot	2bd02e0c82	Revert "[RFC][dynamo] add decorator to register polyfill for unsupported C++ function to avoid graph break (#133712 )" This reverts commit `641724ed1d`. Reverted https://github.com/pytorch/pytorch/pull/133712 on behalf of https://github.com/jeanschmidt due to breaking main windows cpu tests - reverting them all, so we can identify the culprit with more calmness ([comment](https://github.com/pytorch/pytorch/pull/133712#issuecomment-2298528797))	2024-08-20 10:34:41 +00:00
Xuehai Pan	641724ed1d	[RFC][dynamo] add decorator to register polyfill for unsupported C++ function to avoid graph break (#133712 ) Add decorator `torch.compiler.substitute_in_graph` to register polyfill for unsupported C++ function to avoid graph break. This API provides an official way to add support for dynamo for third-party C extensions. Also, it can be used to simplify our implementation for `torch._dynamo.polyfill`. `5ee070266f/torch/_dynamo/variables/builtin.py (L97-L107)` Example: ```python >>> import operator >>> operator.indexOf([1, 2, 3, 4, 5], 3) 2 >>> torch.compile(operator.indexOf, fullgraph=True)([1, 2, 3, 4, 5], 3) Unsupported: ... >>> @torch.compiler.substitute_in_graph(operator.indexOf) ... def indexOf(sequence, x): ... for i, item in enumerate(sequence): ... if item is x or item == x: ... return i ... raise ValueError("sequence.index(x): x not in sequence") >>> torch.compile(operator.indexOf, fullgraph=True)([1, 2, 3, 4, 5], 3) 2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/133712 Approved by: https://github.com/jansel	2024-08-19 22:14:33 +00:00
Oguz Ulgen	6e79932543	Add basic mypy annotations to dynamo (#132415 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132415 Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu	2024-08-04 18:43:36 +00:00
PyTorch MergeBot	3558a8cf4a	Revert "Add basic mypy annotations to dynamo (#132415 )" This reverts commit `71e22e0959`. Reverted https://github.com/pytorch/pytorch/pull/132415 on behalf of https://github.com/ZainRizvi due to Sorry, this PR has entered a weird state in the diff train. Trying to revert it to skip it, and then we can try relanding it ([comment](https://github.com/pytorch/pytorch/pull/132415#issuecomment-2267631785))	2024-08-04 18:39:29 +00:00
Oguz Ulgen	71e22e0959	Add basic mypy annotations to dynamo (#132415 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132415 Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu	2024-08-01 20:14:25 +00:00
Xuehai Pan	e74ba1b34a	[BE][Easy][15/19] enforce style for empty lines in import segments in `torch/_d*/` (#129767 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129767 Approved by: https://github.com/anijain2305	2024-07-31 21:18:11 +00:00
rzou	19db4f6014	[capture_triton] fix special kwargs path (#132143 ) I didn't test this path when creating the orchestrator. This PR fixes that path to work in the capture_triton path. The problem is that we are handling a value that is an int (in the capture_triton path) and a ConstantVariable (in the Dynamo triton path) so we abstract that out in the orchestrator. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/132143 Approved by: https://github.com/oulgen	2024-07-30 20:30:40 +00:00
Animesh Jain	13457d1da0	[dynamo][log] Suggest to use pytree when graph-break on optree (#131827 ) Discovered while working on https://github.com/pytorch/pytorch/issues/121369 On the model above, the log looks like this ~~~ /home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree._C.PyCapsule.flatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py. torch._dynamo.utils.warn_once(msg) /home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree.PyCapsule.unflatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py. torch._dynamo.utils.warn_once(msg) ~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/131827 Approved by: https://github.com/zou3519, https://github.com/mlazos	2024-07-30 05:49:58 +00:00
Chengji Yao	d47c470f47	[dynamo] implement `var_getattr` in UserFunctionVariable (#130413 ) This PR addresses the `getattr` of UserFunctionVariable. Although this usage is uncommon, it does appear in [Megatron's code](https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/tensor_parallel/layers.py#L635). ``` def linear_with_grad_accumulation_and_async_allreduce(...): .... if not linear_with_grad_accumulation_and_async_allreduce.warned: .... .... linear_with_grad_accumulation_and_async_allreduce.warned = False ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/130413 Approved by: https://github.com/yanboliang	2024-07-29 08:29:59 +00:00
Oguz Ulgen	7a42470bcb	Annotate all InstructionTranslator (#131509 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131509 Approved by: https://github.com/zou3519	2024-07-24 23:45:53 +00:00
PyTorch MergeBot	5db5865614	Revert "Annotate all InstructionTranslator (#131509 )" This reverts commit `eafbd20f23`. Reverted https://github.com/pytorch/pytorch/pull/131509 on behalf of https://github.com/clee2000 due to sorry need to revert this to revert something else, I think you only need to rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/131509#issuecomment-2249000843))	2024-07-24 22:29:49 +00:00
Oguz Ulgen	b56939dae1	Annotate more InstructionTranslator (#131680 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131680 Approved by: https://github.com/zou3519 ghstack dependencies: #131676	2024-07-24 22:14:29 +00:00
Oguz Ulgen	eafbd20f23	Annotate all InstructionTranslator (#131509 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131509 Approved by: https://github.com/zou3519	2024-07-24 05:31:01 +00:00
Animesh Jain	eab1595ce2	[dynamo] Delete wrong assertion in bind_args (#131405 ) Fix - https://github.com/pytorch/pytorch/issues/130537 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131405 Approved by: https://github.com/williamwen42, https://github.com/yanboliang ghstack dependencies: #131347, #131367, #131378, #131389	2024-07-23 17:28:05 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
rzou	99c68f7bea	Refactor TritonKernelVariable's logic so it can be shared (#130177 ) TritonKernelVariable's logic tells us how to go from a user-defined triton kernel and a grid to a call to the triton_kernel_wrapper_mutation HOP. We want to re-use this in a setting without Dynamo; in the next PR up, we create a new decorator (capture_triton) that, when applied to a triton kernel, transforms a call to the triton kernel into a call to the triton_kernel_wrapper_mutation HOP. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/130177 Approved by: https://github.com/oulgen, https://github.com/ydwu4	2024-07-10 03:09:29 +00:00
Animesh Jain	a7a7363be0	[dynamo] Skip side effect tracking for c wrappers/descriptors (#129914 ) Fixes PYTORCH_TEST_WITH_DYNAMO=1 pytest -vs test/test_python_dispatch.py::TestPythonDispatch::test_deepcopy_wrapper_subclass Pull Request resolved: https://github.com/pytorch/pytorch/pull/129914 Approved by: https://github.com/jansel ghstack dependencies: #129913	2024-07-04 03:14:45 +00:00
Animesh Jain	53d67165c0	[dynamo] Skip FUNCTION_MATCH guards for descriptors (#129858 ) Hard to write tests. This PR makes many test pass in the stack such as `PYTORCH_TEST_WITH_DYNAMO=1 pytest test/test_ao_sparsity.py::TestComposability::test_convert_without_squash_mask` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129858 Approved by: https://github.com/mlazos ghstack dependencies: #129830	2024-07-01 20:44:59 +00:00
William Wen	79aabaf626	[3.13, dynamo] codegen PUSH_NULL when callable is codegen'd (#129172 ) Significant bytecode generation API change! The new suggested convention to generating bytecode to call a function is now to wrap instructions that push a callable to the stack with `add_push_null`, then that callable is called with `create_call_function` with `push_null=False` (see diff for examples). In Python 3.13, NULL is now expected to be pushed after the callable. In <=3.12, the NULL was pushed before the callable. This change abstracts away the exact placement of the NULL, but the developer must be aware that a NULL may be needed when codegen'ing a callable. This abstraction also reduces the need for the `push_null=True` option in `create_call_function`, which removes the need to rotate a NULL to the right place on the stack with a sequence of `SWAP` instructions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129172 Approved by: https://github.com/jansel	2024-06-22 17:25:23 +00:00
rzou	08b616281f	[custom ops] Switch out references from old landing page to new landing page (#129178 ) Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/129178 Approved by: https://github.com/albanD ghstack dependencies: #129177	2024-06-21 13:31:40 +00:00

1 2 3 4

156 Commits