pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Joel Schlosser	fb146fc3c6	Only store necessary tensor_dict fields in node meta (#132805 ) Fixes #132290 This PR attempts a more invasive / complete solution than the one from #132338, which removes immediate tensor fields from the `tensor_dict` copy stored in node meta. The approach taken here is to store only those fields of the `tensor_dict` which are absolutely utilized somewhere else. So far, this appears to be limited to: * `_dynamo_static_input_type` * `tag` (at least in the tests). Discussion at #94080 appears to indicate this is depended on for export (CI may point out more) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132805 Approved by: https://github.com/mlazos	2024-08-07 13:35:16 +00:00
Brian Hirsh	af8b8a47cb	fsdp.set_: convey to functionalization that it mutates storage (#132322 ) Fixes https://github.com/pytorch/pytorch/issues/132197 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132322 Approved by: https://github.com/albanD, https://github.com/yf225 ghstack dependencies: #132243, #132337	2024-08-05 21:28:59 +00:00
Brian Hirsh	4db368a475	make functorch CSE respect mutations as barriers (like fsdp.set_) (#132243 ) Fixes https://github.com/pytorch/pytorch/issues/132200 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132243 Approved by: https://github.com/albanD, https://github.com/zou3519, https://github.com/yf225	2024-08-05 21:28:55 +00:00
William Wen	01cdcbf7c8	[dynamo] revert map/zip iterator related changes (#132528 ) Need to revert due to internal hangs: S437700 This reverts commit `b6c1490cc0`. Revert "[dynamo] implement IteratorVariable and polyfill fallbacks for enumerate (#131725)" This reverts commit `2576dbbc35`. Revert "[dynamo] add itertools repeat/count bytecode reconstruction (#131716)" This reverts commit `35b4de32fa`. Revert "[dynamo] add lazy IteratorVariable implementations for map and zip (#131413)" This reverts commit `7d282d8755`. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/132528 Approved by: https://github.com/ZainRizvi	2024-08-04 18:46:55 +00:00
PyTorch MergeBot	0a25666f92	Revert "[dynamo] revert map/zip iterator related changes (#132528 )" This reverts commit `e81e74ca6c`. Reverted https://github.com/pytorch/pytorch/pull/132528 on behalf of https://github.com/ZainRizvi due to This stack entered a weird state in the diff train. Reverting and relanding to clean the state ([comment](https://github.com/pytorch/pytorch/pull/132528#issuecomment-2267628475))	2024-08-04 18:26:09 +00:00
Pian Pawakapan	a896fb1b36	check unsupported sympy functions for runtime asserts (#132457 ) Some sympy Functions aren't supported by sympy_interp(); we can't turn them into FX nodes, so currently the runtime asserts CSE pass avoids CSE'ing on any expression containing a sympy Function. https://github.com/pytorch/pytorch/pull/132325 started tracking unsupported functions, so we switch the check to that to be more precise. We also check for and skip unsupported functions when adding asserts - previously we only did the check for CSE, and not adding new expressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132457 Approved by: https://github.com/avikchaudhuri	2024-08-03 10:17:25 +00:00
William Wen	e81e74ca6c	[dynamo] revert map/zip iterator related changes (#132528 ) Need to revert due to internal hangs: S437700 This reverts commit `b6c1490cc0`. Revert "[dynamo] implement IteratorVariable and polyfill fallbacks for enumerate (#131725)" This reverts commit `2576dbbc35`. Revert "[dynamo] add itertools repeat/count bytecode reconstruction (#131716)" This reverts commit `35b4de32fa`. Revert "[dynamo] add lazy IteratorVariable implementations for map and zip (#131413)" This reverts commit `7d282d8755`. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/132528 Approved by: https://github.com/ZainRizvi	2024-08-02 19:40:57 +00:00
Michael Lazos	93979e7063	Skip frame if torch dispatch mode enabled (#131828 ) Fixes https://github.com/pytorch/pytorch/issues/105929 We now skip frames if a dispatch mode is enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131828 Approved by: https://github.com/bdhirsh, https://github.com/anijain2305	2024-08-01 19:06:20 +00:00
Oguz Ulgen	920f0426ae	Add None return type to init -- tests rest (#132376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376 Approved by: https://github.com/jamesjwu ghstack dependencies: #132335, #132351, #132352	2024-08-01 15:44:51 +00:00
YangQun1	589aef4bb0	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-08-01 03:18:37 +00:00
ekamiti	9e473fd868	Make adding Buffers more like adding Parameters (#125971 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971 Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos	2024-07-31 10:32:40 +00:00
Yiming Zhou	e9d1c26275	fix uniform op in dynamo (#132160 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132160 Approved by: https://github.com/anijain2305	2024-07-31 06:48:43 +00:00
Xuehai Pan	918ece4f4d	[BE][Easy][11/19] enforce style for empty lines in import segments in `test/dy*/` (#129762 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129762 Approved by: https://github.com/anijain2305	2024-07-27 17:43:53 +00:00
Brian Hirsh	e4ace1a396	AOTDispatcher: properly bump version counter on input mutations in inference graphs (#131665 ) This ensures that in an inference setting, we properly bump the VC of mutated graph inputs. Previously, we would only properly bump the VC for training graphs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131665 Approved by: https://github.com/ezyang, https://github.com/zou3519 ghstack dependencies: #131403, #131482	2024-07-26 14:22:20 +00:00
Brian Hirsh	5570a0da0a	dont dispatch aten.conj(scalar_tensor) back to python (#131482 ) https://github.com/pytorch/pytorch/issues/105290 The problem in the original flow is that: (1) the user calls `torch.mul(complex_tensor, complex_scalar) (2) python arg parser wraps the complex scalar in a `scalar_tensor`, and dispatches to `aten.mul.Tensor(self, scalar_other)` (3) autograd sees `aten.mul.Tensor`, calls `scalar_other.conj()` [here](https://github.com/pytorch/pytorch/blob/main/torch/csrc/autograd/FunctionsManual.cpp#L597) (4) during proxy tensor tracing, this gets dispatched to `aten._conj(scalar_tensor)` (5) when we hit __torch_dispatch__, the scalar_tensor is converted back into a plain python scalar (6) we error during tracing, because in `FunctionalTensorMode.__torch_dispatch__` we try to redispatch on `aten._conj.default(plain_python_scalar)`, and this overload does not accept python scalars. My attempted fix in this PR is to update `TensorBase::conj()` to check if the current tensor is a scalar tensor (wrapped number), and if so, manually: (1) convert the scalar tensor back into a scalar (2) call scalar.conj() directly (3) convert the result back into a wrapped tensor This avoids having to go through python entirely in the tracing case (which is fine, because these scalar tensors are constants that we can const-prop during tracing anyway). Notable, I did not add e.g. a new `aten._conj.Scalar` overload. This would not actually fix the problem, since the bug is that we call `aten._conj.default(python_scalar)` directly. we would also need to muck with all `__torch_dispatch__` call sites to know to convert python scalars back into tensors directly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131482 Approved by: https://github.com/zou3519, https://github.com/ezyang ghstack dependencies: #131403	2024-07-26 14:22:20 +00:00
Brian Hirsh	8bb9aa93a7	dynamo: mutations on .data should be invisible to autograd (#131403 ) Fixes https://github.com/pytorch/pytorch/issues/121353 our handle for `.data` in dynamo today basically just converts `y = x.data` into `y = x.detach()`. The semantics of these two ops are not quite the same, because: (1) any future mutations on `x.data` will be fully ignored by autograd (2) any mutations on `x.detach()` will bump x's version counter the linked model does a .data mutation that is hidden from autograd in eager, but ends up erroring during AOTDispatcher tracing. I updated dynamo's handling so that: (1) when dynamo sees a call to `getattr(tensor, "data")` and calls `.detach()` we set a flag on the returned `TensorVariable` indicating it came from `.data` (2) on any tensor method that we call with an input `TensorVariable` with this flag turned on, we proxy autograd's `preserve_version_counter` logic into the graph, to properly reset the VC after the op is run. One thing to note is that I don't actually do this on every op that we pass the tensor to: I only do it for tensor methods that appear to be mutations (by checking for a trailing underscore). My thought was that: (1) I didn't want to do this for every op that you pass `y` into, since that will e.g. triple the number of nodes in the graph, and could cause compile time regressions if you use .data (2) this situation is pretty rare in general, and I'm hoping that "tensor method mutations" cover most reasonable mutation cases. If we manage to miss a case, you will get a loud error during tracing anyway, so there is not a safety issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131403 Approved by: https://github.com/anijain2305, https://github.com/zou3519	2024-07-26 14:22:20 +00:00
William Wen	7d282d8755	[dynamo] add lazy IteratorVariable implementations for map and zip (#131413 ) Fixes https://github.com/pytorch/pytorch/issues/130750. Repro of lazy/eager `map` discrepancy without `islice`: ```python def fn(a, b): y = 1 def f(x): nonlocal y y += 1 return x l = list(zip([a, b], map(f, [1, 2, 3, 4]))) return a + y ``` The major change is that we implement `MapVariable` and `ZipVariable` based on `IteratorVariable`. Before, `map` and `zip` were being traced by immediately unpacking the result as a `TupleVariable`, which is wrong in cases such as the example above. `MapVariable`s are not allowed to be unpacked while `ZipVariable`s can only be unpacked if all of its iterables can also be unpacked. We also add new `[has_]force_unpack_var_sequence` methods to `VariableTracker` for the case where it is safe to unpack the entire sequence lazily, e.g., when building a list from a map (i.e. `list(map(f, ...))`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/131413 Approved by: https://github.com/anijain2305	2024-07-26 10:47:38 +00:00
Michael Lazos	51f4f87718	[Reland] Ensure staticmethods can be allowed in graph (#131789 ) Fixes https://github.com/pytorch/pytorch/issues/124735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131789 Approved by: https://github.com/anijain2305	2024-07-25 22:54:18 +00:00
William Wen	2423d89d0c	[dynamo] mirror training flag in OptimizedModule (#131546 ) Fixes https://github.com/pytorch/pytorch/issues/122414. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131546 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2024-07-25 17:43:09 +00:00
PyTorch MergeBot	c3679bed35	Revert "Fix py codegen to delete values that don't have any users (#131028 )" This reverts commit `91aba7baac`. Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/clee2000 due to broke inductor/test_triton_kernels inductor/test_triton_kernels.py::KernelTests::test_triton_kernel_functionalize [GH job link](https://github.com/pytorch/pytorch/actions/runs/10094659640/job/27915271250) [HUD commit link](`91aba7baac`) ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2251058374))	2024-07-25 17:42:18 +00:00
YangQun1	91aba7baac	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-07-25 13:04:23 +00:00
PyTorch MergeBot	236e06f9f9	Revert "Ensure staticmethods can be allowed in graph (#130882 )" This reverts commit `93fdd0237d`. Reverted https://github.com/pytorch/pytorch/pull/130882 on behalf of https://github.com/clee2000 due to torchrec test still broken internally D59945836 ([comment](https://github.com/pytorch/pytorch/pull/130882#issuecomment-2249003059))	2024-07-24 22:32:41 +00:00
PyTorch MergeBot	8ffd109a00	Revert "Fix py codegen to delete values that don't have any users (#131028 )" This reverts commit `466c167b71`. Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/atalman due to breaks CI ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2247771530))	2024-07-24 12:21:43 +00:00
YangQun1	466c167b71	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-07-24 01:03:56 +00:00
Michael Lazos	93fdd0237d	Ensure staticmethods can be allowed in graph (#130882 ) Fixes https://github.com/pytorch/pytorch/issues/124735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130882 Approved by: https://github.com/anijain2305, https://github.com/williamwen42	2024-07-23 18:59:19 +00:00
Animesh Jain	ddde9dd25c	[dynamo][automatic_dynamic] Trigger dynamism on stride changes (#130232 ) Fixes https://github.com/pytorch/pytorch/issues/129798 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130232 Approved by: https://github.com/ezyang	2024-07-21 03:45:54 +00:00
Animesh Jain	00e54e74ff	[dynamo][cpp-guards] Fix bug in dict tags (#131056 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131056 Approved by: https://github.com/williamwen42, https://github.com/jansel	2024-07-19 04:42:38 +00:00
PyTorch MergeBot	9f6db5d0e2	Revert "Ensure staticmethods can be allowed in graph (#130882 )" This reverts commit `b0387449db`. Reverted https://github.com/pytorch/pytorch/pull/130882 on behalf of https://github.com/atalman due to failing torchrec tests internally, please fix and reland ([comment](https://github.com/pytorch/pytorch/pull/130882#issuecomment-2236528473))	2024-07-18 13:31:30 +00:00
Michael Lazos	b0387449db	Ensure staticmethods can be allowed in graph (#130882 ) Fixes https://github.com/pytorch/pytorch/issues/124735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130882 Approved by: https://github.com/anijain2305, https://github.com/williamwen42	2024-07-17 19:18:30 +00:00
Michael Lazos	e4f9d01cd9	Add test for dataclass field accesses (#130848 ) Fixes https://github.com/pytorch/pytorch/issues/120108 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130848 Approved by: https://github.com/williamwen42, https://github.com/anijain2305	2024-07-17 19:14:23 +00:00
William Wen	3928ca2ab6	[dynamo] update call map to allow multiple input parameters (#130748 ) Fixes https://github.com/pytorch/pytorch/issues/128072. Commandeering https://github.com/pytorch/pytorch/pull/128282 since the issue is now hi pri. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130748 Approved by: https://github.com/Skylion007, https://github.com/anijain2305	2024-07-15 22:16:49 +00:00
PyTorch MergeBot	dff9d68f18	Revert "Fix names conflict when lifting (#129817 )" This reverts commit `53cf46b8c6`. Reverted https://github.com/pytorch/pytorch/pull/129817 on behalf of https://github.com/clee2000 due to Failing inductor/test_flex_attention.py https://github.com/pytorch/pytorch/actions/runs/9940532858/job/27478084137 `74da2a467f` Sorry for the churn, possibly a landrace? ([comment](https://github.com/pytorch/pytorch/pull/129817#issuecomment-2229519886))	2024-07-15 22:08:45 +00:00
Zhanghan Wang	53cf46b8c6	Fix names conflict when lifting (#129817 ) ## Bug description When pending args that are potentially to be lift [here](`58f346c874/torch/_dynamo/output_graph.py (L1866)`) having same base name, like `contiguous` and `contiguous_1`, the call into [create_graph_input](`58f346c874/torch/_dynamo/output_graph.py (L2081)`) can finally create a name ([here](`58f346c874/torch/fx/graph.py (L1008)`)) that overwrite args to lift. And thus causing a wrong output of graph. ## Reproducing Below is an reproduceable example, ```python import logging from typing import List import torch from functorch.compile import aot_module_simplified, make_boxed_func @torch.library.custom_op("mylib::somefunc_forward", mutates_args=()) def somefunc_forward( input_: torch.Tensor, weight: torch.Tensor, shape: List[int], ) -> torch.Tensor: return torch.ones_like(input_) @somefunc_forward.register_fake def _(input_, shape, weight): return torch.empty_like(input_) @torch.library.custom_op("mylib::somefunc_backward", mutates_args=()) def somefunc_backward( grad_output: torch.Tensor, input_: torch.Tensor, weight: torch.Tensor, shape: List[int], ) -> torch.Tensor: print(f"backward.{grad_output.shape=}") print(f"backward.{input_.shape=}") print(f"backward.{weight.shape=}") print(f"backward.{shape=}") assert list(weight.shape) == shape return torch.ones_like(weight) @somefunc_backward.register_fake def _(grad_output, input_, weight, shape): return torch.empty_like(weight) def a_func(grad_output, input_, weight_, shape): return torch.ones_like(input_.sum() * weight_) class SomeFunc(torch.autograd.Function): @staticmethod def forward(ctx, input, weight, normalized_shape): ctx.normalized_shape = normalized_shape input_ = input.contiguous() weight_ = weight.contiguous() output = somefunc_forward(input_, weight_, ctx.normalized_shape) ctx.save_for_backward(input_, weight_) return output @staticmethod def backward(ctx, grad_output): input_, weight_ = ctx.saved_tensors # grad_weight = a_func(grad_output, input_, weight_, ctx.normalized_shape) grad_weight = somefunc_backward( grad_output.contiguous(), input_, weight_, ctx.normalized_shape, ) return None, grad_weight, None class MyModel(torch.nn.Module): def __init__(self): super().__init__() self.weight = torch.nn.Parameter(torch.ones(7)) def forward(self, x): return SomeFunc.apply(x, self.weight, [7]) model = MyModel() torch._logging.set_logs(dynamo=logging.DEBUG, aot=logging.DEBUG, graph_code=True) def aot_print_backend(gm, sample_inputs): # Forward compiler capture def fw(gm, sample_inputs): print(f"----- fw") gm.print_readable() return make_boxed_func(gm.forward) # Backward compiler capture def bw(gm, sample_inputs): print(f"----- bw") gm.print_readable() return make_boxed_func(gm.forward) # Call AOTAutograd gm_forward = aot_module_simplified( gm, sample_inputs, fw_compiler=fw, bw_compiler=bw ) return gm_forward model = torch.compile( model, backend=aot_print_backend, dynamic=False, ) out = model(torch.rand((128, 4, 7))) out.mean().backward() ``` I can see log that showing calling into create_graph_input like ```log V0629 02:08:46.839914 8200981504 torch/_dynamo/output_graph.py:2042] [0/0] create_graph_input contiguous (none) V0629 02:08:46.839998 8200981504 torch/_dynamo/output_graph.py:2042] [0/0] create_graph_input contiguous_1 (none) ``` And the backward graph generate will be like ```log class GraphModule(torch.nn.Module): def forward(self, function_ctx, somefunc_forward_default: "f32[128, 4, 7]", contiguous: "f32[128, 4, 7]", contiguous_1: "f32[7]"): contiguous_1 = contiguous contiguous_2 = contiguous_1 # No stacktrace found for following nodes _set_grad_enabled = torch._C._set_grad_enabled(False) # File: /Users/bytedance/testtorch/test_custom_op_bug.py:61 in backward, code: grad_output.contiguous(), contiguous: "f32[128, 4, 7]" = somefunc_forward_default.contiguous(); somefunc_forward_default = None # File: /opt/tiger/pytorch/torch/_library/custom_ops.py:506 in __call__, code: return self._opoverload(args, *kwargs) somefunc_backward_default: "f32[7]" = torch.ops.mylib.somefunc_backward.default(contiguous, contiguous_1, contiguous_2, [7]); contiguous = contiguous_1 = contiguous_2 = None # No stacktrace found for following nodes _set_grad_enabled_1 = torch._C._set_grad_enabled(True) return (None, somefunc_backward_default) ``` The original code of `somefunc_backward` takes a input list of `grad_output`, `input_`, `weight` and `shape`, where `weight` should be shape of `torch.Size([7])`. However, in the graph, `contiguous1` and `contiguous_2` are assigned with `contiguous`, this leads to assertion failure I added in `somefunc_backward`. ## Environment ```log Collecting environment information... PyTorch version: 2.5.0a0+git0b7e8df Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A OS: macOS 14.5 (arm64) GCC version: Could not collect Clang version: 15.0.0 (clang-1500.3.9.4) CMake version: version 3.26.4 Libc version: N/A Python version: 3.9.19 (main, May 6 2024, 14:39:30) [Clang 14.0.6 ] (64-bit runtime) Python platform: macOS-14.5-arm64-arm-64bit Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True CPU: Apple M3 Pro Versions of relevant libraries: [pip3] numpy==2.0.0 [pip3] optree==0.11.0 [pip3] torch==2.5.0a0+git0b7e8df [pip3] torchgraph==0.0.1 [conda] numpy 2.0.0 pypi_0 pypi [conda] optree 0.11.0 pypi_0 pypi [conda] torch 2.5.0a0+git0b7e8df dev_0 <develop> [conda] torchgraph 0.0.1 dev_0 <develop> ``` ## How to fix? I put a naive fix that add the potential args to lift into the used_names. This visits private variables, will fix that if this issue makes sense to you. @zou3519 @oulgen Co-authored-by: rzou <zou3519@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/129817 Approved by: https://github.com/zou3519	2024-07-15 18:49:12 +00:00
PyTorch MergeBot	1e897a0ca4	Revert "Fix names conflict when lifting (#129817 )" This reverts commit `74da2a467f`. Reverted https://github.com/pytorch/pytorch/pull/129817 on behalf of https://github.com/clee2000 due to broke dynamo/test_inline_inbuilt_nn_modules.py https://github.com/pytorch/pytorch/actions/runs/9940532858/job/27461141919 `74da2a467f`. Test passed on PR, possibly a landrace? ([comment](https://github.com/pytorch/pytorch/pull/129817#issuecomment-2228993570))	2024-07-15 17:09:52 +00:00
Zhanghan Wang	74da2a467f	Fix names conflict when lifting (#129817 ) ## Bug description When pending args that are potentially to be lift [here](`58f346c874/torch/_dynamo/output_graph.py (L1866)`) having same base name, like `contiguous` and `contiguous_1`, the call into [create_graph_input](`58f346c874/torch/_dynamo/output_graph.py (L2081)`) can finally create a name ([here](`58f346c874/torch/fx/graph.py (L1008)`)) that overwrite args to lift. And thus causing a wrong output of graph. ## Reproducing Below is an reproduceable example, ```python import logging from typing import List import torch from functorch.compile import aot_module_simplified, make_boxed_func @torch.library.custom_op("mylib::somefunc_forward", mutates_args=()) def somefunc_forward( input_: torch.Tensor, weight: torch.Tensor, shape: List[int], ) -> torch.Tensor: return torch.ones_like(input_) @somefunc_forward.register_fake def _(input_, shape, weight): return torch.empty_like(input_) @torch.library.custom_op("mylib::somefunc_backward", mutates_args=()) def somefunc_backward( grad_output: torch.Tensor, input_: torch.Tensor, weight: torch.Tensor, shape: List[int], ) -> torch.Tensor: print(f"backward.{grad_output.shape=}") print(f"backward.{input_.shape=}") print(f"backward.{weight.shape=}") print(f"backward.{shape=}") assert list(weight.shape) == shape return torch.ones_like(weight) @somefunc_backward.register_fake def _(grad_output, input_, weight, shape): return torch.empty_like(weight) def a_func(grad_output, input_, weight_, shape): return torch.ones_like(input_.sum() * weight_) class SomeFunc(torch.autograd.Function): @staticmethod def forward(ctx, input, weight, normalized_shape): ctx.normalized_shape = normalized_shape input_ = input.contiguous() weight_ = weight.contiguous() output = somefunc_forward(input_, weight_, ctx.normalized_shape) ctx.save_for_backward(input_, weight_) return output @staticmethod def backward(ctx, grad_output): input_, weight_ = ctx.saved_tensors # grad_weight = a_func(grad_output, input_, weight_, ctx.normalized_shape) grad_weight = somefunc_backward( grad_output.contiguous(), input_, weight_, ctx.normalized_shape, ) return None, grad_weight, None class MyModel(torch.nn.Module): def __init__(self): super().__init__() self.weight = torch.nn.Parameter(torch.ones(7)) def forward(self, x): return SomeFunc.apply(x, self.weight, [7]) model = MyModel() torch._logging.set_logs(dynamo=logging.DEBUG, aot=logging.DEBUG, graph_code=True) def aot_print_backend(gm, sample_inputs): # Forward compiler capture def fw(gm, sample_inputs): print(f"----- fw") gm.print_readable() return make_boxed_func(gm.forward) # Backward compiler capture def bw(gm, sample_inputs): print(f"----- bw") gm.print_readable() return make_boxed_func(gm.forward) # Call AOTAutograd gm_forward = aot_module_simplified( gm, sample_inputs, fw_compiler=fw, bw_compiler=bw ) return gm_forward model = torch.compile( model, backend=aot_print_backend, dynamic=False, ) out = model(torch.rand((128, 4, 7))) out.mean().backward() ``` I can see log that showing calling into create_graph_input like ```log V0629 02:08:46.839914 8200981504 torch/_dynamo/output_graph.py:2042] [0/0] create_graph_input contiguous (none) V0629 02:08:46.839998 8200981504 torch/_dynamo/output_graph.py:2042] [0/0] create_graph_input contiguous_1 (none) ``` And the backward graph generate will be like ```log class GraphModule(torch.nn.Module): def forward(self, function_ctx, somefunc_forward_default: "f32[128, 4, 7]", contiguous: "f32[128, 4, 7]", contiguous_1: "f32[7]"): contiguous_1 = contiguous contiguous_2 = contiguous_1 # No stacktrace found for following nodes _set_grad_enabled = torch._C._set_grad_enabled(False) # File: /Users/bytedance/testtorch/test_custom_op_bug.py:61 in backward, code: grad_output.contiguous(), contiguous: "f32[128, 4, 7]" = somefunc_forward_default.contiguous(); somefunc_forward_default = None # File: /opt/tiger/pytorch/torch/_library/custom_ops.py:506 in __call__, code: return self._opoverload(args, *kwargs) somefunc_backward_default: "f32[7]" = torch.ops.mylib.somefunc_backward.default(contiguous, contiguous_1, contiguous_2, [7]); contiguous = contiguous_1 = contiguous_2 = None # No stacktrace found for following nodes _set_grad_enabled_1 = torch._C._set_grad_enabled(True) return (None, somefunc_backward_default) ``` The original code of `somefunc_backward` takes a input list of `grad_output`, `input_`, `weight` and `shape`, where `weight` should be shape of `torch.Size([7])`. However, in the graph, `contiguous1` and `contiguous_2` are assigned with `contiguous`, this leads to assertion failure I added in `somefunc_backward`. ## Environment ```log Collecting environment information... PyTorch version: 2.5.0a0+git0b7e8df Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A OS: macOS 14.5 (arm64) GCC version: Could not collect Clang version: 15.0.0 (clang-1500.3.9.4) CMake version: version 3.26.4 Libc version: N/A Python version: 3.9.19 (main, May 6 2024, 14:39:30) [Clang 14.0.6 ] (64-bit runtime) Python platform: macOS-14.5-arm64-arm-64bit Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True CPU: Apple M3 Pro Versions of relevant libraries: [pip3] numpy==2.0.0 [pip3] optree==0.11.0 [pip3] torch==2.5.0a0+git0b7e8df [pip3] torchgraph==0.0.1 [conda] numpy 2.0.0 pypi_0 pypi [conda] optree 0.11.0 pypi_0 pypi [conda] torch 2.5.0a0+git0b7e8df dev_0 <develop> [conda] torchgraph 0.0.1 dev_0 <develop> ``` ## How to fix? I put a naive fix that add the potential args to lift into the used_names. This visits private variables, will fix that if this issue makes sense to you. @zou3519 @oulgen Co-authored-by: rzou <zou3519@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/129817 Approved by: https://github.com/zou3519	2024-07-15 13:41:46 +00:00
Animesh Jain	8714b7fc69	[dynamo][cpp-guards] Use dict tags to skip guards on immutable dict getitems (#130654 ) Reduces the guard overhead from 3.7k units to 2.1k units. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130654 Approved by: https://github.com/jansel	2024-07-13 15:31:10 +00:00
Pian Pawakapan	1b3b4c2fb9	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) (#130380 ) original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train) Summary: This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Test Plan: contbuild & OSS CI, see `940e4477ab` Original Phabricator Test Plan: Imported from GitHub, without a `Test Plan:` line. Differential Revision: D59543603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380 Approved by: https://github.com/izaitsevfb	2024-07-10 19:23:37 +00:00
PyTorch MergeBot	9c9744c3ac	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `940e4477ab`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))	2024-07-09 21:03:49 +00:00
Animesh Jain	f053be2a97	[dynamo] Graph break on random_ op (#130222 ) Fixes https://github.com/pytorch/pytorch/issues/121621 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130222 Approved by: https://github.com/jansel	2024-07-08 06:10:24 +00:00
Pian Pawakapan	940e4477ab	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-07 20:10:14 +00:00
Xuehai Pan	a3ce9eddd6	[BE][Easy] apply autofix for ruff rule unnecessary-literal-set (C405) and unnecessary-map (C417) (#130198 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/130198 Approved by: https://github.com/Skylion007	2024-07-07 00:58:22 +00:00
PyTorch MergeBot	963f430d13	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `0267b2ddcb`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk `0267b2ddcb` ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))	2024-07-06 07:20:05 +00:00
Pian Pawakapan	0267b2ddcb	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-06 03:44:49 +00:00
Edward Z. Yang	29c68df600	Stop immediately specializing common constants 0/1 for plain int (#128327 ) Fixes https://github.com/pytorch/pytorch/issues/128319 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128327 Approved by: https://github.com/lezcano ghstack dependencies: #129983	2024-07-03 16:41:51 +00:00
Edward Z. Yang	8af58f66bb	Fix typo in floordiv solver code that affects flipped relation (#129888 ) Fixes https://github.com/pytorch/pytorch/issues/123535 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/129888 Approved by: https://github.com/lezcano	2024-07-03 04:47:32 +00:00
Simon Fan	be2d79a16b	[dynamic] config to disable duck sizing (#129804 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129804 Approved by: https://github.com/ezyang	2024-07-03 00:20:54 +00:00
PyTorch MergeBot	c22e66896f	Revert "Fix typo in floordiv solver code that affects flipped relation (#129888 )" This reverts commit `3c6c3b9448`. Reverted https://github.com/pytorch/pytorch/pull/129888 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the updated test starts to fail flakily in trunk somehow, so I am reverting the change to see if it helps ([comment](https://github.com/pytorch/pytorch/pull/129888#issuecomment-2204442653))	2024-07-02 21:16:59 +00:00
Edward Z. Yang	3c6c3b9448	Fix typo in floordiv solver code that affects flipped relation (#129888 ) Fixes https://github.com/pytorch/pytorch/issues/123535 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/129888 Approved by: https://github.com/lezcano	2024-07-02 11:15:03 +00:00
Brian Hirsh	b91a9dc328	[Brian's PR #128754 ] Use torch.ops.fsdp.set_ for FSDP2 storage resize; dont functionalize resize_, set_, split_with_sizes_copy.out (#129203 ) This is a copy of Brian's PR https://github.com/pytorch/pytorch/pull/128754, with some changes in the test_distributed_patterns.py unit tests to more closely reflect FSDP2 patterns. Also disabled two tests `test_input_mutation_storage_resize_up_down` and `test_input_mutation_storage_resize_not_supported` in test_aotdispatch.py until we figure out the right behavior for them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129203 Approved by: https://github.com/bdhirsh	2024-06-23 06:07:19 +00:00
Animesh Jain	6b5fbc544e	[dynamo] Use polyfill to trace through the attributes of torch.jit.* and lru_cache_wrapper (#128336 ) Earlier we were taking the vt for `obj` and then monkeypatching that `vt.source` to be `obj._torchdynamo_inline`. If one accesses `obj.attr_a`, this would cause problems because Dynamo would then search it in `obj._torchdynamo_inline.attr_a`. This PR makes it more functional, so that we have different vts for obj and `ob._torchdynamo_inline`. Fixes https://github.com/pytorch/pytorch/issues/93698 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128336 Approved by: https://github.com/jansel, https://github.com/yanboliang ghstack dependencies: #129117	2024-06-21 07:44:44 +00:00
chilli	a2b1673dfb	[Horace's PR #126446 ] Prevent partitioner from ever saving views (#129039 ) Most work is done by Horace in https://github.com/pytorch/pytorch/issues/126446, this PR just additionally adds the config for it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129039 Approved by: https://github.com/Chillee	2024-06-19 23:21:16 +00:00
Animesh Jain	c5e0b84484	[dynamo][trace_rules] Remove incorrectly classified Ingraph functions (#128428 ) Co-authored-by: Laith Sakka <lsakka@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128428 Approved by: https://github.com/yanboliang, https://github.com/mlazos	2024-06-19 00:06:46 +00:00
PyTorch MergeBot	5bc9835d64	Revert "[dynamo][trace_rules] Remove incorrectly classified Ingraph functions (#128428 )" This reverts commit `c52eda896e`. Reverted https://github.com/pytorch/pytorch/pull/128428 on behalf of https://github.com/anijain2305 due to luca saw bad compile time ([comment](https://github.com/pytorch/pytorch/pull/128453#issuecomment-2176877667))	2024-06-18 20:09:00 +00:00
Animesh Jain	b0282071c4	[dynamo] override torch.nn.modules.activation._is_make_fx_tracing (#128748 ) Discovered while inlining `MultiHeadAttention` nn Module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128748 Approved by: https://github.com/jansel ghstack dependencies: #128315	2024-06-17 08:49:29 +00:00
Animesh Jain	7e092a62e6	[dynamo] Support weakref objects (#128533 ) Fixes https://github.com/pytorch/pytorch/issues/125720 I was earlier worried that DELETE_* or STORE_* on referent values should result in a graph break, because they could invalidate the weak ref. But then @zou3519 pointed out that weakref invalidation will happen EVENTUALLY, CPython provides no guarantees when the weakref will be invalidated (even when the user calls del x and x is the last reference). So any code that relies on del x to invalidate the weakref of x right away is BAD code. CPython provide no guarantees. Therefore we can (ab)use this nuance, and can just ignore DELETE_* or STORE_* on the referent objects. The only corner case is when Dynamo is reconstructing the weakref object. Dynamo will have a hard time being correct here, so just SKIP_FRAME on such a case. This is rare. Cpython notes 1) https://docs.python.org/3/library/weakref.html 2) https://docs.python.org/3/reference/datamodel.html#index-2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128533 Approved by: https://github.com/jansel	2024-06-15 02:16:25 +00:00
Simon Fan	4b96575a09	[dynamo][aot autograd] Silently disable default saved tensor hooks during tracing (#123196 ) FIXES #113263. Same idea as in https://github.com/pytorch/pytorch/pull/113417, but we need a more intrusive C API to silently nop default saved tensor hooks, in order to support user-code that use torch.autograd.disable_saved_tensors_hooks (see test_unpack_hooks_can_be_disabled). We mock the output of get_hooks while leaving push/pop untouched. For compiled autograd, we're firing pack hooks once and unpack hooks twice right now, I'll look into this separately from this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123196 Approved by: https://github.com/soulitzer	2024-06-14 20:28:08 +00:00
Animesh Jain	c52eda896e	[dynamo][trace_rules] Remove incorrectly classified Ingraph functions (#128428 ) Co-authored-by: Laith Sakka <lsakka@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128428 Approved by: https://github.com/yanboliang, https://github.com/mlazos ghstack dependencies: #126578, #128440, #128470, #128453, #128484	2024-06-13 06:08:56 +00:00
Animesh Jain	05711eece9	[dynamo][inlining inbuilt modules] Ensure BC for nn_module_stack (#128295 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128295 Approved by: https://github.com/ydwu4	2024-06-10 23:11:04 +00:00
laithsakka	5b3624117a	update test_issue175 to handle inline_inbuilt_nn_modules (#128026 ) with inlining the output graph have more function calls reflecting those on the test that count number of function calls. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128026 Approved by: https://github.com/anijain2305 ghstack dependencies: #127553	2024-06-07 22:07:16 +00:00
laithsakka	78a6b0c479	update test_reformer_train test to handle nn module inlining (#127467 ) number of call nodes increase due to inlining before inlining: ``` class GraphModule(torch.nn.Module): def forward(self, function_ctx, cat: "f32[1, s0, 512]"): # No stacktrace found for following nodes _set_grad_enabled = torch._C._set_grad_enabled(False) # File: /data/users/lsakka/pytorch/pytorch/test/dynamo/test_repros.py:283 in backward, code: grad_attn_output, grad_hidden_states = torch.chunk( chunk = torch.chunk(cat, 2, dim = -1); cat = None getitem: "f32[1, s0, 256]" = chunk[0] getitem_1: "f32[1, s0, 256]" = chunk[1]; chunk = None # No stacktrace found for following nodes _set_grad_enabled_1 = torch._C._set_grad_enabled(True) return (getitem_1, None) ``` after inlining: ``` class GraphModule(torch.nn.Module): def forward(self, s0: "Sym(s0)", L_hidden_states_: "f32[1, s0, 256]", L_self_layers_0_weight: "f32[256, 256]", L_self_layers_0_bias: "f32[256]", L_self_layer_norm_weight: "f32[512]", L_self_layer_norm_bias: "f32[512]", L_self_layer_norm_normalized_shape_0_: "Sym(512)"): l_hidden_states_ = L_hidden_states_ l_self_layers_0_weight = L_self_layers_0_weight l_self_layers_0_bias = L_self_layers_0_bias l_self_layer_norm_weight = L_self_layer_norm_weight l_self_layer_norm_bias = L_self_layer_norm_bias l_self_layer_norm_normalized_shape_0_ = L_self_layer_norm_normalized_shape_0_ # File: /data/users/lsakka/pytorch/pytorch/test/dynamo/test_repros.py:332 in forward, code: hidden_states = torch.cat([hidden_states, hidden_states], dim=-1) hidden_states: "f32[1, s0, 512]" = torch.cat([l_hidden_states_, l_hidden_states_], dim = -1); l_hidden_states_ = None # File: /data/users/lsakka/pytorch/pytorch/test/dynamo/test_repros.py:333 in forward, code: hidden_states = _ReversibleFunction.apply( function_ctx = torch.autograd.function.FunctionCtx() # File: /data/users/lsakka/pytorch/pytorch/test/dynamo/test_repros.py:258 in forward, code: hidden_states, attn_output = torch.chunk(hidden_states, 2, dim=-1) chunk = torch.chunk(hidden_states, 2, dim = -1); hidden_states = None hidden_states_1: "f32[1, s0, 256]" = chunk[0] attn_output: "f32[1, s0, 256]" = chunk[1]; chunk = None # File: /data/users/lsakka/pytorch/pytorch/torch/nn/modules/linear.py:116 in forward, code: return F.linear(input, self.weight, self.bias) attn_output_1: "f32[1, s0, 256]" = torch._C._nn.linear(attn_output, l_self_layers_0_weight, l_self_layers_0_bias); attn_output = l_self_layers_0_weight = l_self_layers_0_bias = None # File: /data/users/lsakka/pytorch/pytorch/test/dynamo/test_repros.py:272 in forward, code: ctx.save_for_backward(attn_output.detach(), hidden_states.detach()) detach: "f32[1, s0, 256]" = attn_output_1.detach() detach_1: "f32[1, s0, 256]" = hidden_states_1.detach() # File: /data/users/lsakka/pytorch/pytorch/test/dynamo/test_repros.py:279 in forward, code: return torch.cat([attn_output, hidden_states], dim=-1) hidden_states_2: "f32[1, s0, 512]" = torch.cat([attn_output_1, hidden_states_1], dim = -1); attn_output_1 = hidden_states_1 = None # File: /data/users/lsakka/pytorch/pytorch/torch/nn/modules/normalization.py:201 in forward, code: return F.layer_norm( hidden_states_3: "f32[1, s0, 512]" = torch.nn.functional.layer_norm(hidden_states_2, (l_self_layer_norm_normalized_shape_0_,), l_self_layer_norm_weight, l_self_layer_norm_bias, 1e-12); hidden_states_2 = l_self_layer_norm_normalized_shape_0_ = l_self_layer_norm_weight = l_self_layer_norm_bias = None # File: /data/users/lsakka/pytorch/pytorch/test/dynamo/test_repros.py:352 in forward, code: hidden_states = torch.nn.functional.dropout( hidden_states_4: "f32[1, s0, 512]" = torch.nn.functional.dropout(hidden_states_3, p = 0.5, training = True); hidden_states_3 = None return (hidden_states_4,) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/127467 Approved by: https://github.com/anijain2305 ghstack dependencies: #126444, #127146, #127424, #127440	2024-06-06 17:56:36 +00:00
Animesh Jain	c7e936a56a	[dynamo] Tensorvariable - track grad with _grad field (#127785 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127785 Approved by: https://github.com/jansel	2024-06-04 18:25:46 +00:00
Animesh Jain	4aa7a1efcf	[dynamo] Initial exception handling support (#126923 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126923 Approved by: https://github.com/williamwen42, https://github.com/jansel	2024-06-01 13:00:32 +00:00
Animesh Jain	159632aecd	[dynamo] Support hasattr on BuiltinVariable (#127372 ) Fixes https://github.com/pytorch/pytorch/issues/127172 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127372 Approved by: https://github.com/williamwen42, https://github.com/yanboliang ghstack dependencies: #127377	2024-05-31 04:23:56 +00:00
Animesh Jain	51b22d9cf2	[dynamo] Support enum construction (#127364 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127364 Approved by: https://github.com/yanboliang ghstack dependencies: #127263	2024-05-29 08:09:21 +00:00
Michael Hsu	85172fbe84	Back out "Prevent partitioner from ever saving views (#126446 )" (#127316 ) Summary: Revert "Prevent partitioner from ever saving views (#126446)" due to a torchinductor failure on CU Training Framework tests. Reviewed By: Chillee Differential Revision: D57868343 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127316 Approved by: https://github.com/Chillee	2024-05-29 00:29:44 +00:00
Xuehai Pan	26f4f10ac8	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980	2024-05-27 14:49:57 +00:00
PyTorch MergeBot	55c0ab2887	Revert "[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 )" This reverts commit `7763c83af6`. Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))	2024-05-27 09:22:08 +00:00
Xuehai Pan	7763c83af6	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980 ghstack dependencies: #127122, #127123, #127124, #127125	2024-05-27 04:22:18 +00:00
Xuehai Pan	a28bfb5ed5	[4/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort functorch (#127125 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127125 Approved by: https://github.com/Skylion007 ghstack dependencies: #127122, #127123, #127124	2024-05-25 22:45:38 +00:00
Yidi Wu	2d6d2dbc0b	[dynamo] make callable(nn_module) return True (#127026 ) Before the pr, we have a graph break for `callable(nn_module)`: ```python class M(nn.Module): def forward(self, x): return x.sin() def f(m): return callable(m) res = torch.compile(f, fullgraph=True)(M()) ``` ``` Traceback (most recent call last): File "/data/users/yidi/pytorch/t.py", line 17, in <module> out = torch.compile(f, backend="eager", fullgraph=True)(M()) File "/data/users/yidi/pytorch/torch/_dynamo/eval_frame.py", line 414, in _fn return fn(args, kwargs) File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 1077, in catch_errors return callback(frame, cache_entry, hooks, frame_state, skip=1) File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 456, in _convert_frame_assert return _compile( File "/data/users/yidi/pytorch/torch/_utils_internal.py", line 74, in wrapper_function return function(args, *kwargs) File "/home/yidi/.conda/envs/pytorch/lib/python3.10/contextlib.py", line 79, in inner return func(args, *kwds) File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 799, in _compile guarded_code = compile_inner(code, one_graph, hooks, transform) File "/data/users/yidi/pytorch/torch/_dynamo/utils.py", line 210, in time_wrapper r = func(args, *kwargs) File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 618, in compile_inner out_code = transform_code_object(code, transform) File "/data/users/yidi/pytorch/torch/_dynamo/bytecode_transformation.py", line 1167, in transform_code_object transformations(instructions, code_options) File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 177, in _fn return fn(args, **kwargs) File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 564, in transform tracer.run() File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 2244, in run super().run() File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 886, in run while self.step(): File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 801, in step self.dispatch_table[inst.opcode](self, inst) File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 496, in wrapper return inner_fn(self, inst) File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 1255, in CALL_FUNCTION self.call_function(fn, args, {}) File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 739, in call_function self.push(fn.call_function(self, args, kwargs)) File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 948, in call_function return handler(tx, args, kwargs) File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 711, in <lambda> return lambda tx, args, kwargs: obj.call_function( File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 948, in call_function return handler(tx, args, kwargs) File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 835, in builtin_dipatch unimplemented(error_msg) File "/data/users/yidi/pytorch/torch/_dynamo/exc.py", line 216, in unimplemented raise Unsupported(msg) torch._dynamo.exc.Unsupported: builtin: callable [<class 'torch._dynamo.variables.nn_module.NNModuleVariable'>] False ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/127026 Approved by: https://github.com/jansel	2024-05-24 18:31:43 +00:00
Animesh Jain	f0366de414	[dynamo] Support __contains__ on obj.__dict__ (#126922 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126922 Approved by: https://github.com/jansel, https://github.com/yanboliang	2024-05-23 09:01:29 +00:00
chilli	d4ec18bdad	Prevent partitioner from ever saving views (#126446 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126446 Approved by: https://github.com/anijain2305 ghstack dependencies: #126615	2024-05-22 17:28:46 +00:00
PyTorch MergeBot	0f37fd06d9	Revert "Prevent partitioner from ever saving views (#126446 )" This reverts commit `da2292ce6b`. Reverted https://github.com/pytorch/pytorch/pull/126446 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/126615#issuecomment-2124169157))	2024-05-22 08:23:40 +00:00
chilli	da2292ce6b	Prevent partitioner from ever saving views (#126446 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126446 Approved by: https://github.com/anijain2305 ghstack dependencies: #126615	2024-05-20 23:40:56 +00:00
Huy Do	abc4b66124	Forward fix the failed new test from D57474327 (#126596 ) Summary: TSIA. The two looks the same to me, but buck was failing with the following error when `with torch._inductor.utils.fresh_inductor_cache()` is used: ``` _________________________ ReproTests.test_issue126128 __________________________ self = <caffe2.test.dynamo.test_repros.ReproTests testMethod=test_issue126128> def test_issue126128(self): def fn(): x = torch.randn(1, 10) y = torch.randn(10, 1) return torch.mm(x, y).sum() def fn2(): x = torch.randn(10, 100) y = torch.randn(100, 10) return torch.mm(x, y).sum() > with torch._inductor.utils.fresh_inductor_cache(): E AttributeError: module 'torch._inductor' has no attribute 'utils' ``` Test Plan: `buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/dynamo:test_dynamo -- --exact 'caffe2/test/dynamo:test_dynamo - test_repros.py::ReproTests::test_issue126128'` Differential Revision: D57516676 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126596 Approved by: https://github.com/xmfan	2024-05-18 23:56:03 +00:00
Matthew Hoffman	81277baa0c	Remove removed ruff rule TRY200 (#126256 ) My TOML linter is complaining that "TRY200" is not acceptable for the `tool.ruff.lint` schema. From the ruff docs: https://docs.astral.sh/ruff/rules/reraise-no-cause/ > This rule has been removed and its documentation is only available for historical reasons. > > This rule is identical to [B904](https://docs.astral.sh/ruff/rules/raise-without-from-inside-except/) which should be used instead. and we are currently explicitly ignoring B904. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126256 Approved by: https://github.com/Skylion007	2024-05-17 16:31:05 +00:00
Simon Fan	cef7756c9c	[inductor] Clear cache on ctx manager exit (#126146 ) FIXES https://github.com/pytorch/pytorch/issues/126128. Right now, we only clear the cache on ctx manager enter. So state is bad unless we call fresh_inductor_cache again, usually fine in tests. Cue compiled autograd tests when going from TestCompiledAutograd -> TestAutogradWithCompiledAutograd. TestCompiledAutograd uses the ctx manager, but TestAutogradWithCompiledAutograd don't Pull Request resolved: https://github.com/pytorch/pytorch/pull/126146 Approved by: https://github.com/jgong5, https://github.com/oulgen ghstack dependencies: #126144	2024-05-16 22:23:02 +00:00
William Wen	56a89fcc08	[dynamo] graph break on issubclass call with non-const args (#125943 ) Fixes https://github.com/pytorch/pytorch/issues/125942 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125943 Approved by: https://github.com/jansel ghstack dependencies: #125882	2024-05-15 23:22:06 +00:00
William Wen	100e3c1205	[dynamo] graph break on const dict KeyError (#125882 ) Fixes https://github.com/pytorch/pytorch/issues/125866 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125882 Approved by: https://github.com/jansel	2024-05-15 23:22:06 +00:00
William Wen	4a8db9d45b	[dynamo] reset grad state in aotdispatch test, add failing trace functional tensor test to dynamo (#126113 ) Workaround for https://github.com/pytorch/pytorch/issues/125568. We could add additional global state to reset (e.g. autocast?) or move this setup/teardown to a more general place. Also added a minimal repro for the linked issue - will investigate in a followup PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126113 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2024-05-14 20:42:49 +00:00
Edward Z. Yang	db3b38202b	Improve dead code elimination of unnecessary int arguments (#126074 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126074 Approved by: https://github.com/lezcano ghstack dependencies: #125325, #125915	2024-05-14 17:22:30 +00:00
Brian Hirsh	f25c7c9699	functionalize storage resizing, minimal ppFSDP traceable forward (#122434 ) More details further down, but first a more high-level description of "how do we functionalize storage resizing" Today, dynamo converts `param.untyped_storage().resize_(x)` calls that it sees from fsdp into a custom op, `ops.inductor.resize_storage_bytes_(x)` So given this setup, there are 3 main cases that I think we want to handle: (1) graph input starts with a real storage size, gets resized down to zero in the graph (2) graph input starts with 0 storage size, gets resized up in the graph (3) graph input starts with 0 storage size, gets resized up and used in some compute, then resized back down to 0 For case (1) we need to emit a `resize_storage_bytes_` at the end of the graph, similar to how we emit `copy_()` for data mutations. For case (2), we need to emit a `resize_storage_bytes_` in the graph, and we also need to emit a `copy_()` (the input had its storage resized up, and filled in with data, which is we need to reflect as an input mutation) For case (3), the net effect is that the input had no data on entry and exit of the function, so we don't need to emit any mutable ops in the end of the graph. The main thing to call out is that: we need to write a functionalization rule for `resize_storage_byte_`, (`FunctionalTensorWrapper::storage_resize_()`) and this rule actually does very little. We would like to not emit any new ops in the graph (like say, a functional resize op). Instead, we should expect / rely on the fact that any resize up will be immediately followed by a `copy_()`/`foreach_copy_`/`out=` op, that will fill in the data of the tensor. So `FunctionalTensor` can temporarily live in a state where its data is invalid, until the `x.copy_(y)` "updates" its data with the new tensor. So effectively, all that this rule does is: (1) it stores metadata on the storage, indicating that the tensor was resized, as well as the updated storage size. We need this info in AOTAutograd, so it knows whether to emit a mutable resize_() op in the graph epilogue (2) There is also a corner case: if we are resizing down to zero, but our tensor had previously had a zero size storage, then we update `value_` to point to the original value of the tensor. The reason this seems safe is because if we have a zero storage sized tensor `x`, and we resize it up, use it in some compute, resize it back down to zero, and use it somewhere, we would want the functional version of this code to use the original `x` after the second resize. For FSDP, this is important because we end up saving parameters (graph inputs) for backward, and we want to make sure that the thing we save (and the output to the forward graph) is the original, zero-storage-sized parameter, and not the "version 2" of the parameter after the first resize_() I think a good order to look at changes in this PR would be: (1) `test_aotdispatch.py` shows the 3 main cases I focused on as well as the expected functionalized graphs (2) In `FunctionalStorageImpl.h/cpp`, I had to add a notion of "original base", and "original/curr_size". The first is so I can re-use the zero-size tensor after multiple resizes, and the second is so I can tell in AOTAutograd whether any resizes canceled each other out into a no-op (3) FunctionalTensorWrapper.h/cpp has the new resize functionalizion rule + some extra utils (4) `_functorch/_autograd`: the main changes in this folder were around adding the logic at trace-time to detect when we need to put a resize_() in the graph. I also have some assertions to check that any inputs that experience storage resizing will always be in the graph and not the opaque epilogue, and I also limited the resize_() mutation case so that you can only ever start with zero storage, or end with zero storage (you can't do e.g. `torch.ones(2).storage().resize_(3)`), and banned it on tensor subclasses (5) `fake_tensor.py`/`meta_utils.py`: we now need to be able to fakeify tensors with zero storage, so I added a quick version of it in meta_utils.py. This also.. has ramifications for fake tensor caching that I need to fix (include the storage size on the cache key, maybe?) ------------------ This PR subsumes https://github.com/pytorch/pytorch/pull/120971. This PR is enough to almost get a simple ppFSDP forward pass tracing with a functionalized resize_() properly. It also attempts to do the updated version from @jansel, where we don't have any notion of `resize_()` in the graph at all, post functionalization. It would probably be good to test it with @yf225 's FSDP changes, and see how many of the FX passes it allows us to remove. I think that in theory, it should allow us to remove all FX passes that affect the forward graph / partitioner, except the one that forces views to be recomputed in the backward (more details below). There are a few things worth calling out: (1) failed attempt at functionalizing `aten.copy_()`. I originally wanted to get a version takes these operations: ``` param.storage().resize_(all_gather_size) param.copy_(all_gather_buffer) out = aten.matmul(param, param) ``` and functionalizes them into: ``` out = aten.matmul(all_gather_buffer, all_gather_buffer) ``` This would involve getting functionalization to turn `x.copy_(y)` into a giant no-op that just returns `y`. Unfortunately, we can't actually do this in a reasonable way within functionalization (instead, there's a functional `aten.copy` in the graph - see the test case graph expecttest for details). Why? In order for that transformation to be safe, `x` and `y` need to have the same metadata. However, it's possible for `x` and `y` to be subclasses of different types. This is not something we can easily tell from within functionalization, and would be a layering violation. So for now I'm leaving it to downstream code to optimize away the `aten.copy` (this is already the case today, so I think inductor can handle this) (2) The forward doesn't actually run successfully in this PR (see the `assertRaisesRegex` in the test). Why? The final forward graph looks like this: ``` def forward(self, primals_1, primals_2): _foreach_copy = torch.ops.aten._foreach_copy.default([primals_1], [primals_2]); primals_2 = None getitem = _foreach_copy[0]; _foreach_copy = None mm = torch.ops.aten.mm.default(getitem, getitem); getitem = None t_1 = torch.ops.aten.t.default(primals_1); primals_1 = None return [mm, t_1] ``` Where `primals_1` starts out as a secretly-zero-storage-size parameter, and gets resized up and back down within the forward (these are functionalized away). Importantly, the matmul happy on the result of the `foreach_copy`, but the activation that we save for backward (`t_1`) is the result of transposing the original parameter (the zero-storage-size param). This is exactly the optimization in fsdp that allows us to have good peak memory usage. The problem is that the min-cut partitioner decides to save `t_1` for backward. Running this code in eager breaks, because the kernel for `aten.permute(x)` is not happy when `x` has secretly-zero-sized-storage. The real problem here is that in eager mode the `permute` kernel runs during the backward, after backward hooks have properly resized the saved activation. Here, we are running the transpose in the forward. One option would be to turn off the checks in our view kernels and allow them to work on zero-storage-sized tensors, which feels pretty bad. Another option is to tweak the partitioner (or use one of Will's FX passes) to force the partitioner to not save views for backward, and allow the views to be recomputed in the backward. This seems kind of silly, but is also probably harmless. (3) The backward is still broken. To be fair, this issue is pretty separable from "functionalizing storage resize calls", and can be fixed later (either by a real fix to our tracing infra, or via another hacky FX pass). More description of this problem is described at issue (8) of my PR description in https://github.com/pytorch/pytorch/pull/120971 (4) I only added support for "full graph" resizing: basically, the limited case where a param starts with zero storage size, and gets resized up and back down. I think we can add support for the graph break case, but I think we can keep that add-on separate from this PR unless we need it immediately. I also added asserts so we should fail loudly when we hit this case (5) I have a change to FakeTensor creation when inputs have zero storage size that.. is probably ok. But I also removed FakeTensor caching on view ops, which I probably need to fix before I can land this PR (6) I added a notion of "original_base" to `FunctionalStorageImpl`. More details are in the comments, but my rational for this was that we basically need it to ensure that autograd saves the original, zero-storage-sized param for backward, after resizing up and back down (7) I had to update our eager kernels for `aten.copy` and `aten._foreach_copy`, to handle the case where the `self` argument has secretly-zero-storage. Inductor can probably generate correct code for this case, but we need these ops to work properly in this situation for the `aot_eager` backend to do the right thing Pull Request resolved: https://github.com/pytorch/pytorch/pull/122434 Approved by: https://github.com/jansel	2024-05-10 18:09:10 +00:00
William Wen	ae20f15941	[dynamo] trace through nn parametrize (#125771 ) Fix https://github.com/pytorch/pytorch/issues/120914 Example dynamo output graph (from test_nn_parametrize): ``` V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] TRACED GRAPH V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] ===== __compiled_fn_1 ===== V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] /data/users/williamwen/pytorch2/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] def forward(self, L_x_: "f32[10, 10]"): V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] l_x_ = L_x_ V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] # File: /data/users/williamwen/pytorch2/torch/nn/utils/parametrize.py:275 in forward, code: x = self[0](self.original) V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] l__self___parametrizations__param___original: "f32[10, 10]" = self.L__self___parametrizations__param___original V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] # File: /data/users/williamwen/pytorch2/test/dynamo/test_repros.py:4759 in forward, code: return torch.sin(x) V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] x: "f32[10, 10]" = torch.sin(l__self___parametrizations__param___original); l__self___parametrizations__param___original = None V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] # File: /data/users/williamwen/pytorch2/test/dynamo/test_repros.py:4755 in forward, code: return self.param @ x V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] matmul: "f32[10, 10]" = x @ l_x_; x = l_x_ = None V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] return (matmul,) V0508 11:16:26.687000 140092517021504 torch/_dynamo/output_graph.py:1272] [0/0] [__graph_code] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/125771 Approved by: https://github.com/jbschlosser ghstack dependencies: #125710, #125724	2024-05-09 17:43:48 +00:00
William Wen	ff090c6937	[dynamo] support tracing nn.Module @property that accesses closure cells (#125724 ) Fix https://github.com/pytorch/pytorch/issues/125702 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125724 Approved by: https://github.com/jansel, https://github.com/jbschlosser ghstack dependencies: #125710	2024-05-08 23:25:39 +00:00
William Wen	93f3d561f9	[dynamo] don't make nn parametrized Modules unspecialized (#125710 ) Workaround for https://github.com/pytorch/pytorch/issues/125314 and https://github.com/pytorch/pytorch/issues/125478. We no longer make parametrized nn.Modules unspecialized. Instead, when we are about to call a function from the `torch.nn.utils.parametrize` module, we skip the frame. The script from https://github.com/pytorch/pytorch/issues/125314 now outputs ``` parametrize=True: 6587ms parametrize=False: 1729ms parametrize=True: 4497ms parametrize=False: 1539ms ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/125710 Approved by: https://github.com/jansel, https://github.com/jbschlosser	2024-05-08 23:25:39 +00:00
Edward Z. Yang	ecd62746e3	Also pull size/stride info from example_value (#125505 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125505 Approved by: https://github.com/jansel	2024-05-05 22:27:46 +00:00
Arun Pa	00c5859aeb	[dynamo] Add support for DELETE_SUBSCR (#123526 ) Fixes #123317 Co-authored-by: Jason Ansel <jansel@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123526 Approved by: https://github.com/jansel	2024-04-25 22:07:24 +00:00
Yanbo Liang	72a34eeb99	Dynamo x autograd.Function supports non-{Tensor, symnode, constant} inputs (#124360 ) Fixes #118395 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124360 Approved by: https://github.com/zou3519	2024-04-22 23:32:54 +00:00
Andrew M. James	64f42bfd52	[dynamo] Support list.reverse (#124210 ) fixes #123974 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124210 Approved by: https://github.com/peterbell10	2024-04-17 23:33:32 +00:00
Xuehai Pan	93e249969b	[BE] enable `ruff` rule `RSE` and remove useless parentheses in `raise` statements (#124261 ) Remove useless parentheses in `raise` statements if the exception type is raised with no argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124261 Approved by: https://github.com/albanD	2024-04-17 19:29:34 +00:00
William Wen	2b3594f90e	[dynamo] fix call_finally issue in Python 3.8 (#124122 ) Fix https://github.com/pytorch/pytorch/issues/97811 again... Pull Request resolved: https://github.com/pytorch/pytorch/pull/124122 Approved by: https://github.com/jansel	2024-04-16 08:36:20 +00:00
Edward Z. Yang	e4efa311f1	Refactor test_tensor_set_data to be parametrized (#124105 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124105 Approved by: https://github.com/albanD	2024-04-16 03:23:41 +00:00
Jason Ansel	f3fd280238	[dynamo] Relax strict_mode for autograd.Function forward inputs (#123910 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123910 Approved by: https://github.com/oulgen	2024-04-13 19:41:59 +00:00
Animesh Jain	58afcd7b61	[dynamo][dict] Add UnspecializedNNModuleVariable to dict keys (#122812 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122812 Approved by: https://github.com/jansel ghstack dependencies: #122943, #123877, #123878	2024-04-13 02:07:35 +00:00
Jason Ansel	e3935783f7	[dynamo] Fix @property on user-defined nn.Module (#123804 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123804 Approved by: https://github.com/anijain2305 ghstack dependencies: #123700, #123705, #123786, #123790, #123803	2024-04-12 19:03:13 +00:00
Brian Hirsh	f9f7ef33c4	AOTAutograd: add config to error when overlapping input checks would cause slow compile / runtimes (#123455 ) We should eventually make the non-overlapping checks faster when dynamic shapes are enabled, but this is pretty difficult to do. So for now this PR adds a config that lets us fail fast when this situation happens, instead of causing compile times to secretly come to a crawl. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123455 Approved by: https://github.com/ezyang	2024-04-12 13:25:33 +00:00
Brian Hirsh	96fe3c5d46	fix correctness for dynamo inlining RangeVariable __contains__ (#122751 ) Fixes https://github.com/pytorch/pytorch/issues/122379 It looks like `iter_contains()` in dynamo expects to take in something like `iter_contains(List[VariableTracker], VariableTracker])`. Previously, when we called this function where the list in question was a `RangeVariable`, we would pass in `RangeVariable.items` as our list. This is wrong, though since `RangeVariable.items` just contains the underlying [start, stop, step]. It looks like `unpack_var_sequence` does the right thing of "materializing" the range into a list of `VariableTrackers`, so I used that instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122751 Approved by: https://github.com/anijain2305, https://github.com/jansel ghstack dependencies: #122502	2024-04-12 01:12:23 +00:00
Brian Hirsh	2fe672b146	compile: ban mutations on non-compositional uses of as_strided (#122502 ) Fixes https://github.com/pytorch/pytorch/issues/104505 I was originally going to ban all usages of as_strided + mutation in functionalization. But I'm pretty sure that as_strided + mutation is fine when we are calling as_strided on a base tensor. So in this PR I added a slightly more conservative check: if we see an as_strided + mutation, where the input to an as_strided was another view op, then I error loudly in functionalization and link to the github issue above (in case anyone runs into this in the real world) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122502 Approved by: https://github.com/ezyang, https://github.com/albanD	2024-04-12 01:12:23 +00:00
Jason Ansel	b3feb01910	[dynamo] Update co_names if needed in fix_vars (#123697 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123697 Approved by: https://github.com/williamwen42	2024-04-11 01:00:01 +00:00
Animesh Jain	7283c37c98	[dynamo] Keep guards on global function (#123423 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123423 Approved by: https://github.com/jansel	2024-04-09 04:23:11 +00:00
Jason Ansel	d8e0c26e64	[dynamo] Support warnings.catch_warnings (#123511 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123511 Approved by: https://github.com/anijain2305	2024-04-08 22:27:46 +00:00
Jason Ansel	212e460dce	[dynamo] Support custom __setattr__ on UserDefinedObjectVariable (#123318 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123318 Approved by: https://github.com/anijain2305	2024-04-07 21:06:52 +00:00
Animesh Jain	5d0ac887b9	[dynamo][higher order ops] Make the subgraph sourceless (#123071 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123071 Approved by: https://github.com/jansel, https://github.com/zou3519 ghstack dependencies: #123046, #123058, #123059	2024-04-01 21:09:41 +00:00
Jason Ansel	781e8d2201	[dynamo] Support __next__ on UserDefinedObjectVariable (#122565 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122565 Approved by: https://github.com/yanboliang	2024-03-31 19:00:03 +00:00
Edward Z. Yang	deeeaded1f	Add metas for randint/rand factory functions out overload (#122375 ) Fixes https://github.com/pytorch/pytorch/issues/121897 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122375 Approved by: https://github.com/lezcano	2024-03-25 04:01:38 +00:00
Jason Ansel	3e4a4bea12	[dynamo] Graph break on SymNode control flow (#122546 ) Fixes #111918 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122546 Approved by: https://github.com/ezyang	2024-03-24 07:22:02 +00:00
Jason Ansel	5f7e71c411	[dynamo] Add HASATTR guard for UserDefinedObject attrs (#122555 ) Fixes #111522 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122555 Approved by: https://github.com/Skylion007	2024-03-24 03:41:58 +00:00
Edward Z. Yang	c2651a7f0e	Make check_is_size clamp to sys.maxsize - 1, so sys.maxsize comparison returns False (#122372 ) Partially fixes https://github.com/pytorch/pytorch/issues/113002 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122372 Approved by: https://github.com/lezcano ghstack dependencies: #122370	2024-03-21 17:14:42 +00:00
albanD	53d5276d69	Improve Dynamo support for torch function and class methods in general (#121365 ) I was originally trying to solve https://github.com/pytorch/pytorch/issues/120799 but got sidetracked along the way. This PR contains a couple fixes. Let me know if you want me to split them up! - Properly handle invalid user code when "super()" is called from non-method/classmethod. It will now properly raise the same error as CPython - Fix base VariableTracker `__str__` method shadowing all `__repr__` methods defined in subclasses - Fix accessing a classmethod on a user object to bind "cls" and not "self" - Fix custom class handling of super() call to properly handle mixed regular/class/static methods Locally , test_repros.py -k test_batch_norm_act still fails where the generated graph module is: ``` Call using an FX-traced Module, line 8 of the traced Module's generated forward function: x = self.forward(l_x_); self = l_x_ = None x_1 = self.L__self___act(x); x = None ``` note that "self" is being unset on the first line even though it is used on the second one. For reference, this is the test `c268ce4a6d/test/dynamo/test_repros.py (L1368-L1369)` I cannot figure out where the generated forward function comes from though, any hint would be welcome! Pull Request resolved: https://github.com/pytorch/pytorch/pull/121365 Approved by: https://github.com/jansel	2024-03-08 20:03:49 +00:00
Tugsbayasgalan Manlaibaatar	f01a23d01b	Don't aggressively rewrite asserts for symbolic expressions (#120564 ) Fixes: https://github.com/pytorch/pytorch/issues/118417 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120564 Approved by: https://github.com/ezyang	2024-03-01 17:46:36 +00:00
Edward Z. Yang	2a08a51738	Add _assert_scalar and teach Inductor to codegen it (#114148 ) Inductor codegen for `_assert_async` is currently disabled because we don't really understand how to codegen `scalar_to_tensor` on a Sympy expression. I initially tried to see if I could get this to work, but I got into some weird problem involving stride sorting, so I decided to fix it properly by not going through a tensor. So we introduce an `_assert_scalar` which takes a scalar as an argument, avoiding needing to turn a SymBool into a tensor before asserting on it. I also add `_functional_assert_scalar` for good luck, although this doesn't do anything right now because https://github.com/pytorch/pytorch/pull/104203 still hasn't been landed. I need to customize the codegen for this operator, so I decide to directly implement it in Inductor, rather than trying to treat it as a generic ExternKernel. This leads to the new AssertScalar IR node. This is written carefully so that it doesn't get DCE'd by Inductor. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/114148 Approved by: https://github.com/jansel ghstack dependencies: #120800	2024-03-01 05:06:36 +00:00
Animesh Jain	e7039e3a0b	[dynamo][easy] Dynamo test changes (#120927 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120927 Approved by: https://github.com/yanboliang ghstack dependencies: #120864, #120730	2024-02-29 22:05:41 +00:00
Brian Hirsh	cccacf6c8e	add a test that non_overlapping checks dont generate too many guards (#120106 ) Pre-emptive test in OSS to ensure that models relying on the "non-overlapping guards" checks do not suffer drastically w.r.t. guard slowness. Current plan is to follow up on this with a "real" fix, to generate a linear number of these guards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120106 Approved by: https://github.com/mlazos	2024-02-20 18:38:59 +00:00
Brian Hirsh	6819452a08	fix multiple-fake-modes bug with compile + subclasses (#118191 ) This should fix the "multiple fake modes" errors we've been seeing with both float8 tensor and DTensor. Haven't added a test yet - will add one before landing. I also have a separate PR that would have made the error significantly nicer (the bad error resulted from us returning a FakeTensor at runtime): https://github.com/pytorch/pytorch/pull/118644 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118191 Approved by: https://github.com/drisspg ghstack dependencies: #117667, #117666, #118209	2024-02-20 15:23:41 +00:00
Animesh Jain	80379ef0aa	[dynamo-must-fix] Use ID_MATCH for UserDefinedClass (#119853 ) Fixes https://github.com/pytorch/pytorch/issues/119715 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119853 Approved by: https://github.com/jansel	2024-02-14 03:14:42 +00:00
Sergii Dymchenko	bd9db6a9c7	Update to TorchFix 0.4.0 (#119424 ) `torch.library.Library` updated to `torch.library._scoped_library` in files with many tests where it seems obvious to do, otherwise `noqa: TOR901` added - see https://github.com/pytorch/pytorch/pull/118318 for more context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119424 Approved by: https://github.com/zou3519	2024-02-12 23:30:12 +00:00
Brian Hirsh	02b60e76c9	make flash_attn_bw impl correct w.r.t. meta when k and v have different strides (#119500 ) `dv = at::empty_like(k)` and `dv = at::empty_like(v)` can be materially different, because `empty_like` tries to preserve the strides of the input when possible. So if `k` is contiguous, but `v`, is transposed, then before this PR, `dv` would be computed to be contiguous. Alternatively, we could change the meta implementation of `aten._scaled_dot_product_flash_attention` to this: ``` grad_q = torch.empty_like(query.transpose(1, 2)).transpose(1, 2) grad_k = torch.empty_like(key.transpose(1, 2)).transpose(1, 2) grad_v = torch.empty_like(key.transpose(1, 2)).transpose(1, 2) return grad_q, grad_k, grad_v ``` But (I think?) the logic in the sdpa backward impl was a typo. I noticed this because changing the meta formula as above was enough to fix the issue with the `aot_eager` backend in this [link](https://github.com/pytorch/pytorch/issues/116935#issuecomment-1914310523). A minimal repro that I made looks like this: ``` import torch # in this repro, "grad_out" and "value" are transposed tensors, # but "key" and "value" are contiguous a = torch.randn(2, 513, 16, 64, dtype=torch.float16, device='cuda').transpose(1, 2) b = torch.randn(2, 16, 513, 64, dtype=torch.float16, device='cuda') c = torch.randn(2, 16, 513, 64, dtype=torch.float16, device='cuda') d = torch.randn(2, 513, 16, 64, dtype=torch.float16, device='cuda').transpose(1, 2) e = torch.randn(2, 16, 513, 64, dtype=torch.float16, device='cuda') f = torch.randn(2, 16, 513, device='cuda') g = None h = None i = 513 j = 513 k = 0.0 l = False m = torch.tensor(1, dtype=torch.int64) n = torch.tensor(1, dtype=torch.int64) out1_ref, out2_ref, out3_ref = torch.ops.aten._scaled_dot_product_flash_attention_backward(a, b, c, d, e, f, g, h, i, j, k, l, m, n, scale=0.125) from torch._meta_registrations import meta__scaled_dot_product_flash_backward out1_test, out2_test, out3_test = meta__scaled_dot_product_flash_backward(a, b, c, d, e, f, g, h, i, j, k, l, m, n, scale=0.125) # prints True True print(out1_ref.is_contiguous()) print(out1_test.is_contiguous()) # prints True True print(out2_ref.is_contiguous()) print(out2_test.is_contiguous()) # prints True False print(out3_ref.is_contiguous()) print(out3_test.is_contiguous()) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/119500 Approved by: https://github.com/drisspg, https://github.com/ezyang, https://github.com/Skylion007	2024-02-12 22:12:29 +00:00
PyTorch MergeBot	34db6f1b13	Revert "make flash_attn_bw impl correct w.r.t. meta when k and v have different strides (#119500 )" This reverts commit `095f471307`. Reverted https://github.com/pytorch/pytorch/pull/119500 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/119500#issuecomment-1937003082))	2024-02-10 13:06:30 +00:00
Brian Hirsh	095f471307	make flash_attn_bw impl correct w.r.t. meta when k and v have different strides (#119500 ) `dv = at::empty_like(k)` and `dv = at::empty_like(v)` can be materially different, because `empty_like` tries to preserve the strides of the input when possible. So if `k` is contiguous, but `v`, is transposed, then before this PR, `dv` would be computed to be contiguous. Alternatively, we could change the meta implementation of `aten._scaled_dot_product_flash_attention` to this: ``` grad_q = torch.empty_like(query.transpose(1, 2)).transpose(1, 2) grad_k = torch.empty_like(key.transpose(1, 2)).transpose(1, 2) grad_v = torch.empty_like(key.transpose(1, 2)).transpose(1, 2) return grad_q, grad_k, grad_v ``` But (I think?) the logic in the sdpa backward impl was a typo. I noticed this because changing the meta formula as above was enough to fix the issue with the `aot_eager` backend in this [link](https://github.com/pytorch/pytorch/issues/116935#issuecomment-1914310523). A minimal repro that I made looks like this: ``` import torch # in this repro, "grad_out" and "value" are transposed tensors, # but "key" and "value" are contiguous a = torch.randn(2, 513, 16, 64, dtype=torch.float16, device='cuda').transpose(1, 2) b = torch.randn(2, 16, 513, 64, dtype=torch.float16, device='cuda') c = torch.randn(2, 16, 513, 64, dtype=torch.float16, device='cuda') d = torch.randn(2, 513, 16, 64, dtype=torch.float16, device='cuda').transpose(1, 2) e = torch.randn(2, 16, 513, 64, dtype=torch.float16, device='cuda') f = torch.randn(2, 16, 513, device='cuda') g = None h = None i = 513 j = 513 k = 0.0 l = False m = torch.tensor(1, dtype=torch.int64) n = torch.tensor(1, dtype=torch.int64) out1_ref, out2_ref, out3_ref = torch.ops.aten._scaled_dot_product_flash_attention_backward(a, b, c, d, e, f, g, h, i, j, k, l, m, n, scale=0.125) from torch._meta_registrations import meta__scaled_dot_product_flash_backward out1_test, out2_test, out3_test = meta__scaled_dot_product_flash_backward(a, b, c, d, e, f, g, h, i, j, k, l, m, n, scale=0.125) # prints True True print(out1_ref.is_contiguous()) print(out1_test.is_contiguous()) # prints True True print(out2_ref.is_contiguous()) print(out2_test.is_contiguous()) # prints True False print(out3_ref.is_contiguous()) print(out3_test.is_contiguous()) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/119500 Approved by: https://github.com/drisspg, https://github.com/ezyang, https://github.com/Skylion007	2024-02-10 02:04:56 +00:00
Aaron Orenstein	4dc53f777b	Fix dynamo failure w/ astype (#117952 ) The torch "fake" ndarray had some mismatches vs numpy.ndarray which caused test_sparse_to_sparse_compressed to fail under dynamo. This also fixes (because the test now hits it) a problem where unpacking a sequence with the incorrect number of args would assert in dynamo instead of graph breaking (because it would throw an exception). Added a unit test for this condition. Fixed: - torch._numpy._ndarray.astype() (actually used by the test) - torch._numpy._ndarray.put() (drive-by discovery) - torch._numpy._ndarray.view() (drive-by discovery) (burndown item 7) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117952 Approved by: https://github.com/yanboliang ghstack dependencies: #117951	2024-02-03 08:10:15 +00:00
Alexander Grund	865945cc1f	Convert `requires_cuda` to full decorator (#118281 ) Don't require using it as `@requires_cuda()` -> `@requires_cuda` instead No need for the partial function invoked many times Split out this change from the initial large refactoring in #117741 to hopefully get merged before conflicts arise Pull Request resolved: https://github.com/pytorch/pytorch/pull/118281 Approved by: https://github.com/ezyang	2024-01-25 15:50:21 +00:00
voznesenskym	fed45aee54	Replace invoking self.value if there is a user defined init, avoiding arbitrary code execution (#117818 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117818 Approved by: https://github.com/ezyang	2024-01-23 03:14:58 +00:00
PyTorch MergeBot	1174e82bde	Revert "Add _assert_scalar and teach Inductor to codegen it (#114148 )" This reverts commit `b6028acfa4`. Reverted https://github.com/pytorch/pytorch/pull/114148 on behalf of https://github.com/osalpekar due to Going to revert this given the broken torchrec PT2 tests internally: [D52648865](https://www.internalfb.com/diff/D52648865). Logs aren't too clear but @dstaay-fb can help debug as well ([comment](https://github.com/pytorch/pytorch/pull/114148#issuecomment-1886100368))	2024-01-11 02:30:22 +00:00
Edward Z. Yang	b6028acfa4	Add _assert_scalar and teach Inductor to codegen it (#114148 ) Inductor codegen for `_assert_async` is currently disabled because we don't really understand how to codegen `scalar_to_tensor` on a Sympy expression. I initially tried to see if I could get this to work, but I got into some weird problem involving stride sorting, so I decided to fix it properly by not going through a tensor. So we introduce an `_assert_scalar` which takes a scalar as an argument, avoiding needing to turn a SymBool into a tensor before asserting on it. I also add `_functional_assert_scalar` for good luck, although this doesn't do anything right now because https://github.com/pytorch/pytorch/pull/104203 still hasn't been landed. I need to customize the codegen for this operator, so I decide to directly implement it in Inductor, rather than trying to treat it as a generic ExternKernel. This leads to the new AssertScalar IR node. This is written carefully so that it doesn't get DCE'd by Inductor. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/114148 Approved by: https://github.com/jansel	2024-01-09 23:21:26 +00:00
voznesenskym	83e8a0721d	Reland #111196 (take 4) "Support tensors as Dict keys" (#116934 ) Fixes #ISSUE_NUMBER See that PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/116934 Approved by: https://github.com/ezyang, https://github.com/huydhn	2024-01-07 01:37:26 +00:00
PyTorch MergeBot	2dca3e99eb	Revert "Support tensors as Dict keys Re-PR of #111196 (#116785 )" This reverts commit `1badad9ce9`. Reverted https://github.com/pytorch/pytorch/pull/116785 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/116785#issuecomment-1879592261))	2024-01-06 08:22:33 +00:00
voznesenskym	1badad9ce9	Support tensors as Dict keys Re-PR of #111196 (#116785 ) This prepares the PR where we implement sets in terms of dicts. To do so, rather than storing internally a dictionary that maps literals to VariableTrackers, it stores (pretty much) a dictionary from VTs to VTs. To do so, keys are wrapped in an opaque internal class _Hashable. The Hashable class is opaque on purpose so that it fails hard if if it inadvertently leaks back into user code. We also found and fixed a number of latent bugs and inconsistencies in the way dynamo checked what can be a dict key. More generally, we make much clearer what are the things that need to be modified to add a new supported key type to Dicts. Fixes [#107595](https://www.internalfb.com/tasks?t=107595) Fixes [#111603](https://www.internalfb.com/tasks?t=111603) Re-PR of https://github.com/pytorch/pytorch/pull/111196 sadly due to reverts, we could not reuse @lezcano's original PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116785 Approved by: https://github.com/mlazos	2024-01-06 03:35:35 +00:00
Edward Z. Yang	53f8d17d1e	Specialize SymNodeVariable when used as module index (#114377 ) Fixes #114171 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/114377 Approved by: https://github.com/Skylion007	2024-01-05 13:51:52 +00:00
Guilherme Leobas	5c9464fb51	add CALL_FINALLY opcode (#116159 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116159 Approved by: https://github.com/yanboliang	2023-12-27 19:01:08 +00:00
David Berard	7b7f11f230	[dynamo] test number of guards when inputs are views (#115793 ) After # 113734 landed (adding dynamic storage offsets), we found that compilation times increased significantly. The reason: tensors_definitely_do_not_overlap was doing comparisons on storage offsets which were adding guards `626b7dc847/torch/_functorch/_aot_autograd/input_output_analysis.py (L268-L276)` This guard is added on all pairs of tensors which are views of the same source tensor - i.e. it the number of guards can be quadratic in the number of input tensors. This PR adds a test to prevent similar regressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115793 Approved by: https://github.com/yanboliang	2023-12-19 16:09:29 +00:00
Peter Bell	0e0dd8f985	[dynamo][BE] Move torchvision import inside of test_multi_import (#115677 ) Currently this skip imports torchvision, so if your torchvision install is broken then the entire file fails at collection time. This instead means only the test itself will fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115677 Approved by: https://github.com/lezcano	2023-12-13 14:16:31 +00:00
voznesenskym	76ced0df03	Consider storage_changed for assigning alias_of_input in aot_autograd when computing differentiable outputs that alias each other (#115315 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115315 Approved by: https://github.com/bdhirsh	2023-12-12 23:21:58 +00:00
Aaron Gokaslan	794545c11f	[BE]: Enable RUF015 codebase wide (#115507 ) Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507 Approved by: https://github.com/malfet	2023-12-11 15:51:01 +00:00
Jason Ansel	88642d44d9	[dynamo] Add RestrictedListSubclassVariable (#115057 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115057 Approved by: https://github.com/yanboliang ghstack dependencies: #115095, #115046	2023-12-05 19:01:23 +00:00
Jason Ansel	aa70e31610	[dynamo] Fix MutableSideEffects returning alias (#115095 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115095 Approved by: https://github.com/yanboliang	2023-12-05 19:01:03 +00:00
Jason Ansel	3d0bbb24a1	[dynamo] Improve support for list subclasses (#115052 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115052 Approved by: https://github.com/oulgen, https://github.com/eellison ghstack dependencies: #114830, #115047, #115048	2023-12-05 01:31:33 +00:00
Jason Ansel	a70c85ce90	[dynamo] Improve support for inspect.signature().parameters (#115047 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115047 Approved by: https://github.com/oulgen ghstack dependencies: #114830	2023-12-04 19:08:36 +00:00
voznesenskym	4cfe997490	[dynamo] handle setting .data on a tensor (#113080 ) Dynamo We don't want setattr in the graph. Setting data has interesting implications on both aliasing and on the autograd engine. The safe recipe is: 1) Disable grad 2) Call set_() 3) Manually lower the version counter on the object to hide it from the autograd engine This is effectively the same exact thing as setting .data, and it composes properly with aot_autograd and inductor. aot_autograd For aot_autograd, there's another snag. Specifically, when we invoke aot_autograd, we call `fake_mode.from_tensor()`, relying on memo to get the right tensor out. For .data mutations, this doesn't work, because the memoized fake_tensor is in the state it will be in at the end of the trace, not at the beginning. This means that the .data call is already applied, and the tensor shape (as in the case of these tests) mismatches. aot_autograd produces an invalid graph, with illegal calls like `torch.ops.aten.view.default(primals_2, [0])` where primals is actually sized `([6])` on input. The new plan here is to: 1) Record tensor fakification policy in dynamo 2) provide a fresh fake mode to all backends 3) Invoke from_tensor with the stored policy to get fresh new fake tensors in aot_autograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/113080 Approved by: https://github.com/bdhirsh	2023-12-02 00:35:44 +00:00
David Berard	3fc58a6bbe	Revert "Make offsets dynamic by default (#113734 )" (#114889 ) This reverts commit `7c38b76efe`. if a graph has a lot of inputs which are views (with nonzero storage offset), then the check for overlapping tensor views will add a lot of guards (n^2?) `b35ca2cb94/torch/_functorch/_aot_autograd/input_output_analysis.py (L256-L260)` this was causing very slow compilations on an internal model. Differential Revision: [D51733774](https://our.internmc.facebook.com/intern/diff/D51733774) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114889 Approved by: https://github.com/ckluk2, https://github.com/YuqingJ, https://github.com/aaronenyeshi	2023-12-01 16:49:42 +00:00
Jon Chuang	f66add9b85	[dynamo] graph break on `np.ndarray.tobytes` (#114208 ) We can't model this accurately across np and tnp https://github.com/pytorch/pytorch/issues/114204#issuecomment-1820269949 So let's not even try. Just graph break. Fixes: https://github.com/pytorch/pytorch/issues/114204 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114208 Approved by: https://github.com/lezcano	2023-11-21 18:19:37 +00:00
Edward Z. Yang	59ad51e10a	Insert deferred runtime asserts into Dynamo FX graph (#113958 ) During the course of fake tensor propagation (and, potentially, also Dynamo execution, although I do not believe it is possible to exercise this right now), we may generate deferred runtime asserts, which represent "guards" on unbacked symbols which cannot be immediately checked on entry to a code block; instead, they have to be checked at runtime. However, we currently accumulate these deferred runtime asserts into the ShapeEnv, and don't do anything with them. This PR modifies Dynamo to automatically insert these runtime asserts into the FX graph, before passing it on to the backend compiler. The assert format coincides with the export assert format as practiced in `torch/_export/passes/add_runtime_assertions_for_constraints_pass.py`, but actually these passes are completely disjoint right now as I only handle deferred runtime asserts, while export only handles ranges (which I should probably also handle, but don't in this PR.) The assertions must be inserted by Dynamo, because you could potentially then pass the asserts onto another backend like "eager" which no longer looks at the ShapeEnv before. Thanks to previous work in export, these asserts are preserved in AOTAutograd, but they are dropped by Inductor, which needs to be fixed in future work. This piece will be a bit awkward, as Inductor would have preferred to work with the Sympy expressions directly, ah well. Here is what the Dynamo traced FX graph looks like for the test in question: ``` <eval_with_key>.0 class GraphModule(torch.nn.Module): def forward(self, L_x_ : torch.Tensor): l_x_ = L_x_ # File: /data/users/ezyang/c/pytorch/wu.py:8, code: y = x.item() item = l_x_.item() # No stacktrace found for following nodes ge_1 = item >= 0 scalar_tensor_default = torch.ops.aten.scalar_tensor.default(ge_1); ge_1 = None _assert_async_msg = torch.ops.aten._assert_async.msg(scalar_tensor_default, "Deferred runtime assert failed: i0 >= 0, where i0 was defined by 'item' (for more information, run with TORCH_LOGS=+dynamo,dynamic)"); scalar_tensor_default = None # File: /data/users/ezyang/c/pytorch/wu.py:9, code: torch._check_is_size _check_is_size = torch._check_is_size(item) # File: /data/users/ezyang/c/pytorch/wu.py:10, code: if y >= 0: ge = item >= 0; item = None # File: /data/users/ezyang/c/pytorch/wu.py:11, code: return x * 2 mul = l_x_ * 2; l_x_ = None return (mul,) ``` Note that we actually keep the `_check_is_size` in the graph redundantly. However, assert_async is retained in the graph, whereas _check_is_size ends up getting DCE'ed. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113958 Approved by: https://github.com/aakhundov, https://github.com/tugsbayasgalan ghstack dependencies: #113978	2023-11-20 21:25:11 +00:00
Jon Chuang	100b9952b1	[dynamo] Fix user defined object sourceless callable (#114066 ) Fixes https://github.com/pytorch/pytorch/issues/114019 We do not need to guard on callable user object defined instantiated in graph Pull Request resolved: https://github.com/pytorch/pytorch/pull/114066 Approved by: https://github.com/ezyang	2023-11-20 18:38:03 +00:00
Edward Z. Yang	caffa44b1c	Correctly use real boolean operators, not bitwise in shape guard prints (#113927 ) Fixes https://github.com/pytorch/pytorch/issues/113875 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113927 Approved by: https://github.com/voznesenskym	2023-11-18 04:24:45 +00:00
Peter Bell	9f47580ad7	[BE] Don't mutate torch.compile global config in tests (#113882 ) We should uniformly use `config.patch` so the configuration changes don't effect different tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113882 Approved by: https://github.com/lezcano	2023-11-17 16:49:48 +00:00
David Berard	7c38b76efe	Make offsets dynamic by default (#113734 ) Copied from @ezyang 's #113693. The motivation for this change is that we'd like to guard on storage offset in inductor, to make assumptions about data alignment. create_symbolic_sizes_strides_storage_offset() creates the sizes/strides/offset for fake tensors - they can either be integers or symints. This PR changes storage_offset to always be dynamic. In variables/builder.py, we remove a conditional so that all tensors get added to tracked_fakes. This is because the storage offset will be dynamic even if the other logic in builder.py suggests that it will be static; otherwise, we run into this issue: `1e260c851b/torch/fx/experimental/symbolic_shapes.py (L892-L895)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/113734 Approved by: https://github.com/ezyang	2023-11-17 07:57:21 +00:00
Jon Chuang	277229d0c6	[dynamo] Fix incorrectly casting `SymNode` to `int` when input is `bool` (#113871 ) Fixes https://github.com/pytorch/pytorch/issues/113393, https://github.com/pytorch/pytorch/pull/113848#issuecomment-1814624510 Incorrectly casting symnode type will cause it to take the wrong path in symbolic_shapes Pull Request resolved: https://github.com/pytorch/pytorch/pull/113871 Approved by: https://github.com/jansel	2023-11-16 23:24:57 +00:00
PyTorch MergeBot	98df3088c3	Revert "Make offsets dynamic by default (#113734 )" This reverts commit `9efbb4ea73`. Reverted https://github.com/pytorch/pytorch/pull/113734 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is causing a memory leak in one of the test `9efbb4ea73` ([comment](https://github.com/pytorch/pytorch/pull/113734#issuecomment-1815297222))	2023-11-16 20:56:27 +00:00
David Berard	9efbb4ea73	Make offsets dynamic by default (#113734 ) Copied from @ezyang 's #113693. The motivation for this change is that we'd like to guard on storage offset in inductor, to make assumptions about data alignment. create_symbolic_sizes_strides_storage_offset() creates the sizes/strides/offset for fake tensors - they can either be integers or symints. This PR changes storage_offset to always be dynamic. In variables/builder.py, we remove a conditional so that all tensors get added to tracked_fakes. This is because the storage offset will be dynamic even if the other logic in builder.py suggests that it will be static; otherwise, we run into this issue: `1e260c851b/torch/fx/experimental/symbolic_shapes.py (L892-L895)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/113734 Approved by: https://github.com/ezyang	2023-11-16 06:49:09 +00:00
Brian Hirsh	cebad9867b	graph break on intermediate leaves that require grad (#113277 ) fixes https://github.com/pytorch/pytorch/issues/90552. This is a simpler fix that just detects the situation where AOTAutograd can't create a proper backward graph for the situation and graph breaks. This was technically a silent correctness issue before. This PR tries to always graph break when we see a factory function that returns a tensor requiring grad. I check this by seeing if the op returned a `TensorVariable` in dynamo, and if one of the input arguments was a `requires_grad=True` kwarg. I think this is high-fidelity enough, and I'm also hoping that this is uncommon enough that a graph break is reasonable here. The fix to avoid the graph break in user land is also pretty easy - just instantiate your tensor outside of the compiled region and plumb it in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113277 Approved by: https://github.com/eellison ghstack dependencies: #113267, #113416, #113584	2023-11-16 02:47:45 +00:00
PyTorch MergeBot	5d170fce29	Revert "Support tensors as Dict keys (#111196 )" This reverts commit `b0805fa5d0`. Reverted https://github.com/pytorch/pytorch/pull/111196 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing internally. I will provide the details there ([comment](https://github.com/pytorch/pytorch/pull/111196#issuecomment-1813410149))	2023-11-15 23:08:00 +00:00

1 2 3 4 5 ...

436 Commits