pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Yanbo Liang	da341d0d48	[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432 ) This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432 Approved by: https://github.com/ezyang, https://github.com/jansel	2023-12-09 05:11:44 +00:00
PyTorch MergeBot	e8e4141773	Revert "[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432 )" This reverts commit `e61d6b42f0`. Reverted https://github.com/pytorch/pytorch/pull/113432 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing dynamo tests in trunk `e61d6b42f0`, landrace? ([comment](https://github.com/pytorch/pytorch/pull/113432#issuecomment-1847787981))	2023-12-08 20:15:39 +00:00
Michael Lazos	1c3a4a864c	Remove always restore (#115317 ) Removes always restore, assuming that a HOP will cleanup any leftover state from tracing fwd + bwd This required a minor change to the autograd fn variable higher order op. If we are tracing forward DON'T add the call_function node into the main graph, since we are only tracing it for the purposes of speculation. Instead return the result directly to be passed to the backward for speculation. This was the only observable side effect on the output graph that I found. Test plan: test_smoke_from_test_autograd in test_autograd_function.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/115317 Approved by: https://github.com/voznesenskym, https://github.com/jansel	2023-12-08 18:17:37 +00:00
Yanbo Liang	e61d6b42f0	[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432 ) This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432 Approved by: https://github.com/ezyang, https://github.com/jansel	2023-12-08 17:15:14 +00:00
Iris Zhang (PyTorch)	23fa9621e4	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 ) (#115193 ) Summary: Rename _device_mesh.py to device_mesh.py, update all callsites, add documentation. We created stubs for public class and methods in torch.distributed.device_mesh so that torch.distributed.device_mesh can be imported with or without distributed is available(). Original diff reverted: D51629761 Original PR reverted: https://github.com/pytorch/pytorch/pull/115099 Prior to landing, CI signals are all passed. Shipit added the "ci/trunk" label to the PR and DID NOT wait for it and went ahead committing. More context can be found in the reverted PR above. Test Plan: CI. Differential Revision: D51861018 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115193 Approved by: https://github.com/fegin	2023-12-08 08:44:32 +00:00
voznesenskym	2c84616a94	Move the shape env symint cache to a symbol cache, better routing for subclass fakification [re-pr 115227] (#115396 ) * Context: Joel sees that unless he manually writes to the fake tensor memo, fakification seems to produce spurious symbols! Voz (me) objects, saying that not only is directly writing to memo a bad pattern, recursively invoking fakification on tensor subclass elements in dynamo should suffice! Joel says that while he morally agrees, he has a test proving otherwise, a most perplexing situation. Digging in, I figured out that while we were making fake tensors correctly, with properly cached symbols and the like, we were also incorrectly creating spurious symbols, leading the test to fail. Before this PR, we would only cache source->symint. This was generally fine, but meant that you would create a symbol, then potentially throw it out due to symint cache. For example, the cache hit flow was: make a symbol (ex: s2) -> use it to make a symint -> hit the cache (my_source-s1) Now, in this example, you have a symbol in your val_to_var/var_to_val (s2) that is unused. This is sound, but wasteful, and furthermore, misleading. This was causing a test added in a PR in this stack to fail, specifically, because the test was using ``` curr_var_to_val = { str(k): v for k, v in context.fake_mode.shape_env.var_to_val.items() } ```` To validate that no new symbols were being created (that is, that recursively creating fake tensors for subclasses was working). The test is correct, but the implementation of caching would make (by this method of observation) cache hits look like cache misses. So, the fix here is to move the cache up to be a general symbol cache, rather than only a cache for symints. The initial implementation did that! But then, it ran into some interesting errors when it came to replay. When replaying symbol creation, behaviors would diverge in the new shape env! How could that be? The answer is because creating a new shape_env resulted in us replaying symbol creation... but with a cache from a different shape env! This was short circuiting symbol creation - and so, adding an extra layer to the cache for id(shape_env) fixes the problem. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115396 Approved by: https://github.com/mlazos	2023-12-08 05:02:21 +00:00
Michael Lazos	18d57dde2d	Remove remaining uses of copy_graphstate (#115321 ) After auditing higher_order_ops.py, the graph checkpoints were only getting used in the event of an exception, so it is safe to remove because we restart analysis in this case now. To make this clearer the current state is the following: Checkpoint side effects Capture subgraph if graph break: restore as usual else: throw away inlining translator and subgraph tracer Restore side effects This will change to the following after this change: Checkpoint side effects Capture subgraph: if graph break: restart analysis else: throw away inlining translator and subgraph tracer Restore side effects Pull Request resolved: https://github.com/pytorch/pytorch/pull/115321 Approved by: https://github.com/jansel, https://github.com/zou3519	2023-12-07 22:35:02 +00:00
ydwu4	dd6ae6d3b4	[HigherOrderOp] Remove additional get item calls in MapHigherOrder. (#115207 ) As titled, this PR removes the unnessecary getitem call from the graph that's manipulated in MapHigherOrder, where we want to get the first dim slice of original tensor for specualtion but using call_method will accidentally create a get_item call in the graph, so want to avoid it by calling unpack_var_sequence on input tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115207 Approved by: https://github.com/yanboliang ghstack dependencies: #115115, #115204, #115205	2023-12-07 17:06:44 +00:00
ydwu4	8b74735878	[HigherOrderOp] make MapHigherOrder create map_impl call_function node instead of map (#115205 ) We want to remove the map_wrapper and replace it with dynamo always on. This is the first step of this plan. In this PR, we make dynamo directly generates a map_impl nodes. This hasn't touch the eager logic yet. So the execution path after this PR looks like 1. `dynamo -> map_impl` when torch.compile is on. (Before this PR, it's `dynamo -> map_wrapper -> map_impl` and 2. `map_wrapper -> map_impl` (This PR did't touch the logic here). The added TODO(yidi) is addressed in the following pr. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115205 Approved by: https://github.com/yanboliang ghstack dependencies: #115115, #115204	2023-12-07 17:06:44 +00:00
ydwu4	be3efbebb6	[HigherOrderOp] make MapHigherOrder use should_flatten_output=True (#115204 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115204 Approved by: https://github.com/yanboliang ghstack dependencies: #115115	2023-12-07 17:06:35 +00:00
ydwu4	998c87f93c	[BE][HigherOrderOp] extract redundant code that unflattens the output (#115115 ) We need this function to unflatten the variable tracker for HOPs that want pytree output support, e.g. map. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115115 Approved by: https://github.com/yanboliang	2023-12-07 17:06:28 +00:00
Michael Lazos	3c882925da	Make subclass type instances constants (like UserDefinedClasses) (#115323 ) As title Pull Request resolved: https://github.com/pytorch/pytorch/pull/115323 Approved by: https://github.com/oulgen	2023-12-07 08:10:59 +00:00
Joel Schlosser	3a18211622	Guard on subclass inner tensors (#114965 ) This PR introduces guarding on subclass inner tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114965 Approved by: https://github.com/voznesenskym ghstack dependencies: #114311, #115212	2023-12-07 01:47:48 +00:00
Jon Chuang	83cb6a75ad	[dynamo] add list iterator contains (#115237 ) Fixes https://github.com/pytorch/pytorch/issues/115236 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115237 Approved by: https://github.com/jansel	2023-12-06 22:26:16 +00:00
rzou	67c8ad7285	Fix autograd.Function x enum input x torch.compile (#115206 ) Fixes https://github.com/pytorch/pytorch/issues/114777. We treat Enums like we do ConstantVariable. Test Plan: New test Pull Request resolved: https://github.com/pytorch/pytorch/pull/115206 Approved by: https://github.com/yanboliang ghstack dependencies: #115185, #115186, #115187	2023-12-06 15:18:25 +00:00
Jason Ansel	f4c67ffff4	[dynamo] Improve support for dynamic shapes str.format and _assert (#115203 ) This removes a graph break in vision_maskrcnn. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115203 Approved by: https://github.com/yanboliang	2023-12-06 04:54:45 +00:00
rzou	b0b190f7c0	More descriptive error message for unsupported inputs to HOP (#115187 ) Test Plan: See updated tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/115187 Approved by: https://github.com/ydwu4, https://github.com/yanboliang ghstack dependencies: #115185, #115186	2023-12-06 01:29:03 +00:00
rzou	b5b011a5cd	Expand input types for HOPs that use manually_set_subgraph_inputs=False (#115186 ) Previously we only supported Tensor, Constants, and SymNode. We lift that restriction (there's not really a good reason for it). HOPs like torch.cond, torch.map already do input validation (those are the ones that can only support Tensor, Constant, and SymNode inputs). Test Plan: New test for `wrap`, which is a HOP that has manually_set_subgraph_inputs=False Pull Request resolved: https://github.com/pytorch/pytorch/pull/115186 Approved by: https://github.com/ydwu4, https://github.com/yanboliang ghstack dependencies: #115185	2023-12-06 01:29:03 +00:00
rzou	bc46347152	Refactor how HOPs create new args to subgraphs (#115185 ) This PR combines the logic for Tensor and SymNode. Test Plan: - Existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/115185 Approved by: https://github.com/ydwu4, https://github.com/yanboliang	2023-12-06 01:29:03 +00:00
Yanbo Liang	4620170008	[Dynamo] Revert multiple PRs since they triggered compilation stuck internally (#115126 ) Revert the following PRs to mitigate internal compilation stuck: #113432 #114016 #114507 #114196 #114739 #114669 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115126 Approved by: https://github.com/xush6528	2023-12-05 22:35:37 +00:00
Joel Schlosser	22704426c3	Expand dynamic dims support for traceable subclasses (#114311 ) Continuation of #112185, following the design in this [doc](https://docs.google.com/document/d/1ipSxcTzEMMOAPvxP-YJlD5JBZZmIGgh8Q34ixtOUCRo). Summary: * Introduce `SubclassSymbolicPolicy` containing separate dynamic dim / constraint policies for the outer and inner tensors * Expand the automatic dynamic algorithm to recurse into inner tensors and produce one of these for a subclass instance * Maintain legacy behavior for subclasses by recursively calling `mark_dynamic()` on inner tensors of the same dim as outer when `mark_dynamic(outer, ...)` is called * Addresses this: `6a86cf00ad/torch/_dynamo/variables/builder.py (L1750)` * Add `outer_size` and `outer_stride` arguments to `__tensor_unflatten__()` so that you can find out what symbols were allocated for the outer size / stride (you are expected to return a tensor that compares equal to the outer symbols) * Signatures now: ```python # attrs is a list of inner tensor attributes on x; inner_tensor = getattr(x, attr) # ctx is anything useful for rebuilding the class we want to guard on attrs, ctx = x.__tensor_flatten__() ... # inner_tensors is a dict of {attr -> tensor} # ctx is taken unmodified from flattening and (eventually) guarded on # outer_size is the expected size of the output; possibly symbolic # outer_stride is the expected strides of the output; possibly symbolic y = MySubclass.__tensor_unflatten__(inner_tensors, ctx, outer_size, outer_stride) # at the __tensor_unflatten__() call-site in PT2, we assert y.shape == outer_size and y.stride() == outer_stride # the assert simplifies symbols when there are relationships between outer and inner symbols ``` * Size info needed for `NestedTensor` at least, stride info needed for `DTensor` at least * Punting on `outer_storage_offset` because storage_offset handling is horribly broken in PT2 right now * ~~Add new `__tensor_mark_dynamic__()` to allow overriding the behavior of mark_dynamic on a per-subclass basis~~ (booted to future work) * ~~Add guards for tensor subclasses by calling `__tensor_flatten__()` in the guard to test equality on `ctx`~~ * Now handled in #114469 * Next PR: add TENSOR_MATCH guards on inner tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/114311 Approved by: https://github.com/ezyang, https://github.com/drisspg, https://github.com/voznesenskym, https://github.com/bdhirsh	2023-12-05 21:09:25 +00:00
Jason Ansel	4b8ddbbc7e	[dynamo] Improve graph break message for copy.deepcopy (#115120 ) I was curious what hf_T5_generate was trying to deepcopy, so I updated the errror message: Before: ``` STATS graph_break ("'skip function deepcopy in file /home/jansel/conda/envs/pytorch/lib/python3.10/copy.py'', skipped according skipfiles.SKIP_DIRS'", 3) ... ``` After: ``` STATS graph_break ('copy.deepcopy UserDefinedObjectVariable(GenerationConfig)', 3) ... ``` Related issue: #115122 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115120 Approved by: https://github.com/oulgen ghstack dependencies: #115095, #115046, #115057, #115119	2023-12-05 19:01:31 +00:00
Jason Ansel	522bae20df	[dynamo] Support any() on SymNodeVariable (#115119 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115119 Approved by: https://github.com/yanboliang ghstack dependencies: #115095, #115046, #115057	2023-12-05 19:01:31 +00:00
Jason Ansel	88642d44d9	[dynamo] Add RestrictedListSubclassVariable (#115057 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115057 Approved by: https://github.com/yanboliang ghstack dependencies: #115095, #115046	2023-12-05 19:01:23 +00:00
Jason Ansel	a97ed2470a	[dynamo] Support hasattr on dataclass (#115046 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115046 Approved by: https://github.com/yanboliang ghstack dependencies: #115095	2023-12-05 19:01:14 +00:00
Nikita Shulga	a827ac71f2	Revert "[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 )" This reverts commit `eaa64339d6`.	2023-12-05 08:59:36 -08:00
Iris Zhang (PyTorch)	eaa64339d6	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 ) Summary: Rename _device_mesh.py to device_mesh.py, update all callsites, adds documentation. Original diff reverted: D51629761 Original PR reverted: https://github.com/pytorch/pytorch/pull/114991 It was failing because failing a public module binding tests in MacOS, and this is due to the change in import order for torch/distributed/fsdp/_common_utils.py. Since this original import would still work, we remove the changes in this file. Test Plan: CI. Differential Revision: D51825114 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115099 Approved by: https://github.com/wanchaol, https://github.com/fegin	2023-12-05 05:44:52 +00:00
Jason Ansel	3d0bbb24a1	[dynamo] Improve support for list subclasses (#115052 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115052 Approved by: https://github.com/oulgen, https://github.com/eellison ghstack dependencies: #114830, #115047, #115048	2023-12-05 01:31:33 +00:00
Jason Ansel	fe690f430a	[dynamo] Fix dict.get with no default (#115048 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115048 Approved by: https://github.com/eellison, https://github.com/oulgen ghstack dependencies: #114830, #115047	2023-12-05 01:31:33 +00:00
Yanbo Liang	8ef44e6110	[autograd.Function] Fix torch.compile w/ once_differentiable leads to opaque graph break (#113625 ) Fixes #106893 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113625 Approved by: https://github.com/zou3519	2023-12-04 21:37:06 +00:00
Jason Ansel	a70c85ce90	[dynamo] Improve support for inspect.signature().parameters (#115047 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115047 Approved by: https://github.com/oulgen ghstack dependencies: #114830	2023-12-04 19:08:36 +00:00
Xuehai Pan	3fbfa8cd0a	[dynamo] support `dict.copy()` / `OrderedDict.copy()` / `defaultdict.copy()` (#115012 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115012 Approved by: https://github.com/jansel ghstack dependencies: #115010, #115011	2023-12-04 01:50:10 +00:00
Xuehai Pan	917a52d2a2	[dynamo] support `dict.update(seq2)` / `OrderedDict.update(seq2)` / `defaultdict.update(seq2)` (#115011 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115011 Approved by: https://github.com/jansel ghstack dependencies: #115010	2023-12-04 01:50:10 +00:00
Xuehai Pan	2e8ac5ea93	[dynamo] support `dict.fromkeys()` / `OrderedDict.fromkeys()` / `defaultdict.fromkeys()` (#115010 ) Add support for `dict.fromkeys`, `OrderedDict.fromkeys`, and `defaultdict.fromkeys`. Fixes #114963 - #114963 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115010 Approved by: https://github.com/jansel	2023-12-04 01:49:59 +00:00
Tugsbayasgalan Manlaibaatar	7f49603ed3	Fix https://github.com/pytorch/pytorch/issues/114899 (#114985 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114985 Approved by: https://github.com/ydwu4	2023-12-03 05:24:02 +00:00
PyTorch MergeBot	3a2e2044cd	Revert "[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710 ) (#114991 )" This reverts commit `729ac7317a`. Reverted https://github.com/pytorch/pytorch/pull/114991 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/114991#issuecomment-1837214567))	2023-12-02 17:55:51 +00:00
Iris Zhang (PyTorch)	729ac7317a	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710 ) (#114991 ) Summary: Same content of changes as https://github.com/pytorch/pytorch/pull/114710 Rename _device_mesh.py to device_mesh.py, update all callsites, adds documentation. ghstack-source-id: 208980207 exported-using-ghexport Test Plan: CI. Reviewed By: wanchaol Differential Revision: D51629761 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114991 Approved by: https://github.com/wanchaol, https://github.com/fduwjj, https://github.com/fegin	2023-12-02 04:39:41 +00:00
voznesenskym	4cfe997490	[dynamo] handle setting .data on a tensor (#113080 ) Dynamo We don't want setattr in the graph. Setting data has interesting implications on both aliasing and on the autograd engine. The safe recipe is: 1) Disable grad 2) Call set_() 3) Manually lower the version counter on the object to hide it from the autograd engine This is effectively the same exact thing as setting .data, and it composes properly with aot_autograd and inductor. aot_autograd For aot_autograd, there's another snag. Specifically, when we invoke aot_autograd, we call `fake_mode.from_tensor()`, relying on memo to get the right tensor out. For .data mutations, this doesn't work, because the memoized fake_tensor is in the state it will be in at the end of the trace, not at the beginning. This means that the .data call is already applied, and the tensor shape (as in the case of these tests) mismatches. aot_autograd produces an invalid graph, with illegal calls like `torch.ops.aten.view.default(primals_2, [0])` where primals is actually sized `([6])` on input. The new plan here is to: 1) Record tensor fakification policy in dynamo 2) provide a fresh fake mode to all backends 3) Invoke from_tensor with the stored policy to get fresh new fake tensors in aot_autograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/113080 Approved by: https://github.com/bdhirsh	2023-12-02 00:35:44 +00:00
David Berard	3fc58a6bbe	Revert "Make offsets dynamic by default (#113734 )" (#114889 ) This reverts commit `7c38b76efe`. if a graph has a lot of inputs which are views (with nonzero storage offset), then the check for overlapping tensor views will add a lot of guards (n^2?) `b35ca2cb94/torch/_functorch/_aot_autograd/input_output_analysis.py (L256-L260)` this was causing very slow compilations on an internal model. Differential Revision: [D51733774](https://our.internmc.facebook.com/intern/diff/D51733774) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114889 Approved by: https://github.com/ckluk2, https://github.com/YuqingJ, https://github.com/aaronenyeshi	2023-12-01 16:49:42 +00:00
Yanbo Liang	ab5385fc50	[Dynamo][6.3/N] Further cleanup torch.py (#114669 ) A follow-up PR to clean up what I found during the refactor of torch.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/114669 Approved by: https://github.com/jansel	2023-12-01 04:08:29 +00:00
Yanbo Liang	7f40640342	[Dynamo] Support torch.amp.autocast as decorator (#114845 ) Fixes #114818 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114845 Approved by: https://github.com/jansel	2023-11-30 23:54:57 +00:00
vfdev	f93ea14309	[dynamo] Added support for math ops on ints with dynamic shapes (#114507 ) Fixes #114218 ``` import math import torch def func(x, a): b = math.floor(a + 0.5) b = math.radians(a) + b y = x + b return y cfunc = torch.compile(func, dynamic=True, fullgraph=True, backend="eager") x = torch.tensor([0, 1, 2, 3], dtype=torch.float32) a = 12 out = cfunc(x, a) ``` ``` [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] TRACED GRAPH [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] ===== __compiled_fn_0 ===== [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] <eval_with_key>.0 class GraphModule(torch.nn.Module): [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] def forward(self, L_a_ : torch.SymInt, s1 : torch.SymInt, L_x_ : torch.Tensor): [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] l_a_ = L_a_ [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] l_x_ = L_x_ [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: check_math_ops.py:7, code: b = math.floor(a + 0.5) [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] add = l_a_ + 0.5 [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] floor = math_floor(add); add = None [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: /pytorch/torch/_dynamo/polyfill.py:28, code: return math.pi / 180.0 * x [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] mul = 0.017453292519943295 * l_a_; l_a_ = None [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: check_math_ops.py:9, code: b = math.radians(a) + b [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] add_1 = mul + floor; mul = floor = None [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: check_math_ops.py:13, code: y = x + b [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] y = l_x_ + add_1; l_x_ = add_1 = None [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] return (y,) [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] [2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114507 Approved by: https://github.com/lezcano	2023-11-30 14:11:57 +00:00
rzou	ce4bff4013	[dynamo] fix functools.wraps on nested functions (#114279 ) Updated version of #108885 addressing the review. In this PR: - We add a VT.can_reconstruct utility that checks if VT.reconstruct() does something. - If functools.wraps(fn) is passed a `fn` that either has a source or has .can_reconstruct() == True, then we stash the source (or the VT) - Later on, we use the source (or VT.reconstruct) to actually reconstruct the object in codegen. Test Plan: - New tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/114279 Approved by: https://github.com/voznesenskym	2023-11-28 22:34:59 +00:00
voznesenskym	ddf1cb7870	AOTAutograd: handle set_(), detect metadata mutations that cancel out (#111554 ) This should be enough to get @voznesenskym 's FSDP branch to plumb `set_()` through AOTAutograd properly and have everything properly no-op out. Main changes are: (1) graph break on `aten::set_.source_Tensor_storage_offset` (we could support it but it isn't needed, seems safer to graph break) (2) Functionalization: add a "proper" functionalization kernel for `aten::set_.source_Tensor`. The previous one we had was codegen'd and it was wrong (it would just clone() and call set_(), which does not do the right thing). I also manually mark on the `FunctionalTensorWrapper` when a given tensor has been mutated by a `set_()` call. (3) AOTAutograd: I added a new field, `InputAliasInfo.mutates_storage_metadata`, so we can distinguish between "regular" metadata mutations, and metadata mutations due to `set_()` calls. This is mainly because at runtime, one requires calling `as_strided_()` to fix up metadata, while the other requires calling `set_()`. (4) Made AOTAutograd's detection for metadata mutations / set_() mutations smarter and detect no-ops (if the storage and metadata are all the same). I also killed `was_updated()` and `was_metadata_updated()`, and replaced them with (existing) `has_data_mutation() ` and (new) `has_data_mutation()`, which can more accurately distinguish between data-mutation vs. `set_()` calls vs. metadata-mutation This PR is still silently correct in one case though, which I'd like to discuss more. In particular, this example: ``` def f(x): x_view = x.view(-1) x.set_(torch.ones(2)) x_view.mul_(2) return ``` If you have an input that experiences both a data-mutation and a `x_old.set_(x_new)` call, there are two cases: (a) the data mutation happened on the storage of `x_new`. This case should be handled automatically: if x_new is a graph intermediate then we will functionalize the mutation. If x_new is a different graph input, then we will perform the usual `copy_()` on that other graph input (b) the data mutation happened on the storage of `x_old`. This is more of a pain to handle, and doesn't currently work. At runtime, the right thing to do is probably something like: ``` def functionalized_f(x): x_view = x.view(-1) # set_() desugars into a no-op; later usages of x will use x_output x_output = torch.ones(2) # functionalize the mutation on x_view x_view_updated = x.mul(2) x_updated = x_view_updated.view(x.shape) # x experienced TWO TYPES of mutations; a data mutation and a metatadata mutation # We need to return both updated tensors in our graph return x_updated, x_output def runtime_wrapper(x): x_data_mutation_result, x_set_mutation_result = compiled_graph(x) # First, perform the data mutation on x's old storage x.copy_(x_data_mutation_result) # Then, swap out the storage of x with the new storage x.set_(x_set_mutation_result) ``` There are two things that make this difficult to do though: (1) Functionalization: the functionalization rule for `set_()` will fully throw away the old `FunctionalStorageImpl` on the graph input. So if there are any mutations to that `FunctionalStorageImpl` later on in the graph, the current graph input won't know about it. Maybe we can have a given `FunctionalTensorWrapper` remember all previous storages that it had, and track mutations on all of them - although this feels pretty complicated. (2) AOTAutograd now needs to know that we might have two graph outputs that correspond to a single "mutated input", which is annoying. It's worth pointing out that this issue is probably extremely unlikely for anyone to run into - can we just detect it and error? This feels slightly easier than solving it, although not significantly easier. We would still need `FunctionalTensorWrapper` to keep track of mutations on any of its "previous" storages, so it can report this info back to AOTAutograd so we can raise an error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111554 Approved by: https://github.com/ezyang ghstack dependencies: #113926	2023-11-28 19:33:35 +00:00
Bin Bao	0bef97fac3	[dynamo] Support itertools.groupby (#114192 ) Summary: for https://github.com/pytorch/pytorch/issues/108698 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114192 Approved by: https://github.com/jansel	2023-11-28 14:58:59 +00:00
lezcano	79ee99e6d2	[easy] Dispatch torch.from_numpy to torch.as_tensor (#114609 ) ...rather than detaching the tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/114609 Approved by: https://github.com/larryliu0820, https://github.com/voznesenskym ghstack dependencies: #114608	2023-11-28 12:04:37 +00:00
lezcano	0bb2600c28	Allow to differentiate through NumPy code (#114608 ) With this PR it is possible to differentiate through NumPy code modulo the usual caveats that apply to differentiation: - That there are no graphbreaks - That the decomposition in `torch._numpy` is differentiable @ev-br and I were somewhat careful to achieve the second point, but it is not tested though and through, so YMMV Pull Request resolved: https://github.com/pytorch/pytorch/pull/114608 Approved by: https://github.com/voznesenskym	2023-11-28 12:04:37 +00:00
Angela Yi	dffa5f3f23	[dynamo][reland] `ExecutorchCallDelegateHigherOrderVariable` - add sanity check that input and output tensors are disjoint (#114167 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/111960, Fixes https://github.com/pytorch/pytorch/issues/111917 Original PR broke some internal tests which the current diff has resolved. Test Plan: CI Differential Revision: D51473196 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114167 Approved by: https://github.com/jon-chuang, https://github.com/zou3519	2023-11-28 00:27:23 +00:00
ydwu4	2ac0b61e60	[HigherOrderOp] dedup repeated get_attr placeholders in branches of cond (#112874 ) We further de-duplicate the dupliacted get_attrs nodes. For code below: ```python def test_cond_free_variable_in_both_branches(self): backend = EagerAndRecordGraphs() cnt = CompileCounterWithBackend(backend) z = torch.ones(4, 4) class Foo(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("buffer", torch.ones(6, 4)) def forward(self, x, y): def true_fn(x): return x.sum() + self.buffer.sum() + z.sum() def false_fn(x): return x.sum() - z.sum() - self.buffer.sum() return control_flow.cond(y, true_fn, false_fn, [x]) mod_for_compile = torch.compile( Foo(), backend=cnt, dynamic=True, fullgraph=True ) ``` Before de-duplication, we have the following graph module: ```python class GraphModule(torch.nn.Module): def forward(self, L_y_ : torch.Tensor, L_x_ : torch.Tensor, s0 : torch.SymInt, L_z_ : torch.Tensor): l_y_ = L_y_ l_x_ = L_x_ l_z_ = L_z_ # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1243, code: return x.sum() + self.buffer.sum() + z.sum() l__self___buffer = self.L__self___buffer # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1246, code: return x.sum() - z.sum() - self.buffer.sum() l__self___buffer_1 = self.L__self___buffer # File: /home/yidi/local/pytorch/torch/_higher_order_ops/cond.py:118, code: return cond_op(pred, true_fn, false_fn, operands) cond_true_0 = self.cond_true_0 cond_false_0 = self.cond_false_0 cond = torch.ops.higher_order.cond(l_y_, cond_true_0, cond_false_0, [l_x_, l_z_, l__self___buffer, l__self___buffer_1]); l_y_ = cond_true_0 = cond_false_0 = l_x_ = l_z_ = l__self___buffer = l__self___buffer_1 = None return (cond,) class GraphModule(torch.nn.Module): def forward(self, l_x_, l_z_, l__self___buffer_true_branch, l__self___buffer_1_false_branch): l_x__1 = l_x_ l_z__1 = l_z_ # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1243, code: return x.sum() + self.buffer.sum() + z.sum() sum_1 = l_x__1.sum(); l_x__1 = None sum_2 = l__self___buffer_true_branch.sum(); l__self___buffer_true_branch = None add = sum_1 + sum_2; sum_1 = sum_2 = None sum_3 = l_z__1.sum(); l_z__1 = None add_1 = add + sum_3; add = sum_3 = None return add_1 class GraphModule(torch.nn.Module): def forward(self, l_x_, l_z_, l__self___buffer_true_branch, l__self___buffer_1_false_branch): l_x__1 = l_x_ l_z__1 = l_z_ # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1246, code: return x.sum() - z.sum() - self.buffer.sum() sum_1 = l_x__1.sum(); l_x__1 = None sum_2 = l_z__1.sum(); l_z__1 = None sub = sum_1 - sum_2; sum_1 = sum_2 = None sum_3 = l__self___buffer_1_false_branch.sum(); l__self___buffer_1_false_branch = None sub_1 = sub - sum_3; sub = sum_3 = None return sub_1 ``` After de-duplication, we have the following graph module: ```python class GraphModule(torch.nn.Module): def forward(self, L_x_ : torch.Tensor, L_y_ : torch.Tensor, s0 : torch.SymInt, L_z_ : torch.Tensor): l_x_ = L_x_ l_y_ = L_y_ l_z_ = L_z_ # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1232, code: return x.sum() + self.buffer.sum() + z.sum() l__self___buffer = self.L__self___buffer # File: /home/yidi/local/pytorch/torch/_higher_order_ops/cond.py:118, code: return cond_op(pred, true_fn, false_fn, operands) cond_true_0 = self.cond_true_0 cond_false_0 = self.cond_false_0 cond = torch.ops.higher_order.cond(l_y_, cond_true_0, cond_false_0, [l__self___buffer, l_x_, l_z_]); l_y_ = cond_true_0 = cond_false_0 = l__self___buffer = l_x_ = l_z_ = None return (cond,) class GraphModule(torch.nn.Module): def forward(self, l__self___buffer, l_x_, l_z_): l__self___buffer_1 = l__self___buffer l_x__1 = l_x_ l_z__1 = l_z_ # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1232, code: return x.sum() + self.buffer.sum() + z.sum() sum_1 = l_x__1.sum(); l_x__1 = None sum_2 = l__self___buffer_1.sum(); l__self___buffer_1 = None add = sum_1 + sum_2; sum_1 = sum_2 = None sum_3 = l_z__1.sum(); l_z__1 = None add_1 = add + sum_3; add = sum_3 = None return add_1 class GraphModule(torch.nn.Module): def forward(self, l__self___buffer_1, l_x_, l_z_): l__self___buffer_2 = l__self___buffer_1 l_x__1 = l_x_ l_z__1 = l_z_ # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1235, code: return x.sum() - z.sum() - self.buffer.sum() sum_1 = l_x__1.sum(); l_x__1 = None sum_2 = l_z__1.sum(); l_z__1 = None sub = sum_1 - sum_2; sum_1 = sum_2 = None sum_3 = l__self___buffer_2.sum(); l__self___buffer_2 = None sub_1 = sub - sum_3; sub = sum_3 = None return sub_1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/112874 Approved by: https://github.com/zou3519	2023-11-27 22:07:42 +00:00
voznesenskym	081c5b3adc	Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926 ) (#114526 ) Summary: The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors at the end of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor. This PR is the result of a lot of back and forth with ezyang and eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same: 1) We cache source->symbol in shape_env 2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification 3) We create a new fake mode for backends (from https://github.com/pytorch/pytorch/pull/113605/files) This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't). We went back to the drawing board here, but with a few concessions: 1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons 2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (ezyang did this) cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng imported-using-ghimport Test Plan: Imported from OSS Reviewed By: huydhn, Chillee Differential Revision: D51566250 Pulled By: voznesenskym Pull Request resolved: https://github.com/pytorch/pytorch/pull/114526 Approved by: https://github.com/Chillee, https://github.com/huydhn	2023-11-26 23:40:32 +00:00

1 2 3 4 5 ...

860 Commits