pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Yidi Wu	824474cb35	[cond] support output sizes mismatch in front end (#147130 ) This PR finishes https://github.com/pytorch/pytorch/pull/137615 by addressing the TODOs and comments left there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147130 Approved by: https://github.com/zou3519	2025-02-25 20:28:41 +00:00
eellison	92b7e610ab	[Inductor changes] Invoke Quant (#139102 ) Adds a `invoke_quant` higher order operator as proposed [here](https://docs.google.com/document/d/1s2PfJlq6Q1F8l11CkTIC69BW1rEnGEgs6YmBC7hu8rA/edit?tab=t.0). The primary motivations are - Unifying scattered reasoning for quant operators throughout the code base - Easy of pattern matching - see this very large pattern match expression [here](`949fdd2997/torch/_inductor/fx_passes/post_grad.py (L390-L426)`. Compared to the pattern I have in the tests: ``` @register_graph_pattern( CallFunction( torch.ops.aten.mm, CallFunction( torch.ops.higher_order.invoke_quant, Ignored(), Ignored(), Ignored(), scheme="nf4", ), Arg(), ), pass_dict=test_pass, ) ``` - Ability to specify inductor specific logic, like codegen'ing the operators in lower precision, or forcing fusion to a matmul. Example graph: ``` Python ===== AFTER POST GRAD ===== /data/users/eellison/pytorch/torch/fx/_lazy_graph_module.py class <lambda>(torch.nn.Module): def forward(self, arg0_1: "f32[8][1]cpu", arg1_1: "f32[8][1]cpu"): # File: /data/users/eellison/pytorch/torch/_higher_order_ops/invoke_quant.py:87 in __call__, code: return invoke_quant_tracer(args, kwargs, quant_options=self) # type: ignore[call-arg] repeated_subgraph0 = self.repeated_subgraph0 invoke_quant: "f32[8][1]cpu" = torch.ops.higher_order.invoke_quant(repeated_subgraph0, arg0_1, arg1_1, scheme = 'nf4'); repeated_subgraph0 = arg0_1 = arg1_1 = None return (invoke_quant,) class repeated_subgraph0(torch.nn.Module): def forward(self, arg0_1: "f32[8][1]cpu", arg1_1: "f32[8][1]cpu"): # File: /data/users/eellison/pytorch/torch/_higher_order_ops/invoke_quant.py:87 in __call__, code: return invoke_quant_tracer(args, *kwargs, quant_options=self) # type: ignore[call-arg] mul: "f32[8][1]cpu" = torch.ops.aten.mul.Tensor(arg0_1, arg1_1); arg0_1 = None add: "f32[8][1]cpu" = torch.ops.aten.add.Tensor(mul, arg1_1); mul = arg1_1 = None return add ``` The schema for `invoke_quant` is `torch.ops.higher_order.invoke_quant(subgraph, args, scheme=None)` where the scheme will not always be present. I wasn't sure exactly how the inductor specific configurations like `codgen_in_low_precision` should be passed through. I didnt want to stuff them all in as kwargs, and I didn't want to have them affect pattern matching. So they will be stored as meta of the node itself. And, following that, I wanted the invocation of the hop to match how it will show up in the graph. So I decided to have it be an object that is then invoked for the tracing. ``` invoke_quant = InvokeQuant(codegen_low_precision=True) invoke_quant(gn, (x, y), scheme="nf4") ``` Todo - not require the packing of args in a tuple, will do following https://github.com/pytorch/pytorch/pull/139162. Feedback welcome. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139102 Approved by: https://github.com/Chillee	2025-02-08 19:30:19 +00:00
Yanbo Liang	bd8d7b1b74	[Dynamo][Trace PyDispatcher] Remove disable from HigherOrderOperator.__call__ (#146270 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/146270 Approved by: https://github.com/zou3519	2025-02-03 21:47:54 +00:00
Simon Fan	2e197c8a2d	[dynamo][hop] test torch.compiling all HOPs (#145422 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145422 Approved by: https://github.com/ydwu4, https://github.com/zou3519	2025-01-31 20:45:22 +00:00
Ryan Guo	eaec97ab1f	[dynamo] Properly prune dead input cell object (#145781 ) This patch models input cell object as "newly created" rather than "pre-existing" python object (see added documentation for why this actually captures the semantics more accurately). This enables the `SideEffects.prune_dead_object_new` algorithm to prune away writes to input cell objects which are no longer relevant; this didn't happen prior to this patch because we modelled them as pre-existing objects, which forces us to codegen their attribute mutations. Fixes #145564. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145781 Approved by: https://github.com/williamwen42, https://github.com/jansel	2025-01-28 18:28:13 +00:00
Animesh Jain	19584b28fd	[dynamo][dicts] Consolidate dict(..) construction (#144342 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144342 Approved by: https://github.com/StrongerXi	2025-01-20 04:42:06 +00:00
PyTorch MergeBot	5e6e6200bf	Revert "[dynamo][dicts] Consolidate dict(..) construction (#144342 )" This reverts commit `a54a784b82`. Reverted https://github.com/pytorch/pytorch/pull/144342 on behalf of https://github.com/kit1980 due to breaking internal builds, see D68125388 ([comment](https://github.com/pytorch/pytorch/pull/144342#issuecomment-2597184167))	2025-01-17 00:32:09 +00:00
Animesh Jain	a54a784b82	[dynamo][dicts] Consolidate dict(..) construction (#144342 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144342 Approved by: https://github.com/StrongerXi	2025-01-13 22:24:56 +00:00
Yidi Wu	c36f94b373	[while_loop][dynamo] auto-unspecialize int input and output to unbacked symints (#143106 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143106 Approved by: https://github.com/zou3519 ghstack dependencies: #143105, #143545	2025-01-03 19:01:07 +00:00
Tom Ritchford	d25e6e623f	Fix unused Python variables in test/[a-d]* (#134665 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134665 Approved by: https://github.com/albanD	2024-12-13 22:13:12 +00:00
Yidi Wu	7111cd6ee0	[hop][BE] add util diff_meta with prettier error message. (#142162 ) The error message changes from: ```python -torch._dynamo.exc.Unsupported: Expected branches to return tensors with same metadata. [(tensor_pair, difference)...]:[('pair0:', TensorMetadata(shape=torch.Size([4, 3]), dtype=torch.float32, requires_grad=False, stride=(3, 1), memory_format=None, is_quantized=False, qparams={}), TensorMetadata(shape=torch.Size([2, 3]), dtype=torch.float32, requires_grad=False, stride=(3, 1), memory_format=None, is_quantized=False, qparams={}))] ``` to ```python +torch._dynamo.exc.Unsupported: Expect branches to return tensors with same metadata but find pair[0] differ in 'shape', where lhs is TensorMetadata(shape=torch.Size([4, 3]), dtype=torch.float32, requires_grad=False, stride=(3, 1), memory_format=None, is_quantized=False, qparams={}) and rhs is TensorMetadata(shape=torch.Size([2, 3]), dtype=torch.float32, requires_grad=False, stride=(3, 1), memory_format=None, is_quantized=False, qparams={}) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/142162 Approved by: https://github.com/zou3519	2024-12-10 21:54:28 +00:00
Yidi Wu	9ced54a51a	[hop] lift free symbols in slice (#142385 ) Before the change, we get an unfound proxy error when linting the subgraph. After the change, we have the following dynamo graph for dynamic_shape test. ```python V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] /data/users/yidi/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] def forward(self, s0: "Sym(s0)", s1: "Sym(s1)", s2: "Sym(s2)", L_x_: "f32[s0, s1, s2][s1s2, s2, 1]cpu"): V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] l_x_ = L_x_ V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] # File: /data/users/yidi/pytorch/test/dynamo/test_higher_order_ops.py:307 in f, code: i = x.size(0) - 2 V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] sub: "Sym(s0 - 2)" = s0 - 2 V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] # File: /data/users/yidi/pytorch/test/dynamo/test_higher_order_ops.py:308 in f, code: j = x.size(1) - 3 V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] sub_1: "Sym(s1 - 3)" = s1 - 3 V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] # File: /data/users/yidi/pytorch/test/dynamo/test_higher_order_ops.py:310 in f, code: return wrap(lambda x: x[:i, :j, k:], x) V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] wrap_body_0 = self.wrap_body_0 V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] wrap = torch.ops.higher_order.wrap(wrap_body_0, s0, s1, s2, l_x_, sub, sub_1); wrap_body_0 = s0 = s1 = s2 = l_x_ = sub = sub_1 = None V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] getitem: "f32[s0 - 2, s1 - 3, 0][s1s2, s2, 1]cpu" = wrap[0]; wrap = None V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] return (getitem,) V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] class wrap_body_0(torch.nn.Module): V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] def forward(self, s0: "Sym(s0)", s1: "Sym(s1)", s2: "Sym(s2)", l_x_: "f32[s0, s1, s2][s1s2, s2, 1]cpu", sub: "Sym(s0 - 2)", sub_1: "Sym(s1 - 3)"): V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] # File: /data/users/yidi/pytorch/test/dynamo/test_higher_order_ops.py:310 in <lambda>, code: return wrap(lambda x: x[:i, :j, k:], x) V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] getitem: "f32[s0 - 2, s1 - 3, 0][s1s2, s2, 1]cpu" = l_x_[(slice(None, sub, None), slice(None, sub_1, None), slice(s2, None, None))]; l_x_ = sub = sub_1 = s2 = None V1209 11:11:06.187000 4091124 torch/_dynamo/output_graph.py:1346] [0/2] [__graph_code] return (getitem,) ``` We lift sub, sub_1 because they're compound expressions and are directly used in argument of the getitem node. We lift s0, s1 and s2 because they're basic symbols in the tensor input. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142385 Approved by: https://github.com/zou3519	2024-12-10 21:52:30 +00:00
Ryan Guo	9d54cd1504	[dynamo] Undo some jvp old workarounds in functorch (#142081 ) This basically undoes some workarounds introduced in #119926, the root causes of which have been fixed by #142078 and other changes in Dynamo. Now that Dynamo traces the spec comparison code, the test also needs update: - removing the `_jvp_treespec_compare` calls in fx graph Pull Request resolved: https://github.com/pytorch/pytorch/pull/142081 Approved by: https://github.com/zou3519 ghstack dependencies: #142078, #142080	2024-12-06 08:06:53 +00:00
Ryan Guo	59de5e867b	[dynamo] Undo some vjp old workarounds in functorch (#142080 ) This basically undoes most of the workarounds introduced in #119405, the root causes of which have been fixed by #142078 and other changes in Dynamo. Now that Dynamo traces the spec comparison code, the test also needs update: 1. renaming `o` to `pimals_out` 2. removing the `_vjp_treespec_compare` calls in fx graph Pull Request resolved: https://github.com/pytorch/pytorch/pull/142080 Approved by: https://github.com/zou3519 ghstack dependencies: #142078	2024-12-06 08:06:53 +00:00
PyTorch MergeBot	ad37afd590	Revert "Always unspecialize float in OSS (#138922 )" This reverts commit `ba5253da9b`. Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/yf225 due to perf regression on torchbench ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2499277511))	2024-11-26 00:03:03 +00:00
Bob Ren	ba5253da9b	Always unspecialize float in OSS (#138922 ) Fixes https://github.com/pytorch/pytorch/issues/107277 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922 Approved by: https://github.com/ezyang Co-authored-by: Edward Z. Yang <ezyang@meta.com>	2024-11-24 01:58:13 +00:00
PyTorch MergeBot	a8c90e5140	Revert "Always unspecialize float in OSS (#138922 )" This reverts commit `6d779d0549`. Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/huydhn due to Sorry for reverting your change but there is some slow tests failing after this land ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2495076878))	2024-11-22 23:18:36 +00:00
Bob Ren	6d779d0549	Always unspecialize float in OSS (#138922 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922 Approved by: https://github.com/ezyang Co-authored-by: Edward Z. Yang <ezyang@meta.com>	2024-11-22 17:54:42 +00:00
Guilherme Leobas	7ced49d2cc	Raise exception if vmap (eager) calls compiled function (#140439 ) Fixes #138422 This is not a proper fix for #140439, but more of a way to prevent a user from seeing a nasty error inside the C++ code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140439 Approved by: https://github.com/zou3519	2024-11-19 16:27:48 +00:00
Ryan Guo	ea1d11cf74	[dynamo] Represent all cells as `NewCellVariable` (#140153 ) In addition to `NewCellVariable`, Dynamo has 3 ways of modeling cell objects: 1. For cells captured and created by the root frame, represent them as their contents in `root_tx.symbolic_locals`, which `LOAD_DEREF` and `STORE_DEREF` update directly, without going through `SideEffects`. 2. `ClosureVariable`: this is created when cells from (1) are captured by a newly created function Dynamo is about to inline. It's a handle with a name that redirects `LOAD_DEREF` and `STORE_DEREF` back (1), to make `root_tx.symbolic_locals` up-to-date. 3. For cells that are captured by both the root frame and some pre-existing function Dynamo is about to inline, represent those cells as contents, and do not allow writes to them. Note that (2) and (3) are mainly to conform with (1) -- to make sure Dynamo has a consistent modeling of cells for the same cell objects. In this patch, we represent all of these cells as `NewCellVariable`. The main new code paths introduced are: - using `NewCellVariable` to model cell objects created by the root frame (the cells are passed in as input to `InstructionTranslator`), this is what allows us to get rid of all 3 legacy paths above. - adding a new `AutoDerefLocalSource` to deal with the python-code level (guards) and bytecode level (codegen) auto-dereferencing behavior, when accessing pre-existing python cells. This also involves a tiny update to guard manager generation. - plumbing some extra info into `LocalSource` and `CellVariable` so that we can still emit `LOAD_DEREF`, `STORE_DEREF`, `LOAD_CLOSURE` (instead of `make_cell`, `cell_contents` attribute access, and `LOAD_FAST`), which is important for readability, performance, and some assumptions `bytecode_transformation.py` makes. As a result, this patch removes a lot of the now-dead code paths and TODOs. Notably, it significantly simplified the `prune_dead_locals` function, which was duplicating a lot of the logic from `prune_dead_object_new`; this conveniently closes #137123. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140153 Approved by: https://github.com/jansel ghstack dependencies: #140330, #140152, #140436, #140435	2024-11-15 17:17:30 +00:00
zeshengzong	cb71bcc542	Replace clone.detach with detach.clone (#140264 ) Fixes #64532 As state in issue, replace `clone.detach` by `detach.clone` Pull Request resolved: https://github.com/pytorch/pytorch/pull/140264 Approved by: https://github.com/soulitzer	2024-11-13 07:01:02 +00:00
Yidi Wu	ab42967238	[hop free symbols] lift free symbols in example_value when create_graph_input (#138363 ) There are 4 parts (they are hard to further break into smaller ones cause they're highly coupled) in this PR: 1. Whenever we call create_graph_input, we try to bind the symbols in the graph input. We've enforced the invariant that all create_graph_inputs calls must provide an example value, we could intercept at the create_graph_input calls (This PR only handles free symbols in tensors). 2. We cache the bound_symbols to avoid lift the same symbol repeated. 3. For lifted symbols, we re-used lifted_freevars i.e. the mapping between symbol proxy in parent graph to the lifted phs in current subgraph, which we handle lifted tensors. In this way, all hops that supports lifted tensors should be able to handle lifted_symints automatically (at least in dynamo part). 4. For unbacked symbols created during tracing, we need to also bound these symbols to its proxy. This is to support the tests cases where we want to lift unbacked symbols as input. We need the proxy of the unbacked symbol in parent graph in order to properly create the args to the hop. 5. We change all the tests after free symbols are lifted in subgraphs. And also supports the lifted symbols in existing higher order ops. The interaction of nested tracers: The previous design for lifting tensor closures is that: suppose we're in nested tracers, whenever we see a new proxy that's not created by create tracer, we recursively look for the proxy in parent tracer until we find the tracer that creates this proxy (either a placeholder or some intermediate results). More detail is in Note [Nested SubgraphTracer and free_variable handling]. Given the above design, the plan for lifting the free symbols is: whenever we lift a free tensor to be the inputs of current subgraph, we'll look at the symbols in it and bind the symbols at the same time. For example, suppose we have the following function: ```python def f(x: [s1, s2]): def true_f(): def true_f_inner(): return x.sin() ``` what will happen in time order: 1. we create a subtracer 1 and start to speculate the outer cond's true_f 2. we create a another subtracer 2 and start to speculate the inner cond's true_f_inner. 3. dynamo realize the tensor input x by calling wrap_tensor in top-level to create graph input x (tracer 0), we bind the symbol s1, s2 after ph for x is created. So the graph now looks like: ```python def gm(s1, s2, x): ``` 4. when seeing TensorVariable.call_method of x, tracer2 wants to create a call_function(sin, proxy_of_x), but it finds that proxy_of_x is not created by current tracer. So it recursively look up its parent tracer1 and find parent tracer1 also doesn't track this proxy_of_x then it finds the root tracer0, who is the creator of it and tracks it as a ph. Then tracer 1 create_graph_input to lift the closure to its input ph1 and add (proxy_of_x: ph1) k-v in lifted_freevars of tracer 1. Now the graph looks like: ```python def gm(s1, s2, x): def true_gm(x): ``` 5. Since there are free symbols inside this new tensor input, tracer 1 also binds the symbols (maybe_bind_symbol), which calls create_graph_input for s1 and s2. Now the graph looks like ```python def gm(s1, s2, x): def true_gm(s1, s2, x): ``` 6. then it goes back to tracer 2, and call create_graph_input for x and get ph2, tracer 2's lifted_freevars records (ph1, ph2). and tracer 2 also binds the symbols in this new tensor input. Now the graph looks like: ```python def gm(s1, s2, x): def true_gm(s1, s2, x): def true_gm_inner(s1, s2, x): ``` 7. Finally the sin call_function node is created by tracer 2. This PR also handles the following cases: - What if we lift two tensors share the same symbol? e.g. x1 [s1, s2], x2 [s2, s3]? Each subtracer maintains bound_symbols as a cache that maps a symbol.expr to its proxy in current tracer. So when we see x1, we'll track s1 and s2 as inputs and bound s1 to ph1, s2 to ph2. So when we try to bind symbols of x2, s2 will already be tracked so no graph input is created. - what if a subgraph close over a symint? e.g. ```python def f(x): def true_f(): c = x.size(0) def true_fn_inner(): return c ``` When we speculate true_fn_inner, we find proxy_of_c is not tracked by tracer 2, so it recursively looks up its parent. At this point, x and its symbols have been lifted as input of true_f (as a result of lifting x during tracing true_f in tracer 1. Specifically the graph looks like: ```python def gm(s1, s2, x): def true_gm(s1, s2, x): def true_gm_inner(): ``` So tracer 2 is able to find that s1 have been tracked as ph in tracer 1 so it returns back to gm and call create_graph_input on s1. The graph now looks like: ```python def gm(s1, s2, x): def true_gm(s1, s2, x): def true_gm_inner(s1): return s1 ``` - What if subgraph close over an unbacked symint? e.g. ```python def f(x): def true_f(): c = x.item() def true_f_inner(): return c ``` When x.item() is called, proxy_of_c and its symnode variable is created for tracer 1, and we also call track_unbacked_symbols to record this relationship. So when tracer 2 finds proxy_of_c is not created by current tracer, it recursivelly looks up its parent tracer and finds that that expression u0 has been tracked as a result of track_unbacked_symbol in tracer 1. So it will stop the recursion and create_graph_input u0 in tracer 2. Graph looks like: ```python def f(x): def true_f(s1, s2, x): c = x.item() def true_gm_inner(u0): return u0 cond(pred, true_gm_inner, false_gm_inner, (c,)) ``` - what if subgraph close over a tensor with unbacked symint shape? ```python def f(x): def true_f(): c = x.item() r = torch.randn((c,)) def true_f_inner(): return r + 1 ``` This is the same as the case of closing over tensors with backed shapes. where we first lift r, then bind u0 in it, which recursively bind_symint of u0 in its parent and found u0 is tracked in parent tracer as a result of .item() call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138363 Approved by: https://github.com/zou3519	2024-11-07 04:44:32 +00:00
Yidi Wu	dc3a6a9d08	[hop free symbols][refactor] make create_graph_input always take example_value (#138428 ) Code refactoring only. We move the wrap_to_fake_tensor_logic out of wrap_fx_proxy for placeholders to provide the invariant that all graph inputs must set their example values when creating the inputs. This invariant helps us to identify all the free symbols in the graph in top-level and sub-graphs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138428 Approved by: https://github.com/ezyang, https://github.com/zou3519 ghstack dependencies: #138345	2024-11-04 22:47:49 +00:00
PyTorch MergeBot	b6b9596607	Revert "[dynamo] Fix constant propagation in builtins and UserClasses (#131354 )" This reverts commit `44257c063e`. Reverted https://github.com/pytorch/pytorch/pull/131354 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it seems to break some internal tests ([comment](https://github.com/pytorch/pytorch/pull/131354#issuecomment-2451050605))	2024-11-01 00:13:20 +00:00
Tom Ritchford	44257c063e	[dynamo] Fix constant propagation in builtins and UserClasses (#131354 ) * Fixes https://github.com/pytorch/pytorch/issues/118675 * Replaces https://github.com/pytorch/pytorch/pull/118994 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131354 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-10-30 12:47:20 +00:00
Yuanhao Ji	1a2dc89f17	[Dynamo] Allow `torch.cond()` to handle emply arguments (#138190 ) Fixes #138150 ```python import torch @torch.compile(fullgraph=True) def foo(x, y, z): def f(): return y + 2 def g(): return z + 1 return torch.cond(x, f, g) print(foo(torch.zeros(1), torch.ones(1), torch.ones(1))) # tensor([2.]) print(foo(torch.ones(1), torch.ones(1), torch.ones(1))) # tensor([3.]) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/138190 Approved by: https://github.com/ezyang, https://github.com/zou3519	2024-10-26 15:26:21 +00:00
Ryan Guo	dd7c2899bd	[dynamo] Properly prune dead cell local variables (#136891 ) This patch updates the `prune_dead_locals` logic to do slightly more aggressive pruning for cell local variables, in absence of side-effects, e.g., a cell variable can be pruned when its user function(s) will never be used again. See added tests for examples; note that a few tests in `test/dynamo/test_higher_order_ops.py` also got updated because we are no longer returning the unnecessary graph output. Fixes #127350, #124653 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136891 Approved by: https://github.com/jansel, https://github.com/anijain2305, https://github.com/williamwen42, https://github.com/zou3519	2024-10-10 18:21:24 +00:00
Yanbo Liang	a408cfcbf1	[torch.compile] torch.vmap supports dynamic shapes + enable flex attention create_block_mask dynamic shapes (#137163 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/137163 Approved by: https://github.com/Chillee	2024-10-04 05:16:04 +00:00
Bob Ren	13ec343afe	clean up capture_func_transforms flag (#136960 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136960 Approved by: https://github.com/guilhermeleobas, https://github.com/jansel	2024-10-04 01:10:52 +00:00
PyTorch MergeBot	9223c16208	Revert "Fix constant propagation in builtins and UserClasses (#131354 )" This reverts commit `dd4a51b39a`. Reverted https://github.com/pytorch/pytorch/pull/131354 on behalf of https://github.com/atalman due to Breaks torchrec tests ([comment](https://github.com/pytorch/pytorch/pull/131354#issuecomment-2375417145))	2024-09-25 23:01:03 +00:00
Tom Ritchford	dd4a51b39a	Fix constant propagation in builtins and UserClasses (#131354 ) * Fixes https://github.com/pytorch/pytorch/issues/118675 * Replaces https://github.com/pytorch/pytorch/pull/118994 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131354 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-09-25 13:03:40 +00:00
Yidi Wu	b07d0a22f5	[hop] require hops to override __call__. (#134352 ) Fixes https://github.com/pytorch/pytorch/issues/133719 by making `__call__` of hops an abstractmethod. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134352 Approved by: https://github.com/zou3519	2024-08-28 19:56:40 +00:00
Wuxun Zhang	1d231ff8ba	[HOO] add hints_wrapper to support passing context hints (#132860 ) Fixes #126393 The implementation code is based on feedback here (https://github.com/pytorch/pytorch/pull/121639#issuecomment-2223948842). Hints are passed as kwargs of hints_wrapper op. It also supports nested hints. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132860 Approved by: https://github.com/ydwu4, https://github.com/zou3519	2024-08-26 18:21:22 +00:00
Xu Han	2ec149cd3e	[inductor] fix test_functional_call_sequential_params_and_buffers expectation on Windows (#134394 ) This UT actual code only one empty line wrap difference(`linear` and `add`) between Windows/Linux, and the context is right. Reproduce UTs: ```cmd pytest test\dynamo\test_higher_order_ops.py -v -k test_functional_call_sequential_params_and_buffers ``` We can add `empty_line_normalizer` to fix it. ```cmd ______________________________________________________________________________________________ FuncTorchHigherOrderOpTests.test_functional_call_sequential_params_and_buffers _______________________________________________________________________________________________ Traceback (most recent call last): File "D:\xu_git\dnnl_cb\pytorch\test\dynamo\test_higher_order_ops.py", line 3676, in test_functional_call_sequential_params_and_buffers self.assertExpectedInline( File "C:\Users\Xuhan\.conda\envs\win_mkl_static\lib\site-packages\torch\testing\_internal\common_utils.py", line 2871, in assertExpectedInline return super().assertExpectedInline(actual if isinstance(actual, str) else str(actual), expect, skip + 1) File "C:\Users\Xuhan\.conda\envs\win_mkl_static\lib\site-packages\expecttest\__init__.py", line 271, in assertExpectedInline self.assertMultiLineEqualMaybeCppStack(expect, actual, msg=help_text) File "C:\Users\Xuhan\.conda\envs\win_mkl_static\lib\site-packages\expecttest\__init__.py", line 292, in assertMultiLineEqualMaybeCppStack self.assertMultiLineEqual(expect, actual, args, *kwargs) File "C:\Users\Xuhan\.conda\envs\win_mkl_static\lib\unittest\case.py", line 1226, in assertMultiLineEqual self.fail(self._formatMessage(msg, standardMsg)) File "C:\Users\Xuhan\.conda\envs\win_mkl_static\lib\unittest\case.py", line 675, in fail raise self.failureException(msg) AssertionError: 'clas[509 chars]one\n add: "f32[1, 1]" = linear + l_buf[69 chars],)\n' != 'clas[509 chars]one\n\n add: "f32[1, 1]" = linear + l_b[71 chars],)\n' class GraphModule(torch.nn.Module): def forward(self, L_params_l1_weight_: "f32[1, 1]", L_params_l1_bias_: "f32[1]", L_buffers_buffer_: "f32[1]", L_inputs_: "f32[1, 1]"): l_params_l1_weight_ = L_params_l1_weight_ l_params_l1_bias_ = L_params_l1_bias_ l_buffers_buffer_ = L_buffers_buffer_ l_inputs_ = L_inputs_ linear: "f32[1, 1]" = torch._C._nn.linear(l_inputs_, l_params_l1_weight_, l_params_l1_bias_); l_inputs_ = l_params_l1_weight_ = l_params_l1_bias_ = None + <<<< (difference is here ) add: "f32[1, 1]" = linear + l_buffers_buffer_; linear = l_buffers_buffer_ = None return (add,) : To accept the new output, re-run test with envvar EXPECTTEST_ACCEPT=1 (we recommend staging/committing your changes before doing this) To execute this test, run the following from the base repo dir: python test\dynamo\test_higher_order_ops.py FuncTorchHigherOrderOpTests.test_functional_call_sequential_params_and_buffers This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ========================================================================================================================== short test summary info ========================================================================================================================== FAILED [0.4275s] test/dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_functional_call_sequential_params_and_buffers - AssertionError: 'clas[509 chars]one\n add: "f32[1, 1]" = linear + l_buf[69 chars],)\n' != 'clas[509 chars]one\n\n add: "f32[1, 1]" = linear + l_b[71 chars],)\n' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/134394 Approved by: https://github.com/jansel Co-authored-by: Jason Ansel <jansel@jansel.net>	2024-08-26 01:41:20 +00:00
Yidi Wu	a23d86c178	[hop] ban creating hop by directly instantiating HigherOrderOperator. (#133645 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133645 Approved by: https://github.com/zou3519	2024-08-23 17:28:02 +00:00
PyTorch MergeBot	1491a61769	Revert "[hop] ban creating hop by directly instantiating HigherOrderOperator. (#133645 )" This reverts commit `696107efcb`. Reverted https://github.com/pytorch/pytorch/pull/133645 on behalf of https://github.com/ydwu4 due to breaking ci. probably due to land race ([comment](https://github.com/pytorch/pytorch/pull/133645#issuecomment-2302866106))	2024-08-21 19:33:14 +00:00
Yidi Wu	696107efcb	[hop] ban creating hop by directly instantiating HigherOrderOperator. (#133645 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133645 Approved by: https://github.com/zou3519 ghstack dependencies: #133521	2024-08-21 17:34:21 +00:00
Guilherme Leobas	a9954d22f8	Raise exception if torch.func.* calls torch.compile functions (#128736 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128736 Approved by: https://github.com/zou3519	2024-08-08 20:21:44 +00:00
Zhengxu Chen	942ffd1b2d	Make the __module__ name of HOO to be always "torch.ops.higher_order" (#132775 ) Summary: It seems that we can just make this the default so that in the future all the ops printed in the graph should be like torch.ops.higher_order Test Plan: CI Differential Revision: D60530900 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132775 Approved by: https://github.com/ydwu4, https://github.com/zou3519	2024-08-08 16:55:09 +00:00
angelayi	a270800f0b	[export][reland] Add print_readable to unflattened module (#132817 ) Reland https://github.com/pytorch/pytorch/pull/128617 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132817 Approved by: https://github.com/pianpwk	2024-08-08 06:05:30 +00:00
rzou	2073ddfd1c	Actually report the HOP and subclass/mode when there isn't a registration (#132550 ) Test Plan: - tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/132550 Approved by: https://github.com/ydwu4	2024-08-06 21:33:10 +00:00
David Berard	5973aec671	[fx] python_code(verbose=True): show size/strides for all tensors (#132192 ) python_code(verbose=True) (or print_readable()) generates a string with the code representing the fx graph, with extra annotations indicating the size or stride of the tensor. Currently, it'll only shows sizes/strides for FakeTensors provided in metadata. For subclass tensors like NestedTensor, the outer class (provided in the node metadata) will be a non-FakeTensor and the inner tensors will be fake. This PR expands the conditional to show sizes/strides for all tensors, not just FakeTensors. Testing: I ran this test script (below), ran it with `TORCH_LOGS=+dynamo` and found in the logs the graph shown below - we see that the input nested tensor has sizes and strides associated with it. Also, I stacked a diff on top of this one that forces the readable graph to be generated whenever PT2 is in use in tests, which should hopefully find any issues; https://github.com/pytorch/pytorch/pull/132195 shows no significant failures except for preexisting failures. test script: ```python import torch def fn(x): return x.cos() nt = torch.nested.nested_tensor_from_jagged( torch.randn(10, 10), torch.tensor([0, 1, 3, 6, 10]), ) torch.compile(fn)(nt) ``` logs excerpt: ``` [0/0] [__graph_code] TRACED GRAPH [0/0] [__graph_code] ===== __compiled_fn_1 ===== [0/0] [__graph_code] /data/users/dberard/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.M [0/0] [__graph_code] def forward(self, L_x_: "f32[4, zf1, 10][10zf1, 10, 1]cpu", zf1: "Sym(zf1)"): [0/0] [__graph_code] l_x_ = L_x_ [0/0] [__graph_code] [0/0] [__graph_code] # File: /data/users/dberard/scripts/nt_print_graph.py:4 in fn, code: return x.c [0/0] [__graph_code] cos: "f32[4, zf1, 10][10zf1, 10, 1]cpu" = l_x_.cos(); l_x_ = None [0/0] [__graph_code] return (cos,) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/132192 Approved by: https://github.com/Chillee	2024-08-03 02:54:32 +00:00
PyTorch MergeBot	3855ac5a5d	Revert "[export] Add print_readable to unflattener (#128617 )" This reverts commit `ab9791c0e3`. Reverted https://github.com/pytorch/pytorch/pull/128617 on behalf of https://github.com/angelayi due to never got landed internally due to weird flow... sorry ([comment](https://github.com/pytorch/pytorch/pull/128617#issuecomment-2264224466))	2024-08-01 23:47:29 +00:00
Oguz Ulgen	920f0426ae	Add None return type to init -- tests rest (#132376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376 Approved by: https://github.com/jamesjwu ghstack dependencies: #132335, #132351, #132352	2024-08-01 15:44:51 +00:00
YangQun1	589aef4bb0	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-08-01 03:18:37 +00:00
ekamiti	9e473fd868	Make adding Buffers more like adding Parameters (#125971 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971 Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos	2024-07-31 10:32:40 +00:00
Guilherme Leobas	a843178529	Let dynamo inline functional_call (#128646 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128646 Approved by: https://github.com/zou3519	2024-07-30 14:22:23 +00:00
angelayi	ab9791c0e3	[export] Add print_readable to unflattener (#128617 ) Taking inspiration from `GraphModule.print_readable` (aka I copied its [code](`17b45e905a/torch/fx/graph_module.py (L824)`)), I added a `print_readable` to the unflattened module, because it's kind of nontrivial to print the contents of this module. Example print from `python test/export/test_unflatten.py -k test_unflatten_nested` ``` class UnflattenedModule(torch.nn.Module): def forward(self, x: "f32[2, 3]"): # No stacktrace found for following nodes rootparam: "f32[2, 3]" = self.rootparam # File: /data/users/angelayi/pytorch2/test/export/test_unflatten.py:99 in forward, code: x = x * self.rootparam mul: "f32[2, 3]" = torch.ops.aten.mul.Tensor(x, rootparam); x = rootparam = None # No stacktrace found for following nodes foo: "f32[2, 3]" = self.foo(mul); mul = None bar: "f32[2, 3]" = self.bar(foo); foo = None return (bar,) class foo(torch.nn.Module): def forward(self, mul: "f32[2, 3]"): # No stacktrace found for following nodes child1param: "f32[2, 3]" = self.child1param nested: "f32[2, 3]" = self.nested(mul); mul = None # File: /data/users/angelayi/pytorch2/test/export/test_unflatten.py:79 in forward, code: return x + self.child1param add: "f32[2, 3]" = torch.ops.aten.add.Tensor(nested, child1param); nested = child1param = None return add class nested(torch.nn.Module): def forward(self, mul: "f32[2, 3]"): # File: /data/users/angelayi/pytorch2/test/export/test_unflatten.py:67 in forward, code: return x / x div: "f32[2, 3]" = torch.ops.aten.div.Tensor(mul, mul); mul = None return div class bar(torch.nn.Module): def forward(self, add: "f32[2, 3]"): # No stacktrace found for following nodes child2buffer: "f32[2, 3]" = self.child2buffer # File: /data/users/angelayi/pytorch2/test/export/test_unflatten.py:87 in forward, code: return x - self.child2buffer sub: "f32[2, 3]" = torch.ops.aten.sub.Tensor(add, child2buffer); add = child2buffer = None return sub ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128617 Approved by: https://github.com/zhxchen17, https://github.com/pianpwk	2024-07-30 00:41:44 +00:00
PyTorch MergeBot	f72266ecea	Revert "Let dynamo inline functional_call (#128646 )" This reverts commit `5aab1acc84`. Reverted https://github.com/pytorch/pytorch/pull/128646 on behalf of https://github.com/clee2000 due to the newly added test dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_functional_call_sequential_params_and_buffers [GH job link](https://github.com/pytorch/pytorch/actions/runs/10147452270/job/28058682000) [HUD commit link](`5aab1acc84`) is broken, probably a landrace since it passed on PR ([comment](https://github.com/pytorch/pytorch/pull/128646#issuecomment-2256375501))	2024-07-29 16:26:50 +00:00
Guilherme Leobas	5aab1acc84	Let dynamo inline functional_call (#128646 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128646 Approved by: https://github.com/zou3519 ghstack dependencies: #129091, #130490	2024-07-29 15:41:03 +00:00

1 2 3 4 5

219 Commits