In TorchVision we use the following (simplified) dispatch mechanism:
```python
import torch
def kernel1(tensor):
return tensor + 2
def dispatcher1(input):
kernel = get_kernel(dispatcher1, type(input))
return kernel(input)
def kernel2(tensor):
return tensor - 2
def dispatcher2(input):
kernel = get_kernel(dispatcher2, type(input))
return kernel(input)
# We actually use the function and type as keys, rather than their names.
# However, this currently not supported, but should be easy to add after
# https://github.com/pytorch/pytorch/pull/111196
REGISTRY = {
"dispatcher1": {"Tensor": kernel1},
"dispatcher2": {"Tensor": kernel2},
}
def get_kernel(dispatcher, input_type):
dispatcher_registry = REGISTRY[dispatcher.__name__]
for cls in input_type.__mro__:
kernel = dispatcher_registry[cls.__name__]
break
return kernel
```
This can be compiled without graph breaks:
```python
cfn = torch.compile(dispatcher1, fullgraph=True)
torch.testing.assert_close(int(cfn(torch.tensor(3))), 5)
cfn = torch.compile(dispatcher2, fullgraph=True)
torch.testing.assert_close(int(cfn(torch.tensor(3))), 1)
```
However, if we start chaining these calls, we hit some issues:
```python
class Pipeline(torch.nn.Module):
def forward(self, input):
input = dispatcher1(input)
input = dispatcher2(input)
return input
cfn = torch.compile(Pipeline(), fullgraph=True)
torch.testing.assert_close(int(cfn(torch.tensor(3))), 3)
```
```
Can't access members of type(obj) for a generated custom object. Please use __class__ instead
```
The error message is not really helpful here. The following happens: when compiling `dispatcher1`, `get_kernel` gets inlined. That means when hitting `dispatcher2`, the `type` call no longer happens on an input with a source. Thus, in the first iteration we hit the top branch, while in the second we hit the bottom:
addb8e29cd/torch/_dynamo/variables/builtin.py (L1264-L1268)
And the error message I posted above originates from the type being treated as constant. This PR replaces this with a `SourcelessBuilder` instead.
With that fix in place, we hit another pointing to `input_type.__mro__`
```
AssertionError: Consider SourcelessBuilder for ephemeral objects, usually objects created locally.
```
Fix is similar: instead of using a `VariableBuilder` here, we use a `SourcelessBuilder` in case we have no `source`:
addb8e29cd/torch/_dynamo/variables/builtin.py (L1167-L1168)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113340
Approved by: https://github.com/peterbell10, https://github.com/lezcano
This PR fixes two cases when fx generated code is invalid in python (syntax error):
1. multiple type annotation in one line: `var1: annotation1, var2: annotation2 = function_call()`
2. invalid type annotation for scalars like `var1: f32[] = function_call()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113345
Approved by: https://github.com/ezyang
Subsumes https://github.com/pytorch/pytorch/pull/110794
Fixes https://github.com/pytorch/pytorch/issues/110315
This is not really a 100% sound fix, a deeper analysis of the bug can be found at https://docs.google.com/document/d/1y-nRAPdbZEji52MPKYzC0U3VhvW9yEAEDqP5t5GhWZ0/edit
The general idea behind the fix here is that we are going to play fast and loose with user defined classes: as Dynamo is written today, we are willing to pull out these types and directly manipulate them (e.g., look at their `__mro__`, etc) without an intervening VariableTracker. As such, if I use `python_type` to extract out the Python type of a VT or if I am manually reading out the `__bases__` of a type, which may be a user defined class, if it is sourceless, all I need to do is use SourcelessBuilder instead of ConstantVariable to make sure I wrap it into the correct VT class.
The approach in https://github.com/pytorch/pytorch/pull/110794 was "more correct", but we'd have to go substantially further to get it all working. So I am doing this to unblock suo for now.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113390
Approved by: https://github.com/suo
We set mannualy_set_graph_inputs to False for CondHigherOrder. After that, it became necessary to deduplicate the inputs. We'll add pytree tests in the follow-up pr.
Test Plan:
existing tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111611
Approved by: https://github.com/zou3519
ghstack dependencies: #111610
Previously we were generating a graph to add runtime assertions on inputs and then running that graph to check input constraints. This PR checks input constraints directly.
Differential Revision: D50289970
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111262
Approved by: https://github.com/zhxchen17
When mapping between the original signature of a program and the graph-captured signature of its exported program, we emit errors when we see unexpected original or graph-captured inputs or outputs.
These errors can arise because of various reasons, e.g.:
1. some input or output has been lifted because of mutation
2. some type is not pytree-registered for flattening / unflattening
3. some type cannot be realized with graph operations
(This is probably not an exhaustive list.)
Previously we used to emit errors based on a vanilla id-based membership check between the two sides, mostly anticipating (1) as the reason for errors. But this does not do justice to errors because of (2) or (3).
This PR emits a different error when it finds (3) to be a probable cause. Specifically, it considers only Tensor and Sym* types to be "supported": no other type seems to be realizable by graph operations.
When (2) is a probable cause, we sometimes also hit the same error because we would expect the supported types to show through upon registration. But this kind of error may need some more work in the future.
Differential Revision: [D49885828](https://our.internmc.facebook.com/intern/diff/D49885828/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110472
Approved by: https://github.com/ydwu4
A resubmit of https://github.com/pytorch/pytorch/pull/108447. Copy over the descriptions:
This is a follow-up of the discussion in https://github.com/pytorch/pytorch/pull/108356, where we want to repalce source_fn with source_fn_stack
Before this PR, for the following example:
```python
backend = EagerAndRecordGraphs()
@torch.compile(backend=backend, fullgraph=True)
def cond_f(pred, pred2, x, y):
def true_fn(pred2, x, y):
return x + y
def false_fn(pred2, x, y):
def true_fn2(x, y):
return x.sin() - y.cos()
def false_fn2(x, y):
return x.cos() - y.sin()
return control_flow.cond(pred2, true_fn2, false_fn2, (x, y))
return control_flow.cond(pred, true_fn, false_fn, (pred2, x, y))
```
The graph captured is shown below:
```python
class GraphModule(torch.nn.Module):
def forward(self, L_pred_ : torch.Tensor, L_pred2_ : torch.Tensor, L_x_ : torch.Tensor, L_y_ : torch.Tensor):
l_pred_ = L_pred_
l_pred2_ = L_pred2_
l_x_ = L_x_
l_y_ = L_y_
cond_true_1 = self.cond_true_1
cond_false_1 = self.cond_false_1
cond = torch.ops.higher_order.cond(l_pred_, cond_true_1, cond_false_1, [l_pred2_, l_x_, l_y_]); l_pred_ = cond_true_1 = cond_false_1 = l_pred2_ = l_x_ = l_y_ = None
return (cond,)
class GraphModule(torch.nn.Module):
def forward(self, l_pred2_, l_x_, l_y_):
add = l_x_ + l_y_; l_x_ = l_y_ = None
return add
class GraphModule(torch.nn.Module):
def forward(self, l_pred2_, l_x_, l_y_):
cond_true_0 = self.cond_true_0
cond_false_0 = self.cond_false_0
cond = torch.ops.higher_order.cond(l_pred2_, cond_true_0, cond_false_0, [l_x_, l_y_]); l_pred2_ = cond_true_0 = cond_false_0 = l_x_ = l_y_ = None
return cond
class GraphModule(torch.nn.Module):
def forward(self, l_x_, l_y_):
sin = l_x_.sin(); l_x_ = None
cos = l_y_.cos(); l_y_ = None
sub = sin - cos; sin = cos = None
return sub
class GraphModule(torch.nn.Module):
def forward(self, l_x_, l_y_):
cos = l_x_.cos(); l_x_ = None
sin = l_y_.sin(); l_y_ = None
sub = cos - sin; cos = sin = None
return sub
```
the source_fn for inner cond, sin, cos will be a (name, target) tuple:
```
('cond', <torch._ops.HigherOrderOperator object at xxx>)
('sin', 'sin')
('cos', 'cos')
('sub'. <built-in function sub>)
```
After this pr, the source_fn_stack will be a list of (name, target) tuple. The bottom of stack is the end of the list.
```
[('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>)],
[('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>), ('sin', 'sin')],
[('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cos', 'cos')]
[('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>), ('sub', <built-in function sub>)]
```
Test Plan:
See added tests in test_higher_order_ops.py and modify existing test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108595
Approved by: https://github.com/angelayi, https://github.com/zou3519
Recently we updated the `export` API to take an experimental `dynamic_shapes` argument that was meant to subsume the existing `constraints` argument.
This PR deprecates `constraints` (with a warning on its use, but without actually removing it). Simultaneously it replaces all uses of `constraints` in docs, examples, and tests with corresponding uses of `dynamic_shapes` (preserving behavior). This exercise fortunately revealed some minor bugs in the implementation which have also been fixed in this PR.
Some uses of `constraints` still remain, e.g., when `torch._dynamo.export` is called directly. (Meta-internal uses will be updated in a separate diff.)
Differential Revision: D49676049
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110143
Approved by: https://github.com/tugsbayasgalan
Fix: #107315
This PR enables dynamo to trace through the `pytree` API by inlining its functions. In
order to do so, a few details of `pytree` had to be changed.
In summary, this PR:
- Introduces `TreeSpecVariable` for representing `TreeSpec` instances
- Specializes `<type>.__bases__` call, returning a `TupleVariable`
- Enables the call to `id` builtin function for every variable that implements
`as_python_constant` method
- Specializes `ConstantVariable.call_method` for its (un)flatten functions
- Implements `UserDefinedObjectVariable.as_python_constant`
- Modifies `pytree` by:
- Make `SUPPORTED_NODES` a map of ids (instead of types) to `NodeDef`
- Removed `functools.wraps` function, since it can't be inlined
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108533
Approved by: https://github.com/ezyang, https://github.com/voznesenskym
ghstack dependencies: #109201
Requested from @tugsbayasgalan: we want dynamo to preserve some FX node metadata when we trace `GraphModule`s (`nn_module_stack`, `source_fn`, `stack_trace`). This is helpful for the case when we export an aten-level `GraphModule`, add some (possibly non-torch or non-aten) ops, and we want to transform the graph back into an aten-level graph. Without preserving metadata, future passes that look at metadata (e.g. quantization passes) won't work.
This feature also has the additional benefit of being able to preserve origin line of code when `print_readable`'ing a `GraphModule`. This is helpful when debugging graphs that have passed through dynamo several times.
The added unit test demonstrates the added functionality of this PR.
~This PR is currently a proof-of-concept implementation that shows that preserving node metadata across dynamo is possible.~ This PR preserves node metadata across dynamo by doing the following:
- ~inject a counter variable into the `GraphModule` source code, which is incremented every time a node is run~
- Construct a line number -> node index map in `GraphModule` as the source code is being generated.
- pass a list of node metadata and the line number map to dynamo's bytecode analyzer
- ~dynamo traces the counter as a `ConstantVariable`, so when we create a new proxy, we can determine which original node index this proxy corresponds by looking at the value of the traced counter~
- When we create a new proxy, get the current instruction's line number, and get the node index using the line number map
- index into the original node metadata ~using the counter variable's tracked value.~
~Some things that should be addressed off the top of my head:~
- ~Is this feature even desirable? (Do we really want Dynamo to have special behavior for `GraphModules`? Should we expect users to re-export `GraphModules`?)~
- ~Is there a better approach than to use a counter? We considered using node names, line numbers, and assuming that proxies are created in the same order as the nodes, but each of these 3 have shortcomings. For node names, we only have access to new node names, not the old ones. Using line number is fragile. The third is problematic since not all created nodes go through `create_proxy` (e.g. inputs). We currently generate a line number to node index map when the `GraphModule`'s code is generated.~
- ~What's the best way to send data across the "CPython gap"? That is, it is not obvious how to cleanly pass data from dynamo's `eval_frame.py:_TorchDynamoContext.__call__` to `symbolic_convert.py:InstructionTranslatorBase.__init__`. In this PR, we use a global.~
Differential Revision: [D49257108](https://our.internmc.facebook.com/intern/diff/D49257108)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107067
Approved by: https://github.com/jansel
We could have SymBool inputs for torch.compile, e.g. in the following situation:
```
def f(x:torch.Tensor):
pred = x.size(0) == 3
torch.compile(f)(pred, x)
make_fx(f, tracing_mode="symbolic")(x)
```
The idea of this PR (credit to @ezyang) is to support SymBool by re-using the infra we've already had for SymInt so that we don't need to replicate a lot of stuff.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107850
Approved by: https://github.com/ezyang
ghstack dependencies: #107662
We could have SymBool inputs for torch.compile, e.g. in the following situation:
```
def f(x:torch.Tensor):
pred = x.size(0) == 3
torch.compile(f)(pred, x)
make_fx(f, tracing_mode="symbolic")(x)
```
The idea of this PR (credit to @ezyang) is to support SymBool by re-using the infra we've already had for SymInt so that we don't need to replicate a lot of stuff.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107850
Approved by: https://github.com/ezyang
ghstack dependencies: #107662
## Exception in cond:
For code below:
```python
import torch
import functorch.experimental.control_flow as control_flow
def true_fn(x):
return x.sin()
def false_fn(x):
return x, x
def f(x, y):
return control_flow.cond(y, true_fn, false_fn, [x])
f(torch.ones(3, 4), torch.tensor(False))
```
The original exception stack trace is:
```python
Traceback (most recent call last):
File "/home/yidi/local/pytorch/test_exc.py", line 33, in <module>
f(torch.ones(3, 4), torch.tensor(False))
File "/home/yidi/local/pytorch/test_exc.py", line 31, in f
return control_flow.cond(y, true_fn, false_fn, [x])
File "/home/yidi/local/pytorch/torch/_higher_order_ops/cond.py", line 154, in cond
return torch.compile(cond_op, backend="eager", fullgraph=True)(
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 365, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 513, in catch_errors
return callback(frame, cache_entry, hooks, frame_state)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 140, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 380, in _convert_frame_assert
return _compile(
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 560, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/utils.py", line 197, in time_wrapper
r = func(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 482, in compile_inner
out_code = transform_code_object(code, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
transformations(instructions, code_options)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 449, in transform
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2083, in run
super().run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 733, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 696, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 397, in wrapper
return inner_fn(self, inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1164, in CALL_FUNCTION_EX
self.call_function(fn, argsvars.items, kwargsvars.items)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 570, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 418, in call_function
(false_r, false_graph, false_lifted_freevars) = speculate_branch(False)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 410, in speculate_branch
raise UncapturedHigherOrderOpError(
torch._dynamo.exc.UncapturedHigherOrderOpError: Expected branch to return a single tensor
from user code:
File "/home/yidi/local/pytorch/torch/_dynamo/external_utils.py", line 17, in inner
return fn(*args, **kwargs)
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
```
After this PR we get:
```python
Traceback (most recent call last):
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 50, in graph_break_as_hard_error
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 429, in call_function
(false_r, false_graph, false_lifted_freevars) = speculate_branch(False)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 421, in speculate_branch
unimplemented(
File "/home/yidi/local/pytorch/torch/_dynamo/exc.py", line 187, in unimplemented
raise Unsupported(msg)
torch._dynamo.exc.Unsupported: Expected branch to return a single tensor
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/yidi/local/pytorch/test_exc.py", line 33, in <module>
f(torch.ones(3, 4), torch.tensor(False))
File "/home/yidi/local/pytorch/test_exc.py", line 31, in f
return control_flow.cond(y, true_fn, false_fn, [x])
File "/home/yidi/local/pytorch/torch/_higher_order_ops/cond.py", line 154, in cond
return torch.compile(cond_op, backend="eager", fullgraph=True)(
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 338, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 500, in catch_errors
return callback(frame, cache_entry, hooks, frame_state)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 140, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 382, in _convert_frame_assert
return _compile(
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 562, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 484, in compile_inner
out_code = transform_code_object(code, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
transformations(instructions, code_options)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 451, in transform
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2088, in run
super().run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 728, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 691, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 392, in wrapper
return inner_fn(self, inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1159, in CALL_FUNCTION_EX
self.call_function(fn, argsvars.items, kwargsvars.items)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 565, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 53, in graph_break_as_hard_error
raise UncapturedHigherOrderOpError(reason + msg) from e
torch._dynamo.exc.UncapturedHigherOrderOpError: Cond doesn't work unless it is captured completely with torch.compile. Scroll up to find out what causes the graph break.
from user code:
File "/home/yidi/local/pytorch/torch/_dynamo/external_utils.py", line 17, in inner
return fn(*args, **kwargs)
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
```
## Exception during speculating branches
The example code below has a inplace-buffer mutation error,
```python
import torch
import functorch.experimental.control_flow as control_flow
class Foo(torch.nn.Module):
def __init__(self):
super().__init__()
self.register_buffer("buffer", torch.ones(6, 4))
def forward(self, x):
def true_fn(x):
self.buffer += 1
return self.buffer.sum() + x.sum()
def false_fn(x):
return (x - 1).sum()
return control_flow.cond(x.shape[0] > 4, true_fn, false_fn, [x])
mod_for_compile = torch.compile(Foo(), backend="eager", dynamic=True)
mod_for_compile(torch.ones(3, 4))
```
Before this PR the exception looks like:
```python
[2023-09-08 15:20:03,332] [0/0] torch._dynamo.variables.higher_order_ops: [WARNING] speculate_subgraph: while introspecting cond, we were unable to trace function `true_fn` into a single graph. This means that Dynamo was unable to prove safety for this API and will fall back to eager-mode PyTorch, which could lead to a slowdown.
[2023-09-08 15:20:03,332] [0/0] torch._dynamo.variables.higher_order_ops: [ERROR] Can't inplace modify module params/buffers inside HigherOrderOp
Traceback (most recent call last):
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 163, in speculate_subgraph
output = f.call_function(tx, args, sub_kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 606, in inline_user_function_return
result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2200, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2316, in inline_call_
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 733, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 696, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1219, in STORE_ATTR
.call_function(self, [obj, ConstantVariable(inst.argval), val], {})
File "/home/yidi/local/pytorch/torch/_dynamo/variables/builtin.py", line 618, in call_function
result = handler(tx, *args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/builtin.py", line 1169, in call_setattr
raise AttributeMutationError(
torch._dynamo.exc.AttributeMutationError: Can't inplace modify module params/buffers inside HigherOrderOp
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 394, in speculate_branch
ret_val, ret_graph, ret_lifted_freevars = speculate_subgraph(
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 222, in speculate_subgraph
raise Unsupported(
torch._dynamo.exc.Unsupported: speculate_subgraph: while introspecting cond, we were unable to trace function `true_fn` into a single graph. This means that Dynamo was unable to prove safety for this API and will fall back to eager-mode PyTorch, which could lead to a slowdown. Scroll up for the stack trace of the initial exception. The reason was: Can't inplace modify module params/buffers inside HigherOrderOp
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/yidi/local/pytorch/test_exc.py", line 20, in <module>
mod_for_compile(torch.ones(3, 4))
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1528, in _call_impl
return forward_call(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 365, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1528, in _call_impl
return forward_call(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 513, in catch_errors
return callback(frame, cache_entry, hooks, frame_state)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 632, in _convert_frame
result = inner_convert(frame, cache_entry, hooks, frame_state)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 140, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 380, in _convert_frame_assert
return _compile(
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 560, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/utils.py", line 197, in time_wrapper
r = func(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 482, in compile_inner
out_code = transform_code_object(code, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
transformations(instructions, code_options)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 449, in transform
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2083, in run
super().run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 733, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 696, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 397, in wrapper
return inner_fn(self, inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1124, in CALL_FUNCTION
self.call_function(fn, args, {})
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 570, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 261, in call_function
return super().call_function(tx, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 606, in inline_user_function_return
result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2200, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2316, in inline_call_
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 733, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 696, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 397, in wrapper
return inner_fn(self, inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1124, in CALL_FUNCTION
self.call_function(fn, args, {})
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 570, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 415, in call_function
(true_r, true_graph, true_lifted_freevars) = speculate_branch(True)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 405, in speculate_branch
raise UncapturedHigherOrderOpError(
torch._dynamo.exc.UncapturedHigherOrderOpError: Cond doesn't work unless it is captured completely with torch.compile
from user code:
File "/home/yidi/local/pytorch/test_exc.py", line 16, in forward
return control_flow.cond(x.shape[0] > 4, true_fn, false_fn, [x])
File "/home/yidi/local/pytorch/torch/_higher_order_ops/cond.py", line 127, in cond
return cond_op(pred, true_fn, false_fn, operands)
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
```
after this PR, the only difference is the error message of UncapturedHigherOrderOpError changes from `Cond doesn't work unless it is captured completely with torch.compile` to `Cond doesn't work unless it is captured completely with torch.compile. Scroll up to find out what causes the graph break`.
```python
[2023-09-08 15:17:02,052] [0/0] torch._dynamo.variables.higher_order_ops: [WARNING] speculate_subgraph: while introspecting cond, we were unable to trace function `true_fn` into a single graph. This means that Dynamo was unable to prove safety for this API and will fall back to eager-mode PyTorch, which could lead to a slowdown.
[2023-09-08 15:17:02,052] [0/0] torch._dynamo.variables.higher_order_ops: [ERROR] Can't inplace modify module params/buffers inside HigherOrderOp
Traceback (most recent call last):
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 177, in speculate_subgraph
output = f.call_function(tx, args, sub_kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 601, in inline_user_function_return
result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2193, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2300, in inline_call_
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 728, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 691, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1214, in STORE_ATTR
.call_function(self, [obj, ConstantVariable(inst.argval), val], {})
File "/home/yidi/local/pytorch/torch/_dynamo/variables/builtin.py", line 618, in call_function
result = handler(tx, *args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/builtin.py", line 1169, in call_setattr
raise AttributeMutationError(
torch._dynamo.exc.AttributeMutationError: Can't inplace modify module params/buffers inside HigherOrderOp
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 50, in graph_break_as_hard_error
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 426, in call_function
(true_r, true_graph, true_lifted_freevars) = speculate_branch(True)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 410, in speculate_branch
ret_val, ret_graph, ret_lifted_freevars = speculate_subgraph(
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 236, in speculate_subgraph
raise Unsupported(
torch._dynamo.exc.Unsupported: speculate_subgraph: while introspecting cond, we were unable to trace function `true_fn` into a single graph. This means that Dynamo was unable to prove safety for this API and will fall back to eager-mode PyTorch, which could lead to a slowdown. Scroll up for the stack trace of the initial exception. The reason was: Can't inplace modify module params/buffers inside HigherOrderOp
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/yidi/local/pytorch/test_exc.py", line 20, in <module>
mod_for_compile(torch.ones(3, 4))
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 338, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 500, in catch_errors
return callback(frame, cache_entry, hooks, frame_state)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 634, in _convert_frame
result = inner_convert(frame, cache_entry, hooks, frame_state)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 140, in _fn
return fn(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 382, in _convert_frame_assert
return _compile(
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 562, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 484, in compile_inner
out_code = transform_code_object(code, transform)
File "/home/yidi/local/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
transformations(instructions, code_options)
File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 451, in transform
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2088, in run
super().run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 728, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 691, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 392, in wrapper
return inner_fn(self, inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1119, in CALL_FUNCTION
self.call_function(fn, args, {})
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 565, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 261, in call_function
return super().call_function(tx, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 601, in inline_user_function_return
result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2193, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2300, in inline_call_
tracer.run()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 728, in run
and self.step()
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 691, in step
getattr(self, inst.opname)(inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 392, in wrapper
return inner_fn(self, inst)
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1119, in CALL_FUNCTION
self.call_function(fn, args, {})
File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 565, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 53, in graph_break_as_hard_error
raise UncapturedHigherOrderOpError(reason + msg) from e
torch._dynamo.exc.UncapturedHigherOrderOpError: Cond doesn't work unless it is captured completely with torch.compile. Scroll up to find out what causes the graph break.
from user code:
File "/home/yidi/local/pytorch/test_exc.py", line 16, in forward
return control_flow.cond(x.shape[0] > 4, true_fn, false_fn, [x])
File "/home/yidi/local/pytorch/torch/_higher_order_ops/cond.py", line 127, in cond
return cond_op(pred, true_fn, false_fn, operands)
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108817
Approved by: https://github.com/zou3519
Requested from @tugsbayasgalan: we want dynamo to preserve some FX node metadata when we trace `GraphModule`s (`nn_module_stack`, `source_fn`, `stack_trace`). This is helpful for the case when we export an aten-level `GraphModule`, add some (possibly non-torch or non-aten) ops, and we want to transform the graph back into an aten-level graph. Without preserving metadata, future passes that look at metadata (e.g. quantization passes) won't work.
This feature also has the additional benefit of being able to preserve origin line of code when `print_readable`'ing a `GraphModule`. This is helpful when debugging graphs that have passed through dynamo several times.
The added unit test demonstrates the added functionality of this PR.
~This PR is currently a proof-of-concept implementation that shows that preserving node metadata across dynamo is possible.~ This PR preserves node metadata across dynamo by doing the following:
- ~inject a counter variable into the `GraphModule` source code, which is incremented every time a node is run~
- Construct a line number -> node index map in `GraphModule` as the source code is being generated.
- pass a list of node metadata and the line number map to dynamo's bytecode analyzer
- ~dynamo traces the counter as a `ConstantVariable`, so when we create a new proxy, we can determine which original node index this proxy corresponds by looking at the value of the traced counter~
- When we create a new proxy, get the current instruction's line number, and get the node index using the line number map
- index into the original node metadata ~using the counter variable's tracked value.~
~Some things that should be addressed off the top of my head:~
- ~Is this feature even desirable? (Do we really want Dynamo to have special behavior for `GraphModules`? Should we expect users to re-export `GraphModules`?)~
- ~Is there a better approach than to use a counter? We considered using node names, line numbers, and assuming that proxies are created in the same order as the nodes, but each of these 3 have shortcomings. For node names, we only have access to new node names, not the old ones. Using line number is fragile. The third is problematic since not all created nodes go through `create_proxy` (e.g. inputs). We currently generate a line number to node index map when the `GraphModule`'s code is generated.~
- ~What's the best way to send data across the "CPython gap"? That is, it is not obvious how to cleanly pass data from dynamo's `eval_frame.py:_TorchDynamoContext.__call__` to `symbolic_convert.py:InstructionTranslatorBase.__init__`. In this PR, we use a global.~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107067
Approved by: https://github.com/jansel
Sometimes one might want to impose equalities that are not required by guards, e.g. say that you only want square images when rectangular images would suffice.
Curiously we never checked that the concrete values passed in example shapes actually satisfy such equality constraints. So, e.g., you could multiply two tensors of shapes MxK and KxN, specify that M and N must be equal, and then pass examples where they are not equal.
Relatedly, the symbolic shape dimensions for inputs in the exported graph were not forced to be equal.
However, runtime assertions still fire because they take into account all equality constraints. This would result in the strange situation where export would succeed but the exported program with the same example inputs would fail.
This PR fixes these issues.
Differential Revision: [D48910918](https://our.internmc.facebook.com/intern/diff/D48910918/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108429
Approved by: https://github.com/zhxchen17
Before this PR, we use get_fake_value to get the fake_sub_args then call op(*fake_sub_args) to get the example value for out dtype.
This causes problem when the input proxy's op type is `get_attr`, get_fake_value for a `get_attr` node will actually look at the original param/buffer and **return a real tensor** instead of fake tensor. This is OK for export, since export's fake_mode allows non_fake_inputs see [here](https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/output_graph.py#L278). But it causes problem when nesting cond with out_dtype where cond will use torch.compile(full_graph=True) to inspect out_dtype and find the inputs to op are mixed FakeTensor and real tensor.
This PR changes how we get the example values from proxies by directly looking at node.meta["example_value"]. This meta data is guaranteed to exist for all proxies during dynamo tracing so it's safe to use ( it's also used by get_fake_value to get fake tensors from args for general ops see [here](https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/utils.py#L1318)).
Test Plan:
existing tests + remove expected failure for a test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108715
Approved by: https://github.com/zou3519
Summary: Currently node metadata "nn_module_stack" is only being used by export. For some export model, we still want to retain nn_module_stack for unspecialized module for various purposes. This diff add a path to also record nn_module_stack when unspecialized module has a source available.
Test Plan: test_export_nn_module_stack_patched_module
Differential Revision: D48841193
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108281
Approved by: https://github.com/yanboliang, https://github.com/tugsbayasgalan
We want cond to always throw errors despite user's torch.compile mode.
The current implementation is to
1. catch the UserError.GRAPH_BREAK_IN_CONTROL_FLOW and once saw it, we directly raise: once in [break_graph_if_unsupported](bad3f2db40/torch/_dynamo/symbolic_convert.py (L1250)), which catches and raises for call_function (entry point of higher order operator) and a few others.
2. The raised exception is caught and raised again in [step](bad3f2db40/torch/_dynamo/symbolic_convert.py (L691)), where all instructions' exceptions are handled.
3. At the top-level, we treat it like an hard error and not supressing the errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108027
Approved by: https://github.com/zou3519
ghstack dependencies: #108025, #108026
Previously when we found some input or output mismatch between original args / traced result vs. graph-captured input / output, we would have a pretty sparse error message. (This might be partly due to the urge to reuse the same code for matching both inputs and outputs.)
With this PR we now point out which input or output is problematic, what its type is, and also present the expected types along with descriptions of what they mean. We don't suggest any fixes, but the idea is that it should be evident what went wrong looking at the error message.
Differential Revision: [D48668059](https://our.internmc.facebook.com/intern/diff/D48668059/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107907
Approved by: https://github.com/gmagogsfm
Currently there are 4 cases where contraint violation errors are raised, but the error messages are (a) inconsistent in their information content (b) worded in ways that are difficult to understand for the end user.
This diff cuts one of the cases that can never be reached, and makes the other 3
(a) consistent, e.g. they all point out that some values in the given range may not work, citing a reason and asking the user to run with logs to follow up
(b) user-friendly, e.g., compiler-internal info is cut out or replaced with user-facing syntax.
Differential Revision: D48576608
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107790
Approved by: https://github.com/tugsbayasgalan, https://github.com/angelayi
Dynamo currently runs the real graph module with real inputs as a way to match the return result of graph module with the eager return type. This is unsafe when graph module is side effectful. In the long term, we will get rid of this step. But in the short term, we just fakify the graph module again and run it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107271
Approved by: https://github.com/ezyang
All log messages that occur while running Dynamo compilation now have `[X/Y]` added to the beginning of their message. X represents the frame being compiled, while Y says which compilation of the frame. For example, if you are debugging a frame that is repeatedly recompiling, you can look for N/0, N/1, N/2, etc. for the same N. Here is what the logs look like as you transition from one frame to another:
<img width="1372" alt="image" src="https://github.com/pytorch/pytorch/assets/13564/4897e368-1e50-4807-b342-54e911bcf087">
To accurately get this prefix added to all messages, I had to expand the scope of the `tracing` context manager. Its scope now coincides with `log_compilation_event`. To do this, I had to populate fake mode lazily in the TracingContext, since it isn't created until later, inside the OutputGraph.
This subsumes the previous X.Y logging that was solely for dynamic shapes.
Unfortunately I had to reindent some stuff. Review the diff with whitespace off.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107530
Approved by: https://github.com/anijain2305
ghstack dependencies: #107505, #107516
Some notable changes:
1. `constrain_as_size` allows min value to be less than 2 as it will unconditionally assume min >= 2 for compiler purposes. Instead, we add additional check to make sure max value is always greater than 2.
2. Previously, we used to runtime assert on the unbacked symint's val range which would be always between [2, max]. I modified this logic to assert on [0, max] unless user explicitly specifies the min range.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106591
Approved by: https://github.com/gmagogsfm, https://github.com/ezyang
Some notable changes:
1. `constrain_as_size` allows min value to be less than 2 as it will unconditionally assume min >= 2 for compiler purposes. Instead, we add additional check to make sure max value is always greater than 2.
2. Previously, we used to runtime assert on the unbacked symint's val range which would be always between [2, max]. I modified this logic to assert on [0, max] unless user explicitly specifies the min range.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106591
Approved by: https://github.com/gmagogsfm, https://github.com/ezyang
Previously, you would get an error like
```
Dynamo input and output is a strict subset of traced input/output
```
now you get
```
Cannot export model which references tensors that are neither
buffers/parameters/constants nor are direct inputs. For each tensor, if you'd
like this tensor to be an explicit input, add it as a dummy argument
to the top-level model definition you are exporting; if you would
like its value to be embedded as an exported constant, wrap its access
in a function marked with @assume_constant_result.
G['bulbous_bouffant'], accessed at:
File "test_export.py", line N, in f
return bulbous_bouffant + y
```
This doesn't handle outputs, I'm going to hit that next.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106403
Approved by: https://github.com/tugsbayasgalan
Fixes https://github.com/pytorch/pytorch/issues/103210
Test Plan:
Before the fix:
```
pytest test/dynamo/test_export.py -k suppress_errors
```
got result:
```
File "/data/users/zhxchen17/pytorch/torch/nn/modules/module.py", line 1502, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/nn/modules/module.py", line 1511, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/_dynamo/eval_frame.py", line 295, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/nn/modules/module.py", line 1502, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/nn/modules/module.py", line 1511, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/_dynamo/eval_frame.py", line 448, in catch_errors
return callback(frame, cache_size, hooks, frame_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/_dynamo/convert_frame.py", line 127, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/_dynamo/convert_frame.py", line 360, in _convert_frame_assert
return _compile(
^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/_dynamo/utils.py", line 180, in time_wrapper
r = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/_dynamo/convert_frame.py", line 511, in _compile
exception_handler(e, code, frame)
File "/data/users/zhxchen17/pytorch/torch/_dynamo/convert_frame.py", line 216, in exception_handler
log.error(format_error_msg(e, code, record_filename, frame))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/users/zhxchen17/pytorch/torch/_dynamo/exc.py", line 248, in format_error_msg
stack_above_dynamo = filter_stack(extract_stack(frame))
^^^^^^^^^^^^^^^^^^^^
File "/home/zhxchen17/miniconda3/envs/dev/lib/python3.11/traceback.py", line 231, in extract_stack
stack = StackSummary.extract(walk_stack(f), limit=limit)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhxchen17/miniconda3/envs/dev/lib/python3.11/traceback.py", line 393, in extract
return klass._extract_from_extended_frame_gen(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhxchen17/miniconda3/envs/dev/lib/python3.11/traceback.py", line 416, in _extract_from_extended_frame_gen
for f, (lineno, end_lineno, colno, end_colno) in frame_gen:
File "/home/zhxchen17/miniconda3/envs/dev/lib/python3.11/traceback.py", line 390, in extended_frame_gen
for f, lineno in frame_gen:
File "/home/zhxchen17/miniconda3/envs/dev/lib/python3.11/traceback.py", line 334, in walk_stack
yield f, f.f_lineno
^^^^^^^^^^
AttributeError: 'torch._C.dynamo.eval_frame._PyInterpreterFrame' object has no attribute 'f_lineno'
```
After the fix:
```
pytest test/dynamo/test_export.py -k suppress_errors -s
```
Got Result:
```
File "/data/users/zhxchen17/pytorch/torch/_dynamo/exc.py", line 135, in unimplemented
raise Unsupported(msg)
torch._dynamo.exc.Unsupported: map() operator doesn't support scalar or zero-sized tensors during
tracing.
========== The above exception occurred while processing the following code ==========
File "/data/users/zhxchen17/pytorch/test/dynamo/test_export.py", line 3043, in forward
def forward(self, xs):
File "/data/users/zhxchen17/pytorch/test/dynamo/test_export.py", line 3047, in forward
return map(body, xs)
==========
unimplemented [("map() operator doesn't support scalar or zero-sized tensors during tracing.", 1)]
.
=============================== 1 passed, 133 deselected in 4.60s ================================
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103227
Approved by: https://github.com/williamwen42
Currently, exporting a model to ONNX with fake tensor mode requires the
user to load data and model within `torch.onnx.enable_fake_mode` context,
but the actual call to `torch.onnx.dynamo_export` is done outside such
context.
With this PR, we enable `torch.onnx.dynamo_export` to be called either
within `torch.onnx.enable_fake_mode` or outside of it. This feature
required changes to the core PyTorch Dynamo, which were greatly
supported by @ezyang
In future steps we will determine which scenario we are going to
support, but for now we can use either to explore different options and
scenarios and asses their pros and cons.
This PR also creates a separate suite of tests for fake mode specific
scenarios (`TestFxToOnnxFakeTensorWithOnnxRuntime`).
It was done separately to decrease the test time, but we
could merge it with the default `TestFxToOnnxWithOnnxRuntime`. The
additional parameters are `load_checkpoint_during_init` and
`export_within_fake_mode`
With the newly added supported of nested export within fake mode, the
following scenarios are now supported:
```python
import torch
with torch.onnx.enable_fake_mode() as fake_context:
fake_args = create_args()
fake_kwargs = create_kwargs()
fake_model = create_model()
fake_model.load_state_dict(torch.load(tmp_checkpoint_file.name))
export_options = torch.onnx.ExportOptions(fake_context=fake_context)
# `torch.onnx.dynamo_export` called WITHIN `torch.onnx.enable_fake_mode`
export_output = torch.onnx.dynamo_export(
fake_model,
*fake_args,
**fake_kwargs,
export_options=export_options,
)
export_output.save("/path/to/model.onnx", model_state_dict=create_model())
```
If we decide to only support scenarios in which `torch._dynamo.export` is called within `FakeTensorMode`, then we can remove `fake_mode` argument from `torch._dynamo.export` as a follow-up task
ps: This PR is mostly Edward's https://github.com/pytorch/pytorch/pull/105468 + unit tests after an offline discussion
ps: https://github.com/pytorch/pytorch/issues/105464 tracks pending tasks/limitations from this PR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105477
Approved by: https://github.com/ezyang, https://github.com/BowenBao
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.
Fixes#35735
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069
Approved by: https://github.com/mikaylagawarecki
Enables additional inductor UTs on ROCm and un skips outdated skips.
I have also removed a group of failures in `test_torchinductor_opinfo` which are now passing for CUDA and ROCm
```
- # The following 3 tests fail on CUDA with AssertionError: expected size 5==5, stride 5==1 at dim=0
- # linalg._svd's return value has different strides on CUDA vs CPU which causes this
- # In test_meta.py there is a mechanism to skipping strides checks for some ops
- # (including _linalg_svd), possibly we should have something similar here
- "linalg.cond": {f32, f64},
- "linalg.svdvals": {f32, f64},
- "linalg.matrix_rank": {f32, f64},
- "linalg.svd": {f32, f64},
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104624
Approved by: https://github.com/malfet
Remove _deprecated_global_ns from cond following #104105.
We change the module attribute of HigherOrderOperator instances in the constructor from torch.ops to torch.ops.higher_order when self.namespace is "higher_order". For subclasses (e.g. customized higher order operator), we leave their \_\_module\_\_ unchanged.
Will import this PR to fix internal tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104380
Approved by: https://github.com/zhxchen17, https://github.com/zou3519
At high current implementation of constrains functions (constrain_as_**) will raise exception for the following code snippets:
```
def f(x):
a = x.item()
constrain_as_size(a, 4, 7)
return torch.empty((a, 4))
inp = torch.tensor([5])
ep = torch._export.export(f, (inp,))
```
The reason is because current constrain logic is:
1) Purely python so it won't survive AOT export (the full node is gone after AOT export since AOT export only maintains aten level op).
2) Utilize side effect to add range constraints for traced symbol's shape env ([code](9591e52880/torch/fx/experimental/symbolic_shapes.py (L370-L372))).
3) If runtime assertion is turned on (by default). [`_AddRuntimeAssertionsForConstraintsPass`](9591e52880/torch/_export/passes/add_runtime_assertions_for_constraints_pass.py (L98-L100)) will try to append assertion node based on range constrains extracted from shape env of symbol during another interpretation round.
4). However, since 1), in the round of AOT export, range constraints logic won't run for symbols generated during this round. And later there is no range constrains information available for assertion round and caused issue.
5) As a result of above, it will failure at `torch.empty((a, 4))` (there is no constrains for `a` that it must be positive).
The fix here is just to implement range constrain logic as a native aten op (CPU implementation as no-op) to make it be able to survive AOT export.
**NOTE:**
[Logic](2d745b95d7/torch/fx/experimental/symbolic_shapes.py (L350-L365C15)) within [`constrain_range`](2d745b95d7/torch/fx/experimental/symbolic_shapes.py (LL313C74-L313C74)) is split out as `constrain_range_int` to capture case when non `SymInt` is passed in and reused in the new `_constrain_range`. The reason is when non `SymInt` is provided:
* If it directly calls `sym_constrain_range`, the C++ version will be called which will be no-op.
* So in this case it calls `constrain_range_int` instead to be able to capture issue like user provides a input whose tensor's shape could be out of range during exporting, like the following for above code example:
```
...
inp = torch.tensor([10])
ep = torch._export.export(f, (inp,)) # immediately raise error
```
Differential Revision: [D46734204](https://our.internmc.facebook.com/intern/diff/D46734204)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103346
Approved by: https://github.com/tugsbayasgalan
Fixes#95900
Using the following repro as guide:
```python
import torch
import torch._dynamo
from torch._subclasses import fake_tensor
from torch.fx.experimental.symbolic_shapes import ShapeEnv
from torch._dynamo.output_graph import config
class Model(torch.nn.Module):
def __init__(self) -> None:
super().__init__()
self.linear = torch.nn.Linear(2, 2)
self.linear2 = torch.nn.Linear(2, 2)
def forward(self, x):
out = self.linear(x)
out = self.linear2(out)
return out
fake_mode = fake_tensor.FakeTensorMode(allow_non_fake_inputs=False,
allow_fallback_kernels=True,
shape_env=ShapeEnv(
allow_scalar_outputs=config.capture_scalar_outputs,
allow_dynamic_output_shape_ops=config.capture_dynamic_output_shape_ops,
frame_id=0
),
)
# Fakefying input/model before calling torch._dynamo.export
with fake_mode:
fake_x = torch.rand(5, 2, 2)
model = Model()
# Calling torch._dynamo.export without active fake mode
graph_module, guards = torch._dynamo.export(
model,
fake_x,
aten_graph=True,
fake_mode=fake_mode
)
graph_module.print_readable()
graph_module.graph.print_tabular()
```
Summary of changes:
* Plumb fake_mode through torch.export API. When specified, it
replaces the creation of a new FaketendorMode at InstructionTranslator on behalf of OutputGraph
Hacks FakeTensor.__new__ to prevent a
torch.tensor._make_subclass call for inputs that are already fakefied by
user. This probably need to be fixed in a nicer way. Any idea?
* Removed a few asserts that didn't want faked tensors coming
from user script
* Added torch._subclasses.fake_tensor.FakeTensor to type list on a few
asserts check to allow fake inputs
The changes above allowed symbolic tracing with both static and dynamic shapes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100017
Approved by: https://github.com/ezyang
We discussed in a composability meeting a few weeks ago that `pre_autograd` should probably be renamed to `pre_dispatch`.
One question in this PR was: should I re-use a dispatch key? Or should I create a new dispatch key (that yet again corresponds to "top of the dispatcher")?
~~For now, I ended up sticking our proxy mode on the mode stack corresponding to `PythonTLSSnapshot`, because it was simple and it works. It looks like one of the functorch dispatch keys has higher priority though, so it's possible that functorch will end up running first. Open to options, but we can consider adding a new dispatch key later if that becomes a problem~~
Update: I added a dedicated dispatch key, `PreDispatch`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101818
Approved by: https://github.com/ezyang, https://github.com/Neilblaze, https://github.com/albanD, https://github.com/zou3519
This PR adds aot_export_module as the lowering path from torch.level graph to aten graph. Some known limitations that need to be addressed in the follow up PRs:
1. Store param/buffer data in ExportedProgram
2. Fully support torch.cond with params/buffers
3. Making sure no duplicated ExportMetaData entry
4. This API will break Executorch if used on PyE, we will figure out a plan internally.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101490
Approved by: https://github.com/avikchaudhuri
When investigating failures in https://github.com/pytorch/pytorch/pull/100017 I realized that we were reentering FakeTensorMode even though there was already one on the stack. Although we have attempted assert for these cases in the past, e.g., as in https://github.com/pytorch/pytorch/pull/97186 it seems that the existing protections were insufficient.
In this particular case, the reapplication of FakeTensorMode was due to an interaction with NotImplemented multiple dispatch handling. If proxy tensor mode detects an unrecognized tensor type (this includes FakeTensor, if it is not tracked with a proxy), it will return NotImplemented to give this tensor a chance to unpack itself into proxyable operation. However, this is never the right thing for FakeTensor, where no unpacking is possible. However, today, FakeTensor attempts to reapply the FakeTensorMode, resulting in FakeTensorMode being twice on the stack.
This PR does a number of things:
* It adds an assert in `FakeTensorMode.__torch_dispatch__` that you must not already have this mode on the stack, this is ALWAYS an error
* It modifies `FakeTensor.__torch_dispatch__` to return `NotImplemented` if the mode is already active. This prevents us from readding the mode on the stack
* It adds a new logging artifact `not_implemented` which you can use to get debug logs about all of the times a `__torch_dispatch__` handler returned NotImplemented and why it did so. Your subclass has to manually opt into this logging, but I inserted the necessary logs for ProxyTensorMode and FakeTensor(Mode)
* `with fake_mode` now no-ops if the fake mode is already on the stack, which is what users want anyway
* I am BREAKING pre-autograd tracing, because it is currently doing something weird with the original C++ mode stack. Brian is going to follow up with a fix next week.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102091
Approved by: https://github.com/thiagocrepaldi, https://github.com/eellison, https://github.com/wanchaol, https://github.com/bdhirsh
PR to enable default workflow PyTorch 2.0 unit tests for the ROCm stack.
- Enables all the dynamo unit test suites
- Enables some of the inductor unit test suites
- `test_config`
- `test_cpp_wrapper` (cpu only)
- `test_minifier`
- `test_standalone_compile`
- `test_torchinductor_dynamic_shapes`
- `test_torchinductor_opinfo`
- `test_torchinductor`
- `test_triton_wrapper`
- Introduces TEST_WITH_ROCM conditions for unit test skip/fail dictionaries in test_torchinductor_dynamic_shapes.py and test_torchinductor_opinfo.py
Note this PR follows on from the discussions for the previous UT enablement PR https://github.com/pytorch/pytorch/pull/97988, we have opted to only enable a few inductor suites at the moment to ease the upstreaming effort as these files are changing very quickly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100981
Approved by: https://github.com/jithunnair-amd, https://github.com/malfet
This pr does the following:
1. previously, inline constraints is not properly set for tensor output data-dependent ops such as a.nonzero because of its return value is not symint. This pr just uses all the unbacked symbols i.e.those start with "i"/"f" in create_unbacked_sym* functions. Note that these symbols are guaranteed to be a super set of inline user constraints.
2. add inline assertions support by checking.
Currently, it only deal with tensor, SymInt, SymFloat, SymBool output data-dependent ops and ignore the rest. It's good enough for now as we only have a limited number of data-dependent ops (.item and .nonzero are explicitly tested).
The examples for graph that is added assertions is shown below:
```
class ExportGraphModule(torch.nn.Module):
def forward(self, x):
arg0: i64[s0], = fx_pytree.tree_flatten_spec(([x], {}), self._in_spec)
nonzero_default: i64[i0, 1] = torch.ops.aten.nonzero.default(arg0); arg0 = None
return pytree.tree_unflatten([nonzero_default], self._out_spec)
class GraphModule(torch.nn.Module):
def forward(self, x):
arg0: i64[s0], = fx_pytree.tree_flatten_spec(([x], {}), self._in_spec)
sym_size: Sym(s0) = torch.ops.aten.sym_size(arg0, 0)
nonzero_default: i64[i1, 1] = torch.ops.aten.nonzero.default(arg0); arg0 = None
sym_size_1: Sym(i1) = torch.ops.aten.sym_size(nonzero_default, 0)
ge: Sym(i1 >= 3) = sym_size_1 >= 3
scalar_tensor_default: f32[] = torch.ops.aten.scalar_tensor.default(ge); ge = None
_assert_async_msg = torch.ops.aten._assert_async.msg(scalar_tensor_default, 'nonzero_default.shape[0] is outside of inline constraint [3, 5].'); scalar_tensor_default = None
le: Sym(i1 <= 5) = sym_size_1 <= 5; sym_size_1 = None
scalar_tensor_default_1: f32[] = torch.ops.aten.scalar_tensor.default(le); le = None
_assert_async_msg_1 = torch.ops.aten._assert_async.msg(scalar_tensor_default_1, 'nonzero_default.shape[0] is outside of inline constraint [3, 5].'); scalar_tensor_default_1 = None
return pytree.tree_unflatten([nonzero_default], self._out_spec)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100763
Approved by: https://github.com/tugsbayasgalan
This diff adds support for dynamic equality constraints of the form `dynamic_dim(x, 0) == dynamic_dim(y, 1)`. The process of constraint discovery can already understand equality guards between dimensions and suggests such equality constraints, so this closes the loop on that. Correspondingly we now raise `ConstraintViolation` when we find that such a guard is added on a dynamic dimension and the user did not specify such a constraint. (NOTE: This is distinct from a dynamic dimension being guarded equal to a constant, which is already an error.)
Differential Revision: [D45279437](https://our.internmc.facebook.com/intern/diff/D45279437/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99993
Approved by: https://github.com/tugsbayasgalan
Summary:
Issue:
`torch._dynamo.exc.Unsupported: call_method ListVariable() copy [] {}`
Fix:
Add `copy()` to "method_call" to _dynamo/variables/lists.py
Take it over from #98184. To unblock a meta internal model onboarding to ExecuTorch.
Differential Revision: D45592416
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100669
Approved by: https://github.com/jansel
pytest rewrites Python assert statements in unit tests to provide more detailed error messages. Unfortunately, this breaks some dynamo tests. Disable AST rewriting in test_export.py so that "pytest test/dynamo/test_export.py" passes.
Fixes#93449
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100484
Approved by: https://github.com/tugsbayasgalan
This PR introduces a new operator called aten._assert_async.msg, which allows passing a tensor value and assertion message as inputs. As part of TorchDynamo, we're replacing the use of torch._assert with this new operator so that make_fx also knows how to handle assertions. This is subset of https://github.com/pytorch/pytorch/pull/98878, refer there for historic reviews.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100101
Approved by: https://github.com/jansel
Metadata to store in the GraphModule:
- input shape constraints
- example inputs
- other inline constraints
The saved constraints (in mem) will be used directly after export to convert constraints to runtime assertion which is a separate pass after export.
The requirement of saved constraints:
1. Be able to locate where the constraints is from
2. Should not break the exported graph module serialization.
Examples of saved constraints
```
input_shape_constraints:
{'t_id': 140266058179792, 'dim': 0, 'min': 6, 'max': oo}
{'t_id': 140266058179792, 'dim': 0, 'min': 2, 'max': 10}
inline_constraints:
i1: ValueRanges(lower=2, upper=5)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99961
Approved by: https://github.com/tugsbayasgalan
pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like:
```
torch._dynamo.export(..., aten_graph="aten_pre_autograd")
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98031
Approved by: https://github.com/ezyang
It's part of the effort to improve PT2 Export UX. This PR is to improve the usability of `torch.cond()` by separating user errors from the dynamo internal errors. By definition, user error means the usage of `torch.cond()` violates the restrictions of this API therefore needs users to take action and fix the error.
In this notebook N3363227 we discovered a bunch of limitations of using `torch.cond(pred, true_fn, false_fn, operands)`. In summary, the limitations can be categorized as:
- predicate restriction (`pred`)
- operands restriction (`operands`)
- branch restriction (`true_fn` & `false_fn`)
The error message will be more accurate about where the (user) error is from and more actionable for users to fix it.
For example, `operands` must be a list of tensors and the signature of `true_fn` and `false_fn` must match with the `operands`.
If the operands contains non-tensor types, user will see error message like:
```
torch._dynamo.exc.UserError: Expected a list of tensors but got ["<class 'torch.Tensor'>", "<class 'float'>"]
from user code:
File "~/pytorch/test/dynamo/test_export.py", line 2504, in f_non_tensor_operands
return cond(True, lambda x, a: x.sin(), lambda x, a: x.cos(), [x, a])
```
If the signature of the branch function doesn't match with `operands`, user will see error message like:
```
torch._dynamo.exc.UserError: too many positional arguments.
func = 'false_fn' ~/pytorch/test/dynamo/test_export.py:2514, args = [<class 'torch.Tensor'>, <class 'torch.Tensor'>], kwargs = {}
```
Or if the tensor returned from user defined branches has different metadata, e.g. shapes, dtypes, etc., user will see error message like:
```
TypeError: Expected each tensor to have same metadata but got:
cond_true_0 returns TensorMetadata(shape=torch.Size([2, 1]), dtype=torch.int64, requires_grad=False, stride=(1, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cond_false_0 returns TensorMetadata(shape=torch.Size([1]), dtype=torch.float32, requires_grad=False, stride=(1,), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98909
Approved by: https://github.com/jansel
Summary: This fixes the case when some of the input tensors were
real tensors and fakified in `validate_and_convert_non_fake_tensors`,
but `flat_arg_fake_tensors` would not contain all the inputs
because it was computed before the fakification. We fix this by
recomputing `flat_arg_fake_tensors` after fakification as well.
Test Plan:
python test/dynamo/test_export.py ExportTests.test_mixed_real_and_fake_inputs
Reviewers: Chillee, voznesenskym
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98769
Approved by: https://github.com/voznesenskym
It's part of the effort to improve PT2 Export UX. This PR is to improve the usability of `torch.cond()` by allowing user to set `pred` as `ConstantVariable` as it's not often to see control flow on rank or a tensor or dim size which is traced as `ConstantVariable`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98900
Approved by: https://github.com/jansel
This diff adds the ability to specify range constraints on dynamic dimensions. (Previously we only supported declaring a dynamic dimension, which gets the default range `[2, sympy.oo]`.)
One point worth calling out: our initial design called for compound expressions like `lower <= dynamic_dim(x, d) <= upper`. However this seems difficult to support, because of a combination of desugaring and overloading semantics for such compound expressions in Python. Rather than silently doing the wrong thing, we explicitly error in this case and recommend users to specify multiple constraints, which is supported.
Differential Revision: [D44847318](https://our.internmc.facebook.com/intern/diff/D44847318/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98779
Approved by: https://github.com/ezyang
Summary: Add new experimental python op (`torch.nonzero_static`) for export. There is NO cuda impl included in this PR
Example:
Say input tensor is `x = torch.tensor([[1, 0], [3, 2]])`
call regular `nonzero()` on x will give you a tensor `tensor([[0, 0], [1, 0], [1, 1])`
call `nonzero_static(x, size=4)` on x will give you a tensor `tensor([[0, 0], [1, 0], [1, 1], [fill_value, fill_value])` (padded)
call `nonzero_static(x, size=2)` on x will give you a tensor `tensor([[0, 0], [1, 0])` (truncated)
Test Plan:
**Unit Tests**
```
buck test @mode/dev-nosan //caffe2/test:test_dynamo -- 'caffe2/test:test_dynamo - test_export.py::ExportTests::test_export_with_nonzero_static' -- 'caffe2/test:test_dynamo - test_misc.py::MiscTests::test_nonzero_static'
```
**PT2 Export with `nonzero_static()`**
Example of `GraphModule` in the exported graph
```
def forward(self, x):
arg0, = fx_pytree.tree_flatten_spec(([x], {}), self._in_spec)
nonzero_static_default = torch.ops.aten.nonzero_static.default(arg0, size = 4); arg0 = None
return pytree.tree_unflatten([nonzero_static_default], self._out_spec)
```
Differential Revision: D44324808
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97417
Approved by: https://github.com/ezyang
The purpose of this API is to execute a few large components of work:
1) Refactor all the internals of plumbing dynamic dimension information after dynamo to be stateless
2) Decouple allocation controls around dynamic dimensions from verification
3) For (2), for allocation, create an enum that dictates whether we are in DUCK (default today), STATIC (aka assume_static_default in the past), or DYNAMIC (aka user constrained, do not duck shape)
4) For (2), for verification, we separate out the list of dynamic ranges entirely from allocation. This means shape_env does not tracking for what we verify on, and instead, it is the callers job to invoke produce_guards() with the various things they want verified, specifically, with the valid ranges. We do use constrain ranges to refine value ranges when doing analysis.
5) We have decided, therefore, as an extension of (4) to double down on "late" checks versus "eager" checks, primarily because the mechanisms for gathering what actually matters happens during guards, and should be a purview of the caller seeking guards, not the shape env. However, for dynamo, these structures are essentially one and the same.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96699
Approved by: https://github.com/avikchaudhuri, https://github.com/ezyang
Summary:
Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(https://github.com/pytorch/pytorch/pull/95621).
However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs
Test Plan:
Example of the error message would look like:
```
Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps'))
```
Passed all export tests
```
Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26
Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257
RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8 Up: 6.0 KiB Down: 0 B
Jobs completed: 101. Time elapsed: 115.7s.
Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed
```
Differential Revision: D44075910
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96786
Approved by: https://github.com/tugsbayasgalan, https://github.com/ezyang
`inspect.getfullargspec` does not properly handle functions/methods wrapped by functools.wraps(). As a result, it gets an empty list of `args` in FullArgSpec.
This PR rewrites the logic using `inspect.signature`, which handles functools.wraps() correctly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96557
Approved by: https://github.com/jansel
OK, so this PR used to be about reducing the number of constants we specialize on, but it turns out that unspecialization was ~essentially never used (because we still constant specialized way too aggressively) and I ended up having to fix a bunch of issues to actually get tests to pass. So this PR is now "make int unspecialization actually work". As part of this, I have to turn off unspecialization by default, as there are still latent bugs in inductor.
The general strategy is that an unspecialized int is represented as a SymInt. Representing it as a 0d tensor (which is what the code used to do) is untenable: (1) we often need unspecialized ints to participate in size computations, but we have no way of propagating sympy expressions through tensor compute, and (2) a lot of APIs work when passed SymInt, but not when passed a Tensor. However, I continue to represent Numpy scalars as Tensors, as they are rarely used for size computation and they have an explicit dtype, so they are more accurately modeled as 0d tensors.
* I folded in the changes from https://github.com/pytorch/pytorch/pull/95099 as I cannot represent unspecialized ints as SymInts without also turning on dynamic shapes. This also eliminates the necessity for test_unspec.py, as toggling specialization without dynamic shapes doesn't do anything. As dynamic shapes defaults to unspecializing, I just deleted this entirely; for the specialization case, I rely on regular static shape tests to catch it. (Hypothetically, we could also rerun all the tests with dynamic shapes, but WITH int/float specialization, but this seems... not that useful? I mean, I guess export wants it, but I'd kind of like our Source heuristic to improve enough that export doesn't have to toggle this either.)
* Only 0/1 integers get specialized by default now
* A hodgepodge of fixes. I'll comment on the PR about them.
Fixes https://github.com/pytorch/pytorch/issues/95469
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95621
Approved by: https://github.com/jansel, https://github.com/Chillee
If the input to operator.not_ is a tensor, I want to convert the operator to a torch.logical_not. This allows the following test case to pass. Beforehand it resulted in the error `NotImplementedError("local_scalar_dense/item NYI for torch.bool")`
```
def test_export_tensor_bool_not(self):
def true_fn(x, y):
return x + y
def false_fn(x, y):
return x - y
def f(x, y):
return cond(not torch.any(x), true_fn, false_fn, [x, y])
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94626
Approved by: https://github.com/voznesenskym
By moving guard string assembly into dynamo's default behavior and letting code_parts do the work, we can have much better shape guard failures.
Before this fix, the guard failure in the test would look like:
```
'x.size()[1] == x.size()[0] and x.stride()[0] == x.[264 chars]!= 1' != 'x.size()[0] < 3'
- x.size()[1] == x.size()[0] and x.stride()[0] == x.size()[0] and x.stride()[1] == 1 and x.storage_offset() == 0 and y.size()[0] == x.size()[0] and y.size()[1] == x.size()[0] and y.stride()[0] == x.size()[0] and y.stride()[1] == 1 and y.storage_offset() == 0 and x.size()[0] < 3 and x.size()[0] != 0 and x.size()[0] != 1
+ x.size()[0] < 3
```
now it is
```
"x.size()[0] < 3"
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93894
Approved by: https://github.com/ezyang
Two small changes that I'm bundling together because one of them needs to touch fbcode and I'm not sure how to do stacked diffs + internal changes + land before release cut.
Remove allow_meta from ctor, and allow by default: we should be able to trace through meta with fake tensors, so in some senses it's a bit weird to expose to user to disallow this. However, it's still useful debug wise to error from time to time, so I've added an option to the config that will get back previous behavior.
Remove `throw_on_data_dependent_ops=True`: this was intended as a temporary behavior as we were smoothing things turning on the erroring. There are no uses anywhere of `throw_on_data_dependent_ops=False` I could find.
These are technically backward-incompatble, but fake tensor is new since the last release / in a private namespace, and I don't want to release it with baggage that would be hard to remove later.
Fix for https://github.com/pytorch/pytorch/issues/92877.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93993
Approved by: https://github.com/bdhirsh, https://github.com/ezyang
Previously, Dynamo faked support for item() when `capture_scalar_outputs` was True by representing it internally as a Tensor. With dynamic shapes, this is no longer necessary; we can represent it directly as a SymInt/SymFloat. Do so. Doing this requires you to use dynamic shapes; in principle we could support scalar outputs WITHOUT dynamic shapes but I won't do this unless someone hollers for it.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Differential Revision: [D42885775](https://our.internmc.facebook.com/intern/diff/D42885775)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93150
Approved by: https://github.com/voznesenskym
Sample value from the test case `test_export_with_stack_trace`
node.target | node.meta["source_fn"]
-- | --
aten.randn.default | <built-in method randn of type object at 0x7f8683263108>
aten.t.default | < built-in function linear >
aten.mm.default | < built-in function linear >
aten.cos.default | <built-in method cos of type object at 0x7f8683263108>
aten.relu.default | relu
aten.add.Tensor | < built-in function add >
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92399
Approved by: https://github.com/jerryzh168, https://github.com/yanboliang
The idea is to make ShapeEnv guards less of a one-off special snowflake, and integrate it more closely with the regular builder infrastructure. But it is not so easy: the shape env code has to live after tensor match code, because we need to know that the values in question are tensors before we start matching on them. So we introduce a new `shape_env_code` field to put the special shape env code, so we can add it to the final constructed code after tensor.
Everything else works the obvious way. There's a new ShapeEnvSource for constructing the singleton SHAPE_ENV guard that drives the shape env guard construction. I added some more docs and also made the printed code for guards include the enclosing lambda for more clarity.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91055
Approved by: https://github.com/albanD, https://github.com/voznesenskym
The original implementation of cond() operator support in dynamo operated by recursively calling export() on the inner subgraph. This is problematic for a number of reasons:
* My original motivating reason: the original implementation had to play tricks to feed real tensors to the recursive export call, which means that it doesn't work well with tracing with dynamic shapes (where we MUST stay in fake tensors to accurately track dynamic shapes across the cond invocation)
* If there are pending side effects, the recursive export() call won't see those side effects (as they are only tracked by Dynamo, not actually applied to the Python environment.) You can see an example where dynamo cond tracing does the wrong thing at https://github.com/pytorch/pytorch/pull/90208
* If there were side effects inside the true/false branch, these side effects were silently lost (as the export only returns the graph of tensor operations, and not any of the residual Python bytecodes necessary to reapply any side effects.) This could have substantive effects on the export of subsequent parts of the model, as those parts of the models could rely on the side effects.
* It was not possible to track NN module accesses inside the true/false branches, necessitating a hack where the NN module was explicitly passed in as an input to cond https://github.com/pytorch/pytorch/pull/87020#issuecomment-1338842844 which doesn't really make any sense from a backend compilation perspective
* Guards induced from the inside of the true/false branch were not properly propagated to the top level guards; they were just silently dropped (in fact, the original implementation checked that the true/false branch produce the same guards which... is not useful? Like, I don't think that actually is even necessary for correctness)
This PR replaces the old implementation with a new implementation based on graphstate checkpointing. The basic idea is to process a cond(), we checkpoint the state of our interpreter, run the true branch, rollback to our checkpoint, run the false branch, rollback to our checkpoint and then merge the changes from both of the checkpoints. I require the true/false branches to have exactly the same side effects, but union their guards.
Some of the details:
* Dynamo is too aggressive with tracking side effects when processing closures, c.f. https://github.com/pytorch/torchdynamo/pull/233/files#r1040480078 The basic problem is whenever I define a closure, this immediately counts as a side effect, even if I didn't actually mutate anything. This triggered on the nested cond export example. To prevent this from happening, I optimistically avoid tracking side effects, but if a STORE_DEREF happens, I restart analysis with the relevant Source.name() added to `mutated_closure_cell_contents` so we start tracking on closure allocation. This is enough to fix the relevant test.
* For the most part, I assert that the graph states must be equivalent after applying the true/false branches. During debugging, I found it useful to be able to compare two graph states and give a better description about what the divergence was. You can test this using the `diff()` method I've added to a few structures.
* The implementation now supports NestedUserFunctionVariable, which is nice as it allows the true/false branches to be defined closer to the cond implementation.
* I fixed the naming of the true/false subgraphs; previously they were named `name_0`, `name_1`, now they are named `cond_true_0` and `cond_false_0`
* I added `name_to_input` to the saved graph state. I don't actually know if this is necessary, but it seemed like a good idea.
* I have to play some tricks to get the speculating execution of the true/false branch to record into a subgraph. After a careful read of OutputGraph, I found that what would work is overriding graph with a fresh Graph that we want to write things into, and manually setting up the inputs/outputs. It's a little delicate as you have to make sure you reset the Graph to its original before you restore a checkpoint, as checkpoints don't actually save graph for efficiency, and just undo changes on the graph. This capability may usefully get refactored to OutputGraph but I didn't do it in this PR for simplicity.
There are some further problems with the cond() implementation that I leave for future work. Most of these were preexisting with the original implementation.
* Not a problem per se, but if an NN module is used by both the true/false branch, it will show up in the final graph twice (since it has to be a submodule of the GraphModule that makes use of it.) I hope the export pipeline can deal with this.
* List of tensor output for cond is not supported.
* The true/false return values may not have consistent sizes/dims/etc, and we don't check them for consistency.
* If we modify fake tensors in the true/false branches, we aren't rolling them back, c.f. https://github.com/pytorch/torchdynamo/issues/1840
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90286
Approved by: https://github.com/voznesenskym
Summary:
Today when we transform the captured graph in the last step in export(aten_graph=True), we construct a new graph which doesn't have the all the metadata to be preserved, for example, node.meta["val"].
meta["val"] is important for writing passes and analysis on the graph later in the pipeline, we may want to preserve that on placeholder nodes.
Test Plan: test_export.py:test_export_meta_val
Differential Revision: D41110864
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88651
Approved by: https://github.com/tugsbayasgalan, https://github.com/jansel