Before the pr, we have a graph break for `callable(nn_module)`:
```python
class M(nn.Module):
def forward(self, x):
return x.sin()
def f(m):
return callable(m)
res = torch.compile(f, fullgraph=True)(M())
```
```
Traceback (most recent call last):
File "/data/users/yidi/pytorch/t.py", line 17, in <module>
out = torch.compile(f, backend="eager", fullgraph=True)(M())
File "/data/users/yidi/pytorch/torch/_dynamo/eval_frame.py", line 414, in _fn
return fn(*args, **kwargs)
File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 1077, in catch_errors
return callback(frame, cache_entry, hooks, frame_state, skip=1)
File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 456, in _convert_frame_assert
return _compile(
File "/data/users/yidi/pytorch/torch/_utils_internal.py", line 74, in wrapper_function
return function(*args, **kwargs)
File "/home/yidi/.conda/envs/pytorch/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 799, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/data/users/yidi/pytorch/torch/_dynamo/utils.py", line 210, in time_wrapper
r = func(*args, **kwargs)
File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 618, in compile_inner
out_code = transform_code_object(code, transform)
File "/data/users/yidi/pytorch/torch/_dynamo/bytecode_transformation.py", line 1167, in transform_code_object
transformations(instructions, code_options)
File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 177, in _fn
return fn(*args, **kwargs)
File "/data/users/yidi/pytorch/torch/_dynamo/convert_frame.py", line 564, in transform
tracer.run()
File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 2244, in run
super().run()
File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 886, in run
while self.step():
File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 801, in step
self.dispatch_table[inst.opcode](self, inst)
File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 496, in wrapper
return inner_fn(self, inst)
File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 1255, in CALL_FUNCTION
self.call_function(fn, args, {})
File "/data/users/yidi/pytorch/torch/_dynamo/symbolic_convert.py", line 739, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 948, in call_function
return handler(tx, args, kwargs)
File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 711, in <lambda>
return lambda tx, args, kwargs: obj.call_function(
File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 948, in call_function
return handler(tx, args, kwargs)
File "/data/users/yidi/pytorch/torch/_dynamo/variables/builtin.py", line 835, in builtin_dipatch
unimplemented(error_msg)
File "/data/users/yidi/pytorch/torch/_dynamo/exc.py", line 216, in unimplemented
raise Unsupported(msg)
torch._dynamo.exc.Unsupported: builtin: callable [<class 'torch._dynamo.variables.nn_module.NNModuleVariable'>] False
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127026
Approved by: https://github.com/jansel
We guard on key order
1) When a key is a non-constant object
2) When we actually need key order - like .values, .items etc
For dicts/OrderedDicts that do not require key order guarding, we just rely on usual `GuardManger + DictGetItemGuardAccessor`. This is faster than going through the `list(d.keys())` based design for OrderedDicts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124779
Approved by: https://github.com/jansel
This removes the duplicate handling of comparison ops between symbolic_convert and bultin and refactors the handling to use the binop infrastructure. This change regresses overheads a bit, but this is fixed in the next PR.
New test skips are variants of `type(e) is np.ndarray` previously falling back to eager.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122043
Approved by: https://github.com/anijain2305
ghstack dependencies: #122039
Reduces the torch.compile(backend="eager") for this code by 1-2 seconds.
~~~
def fn(x):
for _ in range(10000):
# x = torch.sin(x)
x = torch.ops.aten.sin(x)
# x = sin(x)
return x
~~~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121053
Approved by: https://github.com/jansel
Fixes#119198.
This PR make dynamo inline `__iter__` of a user defined object instead of creating a graph break. Also added a new test, which shows:
1. the loop is unrolled
2. the length of the loop is guarded when inlining `__iter__`
```python
class Mod:
def __init__(self):
self.a = [torch.randn(2, 2), torch.randn(2, 2)]
def __iter__(self):
return iter(self.a)
def f(mod):
ret = []
for x in mod:
ret.append(x + 1)
return ret
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119243
Approved by: https://github.com/jansel
Finally we have this PR to merge allow_in_graph/inline/skip trace rules into ```trace_rules.lookup_inner```, where we can define and lookup trace rules at both function level and file level. Going forward, this is the central place that we define and consulte Dynamo trace rule for any function.
* ```trace_rules.looup``` is the API can return allow_in_graph, inline or skip.
* ```skipfiles.check``` is the API can return inline or skip, since we have multiple places that only do inline/skip check.
* I'll move ```skipfiles.check``` to ```trace_rules.check``` as one of the follow-ups.
* Both functions consulte ```trace_rules.lookup_inner``` to get the tracing rule.
To avoid a single big PR, I left a few items as the follow-ups:
* Remove ```skipfiles.py``` and merge the code into ```trace_rules.py```.
* We do double check in ```symbolic_convert.check_inlineable```, will refactor and simplify it. We should only do inline/skip check before generating ```SkipFilesVariable``` and ```UserFunctionVariable```.
* Rename ```SkipFilesVariable``` as ```SkipFunctionVariable```, since we only handle functions.
* The inline/skip reasons are not logged for some cases, since the new lookup framework doesn't always return inline/skip reasons. I'll refactor loggings to record the inline/skip reason in next step.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118971
Approved by: https://github.com/jansel
Before the pr, we have a graph break for:
```python
def f():
if torch.cuda.current_stream() is not None:
return torch.randn(2, 2)
torch.compile(f, backend="eager", fullgraph=True)()
```
This pr supports comparson ops of StreamVariable and ConstantVariable by returning a constant.
It's safe to return a constant in this case becuase the StreamVariable is guarded by ID_MATCH when created.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119199
Approved by: https://github.com/yifuwang, https://github.com/anijain2305, https://github.com/jansel