Earlier globals of inlined functions from other files were not handled correctly. We were not tracking mutations on them. They were colliding with the same global name in the parent function etc. This PR overrides the LOAD/STORE_GLOBAL for inline tx and tracks mutation on them separately.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125002
Approved by: https://github.com/jansel
ghstack dependencies: #125097, #125107
FSDP2 has this pattern of using user-defined object instance method as hook, and it will throw this error under compile:
`torch._dynamo.exc.Unsupported: call_function UserDefinedObjectVariable(_pre_forward) [FSDPManagedNNModuleVariable(), TupleVariable(), ConstDictVariable()] {}`
This PR adds support for it by always allowing to trace into a UserDefinedObjectVariable that's an instance method (i.e. `MethodType`).
Supersedes https://github.com/pytorch/pytorch/pull/123320.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123399
Approved by: https://github.com/jansel
Fixes#114844
In the linked issue we have
```
compiled_module = torch.compile(module)
compiled_module.x = ...
compiled_module(...) # Mutates self.x
```
Where since the module mutates `self.x` you would expect `compiled_module.x`
to be updated but actually `compiled_module.x = ...` sets an attribute "x"
on the `OptimizedModule` object while the forward method of the module mutates
`module.x`.
This gives the expected behavior by forwarding `compiled_module.__setattr__`
down to `module.__setattr__`. There is already a corresponding `__getattr__`
so now `compiled_module.x` becomes an alias for `module.x`.
Co-authored-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122098
Approved by: https://github.com/ezyang, https://github.com/lezcano
Fixes#114844
In the linked issue we have
```
compiled_module = torch.compile(module)
compiled_module.x = ...
compiled_module(...) # Mutates self.x
```
Where since the module mutates `self.x` you would expect `compiled_module.x`
to be updated but actually `compiled_module.x = ...` sets an attribute "x"
on the `OptimizedModule` object while the forward method of the module mutates
`module.x`.
This gives the expected behavior by forwarding `compiled_module.__setattr__`
down to `module.__setattr__`. There is already a corresponding `__getattr__`
so now `compiled_module.x` becomes an alias for `module.x`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122098
Approved by: https://github.com/ezyang, https://github.com/lezcano
Fixes#114844
In the linked issue we have
```
compiled_module = torch.compile(module)
compiled_module.x = ...
compiled_module(...) # Mutates self.x
```
Where since the module mutates `self.x` you would expect `compiled_module.x`
to be updated but actually `compiled_module.x = ...` sets an attribute "x"
on the `OptimizedModule` object while the forward method of the module mutates
`module.x`.
This gives the expected behavior by forwarding `compiled_module.__setattr__`
down to `module.__setattr__`. There is already a corresponding `__getattr__`
so now `compiled_module.x` becomes an alias for `module.x`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122098
Approved by: https://github.com/ezyang, https://github.com/lezcano
Summary:
Seems like `kwargs` is already support in `_infer_argument`, so we don't need the extra assertion `len(kwargs) == 0`.
This optimization ensures compatibility with torch.compile() for LazyModules with kwargs inputs, preventing graph breaks.
Test Plan: Unit tetst and CI
Differential Revision: D53558778
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119445
Approved by: https://github.com/yanboliang
Make variables in dict lazy and remove DICT_KEYS guard.
We build the keys of a dict depth-first and we rely on the guards of
each element in the dict to create the correct guards. This allows us to
remove the rather buggy DICT_KEYS guard and make the guard lazy.
The guards are not completely lazy yet, as we instantiate them in
`_HashableTracker._eq_impl` but it should be possible to make them
truly lazy.
Also, adding new types to the supported types within keys should be less
error prone.
This is marginally less efficient when we graph break, but in turn we
should graph break much less. It also makes the dicts code easier to maintain
(removes `is_hashable_python_var`).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117625
Approved by: https://github.com/jansel, https://github.com/peterbell10, https://github.com/anijain2305
ghstack dependencies: #117982, #118098, #117983
Summary:
Original commit changeset: 2a9588cfd51b
Original Phabricator Diff: D52062368
Test Plan: In investigating S386328 and S382826, we found checkpoint loading succeed after backout D52062368: S386328_backout_1220_193648
Differential Revision: D52356011
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116243
Approved by: https://github.com/voznesenskym
Fixes#113422Fixes#94575
This is now possible:
```py
model = Model()
compiled_model = torch.compile(model)
model.load_state_dict(compiled_model.state_dict()) # previously key mismatch!
```
This also makes it much easier to checkpoint and load models that were wrapped like so:
```py
FSDP(torch.compile(model))
# or
DDP(torch.compile(model))
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113423
Approved by: https://github.com/msaroufim
Enables some tests that were incorrectly not being run and enables PIE794 globally. This rule checks if a classvar is defined twice as flags it as it is likely a bug. In fact, we found several cases where it was a bug. It does have a couple of false positives which I flagged upstream and replaced with noqas: https://github.com/astral-sh/ruff/issues/8497
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112989
Approved by: https://github.com/malfet
The strategy for supporting functools partials is relatively straightforward.
There are 2 cases we need to support:
**1) Functools partials as input**
In this case, we are first seeing the functools partial and it is guaranteed to have a source. As such, the args, keywords, and func of the functools partial are passed through VariableBuilder. As this is the first time we are seeing these objects (as it is an input), we re-enter VariableBuilder with a source referencing the args, keywords, and func as attributes of the input to produce:
- func: A callable VariableTracker (UDF, TorchVariable, etc) depending on the value of `func`
- args: List[VariableTracker] - note, not ListVariableTracker!
- keywords: Dict[str, VariableTracker]
A major benefit of this structure is that it very elegantly matches the args to `call_function`.
We then compose a FunctoolsPartialVariable from the VariableTrackers made above.
**2) Functools partials created within compile**
In this case, we already have all the args as known VTs, and thus just compose a FunctoolsPartialVariable as we do for case (1).
For both (1) and (2) - we propagate all guards from the func, args, and keyword VTs to the FunctoolsPartialVariable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108846
Approved by: https://github.com/ezyang, https://github.com/jansel
Summary: Currently node metadata "nn_module_stack" is only being used by export. For some export model, we still want to retain nn_module_stack for unspecialized module for various purposes. This diff add a path to also record nn_module_stack when unspecialized module has a source available.
Test Plan: test_export_nn_module_stack_patched_module
Differential Revision: D48841193
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108281
Approved by: https://github.com/yanboliang, https://github.com/tugsbayasgalan
**This PR is a 99% copy paste of Sam Gross** (@colesbury) work at https://github.com/pytorch/pytorch/pull/100642. Copied from there
--------
The NN_MODULE guard now subsumes guards on Module attributes. The check_fn will fail if the module attributes are changed (such as Module.training), parameters, submodules, and buffers are added or removed, and if fields are changed on the type itself.
This gives up specificity in the guard check -- if any field is changed the check_fn fails -- for faster overall checks.
-----
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108528
Approved by: https://github.com/ezyang
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.
Fixes#35735
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069
Approved by: https://github.com/mikaylagawarecki
before the PR, running super(MyConv1d, self).forward or super(MyConvTranspose, self).foward, dynamo will create a graph break when executing NNModuleVariable.call_method and raise unimplemented error for name=_conv_forward / _output_padding. see issue for full detail: https://github.com/pytorch/pytorch/issues/101155
after the PR, for torch.nn.conv module with function name _conv_forward / _output_padding, we inline the function with tx.inline_user_function_return
code refactor: added NNModuleVariable._inline_user_function_return_helper to consolidaste tx.inline_user_function_return into 1 place to keep code dry. after factor, there are 2 uncolidated inline_user_function_return with different ```fn``` and ```source``` logic. the code is still dry. For local testing, they are covered by test_modulelist, test_moduledict, test_conv_call_super_forward_directly and test_conv_transpose_call_super_forward_directly in test_modules.py
Differential Revision: [D46494460](https://our.internmc.facebook.com/intern/diff/D46494460)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102509
Approved by: https://github.com/yanboliang
@wconstab As we discussed last Friday, I added the unit test for explicitly calling __call__ and added comment to explain why we redirecting ```UserMethodVariable.call_function``` to ```NNModuleVariable.call_method``` for a certain case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100146
Approved by: https://github.com/wconstab
**TL;DR**: This PR fixes handling for lazy modules where `cls_to_become is None`. In those cases, we should leave the type of the lazy module as the old value.
**Details**:
Lazy modules are intended to be initialized at execution; some of them are also supposed to switch to a different type after they have been initialized. However, not all are supposed to switch; see this logic from `nn/modules/lazy.py`
```python
def _infer_parameters(self, ...):
...
if module.cls_to_become is not None:
module.__class__ = module.cls_to_become
```
i.e., we should leave the module type as the old value if `module.cls_to_become is None`. This PR updates dynamo's handling to match this behavior.
Test `test_lazy_module_no_cls_to_become` added to `test/dynamo/test_module.py`.
Differential Revision: [D45253698](https://our.internmc.facebook.com/intern/diff/D45253698)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99943
Approved by: https://github.com/jansel
Before this PR, if users call ```Conv2d(x)```, dynamo handles it well(no graph break) and puts a ```call_module``` op in the FX graph. However, if users explicitly call ```Conv2d.forward(x)``` in another ```forward``` function, the inlining would be failed(caused graph break). This PR fixed this issue by translating the explicit ```Conv2d.forward(x)``` to ```Conv2d(x)```.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99015
Approved by: https://github.com/jansel, https://github.com/wconstab
Allowed modules are stuck into dynamo's fx graph as call_module
nodes, without dynamo doing any tracing of the module. This means
during AOT trace time, hooks will fire during tracing when the
call_module is executed, but the hooks themselves will disappear
after that and not be present in the compiled program.
(worse, if they performed any tensor operations, those would get
traced so you could end up with part of the hook's functionality).
To circumvent this, there are two options for 'allowed modules' with hooks.
1) don't treat them as 'allowed' - trace into them
2) graph-break, so the module is no longer part of the dynamo trace at all
(1) will fail for users that opted into allowed modules becuase they know
their module has problems being traced by dynamo.
(2) causes graph breaks on common modules such as nn.Linear, just because they
are marked as 'allowed'.
It would help matters if we could differentiate between types of allowed modules
(A) allowed to avoid overheads - used for common ops like nn.Linear
(B) allowed to avoid dynamo graphbreaks caused by unsupported code
Ideally, we'd use method (1) for group (A) and (2) for (B).
For now, graph-break on all cases of allowed modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97184
Approved by: https://github.com/jansel
This PR makes basic nnmodule forward hooks work by default, without any overhead. But it leaves silent correctness issues if users modify/remove their hooks later, thus also emits a warning.
- the usual case is to not use hooks, so avoid guard overhead here
- registering any hook before compile will trigger a warning about hook support
- registering a hook later (or removing one) requires user knowledge and opting in,
currently this isn't warnable (but maybe we can observe compiled nnmodules to make it
warnable).
Why skip hook guards by default instead of not tracing __call__/hooks by default?
- avoid having a mode flag that alters dynamo tracing behavior (harder to test both codepaths
in CI with full coverage)
- the most basic hook usecase (registering a hook before compile, and never removing it)
will work by default with this PR, while it would require enablement and incur overhead
in the 'not tracing __call__' proposal.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98371
Approved by: https://github.com/jansel
This lets users that are sure they won't use hooks avoid overhead
related to dynamo guards on (assumedly) empty hook dicts on all
nn modules.
Only enable this flag if you are sure you won't change hook-behavior
after compiling. It is ok to register a hook and then compile, if
you promise never to remove/alter the hook. It is also ok to
not register a hook and compile, if you never register a hook later.
Note- this is not the best we can do, and hopefully in the future
we can avoid the need for this option following some of these paths
- make guards fast enough to not be an issue when guarding on hook
dicts
- make a mode where dynamo actually skips tracing __call__ so
hooks are consistently ignored by compiled programs
- use nnmodule versioning so hook changes can be guarded without
explicit hook dict guards
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97830
Approved by: https://github.com/jansel
Fixes https://github.com/pytorch/pytorch/issues/93485
```python
import torch
from torchvision.models import resnet50
model = resnet50(weights=None)
compile_model = torch.compile(model)
print(type(compile_model))
example_forward_input = torch.rand(1, 3, 224, 224)
c_model_traced = torch.jit.trace(compile_model, example_forward_input) # or torch.jit.script
torch.jit.save(c_model_traced, "c_trace_model.pt")
```
Should I raise a warning if a user tries to compile a scripted or traced model as well? It works just fine now on resnet but not sure if it's that something we want to explicitly discourage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91681
Approved by: https://github.com/desertfire
This PR optimizes the guards overhead introduced by dynamo tracing module forward hooks.
It can and maybe should be followed by a wider change proposed by @voznesenskym to optimize specialized nnmodules by 'observing' any user mutations and directly invalidating the root guard, obviating the need to install other nnmodule guards. (But this observer change seems more involved...)
Idea: maintain a flag, and keep it up to date whenever adding or
removing hooks. Use the flag rather than dict checks to enter the call fast path.
- need to extend RemovableHandle to keep a ref to nnModule so it can update the flag on removal.
- also need to handle the flag in ScriptModule which still uses the python call impl when called from python.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95931
Approved by: https://github.com/ezyang, https://github.com/voznesenskym
Tweak dynamo behavior in 2 places when calling nn.Modules,
to route the call to __call__ instead of .forward(), since
__call__ is the codepath that eager users hit and will dispatch
to hooks correctly.
(1) inside NNModuleVariable.call_function, which covers the common case
of calling a module from code dynamo is already tracing
(2) at the OptimizedModule layer, which is the entrypoint
into a top-level nn.Module dynamo is about to compile
This exposes a new bug: NNModuleVariable used to special-case calling
module.forward() (which is a method) as a UserFunctionVariable with an extra
'self' arg. After tracing into module.__call__, there is no longer a special
case for the eventual call into .forward, and it gets wrapped in a
UserDefinedObjectVariable following standard behavior of ._wrap(). UDOV can't be
called, so this broke some tests.
- Fix: add a new special case in _wrap() that treats methods as a UserDefinedMethod
instead of UserDefinedObjectVariable. Now, the forward method can be called.
Also, fix NNModuleVar.call_method routing forward back to __call__
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92125
Approved by: https://github.com/ezyang, https://github.com/jansel, https://github.com/voznesenskym
The old code didn't actually fakeify traceable tensor subclasses at the
time they are added as a GraphArg to the module; now we do, by ignoring
the subclass during fakeification and relying on Dynamo to simulate
the subclass on top. See comments for more details.
BTW, this codepath is super broken, see filed issues linked on the
inside.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90009
Approved by: https://github.com/wconstab, https://github.com/voznesenskym