All the changes brought by the original PR have been addressed in alternative ways in the stack. Why the original PR has to be reverted requires more effort because there is some bad interaction with export.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131058
Approved by: https://github.com/williamwen42
Uses `dict.fromkeys` whenever possible as covered by flake8-comprehensions rule C420. While the ruff rule RUF025 is still in preview, flake8-comprehensions have added a new rule which covers this. Use dict.fromkeys is faster when the value being added to the dictionary is the same at every iteration and is immutable, it also removes an unnecessary dict comprehension.
This rule will be enabled with our current ruleset in RUF in 0.6 as C420.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130699
Approved by: https://github.com/lezcano, https://github.com/ezyang
# Compile time for eager backend
## AlbertForMaskedLM
No inlining - 3.65 seconds
Inlining on main - 7.48 seconds
Inlining + this PR - 6.70 seconds
## MobileBertForMaskedLM
No inlining - 26.90 seconds
Inlining on main - 48.21 seconds
Inlining + this PR - 43.85 seconds
*Next PR in the stack makes the total compile time better/comparable to no inlining*
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129315
Approved by: https://github.com/jansel
ghstack dependencies: #129316
This code is unused because we just inline the `.parameters` call. The code was also wrong because side-effects only track the first level of mutations. An object might not marked mutated if one of the child objects (like a dict) is mutated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129316
Approved by: https://github.com/jansel
Tracing through `__init__` is important because it initializes (calls STORE_ATTR) on members. By doing that, we kick in the mutation tracking for these objects. So, things like mutating `_modules` etc is tracked automatically.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126578
Approved by: https://github.com/jansel
ghstack dependencies: #128001
I'm going to setup some extra behavior when we set example value, so
I need a convenient place to interpose. I cannot easily do it on
meta itself because its a generic dict with no interposition point.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124176
Approved by: https://github.com/oulgen
ghstack dependencies: #124105, #124059
Previously, `node.meta["nn_module_stack"]` had type `Dict[str, Tuple[str, class]]` when exported, and later `Dict[str, Tuple[str, str]]` after de/serialization. This PR changes it to consistently be `Dict[str, Tuple[str, str]]` for round-trippability, i.e.
```
{..., 'L__self___conv': ('conv', 'torch.nn.modules.conv.Conv2d')}
```
`source_fn_stack` is left untouched in this PR.
note: the `Union[type, str]` type annotations in ONNX are because ONNX goes through both `export.export()` and `_dynamo.export()` (which still has the original `Dict[str, Tuple[str, class]]` format). nn_module_stack from `export.export()` should consistently have the new format, and we verify/test for that in `_trace.py`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123308
Approved by: https://github.com/zhxchen17, https://github.com/thiagocrepaldi
Fixes#118795
This is a graph breaking partial fix for #120914. We still need -actual- module parametrization tracing support, but at least it doesn't blow up hard now.
**Background**: Module parametrization injects a property as the module parameter attribute that calls a `nn.Module` whose forward takes in a module parameter and returns a reparametrized module parameter.
Example:
```
class MyParametrization(nn.Module):
def forward(X):
# This reparametrization just negates the original parameter value
return -X
m = nn.Linear(...)
p = MyParametrization()
register_parametrization(m, "weight", p)
# Accessing the "weight" attribute will invoke p's forward() on m's original weight and return the output as the new weight.
# m.weight here is now an injected property that does the above instead of an actual Parameter.
# This property is defined in torch/nn/utils/parametrize.py.
m.weight
# NB: Parametrization changes the module type (e.g. torch.nn.utils.parametrize.ParametrizedLinear)
print(type(m))
```
**Problem 1**: Dynamo has special tracing rules for things in `torch.nn`. Parametrizing a module changes the type of the module and the parametrized attribute, so now these rules wrongly affect tracing here. To fix this:
* For parametrized modules, call `convert_to_unspecialized()` to restart analysis where Dynamo starts inlining the module.
**Problem 2**: The issue seen in #118795 is that Dynamo will see a dynamically constructed tensor when `m.weight` is called and introduce that to its `tensor_weakref_to_sizes_strides` cache during fake-ification. This tensor is also made to be a graph input, since it's a module parameter. When guards are created for this module parameter input, the logic calls `m.weight` again and tries to look the result up in the cache, but this is a different tensor now, giving the `KeyError` symptom. To fix this:
* Replace Dynamo's `tensor_weakref_to_sizes_strides` cache with a `input_source_to_sizes_strides` cache.
* This cache was originally introduced in #100128.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121041
Approved by: https://github.com/anijain2305
`unimplemented` is a function that raises an error, so
`raise unimplemented(...)` never reaches the `raise`.
Another related issue is that `raise unimplemented(...) from e`
doesn't attach the exception cause correctly. I fix this by adding
a `from_exc` argument to `unimplemented`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122136
Approved by: https://github.com/lezcano
Summary:
Seems like `kwargs` is already support in `_infer_argument`, so we don't need the extra assertion `len(kwargs) == 0`.
This optimization ensures compatibility with torch.compile() for LazyModules with kwargs inputs, preventing graph breaks.
Test Plan: Unit tetst and CI
Differential Revision: D53558778
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119445
Approved by: https://github.com/yanboliang
The original motivation for MYPYINDUCTOR was a faster type checking configuration that only checked a subset of files. With the removal of `follow_imports = ignore`, we are now able to use dmypy to do fast incremental typechecking, eliminating the need for this.
Perhaps erroneously, when I tee'ed up this PR I elected to delete the `follow_imports = skip` designations in the mypy-inductor.ini. This lead to a number of extra type error suppressions that I manually edited. You will need to review.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118432
Approved by: https://github.com/Skylion007
ghstack dependencies: #118414, #118418
After this refactor:
* ```TorchVariable``` definition and all references are removed.
* All ```is_allowed``` references except one are removed.
- The only left one is in ```torch/_dynamo/decorators:_disallow_in_graph_helper```. It was called when users put ```disallow_in_graph``` decorator on a function. Since we use the lists in ```trace_rules``` to decide the function's trace rule, so the decorator would only be used as customer function rather than torch functions. I'll defer this to a separate decorator refactor PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116312
Approved by: https://github.com/jansel