Summary:
As of https://github.com/pytorch/pytorch/pull/103192, dynamo
supports code that creates OrderedDict instances using kwargs
for the key-value pairs rather than passing a dict literal.
But custom dicts (for example subclasses of OrderedDict) follow
a different codepath so that we can check for conditions such
as a custom `__init__` that need to force a graph break.
This commit allows kwargs for custom dict constructors - if the
args are empty and the class is not also a dataclass (which is
the case that, for example, a
`transformers.modeling_outputs.ModelOutput` instance will wind
up hitting) then treat the kwargs as the key-value pairs.
NOTE: For this to behave 100% correctly, we are relying on
the fact that python dicts behave like ordered dicts so that they
preserve the kwargs' ordering. Technically it is not guaranteed that
future versions of Python will respect this; if that behavior changes
we would need to ensure that dynamo uses OrderedDict for kwargs all
the way down in order to handle special cases like OrderedDict where
the kwargs' ordering does matter.
Test Plan:
```
pytest test/dynamo/test_functions.py
```
I also verified that the new test fails without the changes to
`dicts.py`.
Reviewers: yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112513
Approved by: https://github.com/yanboliang
`sys.modules` is currently treated as a constant dictionary and any reference to
it will result in guards on the full contents of `sys.modules`. This instead
adds a specialized variable tracker which tries to guard only on the modules
referenced by the code. e.g.
```
sys.modules["operator"].add(x, x)
```
will generate the guard
```
___dict_contains('operator', G['sys'].modules)
```
It does this with special support for `__contains__` `__getitem__` and `.get`
which are probably the most commonly used with `sys.modules`. For anything else
we just fall back to building the dict tracker as normal.
While accessing `sys.modules` may seem unusual, it actually comes up when
inlining the `warnings.catch_warnings` context manager which internally accesses
`sys.modules["warnings"]`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110990
Approved by: https://github.com/ezyang
Fix: #107315
This PR enables dynamo to trace through the `pytree` API by inlining its functions. In
order to do so, a few details of `pytree` had to be changed.
In summary, this PR:
- Introduces `TreeSpecVariable` for representing `TreeSpec` instances
- Specializes `<type>.__bases__` call, returning a `TupleVariable`
- Enables the call to `id` builtin function for every variable that implements
`as_python_constant` method
- Specializes `ConstantVariable.call_method` for its (un)flatten functions
- Implements `UserDefinedObjectVariable.as_python_constant`
- Modifies `pytree` by:
- Make `SUPPORTED_NODES` a map of ids (instead of types) to `NodeDef`
- Removed `functools.wraps` function, since it can't be inlined
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108533
Approved by: https://github.com/ezyang, https://github.com/voznesenskym
ghstack dependencies: #109201
before the PR, for HF's ModelOutput class, we use dicts.py/DataClassVariable with our own implementation on __getItem__, __setAttr__, __setItem__. There is a risk that ModelOutput logic may change since it is a user code
after the PR, we inline __getItem__, __setAttr__, __setItem__ using dicts.py/CustomizedDictVariable so the logic always keep AA
unit test
* python test/dynamo/test_model_output.py -k test_HF_bert_model_output
test on HF benchmark
* python benchmarks/dynamo/huggingface.py -d cuda --inference --accuracy --progress --inductor --print-dataframe-summary 2>&1
* all metric are the same before/after the PR, including pass rate, unique_graphs, graph_breaks, unique_graph_breaks
* before the PR: P790393916
* after the PR: P790368991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105044
Approved by: https://github.com/jansel
Marks all params/optimizer state as static addresses and a finalizer which cleans up the graph attributes when the optimizer goes out of scope.
**Note: this does not mark grads as static because this will increase memory usage significantly
There are two cases:
1. The upstream graph is cudagraphed - this case will work fine OOTB
2. The upstream graph is not cudagraphed - in this case, there will be a lot of copies introduced from the upstream (to copy the grads) into cudagraphed-owned memory, unless the user explicitly marks the grads as static. If the user does this, this will also require not deallocating the grads in zero_grad() (either the mod or optimizer version) by setting them to zero vs None. There is a PR (https://github.com/pytorch/pytorch/pull/107853) in flight to throw an error if zero_grad attempts to set static grads to None.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107504
Approved by: https://github.com/eellison
Fixes https://github.com/pytorch/pytorch/pull/102577#issuecomment-1650905536
Serializing to json is more stable, and renamed the API:
```
# Takes in a treespec and returns the serialized treespec as a string. Also optionally takes in a protocol version number.
def treespec_dumps(treespec: TreeSpec, protocol: Optional[int] = None) -> str:
# Takes in a serialized treespec and outputs a TreeSpec
def treespec_loads(data: str) -> TreeSpec:
```
If users want to register their own serialization format for a given pytree, they can go through the `_register_treespec_serializer` API which optionally takes in a `getstate` and `setstate` function.
```
_register_treespec_serializer(type_, *, getstate, setstate)
# Takes in the context, and outputs a json-dumpable context
def getstate(context: Context) -> DumpableContext:
# Takes in a json-dumpable context, and reconstructs the original context
def setstate(dumpable_context: DumpableContext) -> Context:
```
We will serialize to the following dataclass, and then json.dump this it to string.
```
class TreeSpec
type: Optional[str] # a string name of the type. null for the case of a LeafSpec
context: Optional[Any] # optional, a json dumpable format of the context
children_specs: List[TreeSpec],
}
```
If no getstate/setstate function is registered, we will by default serialize the context using `json.dumps/loads`. We will also serialize the type through `f"{typ.__module__}.{typ.__name__}"`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106116
Approved by: https://github.com/zou3519
Fixes: #101979
This PR adds support for dictionaries with torch object as keys in dynamo.
The main problem was that, for example, the source built for `d[torch.float]` (`d` being a
dictionary) was `ODictGetItemSource(GlobalSource('d'), index=torch.float)`. When
`Source.name` method was called, we got `odict_getitem(G['d'], torch.float)`. Evaluating
that string raised an error, since `torch` was only available in the global dictionary `G`
as `G["torch"]`.
Instead, this PR builds the source:
`ODictGetItemSource(GlobalSource('d'), index=AttrSource(GlobalSource('torch'), 'float'))`.
The to-be-evaluated string is correctly generated as:
`odict_getitem(G['d'], G['torch'].float)`.
Here's a minimal example that reproduces the error, before this PR:
```python
import torch
d = {
torch.float16: torch.float32,
}
@torch.compile
def f():
return torch.randn(3, dtype=d[torch.float16])
f()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103158
Approved by: https://github.com/mlazos
It's common to call ```dict()``` or ```collections.OrderedDict()``` inside of ```forward``` function, so we should not graph break.
This pattern has been used in many places including:
* The use case in [torchvision](
928b05cad3/torchvision/models/_utils.py (L66-L73)).
* It causes ~100 model failures(nopython=True) in the 14k github models.
* Also it hits several Meta internal use cases.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95250
Approved by: https://github.com/jansel
Make mutation faster to speed up tracing optimizers, helps with https://github.com/pytorch/torchdynamo/issues/1803
`replace_all` no longer iterates over the entire variable tracker data structure every time a mutation is performed
Each variable tracker internally keeps a set of contained mutable variable trackers, to provide a hint to `replace_all`. This is populated with a call to `apply` from `__post_init__` in the base `VariableTracker`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89170
Approved by: https://github.com/jansel