PR does following
* Turns `inference_mode` to False and `no_grad` for `convert_frame`, if the inference_mode is on globally.
* Turns off inference_mode for fake tensor prop. This ensures that converting from real inference tensor to a fake tensor removes the inference-ness.
* Graph breaks on is_inference and is_inference_mode_enabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149321
Approved by: https://github.com/jansel, https://github.com/zou3519
This allows for each device type to check current devices for Triton compatibility and ensure their Triton backend is present.
This PR replaces the `has_triton()` global method which was previously used for this task, and moves the initial check for each Inductor backend on to their associated `BaseScheduler` subclass. This means that other backends, such as Halide, can also implement their own availability checks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139171
Approved by: https://github.com/jansel
Recently I've been experimenting with introducing new APIs to delay compile as a way to reduce compile times while improving the ergonomics of using dynamic shapes. The high level idea is to run the first invocation of compile in eager, save the example inputs, and on the second invocation we can derive the dynamism in the inputs so that we don't need to waste our time doing a compile with static shapes (which is the status quo today with automatic dynamic).
Another benefit of this is most users no longer need to annotate their inputs with mark_dynamic and mark_unbaked calls since we can derive the dynamism on the very first call. Additionally we get dynamic ints out of the box in this new regime.
This PR implements this idea through the set_stance APIs. In particular it introduces a new `eager_then_compile` stance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147983
Approved by: https://github.com/williamwen42
This PR introduces the ability to whitelist sources as dynamic. This is particularly useful for large models with graph breaks, as you can keep the dynamism across graph breaks since source names stay consistent. Additionally you can use this to mark ints as dynamic.
NB: I intentionally didn't complicate the interface by supporting specification of per dimension dynamism. There is virtue in keeping true to the standard way of representing sources (eg. L['x']). If we find in practice that we need more more fine grained control, we can explore further affordances at that time.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147979
Approved by: https://github.com/Mingming-Ding
Earlier, with inline flag we were lifting id-guarded tensors to the inputs to the Fx graph. But this offers no benefit. Main idea behind lifting parameters as inputs was to reuse the compilation units across many instances of the nn-module. However, if we are guarding on the `id`, we are explicitly specializing the compiled artifact to the parameter.
This PR installs the parameters back into the graph. The benefit is removal of all pre-graph bytecode to extract the id-guarded tensors from locals/globals. This increases speedup from 1.67x to 1.75x for an internal model that has large number of optimizer parameters.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147824
Approved by: https://github.com/jansel
Co-authored-by: Jason Ansel <jansel@meta.com>
fixes#145775
This is the first step in introducing a "strict" mode where we don't silent specialize and don't silent graph break. At a high level when we do mark_unbacked(... strict=True), anytime we specialize an unbacked symint we will explicitly error and tell the user their unbacked dimension was specialized to a single value.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147333
Approved by: https://github.com/laithsakka
This PR adds support for list subclasses. Among other things are
1) Tracking the mutations on internal vts like `_dict_vt` and `_list_vt` using sources. This helps identify if there was a mutation in the underlying data structures, and we need to reconstruct it.
2) `UserDefinedObjectVariable` now has a new method - `is_modified` which `side_effect` infra relies upon to check mutations in the underlying vts (like `_dict_vt`).
3) `reconstruction` logic ensures that we use `dict.__getitem__` and `list.__getitem__` methods. This is super important because we don't want to call the overridden `__getitem__` methods.
If this PR is hard to review, please let me know. I can break it into several small PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146819
Approved by: https://github.com/StrongerXi, https://github.com/jansel
This reintroduces the change backed out by #145393 and fixes the underlying problem.
Although using a BuiltinVariable was better than nothing when we saw a GenericAlias it had problems if there was a graph break and we had to reconstruct the original python code which BuiltinVariable did as a simple `list` instead of a `list[int]`.
This changes it to use a TypingVariable instead and then teaches TypingVariable how to reconstruct.
Original commit changeset: 77b9193acb23
python test/dynamo/test_repros.py ReproTests.test_graph_break_on_jit_isinstance
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145554
Approved by: https://github.com/anijain2305
ghstack dependencies: #145551, #145552, #145553
Dynamo wasn't handling the new PEP585 type annotations:
```
x = list[Foo]
```
Although this worked in py3.9 this was causing an `unimplemented` (Unexpected type in sourceless builder) in py3.12.
This fixes it to treat them as a BuiltinVariable.
Fixes#145226
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145240
Approved by: https://github.com/anijain2305
In hinsight, we never needed a DICT_SUBCLASS_GUARD_MANAGER, because Dynamo would inline through the overridden keys method. In this PR, we ensure that while creating guards and constructing variable trackers, we get the `d.keys()` value by using `dict.keys(d)`. This ensures that we do not call overridden keys method. Therefore, the C++ guard can use `PyDict_Next` directly to check the guards.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143722
Approved by: https://github.com/jansel
In hinsight, we never needed a DICT_SUBCLASS_GUARD_MANAGER, because Dynamo would inline through the overridden keys method. In this PR, we ensure that while creating guards and constructing variable trackers, we get the `d.keys()` value by using `dict.keys(d)`. This ensures that we do not call overridden keys method. Therefore, the C++ guard can use `PyDict_Next` directly to check the guards.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143722
Approved by: https://github.com/jansel
Resolves issue #140464 by adding an option to not specialize int from nn.Modules (False by default to maintain existing behavior).
Test Plan: `buck2 test mode/opt caffe2/test/dynamo:test_dynamo -- test_modules.py::NNModuleTests::test_nn_module_unspec_int_attr`
Differential Revision: D66837042
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142829
Approved by: https://github.com/ezyang, https://github.com/yanboliang