This is the next step in support dynamic float arguments in PT2: docs.google.com/document/d/1HswUSp9H6mg8Vg27mhRk8YzC9q_uf63b6wz-gwx65BQ/edit?pli=1#heading=h.xvyiqp8tuje6. To make this more incremental and tractable, we've decided to opt the export path our of this first phase of the rollout.
Fixes python test/export/test_export.py TestExport.test_export_input_mutation_dynamic_shape when specialize_float=False
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139486
Approved by: https://github.com/ezyang
ghstack dependencies: #139451, #139482, #139484
Previously we'd been raising UserErrors when `Dim()` and DimHints (`Dim.AUTO/Dim.DYNAMIC`) were both specified in `dynamic_shapes`, this PR stops that, and uses `Dim()` objects to guide DimHints.
The key to this was making the `EqualityConstraint` class happy when it checks that inferred equivalence relations were specified in the original `dynamic_shapes` spec, and this introduces a `RelaxedConstraint` object to mark the hinted dimensions, so equality checks between `RelaxedConstraints` and other constraints are treated as valid.
Current behavior is that:
```
class Foo(torch.nn.Module):
def forward(self, x, y):
return x - y
inputs = (torch.randn(4, 4), torch.randn(4, 4))
shapes = {
"x": (Dim.AUTO, Dim("d1", min=3)),
"y": (Dim("d0", max=8), Dim.DYNAMIC),
}
ep = export(Foo(), inputs, dynamic_shapes=shapes)
```
The dimensions marked `AUTO` and `DYNAMIC` will have max & min ranges of 8 & 3 respectively. Note that inferred equality between `Dim()` objects & `Dim.STATIC` will still raise errors - `Dim()` suggests not specializing to a constant.
Differential Revision: D64636101
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138490
Approved by: https://github.com/avikchaudhuri
This patch adds logging for all frames Dynamo traced, during each invocation of a Dynamo-optimized function.
Example:
```python
import torch
@torch.compile
def foo():
x = torch.ones([10])
def bar():
y = x + x
torch._dynamo.graph_break()
z = y * x
return z
return bar(), bar
foo()
foo()
```
Running `TORCH_LOGS="dynamo" python` on the above dumps the following near the very end.
```
......
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486] starting from foo /Users/ryanguo99/Documents/work/scratch/test.py:4, torchdynamo attempted to trace the following frames: [
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486] * foo /Users/ryanguo99/Documents/work/scratch/test.py:4
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486] * bar /Users/ryanguo99/Documents/work/scratch/test.py:7
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486] ]
I1003 12:18:31.064000 177 torch/_dynamo/eval_frame.py:486] starting from foo /Users/ryanguo99/Documents/work/scratch/test.py:4, torchdynamo attempted to trace the following frames: []
......
```
Fixes#118262.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137297
Approved by: https://github.com/williamwen42
For Traceable FSDP2, the most common use case is to have `fullgraph=False` for forward pass (to allow user-level graph breaks), and `fullgraph=True` for compiled autograd backward pass (required for queue_callback support).
With `torch._dynamo.compiled_autograd=True`, previously we are not able to set different `fullgraph` config value for forward vs. backward pass, since `rebuild_ctx` just reuses the forward compile config as-is. This PR adds `torch._dynamo.config.compiled_autograd_kwargs_override` config to allow forcing `fullgraph=True` for CA Dynamo tracing.
With this PR, we can remove standalone compiled autograd ctx manager usage in Traceable FSDP2 unit tests, and consolidate on using `torch._dynamo.compiled_autograd=True`.
Test commands:
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor_fullgraph_True`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136967
Approved by: https://github.com/xmfan
Starter version of automatic dynamic shapes for export.
Creates enums `DIM.AUTO`, `DIM.STATIC`, allowing user to specify `AUTO` for dims in dynamic_shapes specs, meaning that corresponding dims are treated as dynamic, and relevant guards will do what's necessary (e.g. refine ValueRanges, set replacements based on equality, or even set static) without raising ConstraintViolationErrors. Basically allows the user to say, "a bunch of these dims can be dynamic, let export do model analysis and return the program with maximum possible dynamism, without complaining".
The usage for specifying `dynamic_shapes` is now:
```
AUTO -> dynamic by default, return whatever produce_guards() says, even if it's static
None/int/STATIC -> static
Dim/DerivedDim -> same as before - will complain if the min/max range is invalid, or if dims related to this are unspecified.
```
Caveat 1: specifying `AUTO` for a dim won't guarantee it'll be dynamic:
- specifying `AUTO` for a dim will return the maximum possible dynamism given your program and other specified constraints, but this can still mean you'll get a static program. For example, with the program below, x is specified dynamic, but it's equal to y, which is specified static, and with how we currently do things we won't promote y to dynamic, but will demote(?) x to static. So this can be surprising if you don't fully know your model, and/or missed one of your other inputs when specifying auto-dynamic shapes.
```
class Foo(torch.nn.Module):
def forward(self, x, y):
return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": None})
```
Caveat 2: specifying `AUTO` and Dims in the same spec is still problematic:
- The way Dims/DerivedDims are currently handled is very strict. A Dim represents a symbol, and we require a user to specify the symbol for all dims governed by the symbol - that's why we've seen errors in the past like `The values of x must always be related to y by ...`, asking the user to specify the exact relation as in the program. We also require the specified min/max range to be a subset of the valid range from model analysis. All this doesn't compose well with specifying `AUTO` just yet - for example in the program below, ideal behavior could be to return a dynamic program, where `dx = x.size(0) = y.size(0)` has range (3,6). Unfortunately this crashes, and correct behavior is to specify `dx` for both inputs. So currently we raise a UserError and crash if both Dims + `AUTO` are present in the spec.
```
class Foo(torch.nn.Module):
def forward(self, x, y):
return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": {0: Dim("dx", min=3, max=6)}}) # this doesn't work, because x & y and related
```
Implementation details:
This is done by setting `assume_static_by_default=False`, and doing a transform on the `dynamic_shapes` spec to preserve semantics. `assume_static_by_default=False` will treat unspecified dims or Nones as dynamic. This is the opposite of what `export.export()` currently does - unspecified Dims/Nones are treated as static. Historically this static-by-default behavior, where the user deals with fewer guards, has been desirable, and we would like to respect that in this implementation. So this internal spec transformation is added, `_transform_shapes_for_default_dynamic()`, does the spec conversion necessary to be compatbile with dynamic by default. Specifically, AUTOs are converted into Nones, and Nones/unspecified dims are filled in with explicitly static constraints.
For example, this would look like, for a 3-d tensor: `{0: DIM.AUTO, 1: None, 2: Dim("dx")} -> {0: None, 1: 32, 2: Dim("dx")}`
This does seem overly complicated, but it's done to preserve dynamic shapes semantics for `torch._dynamo.export()`, which already uses `assume_static_by_default=False`, and follows the same process for generating shape constraints , via `_process_dynamic_shapes`. There the semantics are:
```
None/unspecified: dynamic by default
Dim/DerivedDim: also a strict assertion
```
If we don't care about BC for `_dynamo.export(dynamic_shapes)`, then we can just modify semantics for `_process_dynamic_shapes()` and change all the relevant tests in `test/dynamo/test_export.py`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133620
Approved by: https://github.com/avikchaudhuri
Part of #134054.
This corresponds to the pytorch mypy changes from D61493706. Updating takes so
long and touches so many files that it's impossible to land as a whole without conflicting with some other intermediate change.
So landing these 'type: ignore' for pytorch in advance of them actually being needed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134202
Approved by: https://github.com/Skylion007
Summary: `_ConstraintTarget` is an internal data structure that has some redundancy: tensors are identified by their id but also carry a weak reference. The weak reference was probably useful a year back but everything is done with ids right now, and the lifetime of these tensors ensures that using their ids is OK.
Test Plan: existing tests
Differential Revision: D61488816
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133890
Approved by: https://github.com/tugsbayasgalan
Sorryyyyy for another refactor. This splits `_process_dynamic_shapes` into 3 parts:
1. `_combine_args` - mostly the same thing
2. `_check_dynamic_shapes`, which is responsible for raising 99% of UserErrors if the dynamic shapes spec is invalid (minus 1 UserError with DerivedDims)
3. `_process_dynamic_shapes`, which for now, is the same thing, minus the stuff in 2.
This refactor is helpful for incoming automatic dynamic shapes work, because, we're switching to `assume_static_by_default=False`, which is what `_dynamo.export` currently does. This means any unspecified dims are allocated a symbol, in contrast to export today which keeps unspecified dims static. Historically this has been desirable - export users don't want too much dynamism. So we want to change how the spec is translated into constraints.
This means when we switch over to automatic dynamic shapes, we want to plug in something in between steps 2. and 3. which patches up the spec for `assume_static_by_default=False`, filling in static shapes for any unspecified dims, and potentially clearing out the auto-dynamic dims (since they're no-ops). We would do this in-between 2. and 3. to keep `_process_dynamic_shapes` semantically the same, since it's used with `_dynamo.export`.
We could do this without a refactor, plugging in this transform before `_process_dynamic_shapes`, but since that function's responsible for both spec checking + constraint production, moving spec checking to before we transform the specs helps guarantee we're raising errors on what the user's specified, and not an internal export bug.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133391
Approved by: https://github.com/avikchaudhuri
Summary:
- moves logging functionalities into `torch/_export/db/logging.py` file.
- add a check in `_dynamo/eval_frame.py` to check for optional input and error out with `UnsupportedError`
- change the case name of `torch_sym_int` to `unsupported_operator`
- Check if the case name is registered in exportdb, if so, we give a link to the case in exportdb.
- TODO: add test
Test Plan:
CI
Running the example in https://pytorch.org/docs/main/generated/exportdb/index.html#optional-input gives the following error logging:
```
E0730 10:53:33.687000 4155538 torch/_dynamo/eval_frame.py:1086] Parameter y is optional with a default value of tensor([[-0.1633, 1.2414, -0.1071],
E0730 10:53:33.687000 4155538 torch/_dynamo/eval_frame.py:1086] [-0.1936, -0.9425, -0.0824]])
E0730 10:53:33.688000 4155538 torch/export/_trace.py:1043] See optional_input in exportdb for unsupported case. https://pytorch.org/docs/main/generated/exportdb/index.html#optional-input
......
File "/data/users/shangdiy/fbsource/buck-out/v2/gen/fbcode/389acaeb40d57230/tutorials/pytorch/nntest/__torchtest__/torchtest#link-tree/torch/_dynamo/eval_frame.py", line 1091, in produce_matching
raise Unsupported(
torch._dynamo.exc.Unsupported: Tracing through optional input is not supported yet
```
It also logs a `export.error.classified` event in Scuba.
Reviewed By: zhxchen17
Differential Revision: D60427208
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132420
Approved by: https://github.com/zhxchen17
Summary: The JK disables dynamo by passing None to set_eval_frame.
Test Plan:
Ran buck test mode/opt caffe2/test/dynamo:test_dynamo
Buck UI: https://www.internalfb.com/buck2/1fec33b4-c95a-4bdf-b47b-7c0b8ab9e24a
Test UI: https://www.internalfb.com/intern/testinfra/testrun/2814750010105363
Network: Up: 0B Down: 0B
Jobs completed: 9596. Time elapsed: 28:54.5s.
Tests finished: Pass 4796. Fail 0. Fatal 0. Skip 17. Build failure 0
Also manually write a small local test with torch.compile and toggles the code to see if PT2 can be disabled. Validated with running the test and observing the log.
PT2 enabled: P1486847242. Can see dynamo log about graph breaks.
PT2 disabled: P1486847727. No dynamo log. The newly added warning printed.
Reviewed By: ezyang
Differential Revision: D59968925
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131258
Approved by: https://github.com/c00w
## save&load support for OptimizedModule
[Issue Description](https://github.com/pytorch/pytorch/pull/101651)
English is not my native language; please excuse typing errors.
This pr is based on commit b9588101c4d3411b107fdc860acfa8a72c642f91\
I'll do something with the merge conflicts later
### test result for test/dynamo
Conclusion:\
It performs the same as before as far as I can see.
ENV(CPU only):\
platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0\
configfile: pytest.ini\
plugins: anyio-3.7.1, cpp-2.3.0, flakefinder-1.1.0, xdist-3.3.1, xdoctest-1.1.0, metadata-3.1.1, html-4.1.1, hypothesis-5.35.1, rerunfailures-14.0
#### before this pr:
[before](https://github.com/pytorch/pytorch/files/15329370/before.md)
#### after this pr:
[after](https://github.com/pytorch/pytorch/files/15329376/after.md)
### some changes
1. add test_save_and_load to test/dynamo/test_modules.py with & without "backend='inductor'"
2. add \_\_reduce\_\_ function to OptimizedModule and derived classes of _TorchDynamoContext for pickling & unpickling
3. change the wrappers into wrapper classes ( including convert_frame_assert, convert_frame, catch_errors_wrapper in torch/_dynamo/convert_frame.py & wrap_backend_debug in torch/_dynamo/repro/after_dynamo.py )
4. change self.output.compiler_fn into innermost_fn(self.output.compiler_fn) in torch/_dynamo/symbolic_convert.py to get the origin compiler_fn and to avoid the "compiler_fn is not eager" condition
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126374
Approved by: https://github.com/msaroufim, https://github.com/jansel
Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing.
Note that only warnings that their messages contain `[Dd]eprecat(ed|ion)` are updated in this PR.
Resolves#126888
- #126888
This PR is split from PR #126898.
- #126898
------
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127689
Approved by: https://github.com/Skylion007
With the current state of export's dynamic shapes, we struggle with guards and constraints that are beyond the current dynamic shapes language, expressed with dims and derived dims. While we can compile and guarantee correctness for guards within the current language (e.g. min/max ranges, linear relationships, integer divisibility) we struggle to dynamically compile guards which extend beyond that.
For these "complex" guards, we typically do either of the following: 1) raise a constraint violation error, along the lines of "not all values of <symbol> in the specified range satisfy <guard>", with or without suggested fixes, 2) specialize to the provided static values and suggest removing dynamism, or 3) fail compilation due to some arbitrary unsupported case. Previous [work](https://github.com/pytorch/pytorch/pull/124949) went towards resolving this by disabling forced specializations, instead allowing the user to fail at runtime with incorrect inputs.
In this PR, relying on [hybrid backed-unbacked symints](https://github.com/pytorch/pytorch/issues/121749), [deferred runtime asserts](https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/runtime_assert.py), and the function [_is_supported_equivalence()](d7de4c9d80/torch/fx/experimental/symbolic_shapes.py (L1824)), we add a flag `_allow_complex_guards_as_runtime_asserts` which allows the user to compile exported programs containing these guards and maintain dynamism, while adding correctness checks as runtime assertions in the graph.
Hybrid backed-unbacked symints allow us to easily bypass "implicit" guards emitted from computation - guards that we ~expect to be true. Popular examples revolve around reshapes:
```
# reshape
def forward(self, x, y): # x: [s0, s1], y: [s2]
return x.reshape([-1]) + y # guard s0 * s1 = s2
This leads to the following exported program
class GraphModule(torch.nn.Module):
def forward(self, x: "f32[s0, s1]", y: "f32[s2]"):
sym_size_int: "Sym(s2)" = torch.ops.aten.sym_size.int(y, 0)
mul: "Sym(-s2)" = -1 * sym_size_int; sym_size_int = None
sym_size_int_1: "Sym(s0)" = torch.ops.aten.sym_size.int(x, 0)
sym_size_int_2: "Sym(s1)" = torch.ops.aten.sym_size.int(x, 1)
mul_1: "Sym(s0*s1)" = sym_size_int_1 * sym_size_int_2; sym_size_int_1 = sym_size_int_2 = None
add: "Sym(s0*s1 - s2)" = mul + mul_1; mul = mul_1 = None
eq: "Sym(Eq(s0*s1 - s2, 0))" = add == 0; add = None
_assert_scalar = torch.ops.aten._assert_scalar.default(eq, "Runtime assertion failed for expression Eq(s0*s1 - s2, 0) on node 'eq'"); eq = None
view: "f32[s0*s1]" = torch.ops.aten.view.default(x, [-1]); x = None
add_1: "f32[s0*s1]" = torch.ops.aten.add.Tensor(view, y); view = y = None
return (add_1,)
```
Another case is symbol divisibility:
```
def forward(self, x): # x: [s0, s1]
return x.reshape([-1, x.shape[0] - 1]) # Eq(Mod(s0 * s1, s0 - 1), 0)
```
Applying deferred runtime asserts also helps dynamic compilation for "explicit" complex guards that typically cause problems for export. For example we can generate runtime asserts for not-equal guards, and complex conditions like the following:
```
class Foo(torch.nn.Module):
def forward(self, x, y):
# check that negation of first guard also shows up as runtime assertion
if x.shape[0] == y.shape[0]: # False
return x + y
elif x.shape[0] == y.shape[0] ** 3: # False
return x + 2, y + 3
elif x.shape[0] ** 2 == y.shape[0] * 3: # True
return x * 2.0, y * 3.0
```
For the above graph we will generate 3 runtime assertions: the negation of the first 2, and the 3rd condition as a guard.
One additional benefit here over the current state of exported programs is that this adds further correctness guarantees - previously with explicit complex guards, if compilation succeeded, the guards would be ignored at runtime, treated as given.
As shown above, the runtime asserts appear as math ops in the graph, generated by the sympy interpreter, resulting in an _assert_scalar call. There is an option to avoid adding these asserts into the graph, by setting `TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1`. This results in the "original" computation graph, with dynamism, and any incorrect inputs will fail on ops during runtime. Further work could go into prettifying the printer, so the majority of the graph isn't guard-related.
Ideally this PR would subsume and remove the recently added [_disable_forced_specializations](https://github.com/pytorch/pytorch/pull/124949) flag, but that flag still handles one additional case of specialization: single-variable equalities where the symbol is solvable for a concrete value: see this [PR](https://github.com/pytorch/pytorch/pull/126925)
This PR doesn't change any behavior around data-dependent errors/unbacked symints yet, that could be further work.
NOTE: will take naming change suggestions for the flag :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127129
Approved by: https://github.com/avikchaudhuri