mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Used [rst2myst tool](https://rst-to-myst.readthedocs.io/en/latest/) Fixes #155020. This PR is split into 3 to pass sanity check. Docs comparison (check out the 'new' whenever docs build) 1. draft_export ([old](https://docs.pytorch.org/docs/main/draft_export.html) vs. [new](https://docs-preview.pytorch.org/pytorch/pytorch/155567/draft_export.html)) 2. export.ir_spec ([old](https://docs.pytorch.org/docs/main/export.ir_spec.html) vs. [new](https://docs-preview.pytorch.org/pytorch/pytorch/155567/export.ir_spec.html)) 3. fft ([old](https://docs.pytorch.org/docs/main/fft.html) vs. [new](https://docs-preview.pytorch.org/pytorch/pytorch/155567/fft.html)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155567 Approved by: https://github.com/svekars
263 lines
10 KiB
Markdown
263 lines
10 KiB
Markdown
(draft-export)=
|
|
|
|
# Draft Export
|
|
|
|
:::{warning}
|
|
This feature is not meant to be used in production and is designed to be
|
|
used as a tool for debugging torch.export tracing errors.
|
|
:::
|
|
|
|
Draft-export is a new version of export, which is designed to consistently
|
|
produce a graph, even if there are potential soundness issues, and to generate a
|
|
report listing out all of the issues export encountered during
|
|
tracing and providing additional debugging information. For custom operators that
|
|
don't have fake kernels, it will also generate a profile which you can register
|
|
to automatically generate a fake kernel.
|
|
|
|
Have you ever tried to export a model using {func}`torch.export.export`, only to
|
|
encounter a data-dependent issue? You fix it, but then run into a missing fake
|
|
kernel problem. And after resolving that, you get hit with another
|
|
data-dependent issue. You wonder to yourself, I wish there was a way I could
|
|
just get a graph to play around with, and be able to view all the issues in one
|
|
place so that I can fix them later…
|
|
|
|
`draft_export` to the rescue!
|
|
|
|
`draft_export` is a version of export which will always successfully export a
|
|
graph, even if there are potential soundness issues. These issues will then be
|
|
compiled into a report for clearer visualization, which can be fixed later on.
|
|
|
|
## What sort of errors does it catch?
|
|
|
|
Draft-export helps to catch and debug the following errors:
|
|
|
|
- Guard on data-dependent errors
|
|
- Constraint violation errors
|
|
- Missing fake kernels
|
|
- Incorrectly written fake kernels
|
|
|
|
## How does it work?
|
|
|
|
In normal export, we will convert the sample inputs into FakeTensors and use
|
|
them to record operations and trace the program into a graph. Input tensor
|
|
shapes that can change (which are marked through `dynamic_shapes`), or values
|
|
within tensors (typically from an `.item()` call) will be represented as a symbolic
|
|
shape (`SymInt`) instead of a concrete integer. However some issues may occur
|
|
while tracing - we may run into guards that we cannot evaluate, like if we want
|
|
to check if some item in a tensor is greater than 0 (`u0 >= 0`). Since the tracer
|
|
doesn't know anything about the value of `u0`, it will throw a data-dependent
|
|
error. If the model uses a custom operator but a fake kernel hasn't been
|
|
defined for it, then we will error with `fake_tensor.UnsupportedOperatorException`
|
|
because export doesn't know how to apply this on `FakeTensors`. If a custom
|
|
operator has a fake kernel implemented incorrectly, export will silently produce
|
|
an incorrect graph that doesn't match the eager behavior.
|
|
|
|
To fix the above errors, draft-export uses *real tensor tracing* to guide us on
|
|
how to proceed when tracing. As we trace the model with fake tensors, for every
|
|
operation that happens on a fake tensor, draft-export will also run the operator
|
|
on stored real tensors which come from the example inputs passed to export. This
|
|
allows us to address the above errors: When we reach a guard that we cannot
|
|
evaluate, like `u0 >= 0`, we will use the stored real tensor values to
|
|
evaluate this guard. Runtime asserts will be added into the graph to ensure that
|
|
the graph asserts the same guard that we assumed while tracing. If we run into
|
|
a custom operator without a fake kernel, we will run the operator's normal
|
|
kernel with the stored real tensors, and return a fake tensor with the same rank
|
|
but unbacked shapes. Since we have the real tensor output for every operation,
|
|
we will compare this with the fake tensor output from the fake kernel. If the
|
|
fake kernel is implemented incorrectly, we will then catch this behavior and
|
|
generate a more correct fake kernel.
|
|
|
|
## How can I use draft export?
|
|
|
|
Let's say you're trying to export this piece of code:
|
|
|
|
```python
|
|
class M(torch.nn.Module):
|
|
def forward(self, x, y, z):
|
|
res = torch.ops.mylib.foo2(x, y)
|
|
|
|
a = res.item()
|
|
a = -a
|
|
a = a // 3
|
|
a = a + 5
|
|
|
|
z = torch.cat([z, z])
|
|
|
|
torch._check_is_size(a)
|
|
torch._check(a < z.shape[0])
|
|
|
|
return z[:a]
|
|
|
|
inp = (torch.tensor(3), torch.tensor(4), torch.ones(3, 3))
|
|
|
|
ep = torch.export.export(M(), inp)
|
|
```
|
|
|
|
This runs into a “missing fake kernel” error for `mylib.foo2` and then a
|
|
`GuardOnDataDependentExpression` because of the slicing of `z` with `a`,
|
|
an unbacked symint.
|
|
|
|
To call `draft-export`, we can replace the `torch.export` line with the following:
|
|
|
|
```python
|
|
ep = torch.export.draft_export(M(), inp)
|
|
```
|
|
|
|
`ep` is a valid ExportedProgram which can now be passed through further environments!
|
|
|
|
## Debugging with draft-export
|
|
|
|
In the terminal output from draft-export, you should see the following message:
|
|
|
|
```
|
|
#########################################################################################
|
|
WARNING: 2 issue(s) found during export, and it was not able to soundly produce a graph.
|
|
To view the report of failures in an html page, please run the command:
|
|
`tlparse /tmp/export_angelayi/dedicated_log_torch_trace_axpofwe2.log --export`
|
|
Or, you can view the errors in python by inspecting `print(ep._report)`.
|
|
########################################################################################
|
|
```
|
|
|
|
Draft-export automatically dumps logs for `tlparse`. You can view the tracing
|
|
errors by using `print(ep._report)`, or you can pass the logs into `tlparse`
|
|
to generate an html report.
|
|
|
|
Running the `tlparse` command in the terminal will generate a
|
|
[tlparse](https://github.com/pytorch/tlparse)
|
|
HTML report. Here is an example of the `tlparse` report:
|
|
|
|
```{image} _static/img/export/draft_export_report.png
|
|
```
|
|
|
|
Clicking into the Data Dependent Error, we will see the following page which
|
|
contains information to help debug this error. Specifically, it contains:
|
|
|
|
- The stacktrace at which this error occurs
|
|
- A list of local variables and their shapes
|
|
- Information for how this guard was created
|
|
|
|
```{image} _static/img/export/draft_export_report_dde.png
|
|
```
|
|
|
|
## The returned Exported Program
|
|
|
|
Because draft-export specializes on code paths based on the example inputs, the
|
|
exported program resulting from draft-export is guaranteed to be runnable and
|
|
return correct results for **at least** the given example inputs. Other inputs can
|
|
work, as long as they match the same guards that were taken when we were
|
|
draft-exporting.
|
|
|
|
For example, if we have a graph branching on if a value is greater than 5, if in
|
|
draft-export our example inputs were greater than 5, then the returned
|
|
`ExportedProgram` will specialize on that branch, and will assert that the value
|
|
is greater than 5. This means that the program will succeed if you pass in
|
|
another value greater than 5, but will fail if you pass in a value less than 5.
|
|
This is more sound than `torch.jit.trace`, which will silently specialize on the
|
|
branch. The proper way for `torch.export` to support both branches would be to
|
|
rewrite the code using `torch.cond`, which will then capture both branches.
|
|
|
|
Because of the runtime assertions in the graph, the returned exported-program is
|
|
also retraceable with `torch.export` or `torch.compile`, with a minor addition in
|
|
the case where a custom operator is missing a fake kernel.
|
|
|
|
## Generating Fake Kernels
|
|
|
|
If a custom operator does not contain a fake implementation, currently
|
|
draft-export will use the real-tensor propagation to get an output for the
|
|
operator and continue tracing. However, if we run the exported program with fake
|
|
tensors or retrace the exported model, we will still fail because there is still
|
|
no fake kernel implementation.
|
|
|
|
To address this, after draft-export, we will generate an operator profile for
|
|
each custom operator call that we encounter, and store this on the report
|
|
attached to the exported program: `ep._report.op_profiles`. Users can then use the
|
|
context manager `torch._library.fake_profile.unsafe_generate_fake_kernels` to
|
|
generate and register a fake implementation based on these operator profiles.
|
|
This way future fake tensor retracing will work.
|
|
|
|
The workflow would look something like:
|
|
|
|
```python
|
|
class M(torch.nn.Module):
|
|
def forward(self, a, b):
|
|
res = torch.ops.mylib.foo(a, b) # no fake impl
|
|
return res
|
|
|
|
ep = draft_export(M(), (torch.ones(3, 4), torch.ones(3, 4)))
|
|
|
|
with torch._library.fake_profile.unsafe_generate_fake_kernels(ep._report.op_profiles):
|
|
decomp = ep.run_decompositions()
|
|
|
|
new_inp = (
|
|
torch.ones(2, 3, 4),
|
|
torch.ones(2, 3, 4),
|
|
)
|
|
|
|
# Save the profile to a yaml and check it into a codebase
|
|
save_op_profiles(ep._report.op_profiles, "op_profile.yaml")
|
|
# Load the yaml
|
|
loaded_op_profile = load_op_profiles("op_profile.yaml")
|
|
```
|
|
|
|
The operator profile is a dictionary mapping operator name to a set of profiles
|
|
which describe the input and outputs of the operator, and could be manually
|
|
written, saved into a yaml file, and checked into a codebase. Here's an example
|
|
of a profile for `mylib.foo.default`:
|
|
|
|
```python
|
|
"mylib.foo.default": {
|
|
OpProfile(
|
|
args_profile=(
|
|
TensorMetadata(
|
|
rank=2,
|
|
dtype=torch.float32,
|
|
device=torch.device("cpu"),
|
|
layout=torch.strided,
|
|
),
|
|
TensorMetadata(
|
|
rank=2,
|
|
dtype=torch.float32,
|
|
device=torch.device("cpu"),
|
|
layout=torch.strided,
|
|
),
|
|
),
|
|
out_profile=TensorMetadata(
|
|
rank=2,
|
|
dtype=torch.float32,
|
|
device=torch.device("cpu"),
|
|
layout=torch.strided,
|
|
),
|
|
)
|
|
}
|
|
```
|
|
|
|
`mylib.foo.default`'s profile contains only one profile, which says that for 2
|
|
input tensors of rank 2, dtype `torch.float32`, device `cpu`, we will return
|
|
one tensor of rank 2, dtype `torch.float32`, and device `cpu`. Using the
|
|
context manager, will then generate a fake kernel where given 2 input tensors of
|
|
rank 2 (and the other tensor metadata), we will output one tensor of rank 2 (and
|
|
the other tensor metadata).
|
|
|
|
If the operator also supports other input ranks, then we can add the profile to
|
|
this list of profiles, either by manually adding it into the existing profile or
|
|
rerunning draft-export with new inputs to get new profiles, so that the
|
|
generated fake kernel will support more input types. Otherwise it will error.
|
|
|
|
## Where to go from here?
|
|
|
|
Now that we have successfully created an `ExportedProgram` using draft-export,
|
|
we can use further compilers such as `AOTInductor` to optimize its performance
|
|
and produce a runnable artifact. This optimized version can then be used for
|
|
deployment. In parallel, we can utilize the report generated by draft-export to
|
|
identify and fix `torch.export` errors that were encountered so that the
|
|
original model can be directly traceable with `torch.export`.
|
|
|
|
```{toctree}
|
|
:caption: Additional Links
|
|
:maxdepth: 1
|
|
|
|
torch.compiler_fake_tensor
|
|
torch.compiler_dynamic_shapes
|
|
torch.compiler_aot_inductor
|
|
```
|