Commit Graph

122 Commits

Author SHA1 Message Date
Zhengxu Chen
7feaa73057 [export] Remove deprecated fields from ExportedProgram ctor. (#131697)
Summary: as title.

Test Plan: CI

Reviewed By: SherlockNoMad

Differential Revision: D60078426

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131697
Approved by: https://github.com/ydwu4
2024-07-26 16:19:46 +00:00
Avik Chaudhuri
5b05ad9697 fix non-persistent buffers (#131756)
Summary:
Dynamo doesn't track whether buffers are `persistent`. This led to some ugly code where we would mark buffers as always persistent when creating signatures, then later check whether the buffers were not in the state dict to infer whether they were non-persistent, and use this to fix up the signature.

This PR instead defines a utility to look up all the non-persistent buffers registered inside a module (this information is recorded in a private `_non_persistent_buffers_set` module attribute), and uses it to (a) correctly set the persistent flag on buffers when creating signatures (b) transfer this information to a Dynamo-traced graph module, which then causes non-persistent buffers to (correctly) not show up in the state dict.

Test Plan: existing tests + new case with non-persistent buffer in nested module

Differential Revision: D60224656

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131756
Approved by: https://github.com/zhxchen17, https://github.com/ydwu4
2024-07-26 04:45:30 +00:00
Avik Chaudhuri
83d19620f6 kill tmp _is_executorch flag (#131488)
Test Plan: existing tests

Differential Revision: D60126186

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131488
Approved by: https://github.com/ydwu4
2024-07-24 08:51:37 +00:00
Aaron Orenstein
5a0068cc69 [BE] mypy: disallow untyped decorators (#131428)
Untyped decorators strip the types from their decorated function so even if the underlying function is fully typed then callers to it don't get any benefit from type annotations.

Step 1 - Enable the error and override in all the offending files.

#131429

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131428
Approved by: https://github.com/justinchuby, https://github.com/oulgen
2024-07-23 21:50:55 +00:00
Avik Chaudhuri
94f22eb6b2 refactor post-trace fakification in strict (#131421)
Summary:
Previously it was unclear what `_convert_input_to_fake` actually does (used in strict), and in particular how it is different from `make_fake_inputs` (used in non-strict).

This PR splits that function to work purely on user inputs, then renames it to `extract_fake_inputs` and adds a comment clarifying what it does—namely, it extracts fake inputs from a given graph module instead of "converting inputs to fake inputs" (as suggested by the current name) or "making fake inputs" (as happens in non-strict, where no tracing has taken place yet).

The remainder of that function used to also fakify params and buffers. It turns out that this part is identical to what happens in non-strict, hence we also pull `make_fake_inputs` out from `non_strict_utils` into `_trace`, merge it with another util, and make both modes call it.

Test Plan: existing tests

Differential Revision: D60084442

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131421
Approved by: https://github.com/zhxchen17
2024-07-23 18:23:03 +00:00
Sherlock Huang
c1ef214046 Print ExportedProgram without color by default (#131399)
Summary:
Without plugin, colored ExportedProgram is not really readable.

![image](https://github.com/user-attachments/assets/319920a9-bb4b-4ad2-bcac-0c4f76973b11)

Test Plan: CI

Differential Revision: D60074481

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131399
Approved by: https://github.com/angelayi
2024-07-23 16:41:55 +00:00
angelayi
26f7dd286b [export] Allow non-CIA ops to be preserved (#131075)
I feel like the semantics of `run_decompositions(preserve_ops,...)` should be that we should always preserve whatever operator is put into `preserve_ops`, even if it's not CIA?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131075
Approved by: https://github.com/bdhirsh
2024-07-23 00:41:48 +00:00
PyTorch MergeBot
b9912f31ef Revert "[export] fix zero arg export in training_ir (#130990)"
This reverts commit 50436d5bdb.

Reverted https://github.com/pytorch/pytorch/pull/130990 on behalf of https://github.com/clee2000 due to failing some executorch and torchrec tests internally D60006710 ([comment](https://github.com/pytorch/pytorch/pull/130990#issuecomment-2243395316))
2024-07-22 16:49:25 +00:00
Yidi Wu
50436d5bdb [export] fix zero arg export in training_ir (#130990)
Fixed TrainingIRToRunDecomp failures for test_tensor_attribute_zero_args and also a few re-tracability failures because run_decomposition does a retracing.

**edit:** also remove the eliminate_dead_code() in _unlift because of one onnx test failure:
a constant tensor attr was lifted as constant_tensor input but it's not used in the graph after aot_autograd due to a short cut in its decomposition. This causes the setattr to be removed by eliminate_dead_code but the graph signature still contains the name of that buffer, which causes an inconsitency between the transformed graph and ep's original signature after _unlift. And it seems that this has happened a few times where some nodes are accidentally removed and we're in an inconsistent state.

The alternative of removing it would be: every time we call elimiate_dead_code, we verify the consistency of the graph with 1. the graph before transformation and 2. all the meta datas but i think this deserves a complete design.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130990
Approved by: https://github.com/pianpwk
2024-07-20 02:35:13 +00:00
Pian Pawakapan
745324e487 [export] turn on hybrid symints by default (#130775)
Sets `prefer_deferred_runtime_asserts_over_guards=True` for export, so any guards emitted from `SymNode.expect_true` (for example, guards that are implicitly required to be true for an op to succeed) won't lead to constraint violations. Instead these should appear in the graph as runtime asserts, or potentially as replacement expressions for placeholder shapes.

For example, this reshape op should emit s0 * s1 = s2, deferred as a runtime assert.
```
x = torch.randn(4, 8)  # [s0, s1]
y = torch.randn(32)  # [s2]
out = x.reshape(-1) + y
# this emits Eq(s0 * s1, s2), and we represent y's shape as [s0*s1] in the graph.
```

However, other complex guards can still cause export to fail, for instance guards emitted from `SymNode.guard_bool/guard_size_oblivious` (e.g. explicit if-else conditions in user code or lower-level op implementations hit during tracing) can still raise constraint violations. These can be deferred with `allow_complex_guards_as_runtime_asserts=True`. We don't yet make this default, because while this makes export more likely to succeed, it results in non-trivial asserts being emitted that often represent specialization to a variant of the op, or checks related to 0/1 specialization.

We also remove forced specializations for export and kill the `_disable_forced_specializations` flag - now any guard we can't express with Dims/DerivedDims either are handled with Hybrid SymInts, or should be resolved with rewriting or deferring.

Follow up:
Currently, `ShapeEnv._set_replacement()` is called for complex equality expressions (e.g. s2 -> s0*s1 in the example above), and the ExportedProgram stores `s0*s1` in the input placeholder. This isn't checked for validity when the program is run, so an option is to avoid replacement and/or runtime assert on equality.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130775
Approved by: https://github.com/avikchaudhuri
2024-07-18 17:40:58 +00:00
Zhengxu Chen
5484c86021 [export] Fully support extension op in serialization/deserialization. (#130851)
Summary: Finishing up the mechanism to "register" certain types of operators to a registry so that the serializer can handle them correctly. This is expected to be firstly used by executorch.

Test Plan: buck run mode/opt caffe2/test:test_export -- -r test_export_with_extension_op_serialization

Differential Revision: D59825148

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130851
Approved by: https://github.com/angelayi
2024-07-18 16:47:53 +00:00
angelayi
6c2c8ee15b [export] Remove preserved ops from decomp list (#130970)
Fixes https://fb.workplace.com/groups/1075192433118967/permalink/1466016147369925/

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130970
Approved by: https://github.com/bdhirsh
2024-07-18 05:15:22 +00:00
Pian Pawakapan
d96c80649f [export] constants & non-persistent buffers for training IR (#130864)
Summary: Uses original ExportedProgram constants and graph signature to inform decompositions, so that constant tensors and non-persistent buffers are respected for training IR. Removes 7 test failures for training IR.

Test Plan: test_export

Differential Revision: D59820909

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130864
Approved by: https://github.com/angelayi
2024-07-17 18:27:53 +00:00
Pian Pawakapan
18b7633bfb [export] fix kwargs in run_decompositions() for training IR (#130553)
Re-exporting GraphModule expects all inputs to be in args, though not in pytree-flattened format. This avoids failing when we run with a fx.Interpreter subclass in [AOTAutograd tracing](973037be6a/torch/_functorch/_aot_autograd/traced_function_transforms.py (L760-L762)).

Removes 7 test failures for training IR export.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130553
Approved by: https://github.com/zhxchen17, https://github.com/ydwu4
2024-07-11 22:53:18 +00:00
Zhengxu Chen
726a287271 [export] Expand verifier to be multiple on ExportedProgram (#130364)
Summary: This diff updates the ExportedProgram class in PyTorch to allow for multiple verifiers to be attached to it. This is done by adding a new field to the ExportedProgram schema called "verifiers" which is a list of strings representing the names of the verifiers to be attached to the program. The verifiers are loaded using the "load_verifier" function which is defined in the "torch._export.serde.serialize" module. The "exported_program.dialect" field is also deprecated in favor of the "verifiers" field.

Test Plan: CI

Differential Revision: D59408546

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130364
Approved by: https://github.com/angelayi, https://github.com/ydwu4
2024-07-11 20:34:49 +00:00
Pian Pawakapan
1b3b4c2fb9 [runtime asserts] deduplicate runtime asserts & CSE (#128599) (#130380)
original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train)

Summary:
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]

s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Test Plan:
contbuild & OSS CI, see 940e4477ab

Original Phabricator Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Differential Revision: D59543603

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380
Approved by: https://github.com/izaitsevfb
2024-07-10 19:23:37 +00:00
PyTorch MergeBot
9c9744c3ac Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599)"
This reverts commit 940e4477ab.

Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))
2024-07-09 21:03:49 +00:00
Pian Pawakapan
940e4477ab [runtime asserts] deduplicate runtime asserts & CSE (#128599)
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]
# something with _w ...

# turns into ->
s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

# turns into
torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599
Approved by: https://github.com/ezyang
2024-07-07 20:10:14 +00:00
PyTorch MergeBot
963f430d13 Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599)"
This reverts commit 0267b2ddcb.

Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk 0267b2ddcb ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))
2024-07-06 07:20:05 +00:00
Pian Pawakapan
0267b2ddcb [runtime asserts] deduplicate runtime asserts & CSE (#128599)
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]
# something with _w ...

# turns into ->
s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

# turns into
torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599
Approved by: https://github.com/ezyang
2024-07-06 03:44:49 +00:00
Tugsbayasgalan Manlaibaatar
dabaebd339 Make run_decomp work (#129249)
In this PR, we implement the first version of training_ir.run_decomp functionality. Since we don't return the modified buffers as extra output in training IR, our previous strategy of reusing graph signature won't work. In fact, this run_decomp is more similar to retracing. So i reuse some of export steps here. After this PR:
export_for_training().run_decomp({}, _preserve_ops=[all 183 ops]) == export_for_predispatch() - autograd_manipulating_ops.

Differential Revision: [D59069090](https://our.internmc.facebook.com/intern/diff/D59069090)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129249
Approved by: https://github.com/zhxchen17
ghstack dependencies: #128077, #129092
2024-06-27 19:16:07 +00:00
Tugsbayasgalan Manlaibaatar
90f6043368 Don't decompose functional composite ops in export inference IR (#128077)
Recently we decided to split export IR into two different IRs (training vs inference). In the inference IR, one major change we decided to introduce was we wanted to keep the composite ops that user specified in the IR. This PR does that by overriding the CompositeImplicitAutograd decomp in export inference path.

Differential Revision: [D58701607](https://our.internmc.facebook.com/intern/diff/D58701607)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128077
Approved by: https://github.com/bdhirsh
2024-06-26 23:07:55 +00:00
Pian Pawakapan
d02bba519c [export] match fake mode for _decompose_exported_program() (#129421)
Summary:
_decompose_exported_program() ran into an issue with trace_joint, where trace_joint() produces values with mismatching FakeModes. Adding fake mode context to aot_export_module() so this doesn't happen.

#thanks to tugsbayasgalan for the fix!

Test Plan: test_experimental

Differential Revision: D58977694

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129421
Approved by: https://github.com/tugsbayasgalan, https://github.com/zhxchen17
2024-06-26 05:52:31 +00:00
Zhengxu Chen
65286883d4 [export] reland "experimental joint graph API." (#129081)
Summary: previous diff got reverted despite CI was green.

Test Plan: CI

Differential Revision: D58790048

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129081
Approved by: https://github.com/tugsbayasgalan
2024-06-20 16:50:53 +00:00
PyTorch MergeBot
df94d57c0a Revert "[export] experimental joint graph API. (#128847)"
This reverts commit 0707811286.

Reverted https://github.com/pytorch/pytorch/pull/128847 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/128847#issuecomment-2179326891))
2024-06-19 19:04:36 +00:00
Zhengxu Chen
0707811286 [export] experimental joint graph API. (#128847)
Summary:
WARNING: This API is highly unstable and will be subject to change in the future.

Add a protoype to "decompose" an ExportedProgram into a joint graph form, so that we can compute the gradients on this graph.

Test Plan: buck test mode/opt caffe2/torch/fb/export:test_experimental

Differential Revision: D55657917

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128847
Approved by: https://github.com/tugsbayasgalan
2024-06-19 16:45:27 +00:00
Zhengxu Chen
be0eec9031 [export] Improve static typing in tracer. (#128552)
Summary: as title.

Test Plan: CI

Differential Revision: D58485487

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128552
Approved by: https://github.com/angelayi
2024-06-14 17:57:37 +00:00
chilli
c486e2ab64 Add coloring to fx graph print out (#128476)
Note: Won't land immediately, at least I'll need to add a color option to the field. But curious if any tests fail.

Old:
<img width="1294" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/c3a750ed-5e54-4621-b2e4-be5481be15b6">

New:
<img width="1303" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/3a1f1adc-6f3a-413e-8b87-ee53da9bf4ed">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128476
Approved by: https://github.com/ezyang
2024-06-13 23:39:04 +00:00
Zhengxu Chen
0444e89931 [export] Remove replace_sym_size_ops_pass (#128443)
Summary: Not needed anymore.

Test Plan: CI

Differential Revision: D58429458

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128443
Approved by: https://github.com/angelayi
2024-06-12 21:03:06 +00:00
Aaron Orenstein
038b927590 Flip default value for mypy disallow_untyped_defs [7/11] (#127844)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127844
Approved by: https://github.com/oulgen
ghstack dependencies: #127842, #127843
2024-06-08 18:49:45 +00:00
Pian Pawakapan
e505132797 [export] track TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS for export runtime asserts (#127554)
Track TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1 in export so it doesn't omit runtime asserts.

Differential Revision: D57978699

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127554
Approved by: https://github.com/tugsbayasgalan
2024-06-05 04:16:54 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
b5e85b8ecc Add deferred_runtime_assertion pass after run_decompositions (#127305)
Summary: We also want to reinsert the deferred_runtime passes after run_decompositions as well

Test Plan: CI

Reviewed By: zhxchen17

Differential Revision: D57802237

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127305
Approved by: https://github.com/BoyuanFeng
2024-05-31 05:45:28 +00:00
Aaron Gokaslan
3cb16ebf08 [BE]: Update ruff to 0.4.5 (#126979)
Update ruff to 0.4.5 and addresses some false negatives that have been found in the newer version.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126979
Approved by: https://github.com/ezyang
2024-05-24 18:38:35 +00:00
Matthew Hoffman
81277baa0c Remove removed ruff rule TRY200 (#126256)
My TOML linter is complaining that "TRY200" is not acceptable for the `tool.ruff.lint` schema.

From the ruff docs: https://docs.astral.sh/ruff/rules/reraise-no-cause/

> This rule has been removed and its documentation is only available for historical reasons.
>
> This rule is identical to [B904](https://docs.astral.sh/ruff/rules/raise-without-from-inside-except/) which should be used instead.

and we are currently explicitly ignoring B904.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126256
Approved by: https://github.com/Skylion007
2024-05-17 16:31:05 +00:00
Ze Sheng
51e9bb8783 [Export] Allow ExportedProgram to take empty decomp table (#126142)
**As title.**
Still, `ep.run_decompositions()` will use `core_aten_decompositions()` by default. Cases like `ep.run_decompositions(get_decompositions([]))` will use empty table, and go with [`aot_autograd_decompositions`](04877dc430/torch/_functorch/aot_autograd.py (L456-459)) only.

**Motivation**
We didn't have a clean way to pass in an empty decomp table. Since we've made `pre_dispatch` export as default and `ep.run_decompositions` remains with `aot_export_module(..., pre_dispatch=False)`, allowing empty table would help make blank control easier.

**Testing**
CI
Also looked through all the references in fbcode. The only concern I have is whether we should update [this example](04877dc430/torch/onnx/_internal/exporter.py (L817)) or not.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126142
Approved by: https://github.com/angelayi
2024-05-16 00:31:23 +00:00
Tugsbayasgalan Manlaibaatar
0e419b9146 Fix graph partitioner and make runtime assertion work with submodules in export (#125793)
Summary: This fix does three things:

1. When we add inputs from partioner to the top level graph module, we insert in the order of partioner which is not guaranteed to be same as original graph inputs. This PR fixes that.
2. When we replace autograd ops with HOP, we create new submodules and access their outputs via getitem calls. As a result, previous node names associated with getitem gets updated, resulting in the graph being different from produced graph signature. So I just update the graph signature accordingly.
3. We run runtime_assertion pass before autograd HOP pass because the constraints won't be populated correctly.

Differential Revision: [D57130314](https://our.internmc.facebook.com/intern/diff/D57130314)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125793
Approved by: https://github.com/zhxchen17
2024-05-09 18:13:46 +00:00
angelayi
8be4c1bc2f [export] Add metadata for nodes insert_deferred_runtime_asserts (#125414)
Fixes [internal error](https://fb.workplace.com/groups/1075192433118967/permalink/1416709435633930/).

The issue is that the asserting nodes added in the `insert_deferred_runtime_assertion` pass do not contain metadata that the ExportedProgram requires the graph to have. One solution to fix this is to retrace the entire module, or another solution is to manually add back this metadata.

This diff implements the latter solution (manually add back the metadata) through hooking into fx.graph's `create_node` function, and adding export-specific metadata for every node that is created. The reason I did this is so that the `insert_deferred_runtime_assertion` does not have to know about what metadata export wants.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125414
Approved by: https://github.com/zhxchen17, https://github.com/BoyuanFeng
2024-05-07 23:15:21 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
3946fa1c12 Fix bug in get_update_constraint (#125194)
Summary: Title

Test Plan: CI

Differential Revision: D56726321

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125194
Approved by: https://github.com/pianpwk
2024-04-30 18:21:29 +00:00
Pian Pawakapan
93a319a4fc [export] kill _process_constraints() (#123985)
The process for populating range_constraints follows separate methods for non-strict (`make_constraints`), and strict (`_process_constraints`). The strict method is somewhat more convoluted, and the analysis that Dynamo performs for strict is already present as part of the non-strict process in make_constraints (produce_guards(), running the export constraint solver).

This PR kills _process_constraints() and replaces calls with make_constraints, without duplicating the work that Dynamo already does.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123985
Approved by: https://github.com/avikchaudhuri
2024-04-25 16:58:57 +00:00
Pian Pawakapan
e112792a69 [export] refactor _AddRuntimeAssertionsForInlineConstraintsPass (#124503)
Summary:
The current _AddRuntimeAssertionsForInlineConstraintsPass has 2 known issues caused by its use of torch.fx.Interpreter:
1. SymInt-related ops (e.g. item()) are executed, causing new Unbacked SymInts to appear in the graph during the pass.
2. The graph is reconstructed, and node names/indices can be different from before, causing mismatches with `module_call_graph`, and leading to issues during unflattening.

This refactors the pass to use PassBase instead of _ExportPassBaseDeprecatedDoNotUse, only constructing new nodes for assertions.

Test Plan: This pass is called on all strict-mode export calls with range_constraints, test that behavior remains unchanged.

Differential Revision: D56360137

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124503
Approved by: https://github.com/zhxchen17
2024-04-23 20:07:49 +00:00
Pian Pawakapan
10b9d4d19c [export] handle Dim.lower = 0, 1 for ep.run_decompositions() (#123602)
Summary:
With pre-dispatch export and ep.run_decompositions(), range constraints are updated through looking at ShapeEnv.var_to_range. However the lower bounds on these may be incorrect - analysis on un-specialized symbols are done with lower bounds of 2, which mismatch with user-specified bounds (may be 0, 1).

This updates `_get_updated_range_constraints()` to use the old range constraints if possible.

Test Plan: Existing pre-dispatch/dynamic shapes test case.

Differential Revision: D55899872

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123602
Approved by: https://github.com/tugsbayasgalan
2024-04-19 21:29:36 +00:00
Tugsbayasgalan Manlaibaatar
dd3cea3291 Fix derived dim bugs in ep.run_decomp (#123326)
Differential Revision: [D55730289](https://our.internmc.facebook.com/intern/diff/D55730289)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123326
Approved by: https://github.com/avikchaudhuri
2024-04-17 04:00:55 +00:00
Pian Pawakapan
d0ccf599cc [export] Restore original placeholder names (part 2: higher-order-op subgraph naming) (#123587)
Summary:
note: breaking the original diff [D55225818](https://www.internalfb.com/diff/D55225818) into 3 parts (top-level renaming, higher-order-op subgraphs, constant input de/serialization) because of its size.

Stacked PR to restore original names to placeholder nodes, replacing the default names arg0_1, arg1_1, ...

This PR propagates node names to higher-order-op subgraph placeholders, retaining the top-level names and handling naming collisions by suffixing other non-placeholder nodes in the subgraph with an index. This is the same handling as in fx.Graph/fx.Node, but implemented separately as a pass.

Since the input schemas of HOO subgraphs are very different, they are enumerated in _name_hoo_subgraph_placeholders(). Currently cond, map_impl, and wrap_with_set_grad_enabled are handled, but other ops can be easily added.

Test Plan: verification checks on placeholder names for all export() calls, unit test in test/export/test_export.py

Differential Revision: D55456749

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123587
Approved by: https://github.com/angelayi
2024-04-11 22:40:46 +00:00
PyTorch MergeBot
cf8139b956 Revert "Fix derived dim bugs in ep.run_decomp (#123326)"
This reverts commit 4322874282.

Reverted https://github.com/pytorch/pytorch/pull/123326 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/123326#issuecomment-2048389042))
2024-04-10 20:35:01 +00:00
Tugsbayasgalan Manlaibaatar
4322874282 Fix derived dim bugs in ep.run_decomp (#123326)
Differential Revision: [D55730289](https://our.internmc.facebook.com/intern/diff/D55730289)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123326
Approved by: https://github.com/avikchaudhuri
2024-04-10 18:54:03 +00:00
Pian Pawakapan
d7f23f6826 [export] Restore original placeholder names (part 1: top-level renaming) (#122904)
Summary:
This PR restores original names to placeholder nodes, replacing the default names arg0_1, arg1_1, and so on.

User inputs now follow the signature of mod.forward(), for example forward(x, y) produces nodes x, y. If the tensors are nested in dictionaries, lists, tuples, or dataclasses, the names are a concatenation of the path to the tensor, e.g. x = {'a': torch.randn(4), 'b': [torch.randn(4), torch.randn(4)]} produces nodes x_a, x_b_0, x_b_1.

Parameters, buffers, constants, and custom objects follow the FQN of the object, prefixed by "p", "b", "c", and "obj" respectively. For example, self.bar.l0.weight gets you p_bar_l0_weight.
Effect tokens are named token_1, token_2, and so on, since they are not grounded in model inputs or named attributes.

note: breaking the original diff into 3 parts (top-level renaming, higher-order-op subgraphs, constant input de/serialization) because of its size.

Examples:
```python
# params, buffers, constants, inputs, torch.cond

ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, p_l0_weight: "f32[4, 4]", p_l0_bias: "f32[4]", c_alpha: "f32[4]", b_beta: "f32[4]", x_0_a: "f32[4, 4]", y: "f32[4, 4]"):
            # No stacktrace found for following nodes
            mul: "f32[4, 4]" = torch.ops.aten.mul.Tensor(x_0_a, x_0_a)
            t: "f32[4, 4]" = torch.ops.aten.t.default(p_l0_weight);  p_l0_weight = None
            addmm: "f32[4, 4]" = torch.ops.aten.addmm.default(p_l0_bias, y, t);  p_l0_bias = y = t = None
            return addmm

# model code

class Bar(torch.nn.Module):
    def forward(self, x):
        return x * x
class Foo(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.bar = Bar()
        self.l0 = torch.nn.Linear(4, 4)
        self.alpha = torch.randn(4)
        self.register_buffer('beta', torch.randn(4))
    def forward(self, x, y):
        x = x[0]['a']
        mul = self.bar(x)
        z1 = self.l0(y)
        return z1

# custom objects, dataclasses, tokens, constant inputs

ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, token_1: "f32[0]", obj_attr, data_x: "f32[4, 4]", data_y: "f32[4, 4]", mode):
            # No stacktrace found for following nodes
            mul: "f32[4, 4]" = torch.ops.aten.mul.Scalar(data_x, 30);  data_x = None
            div: "f32[4, 4]" = torch.ops.aten.div.Tensor_mode(data_y, 1.0, rounding_mode = 'floor');  data_y = None
            add: "f32[4, 4]" = torch.ops.aten.add.Tensor(mul, div);  mul = div = None
            with_effects = torch._higher_order_ops.effects.with_effects(token_1, torch.ops._TorchScriptTesting.takes_foo.default, obj_attr, add);  token_1 = obj_attr = add = None
            getitem: "f32[0]" = with_effects[0]
            getitem_1: "f32[4, 4]" = with_effects[1];  with_effects = None
            return (getitem, getitem_1)

# model code

class Foo(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.attr = torch.classes._TorchScriptTesting._Foo(10, 20)
    def forward(self, data, a=1.0, mode="floor"):
        x = self.attr.add_tensor(data.x) + torch.div(data.y, a, rounding_mode=mode)
        x = torch.ops._TorchScriptTesting.takes_foo(self.attr, x)
        return x

dataclass
class DataClass:
    x: Tensor
    y: Tensor
register_dataclass_as_pytree_node(
    DataClass,
    serialized_type_name="test.DataClass"
)

args = (DataClass(x=torch.randn(4, 4), y=torch.randn(4, 4)), )
kwargs = {'mode': 'floor'}
ep = torch.export.export(Foo(), args, kwargs, strict=False)

```

Test Plan: verification checks on placeholder names for all export() calls, unit test in test/export/test_export.py

Differential Revision: D55456418

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122904
Approved by: https://github.com/angelayi, https://github.com/thiagocrepaldi
2024-04-05 18:56:00 +00:00
Tugsbayasgalan Manlaibaatar
1ea6d3a9b4 Fix conv decomp when running to core-aten (#123283)
Differential Revision: [D55709374](https://our.internmc.facebook.com/intern/diff/D55709374)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123283
Approved by: https://github.com/angelayi
2024-04-04 01:14:09 +00:00
Pian Pawakapan
3f99306452 [export] Remove from_export flag (#122500)
Summary: The flag from_export was incorrectly included in a previous diff (https://www.internalfb.com/diff/D54314379) - it was intended for helping with ExportedProgram verification, but was no longer needed in the final implementation.

Test Plan: Changes no functionality, test/export already covers everything

Differential Revision: D55205857

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122500
Approved by: https://github.com/avikchaudhuri, https://github.com/zhxchen17
2024-03-22 22:55:14 +00:00
Pian Pawakapan
3bd38928ba [export] Improve consistency for nn_module_stack metadata, add checks to _trace.py (#120661)
We would like to improve consistency for nn_module_stack metadata in torch.export.

This PR ensures that all tests in test/export/test_export.py has the following constraints:
- Remove nn_module_stack for all placeholder & output nodes, for all modules and submodules
- Ensure nn_module_stack is present for all other node types for the top-level module (there is still an issue with torch.cond submodules having empty fields)
- Add these checks to _export() in _trace.py (we would add this in the Verifier, but downstream apps construct ExportedPrograms separate from _export(), and metadata may not be maintained there)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120661
Approved by: https://github.com/avikchaudhuri
2024-03-16 21:44:52 +00:00
angelayi
ef25d83a62 [export] Add serialization support for tokens (#121552)
Differential Revision: [D54906766](https://our.internmc.facebook.com/intern/diff/D54906766)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121552
Approved by: https://github.com/zhxchen17
2024-03-15 16:15:11 +00:00