Commit Graph

882 Commits

Author SHA1 Message Date
Avik Chaudhuri
0144613e6f move and fix logic to update unbacked bindings (#146115)
Summary:
Previously we were touching up unbacked bindings between Dynamo and AOTAutograd in strict export, but the logic had a bug: if an unbacked symint gets substituted by a backed symint, we would put the backed symint in the unbacked bindings (the check `is_symbol` was not enough here).

This PR fixes this logic, and moreover, moves it into the serializer instead, because we don't need this adjustment outside serde.

Test Plan: added test

Differential Revision: D68880766

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146115
Approved by: https://github.com/pianpwk
2025-02-02 10:43:55 +00:00
angelayi
6023684311 [export] Fix symfloat serialization (#146112)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146112
Approved by: https://github.com/pianpwk
2025-02-01 02:28:44 +00:00
Zhengxu Chen
aad9f44b2e [export] Sync model container types to schema.py (#145959)
Summary: Synced from D68840230

Test Plan: No behavior changes to existing API. Will be tested internally.

Differential Revision: D68846532

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145959
Approved by: https://github.com/yiming0416
2025-01-31 18:17:56 +00:00
Pian Pawakapan
7b07415aaa [export] nested terms in nn_module_stack deserialization (#145901)
Summary: accounting for terms like "getattr(getattr(a[0], b), c)".

Test Plan: test_serialize

Differential Revision: D68784736

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145901
Approved by: https://github.com/angelayi
2025-01-31 10:00:13 +00:00
Sherlock Huang
cf2de4e230 Introduce aoti_call_delegate HOP (#145630)
Summary:
Previously, aoti compile node is represented as a kernel-less custom op in the exported program. The node was not eager runnable, which is a common practice for numerical validation during lowering.

I introduce a new HOP to address this.

The schema is following
```
aoti_call_delegate(lower_moduel: AOTInductorEPModule, original_gm: fx.GraphModule, weights: List[Tensor], inputs: List[Tensor])
```

There are a few problems exposed by HOP
- AOTI expects a FX graph with weights as getattr nodes, aka stateful graph. HOP expect graph_module arguments to be stateless. Export serializer also expect a stateless graph. Currently, to make AOTI happy, I am making `original_gm` stateful, and bypassing the serialization for `original_gm`.
- As a result, the HOP is not re-traceable, as functionalization on stateful graph module argument will fail.

Test Plan: buck2 test 'fbcode//mode/opt' fbcode//deeplearning/aot_inductor/cpu/test:cpu_lowering_utils_test

Reviewed By: zhxchen17

Differential Revision: D68359391

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145630
Approved by: https://github.com/zou3519
2025-01-31 04:57:36 +00:00
Avik Chaudhuri
1a613c3342 bump counters for unbacked binding names (#145882)
Instead of bumping symint counters when we process unbacked bindings during deserialization, it's better to bump them at the beginning based on what the symbols in the original shape env before serialization were. This allows symbols in unbacked bindings to have "gaps" that bumping alone would not be able to match.

Why is bumping counters important at all? It is because when the shape env coming out of deserialization is used later for propagating symints, say in run_decompositions, we don't want new names to clash with existing names (bad things happen).

Differential Revision: [D68798191](https://our.internmc.facebook.com/intern/diff/D68798191/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145882
Approved by: https://github.com/pianpwk
2025-01-29 17:46:21 +00:00
Colin Peppler
50f834f134 [export] allow bit shift builtin ops (#145802)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145802
Approved by: https://github.com/pianpwk
2025-01-29 03:05:48 +00:00
Pian Pawakapan
15e37e4253 [export] don't always print GM in serdes logging (#145857)
Summary: Didn't realize print_readable() would also print and not just return string

Test Plan: .

Differential Revision: D68781525

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145857
Approved by: https://github.com/angelayi, https://github.com/yiming0416
2025-01-29 01:03:02 +00:00
Avik Chaudhuri
45f64e770a relax assertion to warning for unbacked binding names (#145777)
Summary:
Quick fix following up on https://github.com/pytorch/pytorch/pull/144894 to unblock internal tests.

Will keep investigating a more principled fix.

Test Plan: Failures in T213563826 now pass

Differential Revision: D68731710

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145777
Approved by: https://github.com/angelayi
2025-01-28 07:52:40 +00:00
Avik Chaudhuri
42b8e233d9 serde unbacked bindings (#144894)
Adds unbacked bindings during deserialization. These are carried by a node's metadata, and map pending fresh unbacked symbols to paths to such symbols inside the corresponding example value carried by the node's metadata.

Since it is awkward to serialize paths, we only serialize the names of these symbols and reconstruct the paths on deserialization, using a shape env util. We also need to bump counters for unbacked symbols here, because the shape env util we use to create these symbols (when deserializing example values) don't do so, and not doing so makes later passes (like `run_decompositions`) crash because new unbacked symbols don't get new names.

This is enough for non-strict. For strict, the unbacked bindings and example values in node metadata can get out of sync, because of running AOTAutograd as an additional step after Dynamo. So we have to sync those back.

Differential Revision: [D68232274](https://our.internmc.facebook.com/intern/diff/D68232274/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144894
Approved by: https://github.com/pianpwk
2025-01-25 02:34:27 +00:00
Avik Chaudhuri
68a1505985 serde and_ operator (#145506)
Differential Revision: [D68565887](https://our.internmc.facebook.com/intern/diff/D68565887/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145506
Approved by: https://github.com/zhxchen17, https://github.com/Skylion007
2025-01-24 03:48:03 +00:00
Pian Pawakapan
d53f2067fe [BE][export] add "+export" logging to de/serialization (#145283)
adds de/serialization debug logging to `TORCH_LOGS="+dynamic"`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145283
Approved by: https://github.com/ydwu4, https://github.com/angelayi
2025-01-23 19:47:48 +00:00
Aaron Orenstein
97d4d3c40a PEP585 update - torch/_export (#145138)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145138
Approved by: https://github.com/bobrenjc93
ghstack dependencies: #145154
2025-01-19 18:48:35 +00:00
Aaron Orenstein
cd8d0fa20c Tweak schema_check to handle annotated builtin types (#145154)
As of python 3.9 annotated lists can be written as `list[T]` and `List[T]` has been deprecated.  However schema_check was converting `list[T]` to simply be `list`. This change teaches it to handle `list[T]` the same as `List[T]`.

A couple small drive-by changes I noticed as well:
- Path concatenation should use `os.path.join`, not `+`
- Spelling in error message

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145154
Approved by: https://github.com/bobrenjc93
2025-01-19 18:48:35 +00:00
PyTorch MergeBot
f522502b97 Revert "patch for block-wise quantization + pt2e (#144492)"
This reverts commit 1d43b81508.

Reverted https://github.com/pytorch/pytorch/pull/144492 on behalf of https://github.com/albanD due to Broke a few things in trunk ([comment](https://github.com/pytorch/pytorch/pull/144492#issuecomment-2598485291))
2025-01-17 14:27:53 +00:00
Chen Lai
1d43b81508 patch for block-wise quantization + pt2e (#144492)
Summary: As title, needed for enable qcom block-wise quantization kernel

Test Plan: local test

Differential Revision: D67985303

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144492
Approved by: https://github.com/angelayi, https://github.com/billmguo
2025-01-17 04:10:49 +00:00
Zhengxu Chen
53256edff9 [export] Support module inputs for non strict mode. (#143925)
Summary:
Add experimental support for torch.nn.Module as input types.

Before this change, we don't support module inputs but recently we saw some interesting use cases like gpt-fast https://github.com/pytorch-labs/gpt-fast/blob/main/generate.py#L68 where we directly pass in a module input for different variants of the same models.

Since we don't really care about non-param or non-buffer states in non strict mode, we don't care about those either and pretend they are like plain constants during tracing. We treat any module input like a nested container of tensor, and each time we will automatically register a pytree handler for these module types to flatten its state dict into a group of tensors. We will just inline any module method call during tracing like we did for `self` module in export_for_training. This will make input modules' behavior very similar to the training module in typical case, except that we don't record the inputs as parameter or buffers but rather just plain user inputs.

Test Plan: buck run mode/opt caffe2/test:test_export -- -r test_module_input

Differential Revision: D67680827

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143925
Approved by: https://github.com/tugsbayasgalan
2025-01-16 17:30:36 +00:00
Avik Chaudhuri
d812fdd490 fix as_bool serde (#144791)
Differential Revision: [D68167701](https://our.internmc.facebook.com/intern/diff/D68167701/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144791
Approved by: https://github.com/pianpwk
2025-01-15 20:22:26 +00:00
Zhengxu Chen
834086c023 [export] Load side info about pos/kw argument kind for serialization. (#144686)
Summary:
Fixing issue of nodes like
```
torch.ops.aten.linear.default(x, w, b)
```
being deserialized as
```
torch.ops.aten.linear.default(x, w, bias=b)
```
which breaks roundtripping.

Test Plan:
buck test mode/opt caffe2/test:test_export -- -r TestDeserialize
buck test mode/opt caffe2/test:test_export -- -r TestSerialize

Differential Revision: D67991410

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144686
Approved by: https://github.com/angelayi
2025-01-15 19:08:38 +00:00
Aaron Orenstein
d782e46a36 [BE] typing for decorators - library (#138969)
Test Plan: unit tests

Differential Revision: D62302678

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138969
Approved by: https://github.com/zou3519
2025-01-15 17:08:55 +00:00
Yiming Zhou
6d56277682 [export] Fix torchbind constant folding (#144684)
Summary: `CallTorchBind` should not be folded during constant folding

Test Plan:
```
buck2 run mode/dev-nosan sigmoid/inference/test:test_passes -- -r test_const_folding_torchbind
```

Reviewed By: henryoier

Differential Revision: D67721272

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144684
Approved by: https://github.com/zhxchen17
2025-01-14 01:58:44 +00:00
Yiming Zhou
87843ee9ab [export] Unify single and multiple return for hops (#143227)
Summary: Introduce `is_hop_single_tensor_return` field to the `Node` class in serialization so that during deserialization when there is a single return, we know whether it is a tuple of a single element or a single element.

Test Plan:
```
buck2 run @mode/dev-nosan sigmoid/inference/test:e2e_test_cpu -- -r E2ETestCPUCond
buck2 run @mode/dev-nosan sigmoid/inference/test:test_passes -- -r test_const_folding2
```

Differential Revision: D66991624

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143227
Approved by: https://github.com/zhxchen17
2025-01-13 03:31:14 +00:00
angelayi
7a81ba18b9 [export] Add support for serializing symint inputs (#142284)
Fixes https://github.com/pytorch/pytorch/issues/142167
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142284
Approved by: https://github.com/avikchaudhuri
2025-01-10 20:03:26 +00:00
angelayi
10ff6b8894 [export] Add pickle protocol (#142253)
Fixes https://github.com/pytorch/pytorch/issues/142004

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142253
Approved by: https://github.com/avikchaudhuri
2025-01-10 19:49:07 +00:00
Yiming Zhou
d1b64ec326 [export] Fix sym_bool serialization (#144295)
Summary:
When there is a `torch._check()` that checks if a sym_int is equal to some constant, it will generate 3 nodes in the graph with target `operation.ge`, `operator.le` and `operator.eq`. These operators belong to `_SYM_BOOL_OPS` but the `meta_val` of these nodes are are `bool` instead of `torch.SymBool`.

Similar things can happen to `torch.SymInt`, where a `node.target` belongs to `_SYM_INT_OPS` but `node.meta["val"]` is an `int` instead of `torch.SymInt`.

Therefore, we need to check both `meta_val` type and `node.target` type during serialization.

Test Plan:
```
buck2 run @mode/dev-nosan caffe2/test:test_export -- -r test_sym_bool_torch_check_equal
buck2 run @mode/dev-nosan caffe2/test:test_export -- -r test_sym_int_torch_check_equal
```

Differential Revision: D67883754

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144295
Approved by: https://github.com/avikchaudhuri, https://github.com/angelayi
2025-01-10 02:07:54 +00:00
Avik Chaudhuri
12fdb93ebd fix non-strict placeholder naming with kwargs (#144278)
Fixes https://github.com/pytorch/pytorch/issues/143732

Differential Revision: [D67872055](https://our.internmc.facebook.com/intern/diff/D67872055/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144278
Approved by: https://github.com/yushangdi, https://github.com/pianpwk
2025-01-07 11:22:09 +00:00
Tugsbayasgalan Manlaibaatar
c68c38c673 Support getattr for tensor subclasses in pre-dispatch export via patching tensor.getattr (#143946)
Previous discussion: https://github.com/pytorch/pytorch/pull/143671#issuecomment-2560112499 and https://github.com/pytorch/pytorch/pull/143671

Differential Revision: [D67693609](https://our.internmc.facebook.com/intern/diff/D67693609)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143946
Approved by: https://github.com/bdhirsh
2025-01-06 23:55:50 +00:00
bobrenjc93
d75ffccd0a Migrate from Tuple -> tuple in torch/_export (#144262)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144262
Approved by: https://github.com/avikchaudhuri
2025-01-06 22:20:26 +00:00
bobrenjc93
e9e18a9617 remove allow-untyped-defs from _export/db/logging.py (#144093)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144093
Approved by: https://github.com/Skylion007
2025-01-03 19:36:14 +00:00
bobrenjc93
8506a2af9a remove allow-untyped-defs from _export/pass_infra/proxy_value.py (#143944)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143944
Approved by: https://github.com/aorenste
ghstack dependencies: #143943
2025-01-02 18:17:03 +00:00
Avik Chaudhuri
51eacea8c4 graph module retracing without preserving MCS (#143676)
Retracing while preserving module call signatures used to be a problem because graph modules don't have submodules at given paths. This led to a number of failing retracebility tests. By not trying to wrap modules with export tracepoints we can pass most of these tests; the only exception is where you do module swapping on retraced programs, which is still not possible.

Differential Revision: [D67539304](https://our.internmc.facebook.com/intern/diff/D67539304/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143676
Approved by: https://github.com/zhxchen17, https://github.com/tugsbayasgalan
ghstack dependencies: #143664
2024-12-21 07:57:43 +00:00
Pian Pawakapan
f9f82ca48f [ts converter] use Dim.AUTO for ts -> export converter (#138273)
Switches TS converter to use `Dim.AUTO` by default, exporting models with max dynamism. Adds runtime input tests to `test_converter.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138273
Approved by: https://github.com/avikchaudhuri
2024-12-20 07:48:24 +00:00
Jane Xu
a0cff096bc Improve cond error messaging (#143595)
Discovered by @drisspg and I trying out a simple toy example and being way too confused :')

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143595
Approved by: https://github.com/zou3519, https://github.com/ydwu4
2024-12-20 01:19:20 +00:00
Yidi Wu
1e201422ed [export] add is_exporting flag (#142425)
We added an is_export flag under torch.compiler.is_exporting. This comes handy when we try to do some special logic in user-level and system-level (e.g. in upper of the stack).

In increasing-scope:
- `_is_fx_tracing` is set to True when we use under symbolic_trace or make_fx.
- `is_exporting` is set to True when we're doing strict or non-strict export, which internally has a step that calls make_fx and set _is_fx_tracing to be True.
- `is_compiling` is set to True when we're either doing strict, non-strict export or torch.compile.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142425
Approved by: https://github.com/avikchaudhuri
2024-12-18 21:36:28 +00:00
Shangdi Yu
d8ea4ce631 [reland] Kill capture_pre_autograd_graph API (#143426)
Summary:
Delete the following API:

- capture_pre_autograd_graph()
- capture_pre_autograd_graph_using_training_ir()
- gm_using_training_ir()

Update XLA pin to include https://github.com/pytorch/xla/pull/8398

There's no more call sites to `capture_pre_autograd_graph`.

Except
1) two test cases in coreml, guarded by version guard, PR to remove: https://github.com/apple/coremltools/pull/2400
2) a few call sites guarded by version guard (< 2.5.0)

Test Plan: CI

Differential Revision: D67354440

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143426
Approved by: https://github.com/gmagogsfm
2024-12-18 12:07:09 +00:00
Avik Chaudhuri
bceedeec2b fix checking non-trivial input constraints (#143442)
A bunch of auto dynamic shape tests would fail non-strict retraceability because when checking input constraints, we'd compare non-trivial expressions, which would require / affect shape env.
```
... is not tracked with proxy for <torch.fx.experimental.proxy_tensor._ModuleStackTracer object ...
```

I've also observed this bug internally.

This PR does an early check on whether args passed have concrete shapes, and only then proceeds: as before, we
1. try to unify / solve with the arg dim when the corresponding placeholder node dim is symbolic in one symbol
2. check directly if the placeholder node dim is concrete
3. otherwise defer to run time.

Differential Revision: [D67359596](https://our.internmc.facebook.com/intern/diff/D67359596/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143442
Approved by: https://github.com/tugsbayasgalan
2024-12-18 07:29:08 +00:00
Shangdi Yu
c17a07ade3 Add float8 support in serde schema (#143343)
Summary:
Fix https://github.com/pytorch/pytorch/issues/141316

Bump up schema minor version.

as title, add float8 support in serde schema

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r  test_serialize_float8
```

Differential Revision: D67307670

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143343
Approved by: https://github.com/yiming0416
2024-12-18 05:07:21 +00:00
bobrenjc93
17a6d4b882 remove allow-untyped-defs for torch/_export/passes/remove_runtime_assertions.py (#143435)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143435
Approved by: https://github.com/oulgen
2024-12-18 03:05:20 +00:00
Benjamin Glass
37a1b9efcc [export] Serialize all dataclass fields (#142286)
Reverts a change in #121337. All dataclass members must be serialized, even default-valued members, because downstream code often implicitly assumes their presence.

This PR fixes a segfault when running `test_custom_op_all_inputs` from `test/inductor/test_aot_inductor_custom_ops.py`. This segfault was caused by querying for an "index" field for the `Device` type (see `torch/csrc/inductor/aoti_torch/oss_proxy_executor.cpp:136`), which was previously skipped when serializing if the device index was unspecified. A number of other structs which are deserialized in this file also contain optional fields, and presumably could experience the same bug.

Fixes #138955

Fixes #134793
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142286
Approved by: https://github.com/zhxchen17
ghstack dependencies: #142175
2024-12-17 17:21:27 +00:00
PyTorch MergeBot
519d858c31 Revert "Kill capture_pre_autograd_graph API (#143224)"
This reverts commit 4c62275325.

Reverted https://github.com/pytorch/pytorch/pull/143224 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the XLA failure is legit ([comment](https://github.com/pytorch/pytorch/pull/143224#issuecomment-2547264675))
2024-12-17 00:47:24 +00:00
Shangdi Yu
4c62275325 Kill capture_pre_autograd_graph API (#143224)
Summary:
Delete the following API:

- capture_pre_autograd_graph()
- capture_pre_autograd_graph_using_training_ir()
- gm_using_training_ir()

There's no more call sites to `capture_pre_autograd_graph`.

Except
1) two test cases in coreml, PR to remove: https://github.com/apple/coremltools/pull/2400
2) XLA: one test case in pytorch/xla, PR to remove: https://github.com/pytorch/xla/pull/8398
3) a few call sites guarded by version guard (< 2.5.0)

Test Plan: CI

Reviewed By: tugsbayasgalan

Differential Revision: D64056353

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143224
Approved by: https://github.com/tugsbayasgalan
2024-12-16 23:06:22 +00:00
Avik Chaudhuri
de484134e4 support slicing with symints in non-strict (#143217)
Differential Revision: [D67215745](https://our.internmc.facebook.com/intern/diff/D67215745/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143217
Approved by: https://github.com/tugsbayasgalan
2024-12-14 10:27:45 +00:00
Zhengxu Chen
fbfc530442 [export][ez] Fix forward D67044185 (#143193)
Summary: Fixing forward D67044185 and T210459833 by adding the missing buld file.

Test Plan: buck2 build --flagfile fbcode//mode/opt fbcode//admarket/training_data/augmentation/processors/tests:model_manager_test

Differential Revision: D67200056

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143193
Approved by: https://github.com/tugsbayasgalan
2024-12-13 16:06:42 +00:00
Zhengxu Chen
ee5bceaee6 [sigmoid] Write the new export schema format to archive without breaking compatibility. (#142511)
Summary:
This diff make it possible to migrate to PyTorch's OSS export schema from sigmoid. Basically, we add a new field called "methods" to ExportedProgram in Model definition, which contains the thrift schema generated based on schema.py from OSS. This way, we can keep writing the old fields while double write a new format in equivalent form. Since thrift doesn't support inlining type definitions, we do it manually here and it shouldn't break on-wire compatibility. As long as every sigmoid user is using sigmoid.frontend.serialization.serialize, we always guarantee to have the new format saved sa well.

Eventually we will will use json deserialization from OSS so we will only keep this double writing for a couple of months. Eventually, we will migrate every serialization path to the OSS workflow.

Test Plan:
buck test mode/opt sigmoid/frontend:serialization_test
buck test mode/opt sigmoid/frontend/test_gpu:serializer_test

Differential Revision: D67044185

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142511
Approved by: https://github.com/desertfire
2024-12-12 18:41:10 +00:00
Tom Ritchford
dc23f1944a Remove unused Python variables in torch/[_-a]* (#133492)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-12 17:39:14 +00:00
PyTorch MergeBot
5c97ac9721 Revert "Remove unused Python variables in torch/[_-a]* (#133492)"
This reverts commit fda975a7b3.

Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else.  The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))
2024-12-11 17:29:12 +00:00
Tom Ritchford
fda975a7b3 Remove unused Python variables in torch/[_-a]* (#133492)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-10 21:48:44 +00:00
Zhengxu Chen
1986b46d63 [export] Change Tuple[()] to bool in schema to sync with thrift. (#142257)
Summary:
In thrift schema, we represent every None value as "True/False" while we represent None as () in OSS schema. This will cause some inconsistency between the type systems and the simplest thing to do here is changing Tuple[()] to bool in oss schema.

This change should NOT cause version bump, because on deserializer side we never read the value from as_none fields, as it doesn't have real meaning. Therefore this schema change should be considered as safe.

Test Plan: CI

Reviewed By: SherlockNoMad

Differential Revision: D66888892

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142257
Approved by: https://github.com/yiming0416, https://github.com/hl475
2024-12-10 17:13:35 +00:00
Bin Bao
6680a83e89 [AOTI XPU] Support AOT Inductor for Intel GPU. (#140269)
This PR add XPU support for AOT Inductor, and reuse the corresponding UT.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140269
Approved by: https://github.com/desertfire, https://github.com/EikanWang
ghstack dependencies: #140268

Co-authored-by: Bin Bao <binbao@meta.com>
2024-12-10 05:05:08 +00:00
Fabian Keller
5e8e1d725a Remove some unused type ignores (round 1) (#142325)
Over time, a large number of the existing type ignores have become irrelevant/unused/dead as a result of improvements in annotations and type checking.

Having these `# type: ignore` linger around is not ideal for two reasons:

- They are verbose/ugly syntatically.
- They could hide genuine bugs in the future, if a refactoring would actually introduce a bug but it gets hidden by the ignore.

I'm counting over 1500 unused ignores already. This is a first PR that removes some of them. Note that I haven't touched type ignores that looked "conditional" like the import challenge mentioned in https://github.com/pytorch/pytorch/pull/60006#issuecomment-2480604728. I will address these at a later point, and eventually would enable `warn_unused_ignores = True` in the mypy configuration as discussed in that comment to prevent accumulating more dead ignores going forward.

This PR should have no effect on runtime at all.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142325
Approved by: https://github.com/Skylion007, https://github.com/janeyx99
2024-12-09 18:23:46 +00:00
PyTorch MergeBot
219e9c83a5 Revert "[AOTI XPU] Support AOT Inductor for Intel GPU. (#140269)"
This reverts commit 854d83133b.

Reverted https://github.com/pytorch/pytorch/pull/140269 on behalf of https://github.com/clee2000 due to breaks forward compatibility?  D66937097 ([comment](https://github.com/pytorch/pytorch/pull/140269#issuecomment-2528828555))
2024-12-09 17:33:28 +00:00
xinan.lin
854d83133b [AOTI XPU] Support AOT Inductor for Intel GPU. (#140269)
This PR add XPU support for AOT Inductor, and reuse the corresponding UT.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140269
Approved by: https://github.com/desertfire, https://github.com/EikanWang
ghstack dependencies: #140268
2024-12-07 19:22:04 +00:00
Bin Bao
660845a1aa [AOTI] Add deprecation warning for torch._export.aot_load (#142212)
Summary: Add deprecation warning for torch._export.aot_load, and encourage user to move to the new torch._inductor.aoti_load_package.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142212
Approved by: https://github.com/angelayi
2024-12-06 21:12:34 +00:00
Zhengxu Chen
1a7da6e7e9 [export] Add test to enforce consistency between synced thrift and generated thrift from schema.py (#141989)
Summary:
In this diff we implement a way to ensure the internal thrift schema from cfgr (configerator/structs/caffe2/torch/export/schema.thrift) and the schema in OSS (torch/_export/serde/schema.thrift) are in sync, by adding a unittest to reflect on the type names and fields from each schema and compare them field by field.

When we detect new fields/types from torch/_export/serde/schema.thrift, there'll be a test failure on the trunk and the error message hints people to add the missing field/type to the thrift schema from cfgr, so that they are always in sync in practice.

Test Plan: buck test mode/opt caffe2/test:test_export -- -r test_thrift_schema_in_sync

Differential Revision: D66716834

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141989
Approved by: https://github.com/yiming0416
2024-12-06 18:42:20 +00:00
bhack
ae9cda0221 Add truediv support in export serializer (#136364)
Fixes #136113

- [x] Inital `truediv` coverage
- [ ] Expand/reduce coverage?
- [x] Add tests
- [x] Re-check docstrings
- [ ] Linting

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136364
Approved by: https://github.com/pianpwk

Co-authored-by: Angela Yi <angelayi@meta.com>
Co-authored-by: Pian Pawakapan <pianpwk@meta.com>
2024-12-05 17:33:33 +00:00
Shangdi Yu
0190d929f2 [BE] Remove unused argument (#141983)
Summary: As title, the `node_filter` argument is not used.

Test Plan: CI

Differential Revision: D66712599

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141983
Approved by: https://github.com/tugsbayasgalan
2024-12-04 00:07:33 +00:00
Fabian Keller
f472b3aee1 improve typings around torch.export (#141829)
This is another follow-up to https://github.com/pytorch/pytorch/pull/115074 / https://github.com/pytorch/pytorch/pull/141240 following the strategy discussed there (https://github.com/pytorch/pytorch/pull/115074#issuecomment-2480992230).

This PR improves the type annotations around `torch._export`. Even though the PR introduces a few runtime type asserts, the runtime behavior should stay equivalent, because the failed assertions should have been immediate crashes anyway.

CC @Skylion007 @ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141829
Approved by: https://github.com/ezyang
2024-12-03 19:57:21 +00:00
Aaron Gokaslan
08db735629 [BE]: Update mypy to 1.13.0 (#140808)
Update mypy to 1.13.0 . Should hopefully reduce linting time. Has support for orjson cache serialization which should improve mypy cache perf if orjson is installed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140808
Approved by: https://github.com/ezyang, https://github.com/malfet
2024-12-03 02:50:10 +00:00
PyTorch MergeBot
09ce760fef Revert "Add missing data types at torch export serialization (#138561)"
This reverts commit 1ef1b3b391.

Reverted https://github.com/pytorch/pytorch/pull/138561 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/138561#issuecomment-2513343401))
2024-12-03 01:32:50 +00:00
PyTorch MergeBot
daa77f3d9f Revert "[BE]: Update mypy to 1.13.0 (#140808)"
This reverts commit 00134d68af.

Reverted https://github.com/pytorch/pytorch/pull/140808 on behalf of https://github.com/huydhn due to This is failing a distributed test in trunk, target determination missed this test and did not run it on PR ([comment](https://github.com/pytorch/pytorch/pull/140808#issuecomment-2512788426))
2024-12-02 20:47:43 +00:00
Aaron Gokaslan
00134d68af [BE]: Update mypy to 1.13.0 (#140808)
Update mypy to 1.13.0 . Should hopefully reduce linting time. Has support for orjson cache serialization which should improve mypy cache perf if orjson is installed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140808
Approved by: https://github.com/ezyang, https://github.com/malfet
2024-12-02 18:47:54 +00:00
Zhengxu Chen
a8a570512b [export] Generate compatible thrift schema out of schema.py (#141611)
Summary: To make sure schema.py and schema.thrift are kept in sync, we use the int keys from thrift and use Python Annotated type to associate fields between thrift and schema.py. Later we will use this association to build a single source of truth between the schemas.

Test Plan: CI

Differential Revision: D66253157

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141611
Approved by: https://github.com/yiming0416
2024-11-29 20:09:49 +00:00
yintong-lu
1ef1b3b391 Add missing data types at torch export serialization (#138561)
Related to #131654

Added missing FP8 data types at torch export serialization.
Added test cases of FP8 data types.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138561
Approved by: https://github.com/jerryzh168, https://github.com/jgong5
2024-11-28 08:35:03 +00:00
PyTorch MergeBot
6e61ff4fd3 Revert "Add truediv support in export serializer (#136364)"
This reverts commit 1df440dc4e.

Reverted https://github.com/pytorch/pytorch/pull/136364 on behalf of https://github.com/huydhn due to Sorry for reverting your change but its doc build failure is legit ([comment](https://github.com/pytorch/pytorch/pull/136364#issuecomment-2502620732))
2024-11-27 03:24:31 +00:00
bhack
1df440dc4e Add truediv support in export serializer (#136364)
Fixes #136113

- [x] Inital `truediv` coverage
- [ ] Expand/reduce coverage?
- [x] Add tests
- [x] Re-check docstrings
- [ ] Linting

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136364
Approved by: https://github.com/pianpwk

Co-authored-by: Angela Yi <angelayi@meta.com>
Co-authored-by: Pian Pawakapan <pianpwk@meta.com>
2024-11-27 00:31:47 +00:00
Tugsbayasgalan Manlaibaatar
11c786dcb5 [BE] Make maybe_aliasing_or_mutating proper tag (#131990)
For better tracking, we need to make maybe aliasing/mutating ops with proper tag. We need to special case native_batch_norm because it is not a CIA but has a wrong schema. I guess native_batch_norm will be removed at some point, so until then we just keep it around.

D60347117
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131990
Approved by: https://github.com/bdhirsh
2024-11-24 00:12:49 +00:00
angelayi
53df1c11cd [export] Add custom op guards (#141072)
For custom ops that do not have a meta kernel, draft export automatically creates a meta kernel based on the tracing example inputs. To ensure that these assumptions made during tracing is clear to the user, we add assertions into the traced exported program:

An example graph:
```
ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, a: "f32[s0, s1]", b: "f32[s2, s3]"):
             # File: /data/users/angelayi/pytorch/test/export/test_draft_export.py:172 in forward, code: res1 = torch.ops.mylib.foo4(a, b)
            _assert_tensor_metadata = torch.ops.aten._assert_tensor_metadata(a, dtype = torch.float32, device = device(type='cpu'));  _assert_tensor_metadata = None
            _assert_tensor_metadata_1 = torch.ops.aten._assert_tensor_metadata(b, dtype = torch.float32, device = device(type='cpu'));  _assert_tensor_metadata_1 = None
            foo4: "f32[u2, u3]" = torch.ops.mylib.foo4.default(a, b);  a = b = None
            return (foo4,)
```

Differential Revision: [D66321129](https://our.internmc.facebook.com/intern/diff/D66321129)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141072
Approved by: https://github.com/pianpwk
ghstack dependencies: #141071
2024-11-22 20:55:04 +00:00
Tugsbayasgalan Manlaibaatar
7c5c38da23 Fix constant lifting pass when there is no user input (#141157)
Differential Revision: [D66253854](https://our.internmc.facebook.com/intern/diff/D66253854/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141157
Approved by: https://github.com/zhxchen17
2024-11-22 19:08:25 +00:00
Pian Pawakapan
e54538afc8 [export] fix sympy.expr roundtrippability for serialization (#141284)
Summary:
Latest attempt after [136802](https://github.com/pytorch/pytorch/pull/136802) and [140084](https://github.com/pytorch/pytorch/pull/140084) got shelved.

This keeps the string format for `expr_str`, but calls `sympy.printing.repr.srepr(s)` instead of `str(s)`, which prints expressions more explicitly, e.g.
```
((2*x)//(3*y + 4)) -> "FloorDiv(Mul(Integer(2), Symbol('x')), Add(Mul(Integer(3), Symbol('y')), Integer(4)))"
```

This is nice because:
- we have better roundtrippability for deserialization, robust to pretty printing changes like [this](6c9bfd52b6/torch/utils/_sympy/functions.py (L208)) that caused the issue in the first place.
- this preserves the BC surface for both 1) sigmoid thrift serialization, by keeping the string format, and 2) deserialization for old IRs, since `sympy.sympify(...)` still handles the old `str(s)` format.
- more memory efficient than storing ASTs; the [AST attempt](https://github.com/pytorch/pytorch/pull/140084) increased artifact size by 20% on some toy programs.
- doesn't even require a schema version bump.

Additionally to push some test cases over the line, this redoes expression processing (handling ranges, symbol caching) by doing bottom-up processing instead of the current hacky-ish workflow.

Test Plan: test_serdes, test_serialize, internal tests broken by AST PR

Differential Revision: D66283208

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141284
Approved by: https://github.com/zhxchen17
2024-11-22 18:47:04 +00:00
Zhengxu Chen
313dac6c1c [export] Fix name inconsistentcy between thrift and schema.py (#141151)
Summary: The struct type is named "InputToConsantInputSpec" in thrift which causes some inconsistency between the schema. Changing the type name from 1 to another is okayish because that doesn't change the on wire format.

Test Plan: CI

Differential Revision: D66240951

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141151
Approved by: https://github.com/yiming0416
2024-11-22 18:04:23 +00:00
Shangdi Yu
0155a112fd [export] avoid name collision when inlining node (#141169)
Summary:
When we have both `set_grad` and `autocast` HOP, name collision might happen when we try to inline a node.

For exmaple, for a GraphModule like this:

```
GraphModule(
  (submod_0): GraphModule(
    (submod_1): GraphModule()
  )
  (submod_1): GraphModule()
  (submod_2): GraphModule()
)

```

when we inline `submod_0`, we might accidentally overwrite `submod_1`.

In this PR, we fix this by check if the graph module already has an attribute with the same name, if so, we use the next "submod_{i}", until no name collision.

Partially fixes https://github.com/pytorch/pytorch/issues/140589.

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r  test_predispatch_autocast_and_set_grad
```

Differential Revision: D66200994

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141169
Approved by: https://github.com/angelayi
2024-11-22 01:08:22 +00:00
Edward Z. Yang
612122af8f Fix type-safety of torch.nn.Module instances (#141240)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141240
Approved by: https://github.com/Skylion007, https://github.com/malfet
2024-11-22 00:05:05 +00:00
PyTorch MergeBot
d3c8f1af8d Revert "[export] serialize sympy.Exprs as ASTs instead of strings (#140084)"
This reverts commit d869344bc0.

Reverted https://github.com/pytorch/pytorch/pull/140084 on behalf of https://github.com/izaitsevfb due to reverted internally in D66253238 ([comment](https://github.com/pytorch/pytorch/pull/140084#issuecomment-2492165667))
2024-11-21 20:09:54 +00:00
Shangdi Yu
5c37b20d13 Fix autocast HOP pass for nested autocast (#141065)
Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r "test_predispatch_autocast"
```

Differential Revision: D65970066

@diff-train-skip-merge

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141065
Approved by: https://github.com/angelayi
2024-11-20 21:57:11 +00:00
Aaron Gokaslan
12e95aa4ee [BE]: Apply PERF401 autofixes from ruff (#140980)
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-11-20 17:52:07 +00:00
Pian Pawakapan
d869344bc0 [export] serialize sympy.Exprs as ASTs instead of strings (#140084)
Summary: The way we've been de/serializing sympy.Exprs is not roundtrippable in all cases (serialize by calling `str(expr)`, and deserialize by calling `sympy.sympify(expr_str)`). This has led to expressions being mathematically equivalent but structurally different, causing issues in ValueRanges. Example issue: https://github.com/pytorch/pytorch/issues/136797

This starts to deprecate the use of `expr_str` and stores expressions in AST format instead. For BC purposes, `expr_str` deserialization is still supported, but we will always serialize to `expr_ast`. We'll kill this once the serialization upgrader design is finalized and implemented.

Test Plan: test_export

Differential Revision: D65638757

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140084
Approved by: https://github.com/angelayi
2024-11-20 07:44:25 +00:00
Angela Yi
baf756a785 [reland] [aoti] Selectively package AOTI generated files (#140675)
Summary: Reland  https://github.com/pytorch/pytorch/pull/140022

Test Plan: CI

Differential Revision: D65929964

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140675
Approved by: https://github.com/desertfire
2024-11-15 23:48:34 +00:00
Zhengxu Chen
add6bb2e96 [aps] skip version check for export IR. (#140573)
Summary: mitigating potential export compatibility issue for production (temporarily).

Test Plan: CI

Differential Revision: D65890958

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140573
Approved by: https://github.com/desertfire
2024-11-14 17:13:42 +00:00
Zhengxu Chen
3ef2dfc1ba [export] Implement cpp deserializer. (#136398)
Differential Revision: D63206258

This diff introduces a mechanism to generate a json-compatible deserializer in cpp using nlohmann json (already being used by AOTI).

Why we need this? Because there will be a lot of cases where people don't want to use Python to load the graph (e.g. cpp runtime), and instead they can use this header to deserialize the JSON graph.

Every time we call update_schema.py to update the schema, the header will be auto generated and included into the source files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136398
Approved by: https://github.com/angelayi
2024-11-14 16:34:59 +00:00
PyTorch MergeBot
b4cc5d38b4 Revert "[aoti] Remove dir after packaging (#140022)"
This reverts commit ba136a78ba.

Reverted https://github.com/pytorch/pytorch/pull/140022 on behalf of https://github.com/angelayi due to sorry I realized I need to land from internal ([comment](https://github.com/pytorch/pytorch/pull/140022#issuecomment-2473814720))
2024-11-13 14:43:15 +00:00
angelayi
ba136a78ba [aoti] Remove dir after packaging (#140022)
Update AOTI to return a list of files that it generates when `aot_inductor.package=True`. Then we will only package the files that are in that list.

This should fix the [caching issue](https://fb.workplace.com/groups/1028545332188949/permalink/1081702043539944/) and hopefully https://github.com/pytorch/pytorch/issues/140053.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140022
Approved by: https://github.com/larryliu0820, https://github.com/desertfire, https://github.com/malfet
2024-11-13 12:17:19 +00:00
zeshengzong
cb71bcc542 Replace clone.detach with detach.clone (#140264)
Fixes #64532

As state in issue, replace `clone.detach` by `detach.clone`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140264
Approved by: https://github.com/soulitzer
2024-11-13 07:01:02 +00:00
PyTorch MergeBot
d48ea29b9a Revert "[aoti] Remove dir after packaging (#140022)"
This reverts commit 8c6abe5a8c.

Reverted https://github.com/pytorch/pytorch/pull/140022 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the lint failure is legit ([comment](https://github.com/pytorch/pytorch/pull/140022#issuecomment-2471847439))
2024-11-12 23:35:27 +00:00
angelayi
8c6abe5a8c [aoti] Remove dir after packaging (#140022)
Update AOTI to return a list of files that it generates when `aot_inductor.package=True`. Then we will only package the files that are in that list.

This should fix the [caching issue](https://fb.workplace.com/groups/1028545332188949/permalink/1081702043539944/) and hopefully https://github.com/pytorch/pytorch/issues/140053.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140022
Approved by: https://github.com/larryliu0820, https://github.com/desertfire, https://github.com/malfet
2024-11-12 21:36:24 +00:00
Avik Chaudhuri
9a5175e836 fix shared submodule module call signature (#139438)
Differential Revision: [D65308061](https://our.internmc.facebook.com/intern/diff/D65308061/)

When a shared submodule is called multiple times with different aliases, e.g., `self.a` and `self.b` are both `C()` under the hood and we have calls to both `self.a(...)` and `self.b(...)`, we wrap `C()` to emit as many export tracepoints as there are aliases. This caused us to compute module call signatures that conflated information: we'd add inputs and outputs of one call to inputs and outputs of a different call. Overall preserving module call signatures in the presence of shared submodules was borked because of this bug.

The fix is to pay attention to the nn module stack, which accurately tracks individual calls, thus allowing us to ignore some export tracepoints that get the module correct but not the alias through which the call was made.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139438
Approved by: https://github.com/zhxchen17
2024-11-12 09:53:40 +00:00
Tugsbayasgalan Manlaibaatar
0af38b1034 Remove temp table to post autograd IR (#140085)
This table is not needed

Differential Revision: [D64553397](https://our.internmc.facebook.com/intern/diff/D64553397/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140085
Approved by: https://github.com/justinchuby, https://github.com/bdhirsh
2024-11-11 23:59:09 +00:00
Gregory Comer
617b4538f1 Support symbolic builtin round in export (#139549)
Differential Revision: D65380866

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139549
Approved by: https://github.com/digantdesai, https://github.com/angelayi
2024-11-07 02:49:44 +00:00
Henry Tsang
350bc2a166 [export] Add support for symbool to make it usable for torch.cond (#138765)
# Why?

I want the following code to work.

minimal repro:
```
class M(torch.nn.Module):
    def forward(self, dilate_flag):
        return dilate_flag.item()

input1 = (torch.tensor([1], dtype=torch.bool, device="cuda"),)
model = M().cuda()

ep = torch.export.export(model, input1, strict=True)
path = torch._inductor.aot_compile(ep.module(), input1)
aot_model = torch._export.aot_load(path, device="cuda")
actual_output = aot_model(*input1)
```

error: AssertionError: Encountered an unsupported object of type <class 'torch.SymBool'> while writing the metadata for exported program

second error will be handled by https://github.com/pytorch/pytorch/pull/138760

# Motivation

I could technically bypass it with a torch.int tensor. However, it doesn't work with torch.cond. I want the following to work. It would also require https://github.com/pytorch/pytorch/pull/138760 for aot compile to work.

```
class M(torch.nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.dilate_flag = 0

    def forward(self, dilate_flag):
        self.dilate_flag = dilate_flag.item()

        def true_fn(dilate_flag):
            return dilate_flag.clone()

        def false_fn(dilate_flag):
            return dilate_flag.clone()

        torch.cond(
            self.dilate_flag,
            true_fn,
            false_fn,
            (dilate_flag,),
        )
        return self.dilate_flag

input1 = (torch.tensor([1], dtype=torch.bool, device="cuda"),)
input2 = (torch.tensor([0], dtype=torch.bool, device="cuda"),)
inputs = (input1, input2)
model = M().cuda()

for input in inputs:
    expected_output = model(*input)

    ep = torch.export.export(model, input, strict=False)
    path = torch._inductor.aot_compile(ep.module(), input)
    aot_model = torch._export.aot_load(path, device="cuda")
    actual_output = aot_model(*input)

    assert (
        expected_output == actual_output
    ), f"henry they are not equal {expected_output} != {actual_output}"
```

Differential Revision: D64867504

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138765
Approved by: https://github.com/ydwu4
2024-11-04 23:31:49 +00:00
Tugsbayasgalan Manlaibaatar
ae0e7042f6 Fix custom obj being input (#139209)
Differential Revision: [D65158939](https://our.internmc.facebook.com/intern/diff/D65158939)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139209
Approved by: https://github.com/ydwu4
ghstack dependencies: #138658
2024-11-04 18:24:29 +00:00
Tugsbayasgalan Manlaibaatar
e080c89bdc Make test_torchbind.py training IR compatible (#138658)
In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs:
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to verifier.

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138658
Approved by: https://github.com/ydwu4, https://github.com/zhxchen17
2024-11-04 17:43:11 +00:00
Zhengxu Chen
45da80b970 reland D65167805 "[export] Update min_val and max_val to Optional[int] in serialization." (#139394)
Summary:
had a land racing with another diff D65166035 to fix the schema.

According to export team's discussion, we are upgrading min_val and max_val to optional fields which shouldn't break BC and allows the schema to express infinity.

Test Plan: buck2 test 'fbcode//mode/opt' fbcode//apf/rec/ir/tests:ir_export_deserialize_test

Differential Revision: D65273170

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139394
Approved by: https://github.com/yiming0416
2024-10-31 22:28:32 +00:00
Huy Do
f98bc9a49d Revert D65167805 (#139371)
Summary:
This diff reverts D65167805
broke the release pipeline

Test Plan: NA

Differential Revision: D65245198

@diff-train-skip-merge (to silent facebook-github-bot until I have a stamp to land this)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139371
Approved by: https://github.com/malfet
2024-10-31 07:25:28 +00:00
cz2h
48854cbfc4 Add missing operator and corresponding unittest (#138309)
Fixes https://github.com/pytorch/pytorch/issues/129690

Add operator.neg and oepartor.pos into _SYM_BOOL_OPS.

Provide simple unit test under export/test_serialize.py that can reproduce the issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138309
Approved by: https://github.com/ezyang, https://github.com/angelayi
2024-10-30 23:50:24 +00:00
Zhengxu Chen
03ec25053a [export] Update min_val and max_val to Optional[int] in serialization. (#139223)
Summary: According to export team's discussion, we are upgrading min_val and max_val to optional fields which shouldn't break BC and allows the schema to express infinity.

Test Plan: buck test mode/opt caffe2/test:test_export -- -r test_serialize_infinite_sym_int

Differential Revision: D65167805

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139223
Approved by: https://github.com/yiming0416
2024-10-30 21:14:17 +00:00
Avik Chaudhuri
9e06b5b5cb fix unflatten with HOPs (#138978)
Summary:
Unflatten was broken for HOPs for a couple of reasons:
(1) we didn't expect `get_attr` nodes in the exported program, but they can occur to hold graph arguments to HOPs; such attributes must be moved from the exported program to the corresponding unflattened submodule containing the HOP call.
(2) we don't record metadata for graph arguments on serialization (there's nothing to hold it in our schema), and accordingly the `get_attr` nodes we create on deserialization don't have `nn_module_stack` metadata, which obviously wrecks unflatten.

Test Plan: added a couple of tests

Differential Revision: D65013647

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138978
Approved by: https://github.com/zhxchen17
2024-10-28 19:30:56 +00:00
Prajesh Praveen Anchalia
3685c630b8 [pytorch] Plumb compile context from dynamo.export to aot_compile (#138793)
Summary:
tlparse shows unknown for certain items when _export.aot_compile() passes the graph obtained from dynamo.export() to inductor.aot_compile(), we also do not have access to the dynamo trace in the GraphModule exported by dynamo.

This change plumbs through the compile_context into aot_compile as a part of GraphModule.meta without a major change to APIs within dynamo.

Addresses issue: https://github.com/pytorch/pytorch/issues/123759?fbclid=IwY2xjawGE0LBleHRuA2FlbQIxMQABHS-PRpxvsrsHCDPdStHpqr1jQvx1YOnrPsRAfYAb-oXkU8MxidkIUENY-Q_aem_MAT2oaOgD03C8ggBNm575Q#issuecomment-2430722505

Test Plan:
```
buck2 test mode/opt //caffe2/test/dynamo:test_dynamo
Buck UI: https://www.internalfb.com/buck2/ad64c267-65be-47cf-a94f-e4b26e6e030b
Test UI: https://www.internalfb.com/intern/testinfra/testrun/9288674286334710
Network: Up: 83KiB  Down: 314KiB  (reSessionID-1dad223b-c91d-4718-97a4-bb2c81e480f0)
Jobs completed: 10750. Time elapsed: 19:18.5s.
Cache hits: 0%. Commands: 3 (cached: 0, remote: 0, local: 3)
Tests finished: Pass 5365. Fail 2. Fatal 0. Skip 4. Build failure 0

buck2 test mode/opt //caffe2/test/dynamo:test_dynamo_fb
Buck UI: https://www.internalfb.com/buck2/179a60bb-34e1-43b3-97ad-91af8a93ab01
Test UI: https://www.internalfb.com/intern/testinfra/testrun/2533275046340687
Network: Up: 201KiB  Down: 1.8GiB  (reSessionID-36f33983-6d78-4ec9-aa1b-34cee80dcb4f)
Jobs completed: 17. Time elapsed: 42.9s.
Cache hits: 0%. Commands: 1 (cached: 0, remote: 0, local: 1)
Tests finished: Pass 6. Fail 0. Fatal 0. Skip 0. Build failure 0
```

https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/.tmpxZGXf6/index.html
Repor fixed: https://github.com/pytorch/pytorch/issues/123759?fbclid=IwY2xjawGE0LBleHRuA2FlbQIxMQABHS-PRpxvsrsHCDPdStHpqr1jQvx1YOnrPsRAfYAb-oXkU8MxidkIUENY-Q_aem_MAT2oaOgD03C8ggBNm575Q#issuecomment-2430722505

Differential Revision: D64863946

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138793
Approved by: https://github.com/ezyang
2024-10-28 17:07:44 +00:00
chilli
392221b390 Made DDPOptimizer work with HOPs (#138787)
Fixes https://github.com/pytorch/pytorch/issues/137481

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138787
Approved by: https://github.com/yf225
ghstack dependencies: #138733, #138794, #138881
2024-10-25 18:59:01 +00:00
Pian Pawakapan
51045e6251 make DimHints compatible with Dims (#138490)
Previously we'd been raising UserErrors when `Dim()` and DimHints (`Dim.AUTO/Dim.DYNAMIC`) were both specified in `dynamic_shapes`, this PR stops that, and uses `Dim()` objects to guide DimHints.

The key to this was making the `EqualityConstraint` class happy when it checks that inferred equivalence relations were specified in the original `dynamic_shapes` spec, and this introduces a `RelaxedConstraint` object to mark the hinted dimensions, so equality checks between `RelaxedConstraints` and other constraints are treated as valid.

Current behavior is that:
```
class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x - y

inputs = (torch.randn(4, 4), torch.randn(4, 4))
shapes = {
    "x": (Dim.AUTO, Dim("d1", min=3)),
    "y": (Dim("d0", max=8), Dim.DYNAMIC),
}
ep = export(Foo(), inputs, dynamic_shapes=shapes)
```

The dimensions marked `AUTO` and `DYNAMIC` will have max & min ranges of 8 & 3 respectively. Note that inferred equality between `Dim()` objects & `Dim.STATIC` will still raise errors - `Dim()` suggests not specializing to a constant.

Differential Revision: D64636101

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138490
Approved by: https://github.com/avikchaudhuri
2024-10-22 07:43:48 +00:00
Tugsbayasgalan Manlaibaatar
9f7c26bef3 Fix training IR bug by changing passes order (#138292)
Inserting runtime_assertions cause gm to have different names but the graph signature was populated earlier. To avoid this kind of errors in the future, I refactored these steps into a helper function.

Differential Revision: [D64576251](https://our.internmc.facebook.com/intern/diff/D64576251)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138292
Approved by: https://github.com/avikchaudhuri
ghstack dependencies: #138266
2024-10-22 01:24:14 +00:00
Tugsbayasgalan Manlaibaatar
5adc33d3b8 Training IR should preserve custom metadata (#138266)
Differential Revision: [D64576252](https://our.internmc.facebook.com/intern/diff/D64576252)

@diff-train-skip-merge
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138266
Approved by: https://github.com/yushangdi
2024-10-22 01:09:56 +00:00