pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	0821868110	Revert "[export] Get export APIs ready for PTC (#110410 )" This reverts commit `b96ea9f361`. Reverted https://github.com/pytorch/pytorch/pull/110410 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/110410#issuecomment-1757017249))	2023-10-11 07:31:51 +00:00
Angela Yi	b96ea9f361	[export] Get export APIs ready for PTC (#110410 ) Summary: https://docs.google.com/document/d/1QJJEGnj2nHGPODlw38BEG3KLLCOTfdOVjPrNQbz_LM8/edit#bookmark=id.lp80wfshq130 Changes: * `torch.export` will return a functional ATen graph w/o decompositions * `exported_program.run_decompositions(decomposition_table)` will optionally take a decomposition table, and run decompositions on the exported program, returning a new exported program. By default we will run the Core ATen decomposition table. Calling convention for Executorch stays the same: ``` pre_autograd_graph = capture_pre_autograd_graph(f, args, ...) aten_graph_no_decomps = torch.export.export(pre_autograd_graph, args, ...) # Within to_edge we decompose to core aten and then convert to edge edge_graph = exir.to_edge(aten_graph_no_decomps) ``` Test Plan: CI Differential Revision: D49742989 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110410 Approved by: https://github.com/ydwu4	2023-10-11 06:10:07 +00:00
Tugsbayasgalan Manlaibaatar	5aee22e0e0	Move export.constrain_as_* to torch._constrain_as_* (#110757 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110757 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #109859	2023-10-11 02:37:55 +00:00
Tugsbayasgalan Manlaibaatar	cd275dc24f	Remove RangeConstraints in favor of ValueRanges (#109859 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109859 Approved by: https://github.com/avikchaudhuri	2023-10-10 22:22:05 +00:00
PyTorch MergeBot	33403336fa	Revert "[user errors] compulsory case names, allow multiple (#110878 )" This reverts commit `2ae71c4598`. Reverted https://github.com/pytorch/pytorch/pull/110878 on behalf of https://github.com/kit1980 due to export/test_export.py::TestExport::test_multiple_definitions_same_name_dim - TypeError: UserError.init() missing 1 required positional argument: 'case_names' ([comment](https://github.com/pytorch/pytorch/pull/110878#issuecomment-1754360051))	2023-10-10 04:44:40 +00:00
Avik Chaudhuri	2ae71c4598	[user errors] compulsory case names, allow multiple (#110878 ) We want to get to a point where most UserErrors link to exportdb examples. This PR makes passing case names non-optional to make this intent clearer and encourage developers who raise UserErrors to make or point to examples that make fixing such errors more obvious for users. In addition, sometimes there are multiple examples that are relevant to an error. Thus this PR also enables passing multiple case names. Retry of #110733 which was reverted due to a landrace. Differential Revision: [D50087148](https://our.internmc.facebook.com/intern/diff/D50087148/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110878 Approved by: https://github.com/gmagogsfm, https://github.com/tugsbayasgalan	2023-10-10 03:48:07 +00:00
Kazuaki Ishizaki	bff28ec568	Fix typo under torch/_export directory (#110808 ) This PR fixes typo of comments and message in files under `torch/_export` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110808 Approved by: https://github.com/gmagogsfm	2023-10-08 11:47:51 +00:00
Huy Do	18f0d3af72	Revert "[user errors] compulsory case names, allow multiple (#110733 )" (#110783 ) This reverts commit `983f6f36db`. I have no idea how to revert https://github.com/pytorch/pytorch/pull/110733 with the bot. So reverting it manually for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110783 Approved by: https://github.com/ZainRizvi, https://github.com/kit1980	2023-10-07 07:32:39 +00:00
Avik Chaudhuri	983f6f36db	[user errors] compulsory case names, allow multiple (#110733 ) We want to get to a point where most `UserError`s link to `exportdb` examples. This PR makes passing case names non-optional to make this intent clearer and encourage developers who raise `UserError`s to make or point to examples that make fixing such errors more obvious for users. In addition, sometimes there are multiple examples that are relevant to an error. Thus this PR also enables passing multiple case names. Differential Revision: [D50020465](https://our.internmc.facebook.com/intern/diff/D50020465/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110733 Approved by: https://github.com/zhxchen17	2023-10-07 01:25:12 +00:00
Avik Chaudhuri	44d34fe65c	different bounds for same Dim name (#110638 ) Previously,`Dim` definitions that shared the same name but had different ranges were allowed to appear in the `dynamic_shapes` argument of an `export` call. They would correspond to the same dynamic dimension (identified by the shared name) with an effective range would be the intersection of the different ranges. However this behavior can be confusing, because having different definitions with the same name is more likely than not unintentional. Therefore, this PR makes it a user error. We still allow different definitions with the same name to exist at the same time (no global uniqueness) as long as they are not confused in the same `export` call. Redefinitions with the same bounds are also allowed, in case they are accidentally created by executing the same code multiple times. Differential Revision: [D49965944](https://our.internmc.facebook.com/intern/diff/D49965944/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110638 Approved by: https://github.com/zhxchen17	2023-10-06 21:22:52 +00:00
Adnan Akhundov	f74937741e	Remove runtime assertions between export and AOT compilation (#110710 ) Summary: The runtime assertions inserted in the `torch._export.export` by the `_AddRuntimeAssertionsForInlineConstraintsPass` lead to errors in AOT Inductor like #109884. In `torch._export.aot_compile` export and AOT compilation are run consecutively which would lead to the above issue if any assertions are inserted. In this PR, we're adding a new parameter / flag to `torch._export.aot_compile`, `remove_runtime_assertions`, to remove the assertions inserted during export before AOT compilation. The flag is set to `False` for BC. Additionally, we remove the flag `add_runtime_assertions_for_inline_constraints` recently added to `torch._dynamo.config`, as it can lead to undesirable `torch._export` behavior and is 's no longer required for the AOT Inductor testing purposes. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/110710 Approved by: https://github.com/zhxchen17, https://github.com/chenyang78	2023-10-06 21:09:35 +00:00
Zhengxu Chen	be5dc3a00d	[export] Update ArgumentSpec definition. (#110612 ) Summary: Changing ArgumentSpec into a true union type in Python without changing serialization format. Test Plan: CI Differential Revision: D49871088 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110612 Approved by: https://github.com/angelayi	2023-10-06 03:14:45 +00:00
Sherlock Huang	f1b94461aa	[AOTInductor] ProxyExecutor support Dynamic Shape (#110526 ) Summary: Extend ProxyExecutor to support dynamic shape. Example of ProxyExecutor invocation with symints. ``` int64_t* arg0_1_size; AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg0_1, &arg0_1_size)); auto s0 = arg0_1_size[0]; auto s1 = arg0_1_size[1]; int64_t* arg1_1_size; AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg1_1, &arg1_1_size)); auto s2 = arg1_1_size[0]; auto s3 = arg1_1_size[1]; ... aoti_torch_proxy_executor_call_function(proxy_executor, 0, 15, std::vector<int64_t>{42, 16, 17, s0 + s1, s0 + s1, s2s3, 45, 67, 16, 17, s2s3, s2s3, s0 + s1, 89, 910}.data(), 7, std::vector<AtenTensorHandle>{arg0_1, arg0_1, arg1_1, buf2, arg0_1, arg1_1, buf4}.data()); ``` Example of serialized SymInt(s) arguments: ``` { "name": "symint", "arg": { "asSymInt": { "asName": "s0 + s1" } } }, { "name": "symints", "arg": { "asSymInts": [ { "asName": "s0 + s1" }, { "asName": "s2s3" } ] } }, ... { "name": "o_symint", "arg": { "asSymInt": { "asName": "s2s3" } } }, { "name": "o_symints", "arg": { "asSymInts": [ { "asName": "s2s3" }, { "asName": "s0 + s1" } ] } }, ``` Test Plan: buck2 run mode/dev-nosan deeplearning/aot_inductor/test:test_custom_ops Differential Revision: D49887555 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110526 Approved by: https://github.com/chenyang78	2023-10-05 19:05:20 +00:00
ydwu4	cc1de49340	[HigherOrderOp] fallthrough some keys by default. (#110478 ) Fixes #109253 Test Plan: Added a new test that shows default fallthrough keys can be overrided. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110478 Approved by: https://github.com/ezyang	2023-10-05 16:25:42 +00:00
Avik Chaudhuri	416eca9736	export db links for user errors (#110555 ) Ideally all `_dynamo.exc.UserError`s should have "case names", i.e., link to examples in `exportdb`. This PR adds case names to several instances of `_dynamo.exc.UserError`. In particular, looking at coverage based on `UserErrorType`: * `DYNAMIC_CONTROL_FLOW`, `ANTI_PATTERN`, and `STANDARD_LIBRARY` are fully covered. * `CONSTRAINT_VIOLATION` and `DYNAMIC_DIM` have no coverage. We don't seem to have any dedicated examples of specifying dynamic shapes in `exportdb` (although they are used in some other examples without explanation, to avoid some specialization that would make such examples moot). * `INVALID_INPUT` is only partly covered. Frankly this is tedious to cover via examples. Differential Revision: [D49928518](https://our.internmc.facebook.com/intern/diff/D49928518/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110555 Approved by: https://github.com/angelayi, https://github.com/ydwu4	2023-10-05 05:03:04 +00:00
Sherlock Huang	50054b1a62	[AOTInductor] ProxyExecutor support ReinterpretView inputs (#110451 ) Summary: See wrapper.codegen_reinterpret_view(), it return a temporary handle for tensor, which has following problem. ``` # NB, the return handle here represents a temporary tensor, which will be automatically # released. # Here's a sample usage in the cpp wrapper code: # ``` # aoti_torch_addmm_out( # buf1, # arg1_1, # RAIIAtenTensorHandle(tmp_tensor_handle_0), # buf0, # 1L, # 1L)); # ``` # RAIIAtenTensorHandle(tmp_tensor_handle_0) will be released after the call to addmm_out. # This could be problematic when it's used in a different pattern, for example: # ```` # AtenTensorHandle tensor_args[] = {RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6}; # aoti_torch_proxy_executor_call_function(..., tensor_args); # ```` # RAIIAtenTensorHandle(tmp_tensor_handle_2) will be invalid when it's used in the latter # kernel call. return f"RAIIAtenTensorHandle({tmp_name})" ``` As a result, ProxyExecutor would generate following code, which cause invalid memory access. Before: ``` // Source Nodes: [fn_with_tuple_output], Original ATen: [fb.fn_with_tuple_output] AtenTensorHandle tmp_tensor_handle_2; AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch__reinterpret_tensor(buf3, 2, int_array_0, int_array_1, 0L, &tmp_tensor_handle_2)); ... AtenTensorHandle tensor_args[] = {RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6}; int64_t int_args[] = {1}; aoti_torch_proxy_executor_call_function(proxy_executor, 1, 1, int_args, 3, tensor_args); buf3.reset(); ``` With fix in this diff, ProxyExecutor generates following code After: ``` // Source Nodes: [fn_with_tuple_output], Original ATen: [fb.fn_with_tuple_output] AtenTensorHandle tmp_tensor_handle_2; AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch__reinterpret_tensor(buf3, 2, int_array_0, int_array_1, 0L, &tmp_tensor_handle_2)); ... aoti_torch_proxy_executor_call_function(proxy_executor, 1, 1, std::vector<int64_t>{1}.data(), 3, std::vector<AtenTensorHandle>{RAIIAtenTensorHandle(tmp_tensor_handle_2), buf5, buf6}.data()); buf3.reset(); ``` I am not exactly a big fan of such `std::vector{...}.data()` for creating a temp array, but I can't think of another fix. Test Plan: buck2 run mode/dev-nosan deeplearning/aot_inductor/test:test_custom_ops Reviewed By: desertfire Differential Revision: D49758764 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110451 Approved by: https://github.com/desertfire	2023-10-04 02:20:31 +00:00
Angela Yi	e47e946bbf	[aotinductor] Use dynamic_shape instead of constraints (#110360 ) Summary: Previously we used export's constraints to specify all batch-size dimensions being dynamic. This is done by creating 1 constraint `dynamic_dim(inp[0][0], lower, upper)`, followed by `dynamic_dim(inp[0][0]) == dynamic_dim(inp[i][0])` for every input `i`. Through the new `dynamic_shapes` API, we can use `Dims("batch_size")` on every dimension to specify which dimensions are dynamic and equal to each other, and `None` otherwise: `{i: [Dims("batch_size", lower, upper), None] for every input i}` Note: `dynamic_shapes` and `constraints` utilize the same "constraints" backend so this diff should be idempotent. Test Plan: `buck2 run @//mode/dev-nosan //caffe2/torch/fb/model_transform/experimental/benchmark/test/aotinductor:test_aot_inductor_benchmark` Reviewed By: chenyang78, aakhundov Differential Revision: D49784351 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110360 Approved by: https://github.com/desertfire	2023-10-02 16:09:37 +00:00
Angela Yi	13af952f94	[export] Add run_decomposition() function to ExportedProgram (#110236 ) Summary: https://docs.google.com/document/d/1QJJEGnj2nHGPODlw38BEG3KLLCOTfdOVjPrNQbz_LM8/edit#bookmark=id.lp80wfshq130 `exported_program.run_decompositions(decomposition_table)` will optionally take a decomposition table, and run decompositions on the exported program, returning a new exported program. By default we will run the Core ATen decomposition table. Splitting up this diff with the following one (D49742989) to make migrating Executorch easier: 1. Land this diff 1. Wait for a pytorch nightly to include this diff 1. Update executorch's pytorch nightly 1. Land the following diff to have export() return no decomps Test Plan: Tested in following diff Differential Revision: D49743208 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110236 Approved by: https://github.com/gmagogsfm	2023-10-01 18:18:27 +00:00
Adnan Akhundov	2ead6c2f6e	Skip launching kernels with zero grid in AOT Inductor (#110312 ) Summary: with the grid computed in terms of unbacked `SymInt`s, it can happen that the grid is zero size. This causes CUDA error on `cuLaunchKernel` in the AOT Inductor codegen. In this PR, when the grid contains unbacked `SymInt`s, a check is added around the `launchKernel` in the AOT Inductor's C++ wrapper codegen to make sure that the grid is not zero-size. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110312 Approved by: https://github.com/chenyang78	2023-09-30 09:12:56 +00:00
Avik Chaudhuri	359c2a53f5	dynamic_shapes + retrace exported program (#110276 ) An `ExportedProgram`'s `__call__` signature is different from the original module, so `dynamic_shapes` that follow the original signature would fail when applied to re-export an `ExportedProgram`. This PR fixes this issue, in other words, the original `dynamic_shapes` should now work when re-exporting. Differential Revision: D49764011 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110276 Approved by: https://github.com/tugsbayasgalan	2023-09-29 21:06:46 +00:00
ydwu4	5f7eff0adb	Replace node.meta source_fn with source_fn_stack (#108595 ) A resubmit of https://github.com/pytorch/pytorch/pull/108447. Copy over the descriptions: This is a follow-up of the discussion in https://github.com/pytorch/pytorch/pull/108356, where we want to repalce source_fn with source_fn_stack Before this PR, for the following example: ```python backend = EagerAndRecordGraphs() @torch.compile(backend=backend, fullgraph=True) def cond_f(pred, pred2, x, y): def true_fn(pred2, x, y): return x + y def false_fn(pred2, x, y): def true_fn2(x, y): return x.sin() - y.cos() def false_fn2(x, y): return x.cos() - y.sin() return control_flow.cond(pred2, true_fn2, false_fn2, (x, y)) return control_flow.cond(pred, true_fn, false_fn, (pred2, x, y)) ``` The graph captured is shown below: ```python class GraphModule(torch.nn.Module): def forward(self, L_pred_ : torch.Tensor, L_pred2_ : torch.Tensor, L_x_ : torch.Tensor, L_y_ : torch.Tensor): l_pred_ = L_pred_ l_pred2_ = L_pred2_ l_x_ = L_x_ l_y_ = L_y_ cond_true_1 = self.cond_true_1 cond_false_1 = self.cond_false_1 cond = torch.ops.higher_order.cond(l_pred_, cond_true_1, cond_false_1, [l_pred2_, l_x_, l_y_]); l_pred_ = cond_true_1 = cond_false_1 = l_pred2_ = l_x_ = l_y_ = None return (cond,) class GraphModule(torch.nn.Module): def forward(self, l_pred2_, l_x_, l_y_): add = l_x_ + l_y_; l_x_ = l_y_ = None return add class GraphModule(torch.nn.Module): def forward(self, l_pred2_, l_x_, l_y_): cond_true_0 = self.cond_true_0 cond_false_0 = self.cond_false_0 cond = torch.ops.higher_order.cond(l_pred2_, cond_true_0, cond_false_0, [l_x_, l_y_]); l_pred2_ = cond_true_0 = cond_false_0 = l_x_ = l_y_ = None return cond class GraphModule(torch.nn.Module): def forward(self, l_x_, l_y_): sin = l_x_.sin(); l_x_ = None cos = l_y_.cos(); l_y_ = None sub = sin - cos; sin = cos = None return sub class GraphModule(torch.nn.Module): def forward(self, l_x_, l_y_): cos = l_x_.cos(); l_x_ = None sin = l_y_.sin(); l_y_ = None sub = cos - sin; cos = sin = None return sub ``` the source_fn for inner cond, sin, cos will be a (name, target) tuple: ``` ('cond', <torch._ops.HigherOrderOperator object at xxx>) ('sin', 'sin') ('cos', 'cos') ('sub'. <built-in function sub>) ``` After this pr, the source_fn_stack will be a list of (name, target) tuple. The bottom of stack is the end of the list. ``` [('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>)], [('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>), ('sin', 'sin')], [('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cos', 'cos')] [('cond', <torch._ops.HigherOrderOperator object at xxx>), ('cond', <torch._ops.HigherOrderOperator object at xxx>), ('sub', <built-in function sub>)] ``` Test Plan: See added tests in test_higher_order_ops.py and modify existing test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108595 Approved by: https://github.com/angelayi, https://github.com/zou3519	2023-09-28 18:18:36 +00:00
Avik Chaudhuri	5da5e068f3	deprecate constraints in favor of dynamic_shapes (#110143 ) Recently we updated the `export` API to take an experimental `dynamic_shapes` argument that was meant to subsume the existing `constraints` argument. This PR deprecates `constraints` (with a warning on its use, but without actually removing it). Simultaneously it replaces all uses of `constraints` in docs, examples, and tests with corresponding uses of `dynamic_shapes` (preserving behavior). This exercise fortunately revealed some minor bugs in the implementation which have also been fixed in this PR. Some uses of `constraints` still remain, e.g., when `torch._dynamo.export` is called directly. (Meta-internal uses will be updated in a separate diff.) Differential Revision: D49676049 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110143 Approved by: https://github.com/tugsbayasgalan	2023-09-28 10:26:21 +00:00
Sherlock Huang	7f2b51c668	[AOTInductor] ProxyExecutor supports custom op with tuple output (#110140 ) Summary: Extend ProxyExecutor to support custom ops with tuple outputs. Generated wrapper code for `out3, out4 = torch.ops.fb.fn_with_tuple_output(out2, 1)` ``` AtenTensorHandle buf5_handle; // output buffer AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_new_uninitialized_tensor(&buf5_handle)); RAIIAtenTensorHandle buf5(buf5_handle); AtenTensorHandle buf6_handle; // output buffer AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_new_uninitialized_tensor(&buf6_handle)); RAIIAtenTensorHandle buf6(buf6_handle); AtenTensorHandle tensor_args_var_3[] = {buf3.get(), buf5.get(), buf6.get()}; int64_t int_args_var_4[] = {1}; aoti_torch_proxy_executor_call_function(proxy_executor, 1, 1, int_args_var_4, 3, tensor_args_var_3); ``` Test Plan: Test Differential Revision: D49673994 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110140 Approved by: https://github.com/chenyang78	2023-09-28 02:50:39 +00:00
Sherlock Huang	ec5bbef8af	[AOTInductor] Switch ProxyExecutor to use AtenTensorHandle (#109748 ) Summary: Switch ProxyExecutor to use AtenTensorHandle. Test Plan: E2E Test Differential Revision: D49471659 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109748 Approved by: https://github.com/yifuwang, https://github.com/desertfire, https://github.com/chenyang78	2023-09-27 17:51:30 +00:00
Angela Yi	ddbf1aab64	[export] Add dynamic_shapes to _export.aot_compile (#110101 ) Summary: Following the new dynamic_shapes API (introduced in https://github.com/pytorch/pytorch/pull/108448), we will also add a dynamic_shapes API to _export.aot_compile Test Plan: CI Differential Revision: D49653815 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110101 Approved by: https://github.com/gmagogsfm	2023-09-27 04:10:22 +00:00
Angela Yi	a7409695bb	[export] Verifier for exported program (#109519 ) Summary: X-link: https://github.com/pytorch/executorch/pull/292 Added a verifier for the graph signature in a exported program Test Plan: CI Differential Revision: D48926643 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109519 Approved by: https://github.com/zhxchen17	2023-09-26 18:47:43 +00:00
PyTorch MergeBot	c1a2f35805	Revert "Disallow skipping dynamo (#109476 )" This reverts commit `7bb1d10c2f`. Reverted https://github.com/pytorch/pytorch/pull/109476 on behalf of https://github.com/atalman due to Failing internal CI ([comment](https://github.com/pytorch/pytorch/pull/109476#issuecomment-1734402581))	2023-09-25 20:20:50 +00:00
Tugsbayasgalan Manlaibaatar	7bb1d10c2f	Disallow skipping dynamo (#109476 ) Based on William's recent diff on preserving node metadata on retracing, we no longer need to skip dynamo on retracing. This softens our previous restriction of not allowing any new constraints from user side because we can utilize dynamo to analyze through constraints now. As a result, re-export can technically happen with any new constraints. This opens up another problem that "Is it ok to use more loose constraints on the retracing?" If we allow loose constraints, we can technically diverge from eager behaviour because for example we could have eliminated unsafe control flow based on previous assumption. But we can also argue this is ok because we can say we treat the Exported callable to be an independent callable from its' original source code. We can technically ban loose constraints inside export, but my concern is we are breaking abstraction by doing special case checks on ExportedProgram. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109476 Approved by: https://github.com/avikchaudhuri, https://github.com/zhxchen17	2023-09-23 22:15:18 +00:00
Avik Chaudhuri	ebc7039bcb	New export API with dynamic shape specifications instead of constraints (#108448 ) Our experience using `constraints` / `dynamic_dim` with the existing export API has found it to be (subjectively) clunky and (objectively) verbose in common cases. This PR implements a new design for the export API that replaces the use of `constraints` / `dynamic_dim` with a new way of specifying dynamic shapes, involving the following concepts: * a constructor `Dim` for first-class named dynamic dimensions with ranges (similar to `functorch.dim`, and analogous to internal symbolic sizes) * a mechanism that uses the above in `export` calls to associate inputs to their dynamic shape specifications (`dynamic_shapes`) Design doc: https://docs.google.com/presentation/d/168U7XK72C_WSsZpGESP6Cho9udh193fi0gfjxCNcJ4E/edit#slide=id.p (Meta-only). Note that we only implement Option 1 in that doc. An older version of this PR also implemented Option 3, which is an alternative way of specifying dynamic shapes using tensor type annotations on the exported callable; but we have moved that to future work for now. See docs for these new features in `torch.export`. The existing `torch.export.export` is modified to use the new API, `torch._export.export__RC__`, whenever `constraints=None`. We have not deprecated the existing API yet, but will do in a follow-up. Constraint violation errors arising through use of the new API will now contain suggested fixes using the new API. No longer do we need to report all specializations for static dimensions and suggest all constraints over dynamic dimensions to fix such errors. Instead, due to the redesign, the suggested fixes are much more concise, only involving modifying the definitions of relevant `Dim`s. Differential Revision: [D48919204](https://our.internmc.facebook.com/intern/diff/D48919204/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108448 Approved by: https://github.com/suo, https://github.com/gmagogsfm	2023-09-22 06:58:26 +00:00
Sherlock Huang	293205c54b	[AOTInductor] Fix aot_inductor/test:test_custom_ops (#109660 ) Summary: Fix aot_inductor/test:test_custom_ops, which was broken by https://github.com/pytorch/pytorch/pull/109391 Test Plan: buck2 run mode/dev-nosan //deeplearning/aot_inductor/test:test_custom_ops Differential Revision: D49438928 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109660 Approved by: https://github.com/desertfire, https://github.com/chenyang78	2023-09-20 07:44:39 +00:00
Brian Hirsh	238fb66085	python functionalization: support higher order ops (#108656 ) We now have two types of functionalization, C++ Functionalization (through the `Functionalize` dispatch key), and python functionalization (through the `FunctionalTensorMode` torch_dispatch mode). This means that all higher order ops need custom functionalization rules for the python variant too. I added them here, as well as a helper function `dispatch_functionalize()` - equivalent to `torch.func.functionalize()`, except that it uses `FunctionalTensorMode`. In theory we could have secretly switched `torch.func.functionalize` to use `FunctionalTensorMode`. This would be BC-breaking, though, since `FunctionalTensorMode` isn't composable with the other functorch transforms (the functorch layer-mode stack doesn't know how to re-order torch_dispatch modes arbitrarily). Pull Request resolved: https://github.com/pytorch/pytorch/pull/108656 Approved by: https://github.com/zou3519 ghstack dependencies: #109024, #109248	2023-09-20 04:37:31 +00:00
Edward Z. Yang	518308a740	Trace through `pytree` API with dynamo. (#108533 ) Fix: #107315 This PR enables dynamo to trace through the `pytree` API by inlining its functions. In order to do so, a few details of `pytree` had to be changed. In summary, this PR: - Introduces `TreeSpecVariable` for representing `TreeSpec` instances - Specializes `<type>.__bases__` call, returning a `TupleVariable` - Enables the call to `id` builtin function for every variable that implements `as_python_constant` method - Specializes `ConstantVariable.call_method` for its (un)flatten functions - Implements `UserDefinedObjectVariable.as_python_constant` - Modifies `pytree` by: - Make `SUPPORTED_NODES` a map of ids (instead of types) to `NodeDef` - Removed `functools.wraps` function, since it can't be inlined Pull Request resolved: https://github.com/pytorch/pytorch/pull/108533 Approved by: https://github.com/ezyang, https://github.com/voznesenskym ghstack dependencies: #109201	2023-09-20 00:04:56 +00:00
Angela Yi	e8ab8c877d	[exir] Add lift constant tensors passes after aten_to_edge (#109382 ) Summary: X-link: https://github.com/pytorch/executorch/pull/359 When exporting using enable_aot (through the torch.export path), we want to lift all constant tensors as buffers to the exported program. The ScalarToTensor pass in EXIR's aten_to_edge passes will create some constant tensors in the graph, so we will need to run a lift_constant_tensors pass afterwards. Note that this only needs to be applied when exporting using the torch.export path because in the original path, nothing is lifted. Test Plan: CI Differential Revision: D49207492 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109382 Approved by: https://github.com/cccclai	2023-09-19 01:34:58 +00:00
Angela Yi	98208e5160	[export] Update deserialized FakeTensorMode/ShapeEnv with same configs as export (#109522 ) Summary: Deserialized FakeTensorMode/ShapeEnv should have the same configs as export: https://fburl.com/code/y7jxf5qw Test Plan: CI Differential Revision: D49377410 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109522 Approved by: https://github.com/zhxchen17	2023-09-19 00:34:30 +00:00
angelayi	5b13f74e9b	[export] Update how we input kwargs (#109160 ) Previously, the code for passing inputs to exported program was: ``` if kwargs: return (args, kwargs) else: return args ``` However, this causes some inconsistency where if the original input contains args and kwargs, the treespec would be a tuple containing a tuple of arguments, and a dictionary of keyword arguments. But if the original input only contained args, the treespec would just be a tuple of arguments. This inconsistency causes some inconveniences in the runtime. So I updated the code to just always keep the kwargs around. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109160 Approved by: https://github.com/zhxchen17, https://github.com/avikchaudhuri	2023-09-19 00:04:32 +00:00
zhxchen17	6f4b9cc9ab	[export] Skip noop runtime assertion pass. (#109395 ) Summary: If there's no inline constraints added, just return the original graph. We want to do this because sometimes this pass mess up the node names, before we actually fix this, we could make the behavior a bit less buggy by skipping noop passes. Test Plan: Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/109395 Approved by: https://github.com/angelayi	2023-09-18 22:37:28 +00:00
Yanbo Liang	8a567bb59d	[HigherOrderOp] Should automatically pop modes (#109157 ) Fixes #108282 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157 Approved by: https://github.com/zou3519	2023-09-18 20:54:09 +00:00
PyTorch MergeBot	07f2efa285	Revert "[HigherOrderOp] Should automatically pop modes (#109157 )" This reverts commit `f03b8abd47`. Reverted https://github.com/pytorch/pytorch/pull/109157 on behalf of https://github.com/clee2000 due to broke internal builds D49346922 ([comment](https://github.com/pytorch/pytorch/pull/109157#issuecomment-1722571262))	2023-09-17 21:19:52 +00:00
Yanbo Liang	f03b8abd47	[HigherOrderOp] Should automatically pop modes (#109157 ) Fixes #108282 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157 Approved by: https://github.com/zou3519	2023-09-14 20:46:26 +00:00
zhxchen17	5edbee9404	[export] Normalize nn_module_stack paths. (#109231 ) Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/109231 Approved by: https://github.com/angelayi	2023-09-14 01:34:31 +00:00
Angela Yi	58391aeaf1	[export] Lift constant tensors as buffes (reland) (#109040 ) Summary: When we retrace the graph containing constant tensors, they get lifted as buffer inputs. AotInductor also wants to lift all the constants as inputs. If we separate the constants as a separate thing, then it adds an additional complexity where we now have to keep track of 3 inputs (params, buffers, constants). Cons: People might care about specifically what buffers are/are not buffers? If people want to know specifically which buffers are constants, we can add an additional field in the graph signature to mark this. Test Plan: CI Differential Revision: D49153367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109040 Approved by: https://github.com/zhxchen17	2023-09-12 15:23:00 +00:00
Avik Chaudhuri	47be61e12b	untracked inputs in constraints (#109037 ) Differential Revision: [D49157009](https://our.internmc.facebook.com/intern/diff/D49157009/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109037 Approved by: https://github.com/zhxchen17	2023-09-12 06:50:01 +00:00
zhxchen17	6c8b0dfba6	[export] Add a private interface for customizing decomp. (#109058 ) Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/109058 Approved by: https://github.com/angelayi	2023-09-12 03:05:46 +00:00
Huy Do	703cdd711f	Revert "[export] Lift constant tensors as buffers (#108592 )" (#108893 ) This reverts commit `e3407238f6`. I gave up trying to revert the original PR in the usual way https://github.com/pytorch/pytorch/pull/108592#issuecomment-1712135536, so let's manually revert it then. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108893 Approved by: https://github.com/izaitsevfb, https://github.com/atalman	2023-09-08 22:25:10 +00:00
Mu-Chu Lee	30a33b76b9	[AOTInductor] Include constants in AOTInductor .so file. (#108473 ) Summary: Include constants in AOTInductor .so file. Added some difference: 1) serialize with ctypes instead of the native of torch.storage 2) Use the underlying for_blob instead of from_blob to construct Tensor. Test Plan: Unit tests: ``` test/inductor/test_aot_inductor.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/108473 Approved by: https://github.com/angelayi	2023-09-08 03:49:53 +00:00
Zhengxu Chen	d830e4658a	[export] Fix unlifting pass param name handling. (#108659 ) Summary: Fixing an internal test. Differential Revision: D49014757 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108659 Approved by: https://github.com/huydhn	2023-09-07 01:39:07 +00:00
angelayi	e3407238f6	[export] Lift constant tensors as buffers (#108592 ) When we retrace the graph containing constant tensors, they get lifted as buffer inputs. AotInductor also wants to lift all the constants as inputs. If we separate the constants as a separate thing, then it adds an additional complexity where we now have to keep track of 3 inputs (params, buffers, constants). Cons: People might care about specifically what buffers are/are not buffers? If people want to know specifically which buffers are constants, we can add an additional field in the graph signature to mark this. Differential Revision: [D49017872](https://our.internmc.facebook.com/intern/diff/D49017872) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108592 Approved by: https://github.com/zhxchen17	2023-09-07 01:14:30 +00:00
PyTorch MergeBot	1aacbaed8b	Revert "[export] Fix dict.get() to dict.setdefault() for param lookup. (#108587 )" This reverts commit `c99a70c8df`. Reverted https://github.com/pytorch/pytorch/pull/108587 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing one internal test. Please take a look at the diff D48995555 for more details ([comment](https://github.com/pytorch/pytorch/pull/108587#issuecomment-1708933010))	2023-09-06 19:05:01 +00:00
zhxchen17	c99a70c8df	[export] Fix dict.get() to dict.setdefault() for param lookup. (#108587 ) Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/108587 Approved by: https://github.com/angelayi	2023-09-05 22:08:51 +00:00
Sherlock Huang	b9dfdc091b	[AOTInductor][Reland] Proxy Executor for Extern Fallback kernels (#107279 ) (#108350 ) Summary: This is a prototype for running extern fallback kernels with a host side proxy executor. Sample of generated cpp wrapper call: ``` at::Tensor buf0; // output buffer void* tensor_args_var_0[] = {&arg0_1, &arg0_1, &arg1_1, &arg0_1, &arg1_1, &buf0}; int64_t int_args_var_1[] = {81, 81, 7, 7, 7, 81}; proxy_executor->call_function("buf0", int_args_var_1, tensor_args_var_0); ``` - In my current implementation, proxy executor interprets the raw pointers according to the ops schema. This assumes that custom op MUST have a valid schema registered to Dispatcher. (I would like to validate this assumption) - I am using callboxed() API of the custom kernels. This is inevitable, as we wish to have a single call_function API for all possible custom kernels. - These are all the input argument types I have support so far. union Argument { # Bool value does not matter 1: bool asNone; 2: TensorArgument asTensor; 3: list<TensorArgument> asTensors; 5: i64 asInt; 7: list<i64> asInts; 8: double asFloat; 9: list<double> asFloats; 10: string asString; 10.5: list<string> asStrings; 11: SymIntArgument asSymInt; 12: list<SymIntArgument> asSymInts; 13: ScalarType asScalarType; 14: MemoryFormat asMemoryFormat; 15: Layout asLayout; 16: Device asDevice; 17: bool asBool; 18: list<bool> asBools; } - Need a policy for handling unpopulated argument with default values. Here are the options, and it has BC implications. 1. requires exported fx graph to explicitly populate default values, if users doesn't specify. 2. requires cpp wrapper to explicitly populate default values, if fx graph doesn't specify. 3. Proxy executor look up from opSchema for default values. For fixing T162112344 Test Plan: frontend: buck2 run mode/dev-sand mode/inplace -c fbcode.enable_gpu_sections=True sigmoid/frontend:export_main test: buck2 run mode/dev-sand //deeplearning/aot_inductor/test:test_custom_ops backend: buck2 run mode/dev-nosan //deeplearning/aot_inductor/fb:main buck2 test 'fbcode//mode/opt' fbcode//caffe2/torch/fb/model_transform/experimental/benchmark/test:test_aot_inductor_benchmark -- --exact 'caffe2/torch/fb/model_transform/experimental/benchmark/test:test_aot_inductor_benchmark - test_aot_inductor_benchmark_cmf30x (caffe2.torch.fb.model_transform.experimental.benchmark.test.test_aot_inductor_benchmark.AOTInductorBenchmark)' Reviewed By: suo Differential Revision: D48747417 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108350 Approved by: https://github.com/izaitsevfb	2023-09-02 17:14:10 +00:00

1 2 3 4 5

220 Commits