pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Shangdi Yu	a47bb4a393	Fix autocast for non-strict export (#137495 ) Summary: add testing for autocast and set_grad nodes for export_for_training. In export_for_training, we do not wrap the autocast and set_grad node in to HOP, but we should still have the set_grad_enabled/autocast nodes. add support for autocast in non-strict export. Previously, `_enter_autocast` and `_exit_autocast` nodes don't show up in the export graph when we use `strict=False`. - In autocast's enter and exit function, we dispatch to `PreDispatchTorchFunctionMode.__torch_function__`. if we have PreDispatchTorchFunctionMode in our function_mode_stack, the call stack looks like below. This is mostly the same call stack as strict mode, except strict mode enters [here](https://www.internalfb.com/code/fbsource/[0d4f1135cacdb26c6e01d5dce1ce52a15d61ee48]/xplat/caffe2/torch/_dynamo/variables/ctx_manager.py?lines=806). ``` - torch.amp.autocast.__enter__()'s torch.overrides.handle_torch_function - torch.fx.experimental.proxy_tensor.TorchFunctionMetadataMode.__torch_function__ - torch.amp._enter_autocast()'s torch.overrides.handle_torch_function - PreDispatchTorchFunctionMode.__torch_function__ ``` - in `PreDispatchTorchFunctionMode.__torch_function__`, we create the autocast nodes. - to match the strict mode behavior, we let the input node to the `_exist_autocast` node be the corresponding `_enter_autocast` node. This requires us to maintain a stack in `PreDispatchTorchFunctionMode`. Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r test_export_with_autocast buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r test_export_with_set_grad ``` Differential Revision: D64016023 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137495 Approved by: https://github.com/bdhirsh	2024-10-16 17:39:00 +00:00
Tugsbayasgalan Manlaibaatar	0a6c40faba	Fix constant returning (#137993 ) When the constants are used twice in the exported graph (second one is returned as output), the lifting constant pass doesn't account for the second one being the output. THis PR fixes that. Differential Revision: [D64406108](https://our.internmc.facebook.com/intern/diff/D64406108/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137993 Approved by: https://github.com/avikchaudhuri	2024-10-16 16:42:09 +00:00
Pian Pawakapan	44653895cc	override bool(), is_nonzero for real tensor tracing (#136788 ) Fixes bool() and is_nonzero() calls for real tensor tracing, non-strict export Differential Revision: D63482693 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136788 Approved by: https://github.com/ezyang	2024-10-15 17:13:44 +00:00
Wang, Eikan	fa08e924ad	Skip test export with fake tensor inputs on cuda devices for Intel GPU (#137847 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137847 Approved by: https://github.com/etaf, https://github.com/jansel	2024-10-13 07:07:48 +00:00
Avik Chaudhuri	ed55d356de	[alt] fix unroll in successive unflatten (#137646 ) We use nn_module_stack in unflatten to recognize when module calls begin and end. However the current format is not sufficient to detect module call boundaries when we have successive calls to the same module, because the successive instructions (end of one call, begin of next call) have the same nn_module_stack. This causes us to effectively "unroll" successive calls to a single call. This can cause problems when preserving module call signatures because the outputs of the successive calls might be concatenated in the single call. Previously we introduced the concept of a "call index" to generate multiple graphs when unflattening, one per call. This PR pushes this concept into nn_module_stack itself. In particular, the keys of nn_module_stack now go from `key` to `key@call_index`. (In a previous attempt, https://github.com/pytorch/pytorch/pull/137457, instead values in nn_module_stack go from (fqn, type) to (fqn, type, call_index), which is BC-breaking.) Note that we still do not have the ability to preserve module call signatures for multiple calls to the same module. But now instead of randomly crashing we give a proper error. OTOH when not preserving module call signatures we simply generate multiple calls, each with its own graph, possibly deduplicated, matching what we would do for non-successive calls. Test Plan: Like D64014936 Differential Revision: D64136277 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137646 Approved by: https://github.com/angelayi	2024-10-12 15:53:52 +00:00
Tugsbayasgalan Manlaibaatar	5fca2fd365	Try unify training and inference (#136888 ) Previously inference -> inference IR was going through a seperate flow from train -> inference decomposition. This diff unifies them so that we always retrace when decomposing. Joint IR decomp is still going through old flow (inference -> inference) but seems ok for now since it is still in experimental stage. Differential Revision: [D63062521](https://our.internmc.facebook.com/intern/diff/D63062521/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136888 Approved by: https://github.com/avikchaudhuri	2024-10-11 20:09:58 +00:00
Tugsbayasgalan Manlaibaatar	bc232e3c08	Fix custom op bug of clearing dir (#137655 ) Previously when we delete a custom op out of context manager, we weren't clearing the dir field of the op namespace. As a result, it was polluting other tests. Differential Revision: [D64141465](https://our.internmc.facebook.com/intern/diff/D64141465/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137655 Approved by: https://github.com/zou3519, https://github.com/Skylion007	2024-10-11 04:32:40 +00:00
Avik Chaudhuri	8ee361ed13	fix test_retrace_pre_autograd (#137733 ) Test Plan: fixed Differential Revision: D64200918 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137733 Approved by: https://github.com/pianpwk, https://github.com/tugsbayasgalan	2024-10-11 03:46:22 +00:00
Avik Chaudhuri	8262f6d271	fix test_lazy_module_kwargs (#137705 ) Test Plan: fixed Differential Revision: D64185644 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137705 Approved by: https://github.com/tugsbayasgalan	2024-10-11 01:53:10 +00:00
Shangdi Yu	9d4cb0d3eb	Fix param and buffer mapping for state_dict when there are state_dict hooks (#137609 ) Resolve #137540 Summary: We might get different state_dict and named_parameters result when the module has registered custom state_dict_hooks. For exported_program's state_dict, we want the state_dict to reflect the actual module hierarchy at runtime, and it might be different from the model's state_dict() output if the model has state_dict hooks. To do weight swapping, one needs to either re-export or turn-off the hooks when saving model's state_dict(). Previously, ExportedProgram uses nn.Module's state_dict() method to populate its own state_dict, but it doesn't work for some models (e.g. llama3_3_vision) because ExportedProgram's state_dict and an nn.Module's state_dict have some subtle differences semantically. nn.Module's state_dict is about how the state should be serialized, and it reflects the structure of the original user model code. In contrast, export specializes on a “run” of a model, and its state_dict needs to reflect the runtime module hierarchy. One example where these two are different is TorchTune's Llama3_2_vision text decoder. Here, a FusionLayer is added as a local optimization and it is not part of the "static model definition". In runtime, we have mod.layers[3].layer.sa_norm.scale. But in nn.Module's state_dict, the authors of the model added a state_dict hook to remove the "layer" in mod.state_dict() to reflect the static model definition, so we have mod.state_dict()["layers.3.sa_norm.scale"]. In this Diff, we change ExportedProgram to populate its state_dict using named_parameters() and named_buffers() instead. So in ExportedProgram's state_dict, we have "layers.3.layer.sa_norm.scale", which reflects the runtime module hierarchy. Now one problem this presents is weight swapping. Since ExportedProgram's state and the model's state is not the same anymore, weight swapping procedure also needs to change slightly. In internal Ads and RecSys models deployment, weight swapping is where they have one model that is currently being being deployed and serving traffic, and they want to swap out the weights with newly trained model weights without having to redo the whole exporting/lowering process and create a new artifact. So they would move the deployed model’s pointer to the state dict over to the new state dict. Because of this, it’s previously a requirement that the FQNs are matching between the exported and the eager model’s state dict. The new ExportedProgram's state dict still supports weight swapping, but the state_dict to be swapped needs to be obtained from torch.export.exported_program instead of model.state_dict() if the model has state_dict hooks. The new requirement is that the FQNs are matching between the exported’s state dict and the state_dict obtained from `_disabled_load_state_dict_hooks(M)` context manager. One benefit of having this new API is that we are now in full control within export of gathering and updating the model state. If a model doesn't have any state_dict hooks, one can still use model.state_dict() for weight swapping, so it's BC. Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r test_export_for_training_with_state_dict_hooks ``` Differential Revision: D64080561 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137609 Approved by: https://github.com/angelayi, https://github.com/pianpwk	2024-10-11 01:33:50 +00:00
Avik Chaudhuri	365722f606	fix test_constant_output (#137547 ) Summary: Fixes a couple of problems: constants didn't have metadata before creating graph signatures, and graph signatures weren't updated when lifting constants. Test Plan: fixed test Differential Revision: D64081786 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137547 Approved by: https://github.com/tugsbayasgalan	2024-10-10 07:48:15 +00:00
Avik Chaudhuri	a02093e824	fix test_export_constraints_error_not_in_range (#137500 ) Test Plan: fixed Differential Revision: D64052011 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137500 Approved by: https://github.com/tugsbayasgalan	2024-10-09 05:48:14 +00:00
Michael Lazos	27dee935af	[Dynamo] Ensure torch function modes are dispatched on builtin ops (#137117 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137117 Approved by: https://github.com/yanboliang, https://github.com/williamwen42 ghstack dependencies: #137114, #137115, #137116	2024-10-09 02:29:40 +00:00
PyTorch MergeBot	2d18c2d5e7	Revert "[Dynamo] Ensure torch function modes are dispatched on builtin ops (#137117 )" This reverts commit `941be418d8`. Reverted https://github.com/pytorch/pytorch/pull/137117 on behalf of https://github.com/huydhn due to The top of the stack has been reverted but it leaves trunk in a broken state, so I try to revert the rest of the stack ([comment](https://github.com/pytorch/pytorch/pull/137114#issuecomment-2400765603))	2024-10-08 20:33:17 +00:00
Avik Chaudhuri	28493efe6e	fix silly mapping issue with torch.Size (#137465 ) Test Plan: added test Differential Revision: D64022949 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137465 Approved by: https://github.com/yushangdi, https://github.com/angelayi	2024-10-08 16:53:15 +00:00
Michael Lazos	941be418d8	[Dynamo] Ensure torch function modes are dispatched on builtin ops (#137117 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137117 Approved by: https://github.com/yanboliang, https://github.com/williamwen42 ghstack dependencies: #137114, #137115, #137116	2024-10-07 18:55:26 +00:00
Avik Chaudhuri	17718209ea	fix specialization bug in unflatten + preserve_module_call_signature (#137363 ) Summary: In unflatten, when we generate module calls when their signature has been preserved, we do not pass the original constant args. This can cause strange effects, e.g., if the module is swapped out with itself, we may suddenly go down a different path than the original, or even crash. Test Plan: added a test Reviewed By: angelayi Differential Revision: D63913750 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137363 Approved by: https://github.com/angelayi	2024-10-05 04:26:02 +00:00
Avik Chaudhuri	6a6a8b17b8	handle state tensors in training ir path (#137240 ) Summary: We had attribute assignment detection and handling of registered buffer assignments when using `aot_autograd`, but not when using just `make_fx`. Fixed. Test Plan: expanded coverage of `test_state_tensors` to use `export` instead of `torch.export.export` Differential Revision: D63802576 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137240 Approved by: https://github.com/tugsbayasgalan	2024-10-04 20:23:48 +00:00
Tugsbayasgalan Manlaibaatar	d2d14d14e3	[RELAND] Fix unlift to preserve aliased constants (#137310 ) Differential Revision: [D63864743](https://our.internmc.facebook.com/intern/diff/D63864743) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137310 Approved by: https://github.com/avikchaudhuri	2024-10-04 18:15:52 +00:00
Shangdi Yu	b2979f4382	Allow autocast in training ir export (#137287 ) Summary: hardcode "val" field for autocast (similar to set_grad_enabled), to bypass the verifier check. Test Plan: CI Differential Revision: D63345767 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137287 Approved by: https://github.com/angelayi	2024-10-04 17:38:51 +00:00
Pian Pawakapan	6dcd773c57	[export] clean up dynamic markers from tensors (#137230 ) Summary: When we handle dynamic shapes markers like `Dim.AUTO, Dim.DYNAMIC`, we use dynamo decorators, attaching set attributes to the export input tensors, e.g. `x._dynamo_dynamic_indices = set()`. I thought this was fine, since it's done all the time with torch.compile, but it breaks some PT2Inference tests, specifically because unpickling a set attribute isn't possible with the C++ torch::jit::pickle_load call. We've agreed that the PT2Inference side will clone sample inputs & pickle the original inputs to be safe, but this still establishes a nice invariant that user-facing decorators are both ignored & cleaned out in the lifecycle of an export call. Test Plan: test_export Differential Revision: D63773534 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137230 Approved by: https://github.com/avikchaudhuri	2024-10-04 06:50:45 +00:00
PyTorch MergeBot	525f6715bc	Revert "Fix unlift to unblock training IR + run_decomp on aliasing constants (#137162 )" This reverts commit `f96020c246`. Reverted https://github.com/pytorch/pytorch/pull/137162 on behalf of https://github.com/jovianjaison due to Sorry for reverting your changes but many jobs are failing with NameError: name _recursive_getattr is not defined + a Lint job fails ([comment](https://github.com/pytorch/pytorch/pull/137162#issuecomment-2392036062))	2024-10-03 18:17:56 +00:00
Tugsbayasgalan Manlaibaatar	f96020c246	Fix unlift to unblock training IR + run_decomp on aliasing constants (#137162 ) When we populate unlifted graph module, we actually only "unlift" constant tensor inputs which is problematic because export de-duplicates aliasing constants. As a result, we only register one constant instead of two constants. This PR fixes that by querying ep.constants table instead of ep.graph_signature.lifted_tensor_constants. Differential Revision: [D63743111](https://our.internmc.facebook.com/intern/diff/D63743111) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137162 Approved by: https://github.com/pianpwk	2024-10-03 17:28:53 +00:00
Avik Chaudhuri	cd5d1fe015	unflatten with specialized graphs per submodule call (#137013 ) Previously we were making a fairly restrictive assumption when unflattening an exported program: for any submodule, we would assert that the graph of every call to that submodule must be the same. This assertion is load-bearing, i.e., if we simply remove the assertion then we can get incorrect results, as shown by the following example. ``` class N(torch.nn.Module): def forward(self, x, b): if b: return x + 1 else: return x + 2 class M(torch.nn.Module): def __init__(self): super().__init__() self.n = N() def forward(self, x): x0 = x + 3 x1 = self.n(x0, True) x2 = x1 + 4 x3 = self.n(x2, False) return x3 + 5 m = M() inp = (torch.ones(1),) print(m(inp)) # tensor([16.]) ep = torch.export.export(m, inp) print(ep.module()(inp)) # tensor([16.]) unflattened = torch.export.unflatten(ep) print(unflattened(inp)) # tensor([15.]) ``` However, this goes against the spirit of specializing graphs when exporting: we should expect* that for every call to a submodule we might generate a different graph. The goal of this PR is to fix unflattening to handle multiple specialized graphs corresponding to multiple calls to the same submodule. The idea is simple: for every call to a child module `foo`, we will create potentially different child modules `foo`, `foo@1`, `foo@2`, etc. and use those names as targets in `callmodule` instructions in the parent graph. An immediate consequence of this is that the list of fqns in an unflattened module may not be the same as an exported module. Note that all these variants share the same parameters / buffers, so that multiple calls to the same submodule can share state as expected. However, as described so far this scheme may end up with needlessly too many submodules. Thus, between calls to the same submodule, if graphs are equal then we optimize away the extra submodules and reuse call names as much as possible. Moreover, when submodules are shared across fqns, we also try to de-duplicate graphs corresponding to their calls as much as possible. Note that no matter what, information about which submodule was called is still preserved, so that if a submodule has to be swapped with another, one can still find all calls to the former submodule and replace them with calls to the latter. A note on the choice of naming scheme for call names: instead of generating "sibling" modules `foo@1`, `foo@2`, etc. for `foo`, we had considered generating "children" modules `foo._1`, `foo._2`, etc. of `foo`. However this can cause spurious cycles when de-duplicating graphs. E.g., suppose that `foo` is an alias for `bar._1` and `foo._1` is an alias for `bar`, then we must either introduce a cycle or drop the opportunity to optimize. Another idea would be to make `foo` a dummy module that contains `foo._0` corresponding to the first call, but this necessitates too many changes to existing tests and hurts the common case. Differential Revision: D63642479 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137013 Approved by: https://github.com/pianpwk	2024-10-03 00:55:44 +00:00
Tugsbayasgalan Manlaibaatar	73b07df042	Preserve custom ops via run_decomps (#136882 ) This is re-apply of https://github.com/pytorch/pytorch/pull/136773?fbclid=IwZXh0bgNhZW0CMTEAAR3SmginkvZcILVY7G2XDa_KosnV4DPmq1l6pkjPIM255QgJLKVAR90rGAU_aem_ZWpcVdUsmAGzOGiwbjtBDg. Note that this doesn't completely remove the _preserve_ops list from export mainly because we want to have small change to address failing executorch tests. All the complications included in this PR is deleted in the next PR. Differential Revision: [D63553086](https://our.internmc.facebook.com/intern/diff/D63553086/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136882 Approved by: https://github.com/bdhirsh	2024-10-01 17:38:00 +00:00
Pian Pawakapan	cc2a66c55e	[export] hook up mark_dynamic to export Dims (#137029 ) Adds Dim.DYNAMIC which calls torch._dynamo.mark_dynamic() in the backend. Similar to Dim.AUTO in that it does automatic inference for ranges & relations, but errors out for specializations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137029 Approved by: https://github.com/avikchaudhuri	2024-10-01 17:05:09 +00:00
Pian Pawakapan	f0a92541fe	[export] fix lifted constants order for 0-input graphs (#136658 ) Summary: With empty graphs, the `graph.inserting_before(first_user_input = None)` call turns into a `graph.inserting_after(root)` call, inverting the order of constant input nodes being inserted. This fixes the issue by initializing to the first node in the graph (still valid if not a user input - only used for insertion). Test Plan: test_export Differential Revision: D63403514 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136658 Approved by: https://github.com/avikchaudhuri	2024-09-26 17:44:24 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	e4e83a4ac4	Remove aten.item hack (#136663 ) Summary: Title Test Plan: CI Differential Revision: D63404353 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136663 Approved by: https://github.com/bdhirsh	2024-09-26 17:14:48 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	0b38fa154a	Fix meta registry in export (#136492 ) Summary: Title Test Plan: CI This fixes some breaking tests in executorch. I think the root cause is when we have aten::matmul which we are not preserving, we register meta implementation from C++ side. It seems like the C++ kernel doesn't work well with mix of FakeTensor and real tensor. This PR sidesteps this problem by always preferring python CIA decomp over C++ Cia decomp Differential Revision: D63297050 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136492 Approved by: https://github.com/bdhirsh	2024-09-25 17:53:02 +00:00
Pian Pawakapan	7c6d543a5b	[export] fix _get_non_persistent_buffers for duplicates (#136552 ) Summary: Export's method _get_non_persistent_buffers doesn't check duplicate submodules, so we run into state_dict related issues if non-persistent buffers exist on shared submodules. Test Plan: test_export Differential Revision: D63332976 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136552 Approved by: https://github.com/avikchaudhuri, https://github.com/tugsbayasgalan	2024-09-25 16:46:31 +00:00
Tugsbayasgalan Manlaibaatar	1904b09e61	Create export_for_inference API and expose core_aten as public facing API (#135912 ) Differential Revision: [D62606908](https://our.internmc.facebook.com/intern/diff/D62606908) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135912 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #135080	2024-09-15 17:05:07 +00:00
Tugsbayasgalan Manlaibaatar	382fad58b3	Deprecate _preserve_ops and consolidate with decomp_table (#135080 ) In this PR, we deprecate _preserve_ops feature in run_decomposition API. We can't kill this API completely because Executorch team depends on it. As the syncing between two repos is non-trivial, I just leave this argument as deprecated for now. In the next PR, i will immediately remove it. After this PR, run_decompositions will only decompose what's inside the decomp table and preserve the rest by default. Note that this feature is only rolled out to OSS for now. Old code path is protected under IS_FBCODE flag. Differential Revision: [D62163161](https://our.internmc.facebook.com/intern/diff/D62163161/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135080 Approved by: https://github.com/justinchuby, https://github.com/avikchaudhuri, https://github.com/bdhirsh	2024-09-15 17:01:58 +00:00
Pian Pawakapan	6df91b5917	real tensor prop for composite ops (#135717 ) Fixes #135632 Adds real tensor propagation for decompositions, checking any symbols on their outputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/135717 Approved by: https://github.com/ezyang	2024-09-13 03:35:16 +00:00
Aaron Orenstein	8c356ce3da	Fix lint errors in fbcode (#135614 ) Summary: Fixed a bunch of fbcode imports that happened to work but confused autodeps. After this autodeps still suggests "improvements" to TARGETS (which breaks our builds) but at least it can find all the imports. Test Plan: ``` fbpython fbcode/tools/build/buck/linters/lint_autoformat.py --linter=autodeps --default-exec-timeout=1800 -- fbcode/caffe2/TARGETS fbcode/caffe2/test/TARGETS ``` Before: ``` ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/testing.py:229) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fbur$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_export.py:87) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fburl$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_serdes.py:9) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fb$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_serdes.py:10) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fburl$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_retraceability.py:7) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https:$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_retraceability.py:6) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See ht$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_export_nonstrict.py:7) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See http$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_export_nonstrict.py:6) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See $ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_export_training_ir_to_run_decomp.py:8) when processing rule "test_export". Please make sure it's listed in the srcs parameter of an$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_export_training_ir_to_run_decomp.py:10) when processing rule "test_export". Please make sure it's listed in the srcs parameter of anoth$ ERROR while processing caffe2/test/TARGETS: Found "//python/typeshed_internal:typeshed_internal_library" owner for "cv2" but it is protected by visibility rules: [] (from caffe2/test/test_bundled_images.py:7) when processing rule "test_bundled_$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "caffe2.test.profiler_test_cpp_thread_lib" (from caffe2/test/profiler/test_cpp_thread.py:29) when processing rule "profiler_test_cpp_thread". Please make sure it's listed in t$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._utils_internal.get_file_path_2" (from caffe2/test/test_custom_ops.py:23) when processing rule "custom_ops". Please make sure it's listed in the srcs parameter of anoth$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._utils_internal.get_file_path_2" (from caffe2/test/test_public_bindings.py:13) when processing rule "public_bindings". Please make sure it's listed in the srcs paramete$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._C._profiler.symbolize_tracebacks" (from caffe2/test/test_cuda.py:3348) when processing rule "test_cuda". Please make sure it's listed in the srcs parameter of another $ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._C._profiler.gather_traceback" (from caffe2/test/test_cuda.py:3348) when processing rule "test_cuda". Please make sure it's listed in the srcs parameter of another rule$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for include <torch/csrc/autograd/profiler_kineto.h> (from caffe2/test/profiler/test_cpp_thread.cpp:2) when processing profiler_test_cpp_thread_lib. Some things to try: ``` Differential Revision: D62049222 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135614 Approved by: https://github.com/oulgen, https://github.com/laithsakka	2024-09-13 02:04:34 +00:00
Pian Pawakapan	b897ab0540	[export] ignore mark_dynamic() in export (#135536 ) Previously we were accomodating `torch._dynamo.mark_dynamic()` for export's dynamic shapes. Here we clean things up and ignore it, requiring users to specify an export input for `dynamic_shapes`. Note: there's 4 decorators relevant to export, `mark_dynamic, maybe_mark_dynamic, mark_static, mark_unbacked`. User calls that involve export have only been `mark_dynamic()`, and we use `maybe_mark_dynamic` under the hood for `Dim.AUTO`, but we could start using others. One reason I decided to not warn and just silently ignore is these decorators cause the tensors to carry dynamic info, and it'll be hard to tell whether the markers are from export or user calls when re-exporting with the same inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135536 Approved by: https://github.com/avikchaudhuri	2024-09-12 21:22:19 +00:00
Shangdi Yu	ad75b09d89	Replace capture_pre_autograd_graph with export_for_training in torch tests (#135623 ) Summary: as title Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r test_conv_dynamic buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r matcher buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r x86 ``` CI Differential Revision: D62448302 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135623 Approved by: https://github.com/tugsbayasgalan	2024-09-11 19:23:08 +00:00
Avik Chaudhuri	6546c6186d	do not raise when flatten_fn_with_keys not found when suggesting fixes (#135518 ) Test Plan: added test Differential Revision: D62395371 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135518 Approved by: https://github.com/zhxchen17	2024-09-10 03:47:36 +00:00
Yidi Wu	993b5647ab	[export] fix placeholder name collision tests by removing map call (#135366 ) The current test is failing because of the current unstable state of map. torch.compile and non-strict export are taking two seperate routes unlike cond and while_loop. This pr fix the test it self. We'll fix map in follow up PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135366 Approved by: https://github.com/angelayi	2024-09-06 22:02:50 +00:00
Pian Pawakapan	177e4f4218	remove _check call on item() for torch.istft (#135234 ) Fixes #135014 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135234 Approved by: https://github.com/tugsbayasgalan	2024-09-06 17:31:25 +00:00
Avik Chaudhuri	de74aafff4	error on exporting ScriptModule (#135302 ) Test Plan: added test Differential Revision: D62279179 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135302 Approved by: https://github.com/yushangdi	2024-09-06 15:12:40 +00:00
Edward Z. Yang	d0591f4658	Ignore fresh unbacked when doing recursive make_fx inside HOPs (#135053 ) Internal xref: https://fb.workplace.com/groups/6829516587176185/posts/7705964779531357/ This now also incorporates a test from https://github.com/pytorch/pytorch/pull/133585 (which it fixes) and the prep PR https://github.com/pytorch/pytorch/pull/134407 Including the PR desc from that: I am trying to fix a problem reported by user in [fb.workplace.com/groups/6829516587176185/permalink/7705964779531357](https://fb.workplace.com/groups/6829516587176185/permalink/7705964779531357/) The summary of this problem is that when we do collect metadata analysis in AOTAutograd, we accumulate pending unbacked symbols which are going to be discarded at the end of the trace. However, if we do a recursive make_fx inside tracing, as occurs with torch.cond, we end up seeing that there are pending unbacked symbols that aren't associated with a binding, even though it's spurious (they've leaked into the inner make_fx call from the outer AOTAutograd analysis). In https://github.com/pytorch/pytorch/pull/133588 I tried to just prevent adding the symbols to the pending list at all in the first place. But this itself caused some problems which were fixed in https://github.com/pytorch/pytorch/pull/124785 . The problem fixed in that PR is that when we allocate tangents that have unbacked size, something prevented them from having correct unbacked SymInts when ignore fresh unbacked SymInts was enabled. So I had patched it at the time by just not suppressing pending symbols and clearing them out some other way. I think... I was wrong in that PR? That is to say, it was OK to avoid putting the fresh unbacked symbols in the pending list; the real problem was suppressing unbacked renamings. But there doesn't seem to be a good reason to suppress these; this PR shows that it doesn't actually fail any tests if you do these anyway. Intuitively, this makes sense, because you can't trigger renamings unless you're actually adding unbacked symbols to the pending set. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/135053 Approved by: https://github.com/ydwu4	2024-09-06 13:13:15 +00:00
Avik Chaudhuri	43f4947d44	fix fake tensor tolist implementation (#135131 ) Summary: When exporting for training with `tolist`, we do not hit `FunctionalTensor.tolist` since we do not functionalize. Unfortunately, this means we hit `FakeTensor.tolist`, which creates unbacked symints that are not backed by proxies. Rather than trying to patch up this low-level implementation, we replace it with essentially what `FunctionalTensor.tolist` does, which is higher-level: we essentially desugar to `item()` calls and let it take care of unbacked symints. Test Plan: Some expected failures are gone now. Also found a test for `tolist` that was written when `FunctionalTensor.tolist` was implemented but not really doing much; repurposed it now to exercise more modes. Differential Revision: D62197742 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135131 Approved by: https://github.com/ezyang	2024-09-05 23:20:31 +00:00
Yidi Wu	38fead8f7c	[hop] preserve metadata in re-tracing hop subgraph by running with interpreter (#135159 ) In this way, the interpreter.run can preserve the current metadata of subgraphs correctly when tracing the subgraphs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135159 Approved by: https://github.com/tugsbayasgalan	2024-09-05 21:36:56 +00:00
Jack Zhang	8a5c8e5db9	Update unbacked symints in masked_select more precisely (#134899 ) ## Summary At the moment, the fake impl for `masked_select` simply sets the upper range while updating its size-like SymInt to `sys.maxsize`(9223372036854775807, max value for an unsigned int64) if the there are any SymInts in the original input tensor shape. This PR constrains the range more intelligently by using the upper ranges of each SymInt in the input tensor shape. This solves an issue where an model being lowered to Executorch errors during memory planning because the memory allocated for `masked_select` ended up exceeded the 64-bit address space (`INT_MAX * size(dtype)`). ## Test plan - Passes existing unit tests (tests case where upper bound is inf) - Added unit test to verify upper bound reduction calculation - Tested end-to-end by exporting with TORCH_LOGS="export" and ensuring that the range for `masked_select`'s SymInt size has the correct upper bound Pull Request resolved: https://github.com/pytorch/pytorch/pull/134899 Approved by: https://github.com/ezyang	2024-09-05 09:01:06 +00:00
Tugsbayasgalan Manlaibaatar	9d705605dd	Fix decomp behaviour in export training IR (#134801 ) Subset of changes in https://github.com/pytorch/pytorch/pull/132901, can't land the previous one because it is too complicated. Rest of the change will be implemented as follow up after export design meeting. This part just makes the training IR -> inference IR decomp to have the same path as normal export. Differential Revision: [D62000525](https://our.internmc.facebook.com/intern/diff/D62000525) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134801 Approved by: https://github.com/avikchaudhuri, https://github.com/angelayi	2024-09-05 06:37:44 +00:00
Pian Pawakapan	7b280c31ba	[export] dynamic_shapes serialization, load/dump (#134718 ) Adds utility functions `_dump_dynamic_shapes` and `_load_dynamic_shapes`. - `_dump_dynamic_shapes`: dynamic shapes spec -> serialized format: - takes in the `dynamic_shapes` pytree object you'd feed into `export()`, and dumps into serialized format - `_load_dynamic_shapes`: serialized format -> dynamic shapes spec - takes the serialized format, and produces a `dynamic_shapes` object you feed into `export()` For example with dumping: ``` dx = Dim("dx", min=4, max=16) dy = dx + 1 inputs = ( [ torch.randn(4, 4), torch.randn(5, 4), ], torch.randn(4), torch.randn(4, 4), "hello", ) dynamic_shapes = { "a": [ (dx, 4), (dy, 4), ], "b": (Dim.AUTO,), "c": None, "d": None, } out = _dump_dynamic_shapes(dynamic_shapes, inputs) ``` would generate the following output: ``` DynamicShapesSpec( dynamic_shapes=( [ ['dx', 4], ['dx + 1', 4], ], ['_DimHint.STATIC'], ['_DimHint.STATIC', '_DimHint.STATIC'], None, ), dims={ 'dx': RootDim( min=4, max=16, derived=['dx + 1'], ), }, ) ``` The serialized format contains 2 keys, `dynamic_shapes` and `dims.` - `dynamic_shapes` is the pytree structure matching the input to `export()`, with strings in place of Dim names and enums, and ints/Nones otherwise. Each tensor is represented with a list of shapes, non-tensors with Nones. - `dims` contain min/max range and derived dims info for each root dim. The test cases show some roundtrippability guarantees for these functions. Definitely taking naming suggestions for them :) Follow up: utility function to extract serializable format from ExportedProgram. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134718 Approved by: https://github.com/avikchaudhuri	2024-09-05 05:39:44 +00:00
PyTorch MergeBot	7858045491	Revert "Fix set_unbacked_bindings when list of Tensors is returned (#133585 )" This reverts commit `2a49296d75`. Reverted https://github.com/pytorch/pytorch/pull/133585 on behalf of https://github.com/ezyang due to fails torchrec tests ([comment](https://github.com/pytorch/pytorch/pull/133585#issuecomment-2329602983))	2024-09-04 17:21:32 +00:00
PyTorch MergeBot	fc07e6bf56	Revert "Ignore fresh unbacked when doing recursive make_fx inside HOPs (#135053 )" This reverts commit `a178a053ad`. Reverted https://github.com/pytorch/pytorch/pull/135053 on behalf of https://github.com/ezyang due to need to back out https://github.com/pytorch/pytorch/pull/133585 ([comment](https://github.com/pytorch/pytorch/pull/134407#issuecomment-2329597388))	2024-09-04 17:18:21 +00:00
Edward Z. Yang	a178a053ad	Ignore fresh unbacked when doing recursive make_fx inside HOPs (#135053 ) Internal xref: https://fb.workplace.com/groups/6829516587176185/posts/7705964779531357/ I'm not sure this is the right approach though... Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/135053 Approved by: https://github.com/ydwu4 ghstack dependencies: #134407	2024-09-04 13:25:08 +00:00
Avik Chaudhuri	9f00317997	rationalize STATIC vs. None (#134877 ) Summary: A bit of refactoring to prepare to remove `None` as a way to specify static dimensions in dynamic shapes, given we already have `Dim.STATIC` for the same purpose. We will now warn whenever this happens. However no tests were modified because problematic uses of `None` still need to behave as they do today, until we are ready to remove support. It should be easy to port tests by replacing the warning function to raise instead. Note that other uses of `None`, such as for entire values (tensor or non-tensor) remain as is. Moving forward this should be the only purpose of `None` (at least externally). Finally, there's a bit of confusion in our representation now because `AUTO` also internally transforms to `None`. Renamed dynamic_shapes to transformed_dynamic_shapes where this happens. Overall the two forms (pre and post transformation) have different properties so should probably not be represented in the same format in the future. Test Plan: existing Differential Revision: D62040729 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134877 Approved by: https://github.com/pianpwk	2024-09-04 05:34:26 +00:00

1 2 3 4 5 ...

461 Commits