pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Tugsbayasgalan Manlaibaatar	7e7e5698cc	Suppress more warnings (#149833 ) Differential Revision: [D71702307](https://our.internmc.facebook.com/intern/diff/D71702307) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149833 Approved by: https://github.com/malfet, https://github.com/Skylion007	2025-04-01 05:33:04 +00:00
Avik Chaudhuri	6237495fcf	torch.Size input (#149414 ) Summary: Support for `torch.Size` inputs was patchy before because `unflatten_fn` for this type returned a tuple. This PR cleans this up. Fixes #149158 Test Plan: added test Differential Revision: D71403635 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149414 Approved by: https://github.com/yushangdi	2025-03-20 16:23:13 +00:00
Aditya Tiwari	bb9c426024	Typo Errors fixed in multiple files (#148262 ) # Fix typo errors across PyTorch codebase This PR fixes various spelling errors throughout the PyTorch codebase to improve documentation quality and code readability. ## Changes Made ### Documentation Fixes - Changed "seperate" to "separate" in multiple files: - `setup.py`: Build system documentation - `torch/_library/triton.py`: AOT compilation comments - `torch/csrc/dynamo/compiled_autograd.h`: Node compilation documentation - `torch/export/_unlift.py`: Pass population comments - `torch/export/exported_program.py`: Decomposition table notes ### Code Comments and Error Messages - Changed "occured" to "occurred" in: - `test/mobile/test_lite_script_module.py`: Exception handling comments - `torch/export/_draft_export.py`: Error message text - `aten/src/ATen/native/cuda/linalg/BatchLinearAlgebra.cpp`: MAGMA bug comment - `torch/csrc/utils/python_numbers.h`: Overflow handling comment - `torch/csrc/jit/OVERVIEW.md`: Graph compilation documentation - `torch/_dynamo/symbolic_convert.py`: Error explanation ### API Documentation - Changed "fullfill" to "fulfill" in `torch/distributed/checkpoint/state_dict_loader.py` - Changed "accross" to "across" in: - `torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp` - `torch/distributed/distributed_c10d.py` ## Motivation These changes improve code readability and maintain consistent spelling throughout the codebase. No functional changes were made; this is purely a documentation and comment improvement PR. ## Test Plan No testing required as these changes only affect comments and documentation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148262 Approved by: https://github.com/janeyx99 Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>	2025-03-09 12:21:40 +00:00
Aaron Orenstein	b6c5562c1f	PEP585 update - torch/export (#145165 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145165 Approved by: https://github.com/bobrenjc93	2025-01-19 20:56:55 +00:00
Zhengxu Chen	53256edff9	[export] Support module inputs for non strict mode. (#143925 ) Summary: Add experimental support for torch.nn.Module as input types. Before this change, we don't support module inputs but recently we saw some interesting use cases like gpt-fast https://github.com/pytorch-labs/gpt-fast/blob/main/generate.py#L68 where we directly pass in a module input for different variants of the same models. Since we don't really care about non-param or non-buffer states in non strict mode, we don't care about those either and pretend they are like plain constants during tracing. We treat any module input like a nested container of tensor, and each time we will automatically register a pytree handler for these module types to flatten its state dict into a group of tensors. We will just inline any module method call during tracing like we did for `self` module in export_for_training. This will make input modules' behavior very similar to the training module in typical case, except that we don't record the inputs as parameter or buffers but rather just plain user inputs. Test Plan: buck run mode/opt caffe2/test:test_export -- -r test_module_input Differential Revision: D67680827 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143925 Approved by: https://github.com/tugsbayasgalan	2025-01-16 17:30:36 +00:00
Avik Chaudhuri	db51308d9c	fix output node name (#142506 ) Fixes #142227 Differential Revision: [D67043283](https://our.internmc.facebook.com/intern/diff/D67043283/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/142506 Approved by: https://github.com/ydwu4	2024-12-11 17:28:28 +00:00
Fabian Keller	f472b3aee1	improve typings around torch.export (#141829 ) This is another follow-up to https://github.com/pytorch/pytorch/pull/115074 / https://github.com/pytorch/pytorch/pull/141240 following the strategy discussed there (https://github.com/pytorch/pytorch/pull/115074#issuecomment-2480992230). This PR improves the type annotations around `torch._export`. Even though the PR introduces a few runtime type asserts, the runtime behavior should stay equivalent, because the failed assertions should have been immediate crashes anyway. CC @Skylion007 @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/141829 Approved by: https://github.com/ezyang	2024-12-03 19:57:21 +00:00
angelayi	cb6a21b033	[export] Add setattr for ep.example_inputs (#140990 ) Differential Revision: [D66136725](https://our.internmc.facebook.com/intern/diff/D66136725) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140990 Approved by: https://github.com/yushangdi, https://github.com/ydwu4	2024-11-20 02:49:20 +00:00
Tugsbayasgalan Manlaibaatar	e080c89bdc	Make test_torchbind.py training IR compatible (#138658 ) In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now. I also fixed two bugs: 1. ep.module() doesn't register all aliased constants in the module. 2. When we retrace, we need to fakify the original Torchbind object. 3. We don't run any DCE on training IR so we need to add some more torch ops to verifier. Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138658 Approved by: https://github.com/ydwu4, https://github.com/zhxchen17	2024-11-04 17:43:11 +00:00
Aaron Orenstein	07cc4bd3e2	typing compile_fx.py (#138033 ) Type annotations for compile_fx. - Some of the stuff here is pretty complicated (functions which return functions that take functions) so I bailed on those and used `Any` just to get the rest landed. - There are also changes to type signatures in other files which I did just to let mypy know more about the types in compile_fx.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138033 Approved by: https://github.com/Skylion007	2024-10-21 18:14:59 +00:00
Tugsbayasgalan Manlaibaatar	f3c3f3a3c3	Fix assigning tensor with requires_grad as constant in export (#137997 ) When we insert cojstants into unlifted graph, we need to detach them if they require grad BUT when we detach we need to preserve the original aliasing information. Differential Revision: [D64406859](https://our.internmc.facebook.com/intern/diff/D64406859/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137997 Approved by: https://github.com/avikchaudhuri	2024-10-17 06:41:10 +00:00
Tugsbayasgalan Manlaibaatar	bb31e3f57e	Add original forward names to schema so that prettify pass works (#136887 ) When we run_decomp, we retrace if it is training IR. As a result, we do need to reliably store the oroiginal forward names when we run decomp. Differential Revision: [D63064453](https://our.internmc.facebook.com/intern/diff/D63064453/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136887 Approved by: https://github.com/angelayi	2024-10-08 04:21:02 +00:00
angelayi	fa9cd46d12	[export] Update swap's forward function (#137102 ) Downstream APS code was failing to run the previously swapped module because of some fx.GraphModule forward function weirdness (P1594789677). So to fix this, I just attached a custom forward function which matches the unflattened module's forward function. Differential Revision: [D63683422](https://our.internmc.facebook.com/intern/diff/D63683422/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137102 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #136191	2024-10-06 04:25:36 +00:00
Tugsbayasgalan Manlaibaatar	d2d14d14e3	[RELAND] Fix unlift to preserve aliased constants (#137310 ) Differential Revision: [D63864743](https://our.internmc.facebook.com/intern/diff/D63864743) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137310 Approved by: https://github.com/avikchaudhuri	2024-10-04 18:15:52 +00:00
PyTorch MergeBot	525f6715bc	Revert "Fix unlift to unblock training IR + run_decomp on aliasing constants (#137162 )" This reverts commit `f96020c246`. Reverted https://github.com/pytorch/pytorch/pull/137162 on behalf of https://github.com/jovianjaison due to Sorry for reverting your changes but many jobs are failing with NameError: name _recursive_getattr is not defined + a Lint job fails ([comment](https://github.com/pytorch/pytorch/pull/137162#issuecomment-2392036062))	2024-10-03 18:17:56 +00:00
Tugsbayasgalan Manlaibaatar	f96020c246	Fix unlift to unblock training IR + run_decomp on aliasing constants (#137162 ) When we populate unlifted graph module, we actually only "unlift" constant tensor inputs which is problematic because export de-duplicates aliasing constants. As a result, we only register one constant instead of two constants. This PR fixes that by querying ep.constants table instead of ep.graph_signature.lifted_tensor_constants. Differential Revision: [D63743111](https://our.internmc.facebook.com/intern/diff/D63743111) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137162 Approved by: https://github.com/pianpwk	2024-10-03 17:28:53 +00:00
Yidi Wu	c44cb89e06	[export] detach constant tensors when they're not registered as buffer or parameter in unlift (#133031 ) Summary: Fixes T198245910. In previous diff D60532628 that causes the test failure, we fix the in-consistency caused by constant tensors is accidentally reigistered as buffer by deleting the buffer and re assign them as constant. However, this broke several existing tests in pyspeech when the exported program is re-traced with torch.jit.trace (which is an anti-pattern we probably should have some alignment), the jit tracer finds this constant tensor requiring grad and errors out. This PR force constant attr not requiring grad, which is the correct behavior. A better fix is finding out where the constants are created in user code and why it requires grad. But this has low roi so we warn user about it. Test Plan: See failures in T198245910. Differential Revision: D60974869 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133031 Approved by: https://github.com/angelayi	2024-08-09 20:33:52 +00:00
Yidi Wu	bbf568aac8	Split of "[reland] [export] fix zero arg export in training_ir and constant tensor handling" (#132307 ) Summary: A re-land of D60006710. Fixed TrainingIRToRunDecomp failures for test_tensor_attribute_zero_args and also a few re-tracability failures because run_decomposition does a retracing. edit: also remove the eliminate_dead_code() in _unlift because of one onnx test failure: a constant tensor attr was lifted as constant_tensor input but it's not used in the graph after aot_autograd due to a short cut in its decomposition. This causes the setattr to be removed by eliminate_dead_code but the graph signature still contains the name of that buffer, which causes an inconsitency between the transformed graph and ep's original signature after _unlift. And it seems that this has happened a few times where some nodes are accidentally removed and we're in an inconsistent state. The alternative of removing it would be: every time we call elimiate_dead_code, we verify the consistency of the graph with 1. the graph before transformation and 2. all the meta datas but i think this deserves a complete design edit 2: Also fix the inconsistency of graph signatures when param_constant is marked as lifted_tensor_constants but it's registered as parameters in the output of ep.module(). Differential Revision: D60532628 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132307 Approved by: https://github.com/zhxchen17	2024-08-08 01:36:16 +00:00
Shangdi Yu	825002c9c6	[export][fx] More robust DCE pass (#132764 ) Summary: - make default DCE pass check schema, - need to rebase onto https://github.com/pytorch/pytorch/pull/131651 after it's in phabricator (for now the change is manually added). - mark Proxy dump as NotImplemented for better error msg - Remove Proxy from tensors when dumping models, as Proxy cannot be dumped. More details in https://docs.google.com/document/d/1G5vmTXjzxoyVGRI2kpA1gQukK_Glyg2NrE0Oh6Nlg9A/edit?usp=sharing. Test Plan: CI ``` - buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r qat_conv2d - test_export.py - buck2 run 'fbcode//mode/dev-nosan' fbcode//modai/test:test_modai -- -r test_qat_stinson_htp_export - buck2 run 'fbcode//mode/dev-nosan' fbcode//vizard_projects/ml_depth/tests:test_model -- -r test_qat_model_et - buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r dce - buck2 run 'fbcode//mode/dev-nosan' fbcode//bolt/nn/executorch/backends/tests:qnn_test -- -r test_qat_bias=False,use_3d_input=False - buck2 run 'fbcode//mode/dev-nosan' fbcode//bolt/nn/executorch/backends/tests:qnn_test -- -r test_qat_bias=True,use_3d_input=False - buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r test_fold_bn_erases_bn_node ``` Reviewed By: angelayi Differential Revision: D60319175 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132764 Approved by: https://github.com/angelayi	2024-08-06 22:27:22 +00:00
Xuehai Pan	f3fce597e9	[BE][Easy][17/19] enforce style for empty lines in import segments in `torch/[a-c]/` and `torch/[e-n]/` (#129769 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129769 Approved by: https://github.com/ezyang	2024-08-04 10:24:09 +00:00
Yidi Wu	2c1851f04e	[export] fix output node's meta (#131706 ) Summary: This pr fixes all the places in strict export stack where the output node's meta is not preserved correctly. However, we're getting a new error for the test we intend to fix: `buck2 run caffe2/test/quantization:test_quantization -- -r "test_re_export_preserve_handle"`: The `get_attr` nodes has wrong metadata. I guess there are more things need to be fixed to get it working but it's beyond the scope of this PR. Test Plan: buck2 run caffe2/test/quantization:test_quantization -- -r "test_re_export_preserve_handle" Differential Revision: D60198221 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131706 Approved by: https://github.com/yushangdi	2024-07-25 18:44:21 +00:00
Shangdi Yu	29e2e2afb6	Revert D59561509: Multisect successfully blamed "D59561509: [FX][export] DCE pass, check schema for node impurity (#130395 )" for one test failure (#131341 ) Summary: This diff reverts D59561509 D59561509: [FX][export] DCE pass, check schema for node impurity (#130395) by yushangdi causes the following test failure: Tests affected: - [cogwheel:cogwheel_mtia_cmf_m5_shrunk_test#test_flow_with_verification](https://www.internalfb.com/intern/test/844425041436985/) Here's the Multisect link: https://www.internalfb.com/multisect/6533402 Here are the tasks that are relevant to this breakage: T191383430: 10+ tests unhealthy for ads_mtia_inference The backout may land if someone accepts it. If this diff has been generated in error, you can Commandeer and Abandon it. Test Plan: NA Differential Revision: D60029318 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131341 Approved by: https://github.com/angelayi	2024-07-23 05:23:47 +00:00
PyTorch MergeBot	b9912f31ef	Revert "[export] fix zero arg export in training_ir (#130990 )" This reverts commit `50436d5bdb`. Reverted https://github.com/pytorch/pytorch/pull/130990 on behalf of https://github.com/clee2000 due to failing some executorch and torchrec tests internally D60006710 ([comment](https://github.com/pytorch/pytorch/pull/130990#issuecomment-2243395316))	2024-07-22 16:49:25 +00:00
Yidi Wu	50436d5bdb	[export] fix zero arg export in training_ir (#130990 ) Fixed TrainingIRToRunDecomp failures for test_tensor_attribute_zero_args and also a few re-tracability failures because run_decomposition does a retracing. edit: also remove the eliminate_dead_code() in _unlift because of one onnx test failure: a constant tensor attr was lifted as constant_tensor input but it's not used in the graph after aot_autograd due to a short cut in its decomposition. This causes the setattr to be removed by eliminate_dead_code but the graph signature still contains the name of that buffer, which causes an inconsitency between the transformed graph and ep's original signature after _unlift. And it seems that this has happened a few times where some nodes are accidentally removed and we're in an inconsistent state. The alternative of removing it would be: every time we call elimiate_dead_code, we verify the consistency of the graph with 1. the graph before transformation and 2. all the meta datas but i think this deserves a complete design. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130990 Approved by: https://github.com/pianpwk	2024-07-20 02:35:13 +00:00
Shangdi Yu	27ded03545	[FX][export] DCE pass, check schema for node impurity (#130395 ) Change the default DCE pass to check node schema for impure nodes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130395 Approved by: https://github.com/angelayi, https://github.com/jgong5	2024-07-18 16:31:40 +00:00
PyTorch MergeBot	433ef4e444	Revert "[FX][export] DCE pass, check schema for node impurity (#130395 )" This reverts commit `e22b0acc76`. Reverted https://github.com/pytorch/pytorch/pull/130395 on behalf of https://github.com/yushangdi due to breaking tests, need to rebase and fix ([comment](https://github.com/pytorch/pytorch/pull/130395#issuecomment-2235192986))	2024-07-18 02:46:03 +00:00
Shangdi Yu	e22b0acc76	[FX][export] DCE pass, check schema for node impurity (#130395 ) Change the default DCE pass to check node schema for impure nodes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130395 Approved by: https://github.com/angelayi, https://github.com/jgong5	2024-07-18 00:55:20 +00:00
Shangdi Yu	ea4b80e6d6	[FX][export] strict DCE pass, check schema for node impurity (#130552 ) Fixes the failure in `test/export/test_export_training_ir_to_run_decomp.py ` caused by dead code elimination removing node with side effects. For background, in export, we may want to export higher-level IRs that are not functional, so we need to check for side effects more carefully. A call_function node is impure if it has at least one mutable argument. Fixed the tests below: test_to_module_with_mutated_buffer_multiple_update_sub_later test_export_input_mutation_static_shape test_buffer_util Another attempt modifying the original DCE pass is made in PR #130395, but it breaks some other tests, so here we add a flag and use it for export only. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130552 Approved by: https://github.com/pianpwk	2024-07-12 15:43:27 +00:00
Wang, Eikan	1f302d6885	Support aten operations with out tensor (#124926 ) This PR intends to support the aten operations with the `out` tensor. Currently, the AOT compile always does NOT keep input tensor mutations. According to the comments, this is because it has not encountered such a use case. > For now there's no use case involving keeping input mutations in the graph (which we can only do in the inference case anyway). We can add this later if we need to. However, for aten operations, it is popular that the `out` tensor is an input parameter and needs to be mutated. This PR intends to support it by adding a `keep_inference_input_mutations` flag to `aot_inductor.keep_inference_input_mutations`. This flag can provide flexibility to the callee in deciding whether the AOT compile needs to keep input tensor mutations in the graph. Take `clamp` as an example as follows. ```python out_tensor = torch.randn(128, dtype=torch.float, device=device).fill_(-2.0) inp_tensor = torch.randn(128, dtype=torch.float, device=device).fill_(1.0) min_tensor = inp_tensor - 0.05 max_tensor = inp_tensor + 0.05 torch.clamp(input=inp_tensor, min=min_tensor, max=max_tensor, out=out_tensor) ``` W/O this PR ```python def forward(self): arg0_1: "f32[128]"; arg1_1: "f32[128]"; arg2_1: "f32[128]"; arg3_1: "f32[128]"; arg0_1, arg1_1, arg2_1, arg3_1, = fx_pytree.tree_flatten_spec([], self._in_spec) clamp_min: "f32[128]" = torch.ops.aten.clamp_min.Tensor(arg0_1, arg1_1); arg0_1 = arg1_1 = None clamp_max: "f32[128]" = torch.ops.aten.clamp_max.Tensor(clamp_min, arg2_1); clamp_min = arg2_1 = None return (clamp_max, clamp_max) ``` W/ this PR ```python def forward(self): arg0_1: "f32[128]"; arg1_1: "f32[128]"; arg2_1: "f32[128]"; arg3_1: "f32[128]"; arg0_1, arg1_1, arg2_1, arg3_1, = fx_pytree.tree_flatten_spec([], self._in_spec) clamp_min: "f32[128]" = torch.ops.aten.clamp_min.Tensor(arg0_1, arg1_1); arg0_1 = arg1_1 = None clamp_max: "f32[128]" = torch.ops.aten.clamp_max.Tensor(clamp_min, arg2_1); clamp_min = arg2_1 = None copy_: "f32[128]" = torch.ops.aten.copy_.default(arg3_1, clamp_max); arg3_1 = clamp_max = None return (copy_,) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/124926 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/angelayi	2024-06-12 22:31:59 +00:00
PyTorch MergeBot	81e4e12f02	Revert "Support aten operations with out tensor (#124926 )" This reverts commit `cba195c8ed`. Reverted https://github.com/pytorch/pytorch/pull/124926 on behalf of https://github.com/clee2000 due to newly added test broke in internal D58444103. Test passed in OSS CI though ([comment](https://github.com/pytorch/pytorch/pull/124926#issuecomment-2163441547))	2024-06-12 16:20:04 +00:00
Wang, Eikan	cba195c8ed	Support aten operations with out tensor (#124926 ) This PR intends to support the aten operations with the `out` tensor. Currently, the AOT compile always does NOT keep input tensor mutations. According to the comments, this is because it has not encountered such a use case. > For now there's no use case involving keeping input mutations in the graph (which we can only do in the inference case anyway). We can add this later if we need to. However, for aten operations, it is popular that the `out` tensor is an input parameter and needs to be mutated. This PR intends to support it by adding a `keep_inference_input_mutations` flag to `aot_inductor.keep_inference_input_mutations`. This flag can provide flexibility to the callee in deciding whether the AOT compile needs to keep input tensor mutations in the graph. Take `clamp` as an example as follows. ```python out_tensor = torch.randn(128, dtype=torch.float, device=device).fill_(-2.0) inp_tensor = torch.randn(128, dtype=torch.float, device=device).fill_(1.0) min_tensor = inp_tensor - 0.05 max_tensor = inp_tensor + 0.05 torch.clamp(input=inp_tensor, min=min_tensor, max=max_tensor, out=out_tensor) ``` W/O this PR ```python def forward(self): arg0_1: "f32[128]"; arg1_1: "f32[128]"; arg2_1: "f32[128]"; arg3_1: "f32[128]"; arg0_1, arg1_1, arg2_1, arg3_1, = fx_pytree.tree_flatten_spec([], self._in_spec) clamp_min: "f32[128]" = torch.ops.aten.clamp_min.Tensor(arg0_1, arg1_1); arg0_1 = arg1_1 = None clamp_max: "f32[128]" = torch.ops.aten.clamp_max.Tensor(clamp_min, arg2_1); clamp_min = arg2_1 = None return (clamp_max, clamp_max) ``` W/ this PR ```python def forward(self): arg0_1: "f32[128]"; arg1_1: "f32[128]"; arg2_1: "f32[128]"; arg3_1: "f32[128]"; arg0_1, arg1_1, arg2_1, arg3_1, = fx_pytree.tree_flatten_spec([], self._in_spec) clamp_min: "f32[128]" = torch.ops.aten.clamp_min.Tensor(arg0_1, arg1_1); arg0_1 = arg1_1 = None clamp_max: "f32[128]" = torch.ops.aten.clamp_max.Tensor(clamp_min, arg2_1); clamp_min = arg2_1 = None copy_: "f32[128]" = torch.ops.aten.copy_.default(arg3_1, clamp_max); arg3_1 = clamp_max = None return (copy_,) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/124926 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/angelayi	2024-06-11 04:35:27 +00:00
Aaron Orenstein	7c12cc7ce4	Flip default value for mypy disallow_untyped_defs [6/11] (#127843 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127843 Approved by: https://github.com/oulgen ghstack dependencies: #127842	2024-06-08 18:49:29 +00:00
Matthew Hoffman	81277baa0c	Remove removed ruff rule TRY200 (#126256 ) My TOML linter is complaining that "TRY200" is not acceptable for the `tool.ruff.lint` schema. From the ruff docs: https://docs.astral.sh/ruff/rules/reraise-no-cause/ > This rule has been removed and its documentation is only available for historical reasons. > > This rule is identical to [B904](https://docs.astral.sh/ruff/rules/raise-without-from-inside-except/) which should be used instead. and we are currently explicitly ignoring B904. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126256 Approved by: https://github.com/Skylion007	2024-05-17 16:31:05 +00:00
Pian Pawakapan	946e202c07	[export] Restore user input names to unlifted graph modules (#124765 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/122842 Currently, calling ep.module() on an ExportedProgram leads to a GraphModule with a default forward signature (e.g. arg_0, arg_1, ...). This leads to original placeholder names disappearing for retracing/re-exporting. Fixing this issue by creating a forward_arg_names field (will take renaming suggestions for this), that stores the positional & keyword arg names that are used. These names aren't present in the call_spec currently stored, and requires a major version bump for the ExportedProgram schema. Test Plan: Tests exist for export, but names are now changed from generic (e.g. arg_0, arg_1) to follow user inputs (e.g. x, y) Differential Revision: D56484994 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124765 Approved by: https://github.com/zhxchen17	2024-04-29 20:58:17 +00:00
angelayi	e8836759d0	[export] Add effect token to export (#121424 ) Following the creation of effect tokens (https://github.com/pytorch/pytorch/pull/120296), we want to now add support for these tokens in export because the calling/returning convention has changed. The inputs are now `(tokens, params, buffers, constants, user_inputs)` and the outputs are `(tokens, buffer_mutations, user_mutations, user_outputs)`. The graph looks something like: ``` graph(): %arg0_1 : [num_users=1] = placeholder[target=arg0_1] %attr : [num_users=2] = placeholder[target=attr] %arg1_1 : [num_users=2] = placeholder[target=arg1_1] %with_effects : [num_users=2] = call_function[target=torch._higher_order_ops.effects.with_effects](args = (%arg0_1, _TorchScriptTesting.takes_foo.default, %attr, %arg1_1), kwargs = {}) %getitem : [num_users=1] = call_function[target=operator.getitem](args = (%with_effects, 0), kwargs = {}) %getitem_1 : [num_users=1] = call_function[target=operator.getitem](args = (%with_effects, 1), kwargs = {}) %with_effects_1 : [num_users=2] = call_function[target=torch._higher_order_ops.effects.with_effects](args = (%getitem, _TorchScriptTesting.takes_foo.default, %attr, %getitem_1), kwargs = {}) %getitem_2 : [num_users=1] = call_function[target=operator.getitem](args = (%with_effects_1, 0), kwargs = {}) %getitem_3 : [num_users=1] = call_function[target=operator.getitem](args = (%with_effects_1, 1), kwargs = {}) %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg1_1, %getitem_3), kwargs = {}) return (getitem_2, add) ``` During unlifting, we will first remove the tokens and with_effect calls using the `remove_effect_tokens` pass. (cc @SherlockNoMad on the pass to remove tokens). This is so that this won't change the calling conventions when retracing. The graph after unlifting looks something like: ``` graph(): %attr_1 : [num_users=2] = get_attr[target=attr] %arg1_1 : [num_users=2] = placeholder[target=arg1_1] %takes_foo_default_1 : [num_users=1] = call_function[target=torch.ops._TorchScriptTesting.takes_foo.default](args = (%attr_1, %arg1_1), kwargs = {}) %takes_foo_default : [num_users=1] = call_function[target=torch.ops._TorchScriptTesting.takes_foo.default](args = (%attr_1, %takes_foo_default_1), kwargs = {}) %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg1_1, %takes_foo_default), kwargs = {}) return (add,) ``` Serialization support will be added in a followup. Note: tokens only affect custom ops that take in ScriptObjects, not ScriptObject methods yet. Differential Revision: [D54639390](https://our.internmc.facebook.com/intern/diff/D54639390) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121424 Approved by: https://github.com/tugsbayasgalan	2024-03-09 02:43:26 +00:00
Zhenghao Zhao	af93849a3a	[pt2 export] small fix on non_persistent buffer unlift (#120715 ) Summary: Change to get_buffer from the input plain_graph_module instead of the new stateful_gm when restoring non_persistent buffers, since the stateful_gm doesn't contain the buffer yet. Test Plan: Added test case. `buck test caffe2/test:test_export -- test_unlift_nonpersistent_buffer` Differential Revision: D54216772 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120715 Approved by: https://github.com/zhxchen17	2024-03-01 20:20:00 +00:00
Michael Suo	bf4e171539	[export] support non-persistent buffers (#118969 ) Summary: X-link: https://github.com/pytorch/executorch/pull/1817 Basic support for non-persistent buffers, which are buffers that do not show up in the state dict. One weird twist is that most of our other systems (FX, aot_export, dynamo) have completely buggy handling of non-persistent buffers. I tried to go on a wild goose chase to fix them all, but it got to be too much. So I introduced some sad rewrite passes in `_export` make the final state dict correctly align with the original module's state dict. This exposed some bugs/ambiguous handling of parameters/buffers in existing test code. For example, `TestSaveLoad.test_save_buffer` traced over a module that was not in the root module hierarchy and caused some weird behavior. I think we should error explicitly on use cases like this: https://github.com/pytorch/pytorch/issues/118410. For now I just rewrote the tests or skipped them. As a side effect, this diff tightened up quite a few sloppy behaviors around state dict handling: - Tensor attributes were getting promoted to be buffers—bad! - Tracing through a module not in the children of the root module would add its parameters/buffers to the state dict—bad! This behavior is unlikely to show up in user code since the model would be totally broken, but did show up in a bunch of tests. #buildmore Test Plan: unit tests sandcastle Differential Revision: D53340041 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118969 Approved by: https://github.com/guangy10, https://github.com/huydhn, https://github.com/titaiwangms	2024-02-02 19:16:08 +00:00
PyTorch MergeBot	221747507d	Revert "[export] support non-persistent buffers (#118612 ) (#118722 )" This reverts commit `a43c28368c`. Reverted https://github.com/pytorch/pytorch/pull/118722 on behalf of https://github.com/atalman due to broke linux-jammy-py3-clang12-executorch ([comment](https://github.com/pytorch/pytorch/pull/118722#issuecomment-1921484565))	2024-02-01 14:39:29 +00:00
Angela Yi	7e0ea0d5df	[export] Only deepcopy graph in unlift (#118821 ) Summary: We only need to deepcopy the graph because we're modifying the graph by unlifting its parameter/buffer inputs. We don't need to deepcopy the graph module state/contents. This causes an error when the graph module contains an ExecuTorch LoweredModule which stores tensors. Test Plan: Fixes the following diff Differential Revision: D53290077 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118821 Approved by: https://github.com/tugsbayasgalan	2024-02-01 09:00:22 +00:00
Michael Suo	a43c28368c	[export] support non-persistent buffers (#118612 ) (#118722 ) Summary: X-link: https://github.com/pytorch/executorch/pull/1769 Basic support for non-persistent buffers, which are buffers that do not show up in the state dict. One weird twist is that most of our other systems (FX, aot_export, dynamo) have completely buggy handling of non-persistent buffers. I tried to go on a wild goose chase to fix them all, but it got to be too much. So I introduced some sad rewrite passes in `_export` make the final state dict correctly align with the original module's state dict. This exposed some bugs/ambiguous handling of parameters/buffers in existing test code. For example, `TestSaveLoad.test_save_buffer` traced over a module that was not in the root module hierarchy and caused some weird behavior. I think we should error explicitly on use cases like this: https://github.com/pytorch/pytorch/issues/118410. For now I just rewrote the tests or skipped them. Test Plan: added a unit test Differential Revision: D53253905 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118722 Approved by: https://github.com/SherlockNoMad, https://github.com/angelayi	2024-02-01 00:36:09 +00:00
suo	d0627cc2af	[export] do not rewrite state dict when unlifting (#118611 ) This is Very Bad; changing state dict keys violates one of the key contracts we have, which is "do not mess with the state dict". Change unlift to use a similar `_assign_attr` approach that fx.GraphModule and unflatten do. Also took the opportunity to improve the interface of `_assign_attr` to be more general. Differential Revision: [D53139277](https://our.internmc.facebook.com/intern/diff/D53139277/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118611 Approved by: https://github.com/zhxchen17 ghstack dependencies: #118607, #118608, #118609, #118610	2024-01-30 19:14:19 +00:00
suo	be90ab7efd	[export] do not unlift cond/map submodules (#118610 ) I don't think we should be unlifting HOO submodules. What is the constract of unlifting? It is: restore the original calling convention of the module, undoing the transformation in which we lift parameters, buffers, and constants to inputs in the graph. Unlifting does not make any guarantees about what's going on inside the module. It's still a flat module. So why should we lift the cond/map submodules? It doesn't have anything to do with the contract stated above; it's some internal stuff that doesn't affect how the module will be called. Further, this code as written modifies the state dict; adding a new buffer that is actually duplicate of a previous buffer. Modifying the state dict from the original eager module is never correct. Differential Revision: [D53160713](https://our.internmc.facebook.com/intern/diff/D53160713/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118610 Approved by: https://github.com/zhxchen17 ghstack dependencies: #118607, #118608, #118609	2024-01-30 19:14:18 +00:00
suo	4ee8aa6028	[export] adopt KeyPath API in nonstrict mode (#118609 ) This PR rewrites two paths to use the newly-added keypaths API in pytree: First: we were hand-rolling a tree_map during fakification because we wanted to track sources. This PR uses keypaths instead, which can do the same thing without needing custom code. Second: our constraint error formatting was referencing placeholder names in error messages. These placeholder names are not otherwise user-visible, so they are super confusing to users (e.g. "which input does arg1_3 correspond to?"). This diff uses the `keystr` API to format the error message. This necessitated some small refactors—generating the keystr is expensive so doing it in an f-string was very bad. It can also be further improved—we can inspect the signature so that instead of `*args[0]` we can give people the actual argument name, which would be the ideal UX. But leaving that for later. Differential Revision: [D53139358](https://our.internmc.facebook.com/intern/diff/D53139358/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118609 Approved by: https://github.com/zhxchen17 ghstack dependencies: #118607, #118608	2024-01-30 19:14:11 +00:00
suo	ca090b2c77	[export] do not use tree_flatten_spec (#118608 ) tree_flatten_spec is bad; it isn't synced up with `register_pytree_node` so it will not handle arbitrary custom pytrees. It's also not really maintained. We only use it for two purposes: - To retain kwarg ordering stability, so that if the user passes in kwargs in a different order things will still work. - To do "structural" checks that ignore types. In both cases, tree_flatten_spec is probably not the ideal way to implement the desired behavior. ## kwargs ordering - tree_flatten_spec overwrites the behavior of ALL dictionaries, not just kwargs. This is not correct, dictionary ordering is meaningful in Python, and it's pretty trivial to write a program that relies on dict ordering. - For kwargs, we do sort of expect that the order in which arguments are passed shouldn't matter. BUT there is one exception: `kwargs`. In fact, [PEP 468](https://peps.python.org/pep-0468/) was introduced specifically to clarify that ordering does matter when the function being called uses `kwargs`. In this diff I introduce a utility function that only reorders kwargs. This gets us most of the way to correct—dicts are no longer reordered, but kwargs can be passed in any order. A "fully correct" solution would need fix the corner case from PEP468. We could detect whether the top-level fn being traced uses `kwargs` (via `inspect`), then serialize a flag for it. In ExportedProgram, we would check that flag and only re-order if `kwargs` was unused; otherwise error if the key order doesn't match. This is a super corner case though, so I'll file it as a followup task. ## structural equivalence checking This is another use case, where again `tree_flatten_spec` is too broad. Generally we want to treat a precise two types as the same, not override the behavior of comparison generally. So I introduce an `is_equivalent` util for this purpose. Differential Revision: [D53168420](https://our.internmc.facebook.com/intern/diff/D53168420/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118608 Approved by: https://github.com/zhxchen17 ghstack dependencies: #118607	2024-01-30 19:14:04 +00:00
Angela Yi	413a434846	[export] Convert all export tests to .module() (#118425 ) Test Plan: CI Differential Revision: D53075379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118425 Approved by: https://github.com/suo	2024-01-29 23:06:54 +00:00
Angela Yi	5c56822be2	[export] Various fixes to .module() (#118272 ) Summary: While turning on .module() for all the export tests, I uncovered some bugs with .module() and while fixing them I ended up rewriting some of the code... Some of the bugs were: * bad kwargs support on the unlifted module * no support for user input mutations * (at the commit hash i was working off of) no support for custom objects * there were no tests on unlifting weights from cond/map submodules Test Plan: CI Differential Revision: D53075380 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118272 Approved by: https://github.com/suo	2024-01-26 21:05:07 +00:00
Angela Yi	a93940b5db	[export] Allow constant outputs + None input/outputs (#117894 ) Added support for constant outputs. We will just embed the constant directly into the output, like `return (x, 1)`. Also adds support for None input/outputs. For None inputs we address it the same way we do to constants, which is that a placeholder with no users will be inserted into the graph, and the None will be embedded into whatever operator is using the None. For None outputs, we will also address the same way we do constants, which is that we embed it into the output, like `return (x, None)`. Differential Revision: D52881070 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117894 Approved by: https://github.com/zhxchen17	2024-01-25 23:37:34 +00:00
suo	d84173c025	[export] fix unlifting of custom class constants (#117979 ) we didn't have a test covering this case, add one. Aside: we should invest in actually unit testing the lifting/unlifting passes, both separately and also against each other. I have a diff cooking for that. Differential Revision: [D52962180](https://our.internmc.facebook.com/intern/diff/D52962180/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117979 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #115222, #117978	2024-01-23 05:51:00 +00:00
Yidi Wu	2bc7da1ab7	[HigherOrderOp] change signature of map_impl (#117161 ) Summary: X-link: https://github.com/pytorch/executorch/pull/1580 This PR changes the schema of map_impl from map_impl(f, num_mapped, *operands) to map_impl(f, mapped_args: Tuple, moperands: Tuple). This is to prepare for turning on dynamo for eager mode map, where we want to get rid of the num_mapped scalar. Test Plan: Existing tests. Differential Revision: D52495413 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117161 Approved by: https://github.com/angelayi, https://github.com/tugsbayasgalan	2024-01-13 02:50:46 +00:00
Angela Yi	6413511713	[export][refactor][4/n] Make equality_constraints optional (#116233 ) Summary: needed to remove equality_contraints eventually :P Test Plan: CI Differential Revision: D52351709 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116233 Approved by: https://github.com/tugsbayasgalan	2024-01-05 00:50:52 +00:00

1 2

55 Commits