pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Zhengxu Chen	f23dbefe52	[export] Support "custom" metadata field. (#131912 ) Summary: Add a special field in Graph and Node level metadata called "custom" which should be mapped to a json-serializable object, and we guarantee this field should be always preversed across the following transformations: 1. copy/deepcopy 2. run_decompositions() 3. serialization 4. re-exporting Test Plan: :test_export -- -r custom_tag Reviewed By: angelayi Differential Revision: D60291839 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131912 Approved by: https://github.com/angelayi	2024-08-14 01:09:01 +00:00
Shangdi Yu	3c5b246d3c	[export] Remove Proxy from exported programs and modules (#132956 ) Summary: Remove Proxy from exported programs and modules because they cannot be deepcopied or pickeled. Test Plan: CI ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r qat_conv2d buck2 run 'fbcode//mode/dev-nosan' fbcode//modai/test:test_modai -- -r test_qat_stinson_htp_export buck2 run 'fbcode//mode/dev-nosan' fbcode//vizard_projects/ml_depth/tests:test_model -- -r test_qat_model_et buck2 run 'fbcode//mode/dev-nosan' fbcode//bolt/nn/executorch/backends/tests:qnn_test -- -r test_qat_bias=False,use_3d_input=False buck2 run 'fbcode//mode/dev-nosan' fbcode//bolt/nn/executorch/backends/tests:qnn_test -- -r test_qat_bias=True,use_3d_input=False buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r test_fold_bn_erases_bn_node ``` Differential Revision: D60940832 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132956 Approved by: https://github.com/angelayi	2024-08-09 00:00:20 +00:00
Angela Yi	45d0e90bd3	[export] Allow str outputs (#132808 ) Summary: Fixes https://fb.workplace.com/groups/1075192433118967/permalink/1478413606130179/ Test Plan: CI Differential Revision: D60850712 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132808 Approved by: https://github.com/ydwu4	2024-08-08 02:20:59 +00:00
Yidi Wu	bbf568aac8	Split of "[reland] [export] fix zero arg export in training_ir and constant tensor handling" (#132307 ) Summary: A re-land of D60006710. Fixed TrainingIRToRunDecomp failures for test_tensor_attribute_zero_args and also a few re-tracability failures because run_decomposition does a retracing. edit: also remove the eliminate_dead_code() in _unlift because of one onnx test failure: a constant tensor attr was lifted as constant_tensor input but it's not used in the graph after aot_autograd due to a short cut in its decomposition. This causes the setattr to be removed by eliminate_dead_code but the graph signature still contains the name of that buffer, which causes an inconsitency between the transformed graph and ep's original signature after _unlift. And it seems that this has happened a few times where some nodes are accidentally removed and we're in an inconsistent state. The alternative of removing it would be: every time we call elimiate_dead_code, we verify the consistency of the graph with 1. the graph before transformation and 2. all the meta datas but i think this deserves a complete design edit 2: Also fix the inconsistency of graph signatures when param_constant is marked as lifted_tensor_constants but it's registered as parameters in the output of ep.module(). Differential Revision: D60532628 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132307 Approved by: https://github.com/zhxchen17	2024-08-08 01:36:16 +00:00
Shangdi Yu	825002c9c6	[export][fx] More robust DCE pass (#132764 ) Summary: - make default DCE pass check schema, - need to rebase onto https://github.com/pytorch/pytorch/pull/131651 after it's in phabricator (for now the change is manually added). - mark Proxy dump as NotImplemented for better error msg - Remove Proxy from tensors when dumping models, as Proxy cannot be dumped. More details in https://docs.google.com/document/d/1G5vmTXjzxoyVGRI2kpA1gQukK_Glyg2NrE0Oh6Nlg9A/edit?usp=sharing. Test Plan: CI ``` - buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r qat_conv2d - test_export.py - buck2 run 'fbcode//mode/dev-nosan' fbcode//modai/test:test_modai -- -r test_qat_stinson_htp_export - buck2 run 'fbcode//mode/dev-nosan' fbcode//vizard_projects/ml_depth/tests:test_model -- -r test_qat_model_et - buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r dce - buck2 run 'fbcode//mode/dev-nosan' fbcode//bolt/nn/executorch/backends/tests:qnn_test -- -r test_qat_bias=False,use_3d_input=False - buck2 run 'fbcode//mode/dev-nosan' fbcode//bolt/nn/executorch/backends/tests:qnn_test -- -r test_qat_bias=True,use_3d_input=False - buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r test_fold_bn_erases_bn_node ``` Reviewed By: angelayi Differential Revision: D60319175 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132764 Approved by: https://github.com/angelayi	2024-08-06 22:27:22 +00:00
Shangdi Yu	4a2cf50edf	[export][reland] Convert autocast to HOO (#132677 ) Summary: Reland of D60206382. Suggested in https://github.com/pytorch/pytorch/issues/128394. If there's an autocast context manager, the predispatch (strict) graph can look something like: ``` class <lambda>(torch.nn.Module): def forward(self, x: "f32[1]"): ... _enter_autocast = torch.amp.autocast_mode._enter_autocast('cuda', torch.bfloat16, True, None) mm: "f32[8, 8]" = torch.ops.aten.mm.default(rand, rand_1); rand = rand_1 = None _exit_autocast = torch.amp.autocast_mode._exit_autocast(_enter_autocast); _enter_autocast = None return (mm_1,) ``` But the operator `torch.amp.autocast_mode._enter_autocast` is not a valid ATen op. We remove these nodes by turning autocast into a higher order operator and make a submodule for the blocks between `_enter_autocast` and `_exit_autocast`. Some potential followup improvement: 1) Merge some of the duplicated logic with `replace_set_grad_with_hop_pass.py` 2) Check the current autocast status (any enabled? dtype?) and not create a submodule if the autocast args matches current autocast status. Test Plan: CI ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r "test_predispatch_autocast" buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r "test_predispatch_set_grad" ``` Verified that now we can export the llama model in gh issue 128394 and the gemma model in gh issue 131829 without error. Differential Revision: D60770038 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132677 Approved by: https://github.com/angelayi	2024-08-05 22:34:52 +00:00
PyTorch MergeBot	a3ea96b762	Revert "[export] Convert autocast to HOO (#131914 )" This reverts commit `aec948adfc`. Reverted https://github.com/pytorch/pytorch/pull/131914 on behalf of https://github.com/davidberard98 due to PR shouldn't have been relanded by the bot, phabricator diff did not have any recent changes and is still internally reverted ([comment](https://github.com/pytorch/pytorch/pull/131914#issuecomment-2269797388))	2024-08-05 19:52:09 +00:00
Shangdi Yu	aec948adfc	[export] Convert autocast to HOO (#131914 ) Summary: Suggested in https://github.com/pytorch/pytorch/issues/128394. If there's an autocast context manager, the predispatch (strict) graph can look something like: ``` class <lambda>(torch.nn.Module): def forward(self, x: "f32[1]"): ... _enter_autocast = torch.amp.autocast_mode._enter_autocast('cuda', torch.bfloat16, True, None) mm: "f32[8, 8]" = torch.ops.aten.mm.default(rand, rand_1); rand = rand_1 = None _exit_autocast = torch.amp.autocast_mode._exit_autocast(_enter_autocast); _enter_autocast = None return (mm_1,) ``` But the operator `torch.amp.autocast_mode._enter_autocast` is not a valid ATen op. We remove these nodes by turning autocast into a higher order operator and make a submodule for the blocks between `_enter_autocast` and `_exit_autocast`. Some potential followup improvement: 1) Merge some of the duplicated logic with `replace_set_grad_with_hop_pass.py` 2) Check the current autocast status (any enabled? dtype?) and not create a submodule if the autocast args matches current autocast status. Test Plan: CI ``` parsh --build-flags fbcode//mode/dev-nosan fbcode//caffe2/test:test_export run_tests("test_predispatch_autocast") ``` Reviewed By: angelayi Differential Revision: D60206382 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131914 Approved by: https://github.com/angelayi	2024-08-05 18:52:12 +00:00
Xuehai Pan	f3fce597e9	[BE][Easy][17/19] enforce style for empty lines in import segments in `torch/[a-c]/` and `torch/[e-n]/` (#129769 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129769 Approved by: https://github.com/ezyang	2024-08-04 10:24:09 +00:00
PyTorch MergeBot	d984105748	Revert "[export] Convert autocast to HOO (#131914 )" This reverts commit `b28c01d90d`. Reverted https://github.com/pytorch/pytorch/pull/131914 on behalf of https://github.com/ezyang due to Failing lint, but was covered up by master failure on lint ([comment](https://github.com/pytorch/pytorch/pull/131914#issuecomment-2267248773))	2024-08-04 02:10:35 +00:00
Shangdi Yu	b28c01d90d	[export] Convert autocast to HOO (#131914 ) Summary: Suggested in https://github.com/pytorch/pytorch/issues/128394. If there's an autocast context manager, the predispatch (strict) graph can look something like: ``` class <lambda>(torch.nn.Module): def forward(self, x: "f32[1]"): ... _enter_autocast = torch.amp.autocast_mode._enter_autocast('cuda', torch.bfloat16, True, None) mm: "f32[8, 8]" = torch.ops.aten.mm.default(rand, rand_1); rand = rand_1 = None _exit_autocast = torch.amp.autocast_mode._exit_autocast(_enter_autocast); _enter_autocast = None return (mm_1,) ``` But the operator `torch.amp.autocast_mode._enter_autocast` is not a valid ATen op. We remove these nodes by turning autocast into a higher order operator and make a submodule for the blocks between `_enter_autocast` and `_exit_autocast`. Some potential followup improvement: 1) Merge some of the duplicated logic with `replace_set_grad_with_hop_pass.py` 2) Check the current autocast status (any enabled? dtype?) and not create a submodule if the autocast args matches current autocast status. Test Plan: CI ``` parsh --build-flags fbcode//mode/dev-nosan fbcode//caffe2/test:test_export run_tests("test_predispatch_autocast") ``` Reviewed By: angelayi Differential Revision: D60206382 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131914 Approved by: https://github.com/angelayi	2024-08-03 05:48:57 +00:00
Shangdi Yu	a503136583	[export] Detect whether case_name is registered in exportdb (#132420 ) Summary: - moves logging functionalities into `torch/_export/db/logging.py` file. - add a check in `_dynamo/eval_frame.py` to check for optional input and error out with `UnsupportedError` - change the case name of `torch_sym_int` to `unsupported_operator` - Check if the case name is registered in exportdb, if so, we give a link to the case in exportdb. - TODO: add test Test Plan: CI Running the example in https://pytorch.org/docs/main/generated/exportdb/index.html#optional-input gives the following error logging: ``` E0730 10:53:33.687000 4155538 torch/_dynamo/eval_frame.py:1086] Parameter y is optional with a default value of tensor([[-0.1633, 1.2414, -0.1071], E0730 10:53:33.687000 4155538 torch/_dynamo/eval_frame.py:1086] [-0.1936, -0.9425, -0.0824]]) E0730 10:53:33.688000 4155538 torch/export/_trace.py:1043] See optional_input in exportdb for unsupported case. https://pytorch.org/docs/main/generated/exportdb/index.html#optional-input ...... File "/data/users/shangdiy/fbsource/buck-out/v2/gen/fbcode/389acaeb40d57230/tutorials/pytorch/nntest/__torchtest__/torchtest#link-tree/torch/_dynamo/eval_frame.py", line 1091, in produce_matching raise Unsupported( torch._dynamo.exc.Unsupported: Tracing through optional input is not supported yet ``` It also logs a `export.error.classified` event in Scuba. Reviewed By: zhxchen17 Differential Revision: D60427208 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132420 Approved by: https://github.com/zhxchen17	2024-08-03 01:08:48 +00:00
Tugsbayasgalan Manlaibaatar	928adb7cc2	Fix empty fake mode problem (#131995 ) Title Differential Revision: [D60348541](https://our.internmc.facebook.com/intern/diff/D60348541/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131995 Approved by: https://github.com/angelayi ghstack dependencies: #131988	2024-08-01 04:55:37 +00:00
Tugsbayasgalan Manlaibaatar	073430ebea	Don't check for autograd state when lowering to inference IR (#131988 ) When lowering to inference IR, we shouldn't error on autograd state changes because we will have preserved the autograd state change at the training level. I think the more correct way of implementing it would be to wrap autograd ops in HOP before decomposing, but that seems low ROI. Differential Revision: [D60346235](https://our.internmc.facebook.com/intern/diff/D60346235/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131988 Approved by: https://github.com/angelayi	2024-08-01 04:15:37 +00:00
Yidi Wu	404a8ae8f6	[export] fix set_grad x tensor constant. (#131787 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/130379. The original error is verifier finds that the placeholder nodes' meta[''val"] are missing in subgraph of WrapSetGradEnabled hop. In this PR, we fixed it by re-ordering the replace_set_grad_with_hop_pass with lift_constant_tensor pass because only after lift_constant_pass, all the constant attrs start to have meta["val"]. Test Plan: buck2 test test:test_export -- -r "test_setgrad_lifted_tensor" Differential Revision: D60244935 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131787 Approved by: https://github.com/yushangdi	2024-07-26 16:41:59 +00:00
Zhengxu Chen	7feaa73057	[export] Remove deprecated fields from ExportedProgram ctor. (#131697 ) Summary: as title. Test Plan: CI Reviewed By: SherlockNoMad Differential Revision: D60078426 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131697 Approved by: https://github.com/ydwu4	2024-07-26 16:19:46 +00:00
Avik Chaudhuri	2bf649f5ae	suggested fix for data-dependent error (#125378 ) Suggests fixes for data-dependent errors in non-strict export. Any data-dependent error has an unresolved condition on unbacked symints. A mechanizable strategy for fixing such errors, which this PR enables, is to "bash" them using `torch._check()`s. For each error we suggest using `torch._check()` on the condition or its negation. The user selects and copy-pastes the suggested fix and continues. For example, here's an existing data-dependent error message with the suffix following `<snip>...</snip>` added by this PR: ``` Could not guard on data-dependent expression Eq(u2, u1) (unhinted: Eq(u2, u1)). (Size-like symbols: u1) <snip>...</snip> User code: File "test/export/test_export.py", line 1944, in forward return r.view(items[0], items[2]) Suggested fixes (please choose one of the following): 1. torch._check(items[2] == r.shape[1]) 2. torch._check(items[2] != r.shape[1])" ``` Tests in this PR illustrate this workflow, by taking common examples of data-dependent errors and bashing them until success, purely based on suggested fixes. In particular, we test this workflow on the "puzzlers" in https://www.internalfb.com/intern/anp/view/?id=5330476 (thanks @ezyang). In terms of implementation, we focus on non-strict mode, where we can intercept torch function calls to install a handler that walks up the stack from the error, finding the closest non-torch frame and inspecting its locals for symints appearing in the error. The suggested fixes then access these symints through the local variables so that they can be (a) easily understood by the user (b) directly added to the code. Implementing this idea in strict mode is follow-up work—we have already investigated what it would take, and decided to separate it out of this PR for reasons described next. It's not too hard to map symints to locals in Dynamo (although it needs to happen elsewhere, i.e., intercepting torch function calls won't work). However, unfortunately this doesn't seem to be enough; the graph modules created by Dynamo when going through AOTAutograd can raise further data-dependent errors in some cases, and thus we need yet another mechanism to map symints to locals for graph modules, via captured source-level metadata and FX node walking. This latter component will require some care to build properly, or we might conclude it is altogether unnecessary and fix Dynamo instead. Differential Revision: D56867432 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125378 Approved by: https://github.com/ezyang	2024-07-26 08:34:50 +00:00
Avik Chaudhuri	5b05ad9697	fix non-persistent buffers (#131756 ) Summary: Dynamo doesn't track whether buffers are `persistent`. This led to some ugly code where we would mark buffers as always persistent when creating signatures, then later check whether the buffers were not in the state dict to infer whether they were non-persistent, and use this to fix up the signature. This PR instead defines a utility to look up all the non-persistent buffers registered inside a module (this information is recorded in a private `_non_persistent_buffers_set` module attribute), and uses it to (a) correctly set the persistent flag on buffers when creating signatures (b) transfer this information to a Dynamo-traced graph module, which then causes non-persistent buffers to (correctly) not show up in the state dict. Test Plan: existing tests + new case with non-persistent buffer in nested module Differential Revision: D60224656 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131756 Approved by: https://github.com/zhxchen17, https://github.com/ydwu4	2024-07-26 04:45:30 +00:00
Yidi Wu	2c1851f04e	[export] fix output node's meta (#131706 ) Summary: This pr fixes all the places in strict export stack where the output node's meta is not preserved correctly. However, we're getting a new error for the test we intend to fix: `buck2 run caffe2/test/quantization:test_quantization -- -r "test_re_export_preserve_handle"`: The `get_attr` nodes has wrong metadata. I guess there are more things need to be fixed to get it working but it's beyond the scope of this PR. Test Plan: buck2 run caffe2/test/quantization:test_quantization -- -r "test_re_export_preserve_handle" Differential Revision: D60198221 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131706 Approved by: https://github.com/yushangdi	2024-07-25 18:44:21 +00:00
Yidi Wu	75c4176b05	[export][BE] consolidate export and export_for_training (#131496 ) Summary: This PR consolidates the implementation of export and export_for_training to maximize code re-use. Also add some type annotations and comments in the code for better readability. Test Plan: Existing tests. Differential Revision: D60130515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131496 Approved by: https://github.com/avikchaudhuri, https://github.com/pianpwk	2024-07-25 16:35:16 +00:00
Shangdi Yu	6bc8db1d32	Rename is_training flag to have more information (#131618 ) Summary: rename is_training flag into dispatch_tracing_mode = “make_fx” or “aot_export” Test Plan: OSS CI Differential Revision: D60154327 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131618 Approved by: https://github.com/ydwu4	2024-07-25 16:29:55 +00:00
Shangdi Yu	29c9f8c782	[export] Fix `graph_break` log registration error when importing export/_trace.py (#131523 ) Summary: When importing `_trace.py`, put `torch._dynamo.exc.Unsupported` in the global variable ``_ALLOW_LIST`` can cause import to ``export/_trace.py`` to fail with error: ValueError: Artifact name: 'graph_breaks' not registered, please call register_artifact('graph_breaks') in torch._logging.registrations. The error is directly raise on line `graph_breaks_log = torch._logging.getArtifactLogger(__name__, "graph_breaks")` in `_dynamo/exc.py`. I've checked that ``register_artifact('graph_breaks')`` does already exist in torch._logging.registrations. Explicitly call `import torch._logging` doesn't fix the issue. (see T196719676) We move ``_ALLOW_LIST`` to be a local variable. Test Plan: buck2 test 'fbcode//mode/opt' fbcode//aiplatform/modelstore/publish/utils/tests:fc_transform_utils_test -- --exact 'aiplatform/modelstore/publish/utils/tests:fc_transform_utils_test - test_serialized_model_for_disagg_acc (aiplatform.modelstore.publish.utils.tests.fc_transform_utils_test.PrepareSerializedModelTest)' buck2 test 'fbcode//mode/opt' fbcode//aiplatform/modelstore/publish/utils/tests:fc_transform_utils_test -- --exact 'aiplatform/modelstore/publish/utils/tests:fc_transform_utils_test - test_serialized_test_dsnn_module (aiplatform.modelstore.publish.utils.tests.fc_transform_utils_test.PrepareSerializedModelTest)' Differential Revision: D60136706 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131523 Approved by: https://github.com/zhxchen17	2024-07-24 22:40:24 +00:00
Aaron Orenstein	5a0068cc69	[BE] mypy: disallow untyped decorators (#131428 ) Untyped decorators strip the types from their decorated function so even if the underlying function is fully typed then callers to it don't get any benefit from type annotations. Step 1 - Enable the error and override in all the offending files. #131429 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131428 Approved by: https://github.com/justinchuby, https://github.com/oulgen	2024-07-23 21:50:55 +00:00
Avik Chaudhuri	94f22eb6b2	refactor post-trace fakification in strict (#131421 ) Summary: Previously it was unclear what `_convert_input_to_fake` actually does (used in strict), and in particular how it is different from `make_fake_inputs` (used in non-strict). This PR splits that function to work purely on user inputs, then renames it to `extract_fake_inputs` and adds a comment clarifying what it does—namely, it extracts fake inputs from a given graph module instead of "converting inputs to fake inputs" (as suggested by the current name) or "making fake inputs" (as happens in non-strict, where no tracing has taken place yet). The remainder of that function used to also fakify params and buffers. It turns out that this part is identical to what happens in non-strict, hence we also pull `make_fake_inputs` out from `non_strict_utils` into `_trace`, merge it with another util, and make both modes call it. Test Plan: existing tests Differential Revision: D60084442 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131421 Approved by: https://github.com/zhxchen17	2024-07-23 18:23:03 +00:00
Shangdi Yu	cfb9ccab6c	[export] Filter errors by exception type, add case name (#131327 ) Summary: - Log export errors to Scuba and mark them with "classified" and "unclassified" - Classify errors by exception type (ALLOW_LIST) and a `case_name` attribute - Add `case_name` for some exceptions. Test Plan: Running the code below logs a classified error to `torch_export_usage` table in Scuba. ``` import torch from torch._export.db.case import SupportLevel class TorchSymMin(torch.nn.Module): """ torch.sym_min operator is not supported in export. """ def forward(self, x): return x.sum() + torch.sym_min(x.size(0), 100) example_args = (torch.randn(3, 2),) tags = {"torch.operator"} support_level = SupportLevel.NOT_SUPPORTED_YET model = TorchSymMin() torch.export.export(model, example_args) `` Differential Revision: D59981459 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131327 Approved by: https://github.com/zhxchen17	2024-07-23 18:01:13 +00:00
Shangdi Yu	29e2e2afb6	Revert D59561509: Multisect successfully blamed "D59561509: [FX][export] DCE pass, check schema for node impurity (#130395 )" for one test failure (#131341 ) Summary: This diff reverts D59561509 D59561509: [FX][export] DCE pass, check schema for node impurity (#130395) by yushangdi causes the following test failure: Tests affected: - [cogwheel:cogwheel_mtia_cmf_m5_shrunk_test#test_flow_with_verification](https://www.internalfb.com/intern/test/844425041436985/) Here's the Multisect link: https://www.internalfb.com/multisect/6533402 Here are the tasks that are relevant to this breakage: T191383430: 10+ tests unhealthy for ads_mtia_inference The backout may land if someone accepts it. If this diff has been generated in error, you can Commandeer and Abandon it. Test Plan: NA Differential Revision: D60029318 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131341 Approved by: https://github.com/angelayi	2024-07-23 05:23:47 +00:00
PyTorch MergeBot	b9912f31ef	Revert "[export] fix zero arg export in training_ir (#130990 )" This reverts commit `50436d5bdb`. Reverted https://github.com/pytorch/pytorch/pull/130990 on behalf of https://github.com/clee2000 due to failing some executorch and torchrec tests internally D60006710 ([comment](https://github.com/pytorch/pytorch/pull/130990#issuecomment-2243395316))	2024-07-22 16:49:25 +00:00
Yidi Wu	50436d5bdb	[export] fix zero arg export in training_ir (#130990 ) Fixed TrainingIRToRunDecomp failures for test_tensor_attribute_zero_args and also a few re-tracability failures because run_decomposition does a retracing. edit: also remove the eliminate_dead_code() in _unlift because of one onnx test failure: a constant tensor attr was lifted as constant_tensor input but it's not used in the graph after aot_autograd due to a short cut in its decomposition. This causes the setattr to be removed by eliminate_dead_code but the graph signature still contains the name of that buffer, which causes an inconsitency between the transformed graph and ep's original signature after _unlift. And it seems that this has happened a few times where some nodes are accidentally removed and we're in an inconsistent state. The alternative of removing it would be: every time we call elimiate_dead_code, we verify the consistency of the graph with 1. the graph before transformation and 2. all the meta datas but i think this deserves a complete design. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130990 Approved by: https://github.com/pianpwk	2024-07-20 02:35:13 +00:00
Pian Pawakapan	745324e487	[export] turn on hybrid symints by default (#130775 ) Sets `prefer_deferred_runtime_asserts_over_guards=True` for export, so any guards emitted from `SymNode.expect_true` (for example, guards that are implicitly required to be true for an op to succeed) won't lead to constraint violations. Instead these should appear in the graph as runtime asserts, or potentially as replacement expressions for placeholder shapes. For example, this reshape op should emit s0 * s1 = s2, deferred as a runtime assert. ``` x = torch.randn(4, 8) # [s0, s1] y = torch.randn(32) # [s2] out = x.reshape(-1) + y # this emits Eq(s0 * s1, s2), and we represent y's shape as [s0s1] in the graph. ``` However, other complex guards can still cause export to fail, for instance guards emitted from `SymNode.guard_bool/guard_size_oblivious` (e.g. explicit if-else conditions in user code or lower-level op implementations hit during tracing) can still raise constraint violations. These can be deferred with `allow_complex_guards_as_runtime_asserts=True`. We don't yet make this default, because while this makes export more likely to succeed, it results in non-trivial asserts being emitted that often represent specialization to a variant of the op, or checks related to 0/1 specialization. We also remove forced specializations for export and kill the `_disable_forced_specializations` flag - now any guard we can't express with Dims/DerivedDims either are handled with Hybrid SymInts, or should be resolved with rewriting or deferring. Follow up: Currently, `ShapeEnv._set_replacement()` is called for complex equality expressions (e.g. s2 -> s0s1 in the example above), and the ExportedProgram stores `s0*s1` in the input placeholder. This isn't checked for validity when the program is run, so an option is to avoid replacement and/or runtime assert on equality. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130775 Approved by: https://github.com/avikchaudhuri	2024-07-18 17:40:58 +00:00
Shangdi Yu	27ded03545	[FX][export] DCE pass, check schema for node impurity (#130395 ) Change the default DCE pass to check node schema for impure nodes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130395 Approved by: https://github.com/angelayi, https://github.com/jgong5	2024-07-18 16:31:40 +00:00
Pian Pawakapan	e8998d68c8	[export] add non-strict training IR (#130062 ) Summary: Adds non-strict implementation of training IR export. Any expected non-strict training IR failures are also either existing strict training IR or non-strict failures (no new failures added). 4 strict training IR failures also resolved. Refraining from unifying export/export_for_training, per @ydwu4's feedback :) Test Plan: added test_export_training_ir_to_run_decomp_non_strict.py for non-strict training IR Differential Revision: D59349454 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130062 Approved by: https://github.com/ydwu4, https://github.com/zhxchen17	2024-07-16 17:08:00 +00:00
Aaron Orenstein	567482973d	typing fake_tensor.py (#128041 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128041 Approved by: https://github.com/eellison ghstack dependencies: #129182	2024-07-13 06:07:40 +00:00
Pian Pawakapan	988ed4d5db	[export] clean up allow_complex_guards_as_runtime_asserts flag (#130596 ) Summary: removes underscore, cleans up dead code in DimConstraints Test Plan: existing export tests Reviewed By: angelayi Differential Revision: D59612746 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130596 Approved by: https://github.com/angelayi	2024-07-12 17:17:11 +00:00
Pian Pawakapan	1b3b4c2fb9	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) (#130380 ) original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train) Summary: This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Test Plan: contbuild & OSS CI, see `940e4477ab` Original Phabricator Test Plan: Imported from GitHub, without a `Test Plan:` line. Differential Revision: D59543603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380 Approved by: https://github.com/izaitsevfb	2024-07-10 19:23:37 +00:00
Shangdi Yu	c83b941141	[export] add dynamic shapes argument and infer from graph nodes (#129928 ) Fixes the example in #118304 for `torch._functorch.aot_autograd.aot_export_module` and `torch.export.export`. On a high level, the issue is caused by not detecting fake_mode when there's no input. Change plan: 1) we add a `dynamic_shapes: Union[bool, None] = None` arg to `aot_export_module` and `_aot_export_function`. 2) if the input is not a graph module, then we can only rely on this `dynamic_shapes` input arg. 3) If the input is a graph module, then we can traverse the graph and check. 4) So we check if the input mod is a graph module or just a module, and do 2) or 3) depending on the type. Fixes #129927 Bug source: dynamo's fake_mode is not detected correctly in `_convert_input_to_fake` in `_traced.py` when there’s no input to the graph). So in ` _strict_export_lower_to_aten_ir`, we create another fake_mode. `dynamo_fake_mode` is not the same as the fake_mode used by dynamo. Change plan: check `gm_torch_level` graph's node meta "example_value" for fake mode in addition. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129928 Approved by: https://github.com/angelayi	2024-07-10 15:51:05 +00:00
PyTorch MergeBot	9c9744c3ac	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `940e4477ab`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))	2024-07-09 21:03:49 +00:00
Pian Pawakapan	940e4477ab	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-07 20:10:14 +00:00
PyTorch MergeBot	963f430d13	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `0267b2ddcb`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk `0267b2ddcb` ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))	2024-07-06 07:20:05 +00:00
Pian Pawakapan	0267b2ddcb	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-06 03:44:49 +00:00
Tugsbayasgalan Manlaibaatar	39427288f4	Taskify training IR + run_decomp flow failures (#129547 ) Differential Revision: [D59069088](https://our.internmc.facebook.com/intern/diff/D59069088) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129547 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #128077, #129092, #129249	2024-06-27 20:43:22 +00:00
Tugsbayasgalan Manlaibaatar	ec284d3a74	Prototype for export_for_training (#129092 ) This PR implements export_for_training where the IR is not-functional, pre-dispatch aten IR. The general strategy: 1. Call dynamo to get torch IR 2. Lift param/buffer 3. call make_fx TODO: 1. run_decomp doesn't work 2. not-strict is not supported Differential Revision: [D59069087](https://our.internmc.facebook.com/intern/diff/D59069087) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129092 Approved by: https://github.com/zhxchen17 ghstack dependencies: #128077	2024-06-27 18:27:11 +00:00
Tugsbayasgalan Manlaibaatar	90f6043368	Don't decompose functional composite ops in export inference IR (#128077 ) Recently we decided to split export IR into two different IRs (training vs inference). In the inference IR, one major change we decided to introduce was we wanted to keep the composite ops that user specified in the IR. This PR does that by overriding the CompositeImplicitAutograd decomp in export inference path. Differential Revision: [D58701607](https://our.internmc.facebook.com/intern/diff/D58701607) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128077 Approved by: https://github.com/bdhirsh	2024-06-26 23:07:55 +00:00
Yidi Wu	dd00f5e78d	Fixes T192448049 (#129146 ) Differential Revision: D58767610 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129146 Approved by: https://github.com/angelayi	2024-06-25 17:50:15 +00:00
Zhengxu Chen	bfad0aee44	[export] Preserve requires_grad for export inputs. (#128656 ) Summary: Today meta['val'] on placeholder nodes doesn't preserve the consistent requires_grad information with the original inputs. Seems there's no easy way to fix this directly at proxy tensor layer. This is useful for reexporting joint graph. Test Plan: test_preserve_requires_grad_placeholders Differential Revision: D58555651 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128656 Approved by: https://github.com/tugsbayasgalan	2024-06-17 16:26:08 +00:00
angelayi	e9c6e8369c	Torchbind call method + effects support (#128397 ) Adds effect token support to torchbind method calls by allowing `with_effects` to take in `torch.ops._higher_order_ops.call_torchbind` as an input. Here is the print from `TORCH_LOGS="aot" python test/export/test_torchbind.py -k test_compile_obj_torchbind_op`: ```python def forward(self, arg0_1: "f32[0]", arg1_1: "f32[2]", arg2_1): # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1266 in f, code: torch.ops._TorchScriptTesting.queue_push(tq, x.cos()) cos: "f32[2]" = torch.ops.aten.cos.default(arg1_1) with_effects = torch._higher_order_ops.effects.with_effects(arg0_1, torch.ops._TorchScriptTesting.queue_push.default, arg2_1, cos); arg0_1 = cos = None getitem: "f32[0]" = with_effects[0]; with_effects = None # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1267 in f, code: torch.ops._TorchScriptTesting.queue_push(tq, x.cos() + 1) cos_1: "f32[2]" = torch.ops.aten.cos.default(arg1_1) add: "f32[2]" = torch.ops.aten.add.Tensor(cos_1, 1); cos_1 = None with_effects_1 = torch._higher_order_ops.effects.with_effects(getitem, torch.ops._TorchScriptTesting.queue_push.default, arg2_1, add); getitem = add = None getitem_2: "f32[0]" = with_effects_1[0]; with_effects_1 = None # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1268 in f, code: torch.ops._TorchScriptTesting.queue_pop(tq) with_effects_2 = torch._higher_order_ops.effects.with_effects(getitem_2, torch.ops._TorchScriptTesting.queue_pop.default, arg2_1); getitem_2 = None getitem_4: "f32[0]" = with_effects_2[0]; with_effects_2 = None # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1269 in f, code: torch.ops._TorchScriptTesting.queue_push(tq, x.sin()) sin: "f32[2]" = torch.ops.aten.sin.default(arg1_1); arg1_1 = None with_effects_3 = torch._higher_order_ops.effects.with_effects(getitem_4, torch.ops._TorchScriptTesting.queue_push.default, arg2_1, sin); getitem_4 = sin = None getitem_6: "f32[0]" = with_effects_3[0]; with_effects_3 = None # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1270 in f, code: return tq.pop(), tq.pop() + tq.size(), tq with_effects_4 = torch._higher_order_ops.effects.with_effects(getitem_6, torch.ops._higher_order_ops.call_torchbind, arg2_1, 'pop'); getitem_6 = None getitem_8: "f32[0]" = with_effects_4[0] getitem_9: "f32[2]" = with_effects_4[1]; with_effects_4 = None with_effects_5 = torch._higher_order_ops.effects.with_effects(getitem_8, torch.ops._higher_order_ops.call_torchbind, arg2_1, 'pop'); getitem_8 = None getitem_10: "f32[0]" = with_effects_5[0] getitem_11: "f32[2]" = with_effects_5[1]; with_effects_5 = None with_effects_6 = torch._higher_order_ops.effects.with_effects(getitem_10, torch.ops._higher_order_ops.call_torchbind, arg2_1, 'size'); getitem_10 = arg2_1 = None getitem_12: "f32[0]" = with_effects_6[0]; with_effects_6 = None add_1: "f32[2]" = torch.ops.aten.add.Tensor(getitem_11, 0); getitem_11 = None return (getitem_12, getitem_9, add_1) ``` In order to support this, this PR makes the following changes: * Adds `FakeScriptObject` to `CustomObjArgument`, which will be put on the `meta["val"]` of nodes representing torchbind objects. * Adds pickle/deepcopy support to FunctionSchema. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128397 Approved by: https://github.com/ydwu4, https://github.com/zou3519	2024-06-14 21:28:17 +00:00
Zhengxu Chen	be0eec9031	[export] Improve static typing in tracer. (#128552 ) Summary: as title. Test Plan: CI Differential Revision: D58485487 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128552 Approved by: https://github.com/angelayi	2024-06-14 17:57:37 +00:00
Aaron Orenstein	7c12cc7ce4	Flip default value for mypy disallow_untyped_defs [6/11] (#127843 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127843 Approved by: https://github.com/oulgen ghstack dependencies: #127842	2024-06-08 18:49:29 +00:00
Pian Pawakapan	e505132797	[export] track TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS for export runtime asserts (#127554 ) Track TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1 in export so it doesn't omit runtime asserts. Differential Revision: D57978699 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127554 Approved by: https://github.com/tugsbayasgalan	2024-06-05 04:16:54 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	ff32f6c93b	Use freshly traced jit-traced module to be used in export analysis (#127577 ) Summary: When we export already traced module, it seems to be modifying some global state causing the traced modules to fail to run. For now, we are only logging for test cases, so it is probs ok to trace fresh copy to be used in export for now. Test Plan: CI Differential Revision: D57983518 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127577 Approved by: https://github.com/pianpwk	2024-06-04 16:54:23 +00:00
Pian Pawakapan	8a31c2aa84	[export] allow complex guards as runtime asserts (#127129 ) With the current state of export's dynamic shapes, we struggle with guards and constraints that are beyond the current dynamic shapes language, expressed with dims and derived dims. While we can compile and guarantee correctness for guards within the current language (e.g. min/max ranges, linear relationships, integer divisibility) we struggle to dynamically compile guards which extend beyond that. For these "complex" guards, we typically do either of the following: 1) raise a constraint violation error, along the lines of "not all values of <symbol> in the specified range satisfy <guard>", with or without suggested fixes, 2) specialize to the provided static values and suggest removing dynamism, or 3) fail compilation due to some arbitrary unsupported case. Previous [work](https://github.com/pytorch/pytorch/pull/124949) went towards resolving this by disabling forced specializations, instead allowing the user to fail at runtime with incorrect inputs. In this PR, relying on [hybrid backed-unbacked symints](https://github.com/pytorch/pytorch/issues/121749), [deferred runtime asserts](https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/runtime_assert.py), and the function [_is_supported_equivalence()](`d7de4c9d80/torch/fx/experimental/symbolic_shapes.py (L1824)`), we add a flag `_allow_complex_guards_as_runtime_asserts` which allows the user to compile exported programs containing these guards and maintain dynamism, while adding correctness checks as runtime assertions in the graph. Hybrid backed-unbacked symints allow us to easily bypass "implicit" guards emitted from computation - guards that we ~expect to be true. Popular examples revolve around reshapes: ``` # reshape def forward(self, x, y): # x: [s0, s1], y: [s2] return x.reshape([-1]) + y # guard s0 * s1 = s2 This leads to the following exported program class GraphModule(torch.nn.Module): def forward(self, x: "f32[s0, s1]", y: "f32[s2]"): sym_size_int: "Sym(s2)" = torch.ops.aten.sym_size.int(y, 0) mul: "Sym(-s2)" = -1 * sym_size_int; sym_size_int = None sym_size_int_1: "Sym(s0)" = torch.ops.aten.sym_size.int(x, 0) sym_size_int_2: "Sym(s1)" = torch.ops.aten.sym_size.int(x, 1) mul_1: "Sym(s0s1)" = sym_size_int_1 sym_size_int_2; sym_size_int_1 = sym_size_int_2 = None add: "Sym(s0s1 - s2)" = mul + mul_1; mul = mul_1 = None eq: "Sym(Eq(s0s1 - s2, 0))" = add == 0; add = None _assert_scalar = torch.ops.aten._assert_scalar.default(eq, "Runtime assertion failed for expression Eq(s0s1 - s2, 0) on node 'eq'"); eq = None view: "f32[s0s1]" = torch.ops.aten.view.default(x, [-1]); x = None add_1: "f32[s0s1]" = torch.ops.aten.add.Tensor(view, y); view = y = None return (add_1,) ``` Another case is symbol divisibility: ``` def forward(self, x): # x: [s0, s1] return x.reshape([-1, x.shape[0] - 1]) # Eq(Mod(s0 s1, s0 - 1), 0) ``` Applying deferred runtime asserts also helps dynamic compilation for "explicit" complex guards that typically cause problems for export. For example we can generate runtime asserts for not-equal guards, and complex conditions like the following: ``` class Foo(torch.nn.Module): def forward(self, x, y): # check that negation of first guard also shows up as runtime assertion if x.shape[0] == y.shape[0]: # False return x + y elif x.shape[0] == y.shape[0] 3: # False return x + 2, y + 3 elif x.shape[0] 2 == y.shape[0] * 3: # True return x * 2.0, y * 3.0 ``` For the above graph we will generate 3 runtime assertions: the negation of the first 2, and the 3rd condition as a guard. One additional benefit here over the current state of exported programs is that this adds further correctness guarantees - previously with explicit complex guards, if compilation succeeded, the guards would be ignored at runtime, treated as given. As shown above, the runtime asserts appear as math ops in the graph, generated by the sympy interpreter, resulting in an _assert_scalar call. There is an option to avoid adding these asserts into the graph, by setting `TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1`. This results in the "original" computation graph, with dynamism, and any incorrect inputs will fail on ops during runtime. Further work could go into prettifying the printer, so the majority of the graph isn't guard-related. Ideally this PR would subsume and remove the recently added [_disable_forced_specializations](https://github.com/pytorch/pytorch/pull/124949) flag, but that flag still handles one additional case of specialization: single-variable equalities where the symbol is solvable for a concrete value: see this [PR](https://github.com/pytorch/pytorch/pull/126925) This PR doesn't change any behavior around data-dependent errors/unbacked symints yet, that could be further work. NOTE: will take naming change suggestions for the flag :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127129 Approved by: https://github.com/avikchaudhuri	2024-05-29 17:15:25 +00:00

1 2 3 4

168 Commits