pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Deysha Rivera	aec7cc60d7	add graph_code_verbose_log artifact for fx passes (#153775 ) Fixes #153646 This PR refactors the logging behavior in the FX pass insert_deferred_runtime_asserts and runtime_assert.py to separate verbose/intermediate graph logs from the final output graph log. All verbose logs generated during the FX pass are now routed to a new artifact logger, graph_code_verbose, while only the final output graph remains logged to the original graph_code artifact. Changes - Added a new artifact logger: [graph_code_log = torch._logging.getArtifactLogger(__name__, "graph_code_verbose")] - Updated all verbose/intermediate FX pass logs in [insert_deferred_runtime_asserts] to use the new graph_code_verbose artifact. - Ensured that only the final output graph is logged to the original graph_code artifact. - No changes to the FX pass logic or output—only logging behavior is affected. Notes This change is backward-compatible and does not affect the functional behavior of FX passes. No changes to user-facing APIs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153775 Approved by: https://github.com/williamwen42	2025-05-21 18:31:59 +00:00
Pian Pawakapan	fd3d339e17	[dynamic shapes] be less aggressive with runtime assert CSE for bounds (#151590 ) Fixes #150540 Fixes #147772 Stops trying to CSE bound expressions, only does exact deduplication for runtime asserts. Adds the test cases to check that AOTAutograd doesn't data-dependent error out when retracing due to not seeing the asserts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151590 Approved by: https://github.com/laithsakka	2025-04-23 23:07:00 +00:00
Aaron Orenstein	0b2a3687b9	PEP585 update - torch/fx (#145166 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145166 Approved by: https://github.com/bobrenjc93	2025-01-20 18:11:54 +00:00
Joel Schlosser	c5ddf5dd90	Unbacked SymInt fixes for subclasses + data-dependent slice() bounds (non-dynamic) (#143526 ) Lifted non-controversial (non-dynamic) fixes from #142062. See description there for context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143526 Approved by: https://github.com/ezyang	2024-12-19 18:46:36 +00:00
Laith Sakka	caa3a3e12c	Only compute new_untracked_symbols and new_unbacked_bindings if needed. (#140083 ) Summary: 237s -> 198.. buck2 run fbcode//mode/opt fbcode//torchrec/distributed/tests:pt2_compile_benchmark -- --num-features=2000 Test Plan: NA Differential Revision: D65638637 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140083 Approved by: https://github.com/ezyang, https://github.com/isuruf, https://github.com/anijain2305	2024-11-20 19:28:18 +00:00
Laith Sakka	ed313a5ca2	Introduce torch.sym_add, variadic add (#138660 ) Tested internally here: https://www.internalfb.com/diff/D64057744 This is a reland after previous internal failures. main change is ``` if min is None and max is None: torch._check_is_size(size) return ``` Partially addresses https://github.com/pytorch/pytorch/issues/128150 When you have big sums of values, we end up computing long chains of binary addition in our FX graph representation. Not only is this ugly, it also is quadratic, as the sympy.Add constructor is O(N) in number of arguments. Instead, ensure that we maintain the summation as a single FX node so we can do the entire addition all in one go. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/138660 Approved by: https://github.com/ezyang, https://github.com/bobrenjc93	2024-10-23 17:42:41 +00:00
PyTorch MergeBot	16a2c2cfd4	Revert "Introduce torch.sym_sum (#136429 )" This reverts commit `90bed32b98`. Reverted https://github.com/pytorch/pytorch/pull/136429 on behalf of https://github.com/ezyang due to fails internal stuff ([comment](https://github.com/pytorch/pytorch/pull/136429#issuecomment-2403335147))	2024-10-09 20:08:01 +00:00
Edward Z. Yang	90bed32b98	Introduce torch.sym_sum (#136429 ) Partially addresses https://github.com/pytorch/pytorch/issues/128150 When you have big sums of values, we end up computing long chains of binary addition in our FX graph representation. Not only is this ugly, it also is quadratic, as the sympy.Add constructor is O(N) in number of arguments. Instead, ensure that we maintain the summation as a single FX node so we can do the entire addition all in one go. update_hint_regression benchmark, before and after: ``` update_hint_regression,compile_time_instruction_count,2648328980 update_hint_regression,compile_time_instruction_count,2563748678 ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136429 Approved by: https://github.com/isuruf	2024-10-08 18:12:57 +00:00
Edward Z. Yang	cc8f1cddd4	Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136972 Approved by: https://github.com/Skylion007 ghstack dependencies: #136934, #136935	2024-10-01 13:22:10 +00:00
PyTorch MergeBot	8982906502	Revert "Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972 )" This reverts commit `3ff2d93d9f`. Reverted https://github.com/pytorch/pytorch/pull/136972 on behalf of https://github.com/ezyang due to need to back out for merge conflict ([comment](https://github.com/pytorch/pytorch/pull/136972#issuecomment-2384182244))	2024-09-30 21:35:08 +00:00
Edward Z. Yang	3ff2d93d9f	Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136972 Approved by: https://github.com/Skylion007 ghstack dependencies: #136917, #136934, #136935	2024-09-30 18:04:36 +00:00
Pian Pawakapan	5a0e7a408f	restore CSE'd node metadata in runtime asserts pass (#134516 ) Adds val, and optionally stack_trace & nn_module_stack metadata back to SymInt compute nodes that we CSE, with a hook on `graph.create_node()`. Not sure if there's other metadata we want to populate here? Pull Request resolved: https://github.com/pytorch/pytorch/pull/134516 Approved by: https://github.com/ezyang	2024-09-05 07:50:04 +00:00
PyTorch MergeBot	c88c19c6de	Revert "restore CSE'd node metadata in runtime asserts pass (#134516 )" This reverts commit `1dfb105239`. Reverted https://github.com/pytorch/pytorch/pull/134516 on behalf of https://github.com/pianpwk due to breaking NestedTensor test ([comment](https://github.com/pytorch/pytorch/pull/134516#issuecomment-2329738450))	2024-09-04 18:41:21 +00:00
Pian Pawakapan	1dfb105239	restore CSE'd node metadata in runtime asserts pass (#134516 ) Adds val, and optionally stack_trace & nn_module_stack metadata back to SymInt compute nodes that we CSE, with a hook on `graph.create_node()`. Not sure if there's other metadata we want to populate here? Pull Request resolved: https://github.com/pytorch/pytorch/pull/134516 Approved by: https://github.com/ezyang	2024-09-04 05:56:28 +00:00
Pian Pawakapan	0c7856973b	[export] enumerate unsupported sympy.Functions (#134271 ) (#134598 ) Summary: There's 2 concepts of unsupported sympy.Functions in symbolic_shapes: 1) unsupported by the export solver, meaning the solver doesn't know how to provide useful fixes for those functions 2) unsupported by the sympy interpreter - meaning we can't reify them into FX nodes because the functions aren't present in PythonReferenceAnalysis This splits the current call into a call for each version, with the Export solver the only user of 1). For 1), we enumerate the functions in _sympy/functions.py, and subtract the functions we know we can support. For 2) there's only 3 functions we've seen pop up in test cases. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 Differential Revision: D61863394 Pulled By: pianpwk Pull Request resolved: https://github.com/pytorch/pytorch/pull/134598 Approved by: https://github.com/angelayi	2024-08-28 00:34:38 +00:00
PyTorch MergeBot	141a9c7204	Revert "[export] enumerate unsupported sympy.Functions (#134271 )" This reverts commit `ddd71e3479`. Reverted https://github.com/pytorch/pytorch/pull/134271 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/134271#issuecomment-2311353460))	2024-08-27 00:45:00 +00:00
Pian Pawakapan	ddd71e3479	[export] enumerate unsupported sympy.Functions (#134271 ) There's 2 concepts of unsupported sympy.Functions in symbolic_shapes: 1) unsupported by the export solver, meaning the solver doesn't know how to provide useful fixes for those functions 2) unsupported by the sympy interpreter - meaning we can't reify them into FX nodes because the functions aren't present in PythonReferenceAnalysis This splits the current call into a call for each version, with the Export solver the only user of 1). For 1), we enumerate the functions in _sympy/functions.py, and subtract the functions we know we can support. For 2) there's only 3 functions we've seen pop up in test cases. Differential Revision: D61677956 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134271 Approved by: https://github.com/avikchaudhuri	2024-08-26 22:44:12 +00:00
Aaron Orenstein	ed86ac2f25	[BE] typing for decorators - fx/_compatibility (#134054 ) Summary: See #131429 Test Plan: unit tests pass Differential Revision: D61493706 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134054 Approved by: https://github.com/oulgen	2024-08-26 04:00:27 +00:00
Avik Chaudhuri	5ed3b70d09	remove redundant upper bound check at runtime (#133627 ) Summary: Some symbols (unbacked symints?) can have upper bound that is `sys.maxsize - 1` but our code for runtime assertions assumes that such upper bounds would come in as `sympy.oo` (like backed symints?) in order to drop them. So we weren't dropping them, which this PR fixes. Test Plan: added test Differential Revision: D61352056 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133627 Approved by: https://github.com/SherlockNoMad	2024-08-16 06:57:12 +00:00
Aart Bik	a8490a0762	[traced-graph][sparse] propagate sparsity in fx graph (#131920 ) This PR proceeds with implementing the feature request #117188 by generalizing more cases that already work with COO to work with the compressed sparse formats as well. Feature request: https://github.com/pytorch/pytorch/issues/117188 Rebranch of older PRs (for history): https://github.com/pytorch/pytorch/pull/131474 https://github.com/pytorch/pytorch/pull/128549 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131920 Approved by: https://github.com/ezyang	2024-08-05 15:49:53 +00:00
Xuehai Pan	f3fce597e9	[BE][Easy][17/19] enforce style for empty lines in import segments in `torch/[a-c]/` and `torch/[e-n]/` (#129769 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129769 Approved by: https://github.com/ezyang	2024-08-04 10:24:09 +00:00
Pian Pawakapan	a896fb1b36	check unsupported sympy functions for runtime asserts (#132457 ) Some sympy Functions aren't supported by sympy_interp(); we can't turn them into FX nodes, so currently the runtime asserts CSE pass avoids CSE'ing on any expression containing a sympy Function. https://github.com/pytorch/pytorch/pull/132325 started tracking unsupported functions, so we switch the check to that to be more precise. We also check for and skip unsupported functions when adding asserts - previously we only did the check for CSE, and not adding new expressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132457 Approved by: https://github.com/avikchaudhuri	2024-08-03 10:17:25 +00:00
PyTorch MergeBot	945bf78894	Revert "[BE] typing for decorators - fx/_compatibility (#131568 )" This reverts commit `193f62fde9`. Reverted https://github.com/pytorch/pytorch/pull/131568 on behalf of https://github.com/clee2000 due to same as https://github.com/pytorch/pytorch/pull/131572#issuecomment-2254328359 but I clicked the wrong link by accident. This is where it actually starts ([comment](https://github.com/pytorch/pytorch/pull/131568#issuecomment-2254330781))	2024-07-28 03:43:39 +00:00
Aaron Orenstein	193f62fde9	[BE] typing for decorators - fx/_compatibility (#131568 ) See #131429 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131568 Approved by: https://github.com/justinchuby, https://github.com/oulgen, https://github.com/zou3519	2024-07-25 22:24:19 +00:00
Aaron Orenstein	5a0068cc69	[BE] mypy: disallow untyped decorators (#131428 ) Untyped decorators strip the types from their decorated function so even if the underlying function is fully typed then callers to it don't get any benefit from type annotations. Step 1 - Enable the error and override in all the offending files. #131429 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131428 Approved by: https://github.com/justinchuby, https://github.com/oulgen	2024-07-23 21:50:55 +00:00
Pian Pawakapan	1b3b4c2fb9	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) (#130380 ) original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train) Summary: This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Test Plan: contbuild & OSS CI, see `940e4477ab` Original Phabricator Test Plan: Imported from GitHub, without a `Test Plan:` line. Differential Revision: D59543603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380 Approved by: https://github.com/izaitsevfb	2024-07-10 19:23:37 +00:00
PyTorch MergeBot	9c9744c3ac	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `940e4477ab`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))	2024-07-09 21:03:49 +00:00
Pian Pawakapan	940e4477ab	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-07 20:10:14 +00:00
PyTorch MergeBot	963f430d13	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit `0267b2ddcb`. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk `0267b2ddcb` ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))	2024-07-06 07:20:05 +00:00
Pian Pawakapan	0267b2ddcb	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-06 03:44:49 +00:00
chilli	c486e2ab64	Add coloring to fx graph print out (#128476 ) Note: Won't land immediately, at least I'll need to add a color option to the field. But curious if any tests fail. Old: <img width="1294" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/c3a750ed-5e54-4621-b2e4-be5481be15b6"> New: <img width="1303" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/3a1f1adc-6f3a-413e-8b87-ee53da9bf4ed"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128476 Approved by: https://github.com/ezyang	2024-06-13 23:39:04 +00:00
Edward Z. Yang	2229884102	Introduce int_oo (#127693 ) In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range. After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better. But especially after #126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. test/test_sympy_utils.py describes some basic properties of the number, and torch/utils/_sympy/numbers.py has the actual implementation. The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments. Fixes https://github.com/pytorch/pytorch/issues/127396 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/127693 Approved by: https://github.com/lezcano ghstack dependencies: #126905	2024-06-13 04:08:20 +00:00
PyTorch MergeBot	5d8c7f39d4	Revert "Introduce int_oo (#127693 )" This reverts commit `9cab5987bd`. Reverted https://github.com/pytorch/pytorch/pull/127693 on behalf of https://github.com/clee2000 due to sorry executorch CI is a bit weird regarding pins, I'll make a chat with mergen with the choices of what to do and how it'll affect executorch CI, reverting for now to prevent more divergences in the meantime ([comment](https://github.com/pytorch/pytorch/pull/127693#issuecomment-2161775400))	2024-06-11 23:36:08 +00:00
Edward Z. Yang	9cab5987bd	Introduce int_oo (#127693 ) In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range. After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better. But especially after #126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. test/test_sympy_utils.py describes some basic properties of the number, and torch/utils/_sympy/numbers.py has the actual implementation. The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments. Fixes https://github.com/pytorch/pytorch/issues/127396 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/127693 Approved by: https://github.com/lezcano ghstack dependencies: #126905	2024-06-10 19:09:53 +00:00
Aaron Orenstein	038b927590	Flip default value for mypy disallow_untyped_defs [7/11] (#127844 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127844 Approved by: https://github.com/oulgen ghstack dependencies: #127842, #127843	2024-06-08 18:49:45 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	b5e85b8ecc	Add deferred_runtime_assertion pass after run_decompositions (#127305 ) Summary: We also want to reinsert the deferred_runtime passes after run_decompositions as well Test Plan: CI Reviewed By: zhxchen17 Differential Revision: D57802237 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127305 Approved by: https://github.com/BoyuanFeng	2024-05-31 05:45:28 +00:00
Pian Pawakapan	8a31c2aa84	[export] allow complex guards as runtime asserts (#127129 ) With the current state of export's dynamic shapes, we struggle with guards and constraints that are beyond the current dynamic shapes language, expressed with dims and derived dims. While we can compile and guarantee correctness for guards within the current language (e.g. min/max ranges, linear relationships, integer divisibility) we struggle to dynamically compile guards which extend beyond that. For these "complex" guards, we typically do either of the following: 1) raise a constraint violation error, along the lines of "not all values of <symbol> in the specified range satisfy <guard>", with or without suggested fixes, 2) specialize to the provided static values and suggest removing dynamism, or 3) fail compilation due to some arbitrary unsupported case. Previous [work](https://github.com/pytorch/pytorch/pull/124949) went towards resolving this by disabling forced specializations, instead allowing the user to fail at runtime with incorrect inputs. In this PR, relying on [hybrid backed-unbacked symints](https://github.com/pytorch/pytorch/issues/121749), [deferred runtime asserts](https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/runtime_assert.py), and the function [_is_supported_equivalence()](`d7de4c9d80/torch/fx/experimental/symbolic_shapes.py (L1824)`), we add a flag `_allow_complex_guards_as_runtime_asserts` which allows the user to compile exported programs containing these guards and maintain dynamism, while adding correctness checks as runtime assertions in the graph. Hybrid backed-unbacked symints allow us to easily bypass "implicit" guards emitted from computation - guards that we ~expect to be true. Popular examples revolve around reshapes: ``` # reshape def forward(self, x, y): # x: [s0, s1], y: [s2] return x.reshape([-1]) + y # guard s0 * s1 = s2 This leads to the following exported program class GraphModule(torch.nn.Module): def forward(self, x: "f32[s0, s1]", y: "f32[s2]"): sym_size_int: "Sym(s2)" = torch.ops.aten.sym_size.int(y, 0) mul: "Sym(-s2)" = -1 * sym_size_int; sym_size_int = None sym_size_int_1: "Sym(s0)" = torch.ops.aten.sym_size.int(x, 0) sym_size_int_2: "Sym(s1)" = torch.ops.aten.sym_size.int(x, 1) mul_1: "Sym(s0s1)" = sym_size_int_1 sym_size_int_2; sym_size_int_1 = sym_size_int_2 = None add: "Sym(s0s1 - s2)" = mul + mul_1; mul = mul_1 = None eq: "Sym(Eq(s0s1 - s2, 0))" = add == 0; add = None _assert_scalar = torch.ops.aten._assert_scalar.default(eq, "Runtime assertion failed for expression Eq(s0s1 - s2, 0) on node 'eq'"); eq = None view: "f32[s0s1]" = torch.ops.aten.view.default(x, [-1]); x = None add_1: "f32[s0s1]" = torch.ops.aten.add.Tensor(view, y); view = y = None return (add_1,) ``` Another case is symbol divisibility: ``` def forward(self, x): # x: [s0, s1] return x.reshape([-1, x.shape[0] - 1]) # Eq(Mod(s0 s1, s0 - 1), 0) ``` Applying deferred runtime asserts also helps dynamic compilation for "explicit" complex guards that typically cause problems for export. For example we can generate runtime asserts for not-equal guards, and complex conditions like the following: ``` class Foo(torch.nn.Module): def forward(self, x, y): # check that negation of first guard also shows up as runtime assertion if x.shape[0] == y.shape[0]: # False return x + y elif x.shape[0] == y.shape[0] 3: # False return x + 2, y + 3 elif x.shape[0] 2 == y.shape[0] * 3: # True return x * 2.0, y * 3.0 ``` For the above graph we will generate 3 runtime assertions: the negation of the first 2, and the 3rd condition as a guard. One additional benefit here over the current state of exported programs is that this adds further correctness guarantees - previously with explicit complex guards, if compilation succeeded, the guards would be ignored at runtime, treated as given. As shown above, the runtime asserts appear as math ops in the graph, generated by the sympy interpreter, resulting in an _assert_scalar call. There is an option to avoid adding these asserts into the graph, by setting `TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1`. This results in the "original" computation graph, with dynamism, and any incorrect inputs will fail on ops during runtime. Further work could go into prettifying the printer, so the majority of the graph isn't guard-related. Ideally this PR would subsume and remove the recently added [_disable_forced_specializations](https://github.com/pytorch/pytorch/pull/124949) flag, but that flag still handles one additional case of specialization: single-variable equalities where the symbol is solvable for a concrete value: see this [PR](https://github.com/pytorch/pytorch/pull/126925) This PR doesn't change any behavior around data-dependent errors/unbacked symints yet, that could be further work. NOTE: will take naming change suggestions for the flag :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127129 Approved by: https://github.com/avikchaudhuri	2024-05-29 17:15:25 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	246311c944	Unconditionally add asserts after export (#127132 ) Summary: Today AOTAutograd drops some of assert nodes so we reapply it after strict export. Test Plan: CI Reviewed By: angelayi Differential Revision: D57786907 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127132 Approved by: https://github.com/zhxchen17	2024-05-28 06:31:39 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	72f0bdcc22	Remove torch._constrain_as_value (#127103 ) Summary: This API doesn't do anything useful and should be subsumed by torch._check. Test Plan: CI Differential Revision: D57786740 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127103 Approved by: https://github.com/angelayi	2024-05-24 22:49:46 +00:00
Joel Schlosser	31ba6ee49b	Traceable wrapper subclass support for deferred runtime asserts (#126198 ) The padded dense -> jagged conversion op has the signature: ``` _fbgemm_dense_to_jagged_forward(Tensor dense, Tensor[] offsets, SymInt? total_L=None) -> Tensor ``` when `total_L` is not specified, the meta registration has a data-dependent output shape (based on `offsets[0][-1]`). Returning an unbacked SymInt here should work in theory, but traceable wrapper subclass support is missing in later code to handle deferred runtime asserts. This PR fixes this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126198 Approved by: https://github.com/ezyang	2024-05-21 01:21:46 +00:00
angelayi	c712b0f8a3	[export] Fix runtime assertions to add call_function (#125878 ) Fixes [internal issue](https://www.internalfb.com/intern/everpaste/?handle=GJCK9xUNpYXovnEBAHfuJ7vQLxZnbsIXAAAB) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125878 Approved by: https://github.com/ezyang	2024-05-14 00:57:50 +00:00
Aaron Orenstein	e71207b729	Fix infinite recursion in API BC test (#125706 ) ``` python test/test_fx.py -k test_public_api_surface ``` was failing with a complaint about infinite recursion. Fixed that and then marked the two API changes from #123681 as private (for `get_example_value`) and backward compatible (for `insert_deferred_runtime_asserts`). Fixes #104012 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125706 Approved by: https://github.com/BoyuanFeng	2024-05-08 23:07:16 +00:00
angelayi	8be4c1bc2f	[export] Add metadata for nodes insert_deferred_runtime_asserts (#125414 ) Fixes [internal error](https://fb.workplace.com/groups/1075192433118967/permalink/1416709435633930/). The issue is that the asserting nodes added in the `insert_deferred_runtime_assertion` pass do not contain metadata that the ExportedProgram requires the graph to have. One solution to fix this is to retrace the entire module, or another solution is to manually add back this metadata. This diff implements the latter solution (manually add back the metadata) through hooking into fx.graph's `create_node` function, and adding export-specific metadata for every node that is created. The reason I did this is so that the `insert_deferred_runtime_assertion` does not have to know about what metadata export wants. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125414 Approved by: https://github.com/zhxchen17, https://github.com/BoyuanFeng	2024-05-07 23:15:21 +00:00
Edward Z. Yang	e5e623af4b	Codegen runtime asserts in Inductor (#124874 ) This completely subsumes https://github.com/pytorch/pytorch/pull/120816 This makes use of the unbacked binding machinery to teach Inductor how to generate deferred runtime asserts directly. There is some back story about why I did it this way, let me explain. Previously, our strategy for generating runtime asserts was that Dynamo would insert them into the FX graph after finishing tracing, and we would attempt to code generate them based on the FX graph. This is a good strategy for export, where we immediately export the graph. However, this strategy was afflicted by problems in eager, where we reuse the same ShapeEnv as before. In particular, on subsequent graph passes, we would immediately turn all of these assertions into noops, because when we evaluated their expressions, we would see that because we had a deferred runtime assert in the ShapeEnv, we know "oh, of course this expression is True" already. Oops! So, with this PR, we take the attitude that as long as the ShapeEnv sticks around, the ShapeEnv's list of deferred runtime asserts is the source of truth, and we don't put anything in the graph. So we just need to decide when to actually generate asserts, and the place I picked was Inductor lowering, since we already have an AssertScalar buffer concept, and so I just need to insert them at this point. AssertScalar also uses raw sympy.Expr rather than SymInt/Bool, so it is easier to prevent unrestricted simplification at this point. There are a few things jumbled together in this PR. I can split them if you want, but some of the changes are before I changed my strategy, but they're useful changes anyway. torch/_dynamo/output_graph.py and torch/_inductor/lowering.py - Here, we stop putting deferred runtime asserts in the graph. I also have to make sure we don't DCE unused symbol arguments; we're going to get some goofy graph arguments this way, will be good to restore that optimization eventually. We also just disable codegen for `_assert_scalar` entirely; we assume that ShapeEnv will be good enough to capture all of these. torch/_inductor/codegen/wrapper.py and torch/_inductor/ir.py - Add a way to codegen sizevars without forcing simplification torch/_inductor/graph.py - The main logic. Our strategy is to interpose in the same place we are testing that unbacked SymInts are properly showing up in lowered code. The logic is directly analogous to the logic in the existing insert deferred runtime asserts FX pass, but it's simpler because sympy expressions can be directly stored on inductor IR nodes. torch/fx/experimental/symbolic_shapes.py - For extra safety, we have a way of freezing runtime asserts, so that if you try to add more we error. This prevents us from adding runtime asserts after we've done lowering. There's a funny interaction with backwards which there's a comment for in graph.py torch/fx/passes/runtime_assert.py - This is not really needed in this PR, but I rewrote the runtime assert logic to use unbacked_bindings rather than inferring it by looking for unbacked SymInts. Now, keypaths are translated into FX node acessors. Unfortunately, I couldn't delete the old inference code, because you still need it to find backed SymInts from arguments (as this pass may be used on graphs which don't explicitly bind all their shape variables as argments). There are some new tests exercising this. TODO: I think we need to generate asserts for replacements too. This is a preexisting problem that the old FX pass had too. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124874 Approved by: https://github.com/jansel ghstack dependencies: #124864	2024-04-29 10:19:29 +00:00
Aaron Gokaslan	40cf38fd15	[BE]: Apply ruff rule FURB192 (#124742 ) Apply RUFF rule FURB192 to remove unnecessary sorts and replace them with min / max. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124742 Approved by: https://github.com/albanD, https://github.com/malfet	2024-04-24 18:44:08 +00:00
Boyuan Feng	aa2da0cdd2	[Export] Add runtime assert to non-strict export (#123681 ) This PR moves insert_deferred_runtime_asserts from dynamo to torch.fx.passes and uses it to add runtime assertion for non-strict export. Differential Revision: D55944267 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123681 Approved by: https://github.com/tugsbayasgalan, https://github.com/angelayi	2024-04-18 16:13:27 +00:00

46 Commits