pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Pian Pawakapan	a6459afb0e	[dynamic shapes] add backed_size_oblivious option (#148696 ) Adds option `torch.fx.experimental._config.backed_size_oblivious = True` to allocate `[0, inf]` instead of `[2, inf]` ranges for size backed symbols, and opting into size-oblivious semantics for them. Helps in a number of cases like - Keeps `[0, inf]` bounds for unbacked symbols, when we make a unbacked -> backed replacement - More sound handling for 0/1 inputs at runtime when we lower from export - Avoids ends-of-bounds, sys.maxsize constraint violations for exporting with named Dims (https://github.com/pytorch/pytorch/issues/146315, https://github.com/pytorch/pytorch/issues/146046) May look towards turning this on globally for export. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148696 Approved by: https://github.com/bobrenjc93	2025-03-11 21:52:34 +00:00
bobrenjc93	da2688f624	Introduce delayed compile via `eager_then_compile` stance (#147983 ) Recently I've been experimenting with introducing new APIs to delay compile as a way to reduce compile times while improving the ergonomics of using dynamic shapes. The high level idea is to run the first invocation of compile in eager, save the example inputs, and on the second invocation we can derive the dynamism in the inputs so that we don't need to waste our time doing a compile with static shapes (which is the status quo today with automatic dynamic). Another benefit of this is most users no longer need to annotate their inputs with mark_dynamic and mark_unbaked calls since we can derive the dynamism on the very first call. Additionally we get dynamic ints out of the box in this new regime. This PR implements this idea through the set_stance APIs. In particular it introduces a new `eager_then_compile` stance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147983 Approved by: https://github.com/williamwen42	2025-03-04 07:46:31 +00:00
Aaron Orenstein	086d146f6f	Update ruff linter for PEP585 (#147540 ) This turns on PEP585 enforcement in RUFF. - Updates the target python version - Stops ignoring UP006 warnings (PEP585) - Fixes a few issues which crept into the tree in the last day Pull Request resolved: https://github.com/pytorch/pytorch/pull/147540 Approved by: https://github.com/justinchuby, https://github.com/Skylion007	2025-02-22 04:45:17 +00:00
Aaron Orenstein	f2cfe8b59f	PEP585 update - mostly toplevels (#145178 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145178 Approved by: https://github.com/bobrenjc93	2025-01-22 02:21:14 +00:00
Simon Fan	4ee166b82f	[ca] add compiled autograd to CompileId (#141907 ) tlparse PR: https://github.com/ezyang/tlparse/pull/83 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141907 Approved by: https://github.com/ezyang	2024-12-21 00:41:24 +00:00
Aaron Orenstein	7d4e7fbfc1	dynamo tracing perf: no import on hot path: 47.62 -> 47.26 (#143065 ) See #143056 for overall docs. This PR: Removed another `import` in the body of the hot path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143065 Approved by: https://github.com/jansel	2024-12-20 20:06:42 +00:00
Brian Hirsh	e19f493f02	add private config to temporarily preserve old FSDP guard behavior (#142871 ) Summary: https://github.com/pytorch/pytorch/pull/138819 wobbled dynamo guards in a way that caused some performance regression, so this PR temporarily adds a config to get the old behavior back while we investigate. Test Plan: CI Differential Revision: D67096751 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142871 Approved by: https://github.com/yf225	2024-12-13 22:06:48 +00:00
Sam Larsen	60c54467db	[logging] Log runtime autotuning timing to scuba (#141919 ) See test plan in internal diff [D66679369](https://our.internmc.facebook.com/intern/diff/D66679369) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141919 Approved by: https://github.com/jamesjwu, https://github.com/ezyang	2024-12-13 21:22:13 +00:00
Aaron Orenstein	52f31cc238	dynamo tracing perf: Guard slots: 51.76 -> 51.34 (#143060 ) See #143056 for overall docs. This PR: Add slots to Guard Pull Request resolved: https://github.com/pytorch/pytorch/pull/143060 Approved by: https://github.com/jansel ghstack dependencies: #143066, #143056, #143058, #143059	2024-12-13 21:02:50 +00:00
Yukio Siraichi	3a1ded5caa	Add tensor overlapping guards. (#139555 ) Fix: #118214 This PR replaces the guards introduced by running `_tensors_definitely_do_not_overlap` at compile-time by a single `___check_overlapping` guard. When evaluated, this function calls the original `_tensors_definitely_do_not_overlap` so as to check whether the current state of the inputs are consistent, i.e. tensors that should overlap do overlap, and those that shouldn't don't. In summary, the changes are: - Introduce `StorageOverlap` derived class from `GuardEnvExpr` - Plumb `AOTConfig` to the `compute_overlapping_inputs` function, so as to have access to AOTAutograd input sources - Suppress the guards generated by `_tensors_definitely_do_not_overlap` function at runtime - Issue a `StorageOverlap` AOTAutograd guard, specifying the sources that should and shouldn't overlap Pull Request resolved: https://github.com/pytorch/pytorch/pull/139555 Approved by: https://github.com/bdhirsh ghstack dependencies: #139554	2024-12-05 14:43:58 +00:00
Brian Hirsh	49c124fe1b	dynamo: guard on FSDP module parameters (#138819 ) Fixes https://github.com/pytorch/pytorch/issues/138715 It looks like we were previously ignoring guards on FSDP module parameters. In the issue linked above, this was causing inductor size/stride asserts to fire. The root cause is that for some code like this: ``` m = FSDP( torch.nn.Sequential( torch.compile(torch.nn.Linear(1024, 1024)), torch.compile(torch.nn.Linear(1024, 4096)) ) ) ``` We need to generate two different graphs for the two linear layers, and it looks like without a `TENSOR_MATCH` guard on the linear parameters, dynamo would think that it could re-use the same graph across both layers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138819 Approved by: https://github.com/anijain2305	2024-11-13 20:46:46 +00:00
Animesh Jain	2e48788a35	[hierarchical-compilation][invoke_subgraph] Use tracing context to cache artifacts of dispatch keys (#137965 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137965 Approved by: https://github.com/zou3519 ghstack dependencies: #137538, #138036	2024-10-22 15:33:42 +00:00
Edward Z. Yang	cc8f1cddd4	Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136972 Approved by: https://github.com/Skylion007 ghstack dependencies: #136934, #136935	2024-10-01 13:22:10 +00:00
PyTorch MergeBot	8982906502	Revert "Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972 )" This reverts commit `3ff2d93d9f`. Reverted https://github.com/pytorch/pytorch/pull/136972 on behalf of https://github.com/ezyang due to need to back out for merge conflict ([comment](https://github.com/pytorch/pytorch/pull/136972#issuecomment-2384182244))	2024-09-30 21:35:08 +00:00
Edward Z. Yang	3ff2d93d9f	Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136972 Approved by: https://github.com/Skylion007 ghstack dependencies: #136917, #136934, #136935	2024-09-30 18:04:36 +00:00
Edward Z. Yang	9dbc6bacff	Propagate detailed location information of shape guards to guards/recompiles output (#136917 ) To see the payoff, look at test/dynamo/test_logging.py The general idea is to refactor produce_guards into produce_guards_verbose which also returns verbose code parts, which have our annotations. The rest of the logic is plumbing around SLocs to the places they need to be so we can print them. Guards are easy; value ranges and duck sizing take more care. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136917 Approved by: https://github.com/anijain2305	2024-09-30 00:43:12 +00:00
IvanKobzarev	342c031f0e	[aotd] Fix freezing API for subclasses (#136265 ) Original issue: https://github.com/pytorch/ao/issues/890 The problem: TracingContext.flat_params contain original params, with not desugared Subclasses. While inductor.freezing API works on aot graphs, which already desugared Subclasses. flat_params are used only for this logic and storing in them desguared subclasses fixes the issue. Testing: ``` python test/functorch/test_aotdispatch.py -k test_inductor_freezing_with_subclasses ``` Torch AO original failure: ``` python test/integration/test_integration.py -k test_int8_weight_only_quant_with_freeze ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/136265 Approved by: https://github.com/bdhirsh	2024-09-24 13:15:01 +00:00
PyTorch MergeBot	df6a8fa1eb	Revert "[aotd] Fix freezing API for subclasses (#136265 )" This reverts commit `cdef760560`. Reverted https://github.com/pytorch/pytorch/pull/136265 on behalf of https://github.com/atalman due to Breaks internal CI sorry, need to revert ([comment](https://github.com/pytorch/pytorch/pull/136265#issuecomment-2368772574))	2024-09-23 16:25:05 +00:00
PyTorch MergeBot	783c5ba80a	Revert "[PT2/Profiler] Add Context Info to Torch-Compiled Regions (#132765 )" This reverts commit `0b81f700aa`. Reverted https://github.com/pytorch/pytorch/pull/132765 on behalf of https://github.com/ezyang due to implementation is not correct, needs full rewrite ([comment](https://github.com/pytorch/pytorch/pull/132765#issuecomment-2364160452))	2024-09-20 17:10:27 +00:00
IvanKobzarev	cdef760560	[aotd] Fix freezing API for subclasses (#136265 ) Original issue: https://github.com/pytorch/ao/issues/890 The problem: TracingContext.flat_params contain original params, with not desugared Subclasses. While inductor.freezing API works on aot graphs, which already desugared Subclasses. flat_params are used only for this logic and storing in them desguared subclasses fixes the issue. Testing: ``` python test/functorch/test_aotdispatch.py -k test_inductor_freezing_with_subclasses ``` Torch AO original failure: ``` python test/integration/test_integration.py -k test_int8_weight_only_quant_with_freeze ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/136265 Approved by: https://github.com/bdhirsh	2024-09-20 16:32:49 +00:00
Shivam Raikundalia	0b81f700aa	[PT2/Profiler] Add Context Info to Torch-Compiled Regions (#132765 ) Summary: We want to add compile IDs and frames to each Torch-Compiled Region in order to help users cross reference the section they are checking alongside data obtained from tools, such as tlparse. This diff operates on the assumption that each graph section will enter and exit a CompileContext before it is ran to either compile the graph or look it up in the cache. Based on this assuption, we can save the value of the graph section from the exited CompileContext in eval_frame.c using a Python C API. After this, we can create a new interface in cpp shim to wrap around the record_function in order to pass in the new keyword argument for "context". Test Plan: Enhance test_profiler_dynamo_compiled_region to look for kwinputs as well as a name to see that the context is now labeled. Also changed test to run graph with more contexts so that we test a wider range of profiling. Differential Revision: D60803317 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132765 Approved by: https://github.com/anijain2305	2024-08-27 04:55:04 +00:00
Animesh Jain	fee677eeb6	[fbode-testing][dynamo][reland][inline-inbuilt-nn-modules] Mark attri… (#134136 ) Shuai wants to test this internally before https://github.com/pytorch/pytorch/pull/133713 can go in. Creating a separate PR for ghmport. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134136 Approved by: https://github.com/yanboliang	2024-08-22 17:54:58 +00:00
PyTorch MergeBot	68425e68fe	Revert "[dynamo][reland][inline-inbuilt-nn-modules] Mark attributes of nn mod… (#133714 )" This reverts commit `e8d3c4be36`. Reverted https://github.com/pytorch/pytorch/pull/133714 on behalf of https://github.com/anijain2305 due to fails internally ([comment](https://github.com/pytorch/pytorch/pull/133714#issuecomment-2302171472))	2024-08-21 14:21:06 +00:00
Animesh Jain	e8d3c4be36	[dynamo][reland][inline-inbuilt-nn-modules] Mark attributes of nn mod… (#133714 ) Relands https://github.com/pytorch/pytorch/pull/132539 Relands https://github.com/pytorch/pytorch/pull/132736 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133714 Approved by: https://github.com/jansel	2024-08-20 05:57:52 +00:00
Edward Z. Yang	90d2593b3e	Revert #132806 , #132736 , #132539 , #132487 (#133570 ) This reverts commit `25df063f04`. This reverts commit `de00c79583`. This reverts commit `419b76c4ac`. This reverts commit `bc57d5b6ff`. Differential Revision: [D61335013](https://our.internmc.facebook.com/intern/diff/D61335013) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133570 Approved by: https://github.com/albanD, https://github.com/jansel, https://github.com/anijain2305	2024-08-15 20:54:21 +00:00
Animesh Jain	de00c79583	[dynamo][inline_inbuilt_nn_modules] Mark nn module tensor static for cudagraphs (#132736 ) Fixes https://github.com/pytorch/pytorch/issues/132714 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132736 Approved by: https://github.com/mlazos ghstack dependencies: #132538	2024-08-06 20:13:28 +00:00
Xuehai Pan	4226ed1585	[BE] Format uncategorized Python files with `ruff format` (#132576 ) Remove patterns ``, `test/`, and `torch/**` in `tools/linter/adapters/pyfmt_linter.py` and run `lintrunner`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132576 Approved by: https://github.com/ezyang, https://github.com/Skylion007 ghstack dependencies: #132574	2024-08-04 17:13:31 +00:00
Oguz Ulgen	72d2dba992	Add None return type to init (#132335 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132335 Approved by: https://github.com/albanD	2024-08-01 15:26:45 +00:00
Animesh Jain	612ea35395	[dynamo] Introduce UnspecializedBuiltinNNModuleSource (#132312 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132312 Approved by: https://github.com/yanboliang ghstack dependencies: #132302, #132304	2024-08-01 06:21:05 +00:00
Animesh Jain	bcd1d2e832	[dynamo] Introduce UnspecializedNNModule guard source (#132304 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132304 Approved by: https://github.com/yanboliang ghstack dependencies: #132302	2024-08-01 04:35:43 +00:00
Animesh Jain	e772547d70	[dynamo][rename/refactor] Rename guard_source NN_MODULE to SPECIALIZED_NN_MODULE (#132302 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132302 Approved by: https://github.com/yanboliang	2024-08-01 04:35:43 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
Oguz Ulgen	54b0006cb2	Evaluate symexprs on load path of cache not write (#128997 ) When caching is enabled, an internal model fails with ``` assert_size_stride(bmm_9, (17, s0, 512), (54784, 512, 1)) AssertionError: expected size 17==17, stride 57344==54784 at dim=0 ``` looking at this model, the exact problem is when the cache is hit on the forward graph, the generated code for backward fails since the strides of the outputs of forward, passed to backward as inputs, are not what we expected. This PR changes the evaluation logic so that we defer evaluation of output stride exprs to load path as opposed to eagerly doing it on save path. I have not been able to come up with a unit test repro for this problem. Differential Revision: [D58796503](https://our.internmc.facebook.com/intern/diff/D58796503) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128997 Approved by: https://github.com/ezyang	2024-06-20 08:55:12 +00:00
Xuehai Pan	dd143d44cc	[BE] enable UFMT for top-level files `torch/*.py` (#127707 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127707 Approved by: https://github.com/ezyang	2024-06-12 20:15:05 +00:00
Aaron Orenstein	ea614fb2b1	Flip default value for mypy disallow_untyped_defs [2/11] (#127839 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127839 Approved by: https://github.com/oulgen	2024-06-08 18:23:08 +00:00
Aaron Gokaslan	1dd42e42c4	[BE]: Try TCH autofixes on torch/ (#125536 ) Tries TCH autofixes and see what breaks Pull Request resolved: https://github.com/pytorch/pytorch/pull/125536 Approved by: https://github.com/ezyang	2024-05-05 23:13:59 +00:00
Animesh Jain	37c993546d	[dynamo][guards] Bug fix for set_export_info (#125275 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125275 Approved by: https://github.com/yanboliang	2024-05-01 03:46:26 +00:00
Edward Z. Yang	64491c0811	Restore CompileContext as well in backwards (#124626 ) This should fix many of the unknown compile id problems currently afflicting tlparse backwards analysis. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124626 Approved by: https://github.com/bdhirsh	2024-04-23 14:39:52 +00:00
Xuehai Pan	93e249969b	[BE] enable `ruff` rule `RSE` and remove useless parentheses in `raise` statements (#124261 ) Remove useless parentheses in `raise` statements if the exception type is raised with no argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124261 Approved by: https://github.com/albanD	2024-04-17 19:29:34 +00:00
Jason Ansel	11e6f84ad8	[dynamo] Graph break on uninitialized nn.Module (#123790 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123790 Approved by: https://github.com/anijain2305 ghstack dependencies: #123700, #123705, #123786	2024-04-12 19:03:13 +00:00
Brian Hirsh	134e56fa33	inductor: log unique id to match output_code to aot graphs (#118647 ) I found it helpful to be able to see, given some inductor output code, which AOT graph it came from. When you have large models with multiple graphs floating around this can be difficult, so I added the aot_config.aot_id to the printed inductor output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118647 Approved by: https://github.com/ezyang	2024-04-11 14:37:07 +00:00
Animesh Jain	1346ebf12e	[dynamo][guards] Delay DUPLICATE_INPUT guard because of incorrect ordering (#123605 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123605 Approved by: https://github.com/jansel ghstack dependencies: #123606	2024-04-10 07:30:02 +00:00
Jason Ansel	1e9a7df8fe	[dynamo] Compile time optimizations in tx.step() (#121790 ) `python benchmarks/dynamo/microbenchmarks/dynamo_microbenchmarks.py` - Before: `symbolic_convert_overhead_stress_test: 10.7s` - After: `symbolic_convert_overhead_stress_test: 8.6s` `tx.step()` is a small part of that benchmark, so likely the speedup in that isolated function is larger than the top line. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121790 Approved by: https://github.com/oulgen	2024-03-15 01:01:05 +00:00
Jason Ansel	7cc476ea16	[dynamo] Fix support for nn.Parameter constructor (part 1) (#120163 ) This captures calls to `torch.nn.Parameter` by lifting them to graph inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120163 Approved by: https://github.com/albanD, https://github.com/yanboliang ghstack dependencies: #121086	2024-03-11 05:14:42 +00:00
Joel Schlosser	dad1b76584	Introduce EphemeralSource for symbols that should be simplified out (#120948 ) Context: view fake-ification should handle closed-over state in ViewFuncs for use in view replay by: * fake-ifying tensors * symbolicizing SymInts This avoids invalid specialization during view replay. However, the symbols / tensors created as intermediates in the view chain should not stick around or be guarded on. This PR introduces an `EphemeralSource` intended to be used as a source for this purpose. It has the following properties: * Considered first to be simplified out in symbol simplification logic * Errors if guarded on Differential Revision: [D54561597](https://our.internmc.facebook.com/intern/diff/D54561597) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120948 Approved by: https://github.com/ezyang	2024-03-06 02:30:52 +00:00
Sam Larsen	06f8af30fa	Change FakeTensor serialization to consider only an _active_ FakeTensor mode (#120848 ) Summary: https://github.com/pytorch/pytorch/pull/108186 make some changes related to FakeTensor serialization such that saving and loading a tensor will give us a meta tensor, even if FakeTensor mode is not enabled. This means we can't properly save and load Tensors as part of Fx graph caching. This PR changes the logic to check if there's an _active_ FakeTensor mode. Test Plan: * New unit tests * Validated unit tests introduced in https://github.com/pytorch/pytorch/pull/108186 still pass Pull Request resolved: https://github.com/pytorch/pytorch/pull/120848 Approved by: https://github.com/eellison, https://github.com/thiagocrepaldi	2024-03-01 02:37:21 +00:00
Elias Ellison	d03b11ad5b	Pass inductor strides forward in ddp optimizer (#120523 ) # Note: Returning Fake Tensors on First AOT Autograd Call # # Inductor will optimize strides of outputs when it deems it profitable. # For instance, converting to channels last. When we split the graph here # into multiple inductor compilations, we need to make sure that the # output strides of one compilation is appropriately passed to the subsequent # compilations. However, the mapping from inductor output to dynamo output # is non-trivial due to aot_autograd's deduping, de-aliasing, mutation, re-writing, # subclass handling, etc. In order to replay all this logic we set a flag such that # the first invocation of inductor in aot_autograd will return Fake Tensors with # appropriate strides. Then, all of aot autograd's runtime logic is replayed. # This gives us the appropriately strided outputs here which will reflect runtime strides. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120523 Approved by: https://github.com/yf225, https://github.com/bdhirsh	2024-02-29 22:25:00 +00:00
Jason Ansel	01ec8df6d8	[Compiled Autograd] Introduce BackwardState capture (#120382 ) This adds support for backwards hooks that are both: 1) Interior to the graph; and 2) Dynamically generated (e.g. lambdas) We do this by creating a BackwardState object that is used to register the hooks in the forward, then populated by dynamo after the forwards runs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120382 Approved by: https://github.com/xmfan	2024-02-28 20:36:47 +00:00
Animesh Jain	8a59f49da2	[dynamo][compile-time] Collect guard debug stack info only with logs enabled (#120520 ) Reduces backend=eager compile time from 33 to 19 seconds for `MobileBertForQuestionAnswering`. This also helps an internal model where guards.add function is taking 124 seconds. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120520 Approved by: https://github.com/mlazos	2024-02-27 01:51:16 +00:00
Taras Tsugrii	2c8722182e	[dynamo][guards] Avoid unnecessary stack copies. (#119115 ) There is no need to make a `frame_summary_stack` copy in case it's not modified. Proposed change uses copy-on-write functional approach that is easy to understand and is more efficient in case `self.loc_in_frame` is `None` Pull Request resolved: https://github.com/pytorch/pytorch/pull/119115 Approved by: https://github.com/Skylion007	2024-02-10 21:56:00 +00:00

1 2 3

118 Commits