Commit Graph

24 Commits

Author SHA1 Message Date
Aaron Orenstein
be0df96b50 Fix c++ implementation of strip_function_call (#147436)
#143063 was missing handling a couple UCS cases as well as had some bugs in the way it dealt with errors.

- Fix all the UCS handling (and make some of the common code more common)
- Make sure all the error paths return `nullptr`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147436
Approved by: https://github.com/jansel
2025-02-20 20:41:21 +00:00
William Wen
63e8ad49b8 [dynamo] replace hardcoded eval frame control flags skip_code_recursive_flag/cache_limit_hit_flag (#146355)
This PR and the previous:
- Moves parts of `eval_frame.c` to C++.
- Reduces code duplication in `dynamo__custom_eval_frame` and makes the control flow more clear.
- Enables `convert_frame` to signal to `eval_frame.cpp` in a general manner how to evaluate this frame, recursive frames, and future frames with the same code object (default/compile, skip, run-only). e.g. this will allow us to change skipping/cache limit hit eval_frame behavior directly from convert_frame without requiring changes to C/C++.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146355
Approved by: https://github.com/jansel
ghstack dependencies: #145603
2025-02-18 21:37:12 +00:00
Aaron Orenstein
06b4b96b34 dynamo tracing perf: no re in arg_ref: 33.9 -> 33.7 (#143069)
See #143056 for overall docs.

This PR: Avoid use of python re and move valid varname check in
`GuardBuilder.arg_ref()` into C++

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143069
Approved by: https://github.com/jansel
2024-12-23 05:32:09 +00:00
Aaron Orenstein
9bf4b1c2e9 dynamo tracing perf: c++ strip_function_call: 49.12 -> 47.77 (#143063)
See #143056 for overall docs.

This PR: Convert `strip_function_call()` into C++

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143063
Approved by: https://github.com/jansel
ghstack dependencies: #143057, #143062
2024-12-22 06:38:46 +00:00
William Wen
18261e9f39 [dynamo] implement framelocals mapping as c++ object (#140063)
Implements https://github.com/pytorch/pytorch/issues/93753 - move frame local guard accessors to C++.

Before, we used dict accessors on a Python dict representing the frame's fastlocals that we manually build. We move this accessor to C++ and additionally use the fastlocal index whenever possible.

Some implementation notes:
- `FrameLocalsMapping` is now initialized as a C++ vector of `PyObject`s. We do not just use the frame's localsplus/fastlocals buffer because we also unbox cells.
- `FrameLocalsMapping` can still be converted into a Python dict representing the frame's fastlocals, but it is done lazily.
- We update `LeafGuard`, `GuardAccessor`, and `GuardManager`'s `check_nopybind` methods to accept `FrameLocalsMapping`. By default, we convert the `FrameLocalsMapping` to a Python dict and run the original `check_nopybind` on it, but in some cases, conversion is not needed.
- We add a new guard accessor `FrameLocalsGuardAccessor`, which is similar to `DictGetItemGuardAccessor` but has special handling for `FrameLocalsMapping`. We create a separate class to emphasize different use cases, but we could probably combine these two (can do in a follow up)

dynamo_guard_eval.py microbenchmark update:
- 713.2us -> 630.0us (3.10)
- 598.8us -> 530.7us (3.12)

Other followups:
- Add `FrameLocalsMapping` version for `check_verbose_nopybind` in order to match behavior between `check_nopybind` and `check_verbose_nopybind`. This can prevent difficult debugging situations where guards fail (`check_nopybind` returns false) but no guard error message is generated (`check_verbose_nopybind` succeeds).
- Rewrite the `SHAPE_ENV` guard into C++ - it is a fairly common guard that results in `FrameLocalsMapping` needing to convert to a dict

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140063
Approved by: https://github.com/jansel
ghstack dependencies: #142117, #142430
2024-12-17 18:54:27 +00:00
Animesh Jain
fb529c2c84 [dynamo] skip_guard_eval_unsafe stance for power users (#140251)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140251
Approved by: https://github.com/jansel
ghstack dependencies: #140223, #140250
2024-11-21 06:28:58 +00:00
cyy
40fb738197 Use Wextra-semi (#140236)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140236
Approved by: https://github.com/ezyang
2024-11-13 02:15:16 +00:00
Animesh Jain
dba6887dc6 [dynamo][refactor][config-cleanp] Use guard_manager consistently instead of check_fn (#138896)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138896
Approved by: https://github.com/williamwen42, https://github.com/jansel
ghstack dependencies: #138512
2024-10-26 15:14:46 +00:00
albanD
88e54de219 More nogil unsafe API fix (#137142)
Cover the PyDict APIs and confirms no update needed for PyModule one.
The rest was already covered in https://github.com/pytorch/pytorch/pull/136899

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137142
Approved by: https://github.com/eqy, https://github.com/Skylion007
2024-10-04 21:56:34 +00:00
PyTorch MergeBot
9670e9e5b0 Revert "Mark PyTorch module as no-gil valid and pythoncapi_compat.h (#136899)"
This reverts commit 4f93de8951.

Reverted https://github.com/pytorch/pytorch/pull/136899 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/136899#issuecomment-2392721534))
2024-10-04 03:28:31 +00:00
albanD
4f93de8951 Mark PyTorch module as no-gil valid and pythoncapi_compat.h (#136899)
PyList_GetItem are audited but not other APIs yet (they will be done in a follow up PR to keep this one small enough).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136899
Approved by: https://github.com/colesbury, https://github.com/atalman
2024-10-01 22:05:35 +00:00
Shivam Raikundalia
9e4f24f8e5 Fix PT2 Source Code Annotations (#136460)
Summary: In D60803317, we added CompileContext (trace_id) information to Kineto traces using caching when a CompileContext exits. As pointed out by some users, this gives innaccurate IDs because we are not getting the context that we is being looked up within the eval_frame. For this reason, we decided to revert that change, and go with an approach that involves getting the trace_id associated with a given CacheEntry. To do this, we add a trace_id to the GuardedCode so that it can be passed onto a CacheEntry. Then, we change the lookup function to return said trace_id alongside the code so that we can pass both into our eval function. Once we get to a Torch-Compiled Region, we can just append the context information to the name of the annotation thus bypassing any need for kwargs.

Test Plan: Added more comprehensive unit test. Saw that all the trace_ids appeared within the graph.

Differential Revision: D63138786

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136460
Approved by: https://github.com/ezyang
2024-09-28 03:54:43 +00:00
Animesh Jain
594162f7ab [dynamo] Support reading attributes from pybind objects (#134630)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134630
Approved by: https://github.com/jansel
2024-08-29 15:06:52 +00:00
Edward Z. Yang
e836ee1955 Enhancements to recompiles logs (#130043)
----

- We now record on CacheEntry what the compile id that populated it was, so now we can say why a specific frame was rejected
- Add structured log for recompiles under name artifact "recompile_reasons". As it stands, it's not terribly structured, but this was the easiest thing I could do to start
- Slightly reformat multi-reason printing; since we only report one guard failure seems better to have it as a single line

Example output:

```
V0703 10:34:13.273000 140345997743104 torch/_dynamo/guards.py:2590] [0/1] [__recompiles] Recompiling function f in /data/users/ezyang/a/pytorch/b.py:3
V0703 10:34:13.273000 140345997743104 torch/_dynamo/guards.py:2590] [0/1] [__recompiles]     triggered by the following guard failure(s):
V0703 10:34:13.273000 140345997743104 torch/_dynamo/guards.py:2590] [0/1] [__recompiles]     - 0/0: tensor 'L['x']' size mismatch at index 0. expected 4, actual 5
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130043
Approved by: https://github.com/anijain2305
2024-07-09 03:40:56 +00:00
William Wen
35382f0573 [dynamo, 3.12] Use CPython internal _PyOpcode_Caches instead of hardcoding (#122335)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122335
Approved by: https://github.com/jansel
ghstack dependencies: #122146
2024-03-27 20:39:39 +00:00
cyy
c2eedb7f8a [Dynamo][1/N] Fix clang-tidy warnings in torch/csrc/dynamo/* (#122259)
This PR begins a series of works to ensure dynamo C++ code is clang-tidy clean.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122259
Approved by: https://github.com/ezyang
2024-03-21 00:43:25 +00:00
William Wen
ee1c2449f7 [dynamo] delete dynamo cache entry when guard function is invalidated [attempt 2] (#119107)
Attempt #2 for https://github.com/pytorch/pytorch/pull/117875 to fix https://github.com/pytorch/pytorch/issues/112090.

Summary of changes:
- ~Changed CacheEntry linked list into a doubly-linked list structure to support deletion.~ (done by C++ refactor)
- Added CacheEntry and ExtraState borrowed references to GuardFn so that GuardFn can tell ExtraState to delete CacheEntry when the GuardFn is invalidated.
- ~Added ExtraState raw reference to CacheEntry so that we can get ExtraState to correctly point to the first CacheEntry if it gets deleted.~ (done by C++ refactor)
- CacheEntry destructor needs to reset GuardFn refs to ExtraState/CacheEntry in order to prevent use-after-free.
- code_context values that are nn.GraphModules need to be weakrefs in order to prevent circular references.
- Added tests that check for memory leaks and cache deletion operations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119107
Approved by: https://github.com/jansel
2024-02-07 03:32:42 +00:00
William Wen
ae4e866bba [dynamo] refactor CacheEntry and ExtraState to eval_frame.c to C++ (#118438)
Part of implementing CacheEntry invalidation to fix https://github.com/pytorch/pytorch/issues/112090.

Changes:
- Move CacheEntry and ExtraState to C++
- Use pybind to control reference counting
- Use std::list instead of manually implementing a linked list

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118438
Approved by: https://github.com/jansel
2024-02-06 20:48:11 +00:00
cyy
054f3f1d8f [3/N] fix clang-tidy warnings in torch/csrc (#108024)
Apply fixes to some found issues by clang-tidy in torch/csrc.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108024
Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/malfet
2023-08-28 18:00:00 +00:00
Jason Ansel
5a114f72bf [Compiled Autograd] Move to torch::dynamo::autograd namespace (#105854)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105854
Approved by: https://github.com/albanD
2023-07-27 00:36:47 +00:00
PyTorch MergeBot
e60af5c8e4 Revert "[Compiled Autograd] Move to torch::dynamo::autograd namespace (#105854)"
This reverts commit 26e3b4020f.

Reverted https://github.com/pytorch/pytorch/pull/105854 on behalf of https://github.com/PaliC due to breaking internal embedded device tests (details shared with author) ([comment](https://github.com/pytorch/pytorch/pull/105854#issuecomment-1650559375))
2023-07-25 21:09:18 +00:00
Jason Ansel
26e3b4020f [Compiled Autograd] Move to torch::dynamo::autograd namespace (#105854)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105854
Approved by: https://github.com/albanD
2023-07-25 01:14:04 +00:00
Jason Ansel
c902b84e0b Compiled autograd (#103822)
This branch:
1) converts the autograd tape into an FX graph
2) caches that conversion using a "shadow" graph
3) compiles and runs the generated FX graph instead of the normal autograd

What works currently:
1) Caching, capture, and initial integration
2) Backwards hooks
3) Inlining AotAutograd generated subgraphs
4) torch.compiling the generated FX graph
5) Auto-detecting dynamic shapes based on changes

Future work
1) Larger scale testing
1) Boxed calling convention, so memory can be freed incrementally
1) Support hooks on SavedTensor
1) Additional testing by running eager autograd tests under compiled_autograd.enable()

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103822
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-07-24 21:12:05 +00:00
Jason Ansel
f1fdb6efbd Manual changes for moving dynamo to core (#86621)
This is the subset of the changes in #86461 not auto-generated by `copy_to_core.sh`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86621
Approved by: https://github.com/albanD
2022-10-11 23:01:21 +00:00