Commit Graph

184 Commits

Author SHA1 Message Date
Animesh Jain
f213f262af [dynamo][cpp-guards] Improve when to use Dict vs DictSubclassGuardManager (#124237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124237
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #124230
2024-04-18 03:33:37 +00:00
William Wen
812bae09be [dynamo] fix 3.11+ refleak (#124238)
Fixes https://github.com/pytorch/pytorch/issues/119607 for 3.11+.

In 3.11+, `_PyFrame_FastToLocalsWithError` could implicity run `COPY_FREE_VARS` on the original frame, leading to double incref's since the dynamo shadow frame can rerun `COPY_FREE_VARS`. So the solution is to skip the first `COPY_FREE_VARS` instruction in the shadow frame if it was already executed in the original frame.

Also move the location for clearing the original frame in 3.12 to handle error cases more thoroughly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124238
Approved by: https://github.com/jansel
2024-04-18 03:02:29 +00:00
Animesh Jain
51cc808ac7 [dynamo][cpp-guards] Missing decref on early returns in DictSubclassGuardManager (#124230)
I am sad that I missed this earlier. Good thing is that CI caught it. Will be more careful next time.

This was the reason https://github.com/pytorch/pytorch/pull/123547 is reverted - https://github.com/pytorch/pytorch/pull/123547#issuecomment-2058350245

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124230
Approved by: https://github.com/mlazos
2024-04-17 02:49:07 +00:00
Animesh Jain
2e6871f924 [dynamo][guards-cpp] Early return in DictGuardManager for empty dicts (#123787)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123787
Approved by: https://github.com/jansel
ghstack dependencies: #123773
2024-04-11 22:23:28 +00:00
Animesh Jain
b0b7aa201c [dynamo][cpp-guards] Introduce DictSubclassGuardManager (#123773)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123773
Approved by: https://github.com/jansel
2024-04-11 22:23:28 +00:00
Jason Ansel
e3ea316623 [dynamo] Save/restore cublas_allow_tf32 in convert_frame (#123509)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123509
Approved by: https://github.com/anijain2305
2024-04-07 03:37:47 +00:00
Animesh Jain
d3596cf004 [dynamo][cpp-guards] Fix missing decref in GradGuardAccessor (#123488)
Found that there was a peak mem increase while running HF suite.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123488
Approved by: https://github.com/jansel
ghstack dependencies: #123485
2024-04-06 03:09:29 +00:00
Animesh Jain
22b9987144 [dynamo][cpp-guards] ListGetItemGuardAccessor and TupleGetItemGuardAccessor (#123396)
Speeds up the guard-overhead microbenchmark by around 10% normalized to main-branch CPP guards

~~~
import torch

@torch.compile(backend="eager")
def fn(x, lst):
    for l in lst:
        x = x + l
    return x

n = 1000

lst = [i for i in range(n)]

x = torch.randn(4)
print(fn(x, lst))
print("Sucess")
~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123396
Approved by: https://github.com/jansel
ghstack dependencies: #123285, #123302, #123303
2024-04-05 22:10:04 +00:00
William Wen
5c7e2fd270 [dynamo, 3.12] use pymalloc allocator instead of malloc/free for frames (#123299)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123299
Approved by: https://github.com/jansel
ghstack dependencies: #123216
2024-04-04 20:00:54 +00:00
Animesh Jain
fb7664d5bf [dynamo][optimizer][guard-overhead] NOT_NONE guard for param.grad instead of TENSOR_MATCH (#123285)
For optimizers, we do an DATA_PTR match for parameters. For param.grad, we were doing TENSOR_MATCH, but what we really need to guard is if param.grad is None or not. Therefore, I add a new guard called NOT_NONE.

Further improves the guard overhead

![image](https://github.com/pytorch/pytorch/assets/13822661/574598ac-ca71-4e5e-9e75-8774577cd58f)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123285
Approved by: https://github.com/mlazos, https://github.com/jansel
2024-04-04 03:52:47 +00:00
Animesh Jain
3eb84b6343 [dynamo][cpp-guards] Init LocalState only when TENSOR_MATCH guard present (#123152)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123152
Approved by: https://github.com/jansel
2024-04-03 08:04:39 +00:00
Animesh Jain
d91db70295 [dynamo][cpp-guards] Optimize tensor.grad accessor (#123226)
For LayoutLM model, reduces C++ guard overhead by 1.48x. These are the numbers

![image](https://github.com/pytorch/pytorch/assets/13822661/25cfc35b-b67d-4903-8403-71fa931dacdd)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123226
Approved by: https://github.com/jansel
2024-04-03 05:32:13 +00:00
William Wen
d17eea9c0f [dynamo] fix broken 3.11+ windows build failure (#123104)
e.g. https://github.com/pytorch/pytorch/actions/runs/8478510063/job/23230951466#step:12:23296

Caused by https://github.com/pytorch/pytorch/pull/122335

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123104
Approved by: https://github.com/atalman
2024-04-02 17:52:14 +00:00
Animesh Jain
ffd1e4e9ba [dynamo][cpp-guards] Always Reset relational guards (#123046)
Reset guard at the end of RootGuardManager, even if the result is true. Earlier we reset only when result was False. But this causes extra bookkeeping in each guard. This PR gives a tiny bit improvement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123046
Approved by: https://github.com/jansel
2024-04-01 21:09:35 +00:00
William Wen
7b13228038 [dynamo, 3.12] fix DICT_VERSION C++ guards (#122449)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122449
Approved by: https://github.com/jansel
ghstack dependencies: #122146, #122335, #122354, #122355, #122356
2024-03-27 20:39:39 +00:00
William Wen
35382f0573 [dynamo, 3.12] Use CPython internal _PyOpcode_Caches instead of hardcoding (#122335)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122335
Approved by: https://github.com/jansel
ghstack dependencies: #122146
2024-03-27 20:39:39 +00:00
William Wen
2564f6cf0e [dynamo, 3.12] Allocate Dynamo shadow frames by mimicking CPython (#122146)
Python 3.12 changed a few things with how `_PyInterpreterFrame`s are allocated and freed:
- Frames are now required to be placed on the Python frame stack. In 3.11, we could allocate frames anywhere in memory. In 3.12, we now need to use `THP_PyThreadState_BumpFramePointerSlow`/`push_chunk`/`allocate_chunk`. This method of allocating/freeing frames is also compatible with 3.11.
- The eval frame function is now responsible for clearing the frame (see https://docs.python.org/3/whatsnew/changelog.html#id128, the point about "...which now clear the frame.")

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122146
Approved by: https://github.com/jansel
2024-03-27 20:39:39 +00:00
cyy
808a035658 [Dynamo][4/N] Enable clang-tidy coverage on torch/csrc/dynamo/* (#122534)
This PR enables clang-tidy coverage on torch/csrc/dynamo/* and also contains other small improvements.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122534
Approved by: https://github.com/Skylion007
2024-03-24 05:26:32 +00:00
cyy
482f6c4693 [Dynamo][3/N] Fix clang-tidy warnings in torch/csrc/dynamo/* (#122392)
This PR continues to clean clang-tidy warnings in torch/csrc/dynamo/*, following #122362

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122392
Approved by: https://github.com/ezyang
2024-03-22 22:57:41 +00:00
cyy
7f8bb1de83 [Dynamo][2/N] Fix clang-tidy warnings in torch/csrc/dynamo/* (#122362)
This PR continues to clean clang-tidy warnings in torch/csrc/dynamo/*, following #122259

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122362
Approved by: https://github.com/ezyang
2024-03-21 09:41:41 +00:00
cyy
c2eedb7f8a [Dynamo][1/N] Fix clang-tidy warnings in torch/csrc/dynamo/* (#122259)
This PR begins a series of works to ensure dynamo C++ code is clang-tidy clean.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122259
Approved by: https://github.com/ezyang
2024-03-21 00:43:25 +00:00
cyy
1dd1899fd6 Add missing throw of std::runtime_error in dynamo/guards.cpp (#122306)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122306
Approved by: https://github.com/Skylion007, https://github.com/ezyang
2024-03-20 20:50:01 +00:00
Animesh Jain
c568b84794 [dynamo][guards] Move backend match to eval_frame (#121954)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121954
Approved by: https://github.com/jansel
2024-03-17 06:52:10 +00:00
Simon Fan
e25054b248 [compiled autograd] free stack objects before calling compiled graph (#121707)
Moved compilation code into _compiled_autograd_impl, frees stack allocated objects e.g. AutogradCompilerCall

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121707
Approved by: https://github.com/jansel
2024-03-15 07:12:38 +00:00
PyTorch MergeBot
45a835cef2 Revert "[compiled autograd] free stack objects before calling compiled graph (#121707)"
This reverts commit 5b90074540.

Reverted https://github.com/pytorch/pytorch/pull/121707 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think it breaks inductor CPU tests 5b90074540 ([comment](https://github.com/pytorch/pytorch/pull/121698#issuecomment-1995868090))
2024-03-13 21:23:42 +00:00
Simon Fan
8b1b61bc70 [compiled autograd] support custom ops backed by c++ autograd::Function (#120681)
- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports

limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.

Differential Revision: [D54818488](https://our.internmc.facebook.com/intern/diff/D54818488)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
2024-03-13 21:13:21 +00:00
Simon Fan
5b90074540 [compiled autograd] free stack objects before calling compiled graph (#121707)
Moved compilation code into _compiled_autograd_impl, frees stack allocated objects e.g. AutogradCompilerCall

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121707
Approved by: https://github.com/jansel
ghstack dependencies: #121698
2024-03-13 19:31:44 +00:00
Animesh Jain
22489bfe70 [dynamo][guards-cpp-refactor] Directly call root guard manager in eval_frame (#121622)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121622
Approved by: https://github.com/jansel
ghstack dependencies: #121614
2024-03-12 17:09:11 +00:00
Animesh Jain
2348e8e4e7 [dynamo][guards-cpp-refactor] Simplify DYNAMIC_INDICES guard (#121614)
Use NO_HASATTR guard for the common part.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121614
Approved by: https://github.com/jansel
2024-03-12 17:08:56 +00:00
PyTorch MergeBot
b2f09c1859 Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681)"
This reverts commit d27509c384.

Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/xmfan due to breaking internal builds, see D54707287 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1989542344))
2024-03-11 22:18:36 +00:00
Simon Fan
d27509c384 [compiled autograd] support custom ops backed by c++ autograd::Function (#120681)
- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports

limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
2024-03-08 20:43:29 +00:00
Animesh Jain
c86a1ce125 [dynamo][guards-cpp-refactor] Func defaults and kwdefaults accessor (#121338)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121338
Approved by: https://github.com/jansel
ghstack dependencies: #121327
2024-03-08 01:24:00 +00:00
Animesh Jain
79a04f2df9 [dynamo][guards-cpp-refactor] Permit dict version guard in DictGuardManager (#121327)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121327
Approved by: https://github.com/jansel
2024-03-08 01:24:00 +00:00
PyTorch MergeBot
2b1661c7a0 Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681)"
This reverts commit 05c256849b.

Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/izaitsevfb due to breaking internal builds, see D54617701 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1984214079))
2024-03-07 18:53:51 +00:00
Simon Fan
05c256849b [compiled autograd] support custom ops backed by c++ autograd::Function (#120681)
- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports

limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
2024-03-06 18:01:56 +00:00
Animesh Jain
e3bd6efe72 [dynamo][guards-cpp-refactor] Prevent duplication of leaf guards (#121164)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121164
Approved by: https://github.com/jansel
ghstack dependencies: #121121, #121147, #121154
2024-03-06 08:36:45 +00:00
Animesh Jain
b6b2d5b00a [dynamo][guards-cpp-refactor] Pass source name for debug ease (#121154)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121154
Approved by: https://github.com/jansel
ghstack dependencies: #121121, #121147
2024-03-06 08:36:45 +00:00
Animesh Jain
52d89d8491 [dynamo][guards-cpp-refactor] Simplify DictGuardManager by removing KeyValueDictGuardManager (#121147)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121147
Approved by: https://github.com/jansel
ghstack dependencies: #121121
2024-03-06 08:36:45 +00:00
Animesh Jain
af7f55ffc8 [dynamo][guards-cpp-refactor] Add argnames in pybind'ings (#121121)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121121
Approved by: https://github.com/jansel
2024-03-06 08:36:45 +00:00
Animesh Jain
7f81563e5e [dynamo][guards-cpp-refactor] Skip type and length check guard for DictGuardManager (#120739)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120739
Approved by: https://github.com/jansel
ghstack dependencies: #120673
2024-03-02 13:15:53 +00:00
Animesh Jain
82d1465d8d [dynamo][guards-cpp-refactor] DICT_CONTAINS guard (#120673)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120673
Approved by: https://github.com/jansel
2024-03-02 13:15:53 +00:00
Simon Fan
82b356193d Move VariableInfo into its own file to avoid circular dependency (#120732)
VariableInfo is used by both `custom_function.h` (in a templated class) and `compiled_autograd.h` (in a class with some templated methods). Another way could have been to make a `compiled_autograd.cpp` and forward declare VariableInfo, but this VariableInfo was also being used in other nodes like PyNode so it felt cleaner to do it this way.

Differential Revision: [D54287007](https://our.internmc.facebook.com/intern/diff/D54287007)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120732
Approved by: https://github.com/jansel
2024-03-01 08:48:13 +00:00
Animesh Jain
82cbd9b131 [dynamo][guards-cpp-refactor] PythonLambdaGuardAccessor (#120730)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120730
Approved by: https://github.com/jansel
ghstack dependencies: #120864
2024-02-29 07:25:13 +00:00
Jason Ansel
01ec8df6d8 [Compiled Autograd] Introduce BackwardState capture (#120382)
This adds support for backwards hooks that are *both*:
1) Interior to the graph; and
2) Dynamically generated (e.g. lambdas)

We do this by creating a BackwardState object that is used to register the hooks in the forward, then populated by dynamo *after* the forwards runs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120382
Approved by: https://github.com/xmfan
2024-02-28 20:36:47 +00:00
Animesh Jain
63f874b476 [dynamo][guards-cpp-refactor] DictGetItemGuardAccessor for f_locals (#120593)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120593
Approved by: https://github.com/jansel
2024-02-27 03:13:55 +00:00
William Wen
ecb3f33a1a [dynamo] fix segfault in _debug_get_cache_entry_list (#120635)
Fix https://github.com/pytorch/pytorch/issues/120607.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120635
Approved by: https://github.com/jansel
2024-02-26 23:31:09 +00:00
Animesh Jain
a299db2983 [dynamo][guards-cpp-refactor] NO_HASATTR guard (#120469)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120469
Approved by: https://github.com/jansel
2024-02-26 04:37:40 +00:00
Animesh Jain
4328e772bf [dynamo][guards-cpp-refactor] DICT_VERSION guard (#120416)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120416
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093, #120096, #120342, #120344, #120359
2024-02-25 23:24:24 +00:00
Animesh Jain
c269e48af0 [dynamo][guards-cpp-refactor] DictGuardManager (#120359)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120359
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093, #120096, #120342, #120344
2024-02-25 23:24:24 +00:00
Animesh Jain
775a4388d9 [dynamo][guards-cpp-refactor] WEAKREF_ALIVE guard (#120344)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120344
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093, #120096, #120342
2024-02-25 23:24:04 +00:00