Commit Graph

197 Commits

Author SHA1 Message Date
Animesh Jain
3de78a1b48 [dynamo][cpp-guards] EQUALS MATCH - Cache first passing value (#124627)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124627
Approved by: https://github.com/jansel
ghstack dependencies: #124779
2024-04-25 15:24:12 +00:00
Animesh Jain
e68d65dae2 [dynamo][cpp-guards] Differentiate dict guards wrt to guarding on key order (#124779)
We guard on key order
1) When a key is a non-constant object
2) When we actually need key order - like .values, .items etc

For dicts/OrderedDicts that do not require key order guarding, we just rely on usual `GuardManger + DictGetItemGuardAccessor`. This is faster than going through the `list(d.keys())` based design for OrderedDicts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124779
Approved by: https://github.com/jansel
2024-04-25 08:20:35 +00:00
Animesh Jain
f213f262af [dynamo][cpp-guards] Improve when to use Dict vs DictSubclassGuardManager (#124237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124237
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #124230
2024-04-18 03:33:37 +00:00
Animesh Jain
51cc808ac7 [dynamo][cpp-guards] Missing decref on early returns in DictSubclassGuardManager (#124230)
I am sad that I missed this earlier. Good thing is that CI caught it. Will be more careful next time.

This was the reason https://github.com/pytorch/pytorch/pull/123547 is reverted - https://github.com/pytorch/pytorch/pull/123547#issuecomment-2058350245

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124230
Approved by: https://github.com/mlazos
2024-04-17 02:49:07 +00:00
Animesh Jain
2e6871f924 [dynamo][guards-cpp] Early return in DictGuardManager for empty dicts (#123787)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123787
Approved by: https://github.com/jansel
ghstack dependencies: #123773
2024-04-11 22:23:28 +00:00
Animesh Jain
b0b7aa201c [dynamo][cpp-guards] Introduce DictSubclassGuardManager (#123773)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123773
Approved by: https://github.com/jansel
2024-04-11 22:23:28 +00:00
Jason Ansel
e3ea316623 [dynamo] Save/restore cublas_allow_tf32 in convert_frame (#123509)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123509
Approved by: https://github.com/anijain2305
2024-04-07 03:37:47 +00:00
Animesh Jain
d3596cf004 [dynamo][cpp-guards] Fix missing decref in GradGuardAccessor (#123488)
Found that there was a peak mem increase while running HF suite.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123488
Approved by: https://github.com/jansel
ghstack dependencies: #123485
2024-04-06 03:09:29 +00:00
Animesh Jain
22b9987144 [dynamo][cpp-guards] ListGetItemGuardAccessor and TupleGetItemGuardAccessor (#123396)
Speeds up the guard-overhead microbenchmark by around 10% normalized to main-branch CPP guards

~~~
import torch

@torch.compile(backend="eager")
def fn(x, lst):
    for l in lst:
        x = x + l
    return x

n = 1000

lst = [i for i in range(n)]

x = torch.randn(4)
print(fn(x, lst))
print("Sucess")
~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123396
Approved by: https://github.com/jansel
ghstack dependencies: #123285, #123302, #123303
2024-04-05 22:10:04 +00:00
Animesh Jain
fb7664d5bf [dynamo][optimizer][guard-overhead] NOT_NONE guard for param.grad instead of TENSOR_MATCH (#123285)
For optimizers, we do an DATA_PTR match for parameters. For param.grad, we were doing TENSOR_MATCH, but what we really need to guard is if param.grad is None or not. Therefore, I add a new guard called NOT_NONE.

Further improves the guard overhead

![image](https://github.com/pytorch/pytorch/assets/13822661/574598ac-ca71-4e5e-9e75-8774577cd58f)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123285
Approved by: https://github.com/mlazos, https://github.com/jansel
2024-04-04 03:52:47 +00:00
Animesh Jain
3eb84b6343 [dynamo][cpp-guards] Init LocalState only when TENSOR_MATCH guard present (#123152)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123152
Approved by: https://github.com/jansel
2024-04-03 08:04:39 +00:00
Animesh Jain
d91db70295 [dynamo][cpp-guards] Optimize tensor.grad accessor (#123226)
For LayoutLM model, reduces C++ guard overhead by 1.48x. These are the numbers

![image](https://github.com/pytorch/pytorch/assets/13822661/25cfc35b-b67d-4903-8403-71fa931dacdd)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123226
Approved by: https://github.com/jansel
2024-04-03 05:32:13 +00:00
Animesh Jain
ffd1e4e9ba [dynamo][cpp-guards] Always Reset relational guards (#123046)
Reset guard at the end of RootGuardManager, even if the result is true. Earlier we reset only when result was False. But this causes extra bookkeeping in each guard. This PR gives a tiny bit improvement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123046
Approved by: https://github.com/jansel
2024-04-01 21:09:35 +00:00
William Wen
7b13228038 [dynamo, 3.12] fix DICT_VERSION C++ guards (#122449)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122449
Approved by: https://github.com/jansel
ghstack dependencies: #122146, #122335, #122354, #122355, #122356
2024-03-27 20:39:39 +00:00
cyy
482f6c4693 [Dynamo][3/N] Fix clang-tidy warnings in torch/csrc/dynamo/* (#122392)
This PR continues to clean clang-tidy warnings in torch/csrc/dynamo/*, following #122362

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122392
Approved by: https://github.com/ezyang
2024-03-22 22:57:41 +00:00
cyy
c2eedb7f8a [Dynamo][1/N] Fix clang-tidy warnings in torch/csrc/dynamo/* (#122259)
This PR begins a series of works to ensure dynamo C++ code is clang-tidy clean.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122259
Approved by: https://github.com/ezyang
2024-03-21 00:43:25 +00:00
cyy
1dd1899fd6 Add missing throw of std::runtime_error in dynamo/guards.cpp (#122306)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122306
Approved by: https://github.com/Skylion007, https://github.com/ezyang
2024-03-20 20:50:01 +00:00
Animesh Jain
22489bfe70 [dynamo][guards-cpp-refactor] Directly call root guard manager in eval_frame (#121622)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121622
Approved by: https://github.com/jansel
ghstack dependencies: #121614
2024-03-12 17:09:11 +00:00
Animesh Jain
2348e8e4e7 [dynamo][guards-cpp-refactor] Simplify DYNAMIC_INDICES guard (#121614)
Use NO_HASATTR guard for the common part.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121614
Approved by: https://github.com/jansel
2024-03-12 17:08:56 +00:00
Animesh Jain
c86a1ce125 [dynamo][guards-cpp-refactor] Func defaults and kwdefaults accessor (#121338)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121338
Approved by: https://github.com/jansel
ghstack dependencies: #121327
2024-03-08 01:24:00 +00:00
Animesh Jain
79a04f2df9 [dynamo][guards-cpp-refactor] Permit dict version guard in DictGuardManager (#121327)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121327
Approved by: https://github.com/jansel
2024-03-08 01:24:00 +00:00
Animesh Jain
e3bd6efe72 [dynamo][guards-cpp-refactor] Prevent duplication of leaf guards (#121164)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121164
Approved by: https://github.com/jansel
ghstack dependencies: #121121, #121147, #121154
2024-03-06 08:36:45 +00:00
Animesh Jain
b6b2d5b00a [dynamo][guards-cpp-refactor] Pass source name for debug ease (#121154)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121154
Approved by: https://github.com/jansel
ghstack dependencies: #121121, #121147
2024-03-06 08:36:45 +00:00
Animesh Jain
52d89d8491 [dynamo][guards-cpp-refactor] Simplify DictGuardManager by removing KeyValueDictGuardManager (#121147)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121147
Approved by: https://github.com/jansel
ghstack dependencies: #121121
2024-03-06 08:36:45 +00:00
Animesh Jain
af7f55ffc8 [dynamo][guards-cpp-refactor] Add argnames in pybind'ings (#121121)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121121
Approved by: https://github.com/jansel
2024-03-06 08:36:45 +00:00
Animesh Jain
7f81563e5e [dynamo][guards-cpp-refactor] Skip type and length check guard for DictGuardManager (#120739)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120739
Approved by: https://github.com/jansel
ghstack dependencies: #120673
2024-03-02 13:15:53 +00:00
Animesh Jain
82d1465d8d [dynamo][guards-cpp-refactor] DICT_CONTAINS guard (#120673)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120673
Approved by: https://github.com/jansel
2024-03-02 13:15:53 +00:00
Animesh Jain
82cbd9b131 [dynamo][guards-cpp-refactor] PythonLambdaGuardAccessor (#120730)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120730
Approved by: https://github.com/jansel
ghstack dependencies: #120864
2024-02-29 07:25:13 +00:00
Animesh Jain
63f874b476 [dynamo][guards-cpp-refactor] DictGetItemGuardAccessor for f_locals (#120593)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120593
Approved by: https://github.com/jansel
2024-02-27 03:13:55 +00:00
Animesh Jain
a299db2983 [dynamo][guards-cpp-refactor] NO_HASATTR guard (#120469)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120469
Approved by: https://github.com/jansel
2024-02-26 04:37:40 +00:00
Animesh Jain
4328e772bf [dynamo][guards-cpp-refactor] DICT_VERSION guard (#120416)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120416
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093, #120096, #120342, #120344, #120359
2024-02-25 23:24:24 +00:00
Animesh Jain
c269e48af0 [dynamo][guards-cpp-refactor] DictGuardManager (#120359)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120359
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093, #120096, #120342, #120344
2024-02-25 23:24:24 +00:00
Animesh Jain
775a4388d9 [dynamo][guards-cpp-refactor] WEAKREF_ALIVE guard (#120344)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120344
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093, #120096, #120342
2024-02-25 23:24:04 +00:00
Animesh Jain
007606e520 [dynamo][guards-cpp-refactor] TENSOR_MATCH guard (#120342)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120342
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093, #120096
2024-02-23 20:10:09 +00:00
Animesh Jain
4b65d192f0 [dynamo][guards-cpp-refactor] DYNAMIC_INDICES guard (#120096)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120096
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123, #120093
2024-02-23 20:10:09 +00:00
Animesh Jain
a92ce46dc3 [dynamo][guards-cpp-refactor] GlobalWeakRefGuardAccessor (#120093)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120093
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119, #120123
2024-02-23 20:10:01 +00:00
Animesh Jain
bb331b1eb5 [dynamo][guards-cpp-refactor] LENGTH_CHECK guard (#120123)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120123
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091, #120119
2024-02-23 20:09:52 +00:00
Animesh Jain
2eac593ffd [dynamo][guards-cpp-refactor] TUPLE_ITERATOR_LEN guard (#120119)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120119
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089, #120091
2024-02-23 20:09:43 +00:00
Animesh Jain
da95421f05 [dynamo][guards-cpp-refactor] TupleIteratorGetItemAccessor (#120091)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120091
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068, #120089
2024-02-23 20:09:34 +00:00
Animesh Jain
9c64068ef8 [dynamo][guards-cpp-refactor] TypeGuardAccessor (#120089)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120089
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067, #120068
2024-02-21 17:56:48 +00:00
Animesh Jain
ec6783990a [dynamo][guards-cpp-refactor] GlobalsGuardAccessor (#120068)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120068
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065, #120067
2024-02-21 17:56:48 +00:00
Animesh Jain
66c52d678f [dynamo][guards-cpp-refactor] GetItemGuardAccessor (#120067)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120067
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064, #120065
2024-02-21 17:56:36 +00:00
Animesh Jain
7a0c2a9d0a [dynamo][guards-cpp-refactor] NO_TENSOR_ALIASING guard (#120065)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120065
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062, #120064
2024-02-21 17:56:18 +00:00
Animesh Jain
8d5ae8c0b3 [dynamo][guards-cpp-refactor] TENSOR_ALIASING guard (#120064)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120064
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061, #120062
2024-02-21 17:56:05 +00:00
Animesh Jain
034955b2fc [dynamo][guards-cpp-refactor] DATA_PTR_MATCH guard (#120062)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120062
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060, #120061
2024-02-21 17:55:46 +00:00
Animesh Jain
cc6cf89c30 [dynamo][guards-cpp-refactor] GLOBAL_STATE guard (#120061)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120061
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833, #120060
2024-02-21 17:55:32 +00:00
Animesh Jain
5066bec743 [dynamo][guards-cpp-refactor] DEFAULT_DEVICE guard (#120060)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120060
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827, #119833
2024-02-21 17:55:17 +00:00
Animesh Jain
389b56b4c4 [dynamo][guards-cpp-refactor] GetAttrGuardAccessor (#119833)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119833
Approved by: https://github.com/jansel
ghstack dependencies: #119822, #119827
2024-02-20 05:33:08 +00:00
Animesh Jain
96f45d15d8 [dynamo][guards-c++-refactor] EQUALS_MATCH guard (#119827)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119827
Approved by: https://github.com/jansel
ghstack dependencies: #119822
2024-02-20 05:33:08 +00:00
Animesh Jain
0802951081 [dynamo][guards-c++-refactor] Introduce LeafGuard, GuardManager and GuardAccessor classes (#119822)
The full blown implementation is in this stack - https://github.com/pytorch/pytorch/pull/110590 which is passing all the test cases on CI. That stack is hard to review. So, breaking apart.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119822
Approved by: https://github.com/jansel
2024-02-20 05:33:08 +00:00
atalman
244b124bb8 Add linux cpu test for 3.12 (#117853)
This is continuation of work: https://github.com/pytorch/pytorch/pull/113987

Co-authored-by: albanD <desmaison.alban@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117853
Approved by: https://github.com/albanD
2024-02-14 20:52:23 +00:00
Jason Ansel
2de24c11f6 [inductor] Slightly faster memory allocation on CUDA (#118255)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118255
Approved by: https://github.com/peterbell10
ghstack dependencies: #118065, #118070, #118171
2024-01-25 20:49:14 +00:00
Jason Ansel
817debeb89 [inductor] Slightly faster memory allocation on CPU (#118171)
Based on `python benchmarks/dynamo/microbenchmarks/overheads.py`:
- Before `12.2us`
- After `10.5us`

This is inspired by a2c17a2b00 -- but in Python rather than C++

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118171
Approved by: https://github.com/jgong5, https://github.com/peterbell10
ghstack dependencies: #118065, #118070
2024-01-25 16:54:57 +00:00
Jason Ansel
a669319450 [inductor] Faster C++ kernel python bindings (#117500)
Calling C++ from Python via ctypes is notoriously slow.  This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark:
```python
from ctypes import c_void_p
import torch
from torch import empty
from torch._inductor.codecache import AsyncCompile
from torch._dynamo.testing import rand_strided
from torch._inductor.utils import print_performance
from torch._inductor.wrapper_benchmark import compiled_module_main

async_compile = AsyncCompile()

src = '''
#include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h"
extern "C" void kernel(const float* in_ptr0,
                       float* out_ptr0)
{
    {
        auto tmp0 = in_ptr0[static_cast<long>(0L)];
        auto tmp1 = static_cast<float>(1.0);
        auto tmp2 = decltype(tmp0)(tmp0 + tmp1);
        out_ptr0[static_cast<long>(0L)] = tmp2;
    }
}
'''

cpp_fused_add_ctypes = async_compile.cpp(src)
cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float*", "float*"], src)

async_compile.wait(globals())
del async_compile

def call(arg0_1):
    buf0 = empty((1,), device='cpu', dtype=torch.float32)
    if use_ctypes:
        for _ in range(100):
            cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr()))
    else:
        for _ in range(100):
            cpp_fused_add_cpython(arg0_1, buf0)
    del arg0_1
    return (buf0,)

def benchmark_compiled_module(times=1000, repeat=100):
    arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32)
    return print_performance(lambda: call(arg0_1), times=times, repeat=repeat)

print("old ctypes bindings: ", end='')
use_ctypes = True
compiled_module_main('None', benchmark_compiled_module)
print("new bindings:        ", end='')
use_ctypes = False
compiled_module_main('None', benchmark_compiled_module)
```
Output:
```
old ctypes bindings: 0.000073
new bindings:        0.000013
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500
Approved by: https://github.com/desertfire
2024-01-18 16:20:12 +00:00
Nikita Shulga
a1afd1b195 Revert "[inductor] Faster C++ kernel python bindings (#117500)"
It should have never been landed, but was landed again, thanks to
ghstack grafting/ungrafting see discussion on https://github.com/pytorch/pytorch/pull/116910

This reverts commit e457b6fb18.
2024-01-17 17:06:32 -08:00
titaiwangms
e457b6fb18 [inductor] Faster C++ kernel python bindings (#117500)
Calling C++ from Python via ctypes is notoriously slow.  This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark:
```python
from ctypes import c_void_p
import torch
from torch import empty
from torch._inductor.codecache import AsyncCompile
from torch._dynamo.testing import rand_strided
from torch._inductor.utils import print_performance
from torch._inductor.wrapper_benchmark import compiled_module_main

async_compile = AsyncCompile()

src = '''
#include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h"
extern "C" void kernel(const float* in_ptr0,
                       float* out_ptr0)
{
    {
        auto tmp0 = in_ptr0[static_cast<long>(0L)];
        auto tmp1 = static_cast<float>(1.0);
        auto tmp2 = decltype(tmp0)(tmp0 + tmp1);
        out_ptr0[static_cast<long>(0L)] = tmp2;
    }
}
'''

cpp_fused_add_ctypes = async_compile.cpp(src)
cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float*", "float*"], src)

async_compile.wait(globals())
del async_compile

def call(arg0_1):
    buf0 = empty((1,), device='cpu', dtype=torch.float32)
    if use_ctypes:
        for _ in range(100):
            cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr()))
    else:
        for _ in range(100):
            cpp_fused_add_cpython(arg0_1, buf0)
    del arg0_1
    return (buf0,)

def benchmark_compiled_module(times=1000, repeat=100):
    arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32)
    return print_performance(lambda: call(arg0_1), times=times, repeat=repeat)

print("old ctypes bindings: ", end='')
use_ctypes = True
compiled_module_main('None', benchmark_compiled_module)
print("new bindings:        ", end='')
use_ctypes = False
compiled_module_main('None', benchmark_compiled_module)
```
Output:
```
old ctypes bindings: 0.000073
new bindings:        0.000013
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500
Approved by: https://github.com/desertfire
ghstack dependencies: #117409, #116667, #117591
2024-01-17 23:03:15 +00:00
PyTorch MergeBot
da6abaeeac Revert "[inductor] Faster C++ kernel python bindings (#117500)"
This reverts commit bb0fd1bd3c.

Reverted https://github.com/pytorch/pytorch/pull/117500 on behalf of https://github.com/PaliC due to breaking internal discussed with author offline ([comment](https://github.com/pytorch/pytorch/pull/117500#issuecomment-1896516512))
2024-01-17 19:34:26 +00:00
titaiwangms
bb0fd1bd3c [inductor] Faster C++ kernel python bindings (#117500)
Calling C++ from Python via ctypes is notoriously slow.  This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark:
```python
from ctypes import c_void_p
import torch
from torch import empty
from torch._inductor.codecache import AsyncCompile
from torch._dynamo.testing import rand_strided
from torch._inductor.utils import print_performance
from torch._inductor.wrapper_benchmark import compiled_module_main

async_compile = AsyncCompile()

src = '''
#include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h"
extern "C" void kernel(const float* in_ptr0,
                       float* out_ptr0)
{
    {
        auto tmp0 = in_ptr0[static_cast<long>(0L)];
        auto tmp1 = static_cast<float>(1.0);
        auto tmp2 = decltype(tmp0)(tmp0 + tmp1);
        out_ptr0[static_cast<long>(0L)] = tmp2;
    }
}
'''

cpp_fused_add_ctypes = async_compile.cpp(src)
cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float*", "float*"], src)

async_compile.wait(globals())
del async_compile

def call(arg0_1):
    buf0 = empty((1,), device='cpu', dtype=torch.float32)
    if use_ctypes:
        for _ in range(100):
            cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr()))
    else:
        for _ in range(100):
            cpp_fused_add_cpython(arg0_1, buf0)
    del arg0_1
    return (buf0,)

def benchmark_compiled_module(times=1000, repeat=100):
    arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32)
    return print_performance(lambda: call(arg0_1), times=times, repeat=repeat)

print("old ctypes bindings: ", end='')
use_ctypes = True
compiled_module_main('None', benchmark_compiled_module)
print("new bindings:        ", end='')
use_ctypes = False
compiled_module_main('None', benchmark_compiled_module)
```
Output:
```
old ctypes bindings: 0.000073
new bindings:        0.000013
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500
Approved by: https://github.com/desertfire
ghstack dependencies: #117409, #116667, #117591
2024-01-17 19:12:24 +00:00
PyTorch MergeBot
9da01affd3 Revert "[inductor] Faster C++ kernel python bindings (#117500)"
This reverts commit 3a52147cc5.

Reverted https://github.com/pytorch/pytorch/pull/117500 on behalf of https://github.com/PaliC due to breaking internal discussed with author offline ([comment](https://github.com/pytorch/pytorch/pull/117500#issuecomment-1896426304))
2024-01-17 18:42:39 +00:00
Jason Ansel
3a52147cc5 [inductor] Faster C++ kernel python bindings (#117500)
Calling C++ from Python via ctypes is notoriously slow.  This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark:
```python
from ctypes import c_void_p
import torch
from torch import empty
from torch._inductor.codecache import AsyncCompile
from torch._dynamo.testing import rand_strided
from torch._inductor.utils import print_performance
from torch._inductor.wrapper_benchmark import compiled_module_main

async_compile = AsyncCompile()

src = '''
#include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h"
extern "C" void kernel(const float* in_ptr0,
                       float* out_ptr0)
{
    {
        auto tmp0 = in_ptr0[static_cast<long>(0L)];
        auto tmp1 = static_cast<float>(1.0);
        auto tmp2 = decltype(tmp0)(tmp0 + tmp1);
        out_ptr0[static_cast<long>(0L)] = tmp2;
    }
}
'''

cpp_fused_add_ctypes = async_compile.cpp(src)
cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float*", "float*"], src)

async_compile.wait(globals())
del async_compile

def call(arg0_1):
    buf0 = empty((1,), device='cpu', dtype=torch.float32)
    if use_ctypes:
        for _ in range(100):
            cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr()))
    else:
        for _ in range(100):
            cpp_fused_add_cpython(arg0_1, buf0)
    del arg0_1
    return (buf0,)

def benchmark_compiled_module(times=1000, repeat=100):
    arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32)
    return print_performance(lambda: call(arg0_1), times=times, repeat=repeat)

print("old ctypes bindings: ", end='')
use_ctypes = True
compiled_module_main('None', benchmark_compiled_module)
print("new bindings:        ", end='')
use_ctypes = False
compiled_module_main('None', benchmark_compiled_module)
```
Output:
```
old ctypes bindings: 0.000073
new bindings:        0.000013
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500
Approved by: https://github.com/desertfire
2024-01-16 22:30:04 +00:00
Michael Voznesensky
1e7947b3e0 Revert "Reland 3rd try [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#109323)" + Forward fixes + test (#110964)
This reverts commit f786fbdebd.

Forward fixes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110964
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2023-10-11 05:16:47 +00:00
soulitzer
df9a6bcaef [reland] Symintify guards.cpp (#110675)
reland of https://github.com/pytorch/pytorch/pull/110371

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110675
Approved by: https://github.com/ezyang
ghstack dependencies: #110673, #110674
2023-10-10 19:37:17 +00:00
PyTorch MergeBot
585e2bd818 Revert "Symintify guards.cpp (#110371)"
This reverts commit e1cfcdfa06.

Reverted https://github.com/pytorch/pytorch/pull/110371 on behalf of https://github.com/PaliC due to bottom diff is causing a plethora of internal failures ([comment](https://github.com/pytorch/pytorch/pull/110371#issuecomment-1749798063))
2023-10-05 23:42:35 +00:00
soulitzer
e1cfcdfa06 Symintify guards.cpp (#110371)
Separating this out so we can check perf more easily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110371
Approved by: https://github.com/ezyang
ghstack dependencies: #110044, #110369, #110370
2023-10-04 22:56:42 +00:00
Yu Guo
2bf3ca1be7 [torchdynamo] preserve deterministic_algorithms_warn_only in convert_context (#110457)
Summary: preserve deterministic_algorithms_warn_only  in dynamo context

Test Plan: modified unit tests to test warn_only

Differential Revision: D49872622

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110457
Approved by: https://github.com/jansel
2023-10-04 07:12:32 +00:00
Animesh Jain
0673aa3d28 [dynamo][guards-log] Print nn module guard saved dict versions for debugging (#110028)
This is the output for nn module guards

~~~
[DEBUG] GUARDS:
[DEBUG] hasattr(L['x'], '_dynamo_dynamic_indices') == False           # _dynamo/variables/builder.py:1356 in wrap_fx_proxy_cls
[DEBUG] ___check_obj_id(L['self'], 139820807110912)                   # for mod in self.mods:  # examples/graph_break.py:35 in forward
[DEBUG] __nn_module_guard_0(L['self']) # versions(mod=9998, _parameters=1194395, _buffers=1194397, _modules=1194423, _forward_hooks=1194405, _forward_pre_hooks=1194411, _backward_hooks=1194402, _backward_pre_hooks=1194400)  # for mod in self.mods:  # examples/graph_break.py:35 in forward
[DEBUG] ___check_obj_id(L['self'].mods[0], 139817945727568)           # for mod in self.mods:  # examples/graph_break.py:35 in forward
[DEBUG] __nn_module_guard_1(L['self'].mods[0]) # versions(mod=10001, _parameters=1194428, _buffers=1194430, _modules=1194522, _forward_hooks=1194438, _forward_pre_hooks=1194444, _backward_hooks=1194435, _backward_pre_hooks=1194433)  # for mod in self.mods:  # examples/graph_break.py:35 in forward
[DEBUG] ___check_obj_id(L['self'].mods[1], 139817945560640)           # for mod in self.mods:  # examples/graph_break.py:35 in forward
[DEBUG] __nn_module_guard_2(L['self'].mods[1]) # versions(mod=10001, _parameters=1194660, _buffers=1194662, _modules=1194753, _forward_hooks=1194670, _forward_pre_hooks=1194676, _backward_hooks=1194667, _backward_pre_hooks=1194665)  # for mod in self.mods:  # examples/graph_break.py:35 in forward
[DEBUG] ___check_obj_id(L['self'].mods[0].linear, 139817945727856)    # return self.linear(a)  # examples/graph_break.py:24 in helper
[DEBUG] __nn_module_guard_3(L['self'].mods[0].linear) # versions(mod=10004, _parameters=1470004, _buffers=1194467, _modules=1194493, _forward_hooks=1194475, _forward_pre_hooks=1194481, _backward_hooks=1194472, _backward_pre_hooks=1194470)  # return self.linear(a)  # examples/graph_break.py:24 in helper
[DEBUG] ___check_obj_id(L['self'].mods[1].linear, 139817945561120)    # return self.linear(a)  # examples/graph_break.py:24 in helper
[DEBUG] __nn_module_guard_4(L['self'].mods[1].linear) # versions(mod=10004, _parameters=1470008, _buffers=1194699, _modules=1194725, _forward_hooks=1194707, _forward_pre_hooks=1194713, _backward_hooks=1194704, _backward_pre_hooks=1194702)  # return self.linear(a)  # examples/graph_break.py:24 in helper
[DEBUG] utils_device.CURRENT_DEVICE == None                           # _dynamo/output_graph.py:373 in init_ambient_guards
~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110028
Approved by: https://github.com/ezyang
ghstack dependencies: #110023, #110039
2023-09-26 08:53:07 +00:00
Edward Z. Yang
518308a740 Trace through pytree API with dynamo. (#108533)
Fix: #107315

This PR enables dynamo to trace through the `pytree` API by inlining its functions. In
order to do so, a few details of `pytree` had to be changed.

In summary, this PR:

- Introduces `TreeSpecVariable` for representing `TreeSpec` instances
- Specializes `<type>.__bases__` call, returning a `TupleVariable`
- Enables the call to `id` builtin function for every variable that implements
  `as_python_constant` method
- Specializes `ConstantVariable.call_method` for its (un)flatten functions
- Implements `UserDefinedObjectVariable.as_python_constant`
- Modifies `pytree` by:
    - Make `SUPPORTED_NODES` a map of ids (instead of types) to `NodeDef`
    - Removed `functools.wraps` function, since it can't be inlined

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108533
Approved by: https://github.com/ezyang, https://github.com/voznesenskym
ghstack dependencies: #109201
2023-09-20 00:04:56 +00:00
Ken Jin
f9e72acc8f Guard default dtype in torchdynamo (#109459)
Fixes https://github.com/pytorch/pytorch/issues/109458

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109459
Approved by: https://github.com/ezyang
2023-09-17 22:51:33 +00:00
Animesh Jain
f786fbdebd Reland 3rd try [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#109323)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109323
Approved by: https://github.com/huydhn, https://github.com/voznesenskym
2023-09-15 08:44:14 +00:00
PyTorch MergeBot
56c2386157 Revert "reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108883)"
This reverts commit d4230e5574.

Reverted https://github.com/pytorch/pytorch/pull/108883 on behalf of https://github.com/huydhn due to Per the discussion thread on D49122208, reverting this change ([comment](https://github.com/pytorch/pytorch/pull/108883#issuecomment-1712707853))
2023-09-10 04:40:02 +00:00
Animesh Jain
d4230e5574 reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108883)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108883
Approved by: https://github.com/voznesenskym, https://github.com/huydhn
2023-09-09 03:12:31 +00:00
Jason Ansel
4965fffeda [dynamo] Move global state guards to C++ (#108624)
This combines a bunch of python global state guards into a single C++ guard and switches to checking them 100% of the time.  It also adds a few new guards for things that change inductor's behavior.   Even though we are checking more things, I expect this to be much faster.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108624
Approved by: https://github.com/anijain2305
2023-09-08 04:07:08 +00:00
PyTorch MergeBot
72f24d0001 Revert "[dynamo][finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108528)"
This reverts commit 34bb74c4cf.

Reverted https://github.com/pytorch/pytorch/pull/108528 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it has some nasty merge conflicts after the revert of D48910794. I need to revert this so the conflict could be resolved. Please help rebase this tomorrow and reland the change ([comment](https://github.com/pytorch/pytorch/pull/108528#issuecomment-1711034781))
2023-09-08 03:49:41 +00:00
Animesh Jain
34bb74c4cf [dynamo][finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108528)
**This PR is a 99% copy paste of Sam Gross** (@colesbury) work at https://github.com/pytorch/pytorch/pull/100642. Copied from there

--------
The NN_MODULE guard now subsumes guards on Module attributes. The check_fn will fail if the module attributes are changed (such as Module.training), parameters, submodules, and buffers are added or removed, and if fields are changed on the type itself.

This gives up specificity in the guard check -- if any field is changed the check_fn fails -- for faster overall checks.

-----

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108528
Approved by: https://github.com/ezyang
2023-09-07 01:45:47 +00:00
cyy
054f3f1d8f [3/N] fix clang-tidy warnings in torch/csrc (#108024)
Apply fixes to some found issues by clang-tidy in torch/csrc.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108024
Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/malfet
2023-08-28 18:00:00 +00:00
ydwu4
31b0445702 Fix torch.compile with FakeTensor that has SymInt sizes (#107662)
**Motivation:**
When input FakeTensor to torch.compile has SymInt sizes (e.g. make_fx(opt_f, tracing_mode="symbolic"):
1. We cannot create a FakeTensor from that input in dynamo due to the SymInts.
2. We cannot check input tensors in guard check function and will abort due to tensor check calls sizes/strides.

For 1, we specialize the FakeTensor's SymInts using their hints. This is mostly safe since inputs mostly have concrete shapes and not computed from some DynamicOutputShape ops. We'll throw a data dependent error if the symint is unbacked.

For 2, we replace size/stride calls with the sym_* variants in TENSOR_CHECK guards' check function.

**Test Plan:**
See added tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107662
Approved by: https://github.com/ezyang
2023-08-23 05:27:57 +00:00
Edward Z. Yang
68b9bf9671 Simplify verbose error guard printing (#107516)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107516
Approved by: https://github.com/anijain2305
ghstack dependencies: #107505
2023-08-20 06:50:27 +00:00
cyy
1157b4393b Add const reference and std::move in opportunities detected by clang-tidy (#105815)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105815
Approved by: https://github.com/Skylion007
2023-07-25 12:28:14 +00:00
Michael Voznesensky
485cad4a86 Dynamo tensor aliasing guards, dedup graphargs (#104921)
The story here is relatively simple - when we go to wrap a tensor, we (1) ensure that it is a real, not fake tensor (2) check if we have seen it before. (3) If we have seen it, we create a positive alias guard and return the associated variable. If not, we proceed.

By short circuiting here, we avoid lifting it to a graph input, and guarantee that the only names passed to tensors are unique. This allows us to guard on the unique relationships (pyboject addresses, aka IDs, cannot match) to give us guards for negative aliases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104921
Approved by: https://github.com/jansel, https://github.com/ezyang
2023-07-13 22:18:08 +00:00
PyTorch MergeBot
bfd995f0d6 Revert "Specialize storage_offset - Does not cover automatic dynamic (#104204)"
This reverts commit 803c14490b.

Reverted https://github.com/pytorch/pytorch/pull/104204 on behalf of https://github.com/ezyang due to also due to https://github.com/pytorch/pytorch/issues/104563 ([comment](https://github.com/pytorch/pytorch/pull/104204#issuecomment-1620653507))
2023-07-04 19:41:32 +00:00
Michael Voznesensky
803c14490b Specialize storage_offset - Does not cover automatic dynamic (#104204)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104204
Approved by: https://github.com/wconstab
2023-06-27 05:51:42 +00:00
Brian Hirsh
98ab11a2c3 separate out dynamo .requires_grad and .is_grad_enabled guards (#100570)
Fixes https://github.com/pytorch/pytorch/issues/100977

This will hopefully fix this error (from [issue](https://github.com/pytorch/pytorch/issues/99616))

This PR fixes an internal model: we were running an inductor inference graph, but `torch.is_grad_enabled()` was True, causing us to error inside of the inference graph when we encountered an out= operator.

I haven't been able to create a smaller repro - before landing this, I want to create a smaller repro to convince myself of why we need to separate out these guards.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100570
Approved by: https://github.com/ezyang
2023-05-24 14:58:40 +00:00
PyTorch MergeBot
7f3fed125e Revert "separate out dynamo .requires_grad and .is_grad_enabled guards (#100570)"
This reverts commit 1fabee399d.

Reverted https://github.com/pytorch/pytorch/pull/100570 on behalf of https://github.com/PaliC due to breaking inductor tests along with #101219 ([comment](https://github.com/pytorch/pytorch/pull/100570#issuecomment-1555271267))
2023-05-19 21:29:09 +00:00
Brian Hirsh
1fabee399d separate out dynamo .requires_grad and .is_grad_enabled guards (#100570)
Fixes https://github.com/pytorch/pytorch/issues/100977

This will hopefully fix this error (from [issue](https://github.com/pytorch/pytorch/issues/99616))

This PR fixes an internal model: we were running an inductor inference graph, but `torch.is_grad_enabled()` was True, causing us to error inside of the inference graph when we encountered an out= operator.

I haven't been able to create a smaller repro - before landing this, I want to create a smaller repro to convince myself of why we need to separate out these guards.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100570
Approved by: https://github.com/ezyang
2023-05-19 16:14:56 +00:00
Michael Voznesensky
4c2892944f Guard static shapes alongside tensors, instead of from shape_env, in dynamic_shapes=True (#99566)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99566
Approved by: https://github.com/ezyang
2023-04-22 16:46:52 +00:00
Joel Schlosser
35be579701 Refactor TENSOR_MATCH guards to check dim (for NT support) (#97896)
Tweaks the TENSOR_MATCH guard logic to avoid saving sizes / strides for the case of dynamic shapes. Instead, the dim() is stored, which is enough for both dense tensors and NTs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97896
Approved by: https://github.com/ezyang
2023-03-29 23:08:03 +00:00
cyy
f8ad64d5eb [dynamo] avoid truncation of python pointers (#95619)
This PR is separated from #94927 . It aims to fix to the MSVC warnings that passed python pointers are truncated to a smaller integer type.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95619
Approved by: https://github.com/Skylion007
2023-02-28 19:38:34 +00:00
Michael Voznesensky
9ded087bac During export, generate Python TENSOR_MATCH guards (#94970)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94970
Approved by: https://github.com/ezyang
2023-02-24 05:37:31 +00:00
PyTorch MergeBot
254b161def Revert "During export, generate Python TENSOR_MATCH guards (#94970)"
This reverts commit 5a8092f058.

Reverted https://github.com/pytorch/pytorch/pull/94970 on behalf of https://github.com/voznesenskym due to Clowny comparison bug on edge cases for devices
2023-02-23 17:47:59 +00:00
Michael Voznesensky
5a8092f058 During export, generate Python TENSOR_MATCH guards (#94970)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94970
Approved by: https://github.com/ezyang
2023-02-22 17:28:17 +00:00
PyTorch MergeBot
6ae60b19b7 Revert "During export, generate Python TENSOR_MATCH guards (#94970)"
This reverts commit 5d2eb6d636.

Reverted https://github.com/pytorch/pytorch/pull/94970 on behalf of https://github.com/jeanschmidt due to Requires codev to land internal test changes
2023-02-22 16:49:37 +00:00
Michael Voznesensky
5d2eb6d636 During export, generate Python TENSOR_MATCH guards (#94970)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94970
Approved by: https://github.com/ezyang
2023-02-21 19:12:57 +00:00
Michael Voznesensky
eb39d990ce Guard on at::Tensor device index (#91779)
Fixes #91777

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91779
Approved by: https://github.com/ngimel
2023-01-19 00:58:04 +00:00
Aaron Gokaslan
3916d7a575 Apply modernize-use-emplace to aten, c10, torch (#91077)
Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077
Approved by: https://github.com/ezyang
2022-12-19 07:49:56 +00:00
Jason Ansel
8b0cc9c752 [inductor] Fix copysign issue in old msvc build (#87117)
Should fix https://github.com/pytorch/pytorch/pull/87028#issuecomment-1281066036
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87117
Approved by: https://github.com/DanilBaibak
2022-10-18 06:06:31 +00:00
Jason Ansel
30f6f6903c [inductor] Move size asserts to C++, fix bug (#87028)
Inductor internally models any `size=1` dimension as having `stride=0` to simplify indexing formulas (sympy will remove these terms from the expression).

This caused a bug in our generate stride assert in detectron2_maskrcnn_r_50_fpn, where we asserted the wrong stride of a size==1 dimension.

This fixes that bug, and moves size/stride assert logic to C++ which should be a small perf gain.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87028
Approved by: https://github.com/anijain2305
2022-10-16 20:17:22 +00:00
Jason Ansel
f1fdb6efbd Manual changes for moving dynamo to core (#86621)
This is the subset of the changes in #86461 not auto-generated by `copy_to_core.sh`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86621
Approved by: https://github.com/albanD
2022-10-11 23:01:21 +00:00