pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

Author	SHA1	Message	Date
atalman	244b124bb8	Add linux cpu test for 3.12 (#117853 ) This is continuation of work: https://github.com/pytorch/pytorch/pull/113987 Co-authored-by: albanD <desmaison.alban@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117853 Approved by: https://github.com/albanD	2024-02-14 20:52:23 +00:00
Jason Ansel	2de24c11f6	[inductor] Slightly faster memory allocation on CUDA (#118255 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118255 Approved by: https://github.com/peterbell10 ghstack dependencies: #118065, #118070, #118171	2024-01-25 20:49:14 +00:00
Jason Ansel	817debeb89	[inductor] Slightly faster memory allocation on CPU (#118171 ) Based on `python benchmarks/dynamo/microbenchmarks/overheads.py`: - Before `12.2us` - After `10.5us` This is inspired by `a2c17a2b00` -- but in Python rather than C++ Pull Request resolved: https://github.com/pytorch/pytorch/pull/118171 Approved by: https://github.com/jgong5, https://github.com/peterbell10 ghstack dependencies: #118065, #118070	2024-01-25 16:54:57 +00:00
Jason Ansel	a669319450	[inductor] Faster C++ kernel python bindings (#117500 ) Calling C++ from Python via ctypes is notoriously slow. This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark: ```python from ctypes import c_void_p import torch from torch import empty from torch._inductor.codecache import AsyncCompile from torch._dynamo.testing import rand_strided from torch._inductor.utils import print_performance from torch._inductor.wrapper_benchmark import compiled_module_main async_compile = AsyncCompile() src = ''' #include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h" extern "C" void kernel(const float* in_ptr0, float* out_ptr0) { { auto tmp0 = in_ptr0[static_cast<long>(0L)]; auto tmp1 = static_cast<float>(1.0); auto tmp2 = decltype(tmp0)(tmp0 + tmp1); out_ptr0[static_cast<long>(0L)] = tmp2; } } ''' cpp_fused_add_ctypes = async_compile.cpp(src) cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float", "float"], src) async_compile.wait(globals()) del async_compile def call(arg0_1): buf0 = empty((1,), device='cpu', dtype=torch.float32) if use_ctypes: for _ in range(100): cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr())) else: for _ in range(100): cpp_fused_add_cpython(arg0_1, buf0) del arg0_1 return (buf0,) def benchmark_compiled_module(times=1000, repeat=100): arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32) return print_performance(lambda: call(arg0_1), times=times, repeat=repeat) print("old ctypes bindings: ", end='') use_ctypes = True compiled_module_main('None', benchmark_compiled_module) print("new bindings: ", end='') use_ctypes = False compiled_module_main('None', benchmark_compiled_module) ``` Output: ``` old ctypes bindings: 0.000073 new bindings: 0.000013 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500 Approved by: https://github.com/desertfire	2024-01-18 16:20:12 +00:00
Nikita Shulga	a1afd1b195	Revert "[inductor] Faster C++ kernel python bindings (#117500 )" It should have never been landed, but was landed again, thanks to ghstack grafting/ungrafting see discussion on https://github.com/pytorch/pytorch/pull/116910 This reverts commit `e457b6fb18`.	2024-01-17 17:06:32 -08:00
titaiwangms	e457b6fb18	[inductor] Faster C++ kernel python bindings (#117500 ) Calling C++ from Python via ctypes is notoriously slow. This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark: ```python from ctypes import c_void_p import torch from torch import empty from torch._inductor.codecache import AsyncCompile from torch._dynamo.testing import rand_strided from torch._inductor.utils import print_performance from torch._inductor.wrapper_benchmark import compiled_module_main async_compile = AsyncCompile() src = ''' #include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h" extern "C" void kernel(const float* in_ptr0, float* out_ptr0) { { auto tmp0 = in_ptr0[static_cast<long>(0L)]; auto tmp1 = static_cast<float>(1.0); auto tmp2 = decltype(tmp0)(tmp0 + tmp1); out_ptr0[static_cast<long>(0L)] = tmp2; } } ''' cpp_fused_add_ctypes = async_compile.cpp(src) cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float", "float"], src) async_compile.wait(globals()) del async_compile def call(arg0_1): buf0 = empty((1,), device='cpu', dtype=torch.float32) if use_ctypes: for _ in range(100): cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr())) else: for _ in range(100): cpp_fused_add_cpython(arg0_1, buf0) del arg0_1 return (buf0,) def benchmark_compiled_module(times=1000, repeat=100): arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32) return print_performance(lambda: call(arg0_1), times=times, repeat=repeat) print("old ctypes bindings: ", end='') use_ctypes = True compiled_module_main('None', benchmark_compiled_module) print("new bindings: ", end='') use_ctypes = False compiled_module_main('None', benchmark_compiled_module) ``` Output: ``` old ctypes bindings: 0.000073 new bindings: 0.000013 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500 Approved by: https://github.com/desertfire ghstack dependencies: #117409, #116667, #117591	2024-01-17 23:03:15 +00:00
PyTorch MergeBot	da6abaeeac	Revert "[inductor] Faster C++ kernel python bindings (#117500 )" This reverts commit `bb0fd1bd3c`. Reverted https://github.com/pytorch/pytorch/pull/117500 on behalf of https://github.com/PaliC due to breaking internal discussed with author offline ([comment](https://github.com/pytorch/pytorch/pull/117500#issuecomment-1896516512))	2024-01-17 19:34:26 +00:00
titaiwangms	bb0fd1bd3c	[inductor] Faster C++ kernel python bindings (#117500 ) Calling C++ from Python via ctypes is notoriously slow. This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark: ```python from ctypes import c_void_p import torch from torch import empty from torch._inductor.codecache import AsyncCompile from torch._dynamo.testing import rand_strided from torch._inductor.utils import print_performance from torch._inductor.wrapper_benchmark import compiled_module_main async_compile = AsyncCompile() src = ''' #include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h" extern "C" void kernel(const float* in_ptr0, float* out_ptr0) { { auto tmp0 = in_ptr0[static_cast<long>(0L)]; auto tmp1 = static_cast<float>(1.0); auto tmp2 = decltype(tmp0)(tmp0 + tmp1); out_ptr0[static_cast<long>(0L)] = tmp2; } } ''' cpp_fused_add_ctypes = async_compile.cpp(src) cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float", "float"], src) async_compile.wait(globals()) del async_compile def call(arg0_1): buf0 = empty((1,), device='cpu', dtype=torch.float32) if use_ctypes: for _ in range(100): cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr())) else: for _ in range(100): cpp_fused_add_cpython(arg0_1, buf0) del arg0_1 return (buf0,) def benchmark_compiled_module(times=1000, repeat=100): arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32) return print_performance(lambda: call(arg0_1), times=times, repeat=repeat) print("old ctypes bindings: ", end='') use_ctypes = True compiled_module_main('None', benchmark_compiled_module) print("new bindings: ", end='') use_ctypes = False compiled_module_main('None', benchmark_compiled_module) ``` Output: ``` old ctypes bindings: 0.000073 new bindings: 0.000013 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500 Approved by: https://github.com/desertfire ghstack dependencies: #117409, #116667, #117591	2024-01-17 19:12:24 +00:00
PyTorch MergeBot	9da01affd3	Revert "[inductor] Faster C++ kernel python bindings (#117500 )" This reverts commit `3a52147cc5`. Reverted https://github.com/pytorch/pytorch/pull/117500 on behalf of https://github.com/PaliC due to breaking internal discussed with author offline ([comment](https://github.com/pytorch/pytorch/pull/117500#issuecomment-1896426304))	2024-01-17 18:42:39 +00:00
Jason Ansel	3a52147cc5	[inductor] Faster C++ kernel python bindings (#117500 ) Calling C++ from Python via ctypes is notoriously slow. This switches to generating our own C++ bindings directly, which is a >5x speedup on this kernel-launch-bound microbenchmark: ```python from ctypes import c_void_p import torch from torch import empty from torch._inductor.codecache import AsyncCompile from torch._dynamo.testing import rand_strided from torch._inductor.utils import print_performance from torch._inductor.wrapper_benchmark import compiled_module_main async_compile = AsyncCompile() src = ''' #include "/tmp/torchinductor_jansel/gb/cgbau5vlj6cetmcjbjbtw6x4rrivaln6f45s5d72gy2bfx5foz3k.h" extern "C" void kernel(const float* in_ptr0, float* out_ptr0) { { auto tmp0 = in_ptr0[static_cast<long>(0L)]; auto tmp1 = static_cast<float>(1.0); auto tmp2 = decltype(tmp0)(tmp0 + tmp1); out_ptr0[static_cast<long>(0L)] = tmp2; } } ''' cpp_fused_add_ctypes = async_compile.cpp(src) cpp_fused_add_cpython = async_compile.cpp_pybinding(["const float", "float"], src) async_compile.wait(globals()) del async_compile def call(arg0_1): buf0 = empty((1,), device='cpu', dtype=torch.float32) if use_ctypes: for _ in range(100): cpp_fused_add_ctypes(c_void_p(arg0_1.data_ptr()), c_void_p(buf0.data_ptr())) else: for _ in range(100): cpp_fused_add_cpython(arg0_1, buf0) del arg0_1 return (buf0,) def benchmark_compiled_module(times=1000, repeat=100): arg0_1 = rand_strided((1,), (1,), device='cpu', dtype=torch.float32) return print_performance(lambda: call(arg0_1), times=times, repeat=repeat) print("old ctypes bindings: ", end='') use_ctypes = True compiled_module_main('None', benchmark_compiled_module) print("new bindings: ", end='') use_ctypes = False compiled_module_main('None', benchmark_compiled_module) ``` Output: ``` old ctypes bindings: 0.000073 new bindings: 0.000013 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117500 Approved by: https://github.com/desertfire	2024-01-16 22:30:04 +00:00
Michael Voznesensky	1e7947b3e0	Revert "Reland 3rd try [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#109323 )" + Forward fixes + test (#110964 ) This reverts commit `f786fbdebd`. Forward fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/110964 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2023-10-11 05:16:47 +00:00
soulitzer	df9a6bcaef	[reland] Symintify guards.cpp (#110675 ) reland of https://github.com/pytorch/pytorch/pull/110371 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110675 Approved by: https://github.com/ezyang ghstack dependencies: #110673, #110674	2023-10-10 19:37:17 +00:00
PyTorch MergeBot	585e2bd818	Revert "Symintify guards.cpp (#110371 )" This reverts commit `e1cfcdfa06`. Reverted https://github.com/pytorch/pytorch/pull/110371 on behalf of https://github.com/PaliC due to bottom diff is causing a plethora of internal failures ([comment](https://github.com/pytorch/pytorch/pull/110371#issuecomment-1749798063))	2023-10-05 23:42:35 +00:00
soulitzer	e1cfcdfa06	Symintify guards.cpp (#110371 ) Separating this out so we can check perf more easily Pull Request resolved: https://github.com/pytorch/pytorch/pull/110371 Approved by: https://github.com/ezyang ghstack dependencies: #110044, #110369, #110370	2023-10-04 22:56:42 +00:00
Yu Guo	2bf3ca1be7	[torchdynamo] preserve deterministic_algorithms_warn_only in convert_context (#110457 ) Summary: preserve deterministic_algorithms_warn_only in dynamo context Test Plan: modified unit tests to test warn_only Differential Revision: D49872622 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110457 Approved by: https://github.com/jansel	2023-10-04 07:12:32 +00:00
Animesh Jain	0673aa3d28	[dynamo][guards-log] Print nn module guard saved dict versions for debugging (#110028 ) This is the output for nn module guards ~~~ [DEBUG] GUARDS: [DEBUG] hasattr(L['x'], '_dynamo_dynamic_indices') == False # _dynamo/variables/builder.py:1356 in wrap_fx_proxy_cls [DEBUG] ___check_obj_id(L['self'], 139820807110912) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] __nn_module_guard_0(L['self']) # versions(mod=9998, _parameters=1194395, _buffers=1194397, _modules=1194423, _forward_hooks=1194405, _forward_pre_hooks=1194411, _backward_hooks=1194402, _backward_pre_hooks=1194400) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] ___check_obj_id(L['self'].mods[0], 139817945727568) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] __nn_module_guard_1(L['self'].mods[0]) # versions(mod=10001, _parameters=1194428, _buffers=1194430, _modules=1194522, _forward_hooks=1194438, _forward_pre_hooks=1194444, _backward_hooks=1194435, _backward_pre_hooks=1194433) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] ___check_obj_id(L['self'].mods[1], 139817945560640) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] __nn_module_guard_2(L['self'].mods[1]) # versions(mod=10001, _parameters=1194660, _buffers=1194662, _modules=1194753, _forward_hooks=1194670, _forward_pre_hooks=1194676, _backward_hooks=1194667, _backward_pre_hooks=1194665) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] ___check_obj_id(L['self'].mods[0].linear, 139817945727856) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] __nn_module_guard_3(L['self'].mods[0].linear) # versions(mod=10004, _parameters=1470004, _buffers=1194467, _modules=1194493, _forward_hooks=1194475, _forward_pre_hooks=1194481, _backward_hooks=1194472, _backward_pre_hooks=1194470) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] ___check_obj_id(L['self'].mods[1].linear, 139817945561120) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] __nn_module_guard_4(L['self'].mods[1].linear) # versions(mod=10004, _parameters=1470008, _buffers=1194699, _modules=1194725, _forward_hooks=1194707, _forward_pre_hooks=1194713, _backward_hooks=1194704, _backward_pre_hooks=1194702) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] utils_device.CURRENT_DEVICE == None # _dynamo/output_graph.py:373 in init_ambient_guards ~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/110028 Approved by: https://github.com/ezyang ghstack dependencies: #110023, #110039	2023-09-26 08:53:07 +00:00
Edward Z. Yang	518308a740	Trace through `pytree` API with dynamo. (#108533 ) Fix: #107315 This PR enables dynamo to trace through the `pytree` API by inlining its functions. In order to do so, a few details of `pytree` had to be changed. In summary, this PR: - Introduces `TreeSpecVariable` for representing `TreeSpec` instances - Specializes `<type>.__bases__` call, returning a `TupleVariable` - Enables the call to `id` builtin function for every variable that implements `as_python_constant` method - Specializes `ConstantVariable.call_method` for its (un)flatten functions - Implements `UserDefinedObjectVariable.as_python_constant` - Modifies `pytree` by: - Make `SUPPORTED_NODES` a map of ids (instead of types) to `NodeDef` - Removed `functools.wraps` function, since it can't be inlined Pull Request resolved: https://github.com/pytorch/pytorch/pull/108533 Approved by: https://github.com/ezyang, https://github.com/voznesenskym ghstack dependencies: #109201	2023-09-20 00:04:56 +00:00
Ken Jin	f9e72acc8f	Guard default dtype in torchdynamo (#109459 ) Fixes https://github.com/pytorch/pytorch/issues/109458 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109459 Approved by: https://github.com/ezyang	2023-09-17 22:51:33 +00:00
Animesh Jain	f786fbdebd	Reland 3rd try [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#109323 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109323 Approved by: https://github.com/huydhn, https://github.com/voznesenskym	2023-09-15 08:44:14 +00:00
PyTorch MergeBot	56c2386157	Revert "reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108883 )" This reverts commit `d4230e5574`. Reverted https://github.com/pytorch/pytorch/pull/108883 on behalf of https://github.com/huydhn due to Per the discussion thread on D49122208, reverting this change ([comment](https://github.com/pytorch/pytorch/pull/108883#issuecomment-1712707853))	2023-09-10 04:40:02 +00:00
Animesh Jain	d4230e5574	reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108883 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108883 Approved by: https://github.com/voznesenskym, https://github.com/huydhn	2023-09-09 03:12:31 +00:00
Jason Ansel	4965fffeda	[dynamo] Move global state guards to C++ (#108624 ) This combines a bunch of python global state guards into a single C++ guard and switches to checking them 100% of the time. It also adds a few new guards for things that change inductor's behavior. Even though we are checking more things, I expect this to be much faster. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108624 Approved by: https://github.com/anijain2305	2023-09-08 04:07:08 +00:00
PyTorch MergeBot	72f24d0001	Revert "[dynamo][finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108528 )" This reverts commit `34bb74c4cf`. Reverted https://github.com/pytorch/pytorch/pull/108528 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it has some nasty merge conflicts after the revert of D48910794. I need to revert this so the conflict could be resolved. Please help rebase this tomorrow and reland the change ([comment](https://github.com/pytorch/pytorch/pull/108528#issuecomment-1711034781))	2023-09-08 03:49:41 +00:00
Animesh Jain	34bb74c4cf	[dynamo][finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108528 ) This PR is a 99% copy paste of Sam Gross (@colesbury) work at https://github.com/pytorch/pytorch/pull/100642. Copied from there -------- The NN_MODULE guard now subsumes guards on Module attributes. The check_fn will fail if the module attributes are changed (such as Module.training), parameters, submodules, and buffers are added or removed, and if fields are changed on the type itself. This gives up specificity in the guard check -- if any field is changed the check_fn fails -- for faster overall checks. ----- Pull Request resolved: https://github.com/pytorch/pytorch/pull/108528 Approved by: https://github.com/ezyang	2023-09-07 01:45:47 +00:00
cyy	054f3f1d8f	[3/N] fix clang-tidy warnings in torch/csrc (#108024 ) Apply fixes to some found issues by clang-tidy in torch/csrc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108024 Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/malfet	2023-08-28 18:00:00 +00:00
ydwu4	31b0445702	Fix torch.compile with FakeTensor that has SymInt sizes (#107662 ) Motivation: When input FakeTensor to torch.compile has SymInt sizes (e.g. make_fx(opt_f, tracing_mode="symbolic"): 1. We cannot create a FakeTensor from that input in dynamo due to the SymInts. 2. We cannot check input tensors in guard check function and will abort due to tensor check calls sizes/strides. For 1, we specialize the FakeTensor's SymInts using their hints. This is mostly safe since inputs mostly have concrete shapes and not computed from some DynamicOutputShape ops. We'll throw a data dependent error if the symint is unbacked. For 2, we replace size/stride calls with the sym_* variants in TENSOR_CHECK guards' check function. Test Plan: See added tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107662 Approved by: https://github.com/ezyang	2023-08-23 05:27:57 +00:00
Edward Z. Yang	68b9bf9671	Simplify verbose error guard printing (#107516 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/107516 Approved by: https://github.com/anijain2305 ghstack dependencies: #107505	2023-08-20 06:50:27 +00:00
cyy	1157b4393b	Add const reference and std::move in opportunities detected by clang-tidy (#105815 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105815 Approved by: https://github.com/Skylion007	2023-07-25 12:28:14 +00:00
Michael Voznesensky	485cad4a86	Dynamo tensor aliasing guards, dedup graphargs (#104921 ) The story here is relatively simple - when we go to wrap a tensor, we (1) ensure that it is a real, not fake tensor (2) check if we have seen it before. (3) If we have seen it, we create a positive alias guard and return the associated variable. If not, we proceed. By short circuiting here, we avoid lifting it to a graph input, and guarantee that the only names passed to tensors are unique. This allows us to guard on the unique relationships (pyboject addresses, aka IDs, cannot match) to give us guards for negative aliases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104921 Approved by: https://github.com/jansel, https://github.com/ezyang	2023-07-13 22:18:08 +00:00
PyTorch MergeBot	bfd995f0d6	Revert "Specialize storage_offset - Does not cover automatic dynamic (#104204 )" This reverts commit `803c14490b`. Reverted https://github.com/pytorch/pytorch/pull/104204 on behalf of https://github.com/ezyang due to also due to https://github.com/pytorch/pytorch/issues/104563 ([comment](https://github.com/pytorch/pytorch/pull/104204#issuecomment-1620653507))	2023-07-04 19:41:32 +00:00
Michael Voznesensky	803c14490b	Specialize storage_offset - Does not cover automatic dynamic (#104204 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104204 Approved by: https://github.com/wconstab	2023-06-27 05:51:42 +00:00
Brian Hirsh	98ab11a2c3	separate out dynamo .requires_grad and .is_grad_enabled guards (#100570 ) Fixes https://github.com/pytorch/pytorch/issues/100977 This will hopefully fix this error (from [issue](https://github.com/pytorch/pytorch/issues/99616)) This PR fixes an internal model: we were running an inductor inference graph, but `torch.is_grad_enabled()` was True, causing us to error inside of the inference graph when we encountered an out= operator. I haven't been able to create a smaller repro - before landing this, I want to create a smaller repro to convince myself of why we need to separate out these guards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100570 Approved by: https://github.com/ezyang	2023-05-24 14:58:40 +00:00
PyTorch MergeBot	7f3fed125e	Revert "separate out dynamo .requires_grad and .is_grad_enabled guards (#100570 )" This reverts commit `1fabee399d`. Reverted https://github.com/pytorch/pytorch/pull/100570 on behalf of https://github.com/PaliC due to breaking inductor tests along with #101219 ([comment](https://github.com/pytorch/pytorch/pull/100570#issuecomment-1555271267))	2023-05-19 21:29:09 +00:00
Brian Hirsh	1fabee399d	separate out dynamo .requires_grad and .is_grad_enabled guards (#100570 ) Fixes https://github.com/pytorch/pytorch/issues/100977 This will hopefully fix this error (from [issue](https://github.com/pytorch/pytorch/issues/99616)) This PR fixes an internal model: we were running an inductor inference graph, but `torch.is_grad_enabled()` was True, causing us to error inside of the inference graph when we encountered an out= operator. I haven't been able to create a smaller repro - before landing this, I want to create a smaller repro to convince myself of why we need to separate out these guards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100570 Approved by: https://github.com/ezyang	2023-05-19 16:14:56 +00:00
Michael Voznesensky	4c2892944f	Guard static shapes alongside tensors, instead of from shape_env, in dynamic_shapes=True (#99566 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99566 Approved by: https://github.com/ezyang	2023-04-22 16:46:52 +00:00
Joel Schlosser	35be579701	Refactor TENSOR_MATCH guards to check dim (for NT support) (#97896 ) Tweaks the TENSOR_MATCH guard logic to avoid saving sizes / strides for the case of dynamic shapes. Instead, the dim() is stored, which is enough for both dense tensors and NTs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97896 Approved by: https://github.com/ezyang	2023-03-29 23:08:03 +00:00
cyy	f8ad64d5eb	[dynamo] avoid truncation of python pointers (#95619 ) This PR is separated from #94927 . It aims to fix to the MSVC warnings that passed python pointers are truncated to a smaller integer type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95619 Approved by: https://github.com/Skylion007	2023-02-28 19:38:34 +00:00
Michael Voznesensky	9ded087bac	During export, generate Python TENSOR_MATCH guards (#94970 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94970 Approved by: https://github.com/ezyang	2023-02-24 05:37:31 +00:00
PyTorch MergeBot	254b161def	Revert "During export, generate Python TENSOR_MATCH guards (#94970 )" This reverts commit `5a8092f058`. Reverted https://github.com/pytorch/pytorch/pull/94970 on behalf of https://github.com/voznesenskym due to Clowny comparison bug on edge cases for devices	2023-02-23 17:47:59 +00:00
Michael Voznesensky	5a8092f058	During export, generate Python TENSOR_MATCH guards (#94970 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94970 Approved by: https://github.com/ezyang	2023-02-22 17:28:17 +00:00
PyTorch MergeBot	6ae60b19b7	Revert "During export, generate Python TENSOR_MATCH guards (#94970 )" This reverts commit `5d2eb6d636`. Reverted https://github.com/pytorch/pytorch/pull/94970 on behalf of https://github.com/jeanschmidt due to Requires codev to land internal test changes	2023-02-22 16:49:37 +00:00
Michael Voznesensky	5d2eb6d636	During export, generate Python TENSOR_MATCH guards (#94970 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94970 Approved by: https://github.com/ezyang	2023-02-21 19:12:57 +00:00
Michael Voznesensky	eb39d990ce	Guard on at::Tensor device index (#91779 ) Fixes #91777 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91779 Approved by: https://github.com/ngimel	2023-01-19 00:58:04 +00:00
Aaron Gokaslan	3916d7a575	Apply modernize-use-emplace to aten, c10, torch (#91077 ) Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077 Approved by: https://github.com/ezyang	2022-12-19 07:49:56 +00:00
Jason Ansel	8b0cc9c752	[inductor] Fix copysign issue in old msvc build (#87117 ) Should fix https://github.com/pytorch/pytorch/pull/87028#issuecomment-1281066036 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87117 Approved by: https://github.com/DanilBaibak	2022-10-18 06:06:31 +00:00
Jason Ansel	30f6f6903c	[inductor] Move size asserts to C++, fix bug (#87028 ) Inductor internally models any `size=1` dimension as having `stride=0` to simplify indexing formulas (sympy will remove these terms from the expression). This caused a bug in our generate stride assert in detectron2_maskrcnn_r_50_fpn, where we asserted the wrong stride of a size==1 dimension. This fixes that bug, and moves size/stride assert logic to C++ which should be a small perf gain. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87028 Approved by: https://github.com/anijain2305	2022-10-16 20:17:22 +00:00
Jason Ansel	f1fdb6efbd	Manual changes for moving dynamo to core (#86621 ) This is the subset of the changes in #86461 not auto-generated by `copy_to_core.sh`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86621 Approved by: https://github.com/albanD	2022-10-11 23:01:21 +00:00

1 2 3 4

197 Commits