pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Scott Wolchok	6a5a436624	DTensor: C++ compute_global_tensor_info (#162990 ) compute_global_tensor_info is on the hot path for DTensor.{from,to}_local. More incremental progress toward C++. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162990 Approved by: https://github.com/ezyang	2025-10-30 15:10:54 +00:00
Yuanyuan Chen	36871622f1	[2/N] Mark unused parameters in C++ code (#165121 ) This is follow-up of #164912 to mark unused C++ parameters to improve code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165121 Approved by: https://github.com/Skylion007	2025-10-15 03:04:39 +00:00
Yuanyuan Chen	ecb53078fa	Turn some const strings into constexpr in C++ code (#165203 ) This PR turns more const strings into constexpr. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165203 Approved by: https://github.com/Skylion007	2025-10-13 20:25:20 +00:00
soulitzer	71aefd5595	[reland] Allow setting grad_dtype on leaf tensors (#164751 ) ghstack-source-id: e44b3941530be83a630ec93f1478eec741ffca2e Pull-Request-resolved: https://github.com/pytorch/pytorch/pull/162815 Fixes #ISSUE_NUMBER Relanding due to internal weirdness. Separate PR to codev w/o ghstack. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164751 Approved by: https://github.com/albanD	2025-10-08 20:23:13 +00:00
PyTorch MergeBot	331191ce4b	Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659 )" This reverts commit `29cbcbac42`. Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/izaitsevfb due to reverted internally, see [D83214133](https://www.internalfb.com/diff/D83214133) ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3369348172))	2025-10-05 21:39:57 +00:00
PyTorch MergeBot	3ddf2018d0	Revert "Support setting grad_dtype on leaf tensors (#162815 )" This reverts commit `dca73982c5`. Reverted https://github.com/pytorch/pytorch/pull/162815 on behalf of https://github.com/yangw-dev due to break internal test D83850533, see more details below ([comment](https://github.com/pytorch/pytorch/pull/162815#issuecomment-3367498501))	2025-10-03 23:14:28 +00:00
soulitzer	dca73982c5	Support setting grad_dtype on leaf tensors (#162815 ) `grad_dtype` is a new attribute on Tensor to control gradient dtype: - Access/setting is leaf-only. - grad_dtype is respected when (1) when assigning to .grad, and (2) in the engine after the previous node produces incoming gradients for AccumulateGrad. (See table below for details) - Not setting grad_dtype preserves the current behavior. Accessing it returns `t.dtype` - `grad_dtype` cannot be set when there is already a `.grad` present and the dtypes conflict. \| `grad_dtype` setting \| Setting `.grad` manually \| Incoming gradient from autograd engine \| \|-----------------------\|--------------------------\|-----------------------------------------\| \| Default (tensor’s dtype) \| `.grad` must match tensor’s dtype \| Engine casts incoming grad to tensor’s dtype \| \| Set to specific dtype \| `.grad` must match that dtype \| Engine casts incoming grad to the specified dtype \| \| Set to `None` \| `.grad` may be any dtype \| Engine does not cast; accepts incoming grad dtype as-is \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/162815 Approved by: https://github.com/albanD	2025-10-02 23:09:07 +00:00
PyTorch MergeBot	cc5d74c366	Revert "[BE] Remove HermeticPyObjectTLS and Simplify PythonOpRegistrationTrampoline (#163464 )" This reverts commit `94195a37ae`. Reverted https://github.com/pytorch/pytorch/pull/163464 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/163464#issuecomment-3353307034))	2025-09-30 18:20:20 +00:00
Yuanyuan Chen	46ec0664e3	Remove unused PyIntXXX, THPUtils_newReal_BOOL, THPQXXX macros (#164056 ) The removed macros are not used in other places of the `pytorch` GitHub org. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164056 Approved by: https://github.com/albanD	2025-09-30 13:48:25 +00:00
PaliC	94195a37ae	[BE] Remove HermeticPyObjectTLS and Simplify PythonOpRegistrationTrampoline (#163464 ) Removes HermeticPyObjectTLS as we no longer need since torch deploy is no longer supported. PythonOpRegistrationTrampoline is also drastically simplified as and being prepped for removal in a future PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163464 Approved by: https://github.com/albanD, https://github.com/Skylion007	2025-09-25 23:30:50 +00:00
PaliC	29cbcbac42	[BE] Make PyObjectSlot use a global PyInterpreter (#162659 ) This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process Gonna ask for review from @huydhn as there are some changes to CI. Testing: imported internally and the failed android build seems to work now! Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659 Approved by: https://github.com/albanD, https://github.com/huydhn	2025-09-25 08:53:19 +00:00
PyTorch MergeBot	edafc902d7	Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659 )" This reverts commit `d1993c27ae`. Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/wdvr due to reverted internally, please see D82771705 @PaliC ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3317110247))	2025-09-22 06:22:37 +00:00
Scott Wolchok	5599f487ef	Fully native DTensor.__new__ (#162508 ) Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162508 Approved by: https://github.com/ezyang ghstack dependencies: #161695	2025-09-21 18:36:05 +00:00
Scott Wolchok	76a841fd47	Port OpSchema.__post_init__ and OpSchema._recompute_comparison_key to C++ (#161695 ) I initially didn't see good results porting this, but it was apparently because of pybind11 function calling overhead. (pybind11's object-handling primitives seem fine enough.) I'm interested in setting up nanobind, but this demonstrates it's not blocking. Differential Revision: [D81530102](https://our.internmc.facebook.com/intern/diff/D81530102) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161695 Approved by: https://github.com/ezyang	2025-09-19 04:07:30 +00:00
Sahan Paliskara	d1993c27ae	[BE] Make PyObjectSlot use a global PyInterpreter (#162659 ) This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process Gonna ask for review from @huydhn as there are some changes to CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659 Approved by: https://github.com/albanD, https://github.com/huydhn	2025-09-17 16:40:55 +00:00
Scott Wolchok	a63221a335	Fix TODO in make_tensor_for_subclass_helper (#162336 ) The constructor does accept a DataPtr (had to fix the DataPtr variant not accepting a SymInt, though). Pull Request resolved: https://github.com/pytorch/pytorch/pull/162336 Approved by: https://github.com/ezyang ghstack dependencies: #162298	2025-09-17 06:46:34 +00:00
PyTorch MergeBot	4db203f875	Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659 )" This reverts commit `05ee8114f8`. Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/jeanschmidt due to seems to have introduced errors in linting see https://github.com/pytorch/pytorch/actions/runs/17750689989/job/50444910643 ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3298626136))	2025-09-16 12:52:57 +00:00
PaliC	05ee8114f8	[BE] Make PyObjectSlot use a global PyInterpreter (#162659 ) This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process Gonna ask for review from @huydhn as there are some changes to CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659 Approved by: https://github.com/albanD, https://github.com/huydhn	2025-09-16 00:37:09 +00:00
Scott Wolchok	1274297e06	Remove __torch_dispatch__ check in THPVariable_make_dtensor (#162337 ) We control DTensor, so we can just guarantee there isn't a programming error with __torch_dispatch__. (The guard is already less-than-perfect; see the note that the deleted comment refers to.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162337 Approved by: https://github.com/Skylion007 ghstack dependencies: #161591, #161595, #161633, #161634, #161692, #162219, #162220, #162218, #161596	2025-09-11 06:58:35 +00:00
Scott Wolchok	eab2afeff7	fastpath type Tensor in THPVariable_NewWithVar (#161634 ) It is cheap to do an exact check against Tensor and much faster when it works (PyType_IsSubtype does not have this fastpath, I checked [source](`9ee0214b5d/Objects/typeobject.c (L2889)`)). Spot-checked in perf on detach-DTensor-in-a-loop benchmark; small win but clear. Differential Revision: [D81530101](https://our.internmc.facebook.com/intern/diff/D81530101) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161634 Approved by: https://github.com/Skylion007, https://github.com/albanD ghstack dependencies: #161591, #161595, #161633	2025-09-09 01:10:06 +00:00
Scott Wolchok	8e076d889c	Don't call check_has_torch_dispatch in THPVariable_NewWithVar if we already know (#161591 ) We already know when we're called from make_wrapper_subclass or make_dtensor. The check isn't particularly cheap. Differential Revision: [D81530099](https://our.internmc.facebook.com/intern/diff/D81530099) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161591 Approved by: https://github.com/ezyang ghstack dependencies: #161466, #161586, #161590	2025-09-08 16:28:08 +00:00
Scott Wolchok	88d94d17e8	Add torch.Tensor._make_dtensor to accelerate DTensor.__new__ further (#161590 ) This seems to be a (very very roughly) ~8% improvement on DTensor benchmark very similar to the benchmark from #160580 (120ish usec -> 110ish usec) Differential Revision: [D81530105](https://our.internmc.facebook.com/intern/diff/D81530105) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161590 Approved by: https://github.com/albanD ghstack dependencies: #161466, #161586	2025-09-05 18:43:41 +00:00
Scott Wolchok	0ee8a4e281	Fix accidental copy in pushPyOutToStack (#161329 ) `auto` forces a copy. Confirmed this did something noticable with perf. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161329 Approved by: https://github.com/zpcore, https://github.com/fduwjj, https://github.com/Skylion007, https://github.com/bdhirsh ghstack dependencies: #161301, #161292, #161304, #161308, #161315, #161317, #161328	2025-08-30 06:55:43 +00:00
PaliC	1b99c1859c	[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 ) This PR is a bit more involved but effectively works to drastically simplify PyObjectSlot and PyInterpreter. 1) For PyObjectSlot we now use a global pyinterpreter since there only is one. From here we change all of the call sites to rely on this assumption. 2) We also remove the "tags" of the PyInterpreter by deprecating `PyInterpreterStatus`. For the reviewer, sadly it seems like `functorch/csrc/dim/dim.cpp` needed to get linted, so there is an unreadable amount of changes there. Fortunately, the only actual change in the file is as follows which just removes `getPyInterpreter()` from the `check_pyobj` call. ``` mpy::handle handle_from_tensor(Arena& A, TensorRef t) { - // fast case: tensor is live in python - std::optional<PyObject> mb_obj = - t->unsafeGetTensorImpl()->pyobj_slot()->check_pyobj(getPyInterpreter(), /ignore_hermetic_tls=/false); - if (mb_obj.has_value() && !t->unsafeGetTensorImpl()->pyobj_slot()->owns_pyobj()) { - return mb_obj; - } - return A.autorelease(mpy::object::checked_steal(THPVariable_Wrap(t))); -} -} + // fast case: tensor is live in python + std::optional<PyObject> mb_obj = + t->unsafeGetTensorImpl()->pyobj_slot()->check_pyobj( + /ignore_hermetic_tls=/false); + if (mb_obj.has_value() && + !t->unsafeGetTensorImpl()->pyobj_slot()->owns_pyobj()) { + return mb_obj; + } + return A.autorelease(mpy::object::checked_steal(THPVariable_Wrap(t))); +} ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/158427 Approved by: https://github.com/albanD	2025-07-30 17:29:43 +00:00
PyTorch MergeBot	15a50dcf1c	Revert "[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )" This reverts commit `eb73650723`. Reverted https://github.com/pytorch/pytorch/pull/158427 on behalf of https://github.com/ZainRizvi due to Reverting this as part of reverting the stack for https://github.com/pytorch/pytorch/pull/158288 ([comment](https://github.com/pytorch/pytorch/pull/158427#issuecomment-3099815367))	2025-07-21 23:14:57 +00:00
PaliC	eb73650723	[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 ) This PR is a bit more involved but effectively works to drastically simplify PyObjectSlot and PyInterpreter. 1) For PyObjectSlot we now use a global pyinterpreter since there only is one. From here we change all of the call sites to rely on this assumption. 2) We also remove the "tags" of the PyInterpreter by deprecating `PyInterpreterStatus`. For the reviewer, sadly it seems like `functorch/csrc/dim/dim.cpp` needed to get linted, so there is an unreadable amount of changes there. Fortunately, the only actual change in the file is as follows which just removes `getPyInterpreter()` from the `check_pyobj` call. ``` mpy::handle handle_from_tensor(Arena& A, TensorRef t) { - // fast case: tensor is live in python - std::optional<PyObject> mb_obj = - t->unsafeGetTensorImpl()->pyobj_slot()->check_pyobj(getPyInterpreter(), /ignore_hermetic_tls=/false); - if (mb_obj.has_value() && !t->unsafeGetTensorImpl()->pyobj_slot()->owns_pyobj()) { - return mb_obj; - } - return A.autorelease(mpy::object::checked_steal(THPVariable_Wrap(t))); -} -} + // fast case: tensor is live in python + std::optional<PyObject> mb_obj = + t->unsafeGetTensorImpl()->pyobj_slot()->check_pyobj( + /ignore_hermetic_tls=/false); + if (mb_obj.has_value() && + !t->unsafeGetTensorImpl()->pyobj_slot()->owns_pyobj()) { + return mb_obj; + } + return A.autorelease(mpy::object::checked_steal(THPVariable_Wrap(t))); +} ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/158427 Approved by: https://github.com/albanD	2025-07-18 05:23:00 +00:00
Yuanyuan Chen	07bb097698	Fix clang-tidy bugprone* warnings (#148529 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/148529 Approved by: https://github.com/ezyang	2025-06-23 23:09:56 +00:00
Sean McGovern	297805fd8f	Typo fixes for "overridden" in comments and function names (#155944 ) This word appears often in class descriptions and is not consistently spelled. Update comments and some function names to use the correct spelling consistently. Facilitates searching the codebase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155944 Approved by: https://github.com/Skylion007	2025-06-14 03:37:38 +00:00
cyy	8fa81a6066	Enable misc-use-internal-linkage check and apply fixes (#148948 ) Enables clang-tidy rule [`misc-use-internal-linkage`](https://clang.llvm.org/extra/clang-tidy/checks/misc/use-internal-linkage.html). This new check was introduced in Clang-Tidy 18 and is available due to recent update of Clang-Tidy 19. The check marks functions and variables used only in the translation unit as static. Therefore undesired symbols are not leaked into other units, more link time optimisations are possible and the resulting binaries may be smaller. The detected violations were mostly fixed by using static. In other cases, the symbols were indeed consumed by others files, then their declaring headers were included. Still some declarations were wrong and have been fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148948 Approved by: https://github.com/Skylion007	2025-03-12 14:22:56 +00:00
cyy	8df99b6a6e	Remove unneeded std::make_optional (#143575 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/143575 Approved by: https://github.com/Skylion007	2024-12-31 03:08:47 +00:00
albanD	70be7900bb	Fix Tensor clear to properly clear slots (#143203 ) Fixes a bug introduced in https://github.com/pytorch/pytorch/pull/137267 While the test ensures the finalizer did run to make sure things are cleared, the objects are not properly collected by the gc due to the faulty tp_clear implementation. So, while the finalizer did run, the object was still alive. Fixing this by giving tp_clear the same treatment as tp_traverse and tp_dealloc on Tensor: make it a unique function that handles the full subclass hierarchy in one place. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143203 Approved by: https://github.com/ezyang, https://github.com/colesbury ghstack dependencies: #143202	2024-12-14 00:17:07 +00:00
albanD	8741d72e3c	move function before modifying it (#143202 ) This is a no-op. Just to make the diff in the next PR easier to read Pull Request resolved: https://github.com/pytorch/pytorch/pull/143202 Approved by: https://github.com/ezyang, https://github.com/janeyx99	2024-12-14 00:17:07 +00:00
Richard Barnes	882b6af219	c10::string_view -> std::string_view in autograd (#142354 ) Differential Revision: D66939966 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142354 Approved by: https://github.com/Skylion007	2024-12-10 15:43:41 +00:00
cyy	45ed7c13fa	Remove unneeded std::make_optional (#141567 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141567 Approved by: https://github.com/albanD	2024-11-28 00:05:21 +00:00
cyy	4a2da52137	[1/N] Replace c10::sv with std::sv (#139453 ) Picks some safe replacements. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139453 Approved by: https://github.com/Skylion007	2024-11-01 05:39:37 +00:00
cyy	1605d4aeb8	Fix object slice (#138880 ) To avoid casting Tensor to Tensorbase Pull Request resolved: https://github.com/pytorch/pytorch/pull/138880 Approved by: https://github.com/Skylion007	2024-10-26 00:13:19 +00:00
albanD	69ba89da11	Fix cuda sanitizer and as_subclass calls (#138218 ) This fixes 4 main issues: - The way the cuda sanitizer handle it's state is weird. In particular, because the lifetime of the Mode is linked to the submodule, then this might outlive the python runtime and other modules loaded. On my current version, this even outlives the "sys" module. Given that I'm not sure the impact of changing this lifetime handling, I'm making the exit handler a no-op when python is already dying and thus no point cleaning up. - Adds a "disable" method to be able to test after the mode is enabled. - Fix `Tensor.as_sublass()` to properly disable modes when creating the new Tensor object just like we already do in `make_subclass` and `make_wrapper_subclass`. The change here is just to apply the exact same treatment to it. - ~Fix `Tensor.as_subclass()` not to propagate autograd as there is no valid backward associated here.~ We have test that check that this behavior happen so I guess this is not an obvious bugfix and expected behavior. Reverted that change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138218 Approved by: https://github.com/ngimel	2024-10-17 21:18:32 +00:00
albanD	c0deec120f	Fix resurrection logic to trigger early enough (#137267 ) Fixes https://github.com/pytorch/pytorch/issues/136358 The bug here is that the Tensor object is actually 2 classes: `Tensor` from `_tensor.py` and `TensorBase` from c++. Before this PR, they have the following gc methods: Tensor: - tp_clear subtype_clear - tp_traverse THPVariable_subclass_traverse - tp_dealloc THPVariable_subclass_dealloc TensorBase: - tp_clear THPVariable_clear - tp_traverse THPFunction_traverse (fake function that is just an error) - tp_dealloc object_dealloc The problem is that when clear is called on the Tensor, subtype_clear is going to clear the things owned by the "Tensor" type, in particular, its `__dict__` attribute, before delegating to the TensorBase clear where we detect that resurrection needs to happen and skip it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137267 Approved by: https://github.com/ezyang, https://github.com/kshitij12345	2024-10-04 21:13:54 +00:00
albanD	adc48a5b52	Python CAPI cleanup (#137266 ) This is unrelated to anything else, but as I was going through the code, fixing bad patterns and a refcount bug (which is unlikely to cause any real issue tbh) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137266 Approved by: https://github.com/Skylion007	2024-10-03 17:55:48 +00:00
Xuehai Pan	8962610247	[BE][clang-format] make macro `PyObject_HEAD_INIT(type)` and `PyVarObject_HEAD_INIT(type, size)` have its own line (#136949 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136949 Approved by: https://github.com/albanD, https://github.com/eqy ghstack dependencies: #136945	2024-10-02 18:39:22 +00:00
albanD	cf31724db7	Fix and improvements to toward 3.13t (#136319 ) Small part of https://github.com/pytorch/pytorch/pull/130689 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136319 Approved by: https://github.com/malfet, https://github.com/Skylion007	2024-09-20 04:22:18 +00:00
cyy	929d2f8253	[3/N] Fix clang-tidy warnings in torch/csrc/autograd (#133389 ) Follows #133295 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133389 Approved by: https://github.com/Skylion007	2024-08-16 00:57:54 +00:00
cyy	29861779ce	[2/N] Change #include <c10/util/Optional.h> to #include <optional> (#130236 ) Follows #128301. The changes were made by grep and sed Pull Request resolved: https://github.com/pytorch/pytorch/pull/130236 Approved by: https://github.com/ezyang	2024-07-09 03:17:24 +00:00
cyy	f4dcf2ae93	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang, https://github.com/r-barnes	2024-07-08 07:03:53 +00:00
garfield1997	27a14405d3	enable device index check for all device types (#126767 ) enable device index check for all device types for grad setter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126767 Approved by: https://github.com/albanD	2024-06-27 01:09:53 +00:00
PyTorch MergeBot	846bb30e13	Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 )" This reverts commit `bd72e28314`. Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build `bd72e28314`. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))	2024-06-15 01:58:20 +00:00
cyy	bd72e28314	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang	2024-06-14 23:21:01 +00:00
Mikayla Gawarecki	cd06ae0cb8	Relax use_count constraints for swap_tensors when AccumulateGrad holds a reference (#127313 ) ### Before this PR: `torch.utils.swap_tensors(a, b)` required the `use_count` of `a` and `b` to be 1 ```python a = torch.randn(2, 3, requires_grad=True) b = torch.randn(2, 4) out = a * 2 out.sum().backward() # Calling swap_tensors here would fail due to the reference held by AccumulateGrad node, which is not cleaned up after backward # torch.utils.swap_tensors(a, b) del out # Calling swap_tensors here would pass torch.utils.swap_tensors(a, b) ``` ### After this PR: `torch.utils.swap_tensors(a, b)` requires the `use_count` of `a` and `b` to be 1 or 2 IF the second reference is held by `AccumulateGrad` A pre-hook will be registered on the `AccumulateGrad` node so that it will fail if it is called (i.e. if user attempts to backward through the graph). ```python a = torch.randn(2, 3, requires_grad=True) b = torch.randn(2, 4) out = a * 2 out.sum().backward() # Calling swap_tensors here is ok torch.utils.swap_tensors(a, b) # If we ever backward to the AccumulateGrad node it will error that it was poisoned by swap_tensors ``` ### Application to `nn.Module` This issue is especially pertinent in context of `nn.Module` where parameters will have `AccumulateGrad` nodes initialized after forward. Specifically, this is intended to address https://github.com/pytorch/pytorch/pull/126814#issuecomment-2127777866. Previously, this would fail at the `m.cpu()` but we want users to be able to do something like the following, and instead raise an error if the user ever attempts to backward through the poisoned `AccumulateGrad` node ```python import torch import torch.nn as nn m = nn.Linear(3, 5) inp = torch.randn(2, 3) out = m(inp) out.sum().backward() m.cpu() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/127313 Approved by: https://github.com/soulitzer	2024-05-30 07:06:55 +00:00
Richard Barnes	ed327876f5	[codemod] `c10:optional` -> `std::optional` (#126135 ) Generated by running the following from PyTorch root: ``` find . -regex ".*\.$cpp\\|h\\|cu\\|hpp\\|cc\\|cxx$$" \| grep -v "build/" \| xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/' ``` `c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi	2024-05-14 19:35:51 +00:00
albanD	71467abc44	Changes to compile with 3.13 (#126033 ) This is mainly: - Fix refcount access macro - Hide all the Dynamo code that needs update as usual - Add _PyWeakref_ClearRef as an extern provided by CPython. Including the pycore header that defines it would require raw c include shenanigans that I don't think are worth it. This allows to build both with regular and nogil version of cpython. Both Note that this requires the 3.13 branch at least past [d3094744d40de2deefbda9b1996d5029c9ebf0b0](`d3094744d4`) which we need for mimalloc include and weakref function being exposed. debug-only issues in pybind11 with PyMem_MALLOC vs PyObject_MALLOC being should be synced either by updating pybind or cpython. @colesbury I can send a PR to ifdef the proper use in pybind if you think that this is the best solution here? Pull Request resolved: https://github.com/pytorch/pytorch/pull/126033 Approved by: https://github.com/colesbury	2024-05-14 02:14:57 +00:00

1 2 3 4 5 ...

398 Commits