pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
linhaifeng	369f2d6951	[3/N] fix typo in other folders (#166606 ) fix typo in other folders #166374 #166126 _typos.toml ```bash [files] extend-exclude = ["tools/linter/dictionary.txt"] [default.extend-words] nd = "nd" arange = "arange" Nd = "Nd" GLOBALs = "GLOBALs" hte = "hte" iy = "iy" PN = "PN" Dout = "Dout" optin = "optin" gam = "gam" PTD = "PTD" Sur = "Sur" nin = "nin" tme = "tme" inpt = "inpt" mis = "mis" Raison = "Raison" ouput = "ouput" nto = "nto" Onwer = "Onwer" callibrate = "callibrate" ser = "ser" Metdata = "Metdata" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/166606 Approved by: https://github.com/ezyang	2025-10-30 10:30:40 +00:00
PyTorch MergeBot	8580112682	Revert "[dynamo][DebugMode] mask python keys in dispatch_key_set guard checks (#164992 )" This reverts commit `306b344a18`. Reverted https://github.com/pytorch/pytorch/pull/164992 on behalf of https://github.com/jeffdaily due to broke ROCm CI test/inductor/test_inductor_scheduler.py::TestSchedulerCUDA::test_flop_counter_op_options0_cuda_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/18417066364/job/52485636942) [HUD commit link](`306b344a18`) ([comment](https://github.com/pytorch/pytorch/pull/164992#issuecomment-3397927142))	2025-10-13 15:14:34 +00:00
William Wen	5dbca58bd0	[dynamo] fix potential 3.12+ THP_PyOpcode_Caches init error seen internally (#165200 ) Another attempt at merging https://github.com/pytorch/pytorch/pull/164597 due to CLA signing failure. Differential Revision: [D84397377](https://our.internmc.facebook.com/intern/diff/D84397377) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165200 Approved by: https://github.com/anijain2305, https://github.com/mlazos	2025-10-12 05:29:04 +00:00
Pian Pawakapan	306b344a18	[dynamo][DebugMode] mask python keys in dispatch_key_set guard checks (#164992 ) I found that running any compiled function under DebugMode more than once will trigger recompilations, e.g. with the really simple modified test case in `test_compile`: ``` [0/1] [__recompiles] Recompiling function f in /data/users/pianpwk/ptclone/pytorch/test/distributed/tensor/debug/test_debug_mode.py:268 [0/1] [__recompiles] triggered by the following guard failure(s): [0/1] [__recompiles] - 0/0: [0/2] [__recompiles] Recompiling function f in /data/users/pianpwk/ptclone/pytorch/test/distributed/tensor/debug/test_debug_mode.py:268 [0/2] [__recompiles] triggered by the following guard failure(s): [0/2] [__recompiles] - 0/1: [0/2] [__recompiles] - 0/0: ``` Digging deeper, the guard failures were due to TENSOR_MATCH guards failing on dispatch key set checks (seemingly on the Python dispatch key): `5a1fbf45ad/torch/csrc/dynamo/guards.cpp (L199-L203)` This seems to due to the `ignore_compile_internals=True` flag on custom dispatch modes being on, which causes these modes to "hide" themselves during compilation, making dynamo guard on the Python dispatch key being off. The (maybe imperfect) solution is to mask out the Python keys for guard comparisons. This might be fine because custom dispatch modes won't appear here during compilation - `ignore_compile_internals=True` hides them, and `ignore_compile_internals=False` disables compile entirely? Pull Request resolved: https://github.com/pytorch/pytorch/pull/164992 Approved by: https://github.com/williamwen42	2025-10-10 20:00:28 +00:00
soulitzer	71aefd5595	[reland] Allow setting grad_dtype on leaf tensors (#164751 ) ghstack-source-id: e44b3941530be83a630ec93f1478eec741ffca2e Pull-Request-resolved: https://github.com/pytorch/pytorch/pull/162815 Fixes #ISSUE_NUMBER Relanding due to internal weirdness. Separate PR to codev w/o ghstack. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164751 Approved by: https://github.com/albanD	2025-10-08 20:23:13 +00:00
Natalia Gimelshein	37c6087334	Add split-K control to cuBLAS reduced-precision settings (#164766 ) ## Summary - add a CuBLASReductionOption enum so the CUDA context can track reduced-precision and split-K options - extend the Python bindings, backend helpers, and docs to accept an optional allow_splitk argument for fp16/bf16 matmul controls - update cuBLAS/cuBLASLt call sites plus dynamo guards and tests to respect the new combinations ## Testing - python test/test_cuda.py TestCuda.test_cublas_allow_fp16_reduced_precision_reduction_get_set -v (fails: ModuleNotFoundError: No module named 'psutil') ------ https://chatgpt.com/codex/tasks/task_e_68e404623178832f8a3e1d34e1e175da Pull Request resolved: https://github.com/pytorch/pytorch/pull/164766 Approved by: https://github.com/malfet, https://github.com/albanD	2025-10-08 18:48:45 +00:00
William Wen	409aece3f9	[dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164400 Approved by: https://github.com/Camyll, https://github.com/atalman	2025-10-04 18:38:08 +00:00
Yuanyuan Chen	5103ecc5d8	[1/N] Fix clang-tidy readability checks (#164561 ) Check all `.cpp` files except `jit` files for readability thoroughly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164561 Approved by: https://github.com/Skylion007	2025-10-04 09:40:38 +00:00
PyTorch MergeBot	3ddf2018d0	Revert "Support setting grad_dtype on leaf tensors (#162815 )" This reverts commit `dca73982c5`. Reverted https://github.com/pytorch/pytorch/pull/162815 on behalf of https://github.com/yangw-dev due to break internal test D83850533, see more details below ([comment](https://github.com/pytorch/pytorch/pull/162815#issuecomment-3367498501))	2025-10-03 23:14:28 +00:00
Lakshay Garg	f006aee601	Speed up FP precision lookup (#164044 ) This commit simplifies the precision lookup and setting logic by reducing the number of branches and using a custom hash function. Fixes #161822. The issue described in #163709 still persists. This is meant as a short term fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164044 Approved by: https://github.com/ngimel, https://github.com/eqy	2025-10-03 21:35:20 +00:00
soulitzer	dca73982c5	Support setting grad_dtype on leaf tensors (#162815 ) `grad_dtype` is a new attribute on Tensor to control gradient dtype: - Access/setting is leaf-only. - grad_dtype is respected when (1) when assigning to .grad, and (2) in the engine after the previous node produces incoming gradients for AccumulateGrad. (See table below for details) - Not setting grad_dtype preserves the current behavior. Accessing it returns `t.dtype` - `grad_dtype` cannot be set when there is already a `.grad` present and the dtypes conflict. \| `grad_dtype` setting \| Setting `.grad` manually \| Incoming gradient from autograd engine \| \|-----------------------\|--------------------------\|-----------------------------------------\| \| Default (tensor’s dtype) \| `.grad` must match tensor’s dtype \| Engine casts incoming grad to tensor’s dtype \| \| Set to specific dtype \| `.grad` must match that dtype \| Engine casts incoming grad to the specified dtype \| \| Set to `None` \| `.grad` may be any dtype \| Engine does not cast; accepts incoming grad dtype as-is \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/162815 Approved by: https://github.com/albanD	2025-10-02 23:09:07 +00:00
PyTorch MergeBot	2a7c486750	Revert "Speed up FP precision lookup (#164044 )" This reverts commit `723ba21393`. Reverted https://github.com/pytorch/pytorch/pull/164044 on behalf of https://github.com/yangw-dev due to broke internal build In file included from xplat/caffe2/aten/src/ATen/DeviceAccelerator.cpp:1: xplat/caffe2/aten/src/ATen/Context.h:502:38: error: shift count >= width of type [-Werror,-Wshift-count-overflow] 502 \| return std::hash<size_t>{}((k1 << 32) \| k2); ([comment](https://github.com/pytorch/pytorch/pull/164044#issuecomment-3363016702))	2025-10-02 21:00:44 +00:00
Yu, Guangye	5f775bdfb7	Fix THP_PyObject_VirtualFree return type (#163763 ) # Motivation `void THP_PyObject_VirtualFree` should have no return value; otherwise, it would raise a build warning ```bash C:\Users\guangyey\pytorch\torch\csrc\dynamo\cpython_defs.c(264): warning C4098: 'THP_PyObject_VirtualFree': 'void' function returning a value ``` # Additional Context Refer to `c4f21d7c7c/Include/cpython/objimpl.h (L59-L68)` PyObjectArenaAllocator::free is defined with `void` return type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163763 Approved by: https://github.com/albanD, https://github.com/williamwen42	2025-10-02 20:21:53 +00:00
PyTorch MergeBot	c6329524d8	Revert "Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527 )" This reverts commit `50c0550f5a`. Reverted https://github.com/pytorch/pytorch/pull/163527 on behalf of https://github.com/swolchok due to breaking import torch in debug builds, see #164297 ([comment](https://github.com/pytorch/pytorch/pull/163527#issuecomment-3361919142))	2025-10-02 15:42:42 +00:00
Lakshay Garg	723ba21393	Speed up FP precision lookup (#164044 ) This commit simplifies the precision lookup and setting logic by reducing the number of branches and using a custom hash function. Fixes #161822. The issue described in #163709 still persists. This is meant as a short term fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164044 Approved by: https://github.com/ngimel, https://github.com/eqy	2025-10-02 00:59:19 +00:00
William Wen	ac1bc51608	[dynamo] do not pop from framelocals dict in Python 3.10 (#164316 ) Followup to https://github.com/pytorch/pytorch/pull/164038 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164316 Approved by: https://github.com/anijain2305	2025-10-01 10:20:46 +00:00
William Wen	d4b785a6a7	[dynamo, 3.14] fix stack ref copy error (#163796 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163796 Approved by: https://github.com/anijain2305 ghstack dependencies: #161838, #161555, #161839, #163009, #163109, #163110, #163191, #163292	2025-09-30 17:42:33 +00:00
William Wen	763ab2a6ed	[dynamo, 3.14] compile actual code in C dynamo (#161555 ) No 3.14 CI tests enabled yet, but this was enough to get Dynamo compiling locally and Python Dynamo is at least being called. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161555 Approved by: https://github.com/anijain2305 ghstack dependencies: #161838	2025-09-30 17:41:42 +00:00
William Wen	4b8fe795f8	[dynamo] format cpython_defs.c (#161838 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161838 Approved by: https://github.com/Skylion007, https://github.com/anijain2305	2025-09-30 17:41:35 +00:00
Yuanyuan Chen	3766513d25	Remove C++ workarounds for Python < 3.10 (#164055 ) Remove two unnecessary `PY_VERSION_HEX` branches. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164055 Approved by: https://github.com/ezyang	2025-09-28 20:00:02 +00:00
Animesh Jain	991e3d0d16	[dynamo][guards] Revert introduction of different types of lambda_guards (#163385 ) With https://fb.workplace.com/groups/260102303573409/permalink/787294574187510/ issue, it might be a better idea to just speedup _realize_dict and keep the changes very local. So reverting this PR as well, to return to clean slate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163385 Approved by: https://github.com/jansel	2025-09-27 18:20:48 +00:00
Scott Wolchok	50c0550f5a	Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527 ) See comment on the macro definition. In short, pybind11 3.x added `py::native_enum`, and also had to add overhead for that new way to bind enums on the critical path for calling functions that take regular old `py::enum_`s as arguments (for example, `__eq__`). Differential Revision: [D82873169](https://our.internmc.facebook.com/intern/diff/D82873169/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163527 Approved by: https://github.com/ezyang	2025-09-26 17:59:22 +00:00
Karhou Tam	39df24fe04	[Code Clean] Replace `std::runtime_error` with `TORCH_CHECK` (#163610 ) Including: - `torch/csrc/instruction_counter` - `torch/csrc/lazy` - `torch/csrc/monitor` - `torch/csrc/profiler` - `torch/csrc/dynamo` Fixes part of #148114 Personal mistake about (PR #163317), this PR does the same thing and PR #163317 has already been approved by @albanD. This is a personal mistake on my part, and I'm so sorry about that. Hope you won't mind @albanD. 🥹 Pull Request resolved: https://github.com/pytorch/pytorch/pull/163610 Approved by: https://github.com/albanD, https://github.com/Skylion007	2025-09-26 04:52:48 +00:00
PyTorch MergeBot	32ad29b72a	Revert "[dynamo][guards] Fail on an unknown framelocals to dict conversion (#162695 )" This reverts commit `a8432bcaad`. Reverted https://github.com/pytorch/pytorch/pull/162695 on behalf of https://github.com/anijain2305 due to internal failure at https://fburl.com/workplace/qiitdlp6 ([comment](https://github.com/pytorch/pytorch/pull/162695#issuecomment-3310757225))	2025-09-19 06:18:27 +00:00
PyTorch MergeBot	1302637a23	Revert "[dynamo][guards] Do not construct entire framelocals dict for LAMBDA_GUARD (#162525 )" This reverts commit `5f630d28d7`. Reverted https://github.com/pytorch/pytorch/pull/162525 on behalf of https://github.com/anijain2305 due to internal tests fail ([comment](https://github.com/pytorch/pytorch/pull/162525#issuecomment-3310748980))	2025-09-19 06:15:28 +00:00
joshuamarkovic	559e8d1c20	[doc]: Small typos (#162982 ) Small typo fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/162982 Approved by: https://github.com/ezyang, https://github.com/zou3519	2025-09-16 17:42:19 +00:00
Isuru Fernando	79d2418b5a	[inductor] Add FLOAT_IS_NAN and COMPLEX_IS_NAN guards (#162537 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162537 Approved by: https://github.com/anijain2305, https://github.com/mlazos ghstack dependencies: #162528	2025-09-12 04:32:46 +00:00
Isuru Fernando	5dd84559a5	[dynamo] Add DUAL_LEVEL_MATCH C++ guard (#162528 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162528 Approved by: https://github.com/anijain2305	2025-09-12 04:32:46 +00:00
Animesh Jain	a8432bcaad	[dynamo][guards] Fail on an unknown framelocals to dict conversion (#162695 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162695 Approved by: https://github.com/williamwen42 ghstack dependencies: #162694	2025-09-11 15:01:00 +00:00
Animesh Jain	a3a40cb741	[dynamo][guards] Do not consturct framelocals to dict on GlobalsGuardAccessor (#162694 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162694 Approved by: https://github.com/williamwen42	2025-09-11 15:01:00 +00:00
Animesh Jain	5f630d28d7	[dynamo][guards] Do not construct entire framelocals dict for LAMBDA_GUARD (#162525 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162525 Approved by: https://github.com/williamwen42 ghstack dependencies: #162509	2025-09-10 18:52:15 +00:00
Animesh Jain	a67e798cb7	[dynamo][guards] Prevent framelocals to dict conversion for not required LAMBDA_GUARD (#162509 ) This is a smaller PR to reduce framelocals to dict conversion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162509 Approved by: https://github.com/williamwen42	2025-09-10 18:52:15 +00:00
Scott Wolchok	e025c0f459	Dynamo: set_eval_frame microoptimization (#162220 ) Optimize for common case and remove a pair of refcount operations (see new comments.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162220 Approved by: https://github.com/jansel, https://github.com/williamwen42 ghstack dependencies: #161591, #161595, #161633, #161634, #161692, #162219	2025-09-09 01:10:06 +00:00
Alex Malyshev	c5b8a10be5	Fix compiler errors in 3.14 stub definitions (#161792 ) The functions here expect to return pointers, but currently aren't returning anything. Make them return NULL. The properties array wants an extra set of braces. One pair for the array, another for the first item in the array. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161792 Approved by: https://github.com/Skylion007	2025-09-03 00:58:41 +00:00
atalman	62db8ec391	windows python 3.14 nightly builds (#159869 ) Related to https://github.com/pytorch/pytorch/issues/156856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159869 Approved by: https://github.com/malfet, https://github.com/williamwen42	2025-08-19 18:36:16 +00:00
Animesh Jain	4d5b3f2d5a	[dynamo][guards] Install dict watchers for recrusive dict tag optimization (#159796 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159796 Approved by: https://github.com/jansel	2025-08-12 09:49:11 +00:00
Markus Hoehnerbach	e167c7d0f3	[inductor] allocate non-blocking copy destinations in pinned memory (#155121 ) (#158758 ) Fixes #155121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158758 Approved by: https://github.com/EikanWang, https://github.com/eellison	2025-08-07 17:07:26 +00:00
PyTorch MergeBot	83ba3f1101	Revert "[inductor] allocate non-blocking copy destinations in pinned memory (#155121 ) (#158758 )" This reverts commit `6085bf7565`. Reverted https://github.com/pytorch/pytorch/pull/158758 on behalf of https://github.com/davidberard98 due to I need to revert #158462 (it causes device-side asserts), and this PR causes a merge conflict in the test file. Sorry about that! ([comment](https://github.com/pytorch/pytorch/pull/158758#issuecomment-3152490371))	2025-08-04 21:47:11 +00:00
Markus Hoehnerbach	6085bf7565	[inductor] allocate non-blocking copy destinations in pinned memory (#155121 ) (#158758 ) Fixes #155121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158758 Approved by: https://github.com/EikanWang, https://github.com/eellison	2025-08-04 21:22:11 +00:00
Animesh Jain	53e47af0f7	[dynamo][guards] Read the attr name from GetAttrGuardAccessor (#159754 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159754 Approved by: https://github.com/jansel ghstack dependencies: #159752	2025-08-04 16:51:27 +00:00
Animesh Jain	66ad881fc7	[dynamo][guards][refactor] Simplify type extraction from GuardManager (#159752 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159752 Approved by: https://github.com/jansel	2025-08-04 16:51:27 +00:00
Animesh Jain	64cbaa876c	[dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159534 Approved by: https://github.com/jansel	2025-08-04 05:12:44 +00:00
Animesh Jain	4516c59f5f	[dynamo][source] Add special source for __code__ and __closure__ (#159722 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159722 Approved by: https://github.com/jansel	2025-08-04 05:02:05 +00:00
PyTorch MergeBot	805a102beb	Revert "[dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534 )" This reverts commit `1616777cd2`. Reverted https://github.com/pytorch/pytorch/pull/159534 on behalf of https://github.com/malfet due to Broke some inductor test and lint among other things, see `9c18901bfd/1` ([comment](https://github.com/pytorch/pytorch/pull/159534#issuecomment-3146983186))	2025-08-03 04:58:32 +00:00
Animesh Jain	1616777cd2	[dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159534 Approved by: https://github.com/jansel ghstack dependencies: #159186	2025-08-02 18:04:35 +00:00
yewentao256	c44efc3755	[Refactor] Fix Compile Warning: `possibly dangling reference to a temporary` (#159517 ) ```bash DEBUG pytorch/torch/csrc/dynamo/compiled_autograd.h:1388:25: warning: possibly dangling reference to a temporary [-Wdangling-reference] DEBUG 1388 \| for (const at::IValue& elt : lst) { DEBUG \| ^~~ DEBUG pytorch/torch/csrc/dynamo/compiled_autograd.h:1388:1: note: the temporary was destroyed at the end of the full expression ‘__for_begin .c10::impl::ListIterator<c10::IValue, __gnu_cxx::__normal_iterator<c10::IValue, std::vector<c10::IValue> > >::operator().c10::impl::ListElementReference<c10::IValue, __gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue> > >::operator std::conditional_t<true, const c10::IValue&, c10::IValue>()’ DEBUG 1388 \| for (const at::IValue& elt : lst) { DEBUG \| ^ ``` This PR fixes this warning Pull Request resolved: https://github.com/pytorch/pytorch/pull/159517 Approved by: https://github.com/xmfan	2025-07-31 04:49:43 +00:00
William Wen	3a65ff84b6	[dynamo, easy] add comment on skipping sys.monitoring frames (#159493 ) Add a comment so we know why we're doing this code (followup to https://github.com/pytorch/pytorch/pull/159369) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159493 Approved by: https://github.com/azahed98, https://github.com/Lucaskabela, https://github.com/zou3519, https://github.com/jingsh ghstack dependencies: #159369	2025-07-30 21:54:38 +00:00
Animesh Jain	7eb5fdb358	[dynamo][guards] Recursive dict tag optimization (#159183 ) Design doc here - https://docs.google.com/document/d/1W29DrWID5miGWlZXspsQVN5U0zydE3kjZpziOXrhuaY/edit?tab=t.0#bookmark=id.sba04iw9sp68 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159183 Approved by: https://github.com/jansel	2025-07-30 06:01:32 +00:00
William Wen	477c2273e1	[dynamo] better way to skip tracing sys.monitoring callables (#159369 ) Better approach to https://github.com/pytorch/pytorch/pull/158171, according to https://github.com/python/cpython/issues/137178#issuecomment-3131617493. Pull Request resolved: https://github.com/pytorch/pytorch/pull/159369 Approved by: https://github.com/Skylion007	2025-07-29 21:54:58 +00:00
Animesh Jain	f7d6e9f500	[dynamo][guards] More small guard optimizations (#159345 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159345 Approved by: https://github.com/williamwen42 ghstack dependencies: #159288	2025-07-29 18:36:49 +00:00

1 2 3 4 5 ...

506 Commits