pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Natalia Gimelshein 37c6087334 Add split-K control to cuBLAS reduced-precision settings (#164766 ) ## Summary - add a CuBLASReductionOption enum so the CUDA context can track reduced-precision and split-K options - extend the Python bindings, backend helpers, and docs to accept an optional allow_splitk argument for fp16/bf16 matmul controls - update cuBLAS/cuBLASLt call sites plus dynamo guards and tests to respect the new combinations ## Testing - python test/test_cuda.py TestCuda.test_cublas_allow_fp16_reduced_precision_reduction_get_set -v (fails: ModuleNotFoundError: No module named 'psutil') ------ https://chatgpt.com/codex/tasks/task_e_68e404623178832f8a3e1d34e1e175da Pull Request resolved: https://github.com/pytorch/pytorch/pull/164766 Approved by: https://github.com/malfet, https://github.com/albanD		2025-10-08 18:48:45 +00:00
..
cache_entry.cpp	[dynamo] fix _torchdynamo_orig_callable naming issues (#156901 )	2025-07-02 09:53:55 +00:00
cache_entry.h	[dynamo] fix segfault due to dangling CacheEntry backend pointer (#156527 )	2025-06-26 23:51:08 +00:00
compiled_autograd.cpp	[1/N] Use internal linkage in torch/csrc C++ files. (#150930 )	2025-04-11 02:19:31 +00:00
compiled_autograd.h	Revert "Support setting grad_dtype on leaf tensors (#162815 )"	2025-10-03 23:14:28 +00:00
cpp_shim.cpp
cpp_shim.h
cpython_defs.c	[dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400 )	2025-10-04 18:38:08 +00:00
cpython_defs.h
cpython_includes.h	[dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400 )	2025-10-04 18:38:08 +00:00
debug_macros.h
eval_frame_cpp.cpp	[Code Clean] Replace `std::runtime_error` with `TORCH_CHECK` (#163610 )	2025-09-26 04:52:48 +00:00
eval_frame_cpp.h	[dynamo] skip tracing functions registered in sys.monitoring (#158171 )	2025-07-22 18:02:30 +00:00
eval_frame.c	[dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400 )	2025-10-04 18:38:08 +00:00
eval_frame.h	[dynamo] better way to skip tracing sys.monitoring callables (#159369 )	2025-07-29 21:54:58 +00:00
extra_state.cpp	[Code Clean] Replace `std::runtime_error` with `TORCH_CHECK` (#163610 )	2025-09-26 04:52:48 +00:00
extra_state.h	[dynamo] show frame information when recompilation is triggered on fail_on_recompile (#156433 )	2025-07-01 15:15:58 +00:00
framelocals_mapping.cpp	[dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400 )	2025-10-04 18:38:08 +00:00
framelocals_mapping.h
guards.cpp	Add split-K control to cuBLAS reduced-precision settings (#164766 )	2025-10-08 18:48:45 +00:00
guards.h	[dynamo] Add guard serialization for tensor matches. (#151318 )	2025-04-25 14:16:23 +00:00
init.cpp	Revert "Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527 )"	2025-10-02 15:42:42 +00:00
init.h
python_compiled_autograd.cpp	[ca] Support TorchDispatchMode via pass through (#156516 )	2025-06-21 18:33:47 +00:00
python_compiled_autograd.h
utils.cpp	[1/N] Use internal linkage in torch/csrc C++ files. (#150930 )	2025-04-11 02:19:31 +00:00
utils.h