pytorch/torch/csrc/dynamo
Natalia Gimelshein 37c6087334 Add split-K control to cuBLAS reduced-precision settings (#164766)
## Summary
- add a CuBLASReductionOption enum so the CUDA context can track reduced-precision and split-K options
- extend the Python bindings, backend helpers, and docs to accept an optional allow_splitk argument for fp16/bf16 matmul controls
- update cuBLAS/cuBLASLt call sites plus dynamo guards and tests to respect the new combinations

## Testing
- python test/test_cuda.py TestCuda.test_cublas_allow_fp16_reduced_precision_reduction_get_set -v *(fails: ModuleNotFoundError: No module named 'psutil')*

------
https://chatgpt.com/codex/tasks/task_e_68e404623178832f8a3e1d34e1e175da

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164766
Approved by: https://github.com/malfet, https://github.com/albanD
2025-10-08 18:48:45 +00:00
..
cache_entry.cpp [dynamo] fix _torchdynamo_orig_callable naming issues (#156901) 2025-07-02 09:53:55 +00:00
cache_entry.h [dynamo] fix segfault due to dangling CacheEntry backend pointer (#156527) 2025-06-26 23:51:08 +00:00
compiled_autograd.cpp [1/N] Use internal linkage in torch/csrc C++ files. (#150930) 2025-04-11 02:19:31 +00:00
compiled_autograd.h Revert "Support setting grad_dtype on leaf tensors (#162815)" 2025-10-03 23:14:28 +00:00
cpp_shim.cpp
cpp_shim.h
cpython_defs.c [dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400) 2025-10-04 18:38:08 +00:00
cpython_defs.h
cpython_includes.h [dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400) 2025-10-04 18:38:08 +00:00
debug_macros.h
eval_frame_cpp.cpp [Code Clean] Replace std::runtime_error with TORCH_CHECK (#163610) 2025-09-26 04:52:48 +00:00
eval_frame_cpp.h [dynamo] skip tracing functions registered in sys.monitoring (#158171) 2025-07-22 18:02:30 +00:00
eval_frame.c [dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400) 2025-10-04 18:38:08 +00:00
eval_frame.h [dynamo] better way to skip tracing sys.monitoring callables (#159369) 2025-07-29 21:54:58 +00:00
extra_state.cpp [Code Clean] Replace std::runtime_error with TORCH_CHECK (#163610) 2025-09-26 04:52:48 +00:00
extra_state.h [dynamo] show frame information when recompilation is triggered on fail_on_recompile (#156433) 2025-07-01 15:15:58 +00:00
framelocals_mapping.cpp [dynamo, 3.14] prevent StackRef compilation in 3.14 Windows (#164400) 2025-10-04 18:38:08 +00:00
framelocals_mapping.h
guards.cpp Add split-K control to cuBLAS reduced-precision settings (#164766) 2025-10-08 18:48:45 +00:00
guards.h [dynamo] Add guard serialization for tensor matches. (#151318) 2025-04-25 14:16:23 +00:00
init.cpp Revert "Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527)" 2025-10-02 15:42:42 +00:00
init.h
python_compiled_autograd.cpp [ca] Support TorchDispatchMode via pass through (#156516) 2025-06-21 18:33:47 +00:00
python_compiled_autograd.h
utils.cpp [1/N] Use internal linkage in torch/csrc C++ files. (#150930) 2025-04-11 02:19:31 +00:00
utils.h