mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
## Summary - add a CuBLASReductionOption enum so the CUDA context can track reduced-precision and split-K options - extend the Python bindings, backend helpers, and docs to accept an optional allow_splitk argument for fp16/bf16 matmul controls - update cuBLAS/cuBLASLt call sites plus dynamo guards and tests to respect the new combinations ## Testing - python test/test_cuda.py TestCuda.test_cublas_allow_fp16_reduced_precision_reduction_get_set -v *(fails: ModuleNotFoundError: No module named 'psutil')* ------ https://chatgpt.com/codex/tasks/task_e_68e404623178832f8a3e1d34e1e175da Pull Request resolved: https://github.com/pytorch/pytorch/pull/164766 Approved by: https://github.com/malfet, https://github.com/albanD |
||
|---|---|---|
| .. | ||
| _acc | ||
| _dynamo | ||
| _export | ||
| __init__.pyi.in | ||
| _aoti.pyi | ||
| _autograd.pyi | ||
| _cpu.pyi | ||
| _cudnn.pyi | ||
| _cusparselt.pyi | ||
| _distributed_autograd.pyi | ||
| _distributed_c10d.pyi | ||
| _distributed_rpc_testing.pyi | ||
| _distributed_rpc.pyi | ||
| _functionalization.pyi | ||
| _functions.pyi | ||
| _functorch.pyi | ||
| _instruction_counter.pyi | ||
| _itt.pyi | ||
| _jit_tree_views.pyi | ||
| _lazy_ts_backend.pyi | ||
| _lazy.pyi | ||
| _monitor.pyi | ||
| _nn.pyi.in | ||
| _nvtx.pyi | ||
| _onnx.pyi | ||
| _profiler.pyi | ||
| _VariableFunctions.pyi.in | ||
| _verbose.pyi | ||
| build.bzl | ||
| return_types.pyi.in | ||