pytorch/aten
Simon Layton 5ebf74a655 [2/2] Move scaled_mm routines to their own file (#166314)
Summary:

* Further simplify `ATen/native/cuda/Blas.cpp` by moving `_scaled_mm`,
  `_scaled_mm_v2` and supporting methods to a new file,
  `ATen/native/cuda/ScaledBlas.cpp`

Test Plan:

```
pytest -svv test/test_matmul_cuda.py
pytest -svv test/test_scaled_matmul_cuda.py
```

Reviewers:

Subscribers:

Tasks:

Tags:
Signed-off-by: Simon Layton <simonlayton@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/166314
Approved by: https://github.com/eqy
ghstack dependencies: #166313
2025-10-28 16:35:32 +00:00
..
conda
src [2/2] Move scaled_mm routines to their own file (#166314) 2025-10-28 16:35:32 +00:00
tools Adds Issue#153109 as a test for CUDAPluggableAllocator (#163575) 2025-10-01 09:07:48 +00:00
CMakeLists.txt Revert "Use official CUDAToolkit module in CMake (#154595)" 2025-06-23 21:15:31 +00:00