pytorch/docs
Nichols A. Romero a99332eb25 [ROCM] Support Multi-GPU offline tuning in TunableOp (#139673)
This PR enhances offline tuning to support multi-GPUs.

High-level description of algorithm:
- Duplicate GEMMs are first eliminated
- GEMMs are distributed to multi-GPUs for tuning
- Results are gathered into a file with `_full` in the filename

Also adding support for GemmAndBias and ScaledGemm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139673
Approved by: https://github.com/jeffdaily, https://github.com/hongxiayang
2024-11-26 19:07:41 +00:00
..
cpp Update copyrights to 2024 (#138638) 2024-10-22 21:00:58 +00:00
source [ROCM] Support Multi-GPU offline tuning in TunableOp (#139673) 2024-11-26 19:07:41 +00:00
.gitignore
libtorch.rst Add ROCm documentation to libtorch (C++) reST. (#136378) 2024-09-25 02:30:56 +00:00
make.bat
Makefile
README.md
requirements.txt

Please see the Writing documentation section of CONTRIBUTING.md for details on both writing and building the docs.