pytorch/torch/cuda
Nichols A. Romero ca2ffc23ab [ROCm][TunableOp] Stricter unit tests for online and offline tuning (#150142)
Improvements to unit tests and warnings for unsupported cases in offline tuning. Here are more details:
- Previously we only compared the OpSig for the untuned vs. tuned entries. This was not strict enough so we now compare OpSig+ParamSig.
- The main offline and online UTs are now stricter to make sure we exercise the code paths for the four combinations of transA and transB.
- Offline tuning does not support some tensor shapes. Emit warning and skip tuning.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150142
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-03-31 04:12:08 +00:00
..
amp
__init__.py Initial implementation of host memory stats (#147660) 2025-03-05 16:13:19 +00:00
_gpu_trace.py [BE][Easy] enable PYFMT for torch/[a-s]*/ (#138447) 2024-12-23 14:04:00 +00:00
_memory_viz.py [BE][Easy] enable PYFMT for torch/[a-s]*/ (#138447) 2024-12-23 14:04:00 +00:00
_sanitizer.py PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202) 2025-01-20 22:37:26 +00:00
_utils.py
comm.py
error.py
gds.py [BE] Upgrade to mypy 1.14 (#145966) 2025-03-04 20:58:26 +00:00
graphs.py Revert "Implement cuda graphs implementation of torch.cond and torch.while_loop (#140979)" 2025-02-13 18:04:26 +00:00
jiterator.py PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202) 2025-01-20 22:37:26 +00:00
memory.py [GPU Snapshot] Add Clear History Flag (#149352) 2025-03-19 21:44:20 +00:00
nccl.py PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202) 2025-01-20 22:37:26 +00:00
nvtx.py Inductor annotations (#130429) 2024-12-10 08:53:39 +00:00
profiler.py
random.py Avoid unnecessary clone in torch.cuda.set_rng_state (#149283) 2025-03-18 20:47:57 +00:00
sparse.py
streams.py Support with statement on torch.Stream (#140138) 2025-01-10 02:05:19 +00:00
tunable.py [ROCm][TunableOp] Stricter unit tests for online and offline tuning (#150142) 2025-03-31 04:12:08 +00:00