pytorch/torch/cuda
Jeff Daily d401e4e70a [ROCm][CUDA] add unit test utility busy_wait_for_flag (#166218)
torch.cuda._busy_wait_for_flag() will launch a kernel that spins until a flag is set by a corresponding torch.cuda._clear_flag(). These **must** be run on separate streams or it will deadlock.

When used correctly these kernels will put work on the GPU that is more predictable than torch.cuda._sleep() in cases where the unit test is depending on the GPU being busy.

Fixes #120318.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/166218
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-10-29 22:40:23 +00:00
..
amp Fix pyrefly error syntax (2/n) (#166448) 2025-10-29 00:36:40 +00:00
__init__.py [ROCm][CUDA] add unit test utility busy_wait_for_flag (#166218) 2025-10-29 22:40:23 +00:00
_device_limits.py [torch][cuda][device_limits] Library for querying device hardware limits for flops and bandwidth (#162942) 2025-09-23 04:48:19 +00:00
_gpu_trace.py [4/N] Apply ruff UP035 rule to python code (#164206) 2025-10-01 19:05:53 +00:00
_memory_viz.py filter out alloc-free pairs from trace plot (#165752) 2025-10-29 12:44:54 +00:00
_pin_memory_utils.py [dcp] add new checkpoint staging to preserve storage sharing and support mutable state_dicts (#155192) 2025-06-19 02:04:21 +00:00
_sanitizer.py [2/N] Fix ruff warnings (#164460) 2025-10-04 03:40:32 +00:00
_utils.py Fix pyrefly error syntax (2/n) (#166448) 2025-10-29 00:36:40 +00:00
comm.py
gds.py [4/N] Apply ruff UP035 rule to python code (#164206) 2025-10-01 19:05:53 +00:00
graphs.py Fix pyrefly error syntax (2/n) (#166448) 2025-10-29 00:36:40 +00:00
green_contexts.py Fix pyrefly error syntax (2/n) (#166448) 2025-10-29 00:36:40 +00:00
jiterator.py [4/N] Apply ruff UP035 rule to python code (#164206) 2025-10-01 19:05:53 +00:00
memory.py Fix pyrefly error syntax (2/n) (#166448) 2025-10-29 00:36:40 +00:00
nccl.py Fix flake8 B028 warnings (#166224) 2025-10-26 06:18:55 +00:00
nvtx.py Fix pyrefly error syntax (2/n) (#166448) 2025-10-29 00:36:40 +00:00
profiler.py
random.py Avoid unnecessary clone in torch.cuda.set_rng_state (#149283) 2025-03-18 20:47:57 +00:00
sparse.py
streams.py error message for instantiating CUDA Stream if CUDA not available (#159868) 2025-10-11 23:21:35 +00:00
tunable.py Fix flake8 B028 warnings (#166224) 2025-10-26 06:18:55 +00:00