mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
torch.cuda._busy_wait_for_flag() will launch a kernel that spins until a flag is set by a corresponding torch.cuda._clear_flag(). These **must** be run on separate streams or it will deadlock. When used correctly these kernels will put work on the GPU that is more predictable than torch.cuda._sleep() in cases where the unit test is depending on the GPU being busy. Fixes #120318. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166218 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com> |
||
|---|---|---|
| .. | ||
| amp | ||
| __init__.py | ||
| _device_limits.py | ||
| _gpu_trace.py | ||
| _memory_viz.py | ||
| _pin_memory_utils.py | ||
| _sanitizer.py | ||
| _utils.py | ||
| comm.py | ||
| gds.py | ||
| graphs.py | ||
| green_contexts.py | ||
| jiterator.py | ||
| memory.py | ||
| nccl.py | ||
| nvtx.py | ||
| profiler.py | ||
| random.py | ||
| sparse.py | ||
| streams.py | ||
| tunable.py | ||