mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary:
This creates `torch.cuda.set_warn_on_synchronization()` function that would warn or error when synchronizing operation is performed. We could wrap it in a context manager for ease of use, but it would be a lie, because it sets global, and not thread-local state. Since it's intended for debugging, maybe that's ok though.
As all `torch.cuda.*` functions, it's going through CPython, not pybind, so the argument is converted to long before being passed to c10 function. I'll make python argument a python enum class, but without pybind it'll still have to go thourgh long conversion.
For a test script
```
import torch
torch.cuda.set_warn_on_synchronization(1)
x=torch.randn(10, device="cuda")
x.nonzero()
y=torch.randn((), device="cuda")
if y:
print("something")
torch.multinomial(x.abs(), 10, replacement=False)
torch.randperm(20000, device="cuda")
ind = torch.randint(10, (3,), device="cuda")
mask = torch.randint(2, (10,), device="cuda", dtype=torch.bool)
val = torch.randn((), device="cuda")
x[mask]=1.
x[mask] = val
torch.cuda.synchronize()
```
the output is
```
/../playground/sync_warn_test.py:4: UserWarning: called a synchronizing operation (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:145.)
x.nonzero()
/../playground/sync_warn_test.py:7: UserWarning: called a synchronizing operation (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:145.)
if y:
something
/../playground/sync_warn_test.py:9: UserWarning: called a synchronizing operation (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:145.)
torch.multinomial(x.abs(), 10, replacement=False)
/../playground/sync_warn_test.py:15: UserWarning: called a synchronizing operation (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:145.)
x[mask] = val
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62092
Reviewed By: mruberry
Differential Revision: D29968792
Pulled By: ngimel
fbshipit-source-id: cc6f817212c164727ed99ecf6ab050dc29631b9e
109 lines
2.0 KiB
ReStructuredText
109 lines
2.0 KiB
ReStructuredText
torch.cuda
|
|
===================================
|
|
.. automodule:: torch.cuda
|
|
.. currentmodule:: torch.cuda
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
StreamContext
|
|
can_device_access_peer
|
|
current_blas_handle
|
|
current_device
|
|
current_stream
|
|
default_stream
|
|
device
|
|
device_count
|
|
device_of
|
|
get_arch_list
|
|
get_device_capability
|
|
get_device_name
|
|
get_device_properties
|
|
get_gencode_flags
|
|
get_sync_debug_mode
|
|
init
|
|
ipc_collect
|
|
is_available
|
|
is_initialized
|
|
set_device
|
|
set_stream
|
|
set_sync_debug_mode
|
|
stream
|
|
synchronize
|
|
|
|
Random Number Generator
|
|
-------------------------
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
get_rng_state
|
|
get_rng_state_all
|
|
set_rng_state
|
|
set_rng_state_all
|
|
manual_seed
|
|
manual_seed_all
|
|
seed
|
|
seed_all
|
|
initial_seed
|
|
|
|
|
|
Communication collectives
|
|
-------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
comm.broadcast
|
|
comm.broadcast_coalesced
|
|
comm.reduce_add
|
|
comm.scatter
|
|
comm.gather
|
|
|
|
Streams and events
|
|
------------------
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
Stream
|
|
Event
|
|
|
|
Memory management
|
|
-----------------
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
empty_cache
|
|
list_gpu_processes
|
|
memory_stats
|
|
memory_summary
|
|
memory_snapshot
|
|
memory_allocated
|
|
max_memory_allocated
|
|
reset_max_memory_allocated
|
|
memory_reserved
|
|
max_memory_reserved
|
|
set_per_process_memory_fraction
|
|
memory_cached
|
|
max_memory_cached
|
|
reset_max_memory_cached
|
|
reset_peak_memory_stats
|
|
.. FIXME The following doesn't seem to exist. Is it supposed to?
|
|
https://github.com/pytorch/pytorch/issues/27785
|
|
.. autofunction:: reset_max_memory_reserved
|
|
|
|
NVIDIA Tools Extension (NVTX)
|
|
-----------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
nvtx.mark
|
|
nvtx.range_push
|
|
nvtx.range_pop
|