[rocm]add device guard when initialize single stream (#154433)

Summary: AMD streams are lazily initialized and sometimes (e.g. when we just want to do event recording on the stream) we might not be setting the device guard while it's initializing which would lead to invalid configuration error.

Differential Revision: D75456460

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154433
Approved by: https://github.com/jeffdaily
This commit is contained in:
Shiyan Deng 2025-05-29 19:42:08 +00:00 committed by PyTorch MergeBot
parent 20ec61a02f
commit e8f5c24d17

View File

@ -200,6 +200,7 @@ static void initGlobalStreamState() {
// Init a single CUDA or HIP stream
// See Note [HIP Lazy Streams]
static void initSingleStream(int p, DeviceIndex device_index, int i) {
CUDAGuard device_guard(device_index);
auto& stream = streams[p][device_index][i];
auto pri = -p; // lower number is higher priority