[torch][cuda] fix race condition in cuda initialization (#143238)

The access to lazy init callbacks (`_lazy_seed_tracker` and `_queued_calls`) is not synchronized with the initialization lock. This exposes us to the following race: 1. start `_lazy_init` 2. take `_initialization_lock` 3. flush `_queued_calls` and run them all 4. another thread comes in and uses `_lazy_call` to put something on the queue (in our case, the `manual_seed`) 5. original thread finishes initializing, but never runs that call Pull Request resolved: https://github.com/pytorch/pytorch/pull/143238 Approved by: https://github.com/ngimel
2025-12-06 12:20:52 +01:00 · 2024-12-14 07:41:22 +00:00 · 2024-12-14 07:41:22 +00:00 · 9933e59c2b
commit 9933e59c2b
parent 28d8297712
1 changed files with 14 additions and 13 deletions
--- a/torch/cuda/init.py
+++ b/torch/cuda/init.py
@ -245,20 +245,21 @@ def is_initialized():


 def _lazy_call(callable, **kwargs):
-    if is_initialized():
-        callable()
-    else:
-        # TODO(torch_deploy): this accesses linecache, which attempts to read the
-        # file system to get traceback info. Patch linecache or do something
-        # else here if this ends up being important.
-        global _lazy_seed_tracker
-        if kwargs.get("seed_all", False):
-            _lazy_seed_tracker.queue_seed_all(callable, traceback.format_stack())
-        elif kwargs.get("seed", False):
-            _lazy_seed_tracker.queue_seed(callable, traceback.format_stack())
+    with _initialization_lock:
+        if is_initialized():
+            callable()
        else:
-            # Don't store the actual traceback to avoid memory cycle
-            _queued_calls.append((callable, traceback.format_stack()))
+            # TODO(torch_deploy): this accesses linecache, which attempts to read the
+            # file system to get traceback info. Patch linecache or do something
+            # else here if this ends up being important.
+            global _lazy_seed_tracker
+            if kwargs.get("seed_all", False):
+                _lazy_seed_tracker.queue_seed_all(callable, traceback.format_stack())
+            elif kwargs.get("seed", False):
+                _lazy_seed_tracker.queue_seed(callable, traceback.format_stack())
+            else:
+                # Don't store the actual traceback to avoid memory cycle
+                _queued_calls.append((callable, traceback.format_stack()))


 _lazy_call(_check_capability)