Fix test TestCuda.test_streams_multi_gpu_query (#23912)

Summary: This is a similar issue as TestCuda.test_events_wait. PyTorch test sets a policy() method to assertLeaksNoCudaTensors. Whenever a test is run, assertLeaksNoCudaTensors is called, which in turn calls CudaMemoryLeakCheck, which in turn calls initialize_cuda_context_rng, where it executes torch.randn on each device, where a kernel is launched on each device. Since the kernel may not finish on device 0, the first assertion self.assertTrue(s0.query()) fails. The fix is to insert torch.cuda.synchronize(d0) torch.cuda.synchronize(d1) at the beginning of the test so that previously launched kernels finish before the real test begins. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23912 Differential Revision: D16688599 Pulled By: ezyang fbshipit-source-id: 3de2b555e99f5bbd05727835b9d7c93a026a0519
2025-12-06 12:20:52 +01:00 · 2019-08-07 07:41:04 -07:00 · 2019-08-07 07:41:04 -07:00 · 13a684d50b
commit 13a684d50b
parent fc82ec298b
1 changed files with 2 additions and 0 deletions
--- a/test/test_cuda.py
+++ b/test/test_cuda.py
@ -1729,6 +1729,8 @@ class TestCuda(TestCase):
    def test_streams_multi_gpu_query(self):
        d0 = torch.device('cuda:0')
        d1 = torch.device('cuda:1')
+        torch.cuda.synchronize(d0)
+        torch.cuda.synchronize(d1)

        with torch.cuda.device(d0):
            s0 = torch.cuda.current_stream()