mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
[RPC tests] Fix test_init_(rpc|pg)_then_(rpc|pg) not shutting down RPC (#41558)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41558 The problem was due to non-deterministic destruction order of two global static variables: the mutexes used by glog and the RPC agent (which was still set because we didn't call `rpc.shutdown()`). When the TensorPipe RPC agent shuts down some callbacks may fire with an error and thus attempt to log something. If the mutexes have already been destroyed this causes a SIGABRT. Fixes https://github.com/pytorch/pytorch/issues/41474 ghstack-source-id: 108231453 Test Plan: Verified in https://github.com/pytorch/pytorch/issues/41474. Reviewed By: fmassa Differential Revision: D22582779 fbshipit-source-id: 63e34d8a020c4af996ef079cfb7041b2474e27c9
This commit is contained in:
parent
e17e55831d
commit
fced54aa67
|
|
@ -3359,6 +3359,8 @@ class RpcTest(RpcAgentTestFixture):
|
|||
# Test PG
|
||||
dist.barrier()
|
||||
|
||||
rpc.shutdown()
|
||||
|
||||
@dist_init(setup_rpc=False)
|
||||
def test_init_rpc_then_pg(self):
|
||||
rpc.init_rpc(
|
||||
|
|
@ -3384,6 +3386,8 @@ class RpcTest(RpcAgentTestFixture):
|
|||
# Test PG
|
||||
dist.barrier()
|
||||
|
||||
rpc.shutdown()
|
||||
|
||||
@dist_init
|
||||
def test_wait_all_with_exception(self):
|
||||
futs = []
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user