[RPC tests] Fix test_init_(rpc|pg)_then_(rpc|pg) not shutting down RPC (#41558)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41558

The problem was due to non-deterministic destruction order of two global static variables: the mutexes used by glog and the RPC agent (which was still set because we didn't call `rpc.shutdown()`). When the TensorPipe RPC agent shuts down some callbacks may fire with an error and thus attempt to log something. If the mutexes have already been destroyed this causes a SIGABRT.

Fixes https://github.com/pytorch/pytorch/issues/41474
ghstack-source-id: 108231453

Test Plan: Verified in https://github.com/pytorch/pytorch/issues/41474.

Reviewed By: fmassa

Differential Revision: D22582779

fbshipit-source-id: 63e34d8a020c4af996ef079cfb7041b2474e27c9
This commit is contained in:
Luca Wehrstedt 2020-07-22 06:30:17 -07:00 committed by Facebook GitHub Bot
parent e17e55831d
commit fced54aa67

View File

@ -3359,6 +3359,8 @@ class RpcTest(RpcAgentTestFixture):
# Test PG
dist.barrier()
rpc.shutdown()
@dist_init(setup_rpc=False)
def test_init_rpc_then_pg(self):
rpc.init_rpc(
@ -3384,6 +3386,8 @@ class RpcTest(RpcAgentTestFixture):
# Test PG
dist.barrier()
rpc.shutdown()
@dist_init
def test_wait_all_with_exception(self):
futs = []