mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 00:21:07 +01:00
When one process fails, others are immediately killed. This prevents other processes to do necessary cleanups, or dump debug information (in particular, the NCCL flight recorder). This PR adds a grace period. Default behavior is unchanged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131278 Approved by: https://github.com/albanD |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| _atfork.py | ||
| cuda_multiprocessing.md | ||
| pool.py | ||
| queue.py | ||
| reductions.py | ||
| spawn.py | ||