mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Summary: Adding function to log additional debug information before killing the expired watchdog timers. Additional information like stack trace can be added in the debug function using worker process IDs from expired timers. Test Plan: buck test mode/opt caffe2/test/distributed/elastic/timer:file_based_timer_test Differential Revision: D56044153 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123883 Approved by: https://github.com/kurman |
||
|---|---|---|
| .. | ||
| agent_diagram.jpg | ||
| agent.rst | ||
| customization.rst | ||
| errors.rst | ||
| etcd_rdzv_diagram.png | ||
| events.rst | ||
| examples.rst | ||
| kubernetes.rst | ||
| metrics.rst | ||
| multiprocessing.rst | ||
| quickstart.rst | ||
| rendezvous.rst | ||
| run.rst | ||
| subprocess_handler.rst | ||
| timer.rst | ||
| train_script.rst | ||