mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-08 07:39:33 +01:00
Summary: **Problem Statement** Currently, torch distributed elastic does not support to an option specify destination for event logging from torch.distributed.run. *recording events to default destination:* https://fburl.com/code/7f9b0993 The default destination is "null". ***Solution*** adding option in torch.destributed.run to specify event_logging_destination. The default value will be "null" which is current default so it won;t affect users unless the specify it via command line. Test Plan: https://www.internalfb.com/mlhub/pipelines/runs/mast/f738408681-TrainingApplication_torch_distributed_run_3?job_attempt=0&version=0&tab=execution_details&env=PRODUCTION Rollback Plan: Reviewed By: kiukchung Differential Revision: D75183591 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155268 Approved by: https://github.com/d4l3k |
||
|---|---|---|
| .. | ||
| agent | ||
| events | ||
| metrics | ||
| multiprocessing | ||
| rendezvous | ||
| timer | ||
| utils | ||
| __init__.py | ||
| control_plane.py | ||