pytorch/test/distributed/elastic
Kurman Karabukaev 360761f7d0 [Torchelasic] Create root log directory by default (#121257)
Summary:
After refactoring in https://github.com/pytorch/pytorch/pull/120691, default behavior unintentionally was changes from creating tempdir for logging to not capturing any logs by torch Elastic Agent.

Reverting the behavior to:
- making tempdir when log dir is not specified
- allowing non-empty root log dir
    - Note: in case attempt folder exists, it will be pruned here: https://github.com/pytorch/pytorch/blob/main/torch/distributed/elastic/multiprocessing/api.py#L294

Differential Revision: D54531851

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121257
Approved by: https://github.com/d4l3k
2024-03-06 18:50:38 +00:00
..
agent/server/test [TorchElastic] Refactoring to support non-default logging strategy (#120691) 2024-02-29 20:59:17 +00:00
events Run tests in USE_PYTEST_LIST through run_tests (#95659) 2023-02-28 22:09:01 +00:00
metrics
multiprocessing [Torchelasic] Create root log directory by default (#121257) 2024-03-06 18:50:38 +00:00
rendezvous [torchelastic][rendezvous] Add option to enable libuv for TCPStore based rendezvous backend (#118944) 2024-02-04 23:11:32 +00:00
timer Fix typo under test directory (#112346) 2023-11-03 07:53:33 +00:00
utils Add timeout for master store if clients do not join (#111805) 2023-10-27 14:44:43 +00:00