mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Summary: **Overview:** This removes the preceding `_` from `_Join`, `_Joinable`, and `_JoinHook` in preparation for adding the generic join context manager tutorial (see [here](https://github.com/pytorch/tutorials/pull/1610)). This also adds a docs page, which can be linked from the tutorial. [Here](https://github.com/pytorch/pytorch/files/6919475/render.pdf) is a render of the docs page. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62605 Test Plan: `DistributedDataParallel.join()`: ``` touch /tmp/barrier && TEMP_DIR="/tmp" BACKEND="nccl" WORLD_SIZE="2" gpurun python test/distributed/test_distributed_fork.py -- TestDistBackendWithFork.test_ddp_uneven_inputs TestDistBackendWithFork.test_ddp_uneven_inputs_stop_iteration_sync_bn TestDistBackendWithFork.test_ddp_grad_div_uneven_inputs TestDistBackendWithFork.test_ddp_uneven_input_join_disable TestDistBackendWithFork.test_ddp_uneven_input_exception ``` `ZeroRedundancyOptimizer`: ``` gpurun4 python test/distributed/optim/test_zero_redundancy_optimizer.py ``` NOTE: DDP overlap tests are failing due to a landing race. See https://github.com/pytorch/pytorch/pull/62592. Once the fix is landed, I will rebase, and tests should be passing. `Join`: ``` gpurun4 python test/distributed/algorithms/test_join.py ``` Reviewed By: mrshenli Differential Revision: D30055544 Pulled By: andwgu fbshipit-source-id: a5ce1f1d9f1904de3bdd4edd0b31b0a612d87026
18 lines
496 B
ReStructuredText
18 lines
496 B
ReStructuredText
.. role:: hidden
|
|
:class: hidden-section
|
|
|
|
Generic Join Context Manager
|
|
============================
|
|
The generic join context manager facilitates distributed training on uneven
|
|
inputs. This page outlines the API of the relevant classes: :class:`Join`,
|
|
:class:`Joinable`, and :class:`JoinHook`.
|
|
|
|
.. autoclass:: torch.distributed.algorithms.Join
|
|
:members:
|
|
|
|
.. autoclass:: torch.distributed.algorithms.Joinable
|
|
:members:
|
|
|
|
.. autoclass:: torch.distributed.algorithms.JoinHook
|
|
:members:
|