pytorch/docs/source/pipeline.rst
Sam Estep 8c798e0622 Forbid trailing whitespace (#53406)
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857

These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
  - `GLOSSARY.md`
  - `aten/src/ATen/core/op_registration/README.md`
  - `scripts/README.md`
  - `torch/csrc/jit/codegen/fuser/README.md`

The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```

I looked over the auto-generated changes and didn't see anything that looked problematic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406

Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377

This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348

Reviewed By: walterddr, seemethere

Differential Revision: D26856620

Pulled By: samestep

fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
2021-03-05 17:22:55 -08:00

72 lines
3.0 KiB
ReStructuredText

.. _pipeline-parallelism:
Pipeline Parallelism
====================
Pipeline parallelism was original introduced in the
`Gpipe <https://arxiv.org/abs/1811.06965>`__ paper and is an efficient
technique to train large models on multiple GPUs.
.. warning ::
Pipeline Parallelism is experimental and subject to change.
Model Parallelism using multiple GPUs
-------------------------------------
Typically for large models which don't fit on a single GPU, model parallelism
is employed where certain parts of the model are placed on different GPUs.
Although, if this is done naively for sequential models, the training process
suffers from GPU under utilization since only one GPU is active at one time as
shown in the figure below:
.. figure:: _static/img/pipeline_parallelism/no_pipe.png
The figure represents a model with 4 layers placed on 4 different GPUs
(vertical axis). The horizontal axis represents training this model through
time demonstrating that only 1 GPU is utilized at a time
(`image source <https://arxiv.org/abs/1811.06965>`__).
Pipelined Execution
-------------------
To alleviate this problem, pipeline parallelism splits the input minibatch into
multiple microbatches and pipelines the execution of these microbatches across
multiple GPUs. This is outlined in the figure below:
.. figure:: _static/img/pipeline_parallelism/pipe.png
The figure represents a model with 4 layers placed on 4 different GPUs
(vertical axis). The horizontal axis represents training this model through
time demonstrating that the GPUs are utilized much more efficiently.
However, there still exists a bubble (as demonstrated in the figure) where
certain GPUs are not utilized.
(`image source <https://arxiv.org/abs/1811.06965>`__).
Pipe APIs in PyTorch
--------------------
.. autoclass:: torch.distributed.pipeline.sync.Pipe
:members: forward
Skip connections
^^^^^^^^^^^^^^^^
Certain models like ResNeXt are not completely sequential and have skip
connections between layers. Naively implementing as part of pipeling
parallelism would imply that we need to copy outputs for certain layers through
multiple GPUs till we eventually reach the GPU where the layer for the skip
connection resides. To avoid this copy overhead, we provide APIs below to stash
and pop Tensors in different layers of the model.
.. autofunction:: torch.distributed.pipeline.sync.skip.skippable.skippable
.. autoclass:: torch.distributed.pipeline.sync.skip.skippable.stash
.. autoclass:: torch.distributed.pipeline.sync.skip.skippable.pop
.. autofunction:: torch.distributed.pipeline.sync.skip.skippable.verify_skippables
Acknowledgements
----------------
The implementation for pipeline parallelism is based on `fairscale's pipe implementation <https://github.com/facebookresearch/fairscale/tree/master/fairscale/nn/pipe>`__ and
`torchgpipe <https://github.com/kakaobrain/torchgpipe>`__. We would like to
thank both teams for their contributions and guidance towards bringing pipeline
parallelism into PyTorch.