pytorch/torch/distributed/pipelining
Howard Huang 4ee7d0de86 Add generate_stage_to_rank_mapping utility (#146193)
We use `stage_index_to_group_rank` in the stage to determine what send/recv ops and in the schedule for IR generation. However, we don't need to expose this as an argument in our schedule class, so this stack of PRs is to remove it.

This PR creates a `stage_index_to_group_rank` utility function and removes the arg for the ZBVschedule. In a following PR I will add code to infer the `stage_index_to_group_rank` for the CSV schedule path and we will be able to remove this argument from our classes entirely.

Related comment from @wconstab https://github.com/pytorch/torchtitan/issues/774#issuecomment-2619793741

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146193
Approved by: https://github.com/wconstab
2025-02-05 21:26:45 +00:00
..
__init__.py [pipelining] Add ZBV schedule (#142084) 2024-12-11 02:00:57 +00:00
_backward.py [NFC] Fix some minor typos. (#145599) 2025-01-24 18:58:59 +00:00
_debug.py remove allow-untyped-defs from torch/distributed/pipelining/_debug.py (#143871) 2024-12-27 01:20:26 +00:00
_IR.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
_unflatten.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
_utils.py Add generate_stage_to_rank_mapping utility (#146193) 2025-02-05 21:26:45 +00:00
microbatch.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
README.md
schedules.py Add generate_stage_to_rank_mapping utility (#146193) 2025-02-05 21:26:45 +00:00
stage.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00

Pipeline Parallelism for PyTorch

torch.distributed.pipelining is a package for implementing pipeline parallelism on your model.

Our documentation is available here.

pipeline_diagram_web