Commit Graph

24 Commits

Author SHA1 Message Date
Rodrigo Kumpera
38192f63cd Add __all__ for a few distributed modules plus a little typing (reland) (#84872)
This handles distributed_c10d, which is massive and ddp_comm_hooks.

This relands #84119 with the required fixes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84872
Approved by: https://github.com/rohan-varma
2022-09-13 21:57:49 +00:00
PyTorch MergeBot
219ff26172 Revert "Add __all__ for a few distributed modules plus a little typing (#84119)"
This reverts commit 6f21680563.

Reverted https://github.com/pytorch/pytorch/pull/84119 on behalf of https://github.com/izaitsevfb due to breaking internal builds, see D39386448
2022-09-09 20:01:07 +00:00
Rodrigo Kumpera
6f21680563 Add __all__ for a few distributed modules plus a little typing (#84119)
This handles distributed_c10d, which is massive and ddp_comm_hooks.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84119
Approved by: https://github.com/rohan-varma
2022-09-08 23:28:31 +00:00
Pritam Damania
64670e414e [reland] Create torch.distributed._shard package. (#72141)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72141

We have many sharding components currently:
torch.distributed._sharded_tensor, torch.distributed._sharding_spec,
torch.distributed._sharded_optimizer and more coming.

As a result, organizing all of this under the `torch.distributed._shard`
package. For BC reasons, I'm still keeping the old packages and have them just
reference the new package.
ghstack-source-id: 148150861
ghstack-source-id: 148150861

Test Plan: waitforbuildbot

Reviewed By: fduwjj

Differential Revision: D33904585

fbshipit-source-id: 057e847eb7521b536a3ee4e0f94871aacc752062
(cherry picked from commit 29a70dd7af)
2022-02-02 06:58:20 +00:00
Nikita Shulga
34494e6252 Back out "Create torch.distributed.shard package." (#72062)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72062

Original commit changeset: dc692b31e260

Original Phabricator Diff: D33755913 (87bbcf70f7)

Test Plan: CI

Reviewed By: pbelevich

Differential Revision: D33891115

fbshipit-source-id: 37286e03d743d8691319f07c95e9561d54f3d6d0
(cherry picked from commit 0c1b3fe008)
2022-01-31 18:29:27 +00:00
Pritam Damania
87bbcf70f7 Create torch.distributed.shard package. (#71742)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71742

We have many sharding components currently:
torch.distributed._sharded_tensor, torch.distributed._sharding_spec,
torch.distributed._sharded_optimizer and more coming.

As a result, organizing all of this under the `torch.distributed.shard`
package. For BC reasons, I'm still keeping the old packages and have them just
reference the new package.
ghstack-source-id: 147899768

Test Plan: waitforbuildbot

Reviewed By: fduwjj, wanchaol

Differential Revision: D33755913

fbshipit-source-id: dc692b31e2607063d55dfcb3db33ec53961d5a5b
(cherry picked from commit 5b6885f358)
2022-01-29 00:48:06 +00:00
Pritam Damania
c41d8290b3 Rename shard_lengths to shard_sizes to be more inline with Tensor sizes. (#66464)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66464

Dimension sizes are referred to as `size` in general in PyTorch and
hence rename shard_lengths to shard_sizes.

#Closes: https://github.com/pytorch/pytorch/issues/65794
ghstack-source-id: 143866449

Test Plan: waitforbuildbot

Reviewed By: fduwjj, wanchaol

Differential Revision: D31564153

fbshipit-source-id: 6273426c4b0e079358806070d0d9644740adb257
2021-11-19 16:30:00 -08:00
Wanchao Liang
35712a8eb4 [reland] simplify init_from_local_shards API (#68021)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68021

reland PR of https://github.com/pytorch/pytorch/pull/64481 as the previous one have some internal failures that didn't get captured when first landed.

This simplifies `init_from_local_shards` API in sharded tensor, to only require user pass in a list of `Shard` and `overall_size`, instead of ShardedTensorMetadata. We will do the all_gather inside to form a valid ShardedTensorMetadata instead.

TODO: add more test cases to improve coverage.
ghstack-source-id: 143661119
ghstack-source-id: 143661119

Test Plan: TestShardedTensorFromLocalShards

Reviewed By: pritamdamania87

Differential Revision: D32147888

fbshipit-source-id: 897128b75224f4b9644471a04a64079f51e0d5fe
2021-11-17 23:20:37 -08:00
Junjie Wang
2766662ca9 [PyTorch][2/N] Basic implementation of ShardedEmbeddingBag using ShardedTensor. (#67188)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67188

This diff/PR is trying to implement the ShardedEmbeddingBag using the ShardedTensor.

We support both row-wise and column-wise sharding of the embedding bag. The detailed logic can be found in the comment.

Several caveats:
1. Only the sharding of one weight is supported now.
1. We support limited input params for the op. To support more params are on the way.
2. We only support chuck sharding for now.
3. We only support a single local shard per rank for now.

Some other changes include:
1. Refactor the ShardedEmbedding code so that the common logic can be reused.
2. Fix tiny typos and corner cases in API `get_chunked_dim_size`. Where it will return -1 if the we set the dim_size = 5, split_size = 2, idx = 3. (This is a valid case because when chunks = 4, dim_size = 5, then the split_size = 2)
ghstack-source-id: 142325915

Test Plan: Unit test and CI

Reviewed By: pritamdamania87

Differential Revision: D31749458

fbshipit-source-id: ed77e05e4ec94ef1a01b1feda8bbf32dc5d5da1b
2021-11-03 17:39:18 -07:00
Dmytro Ivchenko
ba74b03b0d Back out "[sharded_tensor] simplify init_from_local_shards API"
Summary: Original commit changeset: 6e97d95ffafd

Test Plan: unit test

Reviewed By: wanchaol

Differential Revision: D32023341

fbshipit-source-id: 2a9f7b637c0ff18700bcc3e44466fffcff861698
2021-10-29 14:01:07 -07:00
Wanchao Liang
71a67d0ce9 [sharded_tensor] simplify init_from_local_shards API (#64481)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64481

This simplifies `init_from_local_shards` API in sharded tensor, to only require user pass in a list of `Shard` and `overall_size`, instead of ShardedTensorMetadata. We will do the all_gather inside to form a valid ShardedTensorMetadata instead.

TODO: add more test cases to improve coverage.
ghstack-source-id: 141742350

Test Plan: TestShardedTensorFromLocalShards

Reviewed By: pritamdamania87

Differential Revision: D30748504

fbshipit-source-id: 6e97d95ffafde6b5f3970e2c2ba33b76cabd8d8a
2021-10-27 22:19:20 -07:00
Masaki Kozuki
768cfaa8f8 fix typo in _sharded_tensor (#65511)
Summary:
per title

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang gcramer23

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65511

Reviewed By: albanD

Differential Revision: D31239269

Pulled By: cbalioglu

fbshipit-source-id: 602c0bf7ef96a930606d68b15a5b3cadda9d9437
2021-09-29 18:00:47 -07:00
Xing Liu
600df80296 [PT/ShardedTensor]Allow zero size local shard (#65007)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65007

Relax shard size check in ShardMetadata to allow zero size local shard.

When sharding a tensor on N ranks, some ranks may have empty shard allocated. As we are assuming SPMD, the ranks w/ empty shard still need to participate in all collectives, and we need to allow this in ShardMetadata.

Test Plan: Unit tests and CLI

Reviewed By: jiaqizhai, wanchaol

Differential Revision: D30926566

fbshipit-source-id: afa562c94ffa8f8d91d65ddb4c348156d871dc36
2021-09-21 09:58:54 -07:00
Pritam Damania
0dc98728bc Basic implementation of ShardedLinear using ShardedTensor. (#64128)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64128

This PR implements a sharded nn.Linear layer using ShardedTensors with
the following limitations:

1) Works only for ChunkShardingSpec.
2) Implementation is only aimed to demonstrate functionality and is most likely
not performant at all.

The PR also introduces a `shard_parameter` API to easily shard parameters of
`nn.Modules`. This also has the following limitations:

1) Works only for ChunkShardingSpec.
2) Is not performant since it uses broadcast instead of scatter since
ProcessGroupNCCL doesn't yet support scatter.

Overall user API for running a sharded linear would be something like this:

```
# SPMD programming paradigm running same code on all nodes.
fc = nn.Linear(10, 10)

# Setup sharding.
sharding_spec=ChunkShardingSpec(...)
shard_parameter(fc, 'weight', sharding_spec, src_rank=0)

# Run as a normal linear layer.
inp = torch.rand(10, 10)
output = fc(inp)
```
ghstack-source-id: 138500985

Test Plan:
1) unit tests.
2) waitforbuildbot

Reviewed By: wanchaol, bowangbj

Differential Revision: D30621215

fbshipit-source-id: 1aa7478568c18a4572f6c3462fdf24a4cbde01d6
2021-09-20 18:31:11 -07:00
Wanchao Liang
d431c77d76 [sharded_tensor] fix typing issue for placement (#63426)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63426

placement should either be a string or a _remote_device, this fixes the type to match the behaviors
ghstack-source-id: 136041125

Reviewed By: pritamdamania87

Differential Revision: D30379702

fbshipit-source-id: 34e226494240923b433e3a39cc08c84d42cdad6b
2021-08-17 23:11:48 -07:00
Pritam Damania
b8e6144e0a Add a _RemoteDevice structure for ShardedTensor/ShardingSpec. (#62927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62927

As part of the ShardedTensor work, we realized we do need some sort of
_RemoteDevice structure that deals with our format of "workername/device" so
that users don't have to worry about parsing this string directly.

Right now this structure is just the bare minimum and is mostly a container for
describing a remote device. It is currently only used in ShardedTensor,
ShardingSpec and RemoteModule.

Once we actually have a consolidated remote device proposal, this class can be
extended appropriately if needed.
ghstack-source-id: 135534086

Test Plan:
1) unit tests
2) waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D30170689

fbshipit-source-id: 1ac2e81c7a597dc40bf3fbf2c1168c382c66649f
2021-08-11 11:27:32 -07:00
Wanchao Liang
d92301dd02 [sharded_tensor] add new init_from_local_shards API (#60479)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60479

This added `init_from_local_shards` API to construct a ShardedTensor from local_shards and global sharded_tensor_metadata. It also refactors the utils in ShardingSpec to be able to be used by sharded_tensor for sanity check purpose.

Test Plan:
test_init_from_local_shards
test_init_from_local_shards_invalid_sharding

Reviewed By: pritamdamania87

Differential Revision: D29276777

fbshipit-source-id: 011c1d70426bc560a59b8d858c68f1aa12db8481
2021-07-29 22:04:13 -07:00
Pritam Damania
0222291544 Fix docs for ShardMetadata. (#61388)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61388

The doc for `placement` argument was outdated and is now fixed.
ghstack-source-id: 133184441

Test Plan: waitforbuildbot

Reviewed By: wanchaol

Differential Revision: D29601316

fbshipit-source-id: a0817f799382bf91a5192c54dfeea4d253eb0d56
2021-07-07 21:27:30 -07:00
Pritam Damania
a8430f1076 Remove PlacementSpec from ShardingSpecs. (#59990)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59990

ShardingSpecs accepted a Device/PlacementSpec and was initially
written this way for flexibility. Although, it is slightly confusing given
there is no general use case for this. As a result, to keep things simple I've
ensured that both specs only accept devices for now.

We can always extend this to include a general PlacementSpec later on.
ghstack-source-id: 131842525

Test Plan: waitforbuildbot

Reviewed By: SciPioneer, rohan-varma

Differential Revision: D29116463

fbshipit-source-id: a6f2b3f1346ac6afab91c9595d4cae4f4da04fda
2021-06-18 17:37:43 -07:00
Pritam Damania
f11120967e Support EnumerableShardingSpec in ShardedTensor. (#59061)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59061

Overall Design: https://github.com/pytorch/pytorch/issues/55207

This PR builds upon https://github.com/pytorch/pytorch/pull/58517 and
https://github.com/pytorch/pytorch/pull/57409 to support creating a
ShardedTensor using EnumerableShardingSpec.
ghstack-source-id: 130780376

Test Plan:
1) unit tests
2) waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D28734551

fbshipit-source-id: 656f5f2b22041dae071bc475f19fe94c969716e8
2021-06-09 23:21:14 -07:00
Pritam Damania
40f851c53e Use dataclasses to simplify ShardingSpec (#58893)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58893

Leverage dataclasses to simplify some of the ShardingSpec classes.
ghstack-source-id: 130041687

Test Plan: waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D28665137

fbshipit-source-id: da37517cf2bd8c65d4a5b7cae171fa460e6b0946
2021-05-27 17:33:28 -07:00
Pritam Damania
b420ded66f ShardedTensor framework for ChunkedShardingSpec (#58517)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58517

Building upon the sharding specifications, this PR introduces the
intial skeleton of ShardedTensor and allows building a ShardedTensor by
specifying ChunkedShardingSpec.

In follow up PRs, I'll add further support for GenericShardingSpec.
ghstack-source-id: 129917841

Test Plan:
1) unit tests.
2) waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D28526012

fbshipit-source-id: 8e62847b58957d284e40f57a644302c171289138
2021-05-26 13:24:23 -07:00
Pritam Damania
4709fdb117 Add GenericShardingSpec for generic tensor sharding. (#57409)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57409

Full design: https://github.com/pytorch/pytorch/issues/55207

In https://github.com/pytorch/pytorch/issues/55207, we proposed
`MeshShardingSpec` as a generic sharding mechanism. However, that proposal does
not provide the flexibility to specify shards which have uneven
sizes/partitions and assumes even partitioning. Uneven partitioning is one of
the requirements of an internal use case.

As a result, instead of that we introduce a `GenericShardingSpec` which allows
specifying any arbitrary partitioning of a multi dimensional tensor. Basically
it specifies the start offsets of each shard and the length of each dim of the
shard allowing for greater flexibility
ghstack-source-id: 129604155

Test Plan:
1) unit tests
2) waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D28137616

fbshipit-source-id: 61255762485fb8fa3ec3a43c27bbb222ca25abff
2021-05-23 16:06:05 -07:00
Pritam Damania
0d6fa1adc5 Introduce ChunkShardingSpec as a model sharding specification. (#55728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55728

Full design: https://github.com/pytorch/pytorch/issues/55207

This PR introduces ChunkShardingSpec (SingleShardingSpec in the design). Used
the name ChunkShardingSpec since it is very similar to `torch.chunk` in terms
of how a Tensor is split up and feels more clear compared to SingleShardingSpec.
ghstack-source-id: 129603318

Test Plan: waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D27694108

fbshipit-source-id: c8764abe6a4d5fc56d023fda29b74b5af2a73b49
2021-05-23 16:04:57 -07:00