gather_object is problematic when used with Tensors as they can unpickle on the wrong
device and lead to deadlocks or spurious failures.
This change introduces a RPC workaround for EFA when initing TensorPipe until
they properly address it.
Fixes#73935
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77272
Approved by: https://github.com/pritamdamania87
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72130
1. Refactor ShardingSpec, decouple PlacementSpec and ShardingSpec, as they are essentially two separate concept
2. Introduce customizable ShardingSpec, with the help of two APIs, we can allow user to inherit and define their own customized sharding spec:
* ShardingSpec.build_metadata, which takes a tensor shape and define how to shard a tensor like this shape across ranks, return a ShardedTensorMetadata that describes the layout.
* ShardingSpec.shard: define how to shard a tensor into ShardedTensor
3. Refactor `ShardedTensor.__init__` and `shard_parameter` to take the new ShardingSpec, enable these two API to support both ChunkShardingSpec and EnumerableShardingSpec
ghstack-source-id: 149788833
Test Plan: wait for ci
Reviewed By: fduwjj
Differential Revision: D33923403
fbshipit-source-id: 3236beec8543da651dfd89c32b6968745c59301e
(cherry picked from commit 5994b33a7a6ad96b1fad2e121c6bdd83a877346e)