pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Iris Zhang (PyTorch)	23fa9621e4	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 ) (#115193 ) Summary: Rename _device_mesh.py to device_mesh.py, update all callsites, add documentation. We created stubs for public class and methods in torch.distributed.device_mesh so that torch.distributed.device_mesh can be imported with or without distributed is available(). Original diff reverted: D51629761 Original PR reverted: https://github.com/pytorch/pytorch/pull/115099 Prior to landing, CI signals are all passed. Shipit added the "ci/trunk" label to the PR and DID NOT wait for it and went ahead committing. More context can be found in the reverted PR above. Test Plan: CI. Differential Revision: D51861018 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115193 Approved by: https://github.com/fegin	2023-12-08 08:44:32 +00:00
wz337	dacf5d6e92	[DTensor] Remove assert to allow tensor sharding dimension < Shard(x).ndim (#115114 ) Consolidated by changes made by @yoyoyocmu. https://www.internalfb.com/diff/D51821717 Remove assert to allow tensor dimension < Shard(x).ndim. With the current padding, we do support this already. Follow up: we will still need to fix the size mismatch and `full_tensor()` hang when tensor is uneven-sharded. Created issue here: https://github.com/pytorch/pytorch/issues/115310 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115114 Approved by: https://github.com/yoyoyocmu, https://github.com/wanchaol	2023-12-07 21:57:30 +00:00
Nikita Shulga	a827ac71f2	Revert "[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 )" This reverts commit `eaa64339d6`.	2023-12-05 08:59:36 -08:00
Iris Zhang (PyTorch)	eaa64339d6	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 ) Summary: Rename _device_mesh.py to device_mesh.py, update all callsites, adds documentation. Original diff reverted: D51629761 Original PR reverted: https://github.com/pytorch/pytorch/pull/114991 It was failing because failing a public module binding tests in MacOS, and this is due to the change in import order for torch/distributed/fsdp/_common_utils.py. Since this original import would still work, we remove the changes in this file. Test Plan: CI. Differential Revision: D51825114 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115099 Approved by: https://github.com/wanchaol, https://github.com/fegin	2023-12-05 05:44:52 +00:00
PyTorch MergeBot	3a2e2044cd	Revert "[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710 ) (#114991 )" This reverts commit `729ac7317a`. Reverted https://github.com/pytorch/pytorch/pull/114991 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/114991#issuecomment-1837214567))	2023-12-02 17:55:51 +00:00
Iris Zhang (PyTorch)	729ac7317a	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710 ) (#114991 ) Summary: Same content of changes as https://github.com/pytorch/pytorch/pull/114710 Rename _device_mesh.py to device_mesh.py, update all callsites, adds documentation. ghstack-source-id: 208980207 exported-using-ghexport Test Plan: CI. Reviewed By: wanchaol Differential Revision: D51629761 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114991 Approved by: https://github.com/wanchaol, https://github.com/fduwjj, https://github.com/fegin	2023-12-02 04:39:41 +00:00
wz337	49aa8d19dd	[DTensor] Replace usage of compute_local_offset by compute_local_shape_and_global_offset (#108547 ) This PR removes four usages of compute_local_offset() in PyTorch repo and replaces it with the new API compute_local_shape_and_global_offset(). We will be removing compute_local_offset() API in the next diff, as there are usages internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108547 Approved by: https://github.com/wanchaol	2023-09-06 04:53:44 +00:00
wz337	13e4cce83c	[DTensor] Add util API to compute_local_shape_and_global_offset for checkpointing purpose (#107996 ) The compute_local_shape_and_global_offset API does the following: 1) Calculate both local_shape and global_offset in one API to replace two API calls (compute_local_size and compute_local_shape). 2) Generate the correct global_offset for checkpointing purposes. We are currently using compute_local_offset for downstream checkpoint components, which could lead to incorrect results. For checkpointing, we need global_offset instead of local_offset. In some cases, global_offset does not equal to local_offset, when a dimension is sharded multipe times on different mesh dimension (e.g. placements = [Shard(0), Shard(0)]). Follow-up PRs: 1) Replace related downstream components to use compute_local_shape_and_global_offset instead of compute_local_size and compute_local_offset. 2) Audit existing code base to see if we can remove compute_local_size and compute_local_offset, since they are currently being used. cc. @wanchaol Pull Request resolved: https://github.com/pytorch/pytorch/pull/107996 Approved by: https://github.com/wanchaol	2023-08-30 02:46:50 +00:00
Iris	0d2b55c459	[DTensor] Change Sharding algorithm to be in line with ``torch.chunk()`` (#98722 ) As functional collective being updated, using tensor_split() as the underlying sharding algorithm would require padding and unpadding on multiple ranks. Therefore, we are changing the sharding algorithm to be in line with ``torch.chunk()`` to allow padding on the last two ranks in most of the scenarios. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98722 Approved by: https://github.com/wanchaol	2023-04-21 02:05:22 +00:00
Shen Li	02179827cb	[Easy] Include SPMD and DTensor files in UFMT checks (#98148 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98148 Approved by: https://github.com/fegin	2023-04-02 15:34:49 +00:00
Wanchao Liang	789fc4c292	[dtensor] refactor shape/offset calculation (#95923 ) Shape offset calculation is commonly used and extract them into a separate util Pull Request resolved: https://github.com/pytorch/pytorch/pull/95923 Approved by: https://github.com/fduwjj	2023-03-05 06:33:32 +00:00
Ke Sang	6c061e5145	[DTensor][Shampoo] add _tenso.zero function (#95863 ) Summary: implement zeros function inside DTensor API - user specify the zeros tensor shape, and the function will create local zero tensor given the placement information Test Plan: {F889157756} - unit test for util function for compute_local_tensor_size - unit test for _tensor.zeros Reviewed By: wanchaol Differential Revision: D43630718 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95863 Approved by: https://github.com/wanchaol	2023-03-03 19:36:44 +00:00

12 Commits