pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Tom Ritchford	d25e6e623f	Fix unused Python variables in test/[a-d]* (#134665 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134665 Approved by: https://github.com/albanD	2024-12-13 22:13:12 +00:00
hippocookie	9529d018e9	Refactor offset logic and work for nD (#135861 ) Optimize TODO task in code in distributed test files. - TODO: make this test cleaner and work for nD - TODO: add comments for create_plan/TestDedupTensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/135861 Approved by: https://github.com/wz337	2024-09-27 06:13:06 +00:00
Xilun Wu	de8a8653c0	[dtensor][BE] replace compute_local_shape with compute_local_shape_and_global_offset (#135554 ) Summary 1. This PR removes the public API `compute_local_shape` and replace its use with the more general API `compute_local_shape_and_global_offset`. 2. To keep `compute_local_shape_and_global_offset` consistent with `compute_local_shape` on empty shards, it now returns local tensor shape `(0,)` for empty shards which is more aligned with DTensor's semantics on non-participating ranks. Test `pytest test/distributed/_tensor/test_dtensor.py` `pytest test/distributed/_tensor/test_init.py` `pytest test/distributed/_tensor/test_tensor_ops.py` Differential Revision: [D62415591](https://our.internmc.facebook.com/intern/diff/D62415591) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135554 Approved by: https://github.com/tianyu-l, https://github.com/wz337	2024-09-12 06:30:09 +00:00
Wanchao Liang	cfc227ad43	[reland][dtensor] move DTensor to public namespace (#134203 ) reland of https://github.com/pytorch/pytorch/pull/133113 I have to create a new PR because the previous reverted PR could not either be rebased, or imported successfully :( ---- Moving DTensor to be in the public namespace, to formally add the documentation page that includes all the public APIs. This includes: * many path renames and path import fixes * a dedicated doc page without too much content yet (adding in the next PRs) * To preserve the BC for users still using the torch.distributed._tensor, I added a shim script to redirect old path calls to the new module The BC preserving is evidented by the fact that all DTensor tests are still working without changing the public imports. So it's safe to land the changes Pull Request resolved: https://github.com/pytorch/pytorch/pull/134203 Approved by: https://github.com/tianyu-l	2024-09-08 17:08:40 +00:00
Xilun Wu	ad0ce89050	[3/N][dtensor] Strided Sharding offset calculation util (#132391 ) Summary 1. change `compute_local_shape_and_global_offset` to correctly compute shape and offset for strided sharding placement (currently it only handles 2D and some 3D+ sharding). 2. Add a new property `num_shards_map` to `DTensorSpec` denoting how many shards each tensor dimension has. This is necessary for constructing `_StridedShard` placement when we call `distribute_tensor(dtensor_tp, dp_device_mesh, [Shard(0)])` and the `split_factor` argument will just be the number of shards on that sharding tensor dim. Test `test/distributed/_tensor/test_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/132391 Approved by: https://github.com/wanchaol ghstack dependencies: #126697, #130239	2024-08-07 18:17:12 +00:00
Xilun Wu	0b0c660c02	[2/N][dtensor] Strided Sharding shard_to_replicate (#130239 ) Summary This PR adds the necessary util function to `_StridedShard` for correct shard-to-replicate resharding. Test `pytest test/distributed/_tensor/test_utils.py -s -k strided_sharding` `pytest test/distributed/_tensor/test_utils.py -s -k test_fsdp2_tp_2d_dtensor_local_shards_and_offsets` Pull Request resolved: https://github.com/pytorch/pytorch/pull/130239 Approved by: https://github.com/wanchaol ghstack dependencies: #126697	2024-08-07 18:17:06 +00:00
Xilun Wu	92a17f454a	[1/N][dtensor] introduce StridedShard placement type and _split_tensor() logic (#126697 ) Summary This PR adds a new private placement type `_StridedShard` for FSDP2 + TP style tensor sharding. The previously used `Shard` placement type cannot produce correct `full_tensor()` result because it assumes the tensor to be first sharded over `dp` mesh dimension then `tp` mesh dimension which does not hold true in FSDP2 + TP case. Test `pytest test/distributed/_tensor/test_utils.py -s -k strided_sharding` Pull Request resolved: https://github.com/pytorch/pytorch/pull/126697 Approved by: https://github.com/wanchaol	2024-08-07 18:17:02 +00:00
Xuehai Pan	db3290846e	[BE][Easy][10/19] enforce style for empty lines in import segments in `test/d*/` (#129761 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129761 Approved by: https://github.com/fegin	2024-07-17 16:57:39 +00:00
Wanchao Liang	4f87f47ea1	[dtensor] reuse DTensorSpec as much as possible (#128112 ) as titled, given that our DTensorSpec is immutable, we can always reuse the spec if the input/output have the same tensor metadata. this helps two fold: 1. We don't need to re-calculate the hash everytime we produce a DTensorSpec, reduce runtime operator overhead 2. reduce the DTensor construction overhead. Some local benchmark on a 800 parameter clip_grad_norm shows that for foreach_norm the CPU overhead reduces from 11ms -> 7.8ms (around 30% improvement) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128112 Approved by: https://github.com/awgu	2024-06-06 16:55:50 +00:00
Wanchao Liang	ff061baa94	[comm_mode] adding some initial c10d ops to CommDebugMode (#125475 ) looks like we can make it work :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125475 Approved by: https://github.com/awgu	2024-05-04 04:20:46 +00:00
Iris Z	9fed2e826b	[DTensor][Test] Add unit tests to keep track of DTensor sharding for 2D (#123687 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/123687 Approved by: https://github.com/wanchaol	2024-04-18 03:29:16 +00:00
Iris Zhang (PyTorch)	23fa9621e4	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 ) (#115193 ) Summary: Rename _device_mesh.py to device_mesh.py, update all callsites, add documentation. We created stubs for public class and methods in torch.distributed.device_mesh so that torch.distributed.device_mesh can be imported with or without distributed is available(). Original diff reverted: D51629761 Original PR reverted: https://github.com/pytorch/pytorch/pull/115099 Prior to landing, CI signals are all passed. Shipit added the "ci/trunk" label to the PR and DID NOT wait for it and went ahead committing. More context can be found in the reverted PR above. Test Plan: CI. Differential Revision: D51861018 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115193 Approved by: https://github.com/fegin	2023-12-08 08:44:32 +00:00
wz337	dacf5d6e92	[DTensor] Remove assert to allow tensor sharding dimension < Shard(x).ndim (#115114 ) Consolidated by changes made by @yoyoyocmu. https://www.internalfb.com/diff/D51821717 Remove assert to allow tensor dimension < Shard(x).ndim. With the current padding, we do support this already. Follow up: we will still need to fix the size mismatch and `full_tensor()` hang when tensor is uneven-sharded. Created issue here: https://github.com/pytorch/pytorch/issues/115310 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115114 Approved by: https://github.com/yoyoyocmu, https://github.com/wanchaol	2023-12-07 21:57:30 +00:00
Nikita Shulga	a827ac71f2	Revert "[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 )" This reverts commit `eaa64339d6`.	2023-12-05 08:59:36 -08:00
Iris Zhang (PyTorch)	eaa64339d6	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099 ) Summary: Rename _device_mesh.py to device_mesh.py, update all callsites, adds documentation. Original diff reverted: D51629761 Original PR reverted: https://github.com/pytorch/pytorch/pull/114991 It was failing because failing a public module binding tests in MacOS, and this is due to the change in import order for torch/distributed/fsdp/_common_utils.py. Since this original import would still work, we remove the changes in this file. Test Plan: CI. Differential Revision: D51825114 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115099 Approved by: https://github.com/wanchaol, https://github.com/fegin	2023-12-05 05:44:52 +00:00
PyTorch MergeBot	3a2e2044cd	Revert "[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710 ) (#114991 )" This reverts commit `729ac7317a`. Reverted https://github.com/pytorch/pytorch/pull/114991 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/114991#issuecomment-1837214567))	2023-12-02 17:55:51 +00:00
Iris Zhang (PyTorch)	729ac7317a	[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710 ) (#114991 ) Summary: Same content of changes as https://github.com/pytorch/pytorch/pull/114710 Rename _device_mesh.py to device_mesh.py, update all callsites, adds documentation. ghstack-source-id: 208980207 exported-using-ghexport Test Plan: CI. Reviewed By: wanchaol Differential Revision: D51629761 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114991 Approved by: https://github.com/wanchaol, https://github.com/fduwjj, https://github.com/fegin	2023-12-02 04:39:41 +00:00
wz337	49aa8d19dd	[DTensor] Replace usage of compute_local_offset by compute_local_shape_and_global_offset (#108547 ) This PR removes four usages of compute_local_offset() in PyTorch repo and replaces it with the new API compute_local_shape_and_global_offset(). We will be removing compute_local_offset() API in the next diff, as there are usages internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108547 Approved by: https://github.com/wanchaol	2023-09-06 04:53:44 +00:00
wz337	13e4cce83c	[DTensor] Add util API to compute_local_shape_and_global_offset for checkpointing purpose (#107996 ) The compute_local_shape_and_global_offset API does the following: 1) Calculate both local_shape and global_offset in one API to replace two API calls (compute_local_size and compute_local_shape). 2) Generate the correct global_offset for checkpointing purposes. We are currently using compute_local_offset for downstream checkpoint components, which could lead to incorrect results. For checkpointing, we need global_offset instead of local_offset. In some cases, global_offset does not equal to local_offset, when a dimension is sharded multipe times on different mesh dimension (e.g. placements = [Shard(0), Shard(0)]). Follow-up PRs: 1) Replace related downstream components to use compute_local_shape_and_global_offset instead of compute_local_size and compute_local_offset. 2) Audit existing code base to see if we can remove compute_local_size and compute_local_offset, since they are currently being used. cc. @wanchaol Pull Request resolved: https://github.com/pytorch/pytorch/pull/107996 Approved by: https://github.com/wanchaol	2023-08-30 02:46:50 +00:00
Iris	0d2b55c459	[DTensor] Change Sharding algorithm to be in line with ``torch.chunk()`` (#98722 ) As functional collective being updated, using tensor_split() as the underlying sharding algorithm would require padding and unpadding on multiple ranks. Therefore, we are changing the sharding algorithm to be in line with ``torch.chunk()`` to allow padding on the last two ranks in most of the scenarios. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98722 Approved by: https://github.com/wanchaol	2023-04-21 02:05:22 +00:00
Shen Li	02179827cb	[Easy] Include SPMD and DTensor files in UFMT checks (#98148 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98148 Approved by: https://github.com/fegin	2023-04-02 15:34:49 +00:00
Wanchao Liang	789fc4c292	[dtensor] refactor shape/offset calculation (#95923 ) Shape offset calculation is commonly used and extract them into a separate util Pull Request resolved: https://github.com/pytorch/pytorch/pull/95923 Approved by: https://github.com/fduwjj	2023-03-05 06:33:32 +00:00
Ke Sang	6c061e5145	[DTensor][Shampoo] add _tenso.zero function (#95863 ) Summary: implement zeros function inside DTensor API - user specify the zeros tensor shape, and the function will create local zero tensor given the placement information Test Plan: {F889157756} - unit test for util function for compute_local_tensor_size - unit test for _tensor.zeros Reviewed By: wanchaol Differential Revision: D43630718 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95863 Approved by: https://github.com/wanchaol	2023-03-03 19:36:44 +00:00

23 Commits