pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
wz337	8140494afd	[3/N][2D] Enable training with new 2D flow (#110034 ) Replacing https://github.com/pytorch/pytorch/pull/109553 as it gets reverted. This PR enables training with new 2D flow and adds associated test. In addition, this PR moves the tensor/parallel/_data_parallel_utils.py that are fsdp specific back to tensor/parallel/fsdp.py to avoid circular dependency for ddp.py and test/distributed/tensor/parallel/test_ddp_2d_parallel.py. state_dict related changes would be in later PRs. cc. @fegin, @fduwjj, @wanchaol, @awgu Pull Request resolved: https://github.com/pytorch/pytorch/pull/110034 Approved by: https://github.com/fduwjj	2023-09-26 09:14:15 +00:00
PyTorch MergeBot	f5886bf352	Revert "[3/N][2D] Enable training with new 2D flow (#109553 )" This reverts commit `217b37c023`. Reverted https://github.com/pytorch/pytorch/pull/109553 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but those distributed failures look legit and they are failing in trunk https://hud.pytorch.org/pr/109553 ([comment](https://github.com/pytorch/pytorch/pull/109553#issuecomment-1734100546))	2023-09-25 16:37:19 +00:00
wz337	217b37c023	[3/N][2D] Enable training with new 2D flow (#109553 ) This PR enables training with new 2D flow and adds associated test. state_dict related changes would be in later PRs. cc. @fegin, @fduwjj, @wanchaol, @awgu Pull Request resolved: https://github.com/pytorch/pytorch/pull/109553 Approved by: https://github.com/fegin, https://github.com/awgu	2023-09-25 05:32:07 +00:00
Chien-Chin Huang	1b3e5b53f3	[FSDP][optim_state_dict] Add device to _shard_utils.py to explicitly use the device from fsdp_state (#109631 ) _get_pg_default_device does not always get the device we want. This PR let the user explicitly tell use the correct device. Differential Revision: [D49425743](https://our.internmc.facebook.com/intern/diff/D49425743/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109631 Approved by: https://github.com/awgu, https://github.com/fduwjj, https://github.com/wz337	2023-09-20 01:59:38 +00:00
Wanchao Liang	a29b9101fa	[dynamo] fix dynamo + DTensor to work with 2d (#108329 ) pair debugged with @wconstab and we found some issue in both dynamo and the TP's fsdp extension side. This PR fixes the dynamo + DTensor integration so that the current graph break FSDP can work with tensor parallel by moving the torch.compile after FSDP wrapping. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108329 Approved by: https://github.com/Skylion007, https://github.com/wconstab	2023-08-31 22:46:26 +00:00
fduwjj	3828cd4b79	[TP][EZ] Update doc for TP parallel style (#107819 ) We need to update the doc for PairwiseParallel and SequenceParallel so that users don't get wrong impressions that these working for ``nn.Transformer``. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107819 Approved by: https://github.com/awgu, https://github.com/wanchaol	2023-08-24 00:13:52 +00:00
Wanchao Liang	da765995fb	[2d] remove ShardedTensor from fsdp extension (#107472 ) 2D Parallel won't use ShardedTensor, and it causes headable for dynamo to recoginize it, removing it from the runtime flatten/unflatten path Pull Request resolved: https://github.com/pytorch/pytorch/pull/107472 Approved by: https://github.com/fduwjj	2023-08-21 17:16:07 +00:00
Wanchao Liang	d8f2ef10a6	[dtensor][1/n] refactor op dispatch logic to reduce overhead (#107305 ) This PR is the first change of a series of refactors to the op dispatch logic to: 1. remove the redundant logic in the op dispatch, simplify the error checking 2. reduce the number of tree_map/tree_flatten/unflatten needed to reduce the overhead coming from those operations 3. remove the CachedShardingPropagator by using lru_cache from functools directly, this makes it not only helps TP, but general DTensor operations could be faster! 4. change the view ops behavior by inplace changing the op_schema, which is dangerous for sharding prop caching, model the view op as one type of resharding too 5. enrich output sharding to include whether the op needs redistribute so that we don't need explicit op schema comparison to know it. This should help with further reducing the CPU overhead, benchmark results: before (without this change), aten.addmm latency: 0.476ms ![Screenshot 2023-08-16 at 10 46 26 AM](https://github.com/pytorch/pytorch/assets/9443650/7692e6c1-1936-4c7f-bf9c-6c8c9b8f6c76) after (with this change), aten.addmm latency: 0.341ms ![Screenshot 2023-08-16 at 11 05 49 AM](https://github.com/pytorch/pytorch/assets/9443650/15a53f0b-7a95-444e-ab2f-3ee0ad2fa47f) overall one layer of mlp time reduced from 13.535 -> 9.665ms Apart from overhead reduction, this PR simplifies the op dispatching logic and the resharding logic (more refactor needed to make things more clean, which will be done in later PRs) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107305 Approved by: https://github.com/fduwjj	2023-08-18 18:30:46 +00:00
fduwjj	983fd5ba79	[2D][TP] Enable DDP TP integration with unit test (#106583 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106583 Approved by: https://github.com/kumpera, https://github.com/fegin, https://github.com/wanchaol ghstack dependencies: #107313	2023-08-17 02:54:17 +00:00
fduwjj	f3b0d83fe3	[EZ][TP] Refactor FSDP 2D integration extension code so that it can re-used (#107313 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107313 Approved by: https://github.com/wz337	2023-08-16 22:01:17 +00:00
fduwjj	4a6ca4cc05	[TP][DTensor Perf] Some perf improvement to reduce DTensor CPU overhead (#106524 ) By inspecting a small TP benchmark, we found couple things we can optimize: 1. We call deep_copy so many times when we initialize DTensor. 2. Some shading_prop is not cached successfully. 3. We are still calling redistribute when not necessary. ![image](https://github.com/pytorch/pytorch/assets/6937752/b847d110-eea1-45df-9298-066d0ba07dd7) ![image](https://github.com/pytorch/pytorch/assets/6937752/fc08f564-caed-496b-80d7-275c1dba3806) ![image](https://github.com/pytorch/pytorch/assets/6937752/fdc06cc4-a4ba-48e8-a118-c041bbd04f5e) So we want to: 1. Remove the deep_copy, and we now make placements a tuple so we are sure it's immutable. 2. Somehow the op_schema gets changed during sharding_op propogation, so we store a hash version of it before passing it to sharding_prop. Ideally we want to figure out why `op_schema` gets changed, but looks like in both index and detach/view op, all get changed, it might take more time to debug. 3. Also when we do hashing of op_schema, we want to hash the entire args_schema not just the args_spec which only contains the DTensorSpec from args which are Dtensors. 4. It turns out that sometimes, DTensor has mem_format to be None (not contiguous) and this will lead to redistribute get triggered, so that we only need to compare type/shape and stride in the metadata. Also we need to ensure _Partial and Shard have different hash value in the DTensorSpec. ![image](https://github.com/pytorch/pytorch/assets/6937752/321e6890-1ab6-4975-adc9-524c6ef9a76b) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106524 Approved by: https://github.com/wanchaol	2023-08-14 20:03:19 +00:00
alanhe151220037	1afbc985fe	Make RNGStateTracker support cuda-like device (#106771 ) replace `CudaRNGStateTracker` with `RNGStateTracker` by rewriting some Cuda-binding code with `device_handle` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106771 Approved by: https://github.com/wanchaol	2023-08-10 19:14:33 +00:00
fduwjj	487ebcac3b	Clean up unsed MHA code to avoid confusion (#105956 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105956 Approved by: https://github.com/wz337, https://github.com/ezyang, https://github.com/wanchaol	2023-07-27 17:10:17 +00:00
FFFrog	9a1cdcb8a0	Format: fixing multiple string concatenation in single line (#106013 ) Fixing multiple string concatenation in single line Pull Request resolved: https://github.com/pytorch/pytorch/pull/106013 Approved by: https://github.com/albanD	2023-07-26 18:39:18 +00:00
Nikita Shulga	5837e95d30	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04: - Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh` - Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-15 20:30:20 +00:00
PyTorch MergeBot	15fd1ea118	Revert "[Reland] Update mypy to 1.4.1 (#105227 )" This reverts commit `c9c4f8efc3`. Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))	2023-07-14 22:28:35 +00:00
Nikita Shulga	c9c4f8efc3	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-14 20:45:12 +00:00
PyTorch MergeBot	3c5a494d7a	Revert "Update mypy to 1.4.1 (#91983 )" This reverts commit `634659e262`. Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))	2023-07-14 15:59:16 +00:00
Nikita Shulga	634659e262	Update mypy to 1.4.1 (#91983 ) Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983 Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi	2023-07-13 16:30:36 +00:00
Xilun Wu	e799f565eb	[DTensor][TP][Random] Introduce TensorParallelRNGTracker to integrate parallel RNG state with Tensor Parallel (#103910 ) This PR enables the automatic use of `TensorParallelRNGTracker` in Tensor Parallel api. Some unit tests are going to be added to cover. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103910 Approved by: https://github.com/wanchaol, https://github.com/fduwjj	2023-06-30 08:06:41 +00:00
fduwjj	23b7035b3c	[TP] Add an input resharding wrapper for TP and unit test for 2D + AC (#103334 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103334 Approved by: https://github.com/kumpera	2023-06-23 04:05:01 +00:00
Wanchao Liang	d31707a257	Get rid of dim_groups attribute from DeviceMesh (#103105 ) This PR get rids of the dim_groups attribute from DeviceMesh, the main motivation behind this is that we should let c10d store the process groups during its creation instead of DeviceMesh, DeviceMesh should just handle ranks correctly. This could enable DTensor becomes picklable! (torch.save/load could be possible), which I will give it a try in the next PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/103105 Approved by: https://github.com/XilunWu, https://github.com/fduwjj	2023-06-09 04:11:15 +00:00
fduwjj	d4380edb9b	[TP] Add API logging for TP high level API (#102209 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102209 Approved by: https://github.com/wz337, https://github.com/wanchaol	2023-05-25 03:33:00 +00:00
Wanchao Liang	a1aa32e204	[dtensor] tensor ops to use strategy based sharding prop (#100607 ) This is the first series of PR that adopts operator impls to use a strategy based approach, each op utilizes OpStrategy and PlacementStrategy to generate their own strategy. By utilizing the strategy based approach along with the op graph, we could enable more advanced op implementation (decomp is possible), and turn the sharding prop to be more like a contraint satisfication problem. This PR alone only adds some basic tensor op strategies, and it directly works on the op graph that was used for metadata propagation. The tensor ops added in this PR mainly follows one of the arg strategy. The next set of PRs would add more op strategies to other ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100607 Approved by: https://github.com/XilunWu	2023-05-11 02:47:20 +00:00
fduwjj	953aa6d90e	[TP] Enable more generic attn in Tensor Parallelism (#100508 ) To make TP more generic for Attention module, we come up with this new col/rowwise parallel style. Basically, the idea behind is that: We only do DTensor op for Col/Rowwise sharded part. For the rest of ATen ops, we will leave it to Tensor ops. And we set this behavior as default for Colwise and Rowwise parallel style. If people want to customize it, they can always pass in different prepare_input or prepare_output Pull Request resolved: https://github.com/pytorch/pytorch/pull/100508 Approved by: https://github.com/wanchaol	2023-05-07 18:15:49 +00:00
fduwjj	89b1e67d0a	[Tensor Parallel] Add a new Colwise Parallel style when Pairwise cannot directly used (#100137 ) Some use cases, users cannot directly `PairwiseParallelStyle` and they might need to specify colwise and rowwise separately. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100137 Approved by: https://github.com/wz337	2023-04-28 03:27:51 +00:00
Rohan Varma	be8c7c06b6	[Tensor Parallel] Simplify distribute for MHA (#100046 ) This function is only called for nn.MHA or the custom MHA we use, and if it is the former it is converted to the latter. So this check can actually be an assert. Differential Revision: [D45300396](https://our.internmc.facebook.com/intern/diff/D45300396/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100046 Approved by: https://github.com/wanchaol	2023-04-27 00:54:21 +00:00
Xilun Wu	ce60997376	[BE][DTensor] validate the mesh argument in DeviceMesh construction (#99094 ) ## What's in this PR DeviceMesh's __init__ function now requires all calling ranks to pass the same `mesh` argument. ## Why We want to enforce SPMD style of programs using DTensor. Before this PR, 2-D Parallel API (e.g. _create_1d_device_mesh) defines different DeviceMesh on different ranks. After this PR, it defines each sub-meshes and simply perform communications on the one that it is associated with. Differential Revision: [D45165511](https://our.internmc.facebook.com/intern/diff/D45165511) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99094 Approved by: https://github.com/wanchaol	2023-04-21 23:47:51 +00:00
Kazuaki Ishizaki	35fd5c548e	Fix typos under torch/distributed directory (#95638 ) This PR fixes typos in comments and messages of `.py` files under torch/distributed directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/95638 Approved by: https://github.com/usamah1, https://github.com/H-Huang, https://github.com/kit1980	2023-03-27 21:13:44 +00:00
Wanchao Liang	16e7e5a24b	[dtensor] lazy init process groups in device mesh (#96700 ) This PR adds a private flag to allow process grou lazy initialization, this is replacing the previous `dim_groups` arg, as no one is using that now This could help avoid creating process groups when not necessary Differential Revision: [D44044664](https://our.internmc.facebook.com/intern/diff/D44044664) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96700 Approved by: https://github.com/fduwjj, https://github.com/XilunWu	2023-03-20 17:50:04 +00:00
Wanchao Liang	261eb46ddd	[dtensor] refactor get_coordiniate (#95457 ) This refactor get_coordinate to return a optional[list] instead of directly the coordinate on dim, this is so that we can check if the rank is inside the mesh easily Differential Revision: [D43643579](https://our.internmc.facebook.com/intern/diff/D43643579) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95457 Approved by: https://github.com/XilunWu	2023-02-28 17:54:26 +00:00
Wanchao Liang	bb9a05b116	[dtensor] use tracing for metadata prop (#95456 ) This PR uses tracing for metadata prop, so that we can get correct shape/stride metadata without manual calculation by ourselves. The follow up PR on this would be adopt tracing for the sharding prop itself Differential Revision: [D43643578](https://our.internmc.facebook.com/intern/diff/D43643578) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95456 Approved by: https://github.com/XilunWu	2023-02-28 17:54:22 +00:00
fduwjj	b209d8fa0d	[PT-D][Sequence Parallelism] Enable DTensor based Naive sequence parallelism (#94369 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94369 Approved by: https://github.com/wanchaol	2023-02-16 21:21:00 +00:00
Wanchao Liang	cd9ca4c73f	[tp] additional doc fixes (#94786 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94786 Approved by: https://github.com/fduwjj	2023-02-15 21:25:26 +00:00
fduwjj	39511697d4	[PT-D][BE] Update 2D parallelism API name and docs (#94771 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94771 Approved by: https://github.com/wanchaol	2023-02-14 08:13:15 +00:00
PyTorch MergeBot	28ed0bdb37	Revert "[tp] additional doc fixes (#94786 )" This reverts commit `7522ca55f1`. Reverted https://github.com/pytorch/pytorch/pull/94786 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but the doc failure looks related and they are also failing in trunk `7522ca55f1`	2023-02-14 05:43:37 +00:00
Wanchao Liang	7522ca55f1	[tp] additional doc fixes (#94786 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94786 Approved by: https://github.com/fduwjj	2023-02-14 04:52:04 +00:00
Wanchao Liang	2db12e3844	[tp] minor update to TP docs (#94748 ) minor update to TP docs for beta release Pull Request resolved: https://github.com/pytorch/pytorch/pull/94748 Approved by: https://github.com/fduwjj	2023-02-13 21:54:19 +00:00
Xuehai Pan	5b1cedacde	[BE] [2/3] Rewrite `super()` calls in functorch and torch (#94588 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-10 21:16:33 +00:00
fduwjj	41e3189222	[PT-D][Tensor parallelism] Add documentations for TP (#94421 ) This is far from completed and we will definitely polish it down the road. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94421 Approved by: https://github.com/wz337	2023-02-09 02:31:06 +00:00
Aaron Gokaslan	1e2d82b8e4	[BE] Merge isinstance calls together (#94419 ) Simplify and speeds up isinstance calls by checking for multiple types at the same time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94419 Approved by: https://github.com/ezyang	2023-02-09 00:47:26 +00:00
fduwjj	3fb6e119e2	[PT-D][TP] Fix the module registration in TP API (#93412 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93412 Approved by: https://github.com/XilunWu	2023-02-01 21:03:56 +00:00
Wanchao Liang	9a56997fe1	[dtensor][5/N] add cached propagator for TP (#90734 ) This PR adds a cached propagator for TP use, it caches the sharding prop decision for the same input sharding on an operator. This could improve eager mode performance. Differential Revision: [D42876249](https://our.internmc.facebook.com/intern/diff/D42876249) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90734 Approved by: https://github.com/XilunWu, https://github.com/fduwjj	2023-02-01 05:04:08 +00:00
fduwjj	913866efbf	[PT-D][TP] Fix TP API for FQN path based parallelization (#93029 ) We have not tested dict based parallelize_module and turns out we had mistakes here. 1. Fix the error. 2. Add unit test cases for it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93029 Approved by: https://github.com/wz337	2023-01-26 09:10:21 +00:00
joncrall	ad782ff7df	Enable xdoctest runner in CI for real this time (#83816 ) Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-12-29 05:32:42 +00:00
PyTorch MergeBot	cba96366a2	Revert "remove torch.equal usages (#89527 )" This reverts commit `4095ef8b80`. Reverted https://github.com/pytorch/pytorch/pull/89527 on behalf of https://github.com/clee2000 due to broke periodic multigpu tests `4095ef8b80` https://github.com/pytorch/pytorch/actions/runs/3592806602/jobs/6049368502	2022-12-02 21:36:13 +00:00
Wanchao Liang	9b5e6b029f	[tp] umft distributed.tensor.parallel (#89969 ) cmd: `ufmt format torch/distributed/tensor` Pull Request resolved: https://github.com/pytorch/pytorch/pull/89969 Approved by: https://github.com/fduwjj	2022-12-01 20:58:16 +00:00
Philip Meier	4095ef8b80	remove torch.equal usages (#89527 ) Preparation for the next PR in this stack: #89559. I replaced - `self.assertTrue(torch.equal(...))` with `self.assertEqual(..., rtol=0, atol=0, exact_device=True)`, - the same for `self.assertFalse(...)` with `self.assertNotEqual(...)`, and - `assert torch.equal(...)` with `torch.testing.assert_close(..., rtol=0, atol=0)` (note that we don't need to set `check_device=True` here since that is the default). There were a few instances where the result of `torch.equal` is used directly. In that cases I've replaced with `(... == ...).all().item()` while sometimes also dropping the `.item()` depending on the context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89527 Approved by: https://github.com/mruberry	2022-12-01 11:22:52 +00:00
Wanchao Liang	4451eb24e6	Move tensor_parallel out to distributed.tensor folder (#89878 ) This PR moves tensor parallel from torch.distributed._tensor.parallel to torch.distributed.tensor.parallel, to prepare for beta release Pull Request resolved: https://github.com/pytorch/pytorch/pull/89878 Approved by: https://github.com/fduwjj	2022-11-30 22:13:10 +00:00

49 Commits