pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Aaron Orenstein	5a0068cc69	[BE] mypy: disallow untyped decorators (#131428 ) Untyped decorators strip the types from their decorated function so even if the underlying function is fully typed then callers to it don't get any benefit from type annotations. Step 1 - Enable the error and override in all the offending files. #131429 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131428 Approved by: https://github.com/justinchuby, https://github.com/oulgen	2024-07-23 21:50:55 +00:00
Xuehai Pan	22d258427b	[BE][Easy] enable UFMT for `torch/distributed/_shard/` (#128867 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128867 Approved by: https://github.com/fegin ghstack dependencies: #128866	2024-06-18 14:39:25 +00:00
Aaron Orenstein	3a0d088517	Flip default value for mypy disallow_untyped_defs [5/11] (#127842 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127842 Approved by: https://github.com/oulgen	2024-06-08 18:49:18 +00:00
Wanchao Liang	2c9a420da3	[dtensor] move some modules to private namespace (#127339 ) as titled, moving some modules that are mainly for DTensor private usage to be a private module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127339 Approved by: https://github.com/awgu ghstack dependencies: #127338	2024-05-29 05:18:47 +00:00
Wanchao Liang	afee5bea92	[dtensor] refactor schema suggestions in output sharding (#122929 ) This PR refactors the schema_suggestions in OuputSharding to be a single OpSchema instead of list of schemas, which in practice we only have one, for the multiple resharding case we also moved to OpStrategy so there's no case that needs it to be a list Pull Request resolved: https://github.com/pytorch/pytorch/pull/122929 Approved by: https://github.com/tianyu-l	2024-04-01 17:39:39 +00:00
Adrian Wälchli	866457e746	Fix pydocstyle errors in fully_sharded_data_parallel.py, api.py, graph_utils.py, distribute.py, iter_graph_module.py, comm_tensor.py, experimental_ops.py, batch_dim_utils.py, data_parallel.py, graph_optimization.py (#113216 ) Fixes #113191 ``` pydocstyle torch/distributed/fsdp/fully_sharded_data_parallel.py --count ``` On master: 80 After my changes on this PR: 3 ``` pydocstyle torch/distributed/_spmd/comm_tensor.py --count ``` On master: 5 After my changes on this PR: 3 ``` pydocstyle torch/distributed/_spmd/experimental_ops.py --count ``` On master: 3 After my changes on this PR: 1 ``` pydocstyle torch/distributed/_spmd/iter_graph_module.py --count ``` On master: 39 After my changes on this PR: 27 ``` pydocstyle torch/distributed/_spmd/graph_utils.py --count ``` On master: 16 After my changes on this PR: 4 ``` pydocstyle torch/distributed/_spmd/distribute.py --count ``` On master: 19 After my changes on this PR: 10 ``` pydocstyle torch/distributed/_spmd/api.py --count ``` On master: 10 After my changes on this PR: 3 ``` pydocstyle torch/distributed/_spmd/batch_dim_utils.py --count ``` On master: 14 After my changes on this PR: 3 ``` pydocstyle torch/distributed/_spmd/data_parallel.py --count ``` On master: 34 After my changes on this PR: 2 ``` pydocstyle torch/distributed/_spmd/graph_optimization.py --count ``` On master: 35 After my changes on this PR: 13 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113216 Approved by: https://github.com/ezyang	2023-11-10 03:08:32 +00:00
Wanchao Liang	09f3e08bcc	[dtensor][3/n] use dedicated TensorMeta instead of the fx one (#108261 ) This PR switches the usage of fx's shape prop TensorMetadata to dtensor's own dedicated defined TensorMeta, this is because DTensor only cares three fields: shape/stride/dtype, all other fields are not necessary and can be inferred from local_tensor directly. This would help significantly simplify how we deal with the tensor metadata by not caring other fields. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108261 Approved by: https://github.com/fduwjj ghstack dependencies: #107306	2023-09-13 04:08:02 +00:00
Wanchao Liang	fc1dcfb9ab	[dtensor][2/n] use op overload instead of function schema (#107306 ) function schema doesn't provide us anything as we can also get the schema from `op._schema`, include the op directly in op_schema makes easier for sharding prop to do fake execution, and in principle it should also make the hash comparison faster as we don't need to hash the function schema, instead we just hash the `id(op)` which is constant This PR is just a refactor to include op to OpSchema instead of func schema, no other logic changes Pull Request resolved: https://github.com/pytorch/pytorch/pull/107306 Approved by: https://github.com/fduwjj	2023-09-13 04:08:02 +00:00
fduwjj	4a6ca4cc05	[TP][DTensor Perf] Some perf improvement to reduce DTensor CPU overhead (#106524 ) By inspecting a small TP benchmark, we found couple things we can optimize: 1. We call deep_copy so many times when we initialize DTensor. 2. Some shading_prop is not cached successfully. 3. We are still calling redistribute when not necessary. ![image](https://github.com/pytorch/pytorch/assets/6937752/b847d110-eea1-45df-9298-066d0ba07dd7) ![image](https://github.com/pytorch/pytorch/assets/6937752/fc08f564-caed-496b-80d7-275c1dba3806) ![image](https://github.com/pytorch/pytorch/assets/6937752/fdc06cc4-a4ba-48e8-a118-c041bbd04f5e) So we want to: 1. Remove the deep_copy, and we now make placements a tuple so we are sure it's immutable. 2. Somehow the op_schema gets changed during sharding_op propogation, so we store a hash version of it before passing it to sharding_prop. Ideally we want to figure out why `op_schema` gets changed, but looks like in both index and detach/view op, all get changed, it might take more time to debug. 3. Also when we do hashing of op_schema, we want to hash the entire args_schema not just the args_spec which only contains the DTensorSpec from args which are Dtensors. 4. It turns out that sometimes, DTensor has mem_format to be None (not contiguous) and this will lead to redistribute get triggered, so that we only need to compare type/shape and stride in the metadata. Also we need to ensure _Partial and Shard have different hash value in the DTensorSpec. ![image](https://github.com/pytorch/pytorch/assets/6937752/321e6890-1ab6-4975-adc9-524c6ef9a76b) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106524 Approved by: https://github.com/wanchaol	2023-08-14 20:03:19 +00:00
Aaron Gokaslan	e2a3817dfd	[BE] Enable C419 rule for any all shortcircuiting (#99890 ) Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890 Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet	2023-04-25 15:02:13 +00:00
Shen Li	ca89e7942a	[SPMD][Easy] switch to tree_map_only to simplify code (#99547 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99547 Approved by: https://github.com/fegin	2023-04-19 20:40:09 +00:00
Shen Li	7c0c663a4c	[SPMD] Add aten.stack and aten.select to DTensor prop (#99417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99417 Approved by: https://github.com/fegin	2023-04-19 14:55:34 +00:00
Shen Li	54b168484d	Support LayerNorm without weight or bias parameters (#98687 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98687 Approved by: https://github.com/yifuwang	2023-04-09 02:13:10 +00:00
Shen Li	d255c8e1ad	Add NLLLoss to DTensor prop rule (#98512 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98512 Approved by: https://github.com/wanchaol	2023-04-08 01:22:36 +00:00
Shen Li	02179827cb	[Easy] Include SPMD and DTensor files in UFMT checks (#98148 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98148 Approved by: https://github.com/fegin	2023-04-02 15:34:49 +00:00
Shen Li	e8d39606eb	[SPMD] Enable fused Adam in full train step tracing (#98113 ) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98113 Approved by: https://github.com/yifuwang, https://github.com/fegin	2023-04-01 15:54:13 +00:00
Shen Li	9ec6fdb29b	Enable adam foreach in full train step tracing (#97897 ) Main changes: 1. Registered several foreach ops to both meta and DTensor 2. Skip redundant getitem node when expanding foreach ops with DTensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/97897 Approved by: https://github.com/wanchaol, https://github.com/fegin	2023-03-30 16:47:10 +00:00
Shen Li	379fb47654	[SPMD] Support foreach optimizers with functionalization (#97853 ) My first attempt was to apply the same solution as how proxy_tensor.py handles other inplace ops. However, foreach is different in the way that it's schema is `native_functions.yaml` does not return anything, whereas ops like `addcmul_` and `addcdiv_` do return Tensors (Thanks bdhirsh for teaching me this!). As a result, the proxy output during tracing does not wrap anything, and hence we cannot correctly connect it with subsequent operators. Modifying `native_functions.yaml` is not a preferred solution. After discussing with bdhirsh, the temporary solution is to do foreach functionalization as a graph pass for now. Later, when https://github.com/pytorch/pytorch/issues/97852 is addressed, we will switch to default functionalization. Edit: the latest version follows @bdhirsh 's suggestion on using `make_fx` `decomposition_table` instead of implementing manual fx.Graph tranforms to functionalize `_foreach_add_`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97853 Approved by: https://github.com/fegin, https://github.com/wanchaol	2023-03-30 11:27:10 +00:00
Shen Li	021de486ff	[Easy] Apply black to format _spmd files (#97534 ) No real changes. Format code to prepare for the PR on top. Differential Revision: [D44376380](https://our.internmc.facebook.com/intern/diff/D44376380) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97534 Approved by: https://github.com/wanchaol	2023-03-25 01:09:41 +00:00
Wanchao Liang	738beaa6b8	[dtensor] fix experimental_op slice_scatter (#95894 ) Test Plan: test with spmd e2e flow Differential Revision: D43740349 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95894 Approved by: https://github.com/fegin	2023-03-03 08:41:22 +00:00
Wanchao Liang	bb9a05b116	[dtensor] use tracing for metadata prop (#95456 ) This PR uses tracing for metadata prop, so that we can get correct shape/stride metadata without manual calculation by ourselves. The follow up PR on this would be adopt tracing for the sharding prop itself Differential Revision: [D43643578](https://our.internmc.facebook.com/intern/diff/D43643578) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95456 Approved by: https://github.com/XilunWu	2023-02-28 17:54:22 +00:00
Chien-Chin Huang	250c054bdd	[SPMD] Pull the minimal working distribute API and SPMD module to PyTorch (#94802 ) Pull the minimal working distribute API and SPMD module to PyTorch. The original code is on https://github.com/pytorch/tau/tree/main/spmd/compiler. Other main contributors to the original code base: @anj-s, @lessw2020, @wanchaol @aazzolini Differential Revision: [D43197230](https://our.internmc.facebook.com/intern/diff/D43197230/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94802 Approved by: https://github.com/anj-s, https://github.com/wanchaol	2023-02-16 00:36:16 +00:00

22 Commits