pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Aaron Gokaslan	9c3fbe7475	[BE] Enable flake8-simplify checks (#97984 ) Enable some sensible flake8-simplify rules. Mainly wanted to enable the SIM101, and `yield from` SIM103 checks. @kit1980 since you wanted to be tagged on this CI check. Enabling this check also helped flag one logical bug so it's definitely beneficial (also fixed in this PR). Pull Request resolved: https://github.com/pytorch/pytorch/pull/97984 Approved by: https://github.com/ezyang	2023-03-31 03:40:21 +00:00
Shen Li	9ec6fdb29b	Enable adam foreach in full train step tracing (#97897 ) Main changes: 1. Registered several foreach ops to both meta and DTensor 2. Skip redundant getitem node when expanding foreach ops with DTensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/97897 Approved by: https://github.com/wanchaol, https://github.com/fegin	2023-03-30 16:47:10 +00:00
Shen Li	379fb47654	[SPMD] Support foreach optimizers with functionalization (#97853 ) My first attempt was to apply the same solution as how proxy_tensor.py handles other inplace ops. However, foreach is different in the way that it's schema is `native_functions.yaml` does not return anything, whereas ops like `addcmul_` and `addcdiv_` do return Tensors (Thanks bdhirsh for teaching me this!). As a result, the proxy output during tracing does not wrap anything, and hence we cannot correctly connect it with subsequent operators. Modifying `native_functions.yaml` is not a preferred solution. After discussing with bdhirsh, the temporary solution is to do foreach functionalization as a graph pass for now. Later, when https://github.com/pytorch/pytorch/issues/97852 is addressed, we will switch to default functionalization. Edit: the latest version follows @bdhirsh 's suggestion on using `make_fx` `decomposition_table` instead of implementing manual fx.Graph tranforms to functionalize `_foreach_add_`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97853 Approved by: https://github.com/fegin, https://github.com/wanchaol	2023-03-30 11:27:10 +00:00
Chien-Chin Huang	942e587d40	[SPMD] Make compile cache the compilation result and add option to perform transformation (#97836 ) This PR changes ``compile()`` decorator to cache the compilation result so that the compilation is done once. An gm_transformation option is also added to ``compile()`` so that after the compilation is done, users can perform any graph transformation with the compiled graph module. Differential Revision: [D44484033](https://our.internmc.facebook.com/intern/diff/D44484033/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97836 Approved by: https://github.com/mrshenli, https://github.com/wconstab	2023-03-29 20:51:22 +00:00
Shen Li	c39f1c1490	Allow DTensor to trigger collecives before inplace ops (#97787 ) Mainly two fixes: 1. `make_fx` seems trace through DeviceMesh operations. This commit removes that from the DTensor expanded graph 2. During DTensor expansion, autograd complains about inplace changes on leaf node. This commit wraps entire DTensor expansion code with `torch.no_grad()` Pull Request resolved: https://github.com/pytorch/pytorch/pull/97787 Approved by: https://github.com/wanchaol	2023-03-28 21:06:51 +00:00
Shen Li	75fb0b6c9f	Enable full train_step tracing and customizable dist graph expansion (#97416 ) This commit adds an entry point for full `train_step` tracing and expansion. Model forward, backwrd, and optimizer step will be included in one graph. DTensor expansion will be applied on top to insert collective communications. Users can also provide an `Override` implementation to skip non-traceable submodules and directly install submodule logic to the DTensor-expanded graph by inserting `fx.Nodes`. Differential Revision: [D44325177](https://our.internmc.facebook.com/intern/diff/D44325177) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97416 Approved by: https://github.com/yifuwang, https://github.com/wanchaol	2023-03-25 09:24:21 +00:00
Shen Li	021de486ff	[Easy] Apply black to format _spmd files (#97534 ) No real changes. Format code to prepare for the PR on top. Differential Revision: [D44376380](https://our.internmc.facebook.com/intern/diff/D44376380) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97534 Approved by: https://github.com/wanchaol	2023-03-25 01:09:41 +00:00
Chien-Chin Huang	250c054bdd	[SPMD] Pull the minimal working distribute API and SPMD module to PyTorch (#94802 ) Pull the minimal working distribute API and SPMD module to PyTorch. The original code is on https://github.com/pytorch/tau/tree/main/spmd/compiler. Other main contributors to the original code base: @anj-s, @lessw2020, @wanchaol @aazzolini Differential Revision: [D43197230](https://our.internmc.facebook.com/intern/diff/D43197230/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94802 Approved by: https://github.com/anj-s, https://github.com/wanchaol	2023-02-16 00:36:16 +00:00

8 Commits