Commit Graph

15 Commits

Author SHA1 Message Date
Edward Z. Yang
4f13f69a45 Enable possibly-undefined error code (#118533)
Fixes https://github.com/pytorch/pytorch/issues/118129

Suppressions automatically added with

```
import re

with open("error_file.txt", "r") as f:
    errors = f.readlines()

error_lines = {}
for error in errors:
    match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
    if match:
        file_path, line_number, error_type = match.groups()
        if file_path not in error_lines:
            error_lines[file_path] = {}
        error_lines[file_path][int(line_number)] = error_type

for file_path, lines in error_lines.items():
    with open(file_path, "r") as f:
        code = f.readlines()
    for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
        code[line_number - 1] = code[line_number - 1].rstrip() + f"  # type: ignore[{error_type}]\n"
    with open(file_path, "w") as f:
        f.writelines(code)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
2024-01-30 05:08:10 +00:00
Adrian Wälchli
866457e746 Fix pydocstyle errors in fully_sharded_data_parallel.py, api.py, graph_utils.py, distribute.py, iter_graph_module.py, comm_tensor.py, experimental_ops.py, batch_dim_utils.py, data_parallel.py, graph_optimization.py (#113216)
Fixes #113191

```
pydocstyle torch/distributed/fsdp/fully_sharded_data_parallel.py --count
```

On master: 80
After my changes on this PR: 3

```
pydocstyle torch/distributed/_spmd/comm_tensor.py --count
```
On master: 5
After my changes on this PR: 3

```
pydocstyle torch/distributed/_spmd/experimental_ops.py --count
```
On master: 3
After my changes on this PR: 1

```
pydocstyle torch/distributed/_spmd/iter_graph_module.py --count
```
On master: 39
After my changes on this PR: 27

```
pydocstyle torch/distributed/_spmd/graph_utils.py --count
```
On master: 16
After my changes on this PR: 4

```
pydocstyle torch/distributed/_spmd/distribute.py --count
```
On master: 19
After my changes on this PR: 10

```
pydocstyle torch/distributed/_spmd/api.py --count
```
On master: 10
After my changes on this PR: 3

```
pydocstyle torch/distributed/_spmd/batch_dim_utils.py  --count
```
On master: 14
After my changes on this PR: 3

```
pydocstyle torch/distributed/_spmd/data_parallel.py --count
```
On master: 34
After my changes on this PR: 2

```
pydocstyle torch/distributed/_spmd/graph_optimization.py --count
```
On master: 35
After my changes on this PR: 13

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113216
Approved by: https://github.com/ezyang
2023-11-10 03:08:32 +00:00
Peter Bell
66c32d099a Use pytree.arg_tree_leaves everywhere (#112394)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112394
Approved by: https://github.com/lezcano
ghstack dependencies: #112391, #112392, #112393
2023-10-31 15:57:06 +00:00
Peter Bell
bbd5b935e4 Use pytree.tree_leaves everywhere (#112324)
This changes all the instances I could find of `tree_flatten(...)[0]` or
`x, _ = tree_flatten` to use `tree_leaves`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324
Approved by: https://github.com/lezcano
ghstack dependencies: #112327, #112323
2023-10-30 03:39:04 +00:00
Kazuaki Ishizaki
b5f9696d81 Fix typo under torch directory (#110824)
This PR fixes typo `the the` of comments and exception messages in files under `torch` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110824
Approved by: https://github.com/H-Huang
2023-10-09 19:16:43 +00:00
Yeonju Ro
06f656c5d1 [distributed] implemented find_all_descendants (#102138)
Fixes #100397

Implemented find_all_descendants function that identifies the list of nodes that need to be moved. Added unit test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102138
Approved by: https://github.com/fegin
2023-05-24 21:47:59 +00:00
Chien-Chin Huang
e0a2b49f0b [SPMD] Introduce prerequisites to graph_optimization_pass (#99970)
Some optimizations require prerequisite passes. It is hard to debug why a optimization pass because of the prerequisites condition does not match. Adding this check makes it easier to discover the error.

Differential Revision: [D45255377](https://our.internmc.facebook.com/intern/diff/D45255377/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99970
Approved by: https://github.com/lessw2020
2023-04-28 18:38:01 +00:00
Chien-Chin Huang
01de8ee845 [SPMD][Easy] Add time counter in graph_optimization_pass (#99969)
This can give the idea how expensive the pass is.

Differential Revision: [D45255366](https://our.internmc.facebook.com/intern/diff/D45255366/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99969
Approved by: https://github.com/lessw2020
2023-04-27 17:56:07 +00:00
Chien-Chin Huang
41d7969590 [SPMD] Upstream iter_move_grads_and_optimizers (#98785)
This PR upstreams `iter_move_grads_and_optimizer` which delay some of the gradients and the corresponding optimizer to the next iteration. D44512863(credit to @lessw2020 ) is the internal implementation, which is only good for the old _SPMD expansion.  This PR changes the implmentation to use the new APIs.

Differential Revision: [D44836486](https://our.internmc.facebook.com/intern/diff/D44836486/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98785
Approved by: https://github.com/mrshenli
2023-04-19 06:40:33 +00:00
Shen Li
19c2804614 [SPMD][EASY] Remove unnecessary torch.ops prefix (#99331)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99331
Approved by: https://github.com/dracifer
2023-04-17 19:33:45 +00:00
Chien-Chin Huang
148d49260a [SPMD] Implement split_fused_optimizer to split one fused_optimizer node to two (#98784)
Several optimization passes requires the ability to split the fused_optimizer.  This PR adds the API to support the use cases.

Differential Revision: [D44806450](https://our.internmc.facebook.com/intern/diff/D44806450/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98784
Approved by: https://github.com/mrshenli
2023-04-17 10:02:07 +00:00
Chien-Chin Huang
99aacf5c68 [SPMD] Expedite the allreduce call before doing comm_fusion (#98922)
The allreduce call order and gradients order may be different and can interfere the benefit of comm_fusion. This PR reorders the graph so that all the allreduce calls happen right after its last input.

Differential Revision: [D44900738](https://our.internmc.facebook.com/intern/diff/D44900738/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98922
Approved by: https://github.com/mrshenli
2023-04-12 23:26:37 +00:00
Chien-Chin Huang
f3080997e5 [SPMD] Introduce remove_copy_for_optimizer optimization (#98580)
This PR adds the ability to remove unused `copy_` (`len(node.users) == 0`) that generated by tracing the optimizer.

Differential Revision: [D44761556](https://our.internmc.facebook.com/intern/diff/D44761556/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98580
Approved by: https://github.com/mrshenli
2023-04-12 00:51:22 +00:00
Chien-Chin Huang
07a1378f52 [SPMD] Introduce schedule_comm_wait (#98578)
`schedule_comm_wait` delays the wait_tensor ops as late as possible. Note that this optimization currently does not reorder the computation ops. For `foreach` based optimizer, we observe that reordering the computation ops is required to achieve a good performance.

Differential Revision: [D44761487](https://our.internmc.facebook.com/intern/diff/D44761487/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98578
Approved by: https://github.com/mrshenli
2023-04-12 00:51:19 +00:00
Chien-Chin Huang
dd3e2ddc0a [SPMD] Introduce graph_optimization_pass and comm_fusion_with_cat (#98285)
This PR add `graph_optimization_pass` decorator which should be wrapped by all graph optimization passes. This PR also introduces the first graph optimization, `comm_fusion_with_cat`, as the first use case of `graph_optimization_pass`.

Differential Revision: [D44661608](https://our.internmc.facebook.com/intern/diff/D44661608/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98285
Approved by: https://github.com/yifuwang
2023-04-12 00:51:16 +00:00