MInor, adds a linter that ensures that all jobs run on pull_request, schedule, push etc have a `if: github.repository_owner == 'pytorch'` or are dependent on a job that has that check
There is also a setting in Github repos that can disable all workflows for that repo
A lot of these are unnecessary because many jobs use reusable workflows that have that check. However, this is a one time change so I'm not that bothered
Unfortunately I can't put this at the workflow level, which would make this better
Lots of weird string parsing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138849
Approved by: https://github.com/malfet
Add a new option `--cuda` to `tools/nightly.py` to pull the nightly packages with CUDA support.
```bash
# installs pytorch-nightly with cpuonly
tools/nightly.py pull
# The following only available on Linux and Windows
# installs pytorch-nightly with latest CUDA we support
tools/nightly.py pull --cuda
# installs pytorch-nightly with CUDA 12.1
tools/nightly.py pull --cuda 12.1
```
Also add targets in `Makefile` and instructions in constribution guidelines.
```bash
# setup conda environment with pytorch-nightly
make setup-env
# setup conda environment with pytorch-nightly with CUDA support
make setup-env-cuda
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131133
Approved by: https://github.com/ezyang
This PR follows https://github.com/pytorch/pytorch/pull/129374#pullrequestreview-2136555775 cc @malfet:
> Lots of formatting changes unrelated to PR goal, please keep them as part of separate PR (and please add lint rule if you want to enforce those, or at least cite one)
`usort` allows empty lines within import segments. For example, `usort` do not change the following code:
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
This PR first sort imports via `isort`, then re-sort the file using `ufmt` (`usort` + `black`). This enforces the following import style:
1. no empty lines within segments.
2. single empty line between segments.
3. two spaces after import statements.
All the code snippets above will be formatted to:
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
which produces a consistent code style.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129751
Approved by: https://github.com/malfet
Update ruff to 0.5.0 so we can enable all the some of the new checks I've been wanting to add to the codebase. First just updating the code to comply with some rule changes and a couple minor API changes / deprecations.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129744
Approved by: https://github.com/ezyang
This PR follows https://github.com/pytorch/pytorch/pull/129374#pullrequestreview-2136555775 cc @malfet:
> Lots of formatting changes unrelated to PR goal, please keep them as part of separate PR (and please add lint rule if you want to enforce those, or at least cite one)
`usort` allows empty lines within import segments. For example, `usort` do not change the following code:
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
This PR first sort imports via `isort`, then re-sort the file using `ufmt` (`usort` + `black`). This enforces the following import style:
1. no empty lines within segments.
2. single empty line between segments.
3. two spaces after import statements.
All the code snippets above will be formatted to:
```python
import torch.aaa
import torch.bbb
import torch.ccc
x = ... # some code
```
which produces a consistent code style.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129751
Approved by: https://github.com/malfet
Changes by apply order:
1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.
`.parent{...}.absolute()` -> `.absolute().parent{...}`
4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)
`.parent.parent.parent.parent` -> `.parents[3]`
5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~
~`.parents[3]` -> `.parents[4 - 1]`~
6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
Changes by apply order:
1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.
`.parent{...}.absolute()` -> `.absolute().parent{...}`
4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)
`.parent.parent.parent.parent` -> `.parents[3]`
5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~
~`.parents[3]` -> `.parents[4 - 1]`~
6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
## Problem this PR resolves
Today, most of distributed tests are arranged like this:
```
def test_allreduce(self):
pg = self._create_process_group_nccl(store, self.opts())
pg.allreduce(tensor)
...
```
Thus, we are paying PG creation time **per test**. That's bad. But why were we doing that? Is there a constraint?
If we look deeper, we would find that most of our test cases inherit from `torch.testing._internal.common_distributed.MultiProcessTestCase`. From the name, nothing seems wrong, and probably fits distributed well. But a "problem" exists in its `setUp()` and `tearDown()` methods, which basically do the following:
```
def setUp(self):
self._spawn_processes()
def tearDown(self):
for p in self.processes:
p.terminate()
```
Since `setUp` and `tearDown` are "**test-scope fixtures"**, meaning, they are called per test, each test will have brand new processes. Of course we'd have to recreate ProcessGroup every time.
## How we are fixing it
First, obviously, we need to put a PG's lifetime into a longer scope. Python `unittest` provides such a helper, called **"class-scope fixtures."** It is embodied by a `setUpClass` method and a `tearDownClass` method (note the name difference), which are called only once for all tests in the same test class. Therefore, we would do:
```
@classmethod
def setUpClass(self):
dist.init_process_group(...)
@classmethod
def tearDownClass(self):
dist.destroy_process_group()
```
**In this PR, we create a new test template for distributed: `MultiProcContinousTest`, to hold this class-scope fixture.**
Second, we'd need to avoid per-test process spawn and terminate. That's easy, we can either:
1. launch the whole test file with `torchrun --nproc-per-node=...` or
2. use `mp.spawn()` under `if __name__ == "__main__":`.
Point is, launch the processes only once.
## Result
We moved the "positive tests" from test_c10d_nccl.py to test_c10d_ops_nccl.py.
Before this PR:
```
$ python test_c10d_nccl.py -k ProcessGroupNCCLTest
Ran 24 tests in 174.457s
```
After this PR:
```
$ torchrun --nproc-per-node 2 test_c10d_ops_nccl.py
or
$ python test_c10d_ops_nccl.py
Ran 24 tests in 16.247s
```
10X speedup.
## Limitation
For tests intended to test destroy or abort of PGs, we'd need to go back to the old style. So it would make sense to divide our tests into two classes: one for positive tests where we would reuse the PGs, and the other one for abort/destroy and negative tests like watchdog timeout.
## Next step
Migrate the tests of distributed that would fit with this test style!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125648
Approved by: https://github.com/wconstab
When you make a git worktree, the .git "folder" in the worktree
is not a directory, it's a file pointing at the actual .git directory.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123060
Approved by: https://github.com/albanD
We can use Sapling (hg) with the pytorch repo but there are a couple minor issues to teach our scripting to be happier with having either a git or hg repo.
This change fixes some issues in:
- setup.py
- lintrunner
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122072
Approved by: https://github.com/ezyang