Commit Graph

23 Commits

Author SHA1 Message Date
PyTorch MergeBot
99f2491af9 Revert "Use absolute path path.resolve() -> path.absolute() (#129409)"
This reverts commit 45411d1fc9.

Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/jeanschmidt due to Breaking internal CI, @albanD please help get this PR merged ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2571316444))
2025-01-04 14:17:20 +00:00
Xuehai Pan
45411d1fc9 Use absolute path path.resolve() -> path.absolute() (#129409)
Changes:

1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2025-01-03 20:03:40 +00:00
Xuehai Pan
b6bdb67f82 [BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)
Changes by apply order:

1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.

    `.parent{...}.absolute()` -> `.absolute().parent{...}`

4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)

    `.parent.parent.parent.parent` -> `.parents[3]`

5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~

    ~`.parents[3]` -> `.parents[4 - 1]`~

6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-12-29 17:23:13 +00:00
PyTorch MergeBot
475656fd9c Revert "[BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)"
This reverts commit 2293fe1024.

Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/malfet due to failing internal ROCM builds with error: ModuleNotFoundError: No module named hipify ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2562973920))
2024-12-26 17:32:23 +00:00
PyTorch MergeBot
cc4e70b7c3 Revert "Use absolute path path.resolve() -> path.absolute() (#129409)"
This reverts commit 135c7db99d.

Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/malfet due to need to revert to as dependency of https://github.com/pytorch/pytorch/pull/129374 ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2562969825))
2024-12-26 17:26:06 +00:00
Xuehai Pan
135c7db99d Use absolute path path.resolve() -> path.absolute() (#129409)
Changes:

1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2024-12-24 08:33:08 +00:00
Xuehai Pan
2293fe1024 [BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)
Changes by apply order:

1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.

    `.parent{...}.absolute()` -> `.absolute().parent{...}`

4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)

    `.parent.parent.parent.parent` -> `.parents[3]`

5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~

    ~`.parents[3]` -> `.parents[4 - 1]`~

6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-12-21 22:08:01 +00:00
Xuehai Pan
f6838d521a [BE][Easy][5/19] enforce style for empty lines in import segments in tools/ and torchgen/ (#129756)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129756
Approved by: https://github.com/ezyang
2024-07-17 06:44:35 +00:00
Xuehai Pan
4ee1cb9b95 [BE][Easy] replace import pathlib with from pathlib import Path (#129426)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129426
Approved by: https://github.com/malfet
2024-06-30 01:36:07 +00:00
PyTorch MergeBot
2effbcfcd8 Revert "[BE][Easy] replace import pathlib with from pathlib import Path (#129426)"
This reverts commit 6d75604ef1.

Reverted https://github.com/pytorch/pytorch/pull/129426 on behalf of https://github.com/XuehaiPan due to recognize `Path` as new exported API ([comment](https://github.com/pytorch/pytorch/pull/129426#issuecomment-2198371625))
2024-06-29 23:24:06 +00:00
Xuehai Pan
6d75604ef1 [BE][Easy] replace import pathlib with from pathlib import Path (#129426)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129426
Approved by: https://github.com/malfet
2024-06-29 15:42:09 +00:00
PyTorch MergeBot
3d96217891 Revert "[BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)"
This reverts commit 9e1f3ecaa7.

Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is still failing with the same error ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2197801405))
2024-06-29 00:47:15 +00:00
Xuehai Pan
9e1f3ecaa7 [BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)
Changes by apply order:

1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.

    `.parent{...}.absolute()` -> `.absolute().parent{...}`

4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)

    `.parent.parent.parent.parent` -> `.parents[3]`

5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~

    ~`.parents[3]` -> `.parents[4 - 1]`~

6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-06-28 00:35:15 +00:00
PyTorch MergeBot
895316119d Revert "[BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)"
This reverts commit 0314c4c101.

Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it causes lots of internal build failures where they fail to find hipify module ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2192437052))
2024-06-26 19:03:57 +00:00
Xuehai Pan
0314c4c101 [BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)
Changes by apply order:

1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.

    `.parent{...}.absolute()` -> `.absolute().parent{...}`

4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)

    `.parent.parent.parent.parent` -> `.parents[3]`

5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~

    ~`.parents[3]` -> `.parents[4 - 1]`~

6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-06-25 08:28:38 +00:00
Catherine Lee
06b52dd103 TD outside of test job (#118250)
Give TD it's own job so that each shard can get the results from this one job artifact and they will always be in sync with each other/no longer need to worry about consistently issues

* Move test discovery to its own file that is not dependent on torch so it can be run without building torch
  * Cannot do cpp test discovery before building pytorch
* Move TD calculation to own file that will create a json file with the final results
* TD is now job/build env agnostic
* TD will rank all tests, including those that test jobs may not want to run (ex it will rank distributed tests along with default tests, even though these tests are never run on the same machine together)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118250
Approved by: https://github.com/huydhn
2024-03-01 23:08:10 +00:00
Catherine Lee
dab272eed8 [td] Consistent pytest cache (#113804)
Move the pytest cache downloading into the build step and store it in additional ci files so that it stays consistent during sharding.

Only build env is taken into account now instead of also test config since we might not have the test config during build time, making it less specific, but I also think this might be better since tests are likely to fail across the same test config (I also think it might be worth not even looking at build env but thats a different topic)

Each cache upload should only include information from the current run.  Do not merge current cache with downloaded cache during upload (shouldn't matter anyways since the downloaded cache won't exist at the time)

From what I cant tell of the s3 retention policy, pytest cache files will be deleted after 30 days (cc @ZainRizvi to confirm), so we never have to worry about space or pulling old versions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113804
Approved by: https://github.com/ZainRizvi
2023-11-17 23:45:47 +00:00
Catherine Lee
b33220063d [TD] Historical edited files and profiling heuristics (#111510)
Adds files for the heuristics and run them in trial mode.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111510
Approved by: https://github.com/ZainRizvi
2023-10-25 19:54:17 +00:00
Zain Rizvi
b600aed237 [TD] Make test class times available during CI (#111836)
Makes the test class durations uploaded by https://github.com/pytorch/test-infra/pull/4670 available during CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111836
Approved by: https://github.com/clee2000
2023-10-24 21:40:10 +00:00
Catherine Lee
3b5b7ebd09 [ci] Save various json files from test infra into folder (#111516)
We pull a lot of files from https://github.com/pytorch/test-infra/blob/generated-stats/stats and name them separately when we add them to the artifacts in the build, so stick them in a folder and just add that instead.

Slow test and disabled test jsons remain as they were since they are pulled during the test step and do not need to be included in the artifacts during build since they are not used for sharding.

Sanity checked that test times could be found for linux, mac, windows, and rocm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111516
Approved by: https://github.com/huydhn, https://github.com/ZainRizvi
2023-10-23 20:38:25 +00:00
Zain Rizvi
2c313880fc [TD] Make test class correlation scores available to heuristics. (#111229)
https://github.com/pytorch/test-infra/pull/4617 generates `file_test_class_rating.json`. Now we ensure it's available for heuristics to use during the test step.

(Actual heuristics will come in a separate PR)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111229
Approved by: https://github.com/huydhn
2023-10-17 16:29:30 +00:00
Catherine Lee
f16be5e0d4 Reordering tests experiment (#106347)
Companion with https://github.com/pytorch/test-infra/pull/4424

Uses the file rating generated by the test infra PR to re order tests.  For each test file, sum the file ratings from the changed files in the PR, and put the tests in order of sum.

A lot of tests are probably going to end up as "prioritized" since it takes anything with a rating > 0 right now.

Sharding is done twice, once on the prioritized tests, and once on the general/non prioritized tests.  Prioritized tests have an order, so they should be sharded according to that order, while general tests don't have an order and are sharded by test time, which should result in more balanced shards.

I'll change the metric name before I merge, i want to quarantine my testing stuff from actual results

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106347
Approved by: https://github.com/ZainRizvi
2023-08-16 18:23:09 +00:00
Catherine Lee
86f038dd56 download test times during build to avoid race conditions (#81915)
After https://github.com/pytorch/pytorch/pull/81116, we started pulling test times straight from the source instead of first downloading them in the build job and then having the test job take the build jobs version.  This can cause an issues where different shards pull different versions of the file, leading to incorrect sharding (ex two shards running the same tests file on accident).  This generally happens if the test jobs happen while the test times file is being updated (unlikely, but not impossible) or if someone reruns a test job the next day.

In this PR, I return to the old method of downloading the test times file during the build job and having the test jobs pull from the build jobs uploaded artifacts.  If there is no test times file in the build job's artifacts, we fall back to the default sharding plan.

Notes:
* script moved to a new file to avoid needing to import torch, which would require torch to be built, which can cause issues with asan
* I got errors with asan (`ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.`), so I put the script at the beginning of the build

### Test Plan
Verified that the number of tests ran in the pull and trunk workflows are similar to workflows run on master.  Checked logs to see if artifacts were being used for sharding.  Spot checked a few test configs to check that their lists of selected tests didn't overlap.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81915
Approved by: https://github.com/huydhn
2022-07-28 16:35:01 +00:00