Commit Graph

244 Commits

Author SHA1 Message Date
Aleksei Nikiforov
d5b1d99f78 Enable more nightly tests on s390x (#148452)
Also enable some tests which probably were accidentally disabled.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148452
Approved by: https://github.com/seemethere, https://github.com/malfet
2025-03-18 16:09:39 +00:00
cyy
b0dfd242fa Remove NO_MULTIPROCESSING_SPAWN checks (#146705)
py 3.9 has spawn.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146705
Approved by: https://github.com/colesbury
2025-02-28 05:53:19 +00:00
PyTorch MergeBot
926b7b5027 Revert "Remove NO_MULTIPROCESSING_SPAWN checks (#146705)"
This reverts commit 40ad5e01df.

Reverted https://github.com/pytorch/pytorch/pull/146705 on behalf of https://github.com/cyyever due to Broke lint?, I guess land race with rufff update ([comment](https://github.com/pytorch/pytorch/pull/146705#issuecomment-2689603077))
2025-02-28 03:04:38 +00:00
cyyever
40ad5e01df Remove NO_MULTIPROCESSING_SPAWN checks (#146705)
py 3.9 has spawn.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146705
Approved by: https://github.com/colesbury
2025-02-28 00:15:32 +00:00
Xuehai Pan
c73a92fbf5 [BE][CI] bump ruff to 0.9.2: multiline assert statements (#144546)
Reference: https://docs.astral.sh/ruff/formatter/black/#assert-statements

> Unlike Black, Ruff prefers breaking the message over breaking the assertion, similar to how both Ruff and Black prefer breaking the assignment value over breaking the assignment target:
>
> ```python
> # Input
> assert (
>     len(policy_types) >= priority + num_duplicates
> ), f"This tests needs at least {priority+num_duplicates} many types."
>
>
> # Black
> assert (
>     len(policy_types) >= priority + num_duplicates
> ), f"This tests needs at least {priority+num_duplicates} many types."
>
> # Ruff
> assert len(policy_types) >= priority + num_duplicates, (
>     f"This tests needs at least {priority + num_duplicates} many types."
> )
> ```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144546
Approved by: https://github.com/malfet
2025-02-27 20:46:16 +00:00
cyy
7b512095ef Enable some tests on MacOS (#146268)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146268
Approved by: https://github.com/Skylion007, https://github.com/malfet
2025-02-03 05:04:24 +00:00
wizzniu
c07dc64017 Update pin memory related APIs to not pass 'device' argument (#131858)
Based on https://github.com/pytorch/pytorch/pull/126376, this PR tries to update all PT callers (e.g., `Tensor.is_pinned()`, `Tensor.pin_memory()`) to not pass `device` argument.
As for `storage/untyped_storage.is_pinned()/pin_memory()`, we keep the `device` argument but passing `device` is discouraged. And if not given, the default `device` is still 'cuda' for BC.
Additionally, based on device-agnostic pin_memory, `pin_memory_device` argument of `torch.utils.data.DataLoader` is discouraged  now. For BC, explictly passing this argument is still effective. If not given, the default `device` will be the current accelerator.

Fixes #124908
Relates https://github.com/pytorch/pytorch/pull/126376

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131858
Approved by: https://github.com/albanD

Co-authored-by: albanD <desmaison.alban@gmail.com>
2025-01-15 17:23:35 +00:00
cyy
df458be4e5 [4/N] Apply py39 ruff and pyupgrade fixes (#143257)
```torch/fx/passes/annotate_getitem_nodes.py``` was changed to support the new type hinting annotations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143257
Approved by: https://github.com/justinchuby, https://github.com/albanD
2025-01-04 10:47:51 +00:00
Michael Diggin
55dc61dd52 Dataloader distribute tasks to workers when in_order is False (#142324)
Fixes #105203 and is a follow up PR to #141833

When `in_order` is True (the default), tasks are given out to workers in a round robin fashion. When `in_order` is False this is no longer needed, as we give up guarantees of reproducibility, and instead tasks should be given to workers that are able to perform work.
In this PR I've added tracking of the number of outstanding tasks for each worker (updated when tasks are added to their queue, and when data is returned to the main thread). When finding the next queue to add a task to, if `in_order` is False it will only add the task to the workers queue if it has fewer than `_prefetch_factor` tasks outstanding.
The current default behaviour is left as is.

Tests are also updated to assert on the worker IDs for each sample of data returned.
I've run the following to confirm they aren't flaky
```bash
for i in {1..20}; do python test/test_dataloader.py TestOutOfOrderDataLoader; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142324
Approved by: https://github.com/andrewkho
2025-01-03 12:57:04 +00:00
PyTorch MergeBot
f6801ba4b3 Revert "Use random64 in Fischer-Yates algorithm for large N (#143682)"
This reverts commit 7013be0094.

Reverted https://github.com/pytorch/pytorch/pull/143682 on behalf of https://github.com/wdvr due to failing Meta internal tests that need to be updated ([comment](https://github.com/pytorch/pytorch/pull/143682#issuecomment-2563487675))
2024-12-27 09:09:33 +00:00
Natalia Gimelshein
7013be0094 Use random64 in Fischer-Yates algorithm for large N (#143682)
Fixes bug in randperm https://nbsanity.com/static/a4774194938414dedcec7d6e99727d31/Shuffling_20in_20torch_20vs_20numpy-public.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143682
Approved by: https://github.com/eqy, https://github.com/albanD
2024-12-25 01:19:19 +00:00
Sergii Dymchenko
c042c8a475 Use default_collate from public API (#143616)
Codemodded via `torchfix . --select=TOR104 --fix`.
This is a step to unblock https://github.com/pytorch/pytorch/pull/141076
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143616
Approved by: https://github.com/malfet
2024-12-23 17:38:43 +00:00
Tom Ritchford
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
andrewkho
4e7056d94d Fixes in-order test flakiness (#142389)
Fixes #142343

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142389
Approved by: https://github.com/michael-diggin, https://github.com/divyanshk
2024-12-10 04:19:20 +00:00
Michael Diggin
18ef3a09cd Add option in data loader for out of order data (#141833)
Fixes #105203

Facing a similar problem to the linked issue, where variable sized input data can mean that a handful of slow to process samples holds up smaller and faster to process samples from being used. This also leads to lower GPU utilization as well. In certain cases, e.g. evaluation epochs, inference pipelines or other cases where reproducibility isn't important, this can bring significant speed ups.

This PR adds an `allow_out_of_order` bool input to the `DataLoader` class, defaulting to `false`, which when set to `true` will returning data from workers in whatever order they are ready/processed in, rather in the strict index order.
Instead of storing data that was returned out of order, it is passed directly to the main thread and the entry in `_task_info` is deleted. The main changes are they to check that an entry in `_task_info` does exist, and only increasing `self._rcvd_idx` when the lowest index remaining gets returned.

Two tests are added to test this for iterable type datasets and index type datasets.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141833
Approved by: https://github.com/andrewkho
2024-12-06 19:55:58 +00:00
Aaron Gokaslan
c05eff278a [BE][Ez]: Update ruff to 0.7.4 (#140806)
Updates ruff to 0.7.4, mainly updates false pos/negatives for rules and fixes some bad autofixes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140806
Approved by: https://github.com/cyyever, https://github.com/malfet
2024-11-15 17:04:32 +00:00
cyy
f7dc13806e [2/N] Don't skip ASAN on some tests (#138663)
Follows #138571
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138663
Approved by: https://github.com/ezyang
2024-10-28 03:35:57 +00:00
xinan.lin
190e09d8b6 [Inductor UT] Generalize device-bias code introduced from #134874 and (#136596)
[Inductor UT] Generalize device-bias code introduced from #134874 and fix unexpected success test cases.
Fix #136595

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136596
Approved by: https://github.com/EikanWang, https://github.com/jansel

Co-authored-by: Yu, Guangye <guangye.yu@intel.com>
2024-09-26 02:56:59 +00:00
Huy Do
db80b98ec4 XFAIL test_segfault (#136252)
Fixes https://github.com/pytorch/pytorch/issues/128551

As this has been failing in trunk for a while and there is no owner yet to fix it properly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136252
Approved by: https://github.com/andrewkho
2024-09-19 04:17:06 +00:00
zdevito
8d9c3a71f6 Support IPC for Expandable Segments (#130890)
This reapplication commit is the same as before except it resolves a build error in an internal build where `handle` was shadowed.

Differential Revision: [D60547506](https://our.internmc.facebook.com/intern/diff/D60547506)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130890
Approved by: https://github.com/dsjohns2
2024-08-05 18:48:13 +00:00
Xuehai Pan
4226ed1585 [BE] Format uncategorized Python files with ruff format (#132576)
Remove patterns `**`, `test/**`, and `torch/**` in `tools/linter/adapters/pyfmt_linter.py` and run `lintrunner`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132576
Approved by: https://github.com/ezyang, https://github.com/Skylion007
ghstack dependencies: #132574
2024-08-04 17:13:31 +00:00
Oguz Ulgen
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
PyTorch MergeBot
49a8e061b6 Revert "Support IPC for Expandable Segments (#130890)"
This reverts commit 0e71a88f9b.

Reverted https://github.com/pytorch/pytorch/pull/130890 on behalf of https://github.com/zdevito due to some internal tests show shutdown issues with the change to the table that holds ipc handles ([comment](https://github.com/pytorch/pytorch/pull/130890#issuecomment-2250767280))
2024-07-25 15:54:57 +00:00
zdevito
0e71a88f9b Support IPC for Expandable Segments (#130890)
This reapplication commit is the same as before except it resolves a build error in an internal build where `handle` was shadowed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130890
Approved by: https://github.com/dsjohns2
2024-07-24 15:45:40 +00:00
PyTorch MergeBot
1e86387871 Revert "Support IPC for Expandable Segments (#130890)"
This reverts commit 32c2f84e34.

Reverted https://github.com/pytorch/pytorch/pull/130890 on behalf of https://github.com/zdevito due to variable shadowing broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/130890#issuecomment-2245456085))
2024-07-23 14:46:28 +00:00
zdevito
32c2f84e34 Support IPC for Expandable Segments (#130890)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130890
Approved by: https://github.com/dsjohns2
ghstack dependencies: #130888, #130889
2024-07-22 16:15:01 +00:00
Xuehai Pan
d80939e5e9 [BE] enable UFMT for torch/storage.py (#127706)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127706
Approved by: https://github.com/ezyang
2024-06-27 23:16:24 +00:00
Fuzzkatt
3f47c72268 add multiprocessing checks in test_dataloader.py (#128244)
Add multiprocessing checks in test_dataloader.py for tests requiring multiprocessing similar to test_multiprocessing.py: https://github.com/pytorch/pytorch/blob/main/test/test_multiprocessing.py#L41-L52. Change all Jetson skips to TEST_CUDA_IPC checks since that is the root cause of the failures on Jetson in the first place.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128244
Approved by: https://github.com/eqy, https://github.com/malfet
2024-06-15 01:32:55 +00:00
Yuanhao Ji
a05b2ae302 Enable UFMT on test/test_dataloader.py (#124710)
Part of: #123062

Ran lintrunner on:

- test/test_custom_op_testing.py (already deleted)
- test/test_dataloader.py

Detail:

```bash
$ lintrunner -a --take UFMT --all-files
ok No lint issues.
Successfully applied all patches.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124710
Approved by: https://github.com/soulitzer
2024-04-28 21:21:51 +00:00
Aaron Gokaslan
29cc293725 [BE]: FURB142 - Remove set mutations. Use set update (#124551)
Uses set mutation methods instead of manually reimplementing (update, set_difference etc).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124551
Approved by: https://github.com/ezyang
2024-04-21 14:12:33 +00:00
Aaron Gokaslan
5a1216bb2e [BE]: Update ruff to 0.4.1 (#124549)
Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.

Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0

| Repository                                         | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7         | 251.8         | 351.1            | 274.9            |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
2024-04-21 14:06:23 +00:00
Aaron Gokaslan
1562dae62c [BE]: Apply RUF025 dict.fromkeys preview rule (#118637)
Simplifies and optimizes dict construction using the `fromkeys` classmethod ctor. This also makes it really obvious when all the keys will have the same static value, which could be a bug if unintentional. It is also significantly faster than using a dict comprehension. The rule is in preview, but I am adding a forward fix for when it becomes stable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118637
Approved by: https://github.com/albanD
2024-01-30 20:46:54 +00:00
Peter Bell
7c33ce7702 [CI] Install dill in ci (#116214)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116214
Approved by: https://github.com/malfet
ghstack dependencies: #116230
2024-01-24 23:42:35 +00:00
Aaron Gokaslan
ee5d981249 [BE]: Enable RUFF PERF402 and apply fixes (#115505)
* Enable PERF402. Makes code more efficient and succinct by removing useless list copies that could be accomplished either via a list constructor or extend call. All test cases have noqa added since performance is not as sensitive in that folder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115505
Approved by: https://github.com/malfet
2023-12-20 18:01:24 +00:00
Aaron Gokaslan
376217cc0b [BE]: Apply FURB145 to make code more readable and idiomatic. (#112990)
Testing out some new rules that are in beta, I think I will apply this one codebase wide once it's out of preview. Replaces the hack of using `[:]` to do copies of list with the proper copy method. More efficient and more readable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112990
Approved by: https://github.com/ezyang
2023-11-06 13:15:04 +00:00
Ramil Nugmanov
91eeb77260 StackDataset batched sampling (#110694)
Optimization of loading minibatches

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110694
Approved by: https://github.com/ejguan
2023-10-10 22:05:51 +00:00
Joel Schlosser
ac01304e22 pin_memory support for NT (#110404)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110404
Approved by: https://github.com/cpuhrsch, https://github.com/albanD
ghstack dependencies: #110292
2023-10-10 21:58:19 +00:00
Joel Schlosser
43ea782af3 Multiprocessing support for NT (#110292)
Fixes #110161

Allows NTs to be used in DataLoaders with `num_workers > 1`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110292
Approved by: https://github.com/cpuhrsch, https://github.com/albanD
2023-10-10 21:58:19 +00:00
PyTorch MergeBot
dac895c10a Revert "Multiprocessing support for NT (#110292)"
This reverts commit f17fe89e14.

Reverted https://github.com/pytorch/pytorch/pull/110292 on behalf of https://github.com/kit1980 due to Causes CUDA memory leaks ([comment](https://github.com/pytorch/pytorch/pull/110292#issuecomment-1749852095))
2023-10-06 01:07:40 +00:00
PyTorch MergeBot
81ce5d5725 Revert "pin_memory support for NT (#110404)"
This reverts commit 3597325bc7.

Reverted https://github.com/pytorch/pytorch/pull/110404 on behalf of https://github.com/kit1980 due to Previous PR in the stack caused CUDA memory leaks ([comment](https://github.com/pytorch/pytorch/pull/110404#issuecomment-1749850211))
2023-10-06 01:03:17 +00:00
Joel Schlosser
3597325bc7 pin_memory support for NT (#110404)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110404
Approved by: https://github.com/cpuhrsch, https://github.com/albanD
ghstack dependencies: #110292
2023-10-05 16:33:22 +00:00
Joel Schlosser
f17fe89e14 Multiprocessing support for NT (#110292)
Fixes #110161

Allows NTs to be used in DataLoaders with `num_workers > 1`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110292
Approved by: https://github.com/cpuhrsch, https://github.com/albanD
2023-10-05 15:04:48 +00:00
PyTorch MergeBot
7e6cf04a84 Revert "Multiprocessing support for NT (#110292)"
This reverts commit 881e7304d6.

Reverted https://github.com/pytorch/pytorch/pull/110292 on behalf of https://github.com/jbschlosser due to Address review comments ([comment](https://github.com/pytorch/pytorch/pull/110292#issuecomment-1743524901))
2023-10-02 18:27:13 +00:00
Joel Schlosser
881e7304d6 Multiprocessing support for NT (#110292)
Fixes #110161

Allows NTs to be used in DataLoaders with `num_workers > 1`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110292
Approved by: https://github.com/cpuhrsch
ghstack dependencies: #110219
2023-10-02 18:14:34 +00:00
Aaron Gokaslan
3488837ec1 Update ruff to v0.0.286 (#108058)
Updates ruff to v0.0.286 and fixes one false negative.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108058
Approved by: https://github.com/albanD
2023-08-28 22:55:56 +00:00
PyTorch MergeBot
ecde622649 Revert "reseed all Generators in Dataloader's _worker_loop() -- via GC (#107131)"
This reverts commit 42625da5e1.

Reverted https://github.com/pytorch/pytorch/pull/107131 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/107131#issuecomment-1690325745))
2023-08-23 17:08:07 +00:00
Nicolas Hug
42625da5e1 reseed all Generators in Dataloader's _worker_loop() -- via GC (#107131)
Alternative to https://github.com/pytorch/pytorch/pull/107034, implements @ezyang 's suggestion from https://github.com/pytorch/pytorch/pull/107034#discussion_r1292857201.

This PR addresses https://fb.workplace.com/groups/pytorch.oss.dev/posts/1699944830430051 and does a bunch of stacked changes:

- Make `Generator` class support GC;this makes all `Generator` instances tracked and accessile through Python's GC.
- Use the GC to retrieve all existing Generator instances in Dataloader's `_worker_loop` and re-seed them: this extends what is already applied to the global/default Generator, which is already re-seeded.

~TODO: a bit of docs and justification, which I'll do if this PR is mergeable.~ -- Done

CC @albanD @ezyang  as previously discussed

BC-Breaking Note
-------------------

We now re-seed all `Generator` instances within the `Dataloader` workers' loop to ensure that their RNG is different across workers.
Previously, the RNG of user-defined `Generators` would be the same across workers, which could lead to wrong training procedures. This only affects user-defined `Generators`, not the default `Generator` (which was already re-seeded).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107131
Approved by: https://github.com/ezyang
2023-08-18 10:23:23 +00:00
shibo19
bb2fcc7659 unify TEST_CUDA (#106685)
Fixes #ISSUE_NUMBER
as title, unify TEST_CUDA
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106685
Approved by: https://github.com/zou3519
2023-08-10 09:01:36 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
73e1455327 [BE] Enable ruff's UP rules and autoformat test/ (#105434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434
Approved by: https://github.com/albanD
2023-07-19 20:36:06 +00:00