pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
erjia	ccccd0efec	[DataLoader] Share seed via Distributed Store to get rid of CUDA dependency (#79829 ) Fixes #79828 In distributed environment, before this PR, DataLoader would create a Tensor holding the shared seed in RANK 0 and send the Tensor to other processes. However, when `NCCL` is used as the distributed backend, the Tensor is required to be moved to cuda before broadcasted from RANK 0 to other RANKs. And, this causes the Issue where DataLoader doesn't move the Tensor to cuda before sharing using `NCCL`. After offline discussion with @mrshenli, we think the distributed Store is a better solution as the shared seed is just an integer value. Then, we can get rid of the dependency on NCCL and CUDA when sharing info between distributed processes for DataLoader. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79829 Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT	2022-06-20 19:18:35 +00:00
erjia	04f87f2ab9	[DataLoader] Fix the world_size when distributed sharding MapDataPipe (#79524 ) Fixes #79449 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79524 Approved by: https://github.com/NivekT, https://github.com/VitalyFedyunin	2022-06-14 19:03:57 +00:00
ErjiaGuan	5158a6b41a	Foward fix sharding bug for DL (#79124 ) This PR solves a bug introduced by #79041 `torch.utils.data.graph_settings.apply_sharding` changes the datapipe in-place and returns `None` It would resolve the Error in TorchData. See: https://github.com/pytorch/data/actions/runs/2461030312 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79124 Approved by: https://github.com/VitalyFedyunin	2022-06-08 16:16:58 +00:00
erjia	b3ed65343d	Fix sharding strategy for distributed DL (#79041 ) 1. Change the sharding strategy from sharding by worker first then by rank to sharding in the order of rank then workers. 2. Change to fetch Rank and World size in main process for the sake of `spawn`. For the change 1: Before this PR, for the case when dataset can not be evenly divided by `worker_num * world_size`, more data will be retrieved by workers in first RANKs. Using the following example: - dataset size: 100 - world_size: 4 - num_worker: 2 The number of data retrieved by each rank before this PR - Rank 0: 26 - Rank 1: 26 - Rank 2: 24 - Rank 3: 24 The number of data retrieved by each rank after this PR - Rank 0: 25 - Rank 1: 25 - Rank 2: 25 - Rank 3: 25 For the change 2: Before this PR, `dist` functions are invoked inside worker processes. It's fine when the worker processes are forked from the parent process. All environment variables are inherited and exposed to these `dist` functions. However, when the worker processes are spawned, they won't be able to access to these environment variables, then the dataset won't be sharded by rank. After this PR, `_sharding_worker_init_fn` should be working for both `spawn` and `fork` case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79041 Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT	2022-06-07 20:56:32 +00:00
Vitaly Fedyunin	6fe6902f97	[DataLoader] Apply sharding settings in dist when num_workers is 0 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78950 Approved by: https://github.com/ejguan, https://github.com/NivekT	2022-06-06 20:03:02 +00:00
erjia	9b6cb83b0c	Make ShufflerDataPipe deterministic for persistent DL and distributed DL (#78765 ) Fixes https://github.com/pytorch/data/issues/426 This PR introduces two main changes: - It ensures the `ShufflerDataPipe` would share the same seed across distributed processes. - Users can reset `shuffle` for persistent workers per epoch. Detail: - `shared_seed` is shared across distributed and worker processes. It will seed a `shared_rng` to provide seeds to each `ShufflerDataPipe` in the pipeline - `worker_loop` now accepts a new argument of `shared_seed` to accept this shared seed. - The `shared_seed` is attached to `_ResumeIteration` for resetting seed per epoch for `persistent worker` - I choose not to touch `base_seed` simply for BC issue I used this [script](https://gist.github.com/ejguan/d88f75fa822cb696ab1bc5bc25844f47) to test the result with `world_size=4`. Please check the result in: https://gist.github.com/ejguan/6ee2d2de12ca57f9eb4b97ef5a0e300b You can see there isn't any duplicated/missing element for each epoch. And, with the same seed, the order of data remains the same across epochs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78765 Approved by: https://github.com/VitalyFedyunin	2022-06-06 17:24:00 +00:00
PyTorch MergeBot	129d9dbb15	Revert "Make ShufflerDataPipe deterministic for persistent DL and distributed DL (#78765 )" This reverts commit `b769a0e18b`. Reverted https://github.com/pytorch/pytorch/pull/78765 on behalf of https://github.com/janeyx99 due to broke lint on trunk	2022-06-06 14:24:51 +00:00
erjia	b769a0e18b	Make ShufflerDataPipe deterministic for persistent DL and distributed DL (#78765 ) Fixes https://github.com/pytorch/data/issues/426 This PR introduces two main changes: - It ensures the `ShufflerDataPipe` would share the same seed across distributed processes. - Users can reset `shuffle` for persistent workers per epoch. Detail: - `shared_seed` is shared across distributed and worker processes. It will seed a `shared_rng` to provide seeds to each `ShufflerDataPipe` in the pipeline - `worker_loop` now accepts a new argument of `shared_seed` to accept this shared seed. - The `shared_seed` is attached to `_ResumeIteration` for resetting seed per epoch for `persistent worker` - I choose not to touch `base_seed` simply for BC issue I used this [script](https://gist.github.com/ejguan/d88f75fa822cb696ab1bc5bc25844f47) to test the result with `world_size=4`. Please check the result in: https://gist.github.com/ejguan/6ee2d2de12ca57f9eb4b97ef5a0e300b You can see there isn't any duplicated/missing element for each epoch. And, with the same seed, the order of data remains the same across epochs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78765 Approved by: https://github.com/VitalyFedyunin	2022-06-06 13:36:37 +00:00
Vitaly Fedyunin	883f8ef62e	[DataLoader] DataLoader now automatically apply sharding to DataPipes Pull Request resolved: https://github.com/pytorch/pytorch/pull/78631 Approved by: https://github.com/ejguan, https://github.com/NivekT	2022-06-02 17:40:29 +00:00
Sergii Dymchenko	e8bf3a9cd4	Remove Python 2-related code from dataloader (#78594 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78594 Approved by: https://github.com/seemethere	2022-06-01 05:25:23 +00:00
erjia	365ce350cb	Make ShufflerDataPipe deterministic for SP & MP DataLoader (#77741 ) This is the first PR to make DataPipe deterministic. Users should be able to use `torch.manual_seed(seed)` to control the shuffle order for the following cases: - Directly over `DataPipe` - For single-process DataLoader - Multiprocessing DataLoader Unfortunately, for distributed training, users have to run `apply_shuffle_seed` manually to make sure all distributed processes having the same order of shuffle. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77741 Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT	2022-05-18 23:32:07 +00:00
Vitaly Fedyunin	edffd595c2	[DataLoader] Adding ability to use dill to pass DataPipes in mutiprocessing Pull Request resolved: https://github.com/pytorch/pytorch/pull/77288 Approved by: https://github.com/ejguan, https://github.com/NivekT	2022-05-15 23:04:03 +00:00
Michael Suo	fb0f285638	[lint] upgrade mypy to latest version Fixes https://github.com/pytorch/pytorch/issues/75927. Had to fix some bugs and add some ignores. To check if clean: ``` lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753 Approved by: https://github.com/malfet	2022-05-03 20:51:34 +00:00
PyTorch MergeBot	3d7428d9ac	Revert "[lint] upgrade mypy to latest version" This reverts commit `9bf18aab94`. Reverted https://github.com/pytorch/pytorch/pull/76753 on behalf of https://github.com/suo	2022-05-03 20:01:18 +00:00
Michael Suo	9bf18aab94	[lint] upgrade mypy to latest version Fixes https://github.com/pytorch/pytorch/issues/75927. Had to fix some bugs and add some ignores. To check if clean: ``` lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753 Approved by: https://github.com/malfet	2022-05-03 19:43:28 +00:00
Erjia Guan	0289ab2cec	Fix data-related public API (#368 ) Summary: X-link: https://github.com/pytorch/data/pull/368 This is PR aims to expose the right data-relate API. There are two more changes made in this PR to convert public api to private api `check_lambda_fn` -> `_check_lambda_fn` `deprecation_warning` -> `_deprecation_warning` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76143 Reviewed By: albanD, NivekT Differential Revision: D35798311 Pulled By: ejguan fbshipit-source-id: b13fded5c88a533c706702fb2070c918c839dca4 (cherry picked from commit 0b534b829a2e90e1e533951c6d334fdeaa9358b9)	2022-04-21 17:27:05 -07:00
Jeeja	45bbc4c028	Update Dataloader with default parameter device (#65402 ) Summary: pin_memory, has optional device parameter to specify which device you want to pin for. With this above change the Dataloader will work only for CUDA backend. To add support for other backend which supports pinned memory, dataloader is updated with device as optional parameter. Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/65402 Reviewed By: zou3519 Differential Revision: D32282204 Pulled By: VitalyFedyunin fbshipit-source-id: e2e09876969af108d0db38af7c2d1b2f1cfa9858 (cherry picked from commit 3b76e151964fce442e27fe8fb5c37af930da4fa1)	2022-04-21 01:33:53 +00:00
Philip Meier	04db1b874f	prevent overriding shuffle settings in DataLoader for datapipes Fixes https://github.com/pytorch/data/issues/295 Follow-up to https://github.com/pytorch/pytorch/pull/75014#issuecomment-1091921305. We only need to update locations where we actually check `shuffle` for identity with a boolean value, i.e. `shuffle is False`. For bool-ish checks like `if shuffle:`, `None` behaves just like `False`. `IterDataPipe`'s are currently not mentioned in the docstring. Since this change only applies to them, I didn't update it. LMK, if I should do that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75505 Approved by: https://github.com/ejguan	2022-04-12 18:26:33 +00:00
Philip Meier	3c10987692	don't add extra shuffle in DataLoader2 if one is present Without this, `DataLoader2` will just add an `Shuffler` to the end of the datapipe if `shuffle=True`: ```py from torch.utils.data.dataloader_experimental import DataLoader2 from torchdata.datapipes.iter import IterableWrapper, IterDataPipe, Shuffler class Sorter(IterDataPipe): def __init__(self, datapipe): self.datapipe = datapipe def __iter__(self): return iter(sorted(self.datapipe)) data = list(range(1000)) dp = IterableWrapper(data) dp = Shuffler(dp).set_shuffle(False) dp = Sorter(dp) dl2 = DataLoader2(dp, shuffle=True, batch_size=None) assert list(dl2) == data # fails unless you hit a lucky random seed ``` This example is somewhat non-sensical, but demonstrates we cannot simply add a `Shuffler`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75014 Approved by: https://github.com/ejguan	2022-04-05 19:53:08 +00:00
amin-nejad	cce831c805	Fix misleading DataLoader docstring Fixes description of `prefetch_factor` argument to `DataLoader` as discussed in #58030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74558 Approved by: https://github.com/NivekT	2022-03-28 17:54:48 +00:00
Evren Tumer	7534525735	Reset worker cycle iterator for determinism across runs (#73675 ) Summary: Reset worker cycle iterator for determinism across runs Fixes https://github.com/pytorch/pytorch/issues/73603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73675 Reviewed By: bdhirsh Differential Revision: D34688704 Pulled By: ejguan fbshipit-source-id: 7bab11f0b9f59645d9b168fa11d92dc7c2c4d34e (cherry picked from commit eb5fd559224988f9967528e154cf37c5031fe7c2)	2022-03-09 14:55:07 +00:00
Erjia Guan	67a275c293	Fix persistent worker exits before pin_memory thread (#71579 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71579 Fixes #1551 As the comment in the code, register a function to terminate persistent workers. By adding a reference of these workers in `atexit`, it would prevent Python interpreter kills these persistent worker processes before `pin_memorh_thread` exits. And, if users explicitly kills DataLoader iterator, such function in `atexit` would be a no-op. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D33896537 Pulled By: ejguan fbshipit-source-id: 36b57eac7523d8aa180180c2b61fc693ea4638ae (cherry picked from commit `05add2ae0f`)	2022-02-01 23:57:17 +00:00
Nikita Shulga	86aefdc082	Revert D33694867: Fix persistent worker exits before pin_memory thread Test Plan: revert-hammer Differential Revision: D33694867 (`e2191e7084`) Original commit changeset: 0847f4d424a0 Original Phabricator Diff: D33694867 (`e2191e7084`) fbshipit-source-id: 5f28616700d8647cbe468a9e300724a7f0c6cc15 (cherry picked from commit `3d8125ba6d`)	2022-01-22 00:09:28 +00:00
Erjia Guan	e2191e7084	Fix persistent worker exits before pin_memory thread (#71579 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71579 Fixes #1551 As the comment in the code, register a function to terminate persistent workers. Using `atexit` to make sure termination of persistent workers always happens at the end (after pin_memory_thread exits). We need such mechanism because Python interpreter would clean up worker process before DataLoader iterator in some rare cases. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D33694867 Pulled By: ejguan fbshipit-source-id: 0847f4d424a0cd6b3c0be8235d505415970254e8 (cherry picked from commit `18ad4621af`)	2022-01-21 20:31:16 +00:00
Erjia Guan	0721fc6474	Decouple MapDataPipe from Dataset (#70991 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70991 Test Plan: Imported from OSS Reviewed By: dagitses Differential Revision: D33477680 Pulled By: ejguan fbshipit-source-id: d3e89492e921a96791319f35052a229684ddf7cf	2022-01-07 14:28:41 -08:00
Kevin Tse	b67eaec853	[DateLoader] more clearly expose 'default_collate' and 'default_convert' to users (#69862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69862 Fixes #69445 cc SsnL VitalyFedyunin ejguan NivekT Test Plan: Imported from OSS Reviewed By: ejguan, ngimel Differential Revision: D33068792 Pulled By: NivekT fbshipit-source-id: ef9791acdc23d014b8761fa7420062d454ce8969	2021-12-14 11:18:26 -08:00
Vitaly Fedyunin	d90012689f	[DataPipe] Control shuffle settings from DataLoader2 (#65756 ) Summary: Makes `shuffle` DataPipe sensitive to DataLoader(2) `shuffle` kwarg. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65756 Reviewed By: albanD Differential Revision: D31344867 Pulled By: VitalyFedyunin fbshipit-source-id: e0084e0ac193ac784d6298328ca1222745681347	2021-12-14 07:35:26 -08:00
Erjia Guan	060e41eafa	Forward fix type hint for DataLoader (#66001 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66001 Test Plan: Imported from OSS Reviewed By: NivekT Differential Revision: D31340565 Pulled By: ejguan fbshipit-source-id: d05ae42ebf93f61d781dc5d81ef0222e24f5acb3	2021-10-01 15:48:45 -07:00
Michael Suo	21da6ae9ce	suppress mypy error (#66003 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66003 Differential Revision: D31340874 D31340874 Test Plan: Imported from OSS Reviewed By: seemethere Pulled By: suo fbshipit-source-id: d9ef0f40625fe5ff21f8a5e044d5a75400367dc2	2021-10-01 09:17:42 -07:00
Roman Shapovalov	fc52f1293e	Improve pytorch type hints (Dataloader, trig functions) Summary: This is to fix Pyre errors in our applications: * calling `tensor.cos()` etc. * creating a data loader with batch sampler that is `List[List[int]]`. Test Plan: TODO: rebase the diffs and run Pyre. Reviewed By: ejguan Differential Revision: D31309564 fbshipit-source-id: 1c6f3070d7570260de170e2fe2153d277b246745	2021-10-01 06:53:57 -07:00
Adam J. Stewart	e5ab0d1013	DataLoader: allow non-integer Samplers (#63500 ) Summary: Not entirely sure how to use TypeVar but if someone could give me a hint it would be appreciated. Also let me know if you want me to add tests so we can make sure non-integer samplers actually work. It seems like `test/test_dataloader.py` is the correct location but that's a big file. Fixes https://github.com/pytorch/pytorch/issues/63483 ejguan Pull Request resolved: https://github.com/pytorch/pytorch/pull/63500 Reviewed By: mruberry Differential Revision: D30403689 Pulled By: ejguan fbshipit-source-id: 464e09e5aad3215b94a29cc5e21cb4b10ec136e3	2021-08-19 14:55:46 -07:00
Victor Bittorf	91c076eadc	Add TorchVitals for DataLoader (#60959 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60959 Add TorchVitals for Dataloader, this indicates that the data loader was enabled. This is a no-op if TORCH_VITALS environment variable is not set. Test Plan: buck test mode/dbg caffe2/test:torch -- --regex vitals Reviewed By: VitalyFedyunin Differential Revision: D29445146 fbshipit-source-id: d5778fff3dafb3c0463fec7a498bff4905597518	2021-06-29 14:08:32 -07:00
Philip Meier	d5988c5eca	remove unused `type: ignore` directives (#60006 ) Summary: During development it is common practice to put `type: ignore` comments on lines that are correct, but `mypy` doesn't recognize this. This often stems from the fact, that the used `mypy` version wasn't able to handle the used pattern. With every new release `mypy` gets better at handling complex code. In addition to fix all the previously accepted but now failing patterns, we should also revisit all `type: ignore` comments to see if they are still needed or not. Fortunately, we don't need to do it manually: by adding `warn_unused_ignores = True` to the configuration, `mypy` will error out in case it encounters an `type: ignore` that is no longer needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60006 Reviewed By: jbschlosser, malfet Differential Revision: D29133237 Pulled By: albanD fbshipit-source-id: 41e82edc5cd5affa7ccedad044b59b94dad4425a	2021-06-18 07:23:31 -07:00
Erjia Guan	8cf85a1152	[DataLoader][doc] Randomness for base_seed generator and NumPy seed (#56528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56528 Tried to search across internal and external usage of DataLoader. People haven't started to use `generator` for `DataLoader`. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27908487 Pulled By: ejguan fbshipit-source-id: 14c83ed40d4ba4dc988b121968a78c2732d8eb93	2021-04-22 09:40:45 -07:00
Erjia Guan	aec83ff45e	[DataLoader] Add Numpy seeding to worker of DataLoader (#56488 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56488 Considering amount of requests for this feature, introduce numpy seeding as default within each worker for DataLoader. ## BC-breaking Note: - By introducing default numpy.random seeding strategy to workers of DataLoader, users don't need to manually set seed for workers by the `worker_init_fn`. And this PR won't influence users who are currently using `worker_init_fn` to set customized seed for workers. - DataLoader will preserve reproducibility for users who are using numpy.random within Dataset. - Multiprocessing (without `worker_init_fn` to define seed for numpy) - Start method as `spawn`: Each worker will now have seed for numpy random, rather than the seed generated from the imported time of Numpy module that make the DataLoader lose the reproducibility. - Start method as `fork`: Each worker not only have the same benefit as `spawn`, but also have different seed for numpy as default, rather than inheriting the same seed. Using the following Dataset and script as an example: ```py class RandomDataset(Dataset): def __getitem__(self, ind): item = [ind, np.random.randint(1, 10000)] return item def __len__(self): return 20 if __name__ == '__main__'" ctx = mp.get_context('fork') ds = RandomDataset() g = torch.Generator() g.manual_seed(0) dl = DataLoader(ds, 2, shuffle=False, num_workers=4, multiprocessing_context=ctx, generator=g) epochs = 2 for _ in range(epochs): for batch in d;: print(batch) print("====" * 10) ``` ### 1.8.1: Each worker generates same random result per iteration. And the seed will be reset to same for each epoch. ```py tensor([[ 0, 7449], [ 1, 1519]]) tensor([[ 2, 7449], [ 3, 1519]]) tensor([[ 4, 9645], [ 5, 2387]]) tensor([[ 6, 9645], [ 7, 2387]]) tensor([[ 8, 3118], [ 9, 4552]]) ========================= tensor([[ 0, 7449], [ 1, 1519]]) tensor([[ 2, 7449], [ 3, 1519]]) tensor([[ 4, 9645], [ 5, 2387]]) tensor([[ 6, 9645], [ 7, 2387]]) tensor([[ 8, 3118], [ 9, 4552]]) ========================= ``` ### This PR: Each worker has different seed at the beginning and re-seed for each epoch. ```py tensor([[ 0, 8715], [ 1, 5555]]) tensor([[ 2, 6379], [ 3, 1432]]) tensor([[ 4, 3271], [ 5, 5132]]) tensor([[ 6, 4287], [ 7, 1104]]) tensor([[ 8, 8682], [ 9, 1699]]) ========================= tensor([[ 0, 1374], [ 1, 996]]) tensor([[ 2, 143], [ 3, 3507]]) tensor([[ 4, 5887], [ 5, 4730]]) tensor([[ 6, 7274], [ 7, 738]]) tensor([[ 8, 6374], [ 9, 1572]]) ========================= ``` Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27908486 Pulled By: ejguan fbshipit-source-id: 5f313a30563bedeb88be214fa4beca0cefe9e4f4	2021-04-22 09:39:33 -07:00
Sam Estep	75024e228c	Add lint for unqualified `type: ignore` (#56290 ) Summary: The other half of https://github.com/pytorch/pytorch/issues/56272. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2384511062 - https://github.com/pytorch/pytorch/actions/runs/765036024 Reviewed By: seemethere Differential Revision: D27867219 Pulled By: samestep fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235	2021-04-21 08:07:23 -07:00
Zhiyuan Chen	7d4e9bdba1	Add type hint for SequentialSampler (#56374 ) Summary: Add type hint for SequentialSampler Pull Request resolved: https://github.com/pytorch/pytorch/pull/56374 Reviewed By: heitorschueroff Differential Revision: D27884528 Pulled By: ejguan fbshipit-source-id: 68eb900643098565743245c843e76e464f981458	2021-04-20 14:45:52 -07:00
danielgordon10	7f1693d95e	Fix type hints of the callable arguments for DataLoader (#52924 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/52806 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52924 Reviewed By: malfet Differential Revision: D26694894 Pulled By: ejguan fbshipit-source-id: 55734ec9684caa90f1e599b65659b7c57047f802	2021-02-27 07:45:49 -08:00
Chester Liu	58eb23378f	Clean up usage of torch._six partially (#49785 ) Summary: See https://github.com/pytorch/pytorch/issues/42919 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49785 Reviewed By: mruberry Differential Revision: D25963833 Pulled By: bugra fbshipit-source-id: 11c90d6b8d3f206c9d0a4d8621b773beb10c6ba2	2021-02-08 13:58:34 -08:00
Tongzhou Wang	54ce171f16	Fix persistent_workers + pin_memory (#48543 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48370 https://github.com/pytorch/pytorch/issues/47445 cc emcastillo who authored the original functionality. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48543 Reviewed By: bdhirsh Differential Revision: D25277474 Pulled By: ejguan fbshipit-source-id: 1967002124fb0fff57caca8982bc7df359a059a2	2021-01-08 07:04:10 -08:00
Hugo van Kemenade	473e78c0fa	Remove redundant code for unsupported Python versions (#49486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49486 Remove code for Python 3.5 and lower. There's more that can be removed/modernised, but sticking mainly to redundant version checks here, to keep the diff/PR smaller. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46579 Reviewed By: zou3519 Differential Revision: D24453571 Pulled By: ezyang fbshipit-source-id: c2cfcf05d6c5f65df64d89c331692c9aec09248e	2021-01-06 12:45:46 -08:00
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
Tom McClintock	a3aafea076	Fixed a typo in dataloader.py. (#49437 ) Summary: This small PR fixes a one character typo in the docstring for `DataLoader`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49437 Reviewed By: ngimel Differential Revision: D25665971 Pulled By: mrshenli fbshipit-source-id: b60f975f1e3bf0bb8f88e39f490f716c602f087e	2020-12-21 10:27:24 -08:00
Teng Gao	1c31f76297	Add high level profiling trace for dataloading and optimizer (#47655 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47441 To give user more information about python level functions in profiler traces, we propose to instrument on the following functions: ``` _BaseDataLoaderIter.__next__ Optimizer.step Optimizer.zero_grad ``` Because the record_function already uses if (!active) to check whether the profiler is enabled, so we don't explicitly call torch.autograd._profiler_enabled() before each instrument. Acknowledgement: nbcsm, guotuofeng, gunandrose4u , guyang3532 , mszhanyi Pull Request resolved: https://github.com/pytorch/pytorch/pull/47655 Reviewed By: smessmer Differential Revision: D24960386 Pulled By: ilia-cher fbshipit-source-id: 2eb655789e2e2f506e1b8f95ad3d470c83281102	2020-12-09 00:13:56 -08:00
Tongzhou Wang	1112773cf5	Fix unintended error when worker force kill happens #43455 (#43462 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43455 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43462 Reviewed By: bdhirsh Differential Revision: D25277759 Pulled By: VitalyFedyunin fbshipit-source-id: 0bb0d87374c0403853d71aac2c242374bfc7acf2	2020-12-02 21:42:16 -08:00
SsnL	4abca9067b	Fix dataloader hang with large sampler (#48669 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48666 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48669 Reviewed By: zhangguanheng66 Differential Revision: D25255763 Pulled By: VitalyFedyunin fbshipit-source-id: d06421f52bb1d00cdf8025f1a2ba0d1f9284731a	2020-12-02 09:07:30 -08:00
lixinyu	67b7e751e6	add warning if DataLoader is going to create excessive number of thread (#46867 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46867 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24545540 Pulled By: glaringlee fbshipit-source-id: a3bef0d417e535b8ec0bb33f39cfa2308aadfff0	2020-10-30 07:54:23 -07:00
Vitaly Fedyunin	31ee5d8d8b	Adding information how to control randomness with DataLoader (#45749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45749 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24088407 Pulled By: VitalyFedyunin fbshipit-source-id: 398b73ec5e8c83000ebc692001da847fc0aaa48f	2020-10-12 16:57:58 -07:00
Emilio Castillo	5472426b9f	Reset `DataLoader` workers instead of creating new ones (#35795 ) Summary: This PR needs discussion as it changes the behavior of `DataLoader`. It can be closed if its not considered a good practice. Currently, the `DataLoader` spawns a new `_BaseDataLoaderIter` object every epoch, In the case of the multiprocess DataLoader, every epoch the worker processes are re-created and they make a copy of the original `Dataset` object. If users want to cache data or do some tracking on their datasets, all their data will be wiped out every epoch. Notice that this doesn't happen when the number of workers is 0. giving some inconsistencies with the multiprocess and serial data loaders. This PR keeps the `_BaseDataLoaderIter` object alive and just resets it within epochs, so the workers remain active and so their own `Dataset` objects. People seem to file issues about this often. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35795 Reviewed By: ailzhang Differential Revision: D23426612 Pulled By: VitalyFedyunin fbshipit-source-id: e16950036bae35548cd0cfa78faa06b6c232a2ea	2020-09-01 11:48:00 -07:00
Akihiro Nitta	f17d7a5556	Fix exception chaining in `torch/` (#43836 ) Summary: ## Motivation Fixes https://github.com/pytorch/pytorch/issues/43770. ## Description of the change This PR fixes exception chaining only in files under `torch/` where appropriate. To fix exception chaining, I used either: 1. `raise new_exception from old_exception` where `new_exception` itself seems not descriptive enough to debug or `old_exception` delivers valuable information. 2. `raise new_exception from None` where raising both of `new_exception` and `old_exception` seems a bit noisy and redundant. I subjectively chose which one to use from the above options. ## List of lines containing raise in except clause: I wrote [this simple script](https://gist.github.com/akihironitta/4223c1b32404b36c1b349d70c4c93b4d) using [ast](https://docs.python.org/3.8/library/ast.html#module-ast) to list lines where `raise`ing in `except` clause. - [x] `000739c31a/torch/jit/annotations.py (L35)` - [x] `000739c31a/torch/jit/annotations.py (L150)` - [x] `000739c31a/torch/jit/annotations.py (L158)` - [x] `000739c31a/torch/jit/annotations.py (L231)` - [x] `000739c31a/torch/jit/_trace.py (L432)` - [x] `000739c31a/torch/nn/utils/prune.py (L192)` - [x] `000739c31a/torch/cuda/nvtx.py (L7)` - [x] `000739c31a/torch/utils/cpp_extension.py (L1537)` - [x] `000739c31a/torch/utils/tensorboard/_pytorch_graph.py (L292)` - [x] `000739c31a/torch/utils/data/dataloader.py (L835)` - [x] `000739c31a/torch/utils/data/dataloader.py (L849)` - [x] `000739c31a/torch/utils/data/dataloader.py (L856)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L186)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L189)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L424)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1279)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1283)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1356)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1388)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1391)` - [ ] `000739c31a/torch/testing/_internal/common_utils.py (L1412)` - [x] `000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L310)` - [x] `000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L329)` - [x] `000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L332)` - [x] `000739c31a/torch/testing/_internal/jit_utils.py (L183)` - [x] `000739c31a/torch/testing/_internal/common_nn.py (L4789)` - [x] `000739c31a/torch/onnx/utils.py (L367)` - [x] `000739c31a/torch/onnx/utils.py (L659)` - [x] `000739c31a/torch/onnx/utils.py (L892)` - [x] `000739c31a/torch/onnx/utils.py (L897)` - [x] `000739c31a/torch/serialization.py (L108)` - [x] `000739c31a/torch/serialization.py (L754)` - [x] `000739c31a/torch/distributed/rpc/_testing/faulty_agent_backend_registry.py (L76)` - [x] `000739c31a/torch/distributed/rpc/backend_registry.py (L260)` - [x] `000739c31a/torch/distributed/distributed_c10d.py (L184)` - [x] `000739c31a/torch/_utils_internal.py (L57)` - [x] `000739c31a/torch/hub.py (L494)` - [x] `000739c31a/torch/contrib/_tensorboard_vis.py (L16)` - [x] `000739c31a/torch/distributions/lowrank_multivariate_normal.py (L100)` - [x] `000739c31a/torch/distributions/constraint_registry.py (L142)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/43836 Reviewed By: ailzhang Differential Revision: D23431212 Pulled By: malfet fbshipit-source-id: 5f7f41b391164a5ad0efc06e55cd58c23408a921	2020-08-31 20:26:23 -07:00

1 2 3

150 Commits