pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Ramil Nugmanov	28098cae6b	[DataLoader] Adding `StackDataset` (#101338 ) Torch wrapping datasets list has: `TensorDataset` `ConcatDataset` `ChainDataset` `TensorDataset` is useful for stacking sets of tensors but can't work with objects without `.size()` method. This PR proposes `StackDataset`, similar to `TensorDataset` but for a general case like `ConcatDataset`. Possible usage of `StackDataset` is multimodal networks with different input like image+text or for staking non-tensor input and property to predict. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101338 Approved by: https://github.com/ejguan, https://github.com/NivekT	2023-05-18 00:57:12 +00:00
Kevin Tse	be8d88f8d0	[DataLoader] Removing DataLoader2 related code (#88848 ) Removing these lines of code as `DataLoader2` has been added to [TorchData](https://github.com/pytorch/data). I'm importing this to confirm it will not impact internal codes. Differential Revision: [D41201578](https://our.internmc.facebook.com/intern/diff/D41201578) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88848 Approved by: https://github.com/ejguan	2022-11-11 22:27:01 +00:00
erjia	b13b10a8fa	Extend collate function that can register collate functions to handle specific types (#85748 ) As per request from Vision team, adding `collate` function with an extra argument of `collate_fn_map` to dispatch custom collate functions for non-collection objects and specific objects. If the type of batch element is not present in`collate_fn_map`, it will go through all keys in the insertion order to check if the type is a subclass of the key. If so, it will invoke the corresponding collate functions. And, `default_collate` will utilize the `collate` function with a few by default collate function for `int`, `float`, `str` and `numpy object`. Benefit: - Domain teams can register their own `collate` function to handle their specific type of objects - Easier for users to extend from the `collate` function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85748 Approved by: https://github.com/NivekT, https://github.com/pmeier	2022-09-30 13:30:18 +00:00
erjia	365ce350cb	Make ShufflerDataPipe deterministic for SP & MP DataLoader (#77741 ) This is the first PR to make DataPipe deterministic. Users should be able to use `torch.manual_seed(seed)` to control the shuffle order for the following cases: - Directly over `DataPipe` - For single-process DataLoader - Multiprocessing DataLoader Unfortunately, for distributed training, users have to run `apply_shuffle_seed` manually to make sure all distributed processes having the same order of shuffle. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77741 Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT	2022-05-18 23:32:07 +00:00
Kevin Tse	eec994fc16	[DataPipe] Separating DataPipes from Dataset into different files (#73396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73396 Separating DataPipes from Dataset into different files. This makes the code more maintainable and simplifies some of the code generation. I have also tried to move `datapipe.py` into `torch.utils.data.datapipes`, but that will lead to circular import and rewriting many import statements. Should I put more time and go down that path some more? Fixes https://github.com/pytorch/data/issues/213 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D34481962 Pulled By: NivekT fbshipit-source-id: 42fb26fe7fc334636852cfd8719fc807bdaa7912 (cherry picked from commit 81e76a64e297cb5c58caa951c554e49526173936)	2022-03-15 14:46:34 +00:00
Erjia Guan	0721fc6474	Decouple MapDataPipe from Dataset (#70991 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70991 Test Plan: Imported from OSS Reviewed By: dagitses Differential Revision: D33477680 Pulled By: ejguan fbshipit-source-id: d3e89492e921a96791319f35052a229684ddf7cf	2022-01-07 14:28:41 -08:00
Kevin Tse	b67eaec853	[DateLoader] more clearly expose 'default_collate' and 'default_convert' to users (#69862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69862 Fixes #69445 cc SsnL VitalyFedyunin ejguan NivekT Test Plan: Imported from OSS Reviewed By: ejguan, ngimel Differential Revision: D33068792 Pulled By: NivekT fbshipit-source-id: ef9791acdc23d014b8761fa7420062d454ce8969	2021-12-14 11:18:26 -08:00
Vitaly Fedyunin	d90012689f	[DataPipe] Control shuffle settings from DataLoader2 (#65756 ) Summary: Makes `shuffle` DataPipe sensitive to DataLoader(2) `shuffle` kwarg. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65756 Reviewed By: albanD Differential Revision: D31344867 Pulled By: VitalyFedyunin fbshipit-source-id: e0084e0ac193ac784d6298328ca1222745681347	2021-12-14 07:35:26 -08:00
Vitaly Fedyunin	ab5e1c69a7	[WIP] Example of DataPipes and DataFrames integration (#60840 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60840 Test Plan: Imported from OSS Reviewed By: wenleix, ejguan Differential Revision: D29461080 Pulled By: VitalyFedyunin fbshipit-source-id: 4909394dcd39e97ee49b699fda542b311b7e0d82	2021-09-13 18:50:15 -07:00
Vitaly Fedyunin	82174330d0	[DataLoader2] Adding Messages, Protocols, Loop wrappers (#63882 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63882 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D30627452 Pulled By: VitalyFedyunin fbshipit-source-id: 561ea2df07f3572e04401171946154024126387b	2021-08-30 07:57:20 -07:00
Vitaly Fedyunin	e1bdebf685	Adding DataLoader2 class as future replacement of DataLoader (#63742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63742 Supports sharding and batching on loader level** Supports sharding and batching on loader level Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D30494506 Pulled By: VitalyFedyunin fbshipit-source-id: 6648e09d955055ac38e3a4e3973f701acefca762	2021-08-23 18:09:07 -07:00
Alban Desmaison	71da114412	Revert D30426527: Adding DataLoader2 class as future replacement of DataLoader Test Plan: revert-hammer Differential Revision: D30426527 (`5a7133b87f`) Original commit changeset: e5905d3364c4 fbshipit-source-id: 794d8a4e9256ccff8cf894aee10eff6adc30d502	2021-08-20 12:06:52 -07:00
Vitaly Fedyunin	5a7133b87f	Adding DataLoader2 class as future replacement of DataLoader (#63523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63523 Supports sharding and batching on loader level** * #63522 Adding IterableAsDataPipe IterDataPipe usefull for tests and simple cases Supports sharding and batching on loader level Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D30426527 Pulled By: VitalyFedyunin fbshipit-source-id: e5905d3364c4880e720dd62fb066f08881c71a6e	2021-08-20 09:01:55 -07:00
Vitaly Fedyunin	d3bdf345cb	Introducing DataChunk for DataPipes batching (#62768 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62768 This is part of TorchArrow DF support preparation, separating it to multiple PRs to simplify review process. Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D30149090 Pulled By: VitalyFedyunin fbshipit-source-id: a36b5ff56e2ac6b06060014d4cd41b487754acb8	2021-08-06 08:38:33 -07:00
Vitaly Fedyunin	d6899fe492	[Refactoring] Reordering imports in utils/data/__init__.py (#61324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61324 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D29588895 Pulled By: VitalyFedyunin fbshipit-source-id: 5e719c80f9cb5630c65187ac89773831777f368d	2021-07-21 21:38:28 -07:00
Marcio Porto	4942fe0290	[DataLoader] Introduce MapMapDataPipe functional datapipe (#58258 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58258 As part of https://github.com/pytorch/pytorch/issues/57031, this PR adds the `MapMapDataPipe` functional datapipe for the `MapDataPipe` class. Usage: ``` def fn(x): return x * 10 dp = CountingDataset(n=10) dp.map(fn) ``` Reviewed By: ejguan Differential Revision: D28394510 fbshipit-source-id: 8d71b1f5723dff52385c3ce753944304896af678	2021-05-20 09:00:21 -07:00
Erjia Guan	3b977b3b4d	[DataLoader] Add context manager for runtime type validation (#55936 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55936 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D27743476 Pulled By: ejguan fbshipit-source-id: 8f0454ccf3ec37807598056433bff91013fa9bb9	2021-05-12 11:59:16 -07:00
Erjia Guan	5c696443c7	[DataLoader] Modfity construct_time_validation to argument_validation (#55836 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55836 Change construct_time_validation to argument_validation as we should provide users the flexibility to use this decorator over all different functions, which are required with type validation. It can also work as a construct-time validation ```py class ExampleDataPipe(IterDataPipe): argument_validation def __init__(self, dp: IterDataPipe[int]): self.dp = dp ... ``` Notebook is also updated. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D27743478 Pulled By: ejguan fbshipit-source-id: 49743152d121028cd7d72d89dc7df5c7c7b94c41	2021-05-12 11:58:05 -07:00
Erjia Guan	c549a147a9	[DataLoader] Typing Doc (#54773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54773 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D27364246 Pulled By: ejguan fbshipit-source-id: 48908555853c364d2d3cc173e3b73a6bec2e19f1	2021-04-02 15:22:35 -07:00
Erjia Guan	0b1c3dfae4	[DataLoader] Typing Enforcement for DataPipe at runtime (#54544 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54544 ## Feature - Add `subinstance(data, type)` to check `data` is a subtype instance of the `type` - Add a decorator of `runtime_validation` to validate the returned data from `__iter__` is subtype instance of hint. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D27327234 Pulled By: ejguan fbshipit-source-id: fb6a332762b0fe75284bb2b52a13ed171b42558c	2021-04-02 15:22:32 -07:00
Erjia Guan	1535520f08	[DataLoader] Typing Enforcement for DataPipe at construct-time (#54066 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54066 ## Feature - Add a decorator `construct_time_validation` to validate each input datapipe according to the corresponding type hint. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D27327236 Pulled By: ejguan fbshipit-source-id: a9d4c6edb5b05090bd5a369eee50a6fb4d7cf957	2021-04-02 15:22:29 -07:00
Erjia Guan	e87ab2ac4d	[DataLoader] Switch to guaranteed determinism & add option to non_deterministic (#53532 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53532 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D26888825 Pulled By: ejguan fbshipit-source-id: 1e8c266146aa802a43e8c23c4f0b3b02134c8b50	2021-03-15 14:47:16 -07:00
Erjia Guan	dbbe0a2105	[DataLoader] Introduce deterministic context (#53271 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53271 - [x] Add `set_determinism` context manager - [x] Add `non_deterministic` decorator for `DataPipe` - Raise error at the construction time for non-deterministic DataPipe when `determinism` is set to `True` - [ ] Support `non_deterministic` with option - When `GreedyJoin` only contains one datapipe, it should still be deterministic. Note: Test is in the [PR](https://github.com/facebookexternal/torchdata/pull/15). As the main repo doesn't have non-deterministic DataPipe yet. Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D26823023 Pulled By: ejguan fbshipit-source-id: 51bb92fc3d18d1fc9536c1229363c536ad120876	2021-03-06 07:37:39 -08:00
Erjia Guan	89b1053413	[DataLoader] Move BufferedShuffle from Dataset to DataPipe (#52141 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52141 Remove BufferShuffleDataSet, as it's not being used anywhere within PyTorch (no usage on Github based on a search) and it's not included in the release of PyTorch 1.7.1. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D26710940 Pulled By: ejguan fbshipit-source-id: 90023b4bfb105d6aa392753082100f9181ecebd0	2021-03-01 12:54:44 -08:00
Vitaly Fedyunin	9a03e65456	Adding functional way of stacking DataPipes with fixed mypy (#52885 ) Summary: Readding reverted PR with MyPY fixed Pull Request resolved: https://github.com/pytorch/pytorch/pull/52885 Reviewed By: ejguan Differential Revision: D26676405 Pulled By: VitalyFedyunin fbshipit-source-id: 020216c5522d21a4994cd896ae778c0b77f6444b	2021-02-25 19:37:35 -08:00
Howard Huang	da732c76c4	Revert D26644079: [pytorch][PR] Adding functional way of stacking DataPipes Test Plan: revert-hammer Differential Revision: D26644079 (`7972036bbb`) Original commit changeset: dcf464637b4f fbshipit-source-id: a12a06d7e7fb3821a0990bbc6305d02721ead82c	2021-02-25 14:30:49 -08:00
Vitaly Fedyunin	7972036bbb	Adding functional way of stacking DataPipes (#52507 ) Summary: Allows to use functional API to stack datapipes: ```python numbers_dp = NumbersDataset(size=10).filter(filter_fn = lambda x: x % 2 == 1).map(fn = lambda x: x * 10) ``` DataPipes have to be decorated with: ```python functional_datapipe('map') class MapIterDataPipe(IterDataPipe[T_co]): ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/52507 Reviewed By: ailzhang Differential Revision: D26644079 Pulled By: VitalyFedyunin fbshipit-source-id: dcf464637b4fcf9ea1eb8e84c2a0cd4dfd58b43d	2021-02-25 11:22:01 -08:00
Erjia Guan	059c564ba4	[DataLoader] Fix module import (#52224 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52224 Test Plan: Imported from OSS Reviewed By: glaringlee Differential Revision: D26429871 Pulled By: ejguan fbshipit-source-id: fcf2e5435658ecb92af1079def953b08cebb1f7f	2021-02-16 16:12:33 -08:00
Erjia Guan	52de407b4b	[DataLoader] Rename Functional DataSet to DataPipe (#51488 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51488 Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D26209888 Pulled By: ejguan fbshipit-source-id: cb8bc852b1e4d72be81e0297308a43954cd95332	2021-02-03 07:01:09 -08:00
Erjia Guan	bea0519b0b	[WIP][DataLoader] Implement BucketBatchIterableDataset (#51126 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51126 BucketBatch: Get a chunk of data as a bucket, and sort the bucket by the specified key, then batching. If sort key is not specified, directly use batchIterableDS.. 1. Implement BucketBatch for bucket sampler 2. Improve BatchDS tests Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D26209890 Pulled By: ejguan fbshipit-source-id: 8519e2e49da158b3fe32913c8f3cadfa6f3ff1fc	2021-02-03 07:01:05 -08:00
Erjia Guan	14ee63f7e6	[WIP][DataLoader] Implement CallableIterableDataset (#50045 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50045 Add CallableIterableDataset Modify CollateIterableDataset as another callable Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D26209889 Pulled By: ejguan fbshipit-source-id: d4773026c1269e43b29a3efb16e36e1865fdd024	2021-02-03 06:54:48 -08:00
lixinyu	c0d58bce0d	move Tar Dataset to Tar DataPipe (#51398 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51398 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D26162319 Pulled By: glaringlee fbshipit-source-id: a84879fe4ca044e34238d5e1d31a245d4b80ae8e	2021-02-02 07:46:53 -08:00
Erjia Guan	7ed140a1a0	[WIP][DataLoader] Prototype of SamplerIterableDataset (#49363 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49363 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D25623637 Pulled By: ejguan fbshipit-source-id: 9155d27d1fc91996b74110795cc73f1da0eedd44	2020-12-21 07:09:34 -08:00
Erjia Guan	554f79acb9	[WIP][DataLoader] Prototype of BatchIterableDataset (#49186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49186 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D25623636 Pulled By: ejguan fbshipit-source-id: 01a08cccb69301481c55b46358203354b9b4f5fa	2020-12-21 07:09:31 -08:00
Erjia Guan	1b6fc1fd42	[WIP][DataLoader] CollateIterableDataset prototype (#48933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48933 Prototype for CollateIterableDataset. Move `collate_batch_fn` to BatchIterableDataset - CollateIterableDataset - [x] Prototype - [x] Tests - BatchIterableDataset - [x] Prototype - [x] Tests - SamplerIterableDataset - [x] Prototype - [x] Tests Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D25623635 Pulled By: ejguan fbshipit-source-id: 99ba077619f672551ac15367baaba985db35a9c2	2020-12-21 07:04:25 -08:00
Erjia Guan	96540e918c	Add ShuffleDataset with buffer (#45290 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45290 Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D24001084 Pulled By: erjia-guan fbshipit-source-id: d8a7455cf3f18e1f8c1edc53c42c1a99c8573c51	2020-09-30 07:58:15 -07:00
Ralf Gommers	da32bf4cc6	Move type annotations for remaining torch.utils stub files inline (#43406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43406 Reviewed By: mruberry Differential Revision: D23319736 Pulled By: malfet fbshipit-source-id: e25fbb49f27aa4893590b022441303d6d98263a9	2020-08-31 18:44:09 -07:00
なるみ	d83389d327	Ignore F401 in all __init__.py without putting noqa (#25823 ) Summary: By adding `per-file-ignores = __init__.py: F401` into `.flake8` with `flake8>=3.7`, we can ignore F410 in all `__init__.py` without putting `# noqa: F401` line by line. http://flake8.pycqa.org/en/latest/user/options.html?highlight=per-file-ignores#cmdoption-flake8-per-file-ignores Pull Request resolved: https://github.com/pytorch/pytorch/pull/25823 Differential Revision: D17252182 Pulled By: soumith fbshipit-source-id: 87b174075b79e4078953a7521bd1a8f82405646b	2019-10-23 15:28:13 -07:00
Tongzhou Wang	058beae411	Add IterableDataset (#19228 ) Summary: This is a modified version of https://github.com/pytorch/pytorch/pull/14705 since commit structure for that PR is quite messy. 1. Add `IterableDataset`. 3. So we have 2 data loader mods: `Iterable` and `Map`. 1. `Iterable` if the `dataset` is an instance of `IterableDataset` 2. `Map` o.w. 3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading. 3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`. 4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration. 5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`. 7. Import torch.utils.data in `torch/__init__.py` 9. data loader examples and documentations 10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate` Closes https://github.com/pytorch/pytorch/issues/17909, https://github.com/pytorch/pytorch/issues/18096, https://github.com/pytorch/pytorch/issues/19946, and some of https://github.com/pytorch/pytorch/issues/13023 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19228 Reviewed By: bddppq Differential Revision: D15058152 fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde	2019-06-20 20:12:44 -07:00
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
SsnL	9217bde807	Refactor dataloader.py (#15331 ) Summary: Same as #14668, and was approved there. ailzhang , please apply this patch to Horizon's `data_streamer.py`: https://gist.github.com/SsnL/020fdb3d6b7016d81b6ba1d04cc41459 Thank you! Below is the original description at #14668: As I am working on tasks in https://github.com/pytorch/pytorch/issues/13023, I realized how unreadable the code is because all functions to be run in multiprocessing must be at top global level. Adding more functionalities to `dataloader.py` will only make things worse. So in this PR, I refactor `dataloader.py` and move much of it into `data._utils`. E.g., the `_worker_loop` and related methods are now in `data._utils.worker`, signal handling code in `data._utils.signal_handling`, collating code in `data._utils.collate`, etc. This split, IMHO, makes code much clearer. I will base my future changes to DataLoader on top of this. No functionality is changed, except that I added `torch._six.queue`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15331 Reviewed By: yf225 Differential Revision: D13503120 Pulled By: ailzhang fbshipit-source-id: 94df16b4d80ad1102c437cde0d5a2e62cffe1f8e	2018-12-19 12:36:03 -08:00
Ailing Zhang	38eb1beff5	Revert D13289919: [pytorch][PR] [DataLoader] Refactor dataloader.py Differential Revision: D13289919 Original commit changeset: d701bc7bb48f fbshipit-source-id: c350c491fefa98a0a7c0cf22cb832e78aeb15c3d	2018-12-04 20:25:16 -08:00
SsnL	16558a1e9d	Refactor dataloader.py (#14668 ) Summary: As I am working on tasks in https://github.com/pytorch/pytorch/issues/13023, I realized how unreadable the code is because all functions to be run in multiprocessing must be at top global level. Adding more functionalities to `dataloader.py` will only make things worse. So in this PR, I refactor `dataloader.py` and move much of it into `data._utils`. E.g., the `_worker_loop` and related methods are now in `data._utils.worker`, signal handling code in `data._utils.signal_handling`, collating code in `data._utils.collate`, etc. This split, IMHO, makes code much clearer. I will base my future changes to DataLoader on top of this. No functionality is changed, except that I added `torch._six.queue`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14668 Reviewed By: soumith Differential Revision: D13289919 Pulled By: ailzhang fbshipit-source-id: d701bc7bb48f5dd7b163b5be941a9d27eb277a4c	2018-12-04 09:53:41 -08:00
Tongzhou Wang	108b657159	Import DistributedSampler in utils/data/__init__ (#10671 ) Summary: There is no reason that user should do an extra import to use DistributedSampler. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10671 Differential Revision: D9395189 Pulled By: SsnL fbshipit-source-id: 8f41d93813c8fb52fe012f76980c6a261a8db9b2	2018-08-19 16:55:13 -07:00
Gao, Xiang	d7c32df67f	move Subset, random_split to data, use sequence at some places. (#7816 )	2018-05-25 12:50:50 +02:00
Thomas Viehmann	1b0ad8678b	import *Sampler to utils.data (Better fix than #6982 ) (#7007 )	2018-04-27 10:18:29 +02:00
Valentin Haenel	d592e188f7	port of ConcatDataset (#1902 )	2017-06-27 12:31:56 -04:00
Adam Lerer	a1f5fe6a8f	Add multiprocess data loader + improvements to torch.utils.data	2016-09-30 16:23:43 -04:00
Adam Paszke	ee85fe1a9c	Initial utils implementation	2016-09-08 18:49:48 -07:00

49 Commits