pytorch/torch/utils
Tongzhou Wang 058beae411 Add IterableDataset (#19228)
Summary:
This is a modified version of https://github.com/pytorch/pytorch/pull/14705 since commit structure for that PR is quite messy.

1. Add `IterableDataset`.
3. So we have 2 data loader mods: `Iterable` and `Map`.

    1. `Iterable` if the `dataset` is an instance of `IterableDataset`
    2. `Map` o.w.

3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading.
3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`.
4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration.
5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`.
7. Import torch.utils.data in `torch/__init__.py`
9. data loader examples and documentations
10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate`

Closes https://github.com/pytorch/pytorch/issues/17909, https://github.com/pytorch/pytorch/issues/18096, https://github.com/pytorch/pytorch/issues/19946, and some of https://github.com/pytorch/pytorch/issues/13023
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19228

Reviewed By: bddppq

Differential Revision: D15058152

fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde
2019-06-20 20:12:44 -07:00
..
backcompat Simplify python warning settings and cleanup tests. 2017-06-11 05:37:59 -04:00
bottleneck Turn on F401: Unused import warning. (#18598) 2019-03-30 09:01:17 -07:00
data Add IterableDataset (#19228) 2019-06-20 20:12:44 -07:00
ffi remove support for c extensions (#12122) 2018-10-01 13:55:28 -07:00
tensorboard replace LegacyTracedModule with torchscript used in add_graph (#21339) 2019-06-07 10:43:08 -07:00
__init__.py arc lint torch/utils (#13141) 2018-10-25 14:59:03 -07:00
_cpp_extension_versioner.py arc lint torch/utils (#13141) 2018-10-25 14:59:03 -07:00
checkpoint.py Deprecate variadic inputs of checkpoint_sequential (#21006) 2019-05-28 21:33:45 -07:00
collect_env.py Turn on F401: Unused import warning. (#18598) 2019-03-30 09:01:17 -07:00
cpp_extension.py Torch rename (#20774) 2019-06-12 20:12:34 -07:00
dlpack.py arc lint torch/utils (#13141) 2018-10-25 14:59:03 -07:00
file_baton.py Fix python2 and python 3 compatibility found by lint. (#13140) 2018-10-25 17:20:11 -07:00
hooks.py arc lint torch/utils (#13141) 2018-10-25 14:59:03 -07:00
mkldnn.py Add support for save and load mkldnn modules 2019-05-23 12:51:50 -07:00
model_zoo.py add/move a few apis in torch.hub (#18758) 2019-04-10 23:10:39 -07:00