Summary:
This PR implements a DataLoader API for the C++ frontend.
The components present in this API largely match the Python API. It consists of:
- `Dataset`s: Conceptually a function from a set of indices to a batch of examples;
- `Transform`s: A functional transformation of a dataset. A `Map<D, T>` for Dataset `D` and transform `T` is itself a dataset;
- `Sampler`s: Specify a strategy for generating indices for a new batch;
- A `DataLoader`, with the ability to automatically parallelize fetching of samples across multiple worker threads;
Note that collation functions fall naturally out of the `Map<Dataset, Transform>` abstraction.
Things that are missing right now that maybe should be added:
- Memory pinning for CUDA tensors
The API was designed to be generalizable to almost any kind of dataset, transform or sampling strategy, while providing a convenient API out of the box. To achieve this, it is quite heavily templatized on various possible input types.
There are many parts to this PR! Right now, I would like feedback on:
- Your impression of the general usability of the API;
- Your impression of which parts seem too complex or overthought;
- The implementation of the parallelization aspects of the DataLoader. I've followed the Python implementation in some matters, but also differ in others. I think my implementation is a little cleaner and decouples components slightly better than the Python dataloader.
I haven't added too many comments yet, as this is fresh out of the oven. Let me know if anything is unclear from the code itself.
There also aren't any tests yet. I will write a comprehensive test suite once we agree on the API and implementation.
apaszke ezyang The controller you requested could not be found. pietern
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11918
Reviewed By: ezyang
Differential Revision: D9998881
Pulled By: goldsborough
fbshipit-source-id: 22cf357b63692bea42ddb1cc2abc71dae5030aea