Summary:
Xray sampler (originally by ajtulloch) and prigoyal's resnet trainer use variants of the threaded data input where worker threads put stuff into a python queue that is drained by an enqueuer thread that dumps those batches to a Caffe2 queue, that is then drained by the net's DequeueBlobs operator.
There is a lot of boilerplate, which is also quite complicated.
This diff is an attempt to generalize that general stuff under a new module "data_workers" (name could be improved). Basically you pass it a function that is able to return chunks of data (usually data + labels).
I also created a module 'everstore_data_input' which generalizes everstore-origin data input with preprocessing function (image augmentation , for example). See how I refactored sampler.py for the usage.
Next we could create fetcher function for Laser data.
Differential Revision: D4297667
fbshipit-source-id: 8d8a863b177784ae13940730a27dc76cd1dd3dac