Commit Graph

185 Commits

Author SHA1 Message Date
Tongzhou Wang
04461fa289 Prefix DataLoaderIter with underscore to discourage subclassing (#5619) 2018-03-08 11:09:51 +01:00
Will Feng
a90b695590 Disallow num_workers > 0 for DataLoader on Windows (#5591)
Using DataLoader with num_workers > 0 is known to cause CUDA out-of-memory issue on Windows.

This issue has already been noted in #4092.
2018-03-07 10:21:03 -05:00
Tongzhou Wang
392fc8885c add faq on cuda memory management and dataloder (#5378) 2018-02-27 18:35:30 -05:00
Achal Dave
8327982904 Set python random seed in workers (#5415)
* Set python random seed in workers

* Import random
2018-02-27 03:16:10 -05:00
Tongzhou Wang
1ff537ca71 Ignore FileNotFoundError when shutting down in data_queue.get (#5380)
* Ignore FileNotFoundError when shutting down in data_queue.get

* Address @apaszke comments
2018-02-24 13:32:13 -05:00
Tongzhou Wang
64a9ecae02 Dataloader issues (#4643)
* EINTR and kill by loader fix

* addressed @apaszke 's comments

* remove EINTR handling and add test if we are in main thread before setting SIGCHLD
2018-01-29 01:18:17 +01:00
Tongzhou Wang
0ac58d53b8 ATen conv param expansion; InstanceNorm use_running_stats fix (#4544)
* fix instancenorm and aten conv param expansion

* addressed colesbury 's comments

* improve conv input shape check
2018-01-10 17:36:26 -05:00
Christian Sarofeen
bc6bd62bd6 Fix distributed dataloader so it pins memory to current GPU not GPU 0. 2017-12-19 13:39:06 +01:00
Tongzhou Wang
5cc26c0c90 Add default PyTorch seeding and worker_init_fn to DataLoader (#4018)
* Add default PyTorch seeding and worker_init_fn to DataLoader

* generate seed using current RNG each time

* worker_seed <- main_proc_RNG_generated_seed + worker_id
2017-12-18 02:19:08 -05:00
Jon Crall
5c13c6962c Raise errors when num_workers == 0 in DataLoader (#4019) 2017-12-05 11:07:43 -08:00
Alykhan Tejani
5571d0187e Accept longs in default_collate for dataloader in python 2 (#4001) 2017-12-04 09:50:57 -08:00
SsnL
1661370ac5 Signal handling in DataLoader workers; Timeout option (#3474) 2017-11-29 23:52:14 +01:00
Ozan Çağlayan
dd6d04ddf2 doc: Normalize all true/false in docstrings to `True|False` (#3593)
* doc: Normalize all true/false in docstrings to ``True|False``

This makes them more apparent in the documentation.

* doc: fix flake8
2017-11-09 08:12:29 -05:00
Richard Zou
e579ae75b5 Fix error when default_collate is passed a collection of numpy.str_ (#3404)
* Fix error when default_collate is passed a collection of numpy.str_

* Error if default_collate input is nested nparray containing non-numbers
2017-11-08 10:02:08 -05:00
Sam Gross
8e58135a26 Fix E722 ('do not use bare except') (#3239)
The new version of flake8 includes a check for not using bare except. We
should avoid this since it catches things like KeyboardInterrupt.
2017-10-23 23:03:37 -04:00
Adam Paszke
411e1469e0 Add tools for autograd profiling 2017-09-25 23:21:30 -04:00
Sam Gross
f09027bc29 Add batch sampler to DataLoader (#1867) 2017-06-22 20:18:31 +02:00
Sasank Chilamkurthy
94b147fd41 Allows dicts batches in dataloader. (#1354)
* Allow dicts in Dataloader

* use collections.Sequence instead of collections.Iterable in dataloader
2017-04-28 19:14:52 +02:00
Sam Gross
24d92b5d9f Concatenate directly into shared memory when constructing batches (#1323)
This saves an extra memory copy, which speeds up data loading a bit
(5-10% with accimage).

As part of this change:

 * torch.cat accepts keyword argument out
 * sepcifiying out=None is treated like not specifying out
2017-04-22 03:40:30 -04:00
Adam Paszke
605b3c86ce Retain the type of numpy scalars in collate_fn 2017-04-11 14:48:54 -07:00
Xingdong Zuo
9f2a5d804d Add a flag to fix when dataset size is not divisible by batch size. (#1133) 2017-04-06 00:18:43 -04:00
Xingdong Zuo
476d85dd3f DataLoader: Fix batch data type for numpy array (#1074) 2017-03-24 11:34:24 -04:00
Eli Stevens
e216f557fd Fixes issue returning strings from a Dataloader with pin_memory=True (#908) 2017-03-13 10:11:07 +01:00
yunjey
3330287dc7 Update dataloader.py (#837) 2017-02-23 14:38:41 -05:00
Adam Paszke
7ea6ae57c8 Support numpy arrays in default_collate 2017-02-20 23:28:31 -08:00
Adam Paszke
4cc11066b2 Add torch.utils.data docs and improve notes (#460)
* Add torch.utils.data docs and improve notes
2017-01-17 14:51:05 -05:00
Sergey Zagoruyko
a0c614ece3 unsqueeze instead of view in dataloader 2017-01-01 23:38:54 +01:00
Sam Gross
24af02154c Use ForkingPickler for sharing tensor/storages across processes (#344)
This hooks into the (internal) ForkingPickler class in multiprocessing
to reduce tensors, storages, and CUDA events instead of our queue from
joblib. This makes it easier to use the standard multiprocessing classes
in later versions of Python.

This also exposes:

 - Tensor/Storage.share_memory_()
 - Module.share_memory()

These methods move the CPU tensors and storages to shared memory. If
you're using the "fork" method of multiprocessing, these objects can be
directly inherited instead of serialized through a queue.
2016-12-28 20:34:23 -05:00
Sam Gross
be3276fcdd Account for batch_size in DataLoader.__len__() (#277) 2016-12-02 01:21:36 -05:00
Sam Gross
aea6ba4bcd Support pinned memory in the DataLoader (#265)
DataLoader now supports the constructor argument 'pin_memory'. When set
to true, tensors in the sample are copied to pinned memory. This happens
in a background thread when num_workers > 1.
2016-11-29 12:35:03 -05:00
Sam Gross
6db721b5dd Make DataLoader preserve the ordering of the dataset (#135) 2016-10-21 23:54:16 -04:00
Sam Gross
3931beee81 Use THSetNumThreads instead of omp_set_num_threads
Set OMP num threads to one in the data loader.

Fixes #81
Fixes #82
2016-10-17 15:15:00 -04:00
Sam Gross
112df5f664 Fixes to trainer and data loading
1. Wrap target in a Variable in trainer
2. Collate numbers into torch.Long/DoubleTensors
2016-10-01 13:21:16 -07:00
soumith
c813e93d85 fixing python 3 compat 2016-09-30 16:44:00 -07:00
Adam Lerer
a1f5fe6a8f Add multiprocess data loader + improvements to torch.utils.data 2016-09-30 16:23:43 -04:00