Commit Graph

31 Commits

Author SHA1 Message Date
ShawnZhong
c8c53c802e Add generator= kwarg for DataLoader & random samplers (#39737)
Summary:
Fix https://github.com/pytorch/pytorch/issues/39572

Add `generator=` kwarg for DataLoader & random samplers

cc: SsnL, deeppatel4557, albanD, mitar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39737

Differential Revision: D22019132

Pulled By: albanD

fbshipit-source-id: 835e08b86c5396bc0b0e41057661306b15394d6e
2020-06-15 07:01:20 -07:00
Hong Xu
283a3ff16d The exception raised when RandomSampler.replacement is non-boolean should be TypeError (#36547)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36547

Differential Revision: D21818752

Pulled By: ezyang

fbshipit-source-id: 7502a24a0df134c44ac72959ba992777c873f8e9
2020-06-02 06:54:02 -07:00
SsnL
b5868b2833 Relax sampler check in BatchSampler (#38403)
Summary:
Since the check was added in https://github.com/pytorch/pytorch/pull/6249, one can not pass an iterable as a sampler to the data loader anymore, which was a very handy feature (e.g., https://github.com/pytorch/pytorch/issues/1337). I think the check should be removed for two-fold reasons:
1. It is too strict. There is no reason that it should not be a general iterable.
2. It is inconsistent. In `DataLoader` (the main place where people use samplers), you can pass a general iterable as `batch_sampler` but not `sampler` due to this check.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38403

Differential Revision: D21555958

Pulled By: soumith

fbshipit-source-id: c7267bb99a31edd8f2750689205d6edc5dab5cff
2020-05-13 22:24:29 -07:00
vfdev
c6e0360812 Minor change of docstring example of WeightedRandomSampler (#30846)
Summary:
Previous example
```python
>>> list(WeightedRandomSampler([0.1, 0.9, 0.4, 0.7, 3.0, 0.6], 5, replacement=True))
        [0, 0, 0, 1, 0]
```
may seem misleading according to provided weights.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30846

Differential Revision: D19697367

Pulled By: ezyang

fbshipit-source-id: 3d6e3cd0cecb5272a368707ba35bc7acdbd82c30
2020-02-12 07:46:39 -08:00
Tongzhou Wang
058beae411 Add IterableDataset (#19228)
Summary:
This is a modified version of https://github.com/pytorch/pytorch/pull/14705 since commit structure for that PR is quite messy.

1. Add `IterableDataset`.
3. So we have 2 data loader mods: `Iterable` and `Map`.

    1. `Iterable` if the `dataset` is an instance of `IterableDataset`
    2. `Map` o.w.

3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading.
3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`.
4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration.
5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`.
7. Import torch.utils.data in `torch/__init__.py`
9. data loader examples and documentations
10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate`

Closes https://github.com/pytorch/pytorch/issues/17909, https://github.com/pytorch/pytorch/issues/18096, https://github.com/pytorch/pytorch/issues/19946, and some of https://github.com/pytorch/pytorch/issues/13023
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19228

Reviewed By: bddppq

Differential Revision: D15058152

fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde
2019-06-20 20:12:44 -07:00
Soumith Chintala
6480d3f140 Revert D15511921: [pytorch][PR] BatchSampler now uses list.clear() instead of creating new objects
Differential Revision:
D15511921

Original commit changeset: e943d21e75e1

fbshipit-source-id: 933b7ef74c7a530f0a2cc087c8ee6f0455cf9239
2019-05-27 10:51:24 -07:00
Tongzhou Wang
482ae8e6b2 BatchSampler now uses list.clear() instead of creating new objects
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20976

Differential Revision: D15511921

Pulled By: soumith

fbshipit-source-id: e943d21e75e19f9154a0570f3188cc3ce174083e
2019-05-26 23:45:26 -07:00
crcrpar
bb05f70724 fix the docstring of RandomSampler (#19113)
Summary:
fix
- the order of `Arguments` in `RandomSampler` doc
- the meaningless check of `replacement`'s type.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19113

Differential Revision: D15013081

Pulled By: ezyang

fbshipit-source-id: 39e367f42841de6814b1214eb9df7b75f14f747e
2019-04-23 09:54:20 -07:00
Krishna Kalyan
d80f0a1f3a Add example to WeightedRandomSampler doc string (#17432)
Summary: Example for the weighted random sampler are missing [here](https://pytorch.org/docs/stable/data.html#torch.utils.data.WeightedRandomSampler)

Differential Revision: D14198642

Pulled By: soumith

fbshipit-source-id: af6d8445d31304011002dd4308faaf40b0c1b609
2019-02-23 20:29:06 -08:00
Olen ANDONI
be4ad3fe30 fix(typo): Change 'integeral' to 'integer'
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17396

Differential Revision: D14195023

Pulled By: soumith

fbshipit-source-id: 300ab68c24bfbf10768fefac44fad64784463c8f
2019-02-23 08:22:01 -08:00
ptrblck
8abfd28f58 #16627 convert weights using torch.as_tensor to avoid warning (#17067)
Summary:
Minor change which fixes #16627
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17067

Differential Revision: D14078726

Pulled By: soumith

fbshipit-source-id: c04a5f1eff44e4a4b04b981f0ae8de6ff018515b
2019-02-13 20:54:29 -08:00
kyryl
a7415787ac fix RandomSampler length (#15991)
Summary:
Hi!

This PR addresses #15537  issue.
Please review.

Thanks!

Differential Revision: D13649890

Pulled By: soumith

fbshipit-source-id: 166212ae383331345423236dfc4fa2ea907d265d
2019-01-13 23:09:51 -08:00
Wei Yang
7f9fd1cc26 allow RandomSampler to sample with replacement (#9911)
Summary:
fixes #7908
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9911

Reviewed By: yf225

Differential Revision: D9023223

Pulled By: weiyangfb

fbshipit-source-id: 68b199bef3940b7205d0fdad75e7c46e6fe65ba7
2018-08-28 10:52:25 -07:00
Chetter2
5ca2713a8b Fix performance of WeightedRandomSampler (#10636)
Summary:
Since https://github.com/pytorch/pytorch/pull/8958 was merged, the BatchSampler samples 0d tensors from WeightedRandomSampler instead of integers. It significantly reduces performance. This PR fix it the same way as https://github.com/pytorch/pytorch/pull/10361 fix DistributedSampler.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10636

Differential Revision: D9423869

Pulled By: zou3519

fbshipit-source-id: f94da2d4cccf70e63beea6cfc3d1230b5610ae44
2018-08-22 13:15:48 -07:00
Dmitriy Serdyuk
ba8e133844 Refactor batch sampler (#8958)
Summary:
Fixes #8652, fixes #8957
Closes https://github.com/pytorch/pytorch/pull/8958

Reviewed By: ezyang

Differential Revision: D8668253

Pulled By: soumith

fbshipit-source-id: 663d461621511166f29cfcc902e6c2a71befa647
2018-06-27 16:06:47 -07:00
Gao, Xiang
d7c32df67f move Subset, random_split to data, use sequence at some places. (#7816) 2018-05-25 12:50:50 +02:00
Gao, Xiang
42e5e12750 make BatchSampler subclass of Sampler, and expose (#7707) 2018-05-19 21:29:03 +02:00
Thomas Viehmann
1b0ad8678b import *Sampler to utils.data (Better fix than #6982) (#7007) 2018-04-27 10:18:29 +02:00
Thomas Viehmann
5dc5a71d74 Improve error message (Sampler location) Fixes #6917 (#6982)
Thank you @ruotianluo for reporting!
2018-04-26 10:58:27 -04:00
Tongzhou Wang
1c01eabd3c
Codemod to update our codebase to 0.4 standard (#6641)
* Codemod to update our codebase to 0.4 standard

* Update some of the test scri[ts

* remove Variable in test_clip_grad_value

* fix _symbolic_override_wrapper_maker
2018-04-17 22:06:54 -04:00
Tongzhou Wang
1f0b07cddc fix typos in sampler.py (#6525) 2018-04-11 17:27:25 -04:00
Tongzhou Wang
efc91d8c6d Add arg checks in torch.utils.data.Sampler classes (#6249)
Fixes #6168

* add arg checks in torch.utils.data.Sampler

* add check for positive-ness
2018-04-04 23:07:31 -04:00
Sam Gross
30ec06c140
Merge Variable and Tensor classes (#5225)
This replaces the torch.Tensor constructors with factories that produce
Variables. Similarly, functions on the torch module (e.g. torch.randn)
now return Variables.

To keep the PR to a reasonable size, I've left most of the unused tensor
code. Subsequent PRs will remove the dead code, clean-up calls to
torch.autograd.Variable, and rename Variable to Tensor everywhere.

There are some breaking changes because Variable and Tensors had
slightly different semantics. There's a list of those changes here:

 https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge
2018-02-23 18:03:31 -05:00
vfdev
bf5932fb15 Add missing documentation for replacement in WeightedRandomSampler (#3579)
* Update sampler.py

* fix lint
2017-11-08 20:23:42 -05:00
Sasank Chilamkurthy
bbf2c6a084 Fix ConcatDataset docs (#2355)
* Fix ConcatDataset docs

so that sphinx-napoleon parses it right.

* Fix WeightedRandomSampler docs
2017-08-23 09:47:57 -04:00
Sam Gross
f09027bc29 Add batch sampler to DataLoader (#1867) 2017-06-22 20:18:31 +02:00
Isac Arnekvist
156fe28666 dataloader can now handle growing datasets (#1575) 2017-05-17 19:23:15 -04:00
Dmitry Ulyanov
997312c233 Add WeightedRandomSampler (#980)
Samples elements from `[0,..,len(weights)-1]` with given probabilities (weights). So far there is no mean to either introduce sample weights in loss functions or while sampling from a dataset. This is an attempt to add the functionality for the latter issue.
2017-03-13 00:27:05 -04:00
felixgwu
5e7f5db332 add subset samplers (#888) 2017-03-02 09:26:10 -05:00
Adam Paszke
4cc11066b2 Add torch.utils.data docs and improve notes (#460)
* Add torch.utils.data docs and improve notes
2017-01-17 14:51:05 -05:00
Adam Paszke
ee85fe1a9c Initial utils implementation 2016-09-08 18:49:48 -07:00