Commit Graph

123 Commits

Author SHA1 Message Date
Nathan Goldbaum
ad2825a2c9 Add API for listing functions overridable by __torch_function__ (#33791)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/33182

This adds private API functions that developers of types that implement `__torch_function__` can use to ensure full coverage of the subset of the PyTorch API that can be overrided.

I've refactored some of the code in the tests into a new `torch._overrides.get_overridable_functions` function. I've also changed `TENSOR_LIKE_TORCH_OVERRIDES` into `torch._overrides.get_testing_overrides` and `IGNORED_TORCH_FUNCTIONS` into `torch._overrides.get_ignored_functions`. Making these two static global variables in the tests into functions should allow rewriting their implementation to construct their return values instead of just statically defining the return value as is done here. Currently that is blocked on not being able to inspect function signatures of compiled kernels in PyTorch (see https://github.com/pytorch/pytorch/issues/28233). See the docs I've added for usage examples of these new functions. I also refactored the existing override tests to make use of these new functions, which should be a good forcing function to make sure they're kept up-to-date.

Finally, while working on this I discovered that `TestTorchFunctionOverrides.test_mean` and `TestTorchFunctionOverrides.test_mm` weren't ever being run because they were getting clobbered by the other dynamically generated override tests. I fixed that by renaming the tests and then fixing the actual test code. I've verified that all the subclassing semantics is correct and that the updated test answers are correct. I'm happy to put the fixes to the existing tests in as a separate pull request if that would be easier to review.

ping cpuhrsch since the feature request originally came from them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33791

Differential Revision: D20195053

Pulled By: cpuhrsch

fbshipit-source-id: 1585f4e405f5223932b410eae03a288dc8eb627e
2020-03-03 12:40:34 -08:00
Omkar Salpekar
24dd800e6a [Dist Autograd] Functional API for Dist Autograd and Dist Optimizer (#33711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33711

Fixed #33480

This makes `dist_autograd.backward` and `dist_optimizer.step` functional by making the user explicitly pass in the `context_id` as opposed to relying on the confusing thread_local context_id.

This diff incorporates these API changes and all places where these functions are called.

More concretely, this code:

```
with dist_autograd.context():
    # Forward pass.
    dist_autograd.backward([loss.sum()])
    dist_optim.step()
```

should now be written as follows:

```
with dist_autograd.context() as context_id:
    # Forward pass.
    dist_autograd.backward(context_id, [loss.sum()])
    dist_optim.step(context_id)
```

Test Plan: Ensuring all existing dist_autograd and dist_optimizer tests pass with the new API. Also added a new test case for input checking.

Differential Revision: D20011710

fbshipit-source-id: 216e12207934a2a79c7223332b97c558d89d4d65
2020-02-26 19:08:28 -08:00
Michael Carilli
fc6a153688 [WIP] Reanimate gradient scaling API with original scale update heuristic (#33366)
Summary:
Also, windows memory failures responsible for the earlier reversion have been fixed.

This PR (initially) contains 2 commits:
* a revert of the revert
* all changes to implement the original Apex scale update heuristic, squashed into a single commit for easier diff review
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33366

Differential Revision: D20099026

Pulled By: ngimel

fbshipit-source-id: 339b9b6bd5134bf055057492cd1eedb7e4461529
2020-02-25 19:00:34 -08:00
peter
adbe289870 Update MKL to 2020.0.166 for Windows (#33690)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33690

Differential Revision: D20089300

Pulled By: ezyang

fbshipit-source-id: 887c006fbdb2c837f0a1c607a196811f44f1fb35
2020-02-24 22:43:34 -08:00
Edward Yang
ae53f8dd25 Revert D19859905: [pytorch][PR] Gradient scaling API
Test Plan: revert-hammer

Differential Revision:
D19859905

Original commit changeset: bb8ae6966214

fbshipit-source-id: 28f1c93e8a00e3a4bbe8cc981499b15468f0b970
2020-02-14 11:03:27 -08:00
Michael Carilli
40246fa63c Gradient scaling API (#26512)
Summary:
This PR implements the gradient scaling API that mruberry, jjsjann123, ngimel, zdevito, gchanan and I have been discussing.  Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081.

Volume-wise, this PR is mostly documentation and tests.  The Python API (found entirely in `torch/cuda/amp/amp_scaler.py`) is lightweight .  The exposed functions are intended to make the implementation and control flow of gradient scaling convenient, intuitive, and performant.

The API is probably easiest to digest by looking at the documentation and examples. `docs/source/amp.rst` is the homepage for the Automatic Mixed Precision package.  `docs/source/notes/amp_examples.rst` includes several examples demonstrating common but not-immediately-obvious use cases.  Examples are backed by tests in `test_cuda.py` (and thankfully the tests pass :P).

Two small utility kernels have been added in `native/cuda/AmpKernels.cu` to improve performance and avoid host-device synchronizations wherever possible.

Existing optimizers, both in the wild and in Pytorch core, do not need to change to use the scaling API.

However, the API was also designed to establish a contract between user scripts and optimizers such that writers of _new_ custom optimizers have the control points they need to implement fast, optionally sync-free updates.  User scripts that obey the scaling API can drop such custom optimizers in and reap performance benefits without having to change anything aside from the optimizer constructor itself.  [I know what the contract with custom optimizers should be](35829f24ef/torch/cuda/amp/amp_scaler.py (L179-L184)), but I'm waiting for review on the rest of the API before I go about documenting it (it will be given a dedicated section in `docs/source/notes/amp_examples.rst`.

Currently, the gradient scaling examples do not include the auto-casting API as discussed in https://github.com/pytorch/pytorch/issues/25081.  The gradient scaling API is intended to be orthogonal/modular relative to autocasting.  Without auto-casting the gradient scaling API is fully use-_able_, but not terribly use-_ful_, so it's up to you guys whether you want to wait until auto-casting is ready before merging the scaling API as well.

### Todo
- [ ] How do I get c10 registered status for my two custom kernels?  They're very simple.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26512

Differential Revision: D19859905

Pulled By: mruberry

fbshipit-source-id: bb8ae6966214718dfee11345db824389e4286923
2020-02-13 11:06:06 -08:00
Ilia Cherniavskii
04829e924a Update CPU threading doc (#33083)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33083

Added more recommendations, some notes and warning

Test Plan: cd docs ; make html

Differential Revision: D19829133

Pulled By: ilia-cher

fbshipit-source-id: b9fbd89f5875b3ce35cc42ba75a3b44bb132c506
2020-02-11 14:13:51 -08:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Shen Li
322f34b245 Adding DDP Design Note
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32158

Test Plan: Imported from OSS

Differential Revision: D19405980

Pulled By: mrshenli

fbshipit-source-id: 808ef1c71b637546f8872375bf1828967b1a5a60
2020-01-15 14:10:45 -08:00
Vamshi Chowdary
05088da8e9 [pytorch][PR] Fixed error in sample code of documentation (#31682)
Summary:
"in_features" and "out_features" are not defined. Possibly a typo. They should be "input_features" and "output_features" instead
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31682

Differential Revision: D19251685

Pulled By: zou3519

fbshipit-source-id: ac9e524e792a1853a16e8876d76b908495d8f35e
2020-01-15 10:34:07 -08:00
Rohan Varma
a561a8448b minor doc tweak to use mp.spawn in example (#30381)
Summary:
Per pietern's comment in https://github.com/pytorch/pytorch/issues/30022, we can make this example launcher a bit simpler by using `torch.multiprocessing`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30381

Differential Revision: D19292080

Pulled By: rohan-varma

fbshipit-source-id: 018ace945601166ef3af05d8c3e69d900bd77c3b
2020-01-06 22:19:01 -08:00
Nathan Goldbaum
9d3402e4cb Add the __torch_function__ API override mechanism (#30730)
Summary:
This is a re-do of https://github.com/pytorch/pytorch/issues/27064, which was reverted (b8792c0438). This was landed at the same time as other work that added new operators to the `torch` namespace so the check for whether the `torch` namespace is exhaustively checked for overridability was triggering test failures.

I've temporarily disabled that check and added an explanatory comment that the check will be re-enabled in a future PR that will be merged during a time when the commit velocity on PyTorch is lower.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30730

Differential Revision: D18813270

Pulled By: ezyang

fbshipit-source-id: 70477c4656dca8fea6e7bc59259555041fcfbf68
2019-12-04 13:19:07 -08:00
Edward Yang
b8792c0438 Revert D18645954: add __torch_function__ API override mechanism
Test Plan: revert-hammer

Differential Revision:
D18645954

Original commit changeset: 54b5e4344d7a

fbshipit-source-id: 4a7aebb483e6b001130d6f384ccc53c5a808ab13
2019-12-04 07:41:47 -08:00
Prasun Anand
d12786b24f add __torch_function__ API override mechanism (#27064)
Summary:
Closes https://github.com/pytorch/pytorch/issues/24015 (see description of that issue for more details).

For a toy example, see the `DiagonalTensor` and `SubDiagonalTensor` class in test/test_overrides.py.

This PR currently contains:

* tests for `__torch_function__` behavior
* modification to `gen_python_functions` and `parse` function signatures and dispatched to correct overloaded argument.

This feature is inspired by and analogous to NumPy's `__array_function__` protocol ([see NumPy Enhancement Proposal 18](https://numpy.org/neps/nep-0018-array-function-protocol.html#trying-array-function-methods-until-the-right-one-works)).

### Benchmarks:
See Nathan's comment below: https://github.com/pytorch/pytorch/pull/27064#issuecomment-554601189
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27064

Differential Revision: D18645954

Pulled By: ezyang

fbshipit-source-id: 54b5e4344d7afdbcf996bb57191b0bdadc7b1767
2019-12-04 05:56:46 -08:00
peterjc123
6deb41c88d Update magma to 2.5.1 for Windows and switch CUDA in CI to 9.2
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30513

Differential Revision: D18764184

Pulled By: ezyang

fbshipit-source-id: 4992869fd6a89471a5d25eb6a9b44ad8eceb480f
2019-12-02 11:56:10 -08:00
Rohan Varma
1350b99de4 Add local shutdown to process group agent (#30330)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30330

This is now possible due to previous changes made in `gloo` and `ProcessGroupGloo`. We `abort` the listener thread that is waiting for a message, and join all other threads. The API is changed so that the previous `wait_all_workers` does not destroy the agent, and this is now done in a new `shutdown` method. All callsites are updated appropriately.

ghstack-source-id: 94673884
ghstack-source-id: 94673884

Test Plan: Unit tests pass.

Reviewed By: mrshenli

Differential Revision: D18661775

fbshipit-source-id: 5aaa7c14603e18253394224994f6cd43234301c2
2019-11-27 22:34:08 -08:00
Rohan Varma
5c6705e62c add default arg for init_method (#30208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30208

Adds default arg for init_method so users don't have to pass this in,
and moves it to `RpcBackendOptions` struct. Removes `init_method` arg from rpc.init_rpc. Also fixes some docs.
ghstack-source-id: 94500475

Test Plan: Unit tests pass.

Reviewed By: mrshenli

Differential Revision: D18630074

fbshipit-source-id: 04b7dd7ec96f4c4da311b71d250233f1f262135a
2019-11-25 14:52:48 -08:00
Shen Li
a9f3f48f88 Revert D5578006: Add local shutdown to process group agent
Test Plan: revert-hammer

Differential Revision:
D5578006

Original commit changeset: 6258879fb44c

fbshipit-source-id: 11b893b3a280a8383eeb20a0548626811616dca1
2019-11-22 11:31:04 -08:00
Rohan Varma
c478a92b93 Add local shutdown to process group agent (#30020)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30020
This is now possible due to previous changes made in `gloo` and `ProcessGroupGloo`. We `abort` the listener thread that is waiting for a message, and join all other threads. The destructor calls this same `localShutdown` method, but we ensure this is not called multiple times.

ghstack-source-id: 94415336

Test Plan: Unit tests pass.

Differential Revision: D5578006

fbshipit-source-id: 6258879fb44c9fca97fdfad64468c1488c16ac02
2019-11-22 10:03:00 -08:00
Shen Li
063e22b7c2 Fix RRef design doc warning (#30240)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30240

Get rid of the following warning when build docs:

```
/Users/shenli/Project/pytorch/docs/source/notes/rref.rst:184: WARNING: Error in "code" directive:
maximum 1 argument(s) allowed, 6 supplied.

.. code::
  import torch
  import torch.distributed.rpc as rpc

  # on worker A
  rref = rpc.remote('B', torch.add, args=(torch.ones(2), 1))
  # say the rref has RRefId 100 and ForkId 1
  rref.to_here()
```

Test Plan: Imported from OSS

Differential Revision: D18640016

Pulled By: mrshenli

fbshipit-source-id: d527827f01183411d4b4c73e0a976bdd7fccbf49
2019-11-21 16:22:39 -08:00
Shen Li
e0325011e4 Add link to RRef protocol in RPC doc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30218

Test Plan: Imported from OSS

Differential Revision: D18638881

Pulled By: mrshenli

fbshipit-source-id: ca6fae6f8cea8cdcc33d275dd71a347fbb5dd45c
2019-11-21 16:22:35 -08:00
Alban Desmaison
a78e7eadbd Fix typo in extending doc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30159

Differential Revision: D18619060

Pulled By: albanD

fbshipit-source-id: 1109c8da6242dffd6315b0c9de0f8ca34df0b276
2019-11-21 08:12:32 -08:00
Shen Li
73cf4d468f Design doc for Remote Reference (#30066)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30066

This commit adds design reasoning and walks through four scenarios
for RRef.

Test Plan: Imported from OSS

Reviewed By: rohan-varma

Differential Revision: D18595094

Pulled By: mrshenli

fbshipit-source-id: 134102901ce515a44a2e7cd013b62143a6158120
2019-11-20 12:42:28 -08:00
Rohan Varma
f304bd5062 rename join_rpc to wait_all_workers in public api (#30050)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30050

Renames this API to wait_all_workers as discussed.
ghstack-source-id: 94273005

Test Plan: Unit tests pass

Differential Revision: D18581466

fbshipit-source-id: 4ff5d5fb2d528f17252d5b5f30c3047d2efb92bf
2019-11-20 12:38:35 -08:00
Pritam Damania
88ef402cb5 Add distributed optimizer section to distributed autograd design doc. (#30068)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30068

ghstack-source-id: 94228719

Test Plan: waitforbuildbot

Differential Revision: D18556536

fbshipit-source-id: decd6927bfdd1ee3c81fef7430aa7095d7f38d33
2019-11-19 22:43:03 -08:00
Pritam Damania
ab93b3df60 Polish distributed autograd docs. (#29942)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29942

1) Added links to the design.
2) Fixed function signautres.
3) Expanded examples
ghstack-source-id: 94162372

Test Plan: waitforbuildbot

Differential Revision: D18547103

fbshipit-source-id: 067ba166c107ed14085af8ee3306d3f8a9dcebe7
2019-11-18 18:13:08 -08:00
Pritam Damania
eb29276623 Update distributed autograd design doc with appropriate links. (#29927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29927

With the docs page now up, we can update the links in the design doc
to point to the docs page.
ghstack-source-id: 94055423

Test Plan: waitforbuildbot

Differential Revision: D18541878

fbshipit-source-id: f44702d9a8296ccc0a5d58d56c3b6dc8a822b520
2019-11-15 21:10:53 -08:00
Pritam Damania
c3b2c2e353 Design doc for distributed autograd. (#29175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29175

Updates our docs to include a design doc for distributed autograd.
Currently, this doc only covers the FAST mode algorithm. The Smart mode
algorithm section just refers to the original RFC.

There is a section for Distributed Optimizer that we can complete once we've
finalized the API for the same.
ghstack-source-id: 93701129

Test Plan: look at docs.

Differential Revision: D18318949

fbshipit-source-id: 670ea1b6bb84692f07facee26946bbc6ce8c650c
2019-11-12 15:04:23 -08:00
Alban Desmaison
f5edb62a7f Clean extending autograd doc for output size 1 (#28860)
Summary:
Fix https://github.com/pytorch/pytorch/issues/28583
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28860

Differential Revision: D18224497

Pulled By: albanD

fbshipit-source-id: 0fa4eacce6f6092d555e509dc23bd75206f78d41
2019-10-30 13:57:10 -07:00
Prasun Anand
4230132baf Added docs for context method mixins. Fixes issue #27365 (#28643)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/27365 .

This PR:
1. Makes Context method docs available.
2. Links [Extending torch autograd](https://pytorch.org/docs/stable/notes/extending.html#extending-torch-autograd) notes to Context method docs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28643

Differential Revision: D18170089

Pulled By: albanD

fbshipit-source-id: a1119ea8e2f8a71f0d1aadf416f2f98343aa9b7b
2019-10-28 08:31:35 -07:00
zou3519
e5d6b75319 Bag of documentation fixes; fix more sphinx warnings (#27850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27850

Many of these are real problems in the documentation (i.e., link or
bullet point doesn't display correctly).

Test Plan: - built and viewed the documentation for each change locally.

Differential Revision: D17908123

Pulled By: zou3519

fbshipit-source-id: 65c92a352c89b90fb6b508c388b0874233a3817a
2019-10-15 07:31:14 -07:00
Jerry Ma
1610ea8ef8 Comprehensive-ish instrumentation for CUDA memory allocator (#27361)
Summary:
Adds comprehensive memory instrumentation to the CUDA caching memory allocator.

# Counters

Added comprehensive instrumentation for the following stats:
  - Allocation requests (`allocation`)
  - Allocated memory (`allocated_bytes`)
  - Reserved segments from cudaMalloc (`segment`)
  - Reserved memory (`reserved_bytes`)
  - Active memory blocks (`active`)
  - Active memory (`active_bytes`)
  - Inactive, non-releasable blocks (`inactive_split`)
  - Inactive, non-releasable memory (`inactive_split_bytes`)
  - Number of failed cudaMalloc calls that result in a cache flush and retry (`cuda_malloc_retries`)
  - Number of OOMs (`num_ooms`)

Except for the last two, these stats are segmented between all memory, large blocks, and small blocks. Along with the current value of each stat, historical counts of allocs/frees as well as peak usage are tracked by the allocator.

# Snapshots

Added the capability to get a "memory snapshot" – that is, to generate a complete dump of the allocator block/segment state.

# Implementation: major changes

- Added `torch.cuda.memory_stats()` (and associated C++ changes) which returns all instrumented stats as a dictionary.
- Added `torch.cuda.snapshot()` (and associated C++ changes) which returns a complete dump of the allocator block/segment state as a list of segments.
- Added memory summary generator in `torch.cuda.memory_summary()` for ease of client access to the instrumentation stats. Potentially useful to dump when catching OOMs. Sample output here: https://pastebin.com/uKZjtupq

# Implementation: minor changes

- Add error-checking helper functions for Python dicts and lists in `torch/csrc/utils/`.
- Existing memory management functions in `torch.cuda` moved from `__init__.py` to `memory.py` and star-imported to the main CUDA module.
- Add various helper functions to `torch.cuda` to return individual items from `torch.cuda.memory_stats()`.
- `torch.cuda.reset_max_memory_cached()` and `torch.cuda.reset_max_memory_allocated()` are deprecated in favor of `reset_peak_stats`. It's a bit difficult to think of a case where only one of those stats should be reset, and IMO this makes the peak stats collectively more consistent.
- `torch.cuda.memory_cached()` and `torch.cuda.max_memory_cached()` are deprecated in favor of `*memory_reserved()`.
- Style (add access modifiers in the allocator class, random nit fixes, etc.)

# Testing

- Added consistency check for stats in `test_cuda.py`. This verifies that the data from `memory_stats()` is faithful to the data from `snapshot()`.
- Ran on various basic workflows (toy example, CIFAR)

# Performance

Running the following speed benchmark: https://pastebin.com/UNndQg50

- Before this PR: 45.98 microseconds per tensor creation
- After this PR: 46.65 microseconds per tensor creation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27361

Differential Revision: D17758747

Pulled By: jma127

fbshipit-source-id: 5a84e82d696c40c505646b9a1b4e0c3bba38aeb6
2019-10-08 15:42:48 -07:00
Tongzhou Wang
98d3d1659e Document benchmarking practice for CUDA
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23910

Differential Revision: D16732365

Pulled By: ezyang

fbshipit-source-id: 24e055602d479293da3e00a7143bba8f92bb7c4a
2019-08-13 15:07:23 -07:00
Ilia Cherniavskii
936632b120 Thread local debug info
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22365

Test Plan:
USE_CUDA=0 python setup.py develop
./build/bin/test_jit

Imported from OSS

Reviewed By: ajyu

Differential Revision: D16065129

Pulled By: ilia-cher

fbshipit-source-id: f300985459a83c2c1049ed8c4fefd23a3144047a
2019-08-12 14:53:57 -07:00
Ilia Cherniavskii
41dfe7204b Threading and CPU Inference note
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23417

Test Plan:
cd docs; make html

Imported from OSS

Differential Revision: D16523781

Pulled By: ilia-cher

fbshipit-source-id: d6c09e8a85d39e6185bbdc4b312fea44fcdfff06
2019-07-29 15:45:49 -07:00
Dmytro Dzhulgakov
d6dcec37b6 Add docs about prod ecosystem features (#23010)
Summary:
Covering fleet-wide profiling, api logging, etc.

It's my first time writing rst, so suggestions are definitely welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23010

Differential Revision: D16456721

Pulled By: dzhulgakov

fbshipit-source-id: 3d3018f41499d04db0dca865bb3a9652d8cdf90a
2019-07-24 14:15:33 -07:00
Tongzhou Wang
058beae411 Add IterableDataset (#19228)
Summary:
This is a modified version of https://github.com/pytorch/pytorch/pull/14705 since commit structure for that PR is quite messy.

1. Add `IterableDataset`.
3. So we have 2 data loader mods: `Iterable` and `Map`.

    1. `Iterable` if the `dataset` is an instance of `IterableDataset`
    2. `Map` o.w.

3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading.
3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`.
4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration.
5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`.
7. Import torch.utils.data in `torch/__init__.py`
9. data loader examples and documentations
10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate`

Closes https://github.com/pytorch/pytorch/issues/17909, https://github.com/pytorch/pytorch/issues/18096, https://github.com/pytorch/pytorch/issues/19946, and some of https://github.com/pytorch/pytorch/issues/13023
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19228

Reviewed By: bddppq

Differential Revision: D15058152

fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde
2019-06-20 20:12:44 -07:00
Bram Vanroy
38d68ad803 Update randomness.rst (#21337)
Summary:
Following [this question on the forums](https://discuss.pytorch.org/t/reproducibility-and-performance/46504), I propose the following doc change. It clarifies that 'performance reduction' concerns the processing speed (and not the training accuracy).

Related website commit: https://github.com/pytorch/pytorch.github.io/pull/211
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21337

Differential Revision: D15622151

Pulled By: soumith

fbshipit-source-id: f0edeb20049f2ee715c400e7c57abb966864d621
2019-06-04 07:38:00 -07:00
Tongzhou Wang
bb89827e1d Update cuda pinned memory note to include tensor.to (#20977)
Summary:
separate bits of changes from #19228
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20977

Differential Revision: D15511919

Pulled By: soumith

fbshipit-source-id: 5015a29cdac6d6e160388c493182c330f0da63ec
2019-05-26 22:22:06 -07:00
Tongzhou Wang
83fe92870d Update multiprocessing note now that shared CUDA tensors are refcounted (#19904)
Summary:
The mp notes are not updated after https://github.com/pytorch/pytorch/pull/16854. (The torch.multiprocessing page is.)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19904

Differential Revision: D15509661

Pulled By: soumith

fbshipit-source-id: 7c11e14a6c804498dda3adbf19710e63e6a564a0
2019-05-25 17:40:42 -07:00
Sergii Dymchenko
a5c90aaf47 Use "length of the RNN input" instead of "length of the RNN"
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20873

Differential Revision: D15495570

Pulled By: ezyang

fbshipit-source-id: e3b4cd67ccf97d0053ac053c3bcb74415b928c0a
2019-05-24 09:03:50 -07:00
peter
39b885cbbf Add magma for CUDA 10.1 to Windows docs
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19914

Differential Revision: D15123029

Pulled By: soumith

fbshipit-source-id: a3d4b498a763e1a9829896d44211be00400ec39d
2019-04-29 10:13:21 -07:00
Tongzhou Wang
6d307db5b4 Move cuFFT plan cache note outside Best Practices (#19538)
Summary:
I mistakenly put it there.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19538

Differential Revision: D15026500

Pulled By: soumith

fbshipit-source-id: 0c13499571fdfd789c3bd1c4b58abd870725d422
2019-04-20 21:39:59 -07:00
Tongzhou Wang
973d51079b Add device-specific cuFFT plan caches (#19300)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/19224
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19300

Differential Revision: D14986967

Pulled By: soumith

fbshipit-source-id: 8c31237db50d6924bba1472434c10326610d9255
2019-04-18 06:39:35 -07:00
Edward Yang
48a35135fb Convert all tabs to spaces, add CI. (#18959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18959
ghimport-source-id: a934163fa34cb2019732d5f49dc7290c376bf156

Differential Revision: D14831246

Pulled By: ezyang

fbshipit-source-id: beb92dc4ee8c82f4c8259c081dd72e477fe7a9d0
2019-04-09 08:12:26 -07:00
peter
9af6564060 Add magma debug version for Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18008

Differential Revision: D14455117

Pulled By: soumith

fbshipit-source-id: 29d9a2e0b36d72bece0bb1870bbdc740c4d1f9d6
2019-03-14 10:15:57 -07:00
Tongzhou Wang
b6313d74e1 fix faq typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17851

Differential Revision: D14401791

Pulled By: soumith

fbshipit-source-id: ed6d64d6f5985e7ce76dca1e9e376782736b90f9
2019-03-10 15:33:52 -07:00
peter
c78da0c6ed Enable using CMD when building cpp extensions on Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706

Differential Revision: D14346482

Pulled By: ezyang

fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9
2019-03-06 14:45:31 -08:00
peter
81f2bdf9c2 Update magma to 2.5.0 for Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17607

Differential Revision: D14281291

Pulled By: yf225

fbshipit-source-id: 51209c5540932871e45e54ba6d61b3b7d264aa8c
2019-03-01 09:53:56 -08:00
SsnL
300dcc3b96 Add cuda.reset_max_memory_* (#15985)
Summary:
Addresses #15968
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15985

Differential Revision: D13649916

Pulled By: soumith

fbshipit-source-id: a207aea5709a79dba7a6fc541d0a70103f49efff
2019-01-14 07:31:51 -08:00