Commit Graph

97 Commits

Author SHA1 Message Date
Pritam Damania
eb29276623 Update distributed autograd design doc with appropriate links. (#29927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29927

With the docs page now up, we can update the links in the design doc
to point to the docs page.
ghstack-source-id: 94055423

Test Plan: waitforbuildbot

Differential Revision: D18541878

fbshipit-source-id: f44702d9a8296ccc0a5d58d56c3b6dc8a822b520
2019-11-15 21:10:53 -08:00
Pritam Damania
c3b2c2e353 Design doc for distributed autograd. (#29175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29175

Updates our docs to include a design doc for distributed autograd.
Currently, this doc only covers the FAST mode algorithm. The Smart mode
algorithm section just refers to the original RFC.

There is a section for Distributed Optimizer that we can complete once we've
finalized the API for the same.
ghstack-source-id: 93701129

Test Plan: look at docs.

Differential Revision: D18318949

fbshipit-source-id: 670ea1b6bb84692f07facee26946bbc6ce8c650c
2019-11-12 15:04:23 -08:00
Alban Desmaison
f5edb62a7f Clean extending autograd doc for output size 1 (#28860)
Summary:
Fix https://github.com/pytorch/pytorch/issues/28583
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28860

Differential Revision: D18224497

Pulled By: albanD

fbshipit-source-id: 0fa4eacce6f6092d555e509dc23bd75206f78d41
2019-10-30 13:57:10 -07:00
Prasun Anand
4230132baf Added docs for context method mixins. Fixes issue #27365 (#28643)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/27365 .

This PR:
1. Makes Context method docs available.
2. Links [Extending torch autograd](https://pytorch.org/docs/stable/notes/extending.html#extending-torch-autograd) notes to Context method docs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28643

Differential Revision: D18170089

Pulled By: albanD

fbshipit-source-id: a1119ea8e2f8a71f0d1aadf416f2f98343aa9b7b
2019-10-28 08:31:35 -07:00
zou3519
e5d6b75319 Bag of documentation fixes; fix more sphinx warnings (#27850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27850

Many of these are real problems in the documentation (i.e., link or
bullet point doesn't display correctly).

Test Plan: - built and viewed the documentation for each change locally.

Differential Revision: D17908123

Pulled By: zou3519

fbshipit-source-id: 65c92a352c89b90fb6b508c388b0874233a3817a
2019-10-15 07:31:14 -07:00
Jerry Ma
1610ea8ef8 Comprehensive-ish instrumentation for CUDA memory allocator (#27361)
Summary:
Adds comprehensive memory instrumentation to the CUDA caching memory allocator.

# Counters

Added comprehensive instrumentation for the following stats:
  - Allocation requests (`allocation`)
  - Allocated memory (`allocated_bytes`)
  - Reserved segments from cudaMalloc (`segment`)
  - Reserved memory (`reserved_bytes`)
  - Active memory blocks (`active`)
  - Active memory (`active_bytes`)
  - Inactive, non-releasable blocks (`inactive_split`)
  - Inactive, non-releasable memory (`inactive_split_bytes`)
  - Number of failed cudaMalloc calls that result in a cache flush and retry (`cuda_malloc_retries`)
  - Number of OOMs (`num_ooms`)

Except for the last two, these stats are segmented between all memory, large blocks, and small blocks. Along with the current value of each stat, historical counts of allocs/frees as well as peak usage are tracked by the allocator.

# Snapshots

Added the capability to get a "memory snapshot" – that is, to generate a complete dump of the allocator block/segment state.

# Implementation: major changes

- Added `torch.cuda.memory_stats()` (and associated C++ changes) which returns all instrumented stats as a dictionary.
- Added `torch.cuda.snapshot()` (and associated C++ changes) which returns a complete dump of the allocator block/segment state as a list of segments.
- Added memory summary generator in `torch.cuda.memory_summary()` for ease of client access to the instrumentation stats. Potentially useful to dump when catching OOMs. Sample output here: https://pastebin.com/uKZjtupq

# Implementation: minor changes

- Add error-checking helper functions for Python dicts and lists in `torch/csrc/utils/`.
- Existing memory management functions in `torch.cuda` moved from `__init__.py` to `memory.py` and star-imported to the main CUDA module.
- Add various helper functions to `torch.cuda` to return individual items from `torch.cuda.memory_stats()`.
- `torch.cuda.reset_max_memory_cached()` and `torch.cuda.reset_max_memory_allocated()` are deprecated in favor of `reset_peak_stats`. It's a bit difficult to think of a case where only one of those stats should be reset, and IMO this makes the peak stats collectively more consistent.
- `torch.cuda.memory_cached()` and `torch.cuda.max_memory_cached()` are deprecated in favor of `*memory_reserved()`.
- Style (add access modifiers in the allocator class, random nit fixes, etc.)

# Testing

- Added consistency check for stats in `test_cuda.py`. This verifies that the data from `memory_stats()` is faithful to the data from `snapshot()`.
- Ran on various basic workflows (toy example, CIFAR)

# Performance

Running the following speed benchmark: https://pastebin.com/UNndQg50

- Before this PR: 45.98 microseconds per tensor creation
- After this PR: 46.65 microseconds per tensor creation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27361

Differential Revision: D17758747

Pulled By: jma127

fbshipit-source-id: 5a84e82d696c40c505646b9a1b4e0c3bba38aeb6
2019-10-08 15:42:48 -07:00
Tongzhou Wang
98d3d1659e Document benchmarking practice for CUDA
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23910

Differential Revision: D16732365

Pulled By: ezyang

fbshipit-source-id: 24e055602d479293da3e00a7143bba8f92bb7c4a
2019-08-13 15:07:23 -07:00
Ilia Cherniavskii
936632b120 Thread local debug info
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22365

Test Plan:
USE_CUDA=0 python setup.py develop
./build/bin/test_jit

Imported from OSS

Reviewed By: ajyu

Differential Revision: D16065129

Pulled By: ilia-cher

fbshipit-source-id: f300985459a83c2c1049ed8c4fefd23a3144047a
2019-08-12 14:53:57 -07:00
Ilia Cherniavskii
41dfe7204b Threading and CPU Inference note
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23417

Test Plan:
cd docs; make html

Imported from OSS

Differential Revision: D16523781

Pulled By: ilia-cher

fbshipit-source-id: d6c09e8a85d39e6185bbdc4b312fea44fcdfff06
2019-07-29 15:45:49 -07:00
Dmytro Dzhulgakov
d6dcec37b6 Add docs about prod ecosystem features (#23010)
Summary:
Covering fleet-wide profiling, api logging, etc.

It's my first time writing rst, so suggestions are definitely welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23010

Differential Revision: D16456721

Pulled By: dzhulgakov

fbshipit-source-id: 3d3018f41499d04db0dca865bb3a9652d8cdf90a
2019-07-24 14:15:33 -07:00
Tongzhou Wang
058beae411 Add IterableDataset (#19228)
Summary:
This is a modified version of https://github.com/pytorch/pytorch/pull/14705 since commit structure for that PR is quite messy.

1. Add `IterableDataset`.
3. So we have 2 data loader mods: `Iterable` and `Map`.

    1. `Iterable` if the `dataset` is an instance of `IterableDataset`
    2. `Map` o.w.

3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading.
3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`.
4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration.
5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`.
7. Import torch.utils.data in `torch/__init__.py`
9. data loader examples and documentations
10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate`

Closes https://github.com/pytorch/pytorch/issues/17909, https://github.com/pytorch/pytorch/issues/18096, https://github.com/pytorch/pytorch/issues/19946, and some of https://github.com/pytorch/pytorch/issues/13023
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19228

Reviewed By: bddppq

Differential Revision: D15058152

fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde
2019-06-20 20:12:44 -07:00
Bram Vanroy
38d68ad803 Update randomness.rst (#21337)
Summary:
Following [this question on the forums](https://discuss.pytorch.org/t/reproducibility-and-performance/46504), I propose the following doc change. It clarifies that 'performance reduction' concerns the processing speed (and not the training accuracy).

Related website commit: https://github.com/pytorch/pytorch.github.io/pull/211
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21337

Differential Revision: D15622151

Pulled By: soumith

fbshipit-source-id: f0edeb20049f2ee715c400e7c57abb966864d621
2019-06-04 07:38:00 -07:00
Tongzhou Wang
bb89827e1d Update cuda pinned memory note to include tensor.to (#20977)
Summary:
separate bits of changes from #19228
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20977

Differential Revision: D15511919

Pulled By: soumith

fbshipit-source-id: 5015a29cdac6d6e160388c493182c330f0da63ec
2019-05-26 22:22:06 -07:00
Tongzhou Wang
83fe92870d Update multiprocessing note now that shared CUDA tensors are refcounted (#19904)
Summary:
The mp notes are not updated after https://github.com/pytorch/pytorch/pull/16854. (The torch.multiprocessing page is.)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19904

Differential Revision: D15509661

Pulled By: soumith

fbshipit-source-id: 7c11e14a6c804498dda3adbf19710e63e6a564a0
2019-05-25 17:40:42 -07:00
Sergii Dymchenko
a5c90aaf47 Use "length of the RNN input" instead of "length of the RNN"
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20873

Differential Revision: D15495570

Pulled By: ezyang

fbshipit-source-id: e3b4cd67ccf97d0053ac053c3bcb74415b928c0a
2019-05-24 09:03:50 -07:00
peter
39b885cbbf Add magma for CUDA 10.1 to Windows docs
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19914

Differential Revision: D15123029

Pulled By: soumith

fbshipit-source-id: a3d4b498a763e1a9829896d44211be00400ec39d
2019-04-29 10:13:21 -07:00
Tongzhou Wang
6d307db5b4 Move cuFFT plan cache note outside Best Practices (#19538)
Summary:
I mistakenly put it there.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19538

Differential Revision: D15026500

Pulled By: soumith

fbshipit-source-id: 0c13499571fdfd789c3bd1c4b58abd870725d422
2019-04-20 21:39:59 -07:00
Tongzhou Wang
973d51079b Add device-specific cuFFT plan caches (#19300)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/19224
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19300

Differential Revision: D14986967

Pulled By: soumith

fbshipit-source-id: 8c31237db50d6924bba1472434c10326610d9255
2019-04-18 06:39:35 -07:00
Edward Yang
48a35135fb Convert all tabs to spaces, add CI. (#18959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18959
ghimport-source-id: a934163fa34cb2019732d5f49dc7290c376bf156

Differential Revision: D14831246

Pulled By: ezyang

fbshipit-source-id: beb92dc4ee8c82f4c8259c081dd72e477fe7a9d0
2019-04-09 08:12:26 -07:00
peter
9af6564060 Add magma debug version for Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18008

Differential Revision: D14455117

Pulled By: soumith

fbshipit-source-id: 29d9a2e0b36d72bece0bb1870bbdc740c4d1f9d6
2019-03-14 10:15:57 -07:00
Tongzhou Wang
b6313d74e1 fix faq typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17851

Differential Revision: D14401791

Pulled By: soumith

fbshipit-source-id: ed6d64d6f5985e7ce76dca1e9e376782736b90f9
2019-03-10 15:33:52 -07:00
peter
c78da0c6ed Enable using CMD when building cpp extensions on Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706

Differential Revision: D14346482

Pulled By: ezyang

fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9
2019-03-06 14:45:31 -08:00
peter
81f2bdf9c2 Update magma to 2.5.0 for Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17607

Differential Revision: D14281291

Pulled By: yf225

fbshipit-source-id: 51209c5540932871e45e54ba6d61b3b7d264aa8c
2019-03-01 09:53:56 -08:00
SsnL
300dcc3b96 Add cuda.reset_max_memory_* (#15985)
Summary:
Addresses #15968
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15985

Differential Revision: D13649916

Pulled By: soumith

fbshipit-source-id: a207aea5709a79dba7a6fc541d0a70103f49efff
2019-01-14 07:31:51 -08:00
Alexander Rodin
a0d22b6965 Fix typo in documentation
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15628

Differential Revision: D13562685

Pulled By: soumith

fbshipit-source-id: 1621fcff465b029142313f717035e935e9159513
2018-12-30 18:07:57 -08:00
peterjc123
e1eb32d9f1 Update magma to 2.4.0 for Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14738

Differential Revision: D13341611

Pulled By: soumith

fbshipit-source-id: 39a49fc60e710cc32a463858c9cee57c182330e2
2018-12-05 09:53:39 -08:00
Tongzhou Wang
8a35aafca6 Try to fix randomness.rst formatting again
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12853

Differential Revision: D10458439

Pulled By: SsnL

fbshipit-source-id: ebd259e598327b0c5d63de6b7c182781fe361fbd
2018-10-18 19:18:49 -07:00
Tongzhou Wang
a85174b46a Fix randomness.rst formatting
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12850

Differential Revision: D10457694

Pulled By: SsnL

fbshipit-source-id: fa64964ff6d41625d9383ca96393017230e4ee0f
2018-10-18 18:26:26 -07:00
Thomas Viehmann
0521c47c91 Amend nondeterminism notes (#12217)
Summary:
include atomicAdd commentary as this is less well known

There is some discussion in #12207

Unfortunately, I cannot seem to get the ..include working in `_tensor_docs.py` and `_torch_docs.py`. I could use a hint for that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12217

Differential Revision: D10419739

Pulled By: SsnL

fbshipit-source-id: eecd04fb7486bd9c6ee64cd34859d61a0a97ec4e
2018-10-16 23:59:26 -07:00
cclauss
b0248df72a Docs: Change cuda(async) —> cuda(non_blocking) (#12158)
Summary:
goldsborough Modify the docs to match the changes made in #4999
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12158

Differential Revision: D10103964

Pulled By: SsnL

fbshipit-source-id: 1b8692da86aca1a52e8d2e6cea76a5ad1f71e058
2018-09-28 08:39:27 -07:00
Tongzhou Wang
c30790797f Minor data loader doc improvements
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11821

Differential Revision: D9948292

Pulled By: SsnL

fbshipit-source-id: 01c21c129423c0f7844b403e665a8fe021a9c820
2018-09-19 15:33:25 -07:00
Rasmus Diederichsen
6fc18a7541 Typo fix in randomness.rst (#11571)
Summary:
"need to be" -> "need not be"
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11571

Differential Revision: D9786001

Pulled By: soumith

fbshipit-source-id: 7cc408f5c8bfcc56d4b5c153646f30e1cec37539
2018-09-12 08:25:46 -07:00
Rasmus Diederichsen
8aa8ad8b01 WIP: Reproducibility note (#11329)
Summary:
This adds a Note on making experiments reproducible.

It also adds Instructions for building the Documentation to `README.md`. Please ping if I missed any requirements.

I'm not sure what to do about the submodule changes. Please advise.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11329

Differential Revision: D9784939

Pulled By: ezyang

fbshipit-source-id: 5c5acbe343d1fffb15bdcb84c6d8d925c2ffcc5e
2018-09-11 21:09:51 -07:00
Thomas Viehmann
3799b10c44 various documentation formatting (#9359)
Summary:
This is a grab-bag of documentation formatting fixes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9359

Differential Revision: D8831400

Pulled By: soumith

fbshipit-source-id: 8dac02303168b2ea365e23938ee528d8e8c9f9b7
2018-07-13 02:48:25 -07:00
Tongzhou Wang
e8536c08a1 Update extension docs, fix Fold/Unfold docs (#9239)
Summary:
Commits:
1. In extension doc, get rid of all references of `Variable` s (Closes #6947 )
    + also add minor improvements
    + also added a section with links to cpp extension :) goldsborough
    + removed mentions of `autograd.Function.requires_grad` as it's not used anywhere and hardcoded to `return_Py_True`.
2. Fix several sphinx warnings
3. Change `*` in equations in `module/conv.py` to `\times`
4. Fix docs for `Fold` and `Unfold`.
    + Added better shape check for `Fold` (it previously may give bogus result when there are not enough blocks). Added test for the checks.
5. Fix doc saying `trtrs` not available for CUDA (#9247 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9239

Reviewed By: soumith

Differential Revision: D8762492

Pulled By: SsnL

fbshipit-source-id: 13cd91128981a94493d5efdf250c40465f84346a
2018-07-08 19:09:39 -07:00
Richard Zou
b4cd9f2fc9
Clarify mp note about sharing a tensor's grad field. (#8688)
* Clarify mp note about sharing a tensor's grad field.

* Address comments

* Address comments
2018-06-20 14:22:38 -04:00
Kaiyu Shi
0169ac5936 Fix sample code for cuda stream (#8319) 2018-06-10 11:41:50 -04:00
Tongzhou Wang
9af3a80cff
Docs for gradcheck and gradgradcheck; expose gradgradcheck (#8166)
* Docs for gradcheck and gradgradcheck; expose gradgradcheck

* address comments
2018-06-06 13:59:55 -04:00
peterjc123
108fb1c2c9 Fix the import part of the windows doc (#7979) 2018-05-30 21:51:30 -04:00
peterjc123
267fc43a96 Fix Windows doc for import error (#7704)
* Fix Windows doc for import error

* Fix doc again

* Fix wrong format
2018-05-29 22:07:00 +01:00
braincodercn
5ee5537b98 Fix typo in document (#7725) 2018-05-21 11:10:24 -04:00
Richard Zou
0430bfe40b
[docs] Update broadcasting and cuda semantics notes (#6904)
* [docs] Update broadcasting and cuda semantics notes

* Update multiprocessing.rst

* address comments

* Address comments
2018-04-24 13:41:24 -04:00
peterjc123
a4dbd37403 [doc] Minor fixes for Windows docs (#6853) 2018-04-23 13:15:33 +02:00
peterjc123
56567fe47d Add documents for Windows (#6653)
* Add Windows doc

* some minor fixes

* Fix typo

* more minor fixes

* Fixes on dataloader
2018-04-22 15:18:02 -04:00
Richard Zou
2acc247517
[docs] Update autograd notes (#6769) 2018-04-19 13:34:14 -04:00
Tongzhou Wang
6b7ec95abb Link relevant FAQ section in DataLoader docs (#6476)
* Link FAQ section on workers returning same random numbers in DataLoader docs

* explicitly mention section names
2018-04-11 13:41:46 -04:00
Tongzhou Wang
4d15442ebc
Add total_length option to pad_packed_sequence (#6327)
* add total_length to pad_packed_sequence; add example on how to use pack->rnn->unpack with DP

* address comments

* fix typo
2018-04-08 20:25:48 -04:00
Kento NOZAWA
c00ee6da8f Fix typos (#6348)
* Fix typo

* Fix typo

* Update faq.rst
2018-04-06 11:06:42 -04:00
Kaiyu Shi
605307f8f3 Add support for printing extra information in Module and refactor redundant codes (#5936)
This PR enables users to print extra information of their subclassed nn.Module.
Now I simply insert the user-defined string at the ending of module name, which should be discussed in this PR.

Before this PR, users should redefine the __repr__ and copy&paste the source code from Module.

* Add support for extra information on Module

* Rewrite the repr method of Module

* Fix flake8

* Change the __repr__ to get_extra_repr in Linear

* Fix extra new-line for empty line

* Add test for __repr__ method

* Fix bug of block string indent

* Add indent for multi-line repr test.

* Address review comments

* Update tutorial for creating nn.Module

* Fix flake8, add extra_repr of bilinear

* Refactor DropoutNd

* Change to extra_repr in some Modules

* Fix flake8

* Refactor padding modules

* Refactor pooling module

* Fix typo

* Change to extra_repr

* Fix bug for GroupNorm

* Fix bug for LayerNorm
2018-04-02 13:52:33 -04:00
Peter Goldsborough
47f31cb1e6 Update FAQ to make more sense after tensor/variable merge (#6017) 2018-03-27 07:48:25 -07:00