Commit Graph

235 Commits

Author SHA1 Message Date
Jithun Nair
730ce29baf Add note on ifdefing based on CUDA_VERSION for ROCm path (#62850)
Summary:
CUDA_VERSION and HIP_VERSION follow very unrelated versioning schemes, so it does not make sense to use CUDA_VERSION to determine the ROCm path. This note explicitly addresses it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62850

Reviewed By: mruberry

Differential Revision: D30547562

Pulled By: malfet

fbshipit-source-id: 02990fa66a88466c2330ab85f446b25b78545150
2021-08-25 15:02:03 -07:00
Victor Quach
b95ce1591d Add docs describing saved tensor hooks (#62362)
Summary:
Add section to the Autograd mechanics docs to describe the recently
exposed saved tensors (https://github.com/pytorch/pytorch/issues/52451), how to register packing / unpacking
hooks (https://github.com/pytorch/pytorch/issues/60975) and how to use default hooks (https://github.com/pytorch/pytorch/issues/61834)

Sister PR: https://github.com/pytorch/pytorch/issues/62361 (will add a link from autograd.rst to notes/autograd in whatever PR does not land first)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62362

Reviewed By: soulitzer

Differential Revision: D30453177

Pulled By: Varal7

fbshipit-source-id: f5759977b069ff0ef36a47b08856d297691a6caa
2021-08-20 11:10:51 -07:00
soulitzer
2f615f6313 Improve custom function docs (#60312)
Summary:
- Adds some code examples for `ctx` methods and make requirements of arguments more clear
- Type annotations for `save_for_backward`, `mark_dirty`, `mark_non_differentiable`, and `set_materialize_grads` (BC-breaking?)
- Refactor `torch.autograd.Function` doc

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60312

Reviewed By: VitalyFedyunin

Differential Revision: D30314961

Pulled By: soulitzer

fbshipit-source-id: a284314b65662e26390417bd2b6b12cd85e68dc8
2021-08-18 11:31:31 -07:00
kyshel
e75ed4a4b5 add comma to prevent syntax errors (#62492)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62492

Reviewed By: VitalyFedyunin

Differential Revision: D30304684

Pulled By: ezyang

fbshipit-source-id: db08ca39bcecbfd79ea50df18536bf4e87f51e15
2021-08-16 12:27:31 -07:00
cpatru
6d896cb545 Update faq.rst so OOM section mentions checkpoint (#62709)
Summary:
This FAQ has a section for CUDA OOMs where there are lots of don'ts. This limits modeling solution. Deep nets can blow up memory due to output caching during training.
It's a known problem with a known solution: to trade-off compute for memory via checkpointing.
FAQ should mention it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62709

Reviewed By: nairbv

Differential Revision: D30103326

Pulled By: ezyang

fbshipit-source-id: 3a8b465a7fbe19aae88f83cc50fe82ebafcb56c9
2021-08-05 07:40:08 -07:00
Victor Quach
5830f122f1 Add docstrings for save_on_cpu hooks (#62410)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62410

This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D29990129

Pulled By: Varal7

fbshipit-source-id: 7a98eeee6a0abb11e2c2d9169cd1aa35ad7ba3f4
2021-08-03 17:53:45 -07:00
Michael Dagitses
58df01c3b8 clarify default value of requires_grad for tensors (#61038)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61038

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D29491984

Pulled By: dagitses

fbshipit-source-id: 7e6b7f8e81d77f38c881b86a68c17d3cf5483dad
2021-07-12 12:57:37 -07:00
Jithun Nair
336970c03e Add note on torch.distributed backends on ROCm (#58975)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58975

Reviewed By: soulitzer

Differential Revision: D29595510

Pulled By: rohan-varma

fbshipit-source-id: 384bb67fcd003d65b76e957a474406b2a38099b9
2021-07-10 03:51:19 -07:00
Michael Carilli
2fa6c7627e [CUDA graphs][BC-breaking] Removes post-backward syncs on default stream (#60421)
Summary:
Before https://github.com/pytorch/pytorch/pull/57833, calls to backward() or grad() synced only the calling thread's default stream with autograd leaf streams at the end of backward. This made the following weird pattern safe:
```python
with torch.cuda.stream(s):
    # imagine forward used many streams, so backward leaf nodes may run on many streams
    loss.backward()
# no sync
use grads
```

but a more benign-looking pattern was unsafe:
```python
with torch.cuda.stream(s):
    # imagine forward used a lot of streams, so backward leaf nodes may run on many streams
    loss.backward()
    # backward() syncs the default stream with all the leaf streams, but does not sync s with anything,
    # so counterintuitively (even though we're in the same stream context as backward()!)
    # it is NOT SAFE to use grads here, and there's no easy way to make it safe,
    # unless you manually sync on all the streams you used in forward,
    # or move "use grads" back to default stream outside the context.
    use grads
```
mruberry ngimel and I decided backward() should have the [same user-facing stream semantics as any cuda op](https://pytorch.org/docs/master/notes/cuda.html#stream-semantics-of-backward-passes).** In other words, the weird pattern should be unsafe, and the benign-looking pattern should be safe. Implementationwise, this meant backward() should sync its calling thread's current stream, not default stream, with the leaf streams.

After https://github.com/pytorch/pytorch/pull/57833, backward syncs the calling thread's current stream AND default stream with all leaf streams at the end of backward. The default stream syncs were retained for temporary backward compatibility.

This PR finishes https://github.com/pytorch/pytorch/pull/57833's work by deleting syncs on the default stream.

With this PR, graph-capturing an entire backward() call should be possible (see the [test_graph_grad_scaling diffs](https://github.com/pytorch/pytorch/compare/master...mcarilli:streaming_backwards_remove_default_syncs?expand=1#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3641-R3642)).

** first paragraph has a formatting error which this PR should also fix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60421

Reviewed By: albanD

Differential Revision: D29370344

Pulled By: ngimel

fbshipit-source-id: 3248bc5fb92fc517db0c15c897e5d7250f67d7fe
2021-06-24 17:34:02 -07:00
Luca Wehrstedt
bb9e1150ea Revert D29342234: [pytorch][PR] [CUDA graphs][BC-breaking] Removes post-backward syncs on default stream
Test Plan: revert-hammer

Differential Revision:
D29342234 (675cea1adb)

Original commit changeset: 98e6be7fdd85

fbshipit-source-id: 84022973248b2254210eee57402df2c4f4bc43c6
2021-06-24 04:49:28 -07:00
Michael Carilli
675cea1adb [CUDA graphs][BC-breaking] Removes post-backward syncs on default stream (#60421)
Summary:
Before https://github.com/pytorch/pytorch/pull/57833, calls to backward() or grad() synced only the calling thread's default stream with autograd leaf streams at the end of backward. This made the following weird pattern safe:
```python
with torch.cuda.stream(s):
    # imagine forward used many streams, so backward leaf nodes may run on many streams
    loss.backward()
# no sync
use grads
```

but a more benign-looking pattern was unsafe:
```python
with torch.cuda.stream(s):
    # imagine forward used a lot of streams, so backward leaf nodes may run on many streams
    loss.backward()
    # backward() syncs the default stream with all the leaf streams, but does not sync s with anything,
    # so counterintuitively (even though we're in the same stream context as backward()!)
    # it is NOT SAFE to use grads here, and there's no easy way to make it safe,
    # unless you manually sync on all the streams you used in forward,
    # or move "use grads" back to default stream outside the context.
    use grads
```
mruberry ngimel and I decided backward() should have the [same user-facing stream semantics as any cuda op](https://pytorch.org/docs/master/notes/cuda.html#stream-semantics-of-backward-passes).** In other words, the weird pattern should be unsafe, and the benign-looking pattern should be safe. Implementationwise, this meant backward() should sync its calling thread's current stream, not default stream, with the leaf streams.

After https://github.com/pytorch/pytorch/pull/57833, backward syncs the calling thread's current stream AND default stream with all leaf streams at the end of backward. The default stream syncs were retained for temporary backward compatibility.

This PR finishes https://github.com/pytorch/pytorch/pull/57833's work by deleting syncs on the default stream.

With this PR, graph-capturing an entire backward() call should be possible (see the [test_graph_grad_scaling diffs](https://github.com/pytorch/pytorch/compare/master...mcarilli:streaming_backwards_remove_default_syncs?expand=1#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3641-R3642)).

** first paragraph has a formatting error which this PR should also fix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60421

Reviewed By: VitalyFedyunin, albanD

Differential Revision: D29342234

Pulled By: ngimel

fbshipit-source-id: 98e6be7fdd8550872f0a78f9a66cb8dfe75abf63
2021-06-23 23:35:24 -07:00
Michael Wootton
2f3be2735f Don't split oversize cached blocks (#44742)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/35901

This change is designed to prevent fragmentation in the Caching Allocator.  Permissive block splitting in the allocator allows very large blocks to be split into many pieces.  Once split too finely it is unlikely all pieces will be 'free' at that same time so the original allocation can never be returned.   Anecdotally, we've seen a model run out of memory failing to alloc a 50 MB block on a 32 GB card while the caching allocator is holding 13 GB of 'split free blocks'

Approach:

- Large blocks above a certain size are designated "oversize".  This limit is currently set 1 decade above large, 200 MB
- Oversize blocks can not be split
- Oversize blocks must closely match the requested size (e.g. a 200 MB request will match an existing 205 MB block, but not a 300 MB block)
- In lieu of splitting oversize blocks there is a mechanism to quickly free a single oversize block (to the system allocator) to allow an appropriate size block to be allocated.  This will be activated under memory pressure and will prevent _release_cached_blocks()_ from triggering

Initial performance tests show this is similar or quicker than the original strategy.  Additional tests are ongoing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44742

Reviewed By: zou3519

Differential Revision: D29186394

Pulled By: ezyang

fbshipit-source-id: c88918836db3f51df59de6d1b3e03602ebe306a9
2021-06-21 11:46:08 -07:00
Michael Carilli
be038d8989 [CUDA graphs] Make stream semantics of backward calls consistent with other cuda ops (ci-all edition) (#57833)
Summary:
ci-all resubmit of https://github.com/pytorch/pytorch/pull/54227.

Tests look good except for a few distributed autograd failures (pytorch_linux_xenial_cuda10_2_cudnn7_py3_multigpu_test) and rocm failures (pr/pytorch-linux-bionic-rocm4.1-py3.6).

The common denominator in rocm failures appears to be multi-gpu activity: some [multiprocess DDP failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test1/8115/console), some [single-process failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test2/8115/console) where the single process has autograd ops that span devices. jeffdaily jithunnair-amd sunway513, could one of you take a look? The streaming backward change is also beneficial to rocm, I expect.

For debugging rocm failures, I think we should ignore the multiprocess/DDP tests and focus on the single process cases. The root cause is probably the same and the single process cases are simpler.

----------------------------------

Update: Rocm failures are due to https://github.com/pytorch/pytorch/issues/59750.
2718a54032 is a workaround, to be updated once https://github.com/pytorch/pytorch/issues/59750 is fixed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57833

Reviewed By: mruberry

Differential Revision: D28942391

Pulled By: ngimel

fbshipit-source-id: d6047e971c5f1c6386334bf3641402a92f12e2f8
2021-06-13 12:09:56 -07:00
Jeffrey Wan
a7a5992d7d Add no-grad inference mode note (#58513)
Summary:
Adds a note explaining the difference between several often conflated mechanisms in the autograd note
Also adds a link to this note from the docs in `grad_mode` and `nn.module`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58513

Reviewed By: gchanan

Differential Revision: D28651129

Pulled By: soulitzer

fbshipit-source-id: af9eb1749b641fc1b632815634eea36bf7979156
2021-05-25 13:06:54 -07:00
Jithun Nair
ab6b5fa036 Add HIP (ROCm) semantics doc (#57871)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57871

Reviewed By: agolynski

Differential Revision: D28385510

Pulled By: malfet

fbshipit-source-id: 9cf69e52d026a1cf74cc12d8727ca17ae026235e
2021-05-12 12:34:07 -07:00
albanD
d16ed1ee8a Add first draft of gradcheck note (#55966)
Summary:
You can find the latest rendered version in the `python_doc_build` CI job below, in the artifact tab of that build on circle CI

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55966

Reviewed By: H-Huang

Differential Revision: D28032446

Pulled By: albanD

fbshipit-source-id: 227ad37b03d39894d736c19cae3195b4d56fc62f
2021-04-27 14:33:42 -07:00
Ilqar Ramazanli
70d9be0f42 Replace duplicative s with alpha (#56804)
Summary:
It is always easier to read a document when different objects / concepts denoted with different variables / representations.
In this PR we make sure the [complex autograd](https://pytorch.org/docs/master/notes/autograd.html#autograd-for-complex-numbers) documentation, the variable of output and step size diverge.

Fixes https://github.com/pytorch/pytorch/issues/53633

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56804

Reviewed By: anjali411

Differential Revision: D27989959

Pulled By: iramazanli

fbshipit-source-id: c271590ee744c8aeeff62bfaa2295429765ef64e
2021-04-25 16:27:09 -07:00
Erjia Guan
8cf85a1152 [DataLoader][doc] Randomness for base_seed generator and NumPy seed (#56528)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56528

Tried to search across internal and external usage of DataLoader. People haven't started to use `generator` for `DataLoader`.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D27908487

Pulled By: ejguan

fbshipit-source-id: 14c83ed40d4ba4dc988b121968a78c2732d8eb93
2021-04-22 09:40:45 -07:00
Natalia Gimelshein
f94c95a2dd Revert D23752058: [pytorch][PR] Don't split oversize cached blocks
Test Plan: revert-hammer

Differential Revision:
D23752058 (67dcd62310)

Original commit changeset: ccb7c13e3cf8

fbshipit-source-id: 12ae9702135ea510e9714ed97fb75ca3b9f97c27
2021-04-14 09:24:08 -07:00
Michael Wootton
67dcd62310 Don't split oversize cached blocks (#44742)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/35901

This change is designed to prevent fragmentation in the Caching Allocator.  Permissive block splitting in the allocator allows very large blocks to be split into many pieces.  Once split too finely it is unlikely all pieces will be 'free' at that same time so the original allocation can never be returned.   Anecdotally, we've seen a model run out of memory failing to alloc a 50 MB block on a 32 GB card while the caching allocator is holding 13 GB of 'split free blocks'

Approach:

- Large blocks above a certain size are designated "oversize".  This limit is currently set 1 decade above large, 200 MB
- Oversize blocks can not be split
- Oversize blocks must closely match the requested size (e.g. a 200 MB request will match an existing 205 MB block, but not a 300 MB block)
- In lieu of splitting oversize blocks there is a mechanism to quickly free a single oversize block (to the system allocator) to allow an appropriate size block to be allocated.  This will be activated under memory pressure and will prevent _release_cached_blocks()_ from triggering

Initial performance tests show this is similar or quicker than the original strategy.  Additional tests are ongoing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44742

Reviewed By: ngimel

Differential Revision: D23752058

Pulled By: ezyang

fbshipit-source-id: ccb7c13e3cf8ef2707706726ac9aaac3a5e3d5c8
2021-04-14 03:04:41 -07:00
Yukio Siraichi
93bf0ae6fc Remove legacy constructor calls from pytorch codebase. (#54142)
Summary:
Follow up from https://github.com/pytorch/pytorch/issues/53889
Related to https://github.com/pytorch/pytorch/issues/47112

Removing every occurrence of the legacy constructor call present in PyTorch at:
- _docs_
- _benchmarks_
- _test_
- _caffe2_
- _CONTRIBUTING.md_

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54142

Reviewed By: ngimel

Differential Revision: D27699450

Pulled By: mruberry

fbshipit-source-id: 530aa3f5746cc8bc1407d5d51b2bbd8075e30546
2021-04-11 15:45:17 -07:00
James Reed
ec38dda1cc Remove extra close bracket in extending.rst (#55409)
Summary:
Small typo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55409

Reviewed By: pbelevich

Differential Revision: D27611177

Pulled By: jamesr66a

fbshipit-source-id: 8a5ff702e4ab8a7eb2403432889f8b7a5a69484b
2021-04-07 21:15:46 -07:00
Peter Bell
8ac0619784 Avoid infinite recursion in __torch_function__ example (#55391)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/55284

This gets the example to run but probably doesn't help the readability of the example.

Thoughts?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55391

Reviewed By: mrshenli

Differential Revision: D27621096

Pulled By: ezyang

fbshipit-source-id: d02c4fb0001e54139a167b477fd3b4a229e4dc8c
2021-04-07 20:31:46 -07:00
James Reed
c96f076248 Fix typo in extending.rst (#55408)
Summary:
Small typo in docs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55408

Reviewed By: pbelevich

Differential Revision: D27611175

Pulled By: jamesr66a

fbshipit-source-id: a83a6220054c0411329792c7ac6afceb2b699f44
2021-04-07 03:46:01 -07:00
Sam Estep
5bcbbf5373 Lint trailing newlines (#54737)
Summary:
*Context:* https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines.

The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR:

- `.github/workflows/lint.yml`
- `mypy-strict.ini`
- `tools/README.md`
- `tools/test/test_trailing_newlines.py`
- `tools/trailing_newlines.py`

I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository):

- [How to detect file ends in newline?](https://stackoverflow.com/q/38746)
- [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068)
- [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800)
- [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632)
- [git ensure newline at end of each file](https://stackoverflow.com/q/57770972)

To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737

Test Plan:
Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR:

- https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true

In contrast, this run (after correcting the trailing newlines in this PR) succeeded:

- https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241

To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow):
```
python tools/test/test_trailing_newlines.py
```

Reviewed By: malfet

Differential Revision: D27409736

Pulled By: samestep

fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19
2021-03-30 13:09:52 -07:00
Jeff Yang
475251631b docs: reference links to serialization.html (#54659)
Summary:
fixes https://github.com/pytorch/pytorch/issues/54311
https://11811979-65600975-gh.circle-artifacts.com/0/docs/generated/torch.save.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54659

Reviewed By: ailzhang

Differential Revision: D27328281

Pulled By: zou3519

fbshipit-source-id: b88d02e5407238a338d537d013a297ae9cdf922b
2021-03-29 10:15:07 -07:00
Eric Jang
c2ccb3578e Fix inport -> import typo in documentation (#53589)
Summary:
Fixes a small documentation typo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53589

Reviewed By: ngimel

Differential Revision: D26907045

Pulled By: Chillee

fbshipit-source-id: 15c35bec8d75dd897fe8886d0e0e1b889df65b24
2021-03-08 23:56:42 -08:00
Sam Estep
8c798e0622 Forbid trailing whitespace (#53406)
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857

These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
  - `GLOSSARY.md`
  - `aten/src/ATen/core/op_registration/README.md`
  - `scripts/README.md`
  - `torch/csrc/jit/codegen/fuser/README.md`

The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```

I looked over the auto-generated changes and didn't see anything that looked problematic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406

Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377

This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348

Reviewed By: walterddr, seemethere

Differential Revision: D26856620

Pulled By: samestep

fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
2021-03-05 17:22:55 -08:00
peter
8870c391e9 Update mkl to 2020.2.254 (#52964)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/52907

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52964

Reviewed By: H-Huang

Differential Revision: D26726464

Pulled By: seemethere

fbshipit-source-id: 8f3067292e6416e299b4b040c8fb73510134f02e
2021-03-01 11:13:57 -08:00
Joel Schlosser
a0137808a7 Note on Modules for 1.8 docs (#51536)
Summary:
A new note on Modules for 1.8 documentation.

Rendered form can be seen here: https://alband.github.io/doc_view/notes/modules.html
(thanks Alban!)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51536

Reviewed By: albanD

Differential Revision: D26254282

Pulled By: jbschlosser

fbshipit-source-id: 09cbd46aa268a29b6f54fd48ffe1d6b98db0ff31
2021-02-04 11:28:11 -08:00
anjali411
34d4d79966 Autograd doc note fix (#51661)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51661

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D26230912

Pulled By: anjali411

fbshipit-source-id: 94323d7bce631a4c5781020e9650495461119ede
2021-02-03 15:08:35 -08:00
Mike Ruberry
40c0fffb4b Fixes docs (#51439)
Summary:
pytorch_python_doc_build is failing with:

```
Jan 31 04:30:45 /var/lib/jenkins/workspace/docs/source/notes/broadcasting.rst:6: WARNING: 'any' reference target not found: numpy.doc.broadcasting
```

this removes the incorrect reference and adds an updated link.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51439

Reviewed By: ngimel

Differential Revision: D26170232

Pulled By: mruberry

fbshipit-source-id: 829999db52e1e860d36d626d0d9f26e31283d14b
2021-01-31 22:00:26 -08:00
anjali411
fd9a85d21b Doc update for complex numbers (#51129)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51129

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D26094947

Pulled By: anjali411

fbshipit-source-id: 4e1cdf8915a8c6a86ac3462685cdce881e1bcffa
2021-01-27 07:32:26 -08:00
Hameer Abbasi
f7b339d11c Clarify wording around overrides subclasses. (#51031)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47117

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51031

Reviewed By: bdhirsh

Differential Revision: D26047498

Pulled By: albanD

fbshipit-source-id: dd0a7d9f97c0f6469b3050d2e3b4473f1bee3820
2021-01-25 08:19:13 -08:00
Kurt Mohler
8ab1a1495d Rename set_deterministic to use_deterministic_algorithms (#49904)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/49100

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49904

Reviewed By: ezyang, mrshenli

Differential Revision: D25956761

Pulled By: mruberry

fbshipit-source-id: 86a59289d50825a0ebbd7c358b483c8d8039ffa6
2021-01-22 11:27:07 -08:00
Jeffrey Wan
1833009202 Fix typo in complex autograd docs (#49755)
Summary:
Update complex autograd docs to fix a typo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49755

Reviewed By: mruberry

Differential Revision: D25692649

Pulled By: soulitzer

fbshipit-source-id: 43c2113b4c8f2d1828880102189a5a9b887dc784
2020-12-23 14:42:34 -08:00
pbialecki
1451d84766 Minor doc fix: change truncating to rounding in TF32 docs (#49625)
Summary:
Minor doc fix in clarifying that the input data is rounded not truncated.

CC zasdfgbnm ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49625

Reviewed By: mruberry

Differential Revision: D25668244

Pulled By: ngimel

fbshipit-source-id: ac97e41e0ca296276544f9e9f85b2cf1790d9985
2020-12-22 13:46:33 -08:00
Peter Bell
5180caeeb4 Remove deprecated spectral ops from torch namespace (#48594)
Summary:
Ref https://github.com/pytorch/pytorch/issues/42175

This removes the 4 deprecated spectral functions: `torch.{fft,rfft,ifft,irfft}`. `torch.fft` is also now imported by by default.

The actual `at::native` functions are still used in `torch.stft` so can't be full removed yet. But will once https://github.com/pytorch/pytorch/issues/47601 has been merged.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48594

Reviewed By: heitorschueroff

Differential Revision: D25298929

Pulled By: mruberry

fbshipit-source-id: e36737fe8192fcd16f7e6310f8b49de478e63bf0
2020-12-05 04:12:32 -08:00
peter
3c5db30eaa Update magma to 2.5.4 for Windows (#48656)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48527

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48656

Reviewed By: zhangguanheng66

Differential Revision: D25261601

Pulled By: malfet

fbshipit-source-id: 4ba0036ca882bccd1990108d13596455d179d06e
2020-12-02 09:45:21 -08:00
Hameer Abbasi
4e15877d5c Add documentation for torch.overrides submodule. (#48170)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48087

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48170

Reviewed By: ejguan

Differential Revision: D25220942

Pulled By: ezyang

fbshipit-source-id: a2b7f7b565f5e77173d8ce2fe9676a8131f929b6
2020-11-30 11:25:31 -08:00
Hameer Abbasi
3a2aad9314 Fix documentation to point to torch.overrides instead of _overrides. (#47842)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47697

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47842

Reviewed By: smessmer

Differential Revision: D24951750

Pulled By: ezyang

fbshipit-source-id: df62ec2e52f1c561c864a50bac4abf4a55e4f8e6
2020-11-16 08:28:53 -08:00
Masaki Kozuki
2eb1e866e8 Update links in DDP note (#47663)
Summary:
Update the links in https://pytorch.org/docs/stable/notes/ddp.html#.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47663

Reviewed By: smessmer

Differential Revision: D24951684

Pulled By: ezyang

fbshipit-source-id: c1c104d76cf0292a7fc75a627bf76bb56fea72d0
2020-11-13 21:26:28 -08:00
Xiang Gao
4a7de2746f Add docs on how to toggle TF32 flags on C++ (#47331)
Summary:
I have been asked several times how to toggle this flag on libtorch. I think it would be good to mention it in the docs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47331

Reviewed By: glaringlee

Differential Revision: D24777576

Pulled By: mruberry

fbshipit-source-id: cc2a338c477bb57e0bb74b8960c47fde99665e41
2020-11-08 01:29:24 -08:00
anjali411
ac245f6b45 Complex autograd doc fix (#46258)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46258

Test Plan: Imported from OSS

Reviewed By: ailzhang

Differential Revision: D24286512

Pulled By: anjali411

fbshipit-source-id: 60bc98d69336101c0d8fe5ab542b9757b5e7faac
2020-10-13 14:36:50 -07:00
Vitaly Fedyunin
31ee5d8d8b Adding information how to control randomness with DataLoader (#45749)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45749

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24088407

Pulled By: VitalyFedyunin

fbshipit-source-id: 398b73ec5e8c83000ebc692001da847fc0aaa48f
2020-10-12 16:57:58 -07:00
anjali411
89256611b5 Doc note update for complex autograd (#45270)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45270

<img width="1679" alt="Screen Shot 2020-10-07 at 1 45 59 PM" src="https://user-images.githubusercontent.com/20081078/95368324-fa7b2d00-08a3-11eb-9066-2e659a4085a2.png">
<img width="1673" alt="Screen Shot 2020-10-07 at 1 46 10 PM" src="https://user-images.githubusercontent.com/20081078/95368332-fbac5a00-08a3-11eb-9be5-77ce6deb8967.png">
<img width="1667" alt="Screen Shot 2020-10-07 at 1 46 30 PM" src="https://user-images.githubusercontent.com/20081078/95368337-fe0eb400-08a3-11eb-80a2-5ad23feeeb83.png">
<img width="1679" alt="Screen Shot 2020-10-07 at 1 46 48 PM" src="https://user-images.githubusercontent.com/20081078/95368345-00710e00-08a4-11eb-96d9-e2d544554a4b.png">
<img width="1680" alt="Screen Shot 2020-10-07 at 1 47 03 PM" src="https://user-images.githubusercontent.com/20081078/95368350-023ad180-08a4-11eb-89b3-f079480741f4.png">
<img width="1680" alt="Screen Shot 2020-10-07 at 1 47 12 PM" src="https://user-images.githubusercontent.com/20081078/95368364-0535c200-08a4-11eb-82fc-9435a046e4ca.png">

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D24203257

Pulled By: anjali411

fbshipit-source-id: cd637dade5fb40cecf5d9f4bd03d508d36e26fcd
2020-10-08 15:04:52 -07:00
Michael Carilli
5640b79bf8 Allow consumer ops to sync on GraphRoot's gradient (#45787)
Summary:
Currently, a GraphRoot instance doesn't have an associated stream.  Streaming backward synchronization logic assumes the instance ran on the default stream, and tells consumer ops to sync with the default stream.  If the gradient the GraphRoot instance passes to consumer backward ops was populated on a non-default stream, we have a race condition.

The race condition can exist even if the user doesn't give a manually populated gradient:
```python
with torch.cuda.stream(side_stream):
    # loss.backward() implicitly synthesizes a one-element 1.0 tensor on side_stream
    # GraphRoot passes it to consumers, but consumers first sync on default stream, not side_stream.
    loss.backward()

    # Internally to backward(), streaming-backward logic takes over, stuff executes on the same stream it ran on in forward,
    # and the side_stream context is irrelevant.  GraphRoot's interaction with its first consumer(s) is the spot where
    # the side_stream context causes a problem.
```

This PR fixes the race condition by associating a GraphRoot instance, at construction time, with the current stream(s) on the device(s) of the grads it will pass to consumers. (i think this relies on GraphRoot executing in the main thread, before backward thread(s) fork, because the grads were populated on the main thread.)

The test demonstrates the race condition. It fails reliably without the PR's GraphRoot diffs and passes with the GraphRoot diffs.

With the GraphRoot diffs, manually populating an incoming-gradient arg for `backward` (or `torch.autograd.grad`) and the actual call to `autograd.backward` will have the same stream-semantics relationship as any other pair of ops:
```python
# implicit population is safe
with torch.cuda.stream(side_stream):
    loss.backward()

# explicit population in side stream then backward in side stream is safe
with torch.cuda.stream(side_stream):
    kickoff_grad = torch.ones_like(loss)
    loss.backward(gradient=kickoff_grad)

# explicit population in one stream then backward kickoff in another stream
# is NOT safe, even with this PR's diffs, but that unsafety is consistent with
# stream-semantics relationship of any pair of ops
kickoff_grad = torch.ones_like(loss)
with torch.cuda.stream(side_stream):
    loss.backward(gradient=kickoff_grad)

# Safe, as you'd expect for any pair of ops
kickoff_grad = torch.ones_like(loss)
side_stream.wait_stream(torch.cuda.current_stream())
with torch.cuda.stream(side_stream):
    loss.backward(gradient=kickoff_grad)
```
This PR also adds the last three examples above to cuda docs and references them from autograd docstrings.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45787

Reviewed By: nairbv

Differential Revision: D24138376

Pulled By: albanD

fbshipit-source-id: bc4cd9390f9f0358633db530b1b09f9c1080d2a3
2020-10-07 08:53:53 -07:00
Bert Maher
03342af3a3 Add env variable to bypass CUDACachingAllocator for debugging (#45294)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45294

While tracking down a recent memory corruption bug we found that
cuda-memcheck wasn't finding the bad accesses, and ngimel pointed out that
it's because we use a caching allocator so a lot of "out of bounds" accesses
land in a valid slab.

This PR adds a runtime knob (`PYTORCH_NO_CUDA_MEMORY_CACHING`) that, when set,
bypasses the caching allocator's caching logic so that allocations go straight
to cudaMalloc.  This way, cuda-memcheck will actually work.

Test Plan:
Insert some memory errors and run a test under cuda-memcheck;
observe that cuda-memcheck flags an error where expected.

Specifically I removed the output-masking logic here:
https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/tensorexpr/cuda_codegen.cpp#L819-L826

And ran:
```
PYTORCH_NO_CUDA_MEMORY_CACHING=1 cuda-memcheck pytest -k test_superslomo test_jit_fuser_te.py
```

Reviewed By: ngimel

Differential Revision: D23964734

Pulled By: bertmaher

fbshipit-source-id: 04efd11e8aff037b9edde80c70585cb820ee6e39
2020-09-28 11:40:04 -07:00
Michael Carilli
3e6bb5233f Reference amp tutorial (recipe) from core amp docs (#44725)
Summary:
https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html is live.  Core amp docs should reference it.

Also i fixed some typos in the `zero_grad` docs we ignored when git was behaving weirdly during ngimel 's merge of https://github.com/pytorch/pytorch/pull/44423.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44725

Reviewed By: mruberry

Differential Revision: D23723807

Pulled By: ngimel

fbshipit-source-id: ca0b76365f8ca908bd978e3b38bf81857fa6c2a3
2020-09-16 11:37:58 -07:00
Michael Carilli
2fd142a2ef Small clarification to amp gradient penalty example (#44667)
Summary:
requested by https://discuss.pytorch.org/t/what-is-the-correct-way-of-computing-a-grad-penalty-using-amp/95827/3

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44667

Reviewed By: mruberry

Differential Revision: D23692768

Pulled By: ngimel

fbshipit-source-id: 83c61b94e79ef9f86abed2cc066f188dce0c8456
2020-09-14 21:56:09 -07:00