pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Jithun Nair	730ce29baf	Add note on ifdefing based on CUDA_VERSION for ROCm path (#62850 ) Summary: CUDA_VERSION and HIP_VERSION follow very unrelated versioning schemes, so it does not make sense to use CUDA_VERSION to determine the ROCm path. This note explicitly addresses it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62850 Reviewed By: mruberry Differential Revision: D30547562 Pulled By: malfet fbshipit-source-id: 02990fa66a88466c2330ab85f446b25b78545150	2021-08-25 15:02:03 -07:00
Victor Quach	b95ce1591d	Add docs describing saved tensor hooks (#62362 ) Summary: Add section to the Autograd mechanics docs to describe the recently exposed saved tensors (https://github.com/pytorch/pytorch/issues/52451), how to register packing / unpacking hooks (https://github.com/pytorch/pytorch/issues/60975) and how to use default hooks (https://github.com/pytorch/pytorch/issues/61834) Sister PR: https://github.com/pytorch/pytorch/issues/62361 (will add a link from autograd.rst to notes/autograd in whatever PR does not land first) Pull Request resolved: https://github.com/pytorch/pytorch/pull/62362 Reviewed By: soulitzer Differential Revision: D30453177 Pulled By: Varal7 fbshipit-source-id: f5759977b069ff0ef36a47b08856d297691a6caa	2021-08-20 11:10:51 -07:00
soulitzer	2f615f6313	Improve custom function docs (#60312 ) Summary: - Adds some code examples for `ctx` methods and make requirements of arguments more clear - Type annotations for `save_for_backward`, `mark_dirty`, `mark_non_differentiable`, and `set_materialize_grads` (BC-breaking?) - Refactor `torch.autograd.Function` doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/60312 Reviewed By: VitalyFedyunin Differential Revision: D30314961 Pulled By: soulitzer fbshipit-source-id: a284314b65662e26390417bd2b6b12cd85e68dc8	2021-08-18 11:31:31 -07:00
kyshel	e75ed4a4b5	add comma to prevent syntax errors (#62492 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/62492 Reviewed By: VitalyFedyunin Differential Revision: D30304684 Pulled By: ezyang fbshipit-source-id: db08ca39bcecbfd79ea50df18536bf4e87f51e15	2021-08-16 12:27:31 -07:00
cpatru	6d896cb545	Update faq.rst so OOM section mentions checkpoint (#62709 ) Summary: This FAQ has a section for CUDA OOMs where there are lots of don'ts. This limits modeling solution. Deep nets can blow up memory due to output caching during training. It's a known problem with a known solution: to trade-off compute for memory via checkpointing. FAQ should mention it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62709 Reviewed By: nairbv Differential Revision: D30103326 Pulled By: ezyang fbshipit-source-id: 3a8b465a7fbe19aae88f83cc50fe82ebafcb56c9	2021-08-05 07:40:08 -07:00
Victor Quach	5830f122f1	Add docstrings for save_on_cpu hooks (#62410 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62410 This PR adds docstrings for CPU hooks introduced in #61928. Also uncomments the warning about pinned memory in CUDA semantics docs. Depends on: #62361. For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29990129 Pulled By: Varal7 fbshipit-source-id: 7a98eeee6a0abb11e2c2d9169cd1aa35ad7ba3f4	2021-08-03 17:53:45 -07:00
Michael Dagitses	58df01c3b8	clarify default value of requires_grad for tensors (#61038 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61038 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D29491984 Pulled By: dagitses fbshipit-source-id: 7e6b7f8e81d77f38c881b86a68c17d3cf5483dad	2021-07-12 12:57:37 -07:00
Jithun Nair	336970c03e	Add note on torch.distributed backends on ROCm (#58975 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58975 Reviewed By: soulitzer Differential Revision: D29595510 Pulled By: rohan-varma fbshipit-source-id: 384bb67fcd003d65b76e957a474406b2a38099b9	2021-07-10 03:51:19 -07:00
Michael Carilli	2fa6c7627e	[CUDA graphs][BC-breaking] Removes post-backward syncs on default stream (#60421 ) Summary: Before https://github.com/pytorch/pytorch/pull/57833, calls to backward() or grad() synced only the calling thread's default stream with autograd leaf streams at the end of backward. This made the following weird pattern safe: ```python with torch.cuda.stream(s): # imagine forward used many streams, so backward leaf nodes may run on many streams loss.backward() # no sync use grads ``` but a more benign-looking pattern was unsafe: ```python with torch.cuda.stream(s): # imagine forward used a lot of streams, so backward leaf nodes may run on many streams loss.backward() # backward() syncs the default stream with all the leaf streams, but does not sync s with anything, # so counterintuitively (even though we're in the same stream context as backward()!) # it is NOT SAFE to use grads here, and there's no easy way to make it safe, # unless you manually sync on all the streams you used in forward, # or move "use grads" back to default stream outside the context. use grads ``` mruberry ngimel and I decided backward() should have the [same user-facing stream semantics as any cuda op](https://pytorch.org/docs/master/notes/cuda.html#stream-semantics-of-backward-passes). In other words, the weird pattern should be unsafe, and the benign-looking pattern should be safe. Implementationwise, this meant backward() should sync its calling thread's current stream, not default stream, with the leaf streams. After https://github.com/pytorch/pytorch/pull/57833, backward syncs the calling thread's current stream AND default stream with all leaf streams at the end of backward. The default stream syncs were retained for temporary backward compatibility. This PR finishes https://github.com/pytorch/pytorch/pull/57833's work by deleting syncs on the default stream. With this PR, graph-capturing an entire backward() call should be possible (see the [test_graph_grad_scaling diffs](https://github.com/pytorch/pytorch/compare/master...mcarilli:streaming_backwards_remove_default_syncs?expand=1#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3641-R3642)). first paragraph has a formatting error which this PR should also fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60421 Reviewed By: albanD Differential Revision: D29370344 Pulled By: ngimel fbshipit-source-id: 3248bc5fb92fc517db0c15c897e5d7250f67d7fe	2021-06-24 17:34:02 -07:00
Luca Wehrstedt	bb9e1150ea	Revert D29342234: [pytorch][PR] [CUDA graphs][BC-breaking] Removes post-backward syncs on default stream Test Plan: revert-hammer Differential Revision: D29342234 (`675cea1adb`) Original commit changeset: 98e6be7fdd85 fbshipit-source-id: 84022973248b2254210eee57402df2c4f4bc43c6	2021-06-24 04:49:28 -07:00
Michael Carilli	675cea1adb	[CUDA graphs][BC-breaking] Removes post-backward syncs on default stream (#60421 ) Summary: Before https://github.com/pytorch/pytorch/pull/57833, calls to backward() or grad() synced only the calling thread's default stream with autograd leaf streams at the end of backward. This made the following weird pattern safe: ```python with torch.cuda.stream(s): # imagine forward used many streams, so backward leaf nodes may run on many streams loss.backward() # no sync use grads ``` but a more benign-looking pattern was unsafe: ```python with torch.cuda.stream(s): # imagine forward used a lot of streams, so backward leaf nodes may run on many streams loss.backward() # backward() syncs the default stream with all the leaf streams, but does not sync s with anything, # so counterintuitively (even though we're in the same stream context as backward()!) # it is NOT SAFE to use grads here, and there's no easy way to make it safe, # unless you manually sync on all the streams you used in forward, # or move "use grads" back to default stream outside the context. use grads ``` mruberry ngimel and I decided backward() should have the [same user-facing stream semantics as any cuda op](https://pytorch.org/docs/master/notes/cuda.html#stream-semantics-of-backward-passes). In other words, the weird pattern should be unsafe, and the benign-looking pattern should be safe. Implementationwise, this meant backward() should sync its calling thread's current stream, not default stream, with the leaf streams. After https://github.com/pytorch/pytorch/pull/57833, backward syncs the calling thread's current stream AND default stream with all leaf streams at the end of backward. The default stream syncs were retained for temporary backward compatibility. This PR finishes https://github.com/pytorch/pytorch/pull/57833's work by deleting syncs on the default stream. With this PR, graph-capturing an entire backward() call should be possible (see the [test_graph_grad_scaling diffs](https://github.com/pytorch/pytorch/compare/master...mcarilli:streaming_backwards_remove_default_syncs?expand=1#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3641-R3642)). first paragraph has a formatting error which this PR should also fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60421 Reviewed By: VitalyFedyunin, albanD Differential Revision: D29342234 Pulled By: ngimel fbshipit-source-id: 98e6be7fdd8550872f0a78f9a66cb8dfe75abf63	2021-06-23 23:35:24 -07:00
Michael Wootton	2f3be2735f	Don't split oversize cached blocks (#44742 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/35901 This change is designed to prevent fragmentation in the Caching Allocator. Permissive block splitting in the allocator allows very large blocks to be split into many pieces. Once split too finely it is unlikely all pieces will be 'free' at that same time so the original allocation can never be returned. Anecdotally, we've seen a model run out of memory failing to alloc a 50 MB block on a 32 GB card while the caching allocator is holding 13 GB of 'split free blocks' Approach: - Large blocks above a certain size are designated "oversize". This limit is currently set 1 decade above large, 200 MB - Oversize blocks can not be split - Oversize blocks must closely match the requested size (e.g. a 200 MB request will match an existing 205 MB block, but not a 300 MB block) - In lieu of splitting oversize blocks there is a mechanism to quickly free a single oversize block (to the system allocator) to allow an appropriate size block to be allocated. This will be activated under memory pressure and will prevent _release_cached_blocks()_ from triggering Initial performance tests show this is similar or quicker than the original strategy. Additional tests are ongoing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44742 Reviewed By: zou3519 Differential Revision: D29186394 Pulled By: ezyang fbshipit-source-id: c88918836db3f51df59de6d1b3e03602ebe306a9	2021-06-21 11:46:08 -07:00
Michael Carilli	be038d8989	[CUDA graphs] Make stream semantics of backward calls consistent with other cuda ops (ci-all edition) (#57833 ) Summary: ci-all resubmit of https://github.com/pytorch/pytorch/pull/54227. Tests look good except for a few distributed autograd failures (pytorch_linux_xenial_cuda10_2_cudnn7_py3_multigpu_test) and rocm failures (pr/pytorch-linux-bionic-rocm4.1-py3.6). The common denominator in rocm failures appears to be multi-gpu activity: some [multiprocess DDP failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test1/8115/console), some [single-process failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test2/8115/console) where the single process has autograd ops that span devices. jeffdaily jithunnair-amd sunway513, could one of you take a look? The streaming backward change is also beneficial to rocm, I expect. For debugging rocm failures, I think we should ignore the multiprocess/DDP tests and focus on the single process cases. The root cause is probably the same and the single process cases are simpler. ---------------------------------- Update: Rocm failures are due to https://github.com/pytorch/pytorch/issues/59750. `2718a54032` is a workaround, to be updated once https://github.com/pytorch/pytorch/issues/59750 is fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57833 Reviewed By: mruberry Differential Revision: D28942391 Pulled By: ngimel fbshipit-source-id: d6047e971c5f1c6386334bf3641402a92f12e2f8	2021-06-13 12:09:56 -07:00
Jeffrey Wan	a7a5992d7d	Add no-grad inference mode note (#58513 ) Summary: Adds a note explaining the difference between several often conflated mechanisms in the autograd note Also adds a link to this note from the docs in `grad_mode` and `nn.module`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58513 Reviewed By: gchanan Differential Revision: D28651129 Pulled By: soulitzer fbshipit-source-id: af9eb1749b641fc1b632815634eea36bf7979156	2021-05-25 13:06:54 -07:00
Jithun Nair	ab6b5fa036	Add HIP (ROCm) semantics doc (#57871 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57871 Reviewed By: agolynski Differential Revision: D28385510 Pulled By: malfet fbshipit-source-id: 9cf69e52d026a1cf74cc12d8727ca17ae026235e	2021-05-12 12:34:07 -07:00
albanD	d16ed1ee8a	Add first draft of gradcheck note (#55966 ) Summary: You can find the latest rendered version in the `python_doc_build` CI job below, in the artifact tab of that build on circle CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/55966 Reviewed By: H-Huang Differential Revision: D28032446 Pulled By: albanD fbshipit-source-id: 227ad37b03d39894d736c19cae3195b4d56fc62f	2021-04-27 14:33:42 -07:00
Ilqar Ramazanli	70d9be0f42	Replace duplicative s with alpha (#56804 ) Summary: It is always easier to read a document when different objects / concepts denoted with different variables / representations. In this PR we make sure the [complex autograd](https://pytorch.org/docs/master/notes/autograd.html#autograd-for-complex-numbers) documentation, the variable of output and step size diverge. Fixes https://github.com/pytorch/pytorch/issues/53633 Pull Request resolved: https://github.com/pytorch/pytorch/pull/56804 Reviewed By: anjali411 Differential Revision: D27989959 Pulled By: iramazanli fbshipit-source-id: c271590ee744c8aeeff62bfaa2295429765ef64e	2021-04-25 16:27:09 -07:00
Erjia Guan	8cf85a1152	[DataLoader][doc] Randomness for base_seed generator and NumPy seed (#56528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56528 Tried to search across internal and external usage of DataLoader. People haven't started to use `generator` for `DataLoader`. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27908487 Pulled By: ejguan fbshipit-source-id: 14c83ed40d4ba4dc988b121968a78c2732d8eb93	2021-04-22 09:40:45 -07:00
Natalia Gimelshein	f94c95a2dd	Revert D23752058: [pytorch][PR] Don't split oversize cached blocks Test Plan: revert-hammer Differential Revision: D23752058 (`67dcd62310`) Original commit changeset: ccb7c13e3cf8 fbshipit-source-id: 12ae9702135ea510e9714ed97fb75ca3b9f97c27	2021-04-14 09:24:08 -07:00
Michael Wootton	67dcd62310	Don't split oversize cached blocks (#44742 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/35901 This change is designed to prevent fragmentation in the Caching Allocator. Permissive block splitting in the allocator allows very large blocks to be split into many pieces. Once split too finely it is unlikely all pieces will be 'free' at that same time so the original allocation can never be returned. Anecdotally, we've seen a model run out of memory failing to alloc a 50 MB block on a 32 GB card while the caching allocator is holding 13 GB of 'split free blocks' Approach: - Large blocks above a certain size are designated "oversize". This limit is currently set 1 decade above large, 200 MB - Oversize blocks can not be split - Oversize blocks must closely match the requested size (e.g. a 200 MB request will match an existing 205 MB block, but not a 300 MB block) - In lieu of splitting oversize blocks there is a mechanism to quickly free a single oversize block (to the system allocator) to allow an appropriate size block to be allocated. This will be activated under memory pressure and will prevent _release_cached_blocks()_ from triggering Initial performance tests show this is similar or quicker than the original strategy. Additional tests are ongoing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44742 Reviewed By: ngimel Differential Revision: D23752058 Pulled By: ezyang fbshipit-source-id: ccb7c13e3cf8ef2707706726ac9aaac3a5e3d5c8	2021-04-14 03:04:41 -07:00
Yukio Siraichi	93bf0ae6fc	Remove legacy constructor calls from pytorch codebase. (#54142 ) Summary: Follow up from https://github.com/pytorch/pytorch/issues/53889 Related to https://github.com/pytorch/pytorch/issues/47112 Removing every occurrence of the legacy constructor call present in PyTorch at: - _docs_ - _benchmarks_ - _test_ - _caffe2_ - _CONTRIBUTING.md_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/54142 Reviewed By: ngimel Differential Revision: D27699450 Pulled By: mruberry fbshipit-source-id: 530aa3f5746cc8bc1407d5d51b2bbd8075e30546	2021-04-11 15:45:17 -07:00
James Reed	ec38dda1cc	Remove extra close bracket in extending.rst (#55409 ) Summary: Small typo Pull Request resolved: https://github.com/pytorch/pytorch/pull/55409 Reviewed By: pbelevich Differential Revision: D27611177 Pulled By: jamesr66a fbshipit-source-id: 8a5ff702e4ab8a7eb2403432889f8b7a5a69484b	2021-04-07 21:15:46 -07:00
Peter Bell	8ac0619784	Avoid infinite recursion in __torch_function__ example (#55391 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/55284 This gets the example to run but probably doesn't help the readability of the example. Thoughts? Pull Request resolved: https://github.com/pytorch/pytorch/pull/55391 Reviewed By: mrshenli Differential Revision: D27621096 Pulled By: ezyang fbshipit-source-id: d02c4fb0001e54139a167b477fd3b4a229e4dc8c	2021-04-07 20:31:46 -07:00
James Reed	c96f076248	Fix typo in extending.rst (#55408 ) Summary: Small typo in docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/55408 Reviewed By: pbelevich Differential Revision: D27611175 Pulled By: jamesr66a fbshipit-source-id: a83a6220054c0411329792c7ac6afceb2b699f44	2021-04-07 03:46:01 -07:00
Sam Estep	5bcbbf5373	Lint trailing newlines (#54737 ) Summary: Context: https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines. The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR: - `.github/workflows/lint.yml` - `mypy-strict.ini` - `tools/README.md` - `tools/test/test_trailing_newlines.py` - `tools/trailing_newlines.py` I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository): - [How to detect file ends in newline?](https://stackoverflow.com/q/38746) - [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068) - [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800) - [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632) - [git ensure newline at end of each file](https://stackoverflow.com/q/57770972) To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737 Test Plan: Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR: - https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true In contrast, this run (after correcting the trailing newlines in this PR) succeeded: - https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241 To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow): ``` python tools/test/test_trailing_newlines.py ``` Reviewed By: malfet Differential Revision: D27409736 Pulled By: samestep fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19	2021-03-30 13:09:52 -07:00
Jeff Yang	475251631b	docs: reference links to serialization.html (#54659 ) Summary: fixes https://github.com/pytorch/pytorch/issues/54311 https://11811979-65600975-gh.circle-artifacts.com/0/docs/generated/torch.save.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/54659 Reviewed By: ailzhang Differential Revision: D27328281 Pulled By: zou3519 fbshipit-source-id: b88d02e5407238a338d537d013a297ae9cdf922b	2021-03-29 10:15:07 -07:00
Eric Jang	c2ccb3578e	Fix inport -> import typo in documentation (#53589 ) Summary: Fixes a small documentation typo Pull Request resolved: https://github.com/pytorch/pytorch/pull/53589 Reviewed By: ngimel Differential Revision: D26907045 Pulled By: Chillee fbshipit-source-id: 15c35bec8d75dd897fe8886d0e0e1b889df65b24	2021-03-08 23:56:42 -08:00
Sam Estep	8c798e0622	Forbid trailing whitespace (#53406 ) Summary: Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857 These are the only hand-written parts of this diff: - the addition to `.github/workflows/lint.yml` - the file endings changed in these four files (to appease FB-internal land-blocking lints): - `GLOSSARY.md` - `aten/src/ATen/core/op_registration/README.md` - `scripts/README.md` - `torch/csrc/jit/codegen/fuser/README.md` The rest was generated by running this command (on macOS): ``` git grep -I -l ' $' -- . ':(exclude)/contrib/' ':(exclude)third_party' \| xargs gsed -i 's/ *$//' ``` I looked over the auto-generated changes and didn't see anything that looked problematic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406 Test Plan: This run (after adding the lint but before removing existing trailing spaces) failed: - https://github.com/pytorch/pytorch/runs/2043032377 This run (on the tip of this PR) succeeded: - https://github.com/pytorch/pytorch/runs/2043296348 Reviewed By: walterddr, seemethere Differential Revision: D26856620 Pulled By: samestep fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97	2021-03-05 17:22:55 -08:00
peter	8870c391e9	Update mkl to 2020.2.254 (#52964 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/52907 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52964 Reviewed By: H-Huang Differential Revision: D26726464 Pulled By: seemethere fbshipit-source-id: 8f3067292e6416e299b4b040c8fb73510134f02e	2021-03-01 11:13:57 -08:00
Joel Schlosser	a0137808a7	Note on Modules for 1.8 docs (#51536 ) Summary: A new note on Modules for 1.8 documentation. Rendered form can be seen here: https://alband.github.io/doc_view/notes/modules.html (thanks Alban!) Pull Request resolved: https://github.com/pytorch/pytorch/pull/51536 Reviewed By: albanD Differential Revision: D26254282 Pulled By: jbschlosser fbshipit-source-id: 09cbd46aa268a29b6f54fd48ffe1d6b98db0ff31	2021-02-04 11:28:11 -08:00
anjali411	34d4d79966	Autograd doc note fix (#51661 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51661 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D26230912 Pulled By: anjali411 fbshipit-source-id: 94323d7bce631a4c5781020e9650495461119ede	2021-02-03 15:08:35 -08:00
Mike Ruberry	40c0fffb4b	Fixes docs (#51439 ) Summary: pytorch_python_doc_build is failing with: ``` Jan 31 04:30:45 /var/lib/jenkins/workspace/docs/source/notes/broadcasting.rst:6: WARNING: 'any' reference target not found: numpy.doc.broadcasting ``` this removes the incorrect reference and adds an updated link. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51439 Reviewed By: ngimel Differential Revision: D26170232 Pulled By: mruberry fbshipit-source-id: 829999db52e1e860d36d626d0d9f26e31283d14b	2021-01-31 22:00:26 -08:00
anjali411	fd9a85d21b	Doc update for complex numbers (#51129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51129 Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D26094947 Pulled By: anjali411 fbshipit-source-id: 4e1cdf8915a8c6a86ac3462685cdce881e1bcffa	2021-01-27 07:32:26 -08:00
Hameer Abbasi	f7b339d11c	Clarify wording around overrides subclasses. (#51031 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51031 Reviewed By: bdhirsh Differential Revision: D26047498 Pulled By: albanD fbshipit-source-id: dd0a7d9f97c0f6469b3050d2e3b4473f1bee3820	2021-01-25 08:19:13 -08:00
Kurt Mohler	8ab1a1495d	Rename `set_deterministic` to `use_deterministic_algorithms` (#49904 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/49100 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49904 Reviewed By: ezyang, mrshenli Differential Revision: D25956761 Pulled By: mruberry fbshipit-source-id: 86a59289d50825a0ebbd7c358b483c8d8039ffa6	2021-01-22 11:27:07 -08:00
Jeffrey Wan	1833009202	Fix typo in complex autograd docs (#49755 ) Summary: Update complex autograd docs to fix a typo Pull Request resolved: https://github.com/pytorch/pytorch/pull/49755 Reviewed By: mruberry Differential Revision: D25692649 Pulled By: soulitzer fbshipit-source-id: 43c2113b4c8f2d1828880102189a5a9b887dc784	2020-12-23 14:42:34 -08:00
pbialecki	1451d84766	Minor doc fix: change truncating to rounding in TF32 docs (#49625 ) Summary: Minor doc fix in clarifying that the input data is rounded not truncated. CC zasdfgbnm ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/49625 Reviewed By: mruberry Differential Revision: D25668244 Pulled By: ngimel fbshipit-source-id: ac97e41e0ca296276544f9e9f85b2cf1790d9985	2020-12-22 13:46:33 -08:00
Peter Bell	5180caeeb4	Remove deprecated spectral ops from torch namespace (#48594 ) Summary: Ref https://github.com/pytorch/pytorch/issues/42175 This removes the 4 deprecated spectral functions: `torch.{fft,rfft,ifft,irfft}`. `torch.fft` is also now imported by by default. The actual `at::native` functions are still used in `torch.stft` so can't be full removed yet. But will once https://github.com/pytorch/pytorch/issues/47601 has been merged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48594 Reviewed By: heitorschueroff Differential Revision: D25298929 Pulled By: mruberry fbshipit-source-id: e36737fe8192fcd16f7e6310f8b49de478e63bf0	2020-12-05 04:12:32 -08:00
peter	3c5db30eaa	Update magma to 2.5.4 for Windows (#48656 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48656 Reviewed By: zhangguanheng66 Differential Revision: D25261601 Pulled By: malfet fbshipit-source-id: 4ba0036ca882bccd1990108d13596455d179d06e	2020-12-02 09:45:21 -08:00
Hameer Abbasi	4e15877d5c	Add documentation for torch.overrides submodule. (#48170 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48087 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48170 Reviewed By: ejguan Differential Revision: D25220942 Pulled By: ezyang fbshipit-source-id: a2b7f7b565f5e77173d8ce2fe9676a8131f929b6	2020-11-30 11:25:31 -08:00
Hameer Abbasi	3a2aad9314	Fix documentation to point to torch.overrides instead of _overrides. (#47842 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47697 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47842 Reviewed By: smessmer Differential Revision: D24951750 Pulled By: ezyang fbshipit-source-id: df62ec2e52f1c561c864a50bac4abf4a55e4f8e6	2020-11-16 08:28:53 -08:00
Masaki Kozuki	2eb1e866e8	Update links in DDP note (#47663 ) Summary: Update the links in https://pytorch.org/docs/stable/notes/ddp.html#. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47663 Reviewed By: smessmer Differential Revision: D24951684 Pulled By: ezyang fbshipit-source-id: c1c104d76cf0292a7fc75a627bf76bb56fea72d0	2020-11-13 21:26:28 -08:00
Xiang Gao	4a7de2746f	Add docs on how to toggle TF32 flags on C++ (#47331 ) Summary: I have been asked several times how to toggle this flag on libtorch. I think it would be good to mention it in the docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47331 Reviewed By: glaringlee Differential Revision: D24777576 Pulled By: mruberry fbshipit-source-id: cc2a338c477bb57e0bb74b8960c47fde99665e41	2020-11-08 01:29:24 -08:00
anjali411	ac245f6b45	Complex autograd doc fix (#46258 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46258 Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D24286512 Pulled By: anjali411 fbshipit-source-id: 60bc98d69336101c0d8fe5ab542b9757b5e7faac	2020-10-13 14:36:50 -07:00
Vitaly Fedyunin	31ee5d8d8b	Adding information how to control randomness with DataLoader (#45749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45749 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24088407 Pulled By: VitalyFedyunin fbshipit-source-id: 398b73ec5e8c83000ebc692001da847fc0aaa48f	2020-10-12 16:57:58 -07:00
anjali411	89256611b5	Doc note update for complex autograd (#45270 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45270 <img width="1679" alt="Screen Shot 2020-10-07 at 1 45 59 PM" src="https://user-images.githubusercontent.com/20081078/95368324-fa7b2d00-08a3-11eb-9066-2e659a4085a2.png"> <img width="1673" alt="Screen Shot 2020-10-07 at 1 46 10 PM" src="https://user-images.githubusercontent.com/20081078/95368332-fbac5a00-08a3-11eb-9be5-77ce6deb8967.png"> <img width="1667" alt="Screen Shot 2020-10-07 at 1 46 30 PM" src="https://user-images.githubusercontent.com/20081078/95368337-fe0eb400-08a3-11eb-80a2-5ad23feeeb83.png"> <img width="1679" alt="Screen Shot 2020-10-07 at 1 46 48 PM" src="https://user-images.githubusercontent.com/20081078/95368345-00710e00-08a4-11eb-96d9-e2d544554a4b.png"> <img width="1680" alt="Screen Shot 2020-10-07 at 1 47 03 PM" src="https://user-images.githubusercontent.com/20081078/95368350-023ad180-08a4-11eb-89b3-f079480741f4.png"> <img width="1680" alt="Screen Shot 2020-10-07 at 1 47 12 PM" src="https://user-images.githubusercontent.com/20081078/95368364-0535c200-08a4-11eb-82fc-9435a046e4ca.png"> Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D24203257 Pulled By: anjali411 fbshipit-source-id: cd637dade5fb40cecf5d9f4bd03d508d36e26fcd	2020-10-08 15:04:52 -07:00
Michael Carilli	5640b79bf8	Allow consumer ops to sync on GraphRoot's gradient (#45787 ) Summary: Currently, a GraphRoot instance doesn't have an associated stream. Streaming backward synchronization logic assumes the instance ran on the default stream, and tells consumer ops to sync with the default stream. If the gradient the GraphRoot instance passes to consumer backward ops was populated on a non-default stream, we have a race condition. The race condition can exist even if the user doesn't give a manually populated gradient: ```python with torch.cuda.stream(side_stream): # loss.backward() implicitly synthesizes a one-element 1.0 tensor on side_stream # GraphRoot passes it to consumers, but consumers first sync on default stream, not side_stream. loss.backward() # Internally to backward(), streaming-backward logic takes over, stuff executes on the same stream it ran on in forward, # and the side_stream context is irrelevant. GraphRoot's interaction with its first consumer(s) is the spot where # the side_stream context causes a problem. ``` This PR fixes the race condition by associating a GraphRoot instance, at construction time, with the current stream(s) on the device(s) of the grads it will pass to consumers. (i think this relies on GraphRoot executing in the main thread, before backward thread(s) fork, because the grads were populated on the main thread.) The test demonstrates the race condition. It fails reliably without the PR's GraphRoot diffs and passes with the GraphRoot diffs. With the GraphRoot diffs, manually populating an incoming-gradient arg for `backward` (or `torch.autograd.grad`) and the actual call to `autograd.backward` will have the same stream-semantics relationship as any other pair of ops: ```python # implicit population is safe with torch.cuda.stream(side_stream): loss.backward() # explicit population in side stream then backward in side stream is safe with torch.cuda.stream(side_stream): kickoff_grad = torch.ones_like(loss) loss.backward(gradient=kickoff_grad) # explicit population in one stream then backward kickoff in another stream # is NOT safe, even with this PR's diffs, but that unsafety is consistent with # stream-semantics relationship of any pair of ops kickoff_grad = torch.ones_like(loss) with torch.cuda.stream(side_stream): loss.backward(gradient=kickoff_grad) # Safe, as you'd expect for any pair of ops kickoff_grad = torch.ones_like(loss) side_stream.wait_stream(torch.cuda.current_stream()) with torch.cuda.stream(side_stream): loss.backward(gradient=kickoff_grad) ``` This PR also adds the last three examples above to cuda docs and references them from autograd docstrings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45787 Reviewed By: nairbv Differential Revision: D24138376 Pulled By: albanD fbshipit-source-id: bc4cd9390f9f0358633db530b1b09f9c1080d2a3	2020-10-07 08:53:53 -07:00
Bert Maher	03342af3a3	Add env variable to bypass CUDACachingAllocator for debugging (#45294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45294 While tracking down a recent memory corruption bug we found that cuda-memcheck wasn't finding the bad accesses, and ngimel pointed out that it's because we use a caching allocator so a lot of "out of bounds" accesses land in a valid slab. This PR adds a runtime knob (`PYTORCH_NO_CUDA_MEMORY_CACHING`) that, when set, bypasses the caching allocator's caching logic so that allocations go straight to cudaMalloc. This way, cuda-memcheck will actually work. Test Plan: Insert some memory errors and run a test under cuda-memcheck; observe that cuda-memcheck flags an error where expected. Specifically I removed the output-masking logic here: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/tensorexpr/cuda_codegen.cpp#L819-L826 And ran: ``` PYTORCH_NO_CUDA_MEMORY_CACHING=1 cuda-memcheck pytest -k test_superslomo test_jit_fuser_te.py ``` Reviewed By: ngimel Differential Revision: D23964734 Pulled By: bertmaher fbshipit-source-id: 04efd11e8aff037b9edde80c70585cb820ee6e39	2020-09-28 11:40:04 -07:00
Michael Carilli	3e6bb5233f	Reference amp tutorial (recipe) from core amp docs (#44725 ) Summary: https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html is live. Core amp docs should reference it. Also i fixed some typos in the `zero_grad` docs we ignored when git was behaving weirdly during ngimel 's merge of https://github.com/pytorch/pytorch/pull/44423. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44725 Reviewed By: mruberry Differential Revision: D23723807 Pulled By: ngimel fbshipit-source-id: ca0b76365f8ca908bd978e3b38bf81857fa6c2a3	2020-09-16 11:37:58 -07:00
Michael Carilli	2fd142a2ef	Small clarification to amp gradient penalty example (#44667 ) Summary: requested by https://discuss.pytorch.org/t/what-is-the-correct-way-of-computing-a-grad-penalty-using-amp/95827/3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44667 Reviewed By: mruberry Differential Revision: D23692768 Pulled By: ngimel fbshipit-source-id: 83c61b94e79ef9f86abed2cc066f188dce0c8456	2020-09-14 21:56:09 -07:00

1 2 3 4 5

235 Commits