pytorch/docs at be038d89898d0d2111b8acedefd08ceed62664cb - pytorch - Carlos Sousa's Git

OSSForks/pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

Michael Carilli be038d8989 [CUDA graphs] Make stream semantics of backward calls consistent with other cuda ops (ci-all edition) (#57833 ) Summary: ci-all resubmit of https://github.com/pytorch/pytorch/pull/54227. Tests look good except for a few distributed autograd failures (pytorch_linux_xenial_cuda10_2_cudnn7_py3_multigpu_test) and rocm failures (pr/pytorch-linux-bionic-rocm4.1-py3.6). The common denominator in rocm failures appears to be multi-gpu activity: some [multiprocess DDP failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test1/8115/console), some [single-process failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test2/8115/console) where the single process has autograd ops that span devices. jeffdaily jithunnair-amd sunway513, could one of you take a look? The streaming backward change is also beneficial to rocm, I expect. For debugging rocm failures, I think we should ignore the multiprocess/DDP tests and focus on the single process cases. The root cause is probably the same and the single process cases are simpler. ---------------------------------- Update: Rocm failures are due to https://github.com/pytorch/pytorch/issues/59750. `2718a54032` is a workaround, to be updated once https://github.com/pytorch/pytorch/issues/59750 is fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57833 Reviewed By: mruberry Differential Revision: D28942391 Pulled By: ngimel fbshipit-source-id: d6047e971c5f1c6386334bf3641402a92f12e2f8		2021-06-13 12:09:56 -07:00
..
caffe2	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
cpp	Add no-grad inference mode note (#58513 )	2021-05-25 13:06:54 -07:00
source	[CUDA graphs] Make stream semantics of backward calls consistent with other cuda ops (ci-all edition) (#57833 )	2021-06-13 12:09:56 -07:00
.gitignore	.gitignore for the docs folder	2019-10-08 12:18:30 -07:00
libtorch.rst	DOC: Building libtorch using CMake (#44196 )	2020-10-21 14:29:36 -07:00
make.bat	Sphinx parallel build (#38785 )	2020-05-21 13:03:55 -07:00
Makefile	DOC: fail to build if there are warnings (#41335 )	2020-07-28 22:33:44 -07:00
README.md	Add docs/README.md to make existing doc build info more discoverable (#49286 )	2020-12-16 11:55:45 -08:00
requirements.txt	[1/n][torch/elastic] Move torchelastic docs *.rst (#148 )	2021-05-04 00:57:56 -07:00

README.md

Please see the Writing documentation section of CONTRIBUTING.md for details on both writing and building the docs.