pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

Michael Carilli 675cea1adb [CUDA graphs][BC-breaking] Removes post-backward syncs on default stream (#60421 ) Summary: Before https://github.com/pytorch/pytorch/pull/57833, calls to backward() or grad() synced only the calling thread's default stream with autograd leaf streams at the end of backward. This made the following weird pattern safe: ```python with torch.cuda.stream(s): # imagine forward used many streams, so backward leaf nodes may run on many streams loss.backward() # no sync use grads ``` but a more benign-looking pattern was unsafe: ```python with torch.cuda.stream(s): # imagine forward used a lot of streams, so backward leaf nodes may run on many streams loss.backward() # backward() syncs the default stream with all the leaf streams, but does not sync s with anything, # so counterintuitively (even though we're in the same stream context as backward()!) # it is NOT SAFE to use grads here, and there's no easy way to make it safe, # unless you manually sync on all the streams you used in forward, # or move "use grads" back to default stream outside the context. use grads ``` mruberry ngimel and I decided backward() should have the [same user-facing stream semantics as any cuda op](https://pytorch.org/docs/master/notes/cuda.html#stream-semantics-of-backward-passes). In other words, the weird pattern should be unsafe, and the benign-looking pattern should be safe. Implementationwise, this meant backward() should sync its calling thread's current stream, not default stream, with the leaf streams. After https://github.com/pytorch/pytorch/pull/57833, backward syncs the calling thread's current stream AND default stream with all leaf streams at the end of backward. The default stream syncs were retained for temporary backward compatibility. This PR finishes https://github.com/pytorch/pytorch/pull/57833's work by deleting syncs on the default stream. With this PR, graph-capturing an entire backward() call should be possible (see the [test_graph_grad_scaling diffs](https://github.com/pytorch/pytorch/compare/master...mcarilli:streaming_backwards_remove_default_syncs?expand=1#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3641-R3642)). first paragraph has a formatting error which this PR should also fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60421 Reviewed By: VitalyFedyunin, albanD Differential Revision: D29342234 Pulled By: ngimel fbshipit-source-id: 98e6be7fdd8550872f0a78f9a66cb8dfe75abf63		2021-06-23 23:35:24 -07:00
..
_static	DOC Improve documentation for LayerNorm (#59178 )	2021-06-07 14:34:10 -07:00
_templates	Remove master documentation from being indexable by search engines (#58056 )	2021-05-18 06:20:09 -07:00
community	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
elastic	[torch/elastic] Update the rendezvous docs (#58160 )	2021-05-12 16:54:28 -07:00
notes	[CUDA graphs][BC-breaking] Removes post-backward syncs on default stream (#60421 )	2021-06-23 23:35:24 -07:00
rpc	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
scripts	Add mish activation function (#58648 )	2021-05-25 10:36:21 -07:00
__config__.rst	Fix __config__ docs (#48557 )	2020-11-29 23:57:06 -08:00
amp.rst	Moves grid_sampler to autocast promote list (#58618 )	2021-06-21 10:22:36 -07:00
autograd.rst	Add no-grad inference mode note (#58513 )	2021-05-25 13:06:54 -07:00
backends.rst	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
benchmark_utils.rst	Expand benchmark utils docs (#51664 )	2021-02-04 00:22:41 -08:00
bottleneck.rst	[docs] Clarify more CUDA profiling gotchas in bottleneck docs (#6763 )	2018-04-19 13:15:27 -04:00
checkpoint.rst	Stashing checkpointing RNG states based on devices of arg tensors (#14518 )	2018-12-11 09:48:45 -08:00
complex_numbers.rst	Abladawood patch 1 (#58496 )	2021-05-20 10:32:18 -07:00
conf.py	Use proper Google Analytics id (#56578 )	2021-05-04 13:23:16 -07:00
cpp_extension.rst	correct some cpp extension code usages and documents (#39766 )	2020-06-10 08:31:22 -07:00
cpp_index.rst	Add C++ Landing Page (#38450 )	2020-05-14 16:02:01 -07:00
cuda.rst	breakup optim, cuda documentation (#55673 )	2021-04-14 12:44:00 -07:00
cudnn_persistent_rnn.rst	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
cudnn_rnn_determinism.rst	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
data.rst	[DataLoader][doc] Randomness for base_seed generator and NumPy seed (#56528 )	2021-04-22 09:40:45 -07:00
ddp_comm_hooks.rst	[Gradient Compression] Remove unnecessary warning on the rst file and the check on C++ version (#58170 )	2021-05-12 14:15:10 -07:00
distributed.elastic.rst	[1/n][torch/elastic] Move torchelastic docs *.rst (#148 )	2021-05-04 00:57:56 -07:00
distributed.optim.rst	[Reland] Update and expose ZeroRedundancyOptimizer docs (#53112 )	2021-03-02 14:16:12 -08:00
distributed.rst	[reland] Document debugability features in torch.distributed (#59726 )	2021-06-09 16:40:11 -07:00
distributions.rst	Add sample validation for LKJCholesky.log_prob (#52763 )	2021-02-25 16:12:29 -08:00
dlpack.rst	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
docutils.conf	Revert "Revert D21337640: [pytorch][PR] Split up documentation into subpages and clean up some warnings" (#37778 )	2020-05-04 14:32:35 -07:00
fft.rst	Use autosummary on torch.fft, torch.linalg (#55748 )	2021-04-13 12:02:36 -07:00
futures.rst	Update docs to mention CUDA support for Future (#50048 )	2021-05-11 08:26:33 -07:00
fx.rst	[FX][docs][EZ] Fix link to fuser example (#59670 )	2021-06-08 17:32:55 -07:00
hub.rst	Add a torch.hub.load_local() function that can load models from any local directory with a hubconf.py (#44204 )	2020-09-21 14:17:21 -07:00
index.rst	add `torch.testing` to docs (#57247 )	2021-05-07 09:16:39 -07:00
jit_builtin_functions.rst	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
jit_language_reference_v2.rst	Fix hasattr support type (#57950 )	2021-05-10 12:21:56 -07:00
jit_language_reference.rst	add type annotations to torch.nn.modules.conv (#49564 )	2021-01-15 11:16:11 -08:00
jit_python_reference.rst	[JIT] improve documentation (#57991 )	2021-05-19 11:47:32 -07:00
jit_unsupported.rst	[JIT] Update docs for recently added features (#45232 )	2020-09-28 18:17:42 -07:00
jit.rst	Remove caption for Lang Reference (#56526 )	2021-04-20 14:33:42 -07:00
linalg.rst	Add torch.linalg.inv_ex without checking for errors by default (#58039 )	2021-05-13 09:42:15 -07:00
math-quantizer-equation.png	adding quantization.rst file for quantization feature (#27559 )	2019-10-09 16:45:09 -07:00
mobile_optimizer.rst	Mod lists to neutral+descriptive terms in caffe2/docs (#49803 )	2020-12-23 11:37:11 -08:00
model_zoo.rst	add/move a few apis in torch.hub (#18758 )	2019-04-10 23:10:39 -07:00
multiprocessing.rst	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
name_inference.rst	Abladawood patch 1 (#58496 )	2021-05-20 10:32:18 -07:00
named_tensor.rst	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
nn.functional.rst	Add mish activation function (#58648 )	2021-05-25 10:36:21 -07:00
nn.init.rst	Bag of documentation fixes; fix more sphinx warnings (#27850 )	2019-10-15 07:31:14 -07:00
nn.rst	ENH Adds nn.ReflectionPad3d (#59791 )	2021-06-21 10:53:14 -07:00
onnx.rst	Update Autograd Export Docs (#56594 ) (#59534 )	2021-06-15 12:23:00 -07:00
optim.rst	To add Rectified Adam Algorithm to Optimizers (#58968 )	2021-06-23 18:27:57 -07:00
package.rst	[package] fix tutorial link (#60113 )	2021-06-16 11:27:25 -07:00
pipeline.rst	Add tutorials to pipeline docs. (#55209 )	2021-04-05 20:01:00 -07:00
profiler.rst	docs: fix profiler docstring (#55750 )	2021-04-13 00:23:14 -07:00
quantization-support.rst	[docs][quant] Add fx graph mode quant api doc (#55306 )	2021-04-05 13:56:23 -07:00
quantization.rst	quantization: improve documentation on natively supported backends (#58925 )	2021-06-07 17:29:03 -07:00
random.rst	Remove duplicated entries in `random.rst` (#39725 )	2020-06-10 16:51:15 -07:00
rpc.rst	Add a disclaimer about limited CUDA support in RPC (#58023 )	2021-05-12 00:11:22 -07:00
sparse.rst	Add CSR (compressed sparse row) layout for sparse tensors (#50937 )	2021-04-12 10:09:12 -07:00
special.rst	Implement erfcx() (#58194 )	2021-06-22 12:38:38 -07:00
storage.rst	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
tensor_attributes.rst	Remove legacy constructor calls from pytorch codebase. (#54142 )	2021-04-11 15:45:17 -07:00
tensor_view.rst	Conjugate View (#54987 )	2021-06-04 14:12:41 -07:00
tensorboard.rst	Add method add_hparams to API doc (#27344 )	2019-10-03 17:07:45 -07:00
tensors.rst	Implement histogram operator on CPU (#58780 )	2021-06-22 10:06:04 -07:00
testing.rst	add `torch.testing` to docs (#57247 )	2021-05-07 09:16:39 -07:00
torch.nn.intrinsic.qat.rst	[quantization] Add some support for 3d operations (#50003 )	2021-03-10 16:40:35 -08:00
torch.nn.intrinsic.quantized.rst	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
torch.nn.intrinsic.rst	[quantization] Add some support for 3d operations (#50003 )	2021-03-10 16:40:35 -08:00
torch.nn.qat.rst	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
torch.nn.quantized.dynamic.rst	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
torch.nn.quantized.rst	[quant] add docs for embedding/embedding_bag (#51770 )	2021-02-05 11:43:15 -08:00
torch.overrides.rst	Add documentation for torch.overrides submodule. (#48170 )	2020-11-30 11:25:31 -08:00
torch.quantization.rst	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
torch.rst	Implement histogram operator on CPU (#58780 )	2021-06-22 10:06:04 -07:00
type_info.rst	DOC: split quantization.rst into smaller pieces (#41321 )	2020-07-25 23:59:40 -07:00