add docstring for adam differentiable parameter (#91881)

Fixes #90467

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91881
Approved by: https://github.com/janeyx99
This commit is contained in:
Nouran Ali 2023-01-13 17:08:27 +00:00 committed by PyTorch MergeBot
parent 8f1c3c68d3
commit a60125e298

View File

@ -105,6 +105,10 @@ class Adam(Optimizer):
capturable (bool, optional): whether this instance is safe to capture in a CUDA graph. capturable (bool, optional): whether this instance is safe to capture in a CUDA graph.
Passing True can impair ungraphed performance, so if you don't intend to Passing True can impair ungraphed performance, so if you don't intend to
graph capture this instance, leave it False (default: False) graph capture this instance, leave it False (default: False)
differentiable (bool, optional): whether autograd should occur through the optimizer step
in training otherwise, the step() function runs in a torch.no_grad() context.
Setting to True can impair performance, so leave it False if you don't intend to run
autograd through this instance (default: False)
fused (bool, optional): whether the fused implementation (CUDA only) is used. fused (bool, optional): whether the fused implementation (CUDA only) is used.
Currently, `torch.float64`, `torch.float32`, `torch.float16`, and `torch.bfloat16` Currently, `torch.float64`, `torch.float32`, `torch.float16`, and `torch.bfloat16`
are supported. Since the fused implementation is usually significantly faster than are supported. Since the fused implementation is usually significantly faster than