mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 00:21:07 +01:00
Fixes #128796 This PR adds documentation about the behavior of division by zero operations in PyTorch's autograd system. The documentation explains: 1. How division by zero produces `inf` values following IEEE-754 floating point arithmetic 2. How autograd handles these cases and why masking after division can lead to `nan` gradients 3. Provides concrete examples showing the issue 4. Recommends two solutions: - Masking before division - Using MaskedTensor (experimental API) The documentation is added to the autograd notes section, making it easily discoverable for users who encounter this common issue. This addresses the original issue #128796 which requested better documentation of this behavior to help users avoid common pitfalls when dealing with division by zero in their models. dditional changes: - Fixed formatting consistency by replacing curly apostrophes with straight apostrophes in the existing documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/155987 Approved by: https://github.com/soulitzer Co-authored-by: sekyondaMeta <127536312+sekyondaMeta@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| amp_examples.rst | ||
| autograd.rst | ||
| broadcasting.rst | ||
| cpu_threading_runtimes.svg | ||
| cpu_threading_torchscript_inference.rst | ||
| cpu_threading_torchscript_inference.svg | ||
| cuda.rst | ||
| custom_operators.rst | ||
| ddp.rst | ||
| extending.func.rst | ||
| extending.rst | ||
| faq.rst | ||
| fsdp.rst | ||
| get_start_xpu.rst | ||
| gradcheck.rst | ||
| hip.rst | ||
| large_scale_deployments.rst | ||
| libtorch_stable_abi.md | ||
| modules.rst | ||
| mps.rst | ||
| multiprocessing.rst | ||
| numerical_accuracy.rst | ||
| out.rst | ||
| randomness.rst | ||
| serialization.rst | ||
| windows.rst | ||