Preview: https://docs-preview.pytorch.org/pytorch/pytorch/157750/export.html
Changes:
* Rename draft_export.md -> export.draft_export.md for consistency.
* Removed non-strict section in export, instead pointed to programming model doc.
* Extended "Expressing Dynamism" section to include Dim hints, ShapeCollection, and AdditionalInputs.
* Removed Specialization section in favor of programming model doc
* Added pt2 archive doc
* Cleaned up sidebar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157750
Approved by: https://github.com/pianpwk
Summary: In Pytorch 2.5 we added source code attribution to PT2 traces. Each Torch-Compiled Region will now have its frame id and frame compile id associated with it. Update the image in the doc and add a description of this in the doc itself
Test Plan:
{F1980179183}
Rollback Plan:
Differential Revision: D78118228
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158066
Approved by: https://github.com/aaronenyeshi
This PR adds a new config `backward_pass_autocast`, to set the backward autocast
behavior. It does not change the existing behavior.
The reason why we need this is that torch.compile acquires a forward and
backward graph at the time of the forward pass. This means that
implemented naively, if there are any context managers active outside
the call to torch.compile, the backward graph will also get the
behaviors from those context managers. This PR gives users a way to
tweak the autocast behavior of the backward pass.
Please see torch._functorch.config for the options to the
`backward_pass_autocast` config.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156356
Approved by: https://github.com/bdhirsh
ghstack dependencies: #155354
The dtype documentation has not been updated in awhile, let's do a revamp.
1. combine the duplicated docs for dtypes from `tensors.rst` and `tensor_attributes.rst` to live in `tensor_attributes.rst`, and link to that page from `tensors.rst`
2. split the dtype table into floating point and integer dtypes
3. add the definition of shell dtype
4. add the float8 and MX dtypes as shell dtypes to the dtype table
5. remove legacy quantized dtypes from the table
6. add the definition of various dtype suffixes ("fn", etc)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156087
Approved by: https://github.com/albanD
Based on the [conversation](https://github.com/pytorch/pytorch/issues/121791), we plan to drop the "highest, high, medium" to represent fp32 internal computation data types . Instead, we will directly use the algorithm to represent it.
### Design Choice: Directly use algorithms name like "TF32", "BF16".
#### Pros
- The names are more informative. 'tf32' is more informative than a simple "high".
- Easier to extend new algorithm like `tf32x3`
#### Cons
- "HIGHEST, HIGH, MEDIUM" indicated the relative precision between different algorithms. However, we can have more documents to discuss them.
### We provide a layered structure for backends/operators.
('f32' is short for 'fp32_precision')

### We provide 3 fp32 compute precision can be set:
- **"ieee"**: Not allowed to use any other internal computation data types .
- **"tf32"**: Allowed to use tf32 as internal computation data types.
- **"bf16"**: Allowed to use bf16 as internal computation data types.
- **"none"**: Precision's are not set. Can be override by its father node.
### Overriding Precision Settings
Child node can be override by its father node if it is set to default.
For current default settings:
```
backend = generic, op = all, precision setting = none
backend = cuda, op = all, precision setting = none
backend = cuda, op = conv, precision setting = tf32
backend = cuda, op = rnn, precision setting = tf32
backend = cuda, op = matmul, precision setting = none
backend = matmul, op = all, precision setting = none
backend = matmul, op = conv, precision setting = none
backend = matmul, op = rnn, precision setting = none
backend = matmul, op = matmul, precision setting = none
```
- If the user set `torch.backends.mkldnn.fp32_precision="bf16"`, his child nodes `torch.backends.mkldnn.matmul.fp32_precision` / `torch.backends.mkldnn.conv.fp32_precision` / `torch.backends.mkldnn.rnn.fp32_precision` will also be override to "bf16".
- If the user set `torch.backends.fp32_precision="bf16"`, `torch.backends.mkldnn.fp32_precision` and his child nodes will also we override to "bf16".
### Backward Compatible
Since new API allow user to have more fine-grained control. There will be some conflict. For example, previous `torch.backends.cudnn.allow_tf32` are not enough to represent the status for `torch.backends.cudnn.rnn.fp32_precision="ieee"` and `torch.backends.cudnn.conv.fp32_precision="tf32"`. Therefore, our goal for backward compatible is
- If the user only uses previous APIs, it will work as previous expectations.
- If the user use **new** API to change the status to an **un-representable** status for old API, and try to access the status by **old** API. We will raise Runtime Error and point the document for user.
### Test Plan
```
python test/test_cuda.py -k test_fp32_precision_with_tf32
python test/test_cuda.py -k test_fp32_precision_with_float32_matmul_precision
python test/test_cuda.py -k test_invalid_status_for_legacy_api
python test/test_mkldnn.py -k test_mlkdnn_get_set
python test/test_mkldnn.py -k test_generic_precision
python test/test_mkldnn.py -k test_invalid
python test/test_mkldnn.py -k test_default_use_parent
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125888
Approved by: https://github.com/jgong5, https://github.com/albanD
Co-authored-by: Jiang, Yanbing <yanbing.jiang@intel.com>
# Motivation
Update the doc, to make `torch.device`'s constructor officially support the following methods:
- A device string, which is a string representation of the device type and optionally the device ordinal.
- A device type and a device ordinal.
- A device ordinal, which is treated as the current accelerator type.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156686
Approved by: https://github.com/albanD
This is a remix of https://github.com/pytorch/pytorch/pull/155558
Instead of mediating guard collective via a config option, in this one it's done via a `set_stance` like API. The motivation is that checking for the config value on entry on torch.compile is apparently quite expensive, according to functorch_maml_omniglot. So this makes it a bit cheaper.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156562
Approved by: https://github.com/Microve
Implement Attention-23 using sdpa and flexattention.
- I used copilot for this.
- Also updated the conversion logic to remove trailing None inputs.
@gramalingam @kunal-vaishnavi @titaiwangms
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156431
Approved by: https://github.com/titaiwangms
Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fixes#128796
This PR adds documentation about the behavior of division by zero operations in PyTorch's autograd system. The documentation explains:
1. How division by zero produces `inf` values following IEEE-754 floating point arithmetic
2. How autograd handles these cases and why masking after division can lead to `nan` gradients
3. Provides concrete examples showing the issue
4. Recommends two solutions:
- Masking before division
- Using MaskedTensor (experimental API)
The documentation is added to the autograd notes section, making it easily discoverable for users who encounter this common issue.
This addresses the original issue #128796 which requested better documentation of this behavior to help users avoid common pitfalls when dealing with division by zero in their models.
dditional changes:
- Fixed formatting consistency by replacing curly apostrophes with straight apostrophes in the existing documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155987
Approved by: https://github.com/soulitzer
Co-authored-by: sekyondaMeta <127536312+sekyondaMeta@users.noreply.github.com>