The `AttrsDescriptor` class has been present in Triton for almost a year now (introduced [here](72c9833927)), so we should be able to rely on it existing. I am in the process of supporting the new `AttrsDescriptor` class and @jansel suggested I split changes to the existing class out separately to make sure nothing breaks removing the legacy attribute descriptor attributes.
Initially I attempted to remove the branching around detecting whether `AttrsDescriptor` exists but that breaks because PyTorch must build without Triton. So, I went back and updated for the naming introduced in the commit linked above, and also removed two unused attributes `divisible_by_8` and `ids_to_fold` which were removed in Feb 2024 (https://github.com/triton-lang/triton/pull/3122 and https://github.com/triton-lang/triton/pull/3080 respectively).
With these changes only the internal workings of the `AttrsDescriptor` class will differ between supported Triton versions, but the data stored will remain consistent.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137458
Approved by: https://github.com/jansel
Summary: Previous PR forgets to change two other places that also create `constants` and `signature`.
Test Plan:
Imported from GitHub, without a `Test Plan:` line.
{F1884584338}
Differential Revision: D63027728
Pulled By: Myrthan
Co-authored-by: Jokeren <robinho364@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136514
Approved by: https://github.com/jansel
Co-authored-by: Jokeren <robinho364@gmail.com>
For a masked `tl.load` operation, the Triton language specifies that values masked out (i.e. where the mask evaluates to false) are undefined in the output of the load. Triton provides an optional `other` parameter which, when included, provides an explicit value to use for masked out values from the load. If the output from a masked load without the `other` parameter is used in a conditional, unexpected behavior can occur.
Despite the language specification, all Triton backends currently in use by PyTorch Inductor (NVIDIA, AMD, and Intel) 0-initialize masked loads if `other` is not present (we recently changed the Intel backend behavior to match NVIDIA and AMD because that's what our users expect, even if we are not following the Triton spec to the tee). This PR attempts to "future-proof" Inductor for new backends (or perhaps changes in the current backends? - we did not see any performance change from 0-initializing in the Intel XPU backend but one could imagine compiler optimizations to remove paths that depend on undefined) to add an explicit `other` in instances where later conditionals depend on the `tl.load` output. I also removed an exception to `other` behavior for boolean loads, which was put in place for a Triton bug that should be fixed. I added `other` to the getting started documentation as a clue that masked load behavior requires explicit initialization if, even though I don't expect `undef` values to cause the example code to fail if the underlying output is not 0-initialized. Finally, I added other to the `make_load` function in `select_algorithm.py`, though I wasn't able to determine if that function was actually being called.
Fixes#126535
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127311
Approved by: https://github.com/jansel
triton_meta is intended to be passed directly to triton. Previous we were also putting other metadata into triton_meta; but we should split out the other metadata into a separate dict to avoid possible conficts in the future.
This PR splits out triton_meta and inductor_meta so we have a place to put additional metadata that isn't intended to be passed to triton.
Tests - wait for CI
Differential Revision: [D50864493](https://our.internmc.facebook.com/intern/diff/D50864493)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112351
Approved by: https://github.com/eellison
- Text says `Next, let’s try a real model like resnet50 from the PyTorch` but the code example uses `resnet18`. Fixed code to use `resnet50` for consistency.
- One of the examples in TorchDynamo Overview uses uncompiled model - fixed it - now it uses compiled model.
- Removed unused import to `_dynamo` in one of the examples
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107267
Approved by: https://github.com/soulitzer
Current torch.compile docs have become a bit of a mess with the docs expanded in the left nav. This PR moves them under the torch.compiler menu item in the left nav. A bunch of rewrites were made in collaboration with @msaroufim to address formatting issues, latest updates that moved some of the APIs to the public torch.compiler namespace were addressed as well. The documentation is broken down in three categories that address three main audiences: PyTorch users, Pytorch Developers and PyTorch backend vendors. While, the user-facing documentation was significantly rewritten, dev docs and vendor docs kept mostly untouched. This can be addressed in the follow up PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105376
Approved by: https://github.com/msaroufim