pytorch/torch/utils
David Berard c3ecabf059 [inductor][triton pin] add support for new TMA API for mm.py templates (#155723)
Triton 3.4 will remove the experimental TMA APIs: https://github.com/triton-lang/triton/pull/6488

For mm.py templates, this PR adds support for using the new APIs when they are available (and otherwise falls back to the experimental APIs).

For flex_attention, we'll remove TMA support for Triton 3.2 and 3.3 (versions of triton that don't have the new API).

For mm_scaled_grouped.py, https://github.com/pytorch/pytorch/pull/150944 will remove TMA support for Triton 3.2.

Note: we attempted this earlier with https://github.com/pytorch/pytorch/pull/154858, but this broke TMA usage in Triton 3.2.

Differential Revision: [D76444471](https://our.internmc.facebook.com/intern/diff/D76444471)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155723
Approved by: https://github.com/NikhilAPatel
2025-06-12 06:25:47 +00:00
..
_strobelight [BE]: Enable ruff rule SIM113 (#147290) 2025-02-16 22:41:16 +00:00
_sympy Migrate from lru_cache to cache (#155613) 2025-06-11 19:44:18 +00:00
backcompat
benchmark [BE]: Backport runtime_checkable perf improvements/behavior from 3.12 (#155130) 2025-06-06 13:28:05 +00:00
bottleneck
data remove allow-untyped-defs from torch/utils/data/datapipes/iter/filelister.py (#154624) 2025-05-30 08:38:05 +00:00
hipify [BE] Delete pre-CUDA-10.1 code from SparseCUDABlas (#155079) 2025-06-04 03:29:24 +00:00
jit
model_dump PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
serialization Make record/storage alignment in torch.save configurable (#147788) 2025-03-06 12:04:46 +00:00
tensorboard Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
viz [Visualizer] Start at index with most events (#154571) 2025-05-29 20:49:33 +00:00
__init__.py
_appending_byte_serializer.py Check integrity of bytes in AppendingByteSerializer (#152139) 2025-04-26 18:10:58 +00:00
_backport_slots.py
_config_module.py inductor codecache: include private inductor configs in cache key (#153672) 2025-06-11 01:33:24 +00:00
_config_typing.pyi inductor codecache: include private inductor configs in cache key (#153672) 2025-06-11 01:33:24 +00:00
_content_store.py Revert "Use the device interface for detecting Triton availability (#139171)" 2025-03-11 18:49:21 +00:00
_contextlib.py
_cpp_embed_headers.py [BE] Strip #pragma once when embedding the headers (#146871) 2025-02-11 16:49:00 +00:00
_cpp_extension_versioner.py xpu: support sycl with torch.utils.cpp_extension APIs (#132945) 2025-02-16 16:50:59 +00:00
_cxx_pytree.py [BE] detect CXX pytree requirement with TorchVersion (#151102) 2025-05-01 18:55:57 +00:00
_device.py Remove torch functions that do not support device arguments from _device_constructor (#150290) 2025-04-08 15:13:55 +00:00
_dtype_abbrs.py [BE] Migrate dtype_abbrs into one location (#152229) 2025-04-28 03:52:47 +00:00
_exposed_in.py
_filelock.py
_foreach_utils.py [HPU] Add hpu to fused kernels supported devices (#148666) 2025-03-07 04:28:33 +00:00
_freeze.py
_functools.py
_get_clean_triton.py [BE]Enhance _get_clean_triton.py to auto-generate launch_params if missing (#154666) 2025-05-31 19:27:56 +00:00
_helion.py Migrate from lru_cache to cache (#155613) 2025-06-11 19:44:18 +00:00
_import_utils.py
_mode_utils.py
_ordered_set.py [BE]: Make OrderedSet reversible (#146904) 2025-02-13 15:11:48 +00:00
_python_dispatch.py
_pytree.py Preserve Enum types during torch.export serialization and deserialization (#154821) 2025-06-08 17:30:31 +00:00
_stats.py
_thunk.py
_traceback.py
_triton.py [inductor][triton pin] add support for new TMA API for mm.py templates (#155723) 2025-06-12 06:25:47 +00:00
_typing_utils.py Revert "Fix type annotation of Linear.bias (#142326)" 2025-01-26 03:41:00 +00:00
_zip.py
backend_registration.py
bundled_inputs.py
checkpoint.py [BE] Mention debug=True in AC error messages (#155593) 2025-06-11 00:32:41 +00:00
collect_env.py Revert "Add Intel GPU info collection to the collect env script (#137846)" 2025-06-11 15:18:47 +00:00
cpp_backtrace.py
cpp_extension.py [ROCm] cpp_extension allow user to override default flags (#152432) 2025-05-15 21:06:18 +00:00
deterministic.py
dlpack.py Add __all__ for torch.utils.dlpack (#149026) 2025-04-11 22:03:24 +00:00
file_baton.py Warn user of existing lock file to avoid infinite waiting (#149382) 2025-04-15 20:25:29 +00:00
flop_counter.py Revert "Inductor logging + analysis of torch.profile (#149697)" 2025-06-10 15:38:40 +00:00
hooks.py Add warning for module full backward hook when no input requires gradient (#155339) 2025-06-10 04:42:06 +00:00
mkldnn.py
mobile_optimizer.py
model_zoo.py
module_tracker.py
show_pickle.py Use typing.IO[bytes] instead of io.BytesIO in annotations (#144994) 2025-01-27 18:08:07 +00:00
throughput_benchmark.py
weak.py pymft lint torch/utils/weak.py (#154484) 2025-05-28 17:06:58 +00:00