pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

History

David Berard 62bac07981 [inductor][triton] support profile_scratch launcher arg (#159772 ) This adds support for Triton after https://github.com/triton-lang/triton/pull/7258 landed. https://github.com/triton-lang/triton/pull/7258 adds a new argument to all the Triton kernels - a profile_scratch argument, similar to global_scratch. This PR updates the static cuda launcher and the AOTI kernel callers to pass in these arguments when calling the Triton kernel. Tests: https://github.com/pytorch/pytorch/pull/159158. I also verified these test locally with triton 3.2, 3.3, and 3.4. Fixes: * static_cuda_launcher (test/repro: `python tools/dynamo/verify_dynamo.py`) * AOTI calling logic (test/repro: `TORCHINDUCTOR_CPP_WRAPPER=1 python test/inductor/test_torchinductor_opinfo.py -k test_comprehensive_linalg_vander_cuda_float32`) Differential Revision: [D79825121](https://our.internmc.facebook.com/intern/diff/D79825121) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159772 Approved by: https://github.com/NikhilAPatel, https://github.com/eellison		2025-08-08 14:27:38 +00:00
..
_awaits
_C	Revert "Add unified memory APIs for torch.accelerator (#152932 )"	2025-08-07 16:34:36 +00:00
_C_flatbuffer
_custom_op	[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )	2025-08-08 03:14:59 +00:00
_decomp	(should_fold) gso to guard_or_false when checking folding whether to 3d bmm into 2d mm (#159184 )	2025-07-30 03:12:14 +00:00
_dispatch	Improve torch.ops typing (#154555 )	2025-06-22 15:52:27 +00:00
_dynamo	Fix infinite loop when iterating over an empty zip (#159673 )	2025-08-08 02:50:21 +00:00
_export	[Export Schema] Remove deviceAllocationMap field (#159653 )	2025-08-07 07:31:42 +00:00
_functorch	[MTIA] Allow users who know what they are doing to ignore all device mismatches in tracing and take a preferred device. (#159931 )	2025-08-07 22:37:15 +00:00
_higher_order_ops	[HOP, map] Rework of map autograd to the new interface (#153343 )	2025-08-06 23:02:42 +00:00
_inductor	[inductor][triton] support profile_scratch launcher arg (#159772 )	2025-08-08 14:27:38 +00:00
_lazy	[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 )	2025-07-12 05:47:06 +00:00
_library	[inductor] respect layout tags for ops with registered lowerings (#159134 )	2025-07-31 21:29:40 +00:00
_logging	fix logging setup issue for Windows.. (#159887 )	2025-08-05 23:44:38 +00:00
_numpy	Fix torch._numpy to match NumPy when empty ellipsis causes advanced indexing separation (#158297 )	2025-07-16 08:11:53 +00:00
_prims	[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )	2025-08-08 03:14:59 +00:00
_prims_common	[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) (#159491 )	2025-07-30 22:57:50 +00:00
_refs	[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 )	2025-07-12 05:47:06 +00:00
_strobelight	[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 )	2025-07-12 05:47:06 +00:00
_subclasses	[MTIA] Allow users who know what they are doing to ignore all device mismatches in tracing and take a preferred device. (#159931 )	2025-08-07 22:37:15 +00:00
_vendor
accelerator	Revert "Add unified memory APIs for torch.accelerator (#152932 )"	2025-08-07 16:34:36 +00:00
amp	Fix autocast context manager when there is exception (#159565 )	2025-08-01 02:12:24 +00:00
ao	[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )	2025-08-08 03:14:59 +00:00
autograd	Fix types in graphs.py (#158192 )	2025-07-15 19:49:38 +00:00
backends	fixed typo error (#159451 )	2025-07-30 17:41:30 +00:00
compiler	Add torch compile force disable caches alias (#158072 )	2025-08-02 23:23:17 +00:00
contrib
cpu
csrc	Extend torch function support to ALL arguments, not just scalar type (but not insides of list) (#145089 )	2025-08-07 23:43:53 +00:00
cuda	Revert "Add unified memory APIs for torch.accelerator (#152932 )"	2025-08-07 16:34:36 +00:00
distributed	[SymmMem] Send tensors with unerased type information to NVSHMEM Triton kernels (#159788 )	2025-08-08 05:20:42 +00:00
distributions	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
export	[export] Apply move_to_device_pass to all submodules (#159992 )	2025-08-07 18:51:15 +00:00
fft	[BE][PYFMT] migrate PYFMT for `torch/[e-n]*/` to `ruff format` (#144553 )	2025-06-17 08:18:47 +00:00
func
futures	Simplify the base classes of `_PyFutureMeta` (#157757 )	2025-07-08 15:39:56 +00:00
fx	[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )	2025-08-08 03:14:59 +00:00
headeronly	[Reland] Migrate ScalarType to headeronly (#159911 )	2025-08-06 07:36:37 +00:00
jit	[4/n] Remove references to TorchScript in PyTorch docs (#158317 )	2025-07-16 20:01:34 +00:00
legacy
lib	[2/N] Fix cppcoreguidelines-init-variables suppression (#146237 )	2025-06-19 23:26:42 +00:00
linalg
masked	Fix `MaskedTensor` to device ignored mask (#151205 )	2025-07-21 21:44:49 +00:00
monitor
mps	[BE][12/16] fix typos in torch/ (#156602 )	2025-07-02 22:55:29 +00:00
mtia	[Re-land][Inductor] Support native Inductor as backend for MTIA (#159211 )	2025-07-29 17:03:24 +00:00
multiprocessing	[BE][12/16] fix typos in torch/ (#156602 )	2025-07-02 22:55:29 +00:00
nativert	turn on executon frame clenaup by default (#160110 )	2025-08-08 02:13:48 +00:00
nested	Add minimal nn.functional.log_softmax support for NestedTensor (#159662 )	2025-08-06 20:34:02 +00:00
nn	Allow register_buffer with Tensor-like object (#159455 )	2025-08-01 15:31:38 +00:00
onnx	Make onnx export SDPA match aten behavior (#159973 )	2025-08-07 04:06:07 +00:00
optim	Detach tensor before clone in SGD optimiser and other code (#159204 )	2025-07-27 03:31:12 +00:00
package	[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format` (#144552 )	2025-08-07 00:09:56 +00:00
profiler	[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format` (#144552 )	2025-08-07 00:09:56 +00:00
quantization	[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format` (#144552 )	2025-08-07 00:09:56 +00:00
signal	[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format` (#144552 )	2025-08-07 00:09:56 +00:00
sparse	[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format` (#144552 )	2025-08-07 00:09:56 +00:00
special	[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format` (#144552 )	2025-08-07 00:09:56 +00:00
testing	[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )	2025-08-08 03:14:59 +00:00
utils	dataclass pytree fix (#159916 )	2025-08-07 08:22:41 +00:00
xpu	[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format` (#144552 )	2025-08-07 00:09:56 +00:00
__config__.py
__future__.py
__init__.py	[BE] remove torch deploy - conditionals (#158288 )	2025-07-29 17:40:49 +00:00
_appdirs.py
_classes.py	remove allow-untyped-defs from torch/_classes.py (#157231 )	2025-07-08 00:11:52 +00:00
_compile.py	[precompile] Ensure @disable()-ed function won't trigger recompile from precompile bytecode. (#155363 )	2025-06-10 16:13:38 +00:00
_custom_ops.py
_environment.py
_guards.py	[Dynamo][Better Engineering] Typing `torch/_dynamo/guards.py` (#159315 )	2025-08-06 21:52:14 +00:00
_jit_internal.py	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
_linalg_utils.py	Update `is_sparse` doc to mention that it is sparse_coo specific (#157378 )	2025-07-09 18:22:14 +00:00
_lobpcg.py	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
_lowrank.py	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
_meta_registrations.py	Add meta kernel for sdpa_math_for_mps (#159695 )	2025-08-05 22:27:06 +00:00
_namedtensor_internals.py
_ops.py	[BE] remove torch deploy - conditionals (#158288 )	2025-07-29 17:40:49 +00:00
_python_dispatcher.py	Typo fixes for "overridden" in comments and function names (#155944 )	2025-06-14 03:37:38 +00:00
_size_docs.py
_sources.py
_storage_docs.py	Fix docstring for `torch.UntypedStorage.from_file` (#155067 )	2025-06-05 14:30:49 +00:00
_streambase.py
_tensor_docs.py	Add missing optional for tensor ops (#159028 )	2025-07-25 04:36:55 +00:00
_tensor_str.py	Fix max_width computation in _tensor_str._Formatter (#126859 )	2025-08-01 15:05:41 +00:00
_tensor.py	[MPS] Enable dlpack integration (#158888 )	2025-07-24 18:05:41 +00:00
_thread_safe_fork.py
_torch_docs.py	Update the signature and test of torch.hamming_window() (#152682 )	2025-08-04 17:50:42 +00:00
_utils_internal.py	Wire in pt2_triton_builds (#159897 )	2025-08-06 07:39:51 +00:00
_utils.py	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
CMakeLists.txt	Migrate c10/macros/cmake_macros.h.in to torch/headeronly (#158035 )	2025-07-15 19:52:59 +00:00
custom_class_detail.h
custom_class.h	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
extension.h
functional.py	Fix atleast_{1,2,3}d() with no arguments description (#156042 )	2025-07-28 06:25:23 +00:00
header_only_apis.txt	[Reland] Migrate ScalarType to headeronly (#159911 )	2025-08-06 07:36:37 +00:00
hub.py	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
library.h	[BE][1/16] fix typos in torch/ (#156311 )	2025-07-09 11:02:22 +00:00
library.py	[BE] remove torch deploy - conditionals (#158288 )	2025-07-29 17:40:49 +00:00
overrides.py	Add basic torch.hash_tensor op (#154149 )	2025-07-23 22:28:03 +00:00
py.typed
quasirandom.py
random.py
return_types.py
script.h
serialization.py	Reduce random reads for offset metadata when calling torch.load under FakeTensorMode (#157931 )	2025-07-17 22:17:52 +00:00
storage.py	mypy 1.16.0 (#155821 )	2025-06-14 18:18:43 +00:00
torch_version.py
types.py
version.py.tpl