pytorch/torch/_inductor/codegen
Pian Pawakapan abadea70f3 [inductor] thread hint_override in more kernel args (#164494)
ensure hint_override is threaded in benchmarking args

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164494
Approved by: https://github.com/bobrenjc93
2025-10-03 22:07:12 +00:00
..
aoti_runtime [AOTI win] Add ABI stable method for updating constant buffer (#163819) 2025-10-02 18:31:00 +00:00
cuda [cutlass-4][take 2] upgrade to cutlass 4.2.1 (#164159) 2025-10-03 03:47:59 +00:00
cutedsl Refactor Provenance Tracking (#163378) 2025-09-25 22:55:59 +00:00
mtia [Re-land][Inductor] Support native Inductor as backend for MTIA (#159211) 2025-07-29 17:03:24 +00:00
rocm Refactor Provenance Tracking (#163378) 2025-09-25 22:55:59 +00:00
xpu [Inductor] Update Intel Triton for PyTorch 2.9. (#161050) 2025-08-25 17:18:19 +00:00
__init__.py
aoti_hipify_utils.py
block_analysis.py
common.py [inductor] add a runtime assert for triton shapes (#164242) 2025-10-01 18:55:33 +00:00
cpp_bmm_template.py
cpp_flex_attention_template.py [FlexAttn] Fix Paged Attention Accuracy via Upper Mask Mod and Prevent Invalid Memory Access (#160861) 2025-08-30 04:50:23 +00:00
cpp_gemm_template.py [CPU][GEMM Template] Improve A16W8 performance (#162479) 2025-09-18 01:28:37 +00:00
cpp_grouped_gemm_template.py
cpp_micro_gemm.py Add SVE128 ISA (#158932) 2025-09-29 14:49:19 +00:00
cpp_template_kernel.py [AOTInductor] ABI-Compatibility for RecordFunction. (#159842) 2025-08-15 21:45:47 +00:00
cpp_template.py [AOTInductor] ABI-Compatibility for RecordFunction. (#159842) 2025-08-15 21:45:47 +00:00
cpp_utils.py [doc]: Small typos (#162982) 2025-09-16 17:42:19 +00:00
cpp_wrapper_cpu_array_ref.py [Inductor-FX] Support IndexPutFallback (#162863) 2025-09-16 08:52:47 +00:00
cpp_wrapper_cpu.py [aoti] AOTI mingw cross compilation (#163188) 2025-10-01 02:22:06 +00:00
cpp_wrapper_gpu.py [aoti] AOTI mingw cross compilation (#163188) 2025-10-01 02:22:06 +00:00
cpp_wrapper_mps.py [aoti][mps] Initialize mps kernels first (#159753) 2025-08-06 07:54:29 +00:00
cpp.py [1/N] Fix ruff warnings (#164333) 2025-10-01 16:48:32 +00:00
cpu_device_op_overrides.py [Inductor][CPP] Reuse the pre-existing kernel for the same kernels (#158404) 2025-09-16 01:54:24 +00:00
cuda_combined_scheduling.py Add cutedsl template support to compile (#160108) 2025-08-18 04:37:15 +00:00
debug_utils.py
halide.py [Inductor] Add DeviceAssert op to enable device-side assertion in torch.compile (#160677) 2025-08-28 18:57:34 +00:00
memory_planning.py Fix unbacked symint and memory leak in inductor memory planning (#159839) 2025-08-11 17:16:15 +00:00
mps_device_op_overrides.py
mps.py [MPS] Add igamma/igammac ops (#161927) 2025-09-02 20:52:02 +00:00
multi_kernel.py [multi-kernel] shape-similarity kernel selection (#163090) 2025-09-23 21:00:47 +00:00
python_wrapper_mtia.py [Re-land][Inductor] Support native Inductor as backend for MTIA (#159211) 2025-07-29 17:03:24 +00:00
segmented_tree.py [inductor] dont reuse buffers if it affects peak (#145883) (#159530) 2025-08-19 19:02:56 +00:00
simd_kernel_features.py skip non memory deps in memory estimator (#164294) 2025-10-01 02:44:58 +00:00
simd.py [inductor][templates] Template hooks should be finalised inside a kernel context (#164229) 2025-09-30 17:50:59 +00:00
subgraph.py [inductor][mm] restructure decompose k (#161026) 2025-08-28 20:14:41 +00:00
triton_combo_kernel.py [Inductor] Fix ComboKernels failing due to missing helper functions (#162759) 2025-09-12 20:01:06 +00:00
triton_split_scan.py [inductor] propagate shapes in CSEVariable (#152198) 2025-08-19 16:46:38 +00:00
triton_utils.py Revert "[inductor] Fix issue with scalar arg handling" (#163737) 2025-09-24 07:33:12 +00:00
triton.py [inductor] thread hint_override in more kernel args (#164494) 2025-10-03 22:07:12 +00:00
wrapper_fxir.py Improved support for autotuning in wrapper_fxir (#164132) 2025-10-02 22:54:22 +00:00
wrapper.py [dynamic shapes] unbacked-safe slicing (#161414) 2025-09-30 01:15:19 +00:00