pytorch/torch/_inductor/codegen
eellison fd35be2fd3 TritonTemplate dtype fixes (#141991)
- Set the dtype of "acc" appropriately so that epilogue fusion will have args with dtype
- Update dtype propagation to use `type_to_dtype` instead of instantiating tensor
- Throw if we have a string arg where we should have a proper CSEVariable, unless we're doing the Modification Subgraph thing which is nyi. everything else is appropriately typed (cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @drisspg ).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141991
Approved by: https://github.com/drisspg
ghstack dependencies: #139945, #140057, #141495, #141882
2024-12-04 17:24:23 +00:00
..
aoti_runtime [AOTI][refactor] Separate header codegen (#138882) 2024-10-27 14:14:27 +00:00
cuda [CUTLASS] Lift shape & stride information as kernel args (#138611) 2024-11-25 17:52:33 +00:00
rocm [ROCm][Inductor][CK] Enable scaled mm with bias in gemm max autotune with CK backend (#140674) 2024-11-15 22:08:38 +00:00
xpu [AOTI XPU] Enable Cpp wraper for Intel GPU. (#135318) 2024-11-26 11:51:32 +00:00
__init__.py
aoti_hipify_utils.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
block_analysis.py [Inductor] move block pointer analysis to a new module (#141733) 2024-11-30 23:21:24 +00:00
common.py TritonTemplate dtype fixes (#141991) 2024-12-04 17:24:23 +00:00
cpp_gemm_template.py [BE]: Update mypy to 1.13.0 (#140808) 2024-12-03 02:50:10 +00:00
cpp_micro_gemm.py Simplify & rectify dequantized B buffer loading for AMX GEMM micro-kernel for WoQ int8 case (#140258) 2024-11-22 01:34:06 +00:00
cpp_prefix.h std::value/std::type -> std::_v/std::_t (#138746) 2024-10-26 20:59:24 +00:00
cpp_template_kernel.py [inductor] Add typing to ir.py 2 (#140915) 2024-11-22 04:56:54 +00:00
cpp_template.py [AOTI] Remove the non-ABI-compatible mode (part 1) (#138009) 2024-10-17 02:48:26 +00:00
cpp_utils.py Move Sympy printers to torch/utils/_sympy/printers.py (#140597) 2024-11-26 18:11:00 +00:00
cpp_wrapper_cpu_array_ref.py [AOTI][refactor] Move stack allocation related configs (#139093) 2024-12-04 00:15:19 +00:00
cpp_wrapper_cpu.py Only write predicate once when there are multiple torch.cond (#141528) 2024-12-04 01:56:10 +00:00
cpp_wrapper_gpu.py [inductor] Refactor MutableBox to make IRNode typing easier (#140895) 2024-11-20 19:50:46 +00:00
cpp.py Broadcast constants on vectorised stores in CppTile2DKernel (#140262) 2024-12-03 09:15:17 +00:00
cpu_device_op_overrides.py Add Triton CPU as an Inductor backend (#133408) 2024-09-30 20:24:52 +00:00
cuda_combined_scheduling.py [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
debug_utils.py [BE]: Apply PERF401 autofixes from ruff (#140980) 2024-11-20 17:52:07 +00:00
halide.py Revert "[Inductor] Represent tiling as a dict (#141751)" 2024-12-04 15:43:16 +00:00
memory_planning.py [inductor] Generalize WorkspaceArg for graph-level semaphores (#138170) 2024-10-18 23:05:54 +00:00
multi_kernel.py [Inductor] Use a helper function to tell if a tree or prefix is a reduction (#141738) 2024-11-30 22:38:13 +00:00
simd_kernel_features.py [inductor] Refactor reduction type choices into V.choices (#139585) 2024-11-17 16:10:37 +00:00
simd.py Revert "[Inductor] Represent tiling as a dict (#141751)" 2024-12-04 15:43:16 +00:00
triton_combo_kernel.py Revert "[Inductor] Represent tiling as a dict (#141751)" 2024-12-04 15:43:16 +00:00
triton_split_scan.py Revert "[Inductor] Represent tiling as a dict (#141751)" 2024-12-04 15:43:16 +00:00
triton_utils.py [inductor] Move V.graph.scheduler.current_device to V.graph.current_device (#138252) 2024-10-18 23:05:54 +00:00
triton.py TritonTemplate dtype fixes (#141991) 2024-12-04 17:24:23 +00:00
wrapper.py [user triton] Fix grid codegen for configs with empty kwargs (#141824) 2024-12-02 04:17:21 +00:00