pytorch/torch/_inductor/codegen
Bin Bao 0b151f260f [AOTI] Add an option to skip optimizing generated wrapper code (#144866)
Summary: In some cases, generated wrapper code faces a long cpp compilation time. As an alleviation, this PR adds an option to skip cpp compiler optimizers for the generated main wrapper function body.

D68174038

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144866
Approved by: https://github.com/chenyang78, https://github.com/hl475
2025-01-18 01:44:21 +00:00
..
aoti_runtime Revert "cpp_wrapper: Move #includes to per-device header files (#143909)" 2025-01-17 00:36:38 +00:00
cuda Revert "cpp_wrapper: Move #includes to per-device header files (#143909)" 2025-01-17 00:36:38 +00:00
rocm [ROCm][Inductor][CK] hackfix for segfault in addmm op (#144519) 2025-01-10 19:29:14 +00:00
xpu Revert "cpp_wrapper: Move #includes to per-device header files (#143909)" 2025-01-17 00:36:38 +00:00
__init__.py
aoti_hipify_utils.py remove allow-untyped-defs from _inductor/codegen/aoti_hipify_utils.py (#143916) 2024-12-27 23:25:37 +00:00
block_analysis.py Add heuristic to fail block pointer match early (#144681) 2025-01-16 21:57:30 +00:00
common.py Revert "cpp_wrapper: Move #includes to per-device header files (#143909)" 2025-01-17 00:36:38 +00:00
cpp_bmm_template.py [inductor][cpu] Fix bmm b_index for dynamic expressions in inductor autotuner (#143141) 2025-01-05 18:02:37 +00:00
cpp_flex_attention_template.py Remove is_reduced_floating_point from namespace std (#144502) 2025-01-10 03:24:10 +00:00
cpp_gemm_template.py [Inductor][CPP] Enable Grouped GEMM Template (#143796) 2025-01-14 05:59:07 +00:00
cpp_grouped_gemm_template.py [Inductor][CPP] Enable Epilogue Fusion for Grouped GEMM Template (#143897) 2025-01-14 06:07:50 +00:00
cpp_micro_gemm.py [Fix]: Enable support for Arm Neon & SVE support for FP32 Gemm Wrapper (#144327) 2025-01-14 17:52:00 +00:00
cpp_prefix.h Remove is_reduced_floating_point from namespace std (#144502) 2025-01-10 03:24:10 +00:00
cpp_template_kernel.py [Inductor][CPP] Enable Epilogue Fusion for Grouped GEMM Template (#143897) 2025-01-14 06:07:50 +00:00
cpp_template.py [Inductor][CPP] Enable Grouped GEMM Template (#143796) 2025-01-14 05:59:07 +00:00
cpp_utils.py Migrate from Tuple -> tuple in torch/_inductor (#144264) 2025-01-07 03:27:27 +00:00
cpp_wrapper_cpu_array_ref.py [AOTI] Add an option to skip optimizing generated wrapper code (#144866) 2025-01-18 01:44:21 +00:00
cpp_wrapper_cpu.py [AOTI] Add an option to skip optimizing generated wrapper code (#144866) 2025-01-18 01:44:21 +00:00
cpp_wrapper_gpu.py Revert "cpp_wrapper: Move #includes to per-device header files (#143909)" 2025-01-17 00:36:38 +00:00
cpp.py [Inductor][CPP] Enable Epilogue Fusion for Grouped GEMM Template (#143897) 2025-01-14 06:07:50 +00:00
cpu_device_op_overrides.py remove allow-untyped-defs from _inductor/codegen/cpu_device_op_overrides.py (#143881) 2024-12-27 04:10:47 +00:00
cuda_combined_scheduling.py Prologue Fusion (#134532) 2024-12-13 04:18:25 +00:00
debug_utils.py Revert "cpp_wrapper: Move #includes to per-device header files (#143909)" 2025-01-17 00:36:38 +00:00
halide.py Migrate from Tuple -> tuple in torch/_inductor (#144264) 2025-01-07 03:27:27 +00:00
memory_planning.py [inductor] Replace set by OrderedSet (#138466) 2024-12-13 16:08:45 +00:00
mps_device_op_overrides.py [Inductor] Add MPS device op overrides (#143892) 2024-12-28 02:11:45 +00:00
mps.py [MPSInductor] Fix codegen regression (#144924) 2025-01-16 02:12:42 +00:00
multi_kernel.py [inductor] Refactor CachingAutotuner so that it can pickle (#144044) 2025-01-18 01:44:16 +00:00
simd_kernel_features.py Skip L1 cache for single-use buffers (#143115) 2025-01-07 19:35:40 +00:00
simd.py [Inductor] Restrict ND tiling analysis to MemoryDeps (#144497) 2025-01-11 05:16:47 +00:00
triton_combo_kernel.py Migrate from Tuple -> tuple in torch/_inductor (#144264) 2025-01-07 03:27:27 +00:00
triton_split_scan.py [inductor] Replace set by OrderedSet (#138466) 2024-12-13 16:08:45 +00:00
triton_utils.py [inductor] Move V.graph.scheduler.current_device to V.graph.current_device (#138252) 2024-10-18 23:05:54 +00:00
triton.py Add heuristic to fail block pointer match early (#144681) 2025-01-16 21:57:30 +00:00
wrapper.py Revert "cpp_wrapper: Move #includes to per-device header files (#143909)" 2025-01-17 00:36:38 +00:00