pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Boyuan Feng b6fe28ff02 [Inductor] Graph Partition (#147038 ) This PR implements inductor graph partition. Previously, 1 dynamo graph is mapped to 1 inductor graph, and further mapped to 1 call function. In this PR, we allow 1 dynamo graph mapped to multiple inductor graphs and multiple `graph_partition` functions in the generated code. This allows applying different further optimizations to different `graph_partition`. Design Doc: [link](https://docs.google.com/document/d/1qPgOfy25l7SIYnrQrvU-TO1mdHMslCwv_SLmeXID6tM/edit?usp=sharing) Example: [Generated code before and after this diff](https://www.internalfb.com/intern/diffing/?paste_number=1737334601) In the follow-up PR, we will extend the work to cudagraph, which allows applying cudagraph to parts of the generated code (#125864). Pull Request resolved: https://github.com/pytorch/pytorch/pull/147038 Approved by: https://github.com/eellison		2025-02-27 04:50:43 +00:00
..
aoti_runtime	cpp_wrapper: Move #includes to per-device header files (#145932 )	2025-01-29 21:08:45 +00:00
cuda	[cutlass backend] turn autotuning logs off by default + rename log to autotuning log (#147922 )	2025-02-26 21:02:04 +00:00
rocm	[inductor][ck] kBatch parametrized (#147885 )	2025-02-26 07:28:19 +00:00
xpu	[inductor] Add types to DeviceOpOverrides (#145913 )	2025-02-01 16:33:49 +00:00
__init__.py
aoti_hipify_utils.py	remove allow-untyped-defs from _inductor/codegen/aoti_hipify_utils.py (#143916 )	2024-12-27 23:25:37 +00:00
block_analysis.py	[Inductor] Expand Identity ops prior to block pattern matching (#146000 )	2025-02-08 18:11:53 +00:00
common.py	[inductor][triton] Ignore block ptr advances for removed buffers (#147193 )	2025-02-27 03:37:33 +00:00
cpp_bmm_template.py	[inductor][cpu] Move VNNI weight packing into AMX GEMM kernel for contiguous BMM weights (#146843 )	2025-02-21 21:46:00 +00:00
cpp_flex_attention_template.py	[inductor][cpu] Move VNNI weight packing into AMX GEMM kernel for contiguous BMM weights (#146843 )	2025-02-21 21:46:00 +00:00
cpp_gemm_template.py	[inductor][cpu] Move VNNI weight packing into AMX GEMM kernel for contiguous BMM weights (#146843 )	2025-02-21 21:46:00 +00:00
cpp_grouped_gemm_template.py	[inductor] Finish typing common.py (#146225 )	2025-02-04 23:35:33 +00:00
cpp_micro_gemm.py	cpp_wrapper: fix inductor triton tests (#146109 )	2025-02-25 19:50:37 +00:00
cpp_prefix.h	[Inductor][CPP] fix store mode atomic add (#147961 )	2025-02-26 14:04:34 +00:00
cpp_template_kernel.py	[inductor] [cpp] Support vectorization for score and mask in FlexAttention CPU (#143638 )	2025-02-14 05:26:18 +00:00
cpp_template.py	Fix assertion failure in gemm template lowering (#146353 )	2025-02-08 01:52:20 +00:00
cpp_utils.py	cpp_wrapper: enable all CPU repro tests (#145655 )	2025-02-04 22:05:59 +00:00
cpp_wrapper_cpu_array_ref.py	[Inductor] Graph Partition (#147038 )	2025-02-27 04:50:43 +00:00
cpp_wrapper_cpu.py	[Inductor] Graph Partition (#147038 )	2025-02-27 04:50:43 +00:00
cpp_wrapper_gpu.py	[Inductor] Graph Partition (#147038 )	2025-02-27 04:50:43 +00:00
cpp.py	[Inductor][CPP] fix store mode atomic add (#147961 )	2025-02-26 14:04:34 +00:00
cpu_device_op_overrides.py	[inductor] Add types to DeviceOpOverrides (#145913 )	2025-02-01 16:33:49 +00:00
cuda_combined_scheduling.py	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
debug_utils.py	fix intermediate debug information with cpp_wrapper (#145527 )	2025-02-10 22:24:26 +00:00
halide.py	[inductor] Add type annotations to _inductor/utils.py (#144108 )	2025-02-15 23:13:41 +00:00
memory_planning.py	PEP585 update - torch/_inductor/codegen (#145106 )	2025-01-18 06:56:03 +00:00
mps_device_op_overrides.py	[inductor] Add types to DeviceOpOverrides (#145913 )	2025-02-01 16:33:49 +00:00
mps.py	[MPS/Inductor] Add support for xlog1py. (#147709 )	2025-02-24 05:28:52 +00:00
multi_kernel.py	[inductor][4/N] triton support post-#5512, fix constexpr signatures (#145583 )	2025-01-29 05:46:05 +00:00
simd_kernel_features.py	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
simd.py	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
triton_combo_kernel.py	[inductor] Add typing to common.KernelArgs (#145916 )	2025-02-04 16:05:39 +00:00
triton_split_scan.py	[inductor] Remove _get_grid_fn_str (#146800 )	2025-02-10 23:14:30 +00:00
triton_utils.py	[inductor][5/N] triton support post-#5512, fix 1 and None handling (#145515 )	2025-02-01 02:11:48 +00:00
triton.py	[inductor][triton] Ignore block ptr advances for removed buffers (#147193 )	2025-02-27 03:37:33 +00:00
wrapper.py	[Inductor] Graph Partition (#147038 )	2025-02-27 04:50:43 +00:00