..
autoheuristic
codegen
introduce definitely_contiguous and use it for reshape and tensor meta data computation. ( #153432 )
2025-05-28 03:41:26 +00:00
compile_worker
torch.compile: Supress stdout / stderr output from subprocesses when local ( #153837 )
2025-05-22 05:49:43 +00:00
fx_passes
[EASY] used guard_or_false instead of guard_sizes_oblivious in pointless_view ( #154154 )
2025-05-26 21:59:21 +00:00
kernel
[Cutlass] Support float8_e4m3fn GEMM ( #153890 )
2025-05-22 08:37:33 +00:00
package
[export] Move PT2ArchiveWriter/Reader to torch/export ( #153795 )
2025-05-23 19:04:36 +00:00
runtime
[AOTI] Add a multi_arch_kernel_binary option ( #154413 )
2025-05-28 01:20:38 +00:00
__autotune_main__.py
Improve subproc autotuning implementation ( #149700 )
2025-03-28 01:06:39 +00:00
__init__.py
Add optional device index to AOTIModelPackageLoader ( #152093 )
2025-05-04 11:40:12 +00:00
analyze_preserves_zero_mask.py
Revert two recent prologue prs ( #151013 )
2025-04-10 23:48:41 +00:00
aoti_eager.py
async_compile.py
Pass inductor config for static cuda launcher to workers ( #153382 )
2025-05-14 20:01:32 +00:00
autotune_process.py
[inductor][cutlass backend] Add 2 stage autotuning aka prescreening ( #153335 )
2025-05-23 17:12:25 +00:00
bounds.py
[inductor] Refactor op handlers part 5 ( #146257 )
2025-02-08 18:00:30 +00:00
choices.py
Reland "Introduce new template heuristic for triton autotune configs" ( #147452 )
2025-03-26 15:47:06 +00:00
codecache.py
[AOTI] Support multi-arch when using package_cpp_only ( #154414 )
2025-05-28 01:20:38 +00:00
comm_analysis.py
comm_lowering.py
Fix an issue where functional collectives don't force fx stride on inputs when compiled ( #146467 )
2025-02-10 19:15:49 +00:00
comms.py
Make assertion about pass callable print the bad pass ( #152654 )
2025-05-05 18:07:43 +00:00
compile_fx_async.py
Use correct boxed_forward_device_index when running CompiledFxGraph.post_compile ( #148130 )
2025-03-23 02:57:58 +00:00
compile_fx_ext.py
Revert "Re-enable FakeTensor caching for SymInts ( #152662 )"
2025-05-26 17:13:22 +00:00
compile_fx_subproc.py
async fx compile ( #146135 )
2025-03-19 14:07:51 +00:00
compile_fx.py
Update provenance tracking doc ( #154062 )
2025-05-23 17:09:52 +00:00
compiler_bisector.py
Add a couple config options to compiler bisector ( #148450 )
2025-03-04 23:23:21 +00:00
config.py
[AOTI] Add a multi_arch_kernel_binary option ( #154413 )
2025-05-28 01:20:38 +00:00
constant_folding.py
Fix constant folding cloning constants ( #152273 )
2025-05-01 17:34:39 +00:00
cpp_builder.py
[AOTI] Support multi-arch when using package_cpp_only ( #154414 )
2025-05-28 01:20:38 +00:00
cpu_vec_isa.py
Allow to set custom PYTHONPATH for torch.inductor ( #152832 )
2025-05-15 06:35:41 +00:00
cudagraph_trees.py
[BE]: Update ruff to 0.11.8 ( #153249 )
2025-05-12 18:30:52 +00:00
cudagraph_utils.py
[CUDAGraph] support meta tensor ( #150478 )
2025-04-02 07:21:50 +00:00
custom_graph_pass.py
debug.py
Rename the provenance tracing artifact name for kernel <-> post_grad nodes mapping ( #154046 )
2025-05-22 19:20:56 +00:00
decomposition.py
Revert "Improve torch.ops typing ( #153558 )"
2025-05-19 23:32:36 +00:00
dependencies.py
[Graph Partition] Support symbol inputs ( #149458 )
2025-03-26 17:21:30 +00:00
dtype_propagation.py
Remove libdevice ops in inductor ( #151562 )
2025-04-17 22:18:00 +00:00
exc.py
extern_node_serializer.py
Back out "[AOTI] Always use oss schema for ExternKernelNodes serialization" ( #151026 )
2025-04-10 22:36:35 +00:00
freezing_utils.py
PEP585: More UP006 fixes ( #146392 )
2025-02-20 06:18:13 +00:00
freezing.py
[cudagraphs] Fix issue in collecting static_input_idxs ( #152287 )
2025-04-30 03:24:05 +00:00
fuzzer.py
[AOTI][reland] Add an option to specify custom op C shim ( #153968 )
2025-05-21 15:57:57 +00:00
fx_utils.py
Scheduler Flops refactor ( #152708 )
2025-05-09 19:01:43 +00:00
graph.py
cpp_wrapper: build non-performance-sensitive code at O1 ( #148773 )
2025-05-23 00:51:20 +00:00
hooks.py
index_propagation.py
[BE][PYFMT] migrate PYFMT for torch._inductor to ruff format ( #144550 )
2025-02-28 13:33:19 +00:00
inductor_prims.py
[inductor] lowering for fractional_max_pool3d ( #148630 )
2025-05-22 16:06:29 +00:00
ir.py
Revert "[Inductor] Improve typing, and prepare for ABI-compatible AOTI C-shim dispatching ( #154371 )"
2025-05-27 20:39:09 +00:00
jagged_lowerings.py
loop_body.py
[ez] fix typo in comment ( #151755 )
2025-04-21 14:52:39 +00:00
lowering.py
[Inductor] Allow passing in custom lowering dict to register_lowering() ( #154344 )
2025-05-27 01:35:26 +00:00
memory.py
[Graph Partition] reorder for minimal number of partitions ( #151968 )
2025-04-29 17:17:16 +00:00
metrics.py
[Inductor] Support parallel reduction for GroupNorm ( #144020 )
2025-03-01 17:11:50 +00:00
mkldnn_ir.py
Revert "[Inductor] Improve typing, and prepare for ABI-compatible AOTI C-shim dispatching ( #154371 )"
2025-05-27 20:39:09 +00:00
mkldnn_lowerings.py
[BE]: Update ruff to 0.11.8 ( #153249 )
2025-05-12 18:30:52 +00:00
mock_cache.py
ops_handler.py
Remove libdevice ops in inductor ( #151562 )
2025-04-17 22:18:00 +00:00
optimize_indexing.py
output_code.py
codecache: Remove cpp_prefix.h duplication per build, then precompile it ( #144293 )
2025-05-16 17:41:36 +00:00
pattern_matcher.py
Rename node.meta["arg_kwarg_vals"] to node.meta["eager_input_vals"] ( #148092 )
2025-04-02 13:18:04 +00:00
quantized_lowerings.py
Add AOTI shim for _weight_int4pack_mm_cpu_tensor ( #149031 )
2025-03-18 01:33:13 +00:00
remote_cache.py
[Indcutor Remote Cache] Raise an exception if redis module is required but not available ( #151779 )
2025-04-26 11:21:54 +00:00
scheduler.py
update mutation renames ( #153895 )
2025-05-22 14:54:39 +00:00
script.ld
select_algorithm.py
Make inductor UT to be generic ( #154196 )
2025-05-24 02:47:46 +00:00
sizevars.py
[aoti] fix corner case in unbacked replacements for atomically_apply_size_hint ( #153768 )
2025-05-22 02:05:37 +00:00
standalone_compile.py
Add logging for guard miss failure ( #153125 )
2025-05-09 16:51:04 +00:00
subgraph_lowering.py
[inductor] Refactor op handlers part 5 ( #146257 )
2025-02-08 18:00:30 +00:00
template_heuristics.py
[Inductor] Add Additional Configs for persistent+TMA version of Triton mm and addmm ( #150587 )
2025-04-23 18:21:35 +00:00
test_case.py
[Inductor] be able to disable cache for test ( #141195 )
2025-01-24 19:15:55 +00:00
test_operators.py
[CI] Fix GPUTests.test_scheduler_vertical_fusion1 ( #151166 )
2025-04-13 00:41:51 +00:00
triton_bundler.py
Keep raw cubin file around in case it gets deleted underneath us ( #153064 )
2025-05-08 14:29:19 +00:00
utils.py
[aoti] Initial Metal support ( #153959 )
2025-05-23 05:45:35 +00:00
virtualized.py
[inductor] Add a helper for convert index_dtype to torch dtype ( #149531 )
2025-03-20 21:33:29 +00:00
wrapper_benchmark.py
[Inductor][NCU] Add kernel name filtering, and allow custom metrics ( #150872 )
2025-05-04 20:49:19 +00:00