pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

eellison 566ceb3e7e Refactor dtype propagation (#139945 ) A couple changes. - Tries to reuse dtype propagation rules that were already registered in inductor. These were present both with `pointwise_overrides_data` and the `boolean_ops` list. Additionally, the registration of pointwise ops already specified dtype propagation rules. Saves those registrations and reuses them later. - Factors out `get_promoted_dtype` which uses functools.lru_cache to take in non - CSEVariable args because those will not work with the functools cache. Tests get added later in the stack when everything is implemented. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139945 Approved by: https://github.com/blaine-rister, https://github.com/arui-meta, https://github.com/ezyang		2024-11-27 16:57:02 +00:00
..
aoti_runtime	[AOTI][refactor] Separate header codegen (#138882 )	2024-10-27 14:14:27 +00:00
cuda	[CUTLASS] Lift shape & stride information as kernel args (#138611 )	2024-11-25 17:52:33 +00:00
rocm	[ROCm][Inductor][CK] Enable scaled mm with bias in gemm max autotune with CK backend (#140674 )	2024-11-15 22:08:38 +00:00
xpu	[AOTI XPU] Enable Cpp wraper for Intel GPU. (#135318 )	2024-11-26 11:51:32 +00:00
__init__.py
aoti_hipify_utils.py	[BE][Easy][16/19] enforce style for empty lines in import segments in `torch/_i*/` (#129768 )	2024-07-20 16:20:58 +00:00
common.py	Refactor dtype propagation (#139945 )	2024-11-27 16:57:02 +00:00
cpp_gemm_template.py	[Inductor][CPP] Extract common functions to be reused in other CPP Template (#141554 )	2024-11-27 09:52:18 +00:00
cpp_micro_gemm.py	Simplify & rectify dequantized B buffer loading for AMX GEMM micro-kernel for WoQ int8 case (#140258 )	2024-11-22 01:34:06 +00:00
cpp_prefix.h	std::value/std::type -> std::_v/std::_t (#138746 )	2024-10-26 20:59:24 +00:00
cpp_template_kernel.py	[inductor] Add typing to ir.py 2 (#140915 )	2024-11-22 04:56:54 +00:00
cpp_template.py	[AOTI] Remove the non-ABI-compatible mode (part 1) (#138009 )	2024-10-17 02:48:26 +00:00
cpp_utils.py	Move Sympy printers to torch/utils/_sympy/printers.py (#140597 )	2024-11-26 18:11:00 +00:00
cpp_wrapper_cpu_array_ref.py	[inductor] Refactor ir.Layout into ir.OutputSpec (#140910 )	2024-11-21 20:01:57 +00:00
cpp_wrapper_cpu.py	[AOTI XPU] Enable Cpp wraper for Intel GPU. (#135318 )	2024-11-26 11:51:32 +00:00
cpp_wrapper_gpu.py	[inductor] Refactor MutableBox to make IRNode typing easier (#140895 )	2024-11-20 19:50:46 +00:00
cpp.py	[inductor] modify the heuristic for loop split optimization (#137550 )	2024-11-25 09:16:30 +00:00
cpu_device_op_overrides.py	Add Triton CPU as an Inductor backend (#133408 )	2024-09-30 20:24:52 +00:00
cuda_combined_scheduling.py	[BE]: Update mypy to 1.11.2 (#133816 )	2024-09-16 19:44:11 +00:00
debug_utils.py	[BE]: Apply PERF401 autofixes from ruff (#140980 )	2024-11-20 17:52:07 +00:00
halide.py	Move Sympy printers to torch/utils/_sympy/printers.py (#140597 )	2024-11-26 18:11:00 +00:00
memory_planning.py	[inductor] Generalize WorkspaceArg for graph-level semaphores (#138170 )	2024-10-18 23:05:54 +00:00
multi_kernel.py	[BE]: Apply PERF401 autofixes from ruff (#140980 )	2024-11-20 17:52:07 +00:00
simd_kernel_features.py	[inductor] Refactor reduction type choices into V.choices (#139585 )	2024-11-17 16:10:37 +00:00
simd.py	[inductor] Refactor ir.Layout into ir.OutputSpec (#140910 )	2024-11-21 20:01:57 +00:00
triton_combo_kernel.py	[BE]: Apply PERF401 autofixes from ruff (#140980 )	2024-11-20 17:52:07 +00:00
triton_split_scan.py	[inductor] Support fixed triton configs defined at compile time (#140217 )	2024-11-17 16:10:37 +00:00
triton_utils.py	[inductor] Move V.graph.scheduler.current_device to V.graph.current_device (#138252 )	2024-10-18 23:05:54 +00:00
triton.py	Refactor dtype propagation (#139945 )	2024-11-27 16:57:02 +00:00
wrapper.py	Revert "[Inductor] Inplacing with Donated Buffer (#140113 )"	2024-11-26 21:20:59 +00:00