pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

leslie-fang-intel 2bffbe06bd [Inductor][CPP] Support vectorization of load_seed and randn (#130317 ) Summary Enable the vectorization of `load_seed` and `randn`. For now, `randn` is using the reference implementation. Test Plan ``` python -u -m pytest -s -v test/inductor/test_cpu_repro.py -k test_vec_randn ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/130317 Approved by: https://github.com/jgong5 ghstack dependencies: #122961		2024-08-21 13:20:43 +00:00
..
aoti_runtime	[AOTI] Fix bfloat16 in CPU (#132150 )	2024-08-01 22:26:30 +00:00
cuda	Revert "[CUDA][CUTLASS][submodule] Fixes for CUTLASS upgrade (#131493 )"	2024-08-16 18:09:33 +00:00
rocm	[ROCm][CK][Inductor] enable dynamic shapes for CK backend to gemm max autotune (#133285 )	2024-08-16 06:05:23 +00:00
xpu	Flip default value for mypy disallow_untyped_defs [2/11] (#127839 )	2024-06-08 18:23:08 +00:00
__init__.py
aoti_hipify_utils.py	[BE][Easy][16/19] enforce style for empty lines in import segments in `torch/_i*/` (#129768 )	2024-07-20 16:20:58 +00:00
codegen_device_driver.py	[BE][Easy][16/19] enforce style for empty lines in import segments in `torch/_i*/` (#129768 )	2024-07-20 16:20:58 +00:00
common.py	Minor type annotation updates following up D60954888 (#133382 )	2024-08-14 21:36:42 +00:00
cpp_gemm_template.py	[inductor] prune unused constants in graph scheduling (#132208 )	2024-08-20 23:40:11 +00:00
cpp_micro_gemm.py	[inductor][cpp][gemm] easy: adjust indentation of template, var renaming etc. (#133312 )	2024-08-17 05:49:14 +00:00
cpp_prefix.h	[inductor][cpp][gemm] easy: adjust indentation of template, var renaming etc. (#133312 )	2024-08-17 05:49:14 +00:00
cpp_template_kernel.py	[inductor] [cpp] fix accuracy when template_buffer has users other than the epilogue nodes (#133073 )	2024-08-16 12:13:10 +00:00
cpp_template.py	[inductor] [cpp] fix accuracy when template_buffer has users other than the epilogue nodes (#133073 )	2024-08-16 12:13:10 +00:00
cpp_utils.py	[Inductor][CPP] Support vectorization of load_seed and randn (#130317 )	2024-08-21 13:20:43 +00:00
cpp_wrapper_cpu.py	[AOTI][Tooling] A couple fixes / minor updates for initial debug printer (#133016 )	2024-08-13 23:00:29 +00:00
cpp_wrapper_cuda.py	[AOTI] Introduce DeferredCudaKernelLine for cuda cpp wrapper (#129135 )	2024-08-20 02:15:44 +00:00
cpp.py	[Inductor][CPP] Support vectorization of load_seed and randn (#130317 )	2024-08-21 13:20:43 +00:00
cuda_combined_scheduling.py	Reland "[2/2] PT2 Inductor ComboKernels - automatic horizontal fusing (#131675 )" (#133291 )	2024-08-13 18:18:12 +00:00
debug_utils.py	Minor type annotation updates following up D60954888 (#133382 )	2024-08-14 21:36:42 +00:00
halide.py	Fix triton codegen with math.trunc (#133354 )	2024-08-15 16:38:26 +00:00
memory_planning.py	[BE][Easy][16/19] enforce style for empty lines in import segments in `torch/_i*/` (#129768 )	2024-07-20 16:20:58 +00:00
multi_kernel.py	[BC breaking] move benchmarking + prefer inductor path (#132827 )	2024-08-08 00:47:45 +00:00
simd.py	[BE] Fix MYPY issues (#133872 )	2024-08-20 16:12:04 +00:00
triton_combo_kernel.py	Reland "[2/2] PT2 Inductor ComboKernels - automatic horizontal fusing (#131675 )" (#133291 )	2024-08-13 18:18:12 +00:00
triton_split_scan.py	Add basic mypy annotations to inductor (#132416 )	2024-08-04 18:43:37 +00:00
triton_utils.py	Disable unwrapping scalar tensors when used as outputs (#132859 )	2024-08-16 21:40:45 +00:00
triton.py	Disable unwrapping scalar tensors when used as outputs (#132859 )	2024-08-16 21:40:45 +00:00
wrapper.py	Disable unwrapping scalar tensors when used as outputs (#132859 )	2024-08-16 21:40:45 +00:00