pytorch/torch/_inductor/codegen
leslie-fang-intel 2bffbe06bd [Inductor][CPP] Support vectorization of load_seed and randn (#130317)
**Summary**
Enable the vectorization of `load_seed` and `randn`. For now, `randn` is using the reference implementation.

**Test Plan**
```
python -u -m pytest -s -v test/inductor/test_cpu_repro.py -k test_vec_randn
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130317
Approved by: https://github.com/jgong5
ghstack dependencies: #122961
2024-08-21 13:20:43 +00:00
..
aoti_runtime [AOTI] Fix bfloat16 in CPU (#132150) 2024-08-01 22:26:30 +00:00
cuda Revert "[CUDA][CUTLASS][submodule] Fixes for CUTLASS upgrade (#131493)" 2024-08-16 18:09:33 +00:00
rocm [ROCm][CK][Inductor] enable dynamic shapes for CK backend to gemm max autotune (#133285) 2024-08-16 06:05:23 +00:00
xpu Flip default value for mypy disallow_untyped_defs [2/11] (#127839) 2024-06-08 18:23:08 +00:00
__init__.py
aoti_hipify_utils.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
codegen_device_driver.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
common.py Minor type annotation updates following up D60954888 (#133382) 2024-08-14 21:36:42 +00:00
cpp_gemm_template.py [inductor] prune unused constants in graph scheduling (#132208) 2024-08-20 23:40:11 +00:00
cpp_micro_gemm.py [inductor][cpp][gemm] easy: adjust indentation of template, var renaming etc. (#133312) 2024-08-17 05:49:14 +00:00
cpp_prefix.h [inductor][cpp][gemm] easy: adjust indentation of template, var renaming etc. (#133312) 2024-08-17 05:49:14 +00:00
cpp_template_kernel.py [inductor] [cpp] fix accuracy when template_buffer has users other than the epilogue nodes (#133073) 2024-08-16 12:13:10 +00:00
cpp_template.py [inductor] [cpp] fix accuracy when template_buffer has users other than the epilogue nodes (#133073) 2024-08-16 12:13:10 +00:00
cpp_utils.py [Inductor][CPP] Support vectorization of load_seed and randn (#130317) 2024-08-21 13:20:43 +00:00
cpp_wrapper_cpu.py [AOTI][Tooling] A couple fixes / minor updates for initial debug printer (#133016) 2024-08-13 23:00:29 +00:00
cpp_wrapper_cuda.py [AOTI] Introduce DeferredCudaKernelLine for cuda cpp wrapper (#129135) 2024-08-20 02:15:44 +00:00
cpp.py [Inductor][CPP] Support vectorization of load_seed and randn (#130317) 2024-08-21 13:20:43 +00:00
cuda_combined_scheduling.py Reland "[2/2] PT2 Inductor ComboKernels - automatic horizontal fusing (#131675)" (#133291) 2024-08-13 18:18:12 +00:00
debug_utils.py Minor type annotation updates following up D60954888 (#133382) 2024-08-14 21:36:42 +00:00
halide.py Fix triton codegen with math.trunc (#133354) 2024-08-15 16:38:26 +00:00
memory_planning.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
multi_kernel.py [BC breaking] move benchmarking + prefer inductor path (#132827) 2024-08-08 00:47:45 +00:00
simd.py [BE] Fix MYPY issues (#133872) 2024-08-20 16:12:04 +00:00
triton_combo_kernel.py Reland "[2/2] PT2 Inductor ComboKernels - automatic horizontal fusing (#131675)" (#133291) 2024-08-13 18:18:12 +00:00
triton_split_scan.py Add basic mypy annotations to inductor (#132416) 2024-08-04 18:43:37 +00:00
triton_utils.py Disable unwrapping scalar tensors when used as outputs (#132859) 2024-08-16 21:40:45 +00:00
triton.py Disable unwrapping scalar tensors when used as outputs (#132859) 2024-08-16 21:40:45 +00:00
wrapper.py Disable unwrapping scalar tensors when used as outputs (#132859) 2024-08-16 21:40:45 +00:00