pytorch/torch/_inductor/codegen
Bin Bao b0816e4714 [inductor] Fix AOTInductor output issues (#105773)
Summary: This is a follow-up on https://github.com/pytorch/pytorch/pull/105496. There are several issues with the previous fix,
1) It explicitly does copy for every output at the end of the main function;
2) When an output is ReinterpretView, no as_strided was generated for it;
3) There can be duplicated buffer declarations.

This PR fixes by making sure can_reuse behave consistently between two AOTIndcutor passes, and thus always generate the same set of kernels. It also adds handling of ReinterpretView.

Differential Revision: [D47692214](https://our.internmc.facebook.com/intern/diff/D47692214)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105773
Approved by: https://github.com/jansel
2023-07-24 01:58:49 +00:00
..
__init__.py
aot_inductor_interface.cpp Add aot_inductor as a test backend for benchmarking (#105221) 2023-07-18 13:16:36 +00:00
common.py ValueRange analysis for indirect indexing (#102611) 2023-07-14 13:43:05 +00:00
cpp_prefix.h [inductor] Add single pass "var_unnormalized" reduction_type (#102486) 2023-07-08 20:48:29 +00:00
cpp.py Revert "inductor: promote half/bfloat16 constant to float for cpu vectorization path (#105440)" 2023-07-20 03:56:44 +00:00
triton_foreach.py Allow fusion of epilogue copies with upstream foreach ops (#104018) 2023-06-23 21:39:59 +00:00
triton_utils.py Foreach kernel codegen in inductor (#99975) 2023-05-25 21:48:41 +00:00
triton.py [Inductor] Provenance tracking for wrapper code (#105717) 2023-07-21 23:06:43 +00:00
wrapper.py [inductor] Fix AOTInductor output issues (#105773) 2023-07-24 01:58:49 +00:00