mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: This is a follow-up on https://github.com/pytorch/pytorch/pull/105496. There are several issues with the previous fix, 1) It explicitly does copy for every output at the end of the main function; 2) When an output is ReinterpretView, no as_strided was generated for it; 3) There can be duplicated buffer declarations. This PR fixes by making sure can_reuse behave consistently between two AOTIndcutor passes, and thus always generate the same set of kernels. It also adds handling of ReinterpretView. Differential Revision: [D47692214](https://our.internmc.facebook.com/intern/diff/D47692214) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105773 Approved by: https://github.com/jansel |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| aot_inductor_interface.cpp | ||
| common.py | ||
| cpp_prefix.h | ||
| cpp.py | ||
| triton_foreach.py | ||
| triton_utils.py | ||
| triton.py | ||
| wrapper.py | ||