mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Pavel Belevich 34b32ca914 Remove operator-> from at::Generator (#36027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36027 Differential Revision: D20856462 Pulled By: pbelevich fbshipit-source-id: 156fc23d51d8125d41e96b36b3b1312f13040588		2020-04-07 08:07:07 -07:00
..
cpu	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
cuda	Remove operator-> from at::Generator (#36027 )	2020-04-07 08:07:07 -07:00
arg_spec.h	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00
codegen.cpp	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
codegen.h	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
compiler.cpp	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
compiler.h	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
executor.cpp	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
executor.h	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00
fallback.cpp	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
fallback.h	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00
fused_kernel.h	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00
interface.cpp	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
interface.h	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
kernel_cache.cpp	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00
kernel_cache.h	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00
kernel_spec.h	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
partition_desc.h	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
README.md	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00
tensor_desc.h	[JIT] clang-format JIT code (#35115 )	2020-03-26 11:24:51 -07:00
tensor_info.h	[jit] do the code reorg (#33851 )	2020-02-27 13:02:51 -08:00

README.md

PyTorch Fuser

The fuser accepts subgraphs wrapped in "fusion nodes" and tries to execute them by just-in-time (JIT) compiling kernels that run all the graph operations.

Code Organization

The fuser is designed hierarchically with device-independent logic eventually deferring to device-specific logic and implementation. The device-specific code is (mostly) found in each devices' subdirectory. The device-independent logic has six components:

The Interface (interface.h/cpp) has functions to register and run fusions, interrogate fusion functionality, and perform debugging.
The Compiler (compiler.h/cpp) performs "upfront" and "runtime" compilation. When fusions are registered, upfront compilation produces fallback code and and performs some shape inference. When a fusion is run, runtime compilation invokes code generation and the device-specific compilation logic.
The Code Generator (codegen.h/cpp) produces the string to be compiled on the device.
The Executor (executor.h/cpp) runs requested fusions. It performs shape inference, expands tensors as necessary, determines the device to run on, acquires a cached compiled kernel or requests the Compiler produce a new one, invokes device-specific code to launch the kernel and updates the stack.
The Fallback (fallback.h/cpp) runs subgraphs that can't be fused because shape inference didn't determine a common tensor size or the device the tensors are on doesn't support fusion.
The Kernel Specification Cache (kernel_cache.h/cpp) is a thread-safe cache holding the device-independent specifications produced during upfront compilation. These specifications each have their own thread-safe stores of compiled kernels that the Executor checks before requesting runtime compilation.

The device-specific components have logic for compiling and running code in FusedKernelCPU (cpu/fused_kernel.h/cpp) and FusedKernelCUDA (cuda/fused_kernel.h/cpp).