pytorch/torch/csrc/jit/codegen/fuser
Pavel Belevich 34b32ca914 Remove operator-> from at::Generator (#36027)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36027

Differential Revision: D20856462

Pulled By: pbelevich

fbshipit-source-id: 156fc23d51d8125d41e96b36b3b1312f13040588
2020-04-07 08:07:07 -07:00
..
cpu [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
cuda Remove operator-> from at::Generator (#36027) 2020-04-07 08:07:07 -07:00
arg_spec.h [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00
codegen.cpp [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
codegen.h [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
compiler.cpp [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
compiler.h [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
executor.cpp [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
executor.h [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00
fallback.cpp [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
fallback.h [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00
fused_kernel.h [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00
interface.cpp [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
interface.h [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
kernel_cache.cpp [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00
kernel_cache.h [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00
kernel_spec.h [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
partition_desc.h [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
README.md [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00
tensor_desc.h [JIT] clang-format JIT code (#35115) 2020-03-26 11:24:51 -07:00
tensor_info.h [jit] do the code reorg (#33851) 2020-02-27 13:02:51 -08:00

PyTorch Fuser

The fuser accepts subgraphs wrapped in "fusion nodes" and tries to execute them by just-in-time (JIT) compiling kernels that run all the graph operations.

Code Organization

The fuser is designed hierarchically with device-independent logic eventually deferring to device-specific logic and implementation. The device-specific code is (mostly) found in each devices' subdirectory. The device-independent logic has six components:

  • The Interface (interface.h/cpp) has functions to register and run fusions, interrogate fusion functionality, and perform debugging.
  • The Compiler (compiler.h/cpp) performs "upfront" and "runtime" compilation. When fusions are registered, upfront compilation produces fallback code and and performs some shape inference. When a fusion is run, runtime compilation invokes code generation and the device-specific compilation logic.
  • The Code Generator (codegen.h/cpp) produces the string to be compiled on the device.
  • The Executor (executor.h/cpp) runs requested fusions. It performs shape inference, expands tensors as necessary, determines the device to run on, acquires a cached compiled kernel or requests the Compiler produce a new one, invokes device-specific code to launch the kernel and updates the stack.
  • The Fallback (fallback.h/cpp) runs subgraphs that can't be fused because shape inference didn't determine a common tensor size or the device the tensors are on doesn't support fusion.
  • The Kernel Specification Cache (kernel_cache.h/cpp) is a thread-safe cache holding the device-independent specifications produced during upfront compilation. These specifications each have their own thread-safe stores of compiled kernels that the Executor checks before requesting runtime compilation.

The device-specific components have logic for compiling and running code in FusedKernelCPU (cpu/fused_kernel.h/cpp) and FusedKernelCUDA (cuda/fused_kernel.h/cpp).