mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Re-landing https://github.com/pytorch/pytorch/pull/68111 ## Description Preview4 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444). On the basis of https://github.com/pytorch/pytorch/pull/50256, the below improvements are included: - The [preview4 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.4.1) of the oneDNN Graph API is used - The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties. ### User API: The optimization pass is disabled by default. Users could enable it by: ``` torch.jit.enable_onednn_fusion(True) ``` ### Performance: [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance: - SkyLake 8180 (1 socket of 28 cores):  - SkyLake 8180 (single thread):  \* By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI) \** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops ### Directory structure of the integration code Fuser-related code are placed under: ``` torch/csrc/jit/codegen/onednn/ ``` Optimization pass registration is done in: ``` torch/csrc/jit/passes/onednn_graph_fuser.h ``` CMake for the integration code is: ``` caffe2/CMakeLists.txt ``` ## Limitations - In this PR, we have only supported the optimization on Linux platform. The support on Windows and MacOS will be enabled as the next step. - We have only optimized the inference use case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74596 Approved by: https://github.com/malfet |
||
|---|---|---|
| .. | ||
| init.cpp | ||
| init.h | ||
| module_python.h | ||
| pybind_utils.cpp | ||
| pybind_utils.h | ||
| pybind.h | ||
| python_arg_flatten.cpp | ||
| python_arg_flatten.h | ||
| python_custom_class.cpp | ||
| python_custom_class.h | ||
| python_dict.cpp | ||
| python_dict.h | ||
| python_interpreter.cpp | ||
| python_ir.cpp | ||
| python_ir.h | ||
| python_ivalue.h | ||
| python_list.cpp | ||
| python_list.h | ||
| python_sugared_value.cpp | ||
| python_sugared_value.h | ||
| python_tracer.cpp | ||
| python_tracer.h | ||
| python_tree_views.cpp | ||
| python_tree_views.h | ||
| script_init.cpp | ||
| script_init.h | ||
| update_graph_executor_opt.cpp | ||
| update_graph_executor_opt.h | ||