pytorch/torch/csrc/jit/python
chunyuan 8b11d81058 [Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1)
Re-landing https://github.com/pytorch/pytorch/pull/68111

## Description
Preview4 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444).

On the basis of https://github.com/pytorch/pytorch/pull/50256, the below improvements are included:

- The [preview4 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.4.1) of the oneDNN Graph API is used
- The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties.

### User API:
The optimization pass is disabled by default. Users could enable it by:
```
torch.jit.enable_onednn_fusion(True)
```

### Performance:
[pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance:
- SkyLake 8180 (1 socket of 28 cores):

  ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png)

- SkyLake 8180 (single thread):

  ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png)
 \* By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI)
  \** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops

### Directory structure of the integration code
Fuser-related code are placed under:
```
torch/csrc/jit/codegen/onednn/
```

Optimization pass registration is done in:
```
torch/csrc/jit/passes/onednn_graph_fuser.h
```

CMake for the integration code is:
```
caffe2/CMakeLists.txt
```

## Limitations

- In this PR, we have only supported the optimization on Linux platform. The support on Windows and MacOS will be enabled as the next step.
- We have only optimized the inference use case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74596
Approved by: https://github.com/malfet
2022-04-29 01:01:33 +00:00
..
init.cpp [Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1) 2022-04-29 01:01:33 +00:00
init.h
module_python.h
pybind_utils.cpp Introducing SymInt to Pytorch (for tracing size arithmetic) (master rebase) (#74861) 2022-03-31 21:59:59 +00:00
pybind_utils.h Get rid of TorchScript sparse tensor is experimental warning. (#73874) 2022-03-09 15:45:24 +00:00
pybind.h
python_arg_flatten.cpp
python_arg_flatten.h
python_custom_class.cpp
python_custom_class.h
python_dict.cpp
python_dict.h
python_interpreter.cpp
python_ir.cpp Introducing SymInt to Pytorch (for tracing size arithmetic) (master rebase) (#74861) 2022-03-31 21:59:59 +00:00
python_ir.h
python_ivalue.h
python_list.cpp [BE] Fix pybind deprecation warnings (#72376) 2022-02-07 18:33:32 +00:00
python_list.h Fix sign-compare violations in python_list.h 2022-04-01 19:15:51 +00:00
python_sugared_value.cpp [JIT] support parameterlist iteration 2022-04-21 18:51:27 +00:00
python_sugared_value.h sup torch script parameterlist 2022-04-20 20:53:07 +00:00
python_tracer.cpp
python_tracer.h
python_tree_views.cpp Reland "Make debug_pkl smaller by only emitting unique traces." (#73368) 2022-04-18 22:34:21 +00:00
python_tree_views.h
script_init.cpp sup torch script parameterlist 2022-04-20 20:53:07 +00:00
script_init.h
update_graph_executor_opt.cpp
update_graph_executor_opt.h