Commit Graph

1 Commits

Author SHA1 Message Date
Evgeny Fiksman
e372f42110 [caffe2] Explicit vectorization of LSTM operator (#35556)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35556

Pull Request resolved: https://github.com/pytorch/pytorch/pull/35542

Apply explicit vectorization to lstm_unit operator.
Enabled by -DENABLE_VECTORIZATION=1

This optimization requires vector library support and was tested with Intel SVML & clang.
However, compiler which support OpenMP4.5 with omp simd extention should also benefit.

After the code changes
In file included from caffe2/caffe2/operators/lstm_unit_op.cc:1:
caffe2/caffe2/operators/lstm_unit_op.h:60:1: remark: vectorized loop (vectorization width: 8, interleaved count: 1) [-Rpass=loop-vectorize]
VECTOR_LOOP for (int d = 0; d < D; ++d) {

caffe2/caffe2/operators/lstm_unit_op.h:60:1: remark: vectorized loop (vectorization width: 8, interleaved count: 1) [-Rpass=loop-vectorize]
caffe2/caffe2/operators/lstm_unit_op.h:112:1: remark: vectorized loop (vectorization width: 8, interleaved count: 1) [-Rpass=loop-vectorize]
VECTOR_LOOP for (int d = 0; d < D; ++d) {

Test Plan:
Check failures at OSS CI
- No build failures related to this change
- Failing tests are:
  - py3.6-clang7-rocmdeb-ubuntu16.04-test2
>RuntimeError: fft: ATen not compiled with MKL support
  - caffe2_onnx_ort2_py3_6_clang7_ubuntu16_04_test -
>gradient_check_test.py::TestMakeTwo
Exited with code exit status 1
- pytorch_macos_10_13_py3_test , Test errors like:
> ERROR [0.014s]: test_boolean_indexing_weirdness_cpu (__main__.NumpyTestsCPU)
RuntimeError: shape mismatch: indexing tensors could not be broadcast together with shapes [0], [2]
- caffe2_onnx_ort1_py3_6_clang7_ubuntu16_04_test
  - No failure info

Reviewed By: jspark1105

Differential Revision: D20484640

fbshipit-source-id: 8fb82dbd6698c8de3e0bbbc0b48d15b70e36ca94
2020-04-01 17:19:56 -07:00