pytorch/android/pytorch_android
Jiakai Liu d6e3aed032 add eigen blas for mobile build (#26508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26508

Enable BLAS for pytorch mobile build using Eigen BLAS.
It's not most juicy optimization for typical mobile CV models as we are already
using NNPACK/QNNPACK for most ops there. But it's nice to have good fallback
implementation for other ops.

Test Plan:
- Create a simple matrix multiplication script model:
```
import torch

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.weights = torch.ones(1000, 1000)

    def forward(self, x):
        return torch.mm(x, self.weights)

n = Net()
module = torch.jit.trace_module(n, {'forward': torch.ones(1000, 1000)})
module.save('mm.pk')
```

- Before integrate with eigen blas:
```
adb shell 'cd /data/local/tmp; \
./speed_benchmark_torch \
--model=mm.pk \
--input_dims="1000,1000" \
--input_type=float \
--warmup=5 \
--iter=5'

Milliseconds per iter: 2218.52.
```

- After integrate with eigen blas:
```
adb shell 'cd /data/local/tmp; \
./speed_benchmark_torch_eigen \
--model=mm.pk \
--input_dims="1000,1000" \
--input_type=float \
--warmup=5 \
--iter=5'

Milliseconds per iter: 314.535.
```

- Improve MobileNetV2 single thread perf by ~5%:
```
adb shell 'cd /data/local/tmp; \
./speed_benchmark_torch \
--model=mobilenetv2.pk \
--input_dims="1,3,224,224" \
--input_type=float \
--warmup=5 \
--iter=20 \
--print_output=false \
--caffe2_threadpool_force_inline=true'

Milliseconds per iter: 367.055.

adb shell 'cd /data/local/tmp; \
./speed_benchmark_torch_eigen \
--model=mobilenetv2.pk \
--input_dims="1,3,224,224" \
--input_type=float \
--warmup=5 \
--iter=20 \
--print_output=false \
--caffe2_threadpool_force_inline=true'

Milliseconds per iter: 348.77.
```

Differential Revision: D17489587

fbshipit-source-id: efe542db810a900f680da7ec7e60f215f58db66e
2019-09-20 15:45:11 -07:00
..
src turn off autograd mode in android JNI wrapper (#26477) 2019-09-19 21:25:39 -07:00
build.gradle Exclude libfbjni.so from pytorch_android not to have its duplicating (#26382) 2019-09-18 18:40:48 -07:00
CMakeLists.txt add eigen blas for mobile build (#26508) 2019-09-20 15:45:11 -07:00
generate_test_torchscripts.py Fix python lints for generate_test_torchscripts.py (#25107) 2019-08-23 11:37:23 -07:00
gradle.properties Gradle tasks for publishing to bintray, jcenter, mavencentral etc. (#25351) 2019-08-30 17:52:34 -07:00