Commit Graph

131 Commits

Author SHA1 Message Date
Will Wilson
f6496229a5 Fixes xcode 10 beta 4 compile error (#9748)
Summary:
When building iOS apps with a caffe2 dependency, we were seeing the `caffe2/caffe2/mobile/contrib/ios/mpscnn/mpscnn.mm:33:17: error: method 'copyWithZone:' in protocol 'NSCopying' not implemented [-Werror,-Wprotocol]`. This fixes it by implementing a shallow copy with that method.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9748

Reviewed By: jerryzh168

Differential Revision: D8954332

Pulled By: williamtwilson

fbshipit-source-id: 0cd44408257c0bd3f4ffb80312ea9d13d13e5ff3
2018-07-24 11:11:35 -07:00
Orion Reblitz-Richardson
7f33ec55b2 Fix Eigen issue on OS X with CUDA and nvcc compile (#9350)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9350

Re-apply #9270

Breaking this out of #8338

This takes care of the Eigen failure we saw on Mac CUDA builds when BUILD_CAFFE2 and BUILD_ATEN were removed. Fix is to isolate Eigen from headers included by cu files and processed by nvcc. This was worked on with smessmer.

Reviewed By: mingzhe09088

Differential Revision: D8794431

fbshipit-source-id: de656334af46c697802073f8e8d9a6aeb9ca65a7
2018-07-11 14:00:05 -07:00
Keren Zhou
ea1869244f Change depthwise convolution bandwidth formula (#9317)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9317

Change depthwise convolution bandwidth formula

Reviewed By: hlu1

Differential Revision: D8786684

fbshipit-source-id: ba76fea94a6d2fda8d87f40dd626b3dfd90770ed
2018-07-10 14:24:10 -07:00
Mike Kelley
8e6e8098ce Revert D8768025: [pytorch][PR] Fix Eigen issue on OS X with CUDA and nvcc compile
Differential Revision:
D8768025

Original commit changeset: 5b34017aeb67

fbshipit-source-id: 6ec892ff483bb9d966eb7138eadc77443972c8f8
2018-07-10 10:24:43 -07:00
Orion Reblitz-Richardson
bbeae24145 Fix Eigen issue on OS X with CUDA and nvcc compile (#9270)
Summary:
Breaking this out of #8338

This takes care of the Eigen failure we saw on Mac CUDA builds when BUILD_CAFFE2 and BUILD_ATEN were removed. Fix is to isolate Eigen from headers included by cu files and processed by nvcc. This was worked on with smessmer.

cc mingzhe09088 smessmer BIT-silence Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9270

Reviewed By: mingzhe09088

Differential Revision: D8768025

Pulled By: orionr

fbshipit-source-id: 5b34017aeb67e35a1b5938d962181ccd4cd37591
2018-07-10 09:25:42 -07:00
Orion Reblitz-Richardson
9ec0a2aef4 fbshipit-source-id: ba600fcd2b5cefc7621357bdeb05e24cea02e5af 2018-06-27 04:50:56 -07:00
Sebastian Meßmer
49f8581745
Update from facebook (#7855)
* [mpscnn] MPSCNNChannelShuffle

att

* [Easy] Adding tags as an argument to the functional layer

Without it "tags" would be added as an argument to the operator.

The change here is based on the assumption that there is no operator that takes "tags" as an argument.

* Fix locally_connected_op schema check.

Fix locally_connected_op schema check.

* [C2] Add TypeAndShape inference for few more operators

As desc

* [c2] Shape inference should support 0 as dimension

Tensors can have 0 in their dimension.

* Make MockHiveReader loop over and support max_examples

Replace DatasetReader with RandomDatasetReader.

So that Mock Hive Reader can simulate a large data input using a small sample file as source.

* Utility function to wipe cache between benchmark runs

Caffe2 benchmark does not wipe out cache between runs, and this potentially creates an unrealistically optimistic picture of performance. This diff adds utility function to wipe out the cache.

* Allow caffe2 GlobalInit to be invoked multiple times

Allow caffe2 GlobalInit to be invoked multiple times. Will re-parse gflags and update logging levels on successive invocations, but will not re-run init functions or perform other one-time initialization.

* Add Caffe2 GlobalInitIsCalledGuard to base net and operator classes

Warn if caffe2's GlobalInit function has not been invoked before creating an operator or net object. This is based on discussion here: https://fb.quip.com/kqGIAbmK7vNG

* Rethrow current exception on failure

Rethrow current exception instead of copy constructing a new one on op failure.

* Make `clone()` return subclass of List/Struct

`clone()` is not working correctly when we subclass those classes

* Wipe the cache before the net run

the util function is copied from D7409424
will rebase once D7409424 is landed.

* [Caffe2] [Mobile] Support utils/cast.h::GetCastDataType with LITE_PROTO builds

* Correct includes

async_polling include -> async_base include

* Prepare execution flags for executor migration

Making async_scheduling aware of underlying net type to prepare for executor
migration

* Add operator level observers into async executor

Adding operator level observers into RunAsync operators' calls

* Cleanup TEST_Benchmark

Remove duplicate code and provide default implementation in NetBase

* [C2] Fix type and shape inference for binary comparison ops

As desc.

* Add GlobalInit to predictor to ensure initialization is always done before prediction

FACEBOOK:

Redo D7651453 the correct way.

Now use a static variable for the arguments passed to GLog

* Remove spammy log message

This method is currently used in various places inside Caffe itself.

* Disable events for operators inside a chain

We don't need to use events in operators within a chain because the chain is
always scheduled on a single stream, keeping only first and last event for
scheduling purposes

* Ensure correct finish run order

In rare cases we might call finishRun and trigger net's destruction while
another worker is still holding shared_ptr to a thread pool, that can cause
thread pool destruction from within a worker thread in case no other nets are
using the pool. This diff fixes the order of calling finishRun and also changes
pool() to return raw pointer to keep pool's ownership within the net

* Reduce unnecessary polling

Make sure we don't waste CPU by polling operators that we can set an efficient
callbacks on

* Squash commit of syncing 9506eeb from github to fbcode

Patch xplat buck fix

add virtual destructor to OptimizationPass

add virtual destructor to OptimizationPass

build fixes for sync

build fixes for sync

* Fix net tracing

Fix net tracing from async_scheduling

* Fix logging
2018-05-29 11:38:02 -07:00
Marat Dukhan
61a69c2492
[caffe2] Use both __ARM_NEON__ and __ARM_NEON macros (#6697)
ARM64 clang from Android NDK doesn't define __ARM_NEON__, which results is perf regression on some models. I figured that some compilers define __ARM_NEON__ while others define __ARM_NEON. This patch changes all NEON-specific parts in Caffe2 to check both macros.
2018-04-18 17:45:47 -04:00
Jerry Zhang
711343f981
Gltensor fix (#6647)
Fix getGLTensor
2018-04-17 16:25:38 -07:00
Jerry Zhang
63472bcf29
Sync current changes in ACL backend (#6484)
* Sync changes in ACL backend
2018-04-10 17:32:22 -07:00
Orion Reblitz-Richardson
0ac4d19a29 Linter changes. 2018-03-30 21:00:44 -07:00
Orion Reblitz-Richardson
02786a3819 Linter changes. 2018-03-30 21:00:44 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Marat Dukhan
9123fcc857 Use std::cout instead of LOG(INFO) in TEST_Benchmark implementation
LOG(INFO) can be stripped out at compile-time or disabled at run-time,
but there're hardly use-cases where we want to call TEST_Benchmark,
but don't want to see the result. Additionally, on Android, LOG(INFO)
writes to logcat, which is OK for errors/warnings, but inconvenient
for benchmarking results, as on new phones logcat spawns logs like crazy.
2018-03-20 15:31:03 -04:00
Jerry Zhang
27cb06ae22 Adding rewrite_net for ACL backend (#2186)
Add rewrite_net for ACL backend
2018-03-09 09:00:21 -08:00
Jerry Zhang
ab8498e5c8 Acl copy ops (#2158)
Copy op for ACL backend
2018-03-07 13:34:08 -08:00
Dmytro Dzhulgakov
80d0f5de93 [mobile][mpscnn] iOS11.3 interface update
data source change for MPSCNNConvolution
2018-03-06 00:33:11 -08:00
Marat Dukhan
e07083f00a Cleanup CMake files and build scripts for Android (#2067)
- Remove USE_ARM64 option because it doesn't do what is expected
- Disable ARM ComputeLibrary for non-ARM/ARM64 builds
- Remove analysis of CMake options from scripts/build_android.sh
- Add user-specified CMake options at the end of command line to allow overriding defaults
- Update README for ARM ComputeLibrary integration and do not require to disable NNPACK for ARM64 build with ARM ComputeLibrary
2018-02-27 16:05:21 -08:00
Jerry Zhang
12a477b12e Update README.md 2018-02-26 18:21:24 -08:00
Jerry Zhang
679232657d Update README.md 2018-02-26 18:02:28 -08:00
Jerry Zhang
ec194f2468 Fix typos in README 2018-02-26 18:01:27 -08:00
Jerry Zhang
c0866e45c7 Caffe2 ARM ComputeLibrary integration (#2015)
Caffe2 ARM Compute Library Integration
2018-02-23 18:09:05 -08:00
Hao Lu
6df58dac1d Make NNApi build
Summary:
To build with tests and benchmarks
`./scripts/build_android.sh -G Ninja -DBUILD_TEST=ON -DUSE_NNAPI=ON`
To run unit test
`adb push build_android/bin/nnapi_test data/local/tmp`
`adb shell "cd data/local/tmp &&./nnapi_test`
To run benchmark
`adb push build_android/bin/nnapi_benchmark data/local/tmp`
`adb shell "cd data/local/tmp &&./nnapi_benchmark`
Tested on Google PIxel 2 XL with android 8.1
Closes https://github.com/caffe2/caffe2/pull/1918

Reviewed By: Maratyszcza

Differential Revision: D6944604

Pulled By: hlu1

fbshipit-source-id: 462f010117ae4628b23bef506c41397de3817ad4
2018-02-08 19:02:18 -08:00
Hao Lu
de2a708187 Rename test.cc
Reviewed By: jerryzh168

Differential Revision: D6941693

fbshipit-source-id: ced6063b1776464953b445a0bc907d18baf4b172
2018-02-08 15:48:56 -08:00
Hao Lu
99cdf7f91c Integrate android nn api
Summary: Integrate android nn api into Caffe2. Supported ops include averagepool, maxpool, conv, relu, and softmax

Reviewed By: Maratyszcza

Differential Revision: D6560366

fbshipit-source-id: 2879a99c01acb050e711d9d7d5bde022ef95888d
2018-02-07 16:53:58 -08:00
Fei Sun
849b0a0e0e Update SNPE readme. Indicate libgnustl_shared.so is also needed to ru…
Summary:
…n snpe binaries
Closes https://github.com/caffe2/caffe2/pull/1776

Reviewed By: bwasti

Differential Revision: D6777970

Pulled By: sf-wind

fbshipit-source-id: 86a863536afadb2f22303b065e1dfcd3896f1152
2018-01-25 16:35:23 -08:00
Hao Lu
cb7350fc8d Add vulkanSymbolWrapperReset function
Reviewed By: Maratyszcza

Differential Revision: D6707702

fbshipit-source-id: 140c4be7884a307953684a13202c668cb2c1a927
2018-01-12 21:18:06 -08:00
Pieter Noordhuis
944f9aa826 Move Android.mk 2018-01-10 11:32:34 -08:00
Marat Dukhan
2435d22782 Move NNPACK integration to share/contrib/nnpack
Summary:
we are going to deprecate NNPACK bindings in caffe2/contrib/nnpack.
The first step is to move modern NNPACK bindings from caffe2/mobile/contrib/ios/ to
caffe2/share/contrib/nnpack/, and is implemented in this diff.

Reviewed By: sf-wind

Differential Revision: D6687454

fbshipit-source-id: 458614bade92ab5ba5d2ab7f0691071043198b57
2018-01-09 17:22:24 -08:00
Yangqing Jia
bf37548ccc Properly include the generate proposal headers.
The header files will be committed separately from fbcode.
2018-01-02 21:05:19 -08:00
Hao Lu
b132187014 Add vulkan stub
Summary:
Imported and modified from https://github.com/ARM-software/vulkan-sdk
I changed libvulkan-stub.cpp to libvulkan-stub.c

Reviewed By: Maratyszcza

Differential Revision: D6641092

fbshipit-source-id: 1a7fbf745d58b6111a06a983910c583912365357
2017-12-28 17:37:07 -08:00
Hao Lu
e996157a5c Check if dlopen() return handle is NULL in open_libopencl_so()
Reviewed By: Maratyszcza

Differential Revision: D6616616

fbshipit-source-id: 36aab05ec38ca1b843b05f36433dcd90ca476122
2017-12-20 17:36:19 -08:00
Hao Lu
038fb70455 Remove dlopen() in get_libopencl_path()
Reviewed By: Maratyszcza

Differential Revision: D6584697

fbshipit-source-id: bdf5c6c6dc75eb0d7d46b1eba9852a9814f57373
2017-12-15 19:18:17 -08:00
Jerry Zhang
1766e27324 Add DepthwiseConv in iOS11+
Summary: Use MPSCNNDepthwiseConv when groups == input_channels

Reviewed By: ajtulloch

Differential Revision: D6541561

fbshipit-source-id: 7164f26b8f3a101c0ab5c3e6c02ed855397d2750
2017-12-15 16:47:36 -08:00
Peter Goldsborough
95b3c7edad Fix undefined behavior in GLFilter
Summary: Ran into some issues where these values seemed to be initialized to 0 and caused some trouble. Initializing to 1 is safe and well defined.

Reviewed By: hlu1

Differential Revision: D6582774

fbshipit-source-id: 088ec4e782d9680a1d9b4d2d42523d06cbc7dd72
2017-12-15 15:38:44 -08:00
Jerry Zhang
0365640d7e Fix ConvTranspose
Summary: Turns out that similar to RoIWarp, col2im in custom ConvTranspose implementation is also missing a bound check for image.

Reviewed By: ajtulloch

Differential Revision: D6494061

fbshipit-source-id: 1fadbdd05f360b20343df49b70d2be65eab128ac
2017-12-06 12:20:57 -08:00
Jerry Zhang
3c1932c35f Fix RoIWarp
Summary: Fix MPSCNNRoIWarp and made it more general to channels

Reviewed By: ajtulloch

Differential Revision: D6493869

fbshipit-source-id: 77cfa2e2f3bd80efc6e69a0774793e0162d9942a
2017-12-06 11:02:07 -08:00
Hao Lu
76d7bace47 Add opencl logging part I
Reviewed By: Maratyszcza

Differential Revision: D6441192

fbshipit-source-id: 453580e6bf5abceb00667e1045e316ffe30764cb
2017-12-03 13:16:57 -08:00
Jerry Zhang
0512597f86 Switching to MPSCNNConvolutionTranspose for iOS11 and above
Summary: att.

Reviewed By: ajtulloch

Differential Revision: D6420049

fbshipit-source-id: 30262dfefe8c400285bcaaab50de3a5d3ff68858
2017-12-01 17:49:09 -08:00
Jerry Zhang
a00d7a1bec ushort2(gid.x, gid.y) -> gid.xy
Summary: att

Reviewed By: ajtulloch

Differential Revision: D6442939

fbshipit-source-id: 57da10b7249769e8e03d5f505ed3b6ddd3314c98
2017-11-30 16:48:20 -08:00
Jerry Zhang
6e9bb93a71 Handle MPSCNNConcat edge case
Summary:
Handle cases when channels in an output image is filled by multiple input images.
e.g.
Input1: 1 channel, I2: 1 channel, Output: 2 channels

Reviewed By: ajtulloch

Differential Revision: D6432909

fbshipit-source-id: b7a8e9be51010e6aef0c50d93f9a7ec5558c74a4
2017-11-29 17:03:15 -08:00
Jerry Zhang
eba0af4d5d Enable sampling ratio = 0 in RoIWarp
Summary: The case when sampling_ratio = 0 was skipped before, this diff enables that setting.

Reviewed By: ajtulloch

Differential Revision: D6366669

fbshipit-source-id: 4f3b9eaf47eb9dc20823935428d3d886ea32a5fc
2017-11-29 11:04:41 -08:00
Hao Lu
1ef54e3dab Fix OpenGL 3.0
Summary: Make OpenGL build

Reviewed By: bwasti

Differential Revision: D6415848

fbshipit-source-id: 0b78c90d8b0faf30c342ddbe5ccf91a9ac63ef8b
2017-11-27 11:48:56 -08:00
Andrew Tulloch
09b008f155 Fix BUCK for caffe2_test
Differential Revision: D6402763

fbshipit-source-id: c8fe2f84c1cac92eab9bb8f612278957cbfe042f
2017-11-22 20:56:38 -08:00
Andrew Tulloch
eb4344d6e6 Depthwise F(2x2, 3x3) convolution
Reviewed By: Maratyszcza

Differential Revision: D5117325

fbshipit-source-id: 21de84f8836bad142465eb02405a2f867fa09f85
2017-11-22 20:56:34 -08:00
Yangqing Jia
59b2654544 reapply header change after xplat move
Summary: This is a reapplication of the earlier PR due to xplat move. Original author is Christoph Conrads <christoph.conrads@fluent.ai> christoph-conrads .

Reviewed By: houseroad

Differential Revision: D6379736

fbshipit-source-id: b7482ecf3b9487a528c15e92976e915791210002
2017-11-22 13:04:37 -08:00
Marat Dukhan
fec5631513 Updated nnpack code. original author is @Maratyszcza 2017-11-13 11:28:15 -08:00
Fei Sun
b2bbc7c091 Enable building mobile directory files in OSS
Summary:
The source files are not exposed to the parent directory in mobile. Expose them now so that the files are built in OSS.
Closes https://github.com/caffe2/caffe2/pull/1435

Reviewed By: akyrola

Differential Revision: D6274056

Pulled By: sf-wind

fbshipit-source-id: 6b54645bc9a42b4329d8aa20051abeb5fc6b1c37
2017-11-08 12:34:14 -08:00
Skotch Vail
0ce65ede86 Revert D6224054: [xplat] Switch to open-source NNPACK
Summary:
This reverts commit 4dbe02b4da97648a663586414550c2d4e23c7221

bypass-lint

Differential Revision: D6224054

fbshipit-source-id: 6be2e5a129928650ddfe8baa1b309068d90bea69
2017-11-04 00:31:33 -07:00
Marat Dukhan
5616d41421 Switch to open-source NNPACK
Summary:
replaces FB-internal NNPACK fork with open-source version.
Important FB features are already upstreamed to the GitHub repo.

Reviewed By: ajtulloch

Differential Revision: D6224054

fbshipit-source-id: 4dbe02b4da97648a663586414550c2d4e23c7221
2017-11-03 19:01:53 -07:00
Jerry Zhang
4ac8ecb76e Some bug-fixs in mpscnn backend
Summary: att

Reviewed By: ajtulloch

Differential Revision: D6037723

fbshipit-source-id: d7405b27089210abfd48a33ecee47a87f67ae9a0
2017-10-16 18:33:28 -07:00
Jerry Zhang
3c144e3872 Relax CopyToMPSCNN dimension requirement
Summary: Enable CopyToMPSCNN to accept 1 <= ndim <= 4.

Reviewed By: ajtulloch

Differential Revision: D6021320

fbshipit-source-id: e76222b41a0c7b19b38df2ef8be5a4bb24843419
2017-10-16 12:18:05 -07:00
Marat Dukhan
c0c3162c1a Support NVIDIA Tegra
Summary:
makes the necessary changes to support Caffe2 OpenGL ES backend on NVIDIA Tegra devices
- Remove no_bounds global because Tegra GLES driver doesn't recognize it as a constant. Define BOUNDS_CHECK_MODE macro instead.
- Recognize "NVIDIA Tegra" as a supported GL_RENDERER

Reviewed By: hlu1

Differential Revision: D6030760

fbshipit-source-id: e3655467612469d69c70b3fee35edb2d6774a793
2017-10-15 10:18:52 -07:00
Yanghan Wang
30dac012e0 change header
Differential Revision: D5887857

fbshipit-source-id: 994002cb1a72d123035667e4b809d6cea1950a5e
2017-10-09 15:41:57 -07:00
Andrew Tulloch
5e38345d4a Fix break
Differential Revision: D5997998

fbshipit-source-id: a3937539fe331107f4d2917a2e44e187fa14a8c1
2017-10-06 11:34:54 -07:00
Hao Lu
50208c9fd6 Refactor GLConvolution
Summary:
Separate class definition into header file
Remove uniform buffer initialization in the constructor because it's not necessary
Separate tiling and batching code

Reviewed By: jerryzh168

Differential Revision: D5960502

fbshipit-source-id: 5e3bce5192ce6dc69868be1722f490f690d87076
2017-10-05 22:31:47 -07:00
Jerry Zhang
0710a90fa1 Tiled Softmax
Summary: Add tiling support for GLSoftmax

Reviewed By: hlu1

Differential Revision: D5891341

fbshipit-source-id: 38db5f64b3363852b4b650fed0ee1ee425d041a5
2017-10-05 12:47:15 -07:00
Hao Lu
17a92389b3 Remove metal remnants
Summary: Clean up the metal remnants in BUCK now that the metal code has been removed

Reviewed By: bwasti

Differential Revision: D5966095

fbshipit-source-id: 6b022624fe91a6728549d93d2954328c6b4e059e
2017-10-03 15:43:58 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Hao Lu
08b3140827 Back out D5772847 and D5908415
Summary:
D5772847 is breaking real time style transfer on android and conv unit tests on iPhone 7 upgraded to iOS 11.

The temporary fix in D5908415 only fixes android. iPhone 7 is still crashing.

I think these two diffs should be backed out before D5772847 is fully debugged

Reviewed By: fricc33

Differential Revision: D5913834

fbshipit-source-id: b8072c59c83adfed8a0b0ab0f42c39bc4398c7a0
2017-09-26 15:47:49 -07:00
Hao Lu
59be3da3bc Make GLContext unique_ptr
Reviewed By: fricc33

Differential Revision: D5908793

fbshipit-source-id: 281f9ae9baac737fb8fafd79948d0804724087bc
2017-09-26 14:33:10 -07:00
Hao Lu
44b45a1d73 Fix real time style transfer on android
Reviewed By: fricc33

Differential Revision: D5908415

fbshipit-source-id: 27af70baf7a953566cc64dab040f669784c4224b
2017-09-26 14:33:08 -07:00
Junjie Bai
d9b0bcd7a4 Make all existing (except in RoIPool) "is_test" arguments required
Reviewed By: akyrola

Differential Revision: D5830168

fbshipit-source-id: 8634e9cfe308ba0ee90cd8a5c4b09a47b0b5f015
2017-09-25 23:46:12 -07:00
Andrew Tulloch
7750b8db36 Remove NNPACK MaxPool wrapper
Reviewed By: Maratyszcza

Differential Revision: D5879495

fbshipit-source-id: e2020f7e32d64ed9318ab8d09ea63ce6f12a94a3
2017-09-21 12:05:47 -07:00
Jerry Zhang
5d6a41b8aa MPSCNNMul(scalar only)
Summary:
Implementation of MPSCNNMul that only supports multiplying a tensor with a scalar value for now.

Benchmark runtime for CPU, OpenGL and MPSCNN:
```
I0919 21:15:17.942468 3068398464 net_simple.cc:103] Main run finished. Milliseconds per iter: 527.795. Iters per second: 1.89467
I0919 21:15:21.043023 3068398464 opengl_test.cc:2293] Main run finished. Milliseconds per iter: 249.766. Iters per second: 4.00374
I0919 21:15:23.182369 3068398464 net_simple.cc:103] Main run finished. Milliseconds per iter: 175.548. Iters per second: 5.69644
```

Reviewed By: hlu1

Differential Revision: D5870100

fbshipit-source-id: 2aadd5d134f3b8b40a41f638040cbef35a0086df
2017-09-20 19:22:01 -07:00
Andrew Tulloch
aff1370974 AndroidGLContext can lazily allocate static map
Reviewed By: fricc33

Differential Revision: D5867975

fbshipit-source-id: 0cc9159c27e3f667a001b4cd7768098c36d9550f
2017-09-19 19:06:48 -07:00
Hao Lu
ddf6ad83aa Add tiling support to GLConcat
Reviewed By: fricc33

Differential Revision: D5864131

fbshipit-source-id: 63894f5082fbfc64cd078a8f781b4db1b00a69dc
2017-09-19 13:32:12 -07:00
Hao Lu
0bbf8a7a4c Fix squareFactors in opengl_test.cc
Summary: Remove the caffe2 namespace {} because all the code inside opengl_test.cc is wrapped inside the caffe2 namespace

Reviewed By: Maratyszcza

Differential Revision: D5829458

fbshipit-source-id: e68dde08a1c3dc4c41260f5f028ca7efe8d34fbd
2017-09-14 20:16:55 -07:00
Hao Lu
0b89eb7592 Make seg ios run with OpenGL
Summary: Trying to reland D5803411

Reviewed By: fricc33

Differential Revision: D5819829

fbshipit-source-id: 96cb29c7699df625d30853f91844153ed76505d5
2017-09-12 18:16:23 -07:00
Hao Lu
63829695c6 Make android segmentation net run with MPSCNN
Summary: Trying to reland D5803245

Reviewed By: fricc33

Differential Revision: D5818735

fbshipit-source-id: 252fd3c68ce8731b5c96e2f0678128ba9b668581
2017-09-12 18:16:22 -07:00
Fabio Riccardi
8860fb7fe0 Implemented uniform buffer batching
Summary: Kernel data and other shader parameters are now cached directly into uniform buffer blocks, and the blocks are dynamically attached at run time.

Reviewed By: hlu1

Differential Revision: D5772847

fbshipit-source-id: 746448c2d5db12e38fb883874ede3acfccb9f6ef
2017-09-12 17:51:39 -07:00
Hao Lu
d52404779f Revert D5803245: [caffe2][MPSCNN][segmentation] Make android segmentation net run with MPSCNN
Summary:
This reverts commit 6808e9c3504389c113c7a16504d6554e83bdcc3e

bypass-lint

Differential Revision: D5803245

fbshipit-source-id: e6e2e90dd196ae958d729af2e19942e922207a2a
2017-09-11 18:33:53 -07:00
Hao Lu
f09fb7735e Revert D5803411: [caffe2][segmentation]Make iOS segmentation net run with OpenGL
Summary:
This reverts commit d208771d59f99b4f95ce67849baf369c14e66b37

bypass-lint

Differential Revision: D5803411

fbshipit-source-id: b120583dca6b885e91c92993ab3cc18f7e2c8a48
2017-09-11 18:33:52 -07:00
Fei Sun
670cbf0350 Remove the files added by PR 1203
Reviewed By: pietern

Differential Revision: D5809970

fbshipit-source-id: 011b635ca9d1c285543b88cb021df5ba8f4b2a5a
2017-09-11 17:02:00 -07:00
Hao Lu
98173850b2 Make iOS segmentation net run with OpenGL
Reviewed By: fricc33

Differential Revision: D5803411

fbshipit-source-id: d208771d59f99b4f95ce67849baf369c14e66b37
2017-09-11 16:32:41 -07:00
Hao Lu
ebf7784840 Make android segmentation net run with MPSCNN
Summary: The android segmentation net was failing with MPSCNN because the some fused MPSCNNConvRelu ops become in-place after fusion.

Reviewed By: fricc33

Differential Revision: D5803245

fbshipit-source-id: 6808e9c3504389c113c7a16504d6554e83bdcc3e
2017-09-11 16:32:40 -07:00
Luke Yeager
944115c915 Bugfix for concat frontend
Summary:
When breaking out pooyadavoodi's change to `brew.concat` from https://github.com/caffe2/caffe2/pull/1151 to https://github.com/caffe2/caffe2/pull/1184, I made it throw an error instead of silently changing removing `order`. But `order` is always present because of [this](https://github.com/caffe2/caffe2/blob/v0.8.1/caffe2/python/model_helper.py#L118), so the frontend can never be used to set `axis`. That's bad. This PR changes the behavior back to Pooya's original implementation.
Closes https://github.com/caffe2/caffe2/pull/1202

Reviewed By: akyrola

Differential Revision: D5806488

Pulled By: pietern

fbshipit-source-id: ceaea77469688a66b269b8ed2944f0d3fe873940
2017-09-11 13:02:59 -07:00
Pieter Noordhuis
84167faf0f Enable use of GPUDirect through argument to Gloo AllreduceOp
Summary:
If the Gloo InfiniBand transport is used, the Gloo algorithms can use
GPUDirect to DMA directly from/to GPU memory. This is done through the
CudaDeviceWorkspace. This change adds a "gpu_direct" option to the
Allreduce operator that makes it use GPUDirect if the transport
supports it.
Closes https://github.com/caffe2/caffe2/pull/1203

Reviewed By: wesolwsk

Differential Revision: D5806366

Pulled By: pietern

fbshipit-source-id: 9e9a78f059f2b5c6e4fbf6574b7db4776a94696c
2017-09-11 13:02:58 -07:00
Hao Lu
c11755e559 Add checks for input texture slice for tiling
Summary: The convolution should not run with input texture slices > 1 with tiling

Differential Revision: D5774187

fbshipit-source-id: 5e94f82cd65e0d4425a7a0090a61a33bef2a14fc
2017-09-08 12:52:22 -07:00
Fei Sun
c087a60026 The CMakeLists.txt name is wrong
Summary: Fix the CMakeLists.txt file name

Reviewed By: Yangqing

Differential Revision: D5790555

fbshipit-source-id: 7c5cc36e6154a2708dc290a336da2204a387c416
2017-09-07 18:16:57 -07:00
Fei Sun
0f1a61cf80 @allow-large-files [Caffe2] [Folded diff] Move mobile files to mobile directory
Reviewed By: Yangqing

Differential Revision: D5752229

fbshipit-source-id: bc6e3ec3e4b06ae4b09f94b141a106420664d9ea
2017-09-07 15:06:43 -07:00