Commit Graph

75 Commits

Author SHA1 Message Date
Ashkan Aliabadi
c8deca8ea8 Update pthreadpool to pthreadpool:029c88620802e1361ccf41d1970bd5b07fd6b7bb. (#40524)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40524

Reviewed By: ezyang

Differential Revision: D22215742

Pulled By: AshkanAliabadi

fbshipit-source-id: ef594e0901337a92b21ddd44e554da66c723eb7c
2020-07-09 10:00:36 -07:00
peter
45c9ed825a Formatting cmake (to lowercase without space for if/elseif/else/endif) (#35521)
Summary:
Running commands:
```bash
shopt -s globstar

sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i caffe2/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i torch/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i c10/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake.in
```
We may further convert all the commands into lowercase according to the following issue: 77543bde41.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35521

Differential Revision: D20704382

Pulled By: malfet

fbshipit-source-id: 42186b9b1660c34428ab7ceb8d3f7a0ced5d2e80
2020-03-27 14:25:17 -07:00
Gaurav Singh
e5a02aa2fe [caffe2] simplify relative error expr (#32999)
Summary:
simplify relative error expr
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32999

Differential Revision: D19739382

Pulled By: jerryzh168

fbshipit-source-id: 95e0c68f6d9cb6708f400cc1cdb311af83b0621e
2020-02-19 16:35:44 -08:00
Jerry Zhang
ac87488bd3 Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize (#17764)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17764

Original commit changeset: f1923fdca4a1

reverted int8 ops fixes the original runtime regression.
We'll ignore the memory regression since it is flaky, see D14228484

Reviewed By: dzhulgakov

Differential Revision: D13885233

fbshipit-source-id: ccbe4b94acb44b7b4cb3ae4d73e3f6091e1e1195
2019-03-07 18:38:53 -08:00
Sebastian Messmer
6706e9af19 Make C10_MOBILE consistent with how feature macros are usually used (#17481)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17481

Usually, feature macros are either defined or undefined and checked accordingly.
C10_MOBILE was a weird special case that was always defined but either defined to 1 or to 0.

This caused a lot of confusion for me when trying to disable something from mobile build and it also disabled it
from the server build (because I was using ifdef). Also, I found a place in the existing code base that made
that wrong assumption and used the macro wrongly, see https://fburl.com/y4icohts

Reviewed By: dzhulgakov

Differential Revision: D14214825

fbshipit-source-id: f3a155b6d43d334e8839e2b2e3c40ed2c773eab6
2019-02-27 17:57:51 -08:00
Jerry Zhang
2af95d8e3e Back out "[pt1][tensor] Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize" (#16516)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16516

Original commit changeset: 64abce3dbaed

Reviewed By: dzhulgakov

Differential Revision: D13863715

fbshipit-source-id: f1923fdca4a1a82768d9c280a8493ff15a7eb2ba
2019-01-30 12:50:38 -08:00
Jerry Zhang
ff963d4b9f Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize (#16273)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16273

Previously we have SetOutputSize which accept a partially initialized Output Tensor and set it to the correct size,
the diff change this to GetOutputSize that returns the correct size instead.
e.g.
```
auto* Y = Output(0);
ConvPoolOp<Context>::SetOutputSize(X, Y, channels);
...
Y->mutable_data<T>...
```
-->
```
auto sizes = ConvPoolOp<Context>::GetOutputSize(X, channels);
auto* Y = Output(0, sizes, at::dtype<T>());
```

Reviewed By: dzhulgakov

Differential Revision: D13736281

fbshipit-source-id: 64abce3dbaed0b375098463333dfd0ea5a3b1945
2019-01-28 15:56:34 -08:00
Jerry Zhang
0c32e1b43e use C10_MOBILE/ANDROID/IOS (#15363)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363

Didn't define C10_MOBILE in the numa file move diff: D13380559
move CAFFE2_MOBILE/ANDROID/IOS to c10

```
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS"

```

i-am-not-moving-c2-to-c10

Reviewed By: marcinkwiatkowski

Differential Revision: D13490020

fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4
2019-01-09 15:08:20 -08:00
ArutyunovG
8e91da4cb3 Windows shared build (#13550)
Summary:
Hi guys,

I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.

What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.

After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550

Reviewed By: ezyang

Differential Revision: D13042597

Pulled By: orionr

fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
2018-11-16 12:16:28 -08:00
Jerry Zhang
57ec8f111f Rename ndim() -> dim() - 6/6
Summary:
Codemod generated with clangr shard mode, 50 files per diff,
clangr code(ndim()->dim()): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

Reviewed By: ezyang

Differential Revision: D12935827

fbshipit-source-id: 80ecb034c243dbfd267b9f131cee9d7afd5ef063
2018-11-07 07:27:45 -08:00
Jerry Zhang
13b9fd3e05 Renaming meta() to dtype() - 2/2 (#13334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13334

Codemod generated with clangr shard mode, 50 files per diff,
clangr code(meta->dtype): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

i-am-not-moving-c2-to-c10

Reviewed By: ezyang

Differential Revision: D12845197

fbshipit-source-id: f87eb575d3c31593ca76b70780cc4fca888e706b
2018-10-30 18:24:30 -07:00
Jerry Zhang
91e87c0395 Renaming size() to numel() - 2/2
Summary:
Codemod generated with clangr shard mode, 50 files per diff,
clangr code(size->numel): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

i-am-not-moving-c2-to-c10

Reviewed By: ezyang

Differential Revision: D12833748

fbshipit-source-id: 98dc2d3abc23c177c2c9e457b81499952d4b690c
2018-10-29 18:59:29 -07:00
Edward Yang
8c514627a4 Add C10_LIKELY/C10_UNLIKELY macros (#12932)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12932

I was looking at some assembly for some code I was working on,
and felt a desire to have likely()/unlikely() macros.  I checked
if we already had them, and we didn't.  This commit adds them,
and fixes up all known use sites to make use of it.

Reviewed By: Maratyszcza

Differential Revision: D10488399

fbshipit-source-id: 7476da208907480d49f02b37c7345c17d85c3db7
2018-10-22 16:26:19 -07:00
Yangqing Jia
7d5f7ed270 Using c10 namespace across caffe2. (#12714)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12714

This is a short change to enable c10 namespace in caffe2. We did not enable
it before due to gflags global variable confusion, but it should have been
mostly cleaned now. Right now, the plan on record is that namespace caffe2 and
namespace aten will fully be supersets of namespace c10.

Most of the diff is codemod, and only two places of non-codemod is in caffe2/core/common.h, where

```
using namespace c10;
```

is added, and in Flags.h, where instead of creating aliasing variables in c10 namespace, we directly put it in the global namespace to match gflags (and same behavior if gflags is not being built with).

Reviewed By: dzhulgakov

Differential Revision: D10390486

fbshipit-source-id: 5e2df730e28e29a052f513bddc558d9f78a23b9b
2018-10-17 12:57:19 -07:00
Yangqing Jia
38f3d1fc40 move flags to c10 (#12144)
Summary:
still influx.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144

Reviewed By: smessmer

Differential Revision: D10140176

Pulled By: Yangqing

fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c
2018-10-04 02:09:56 -07:00
Sebastian Messmer
8f0db9bbbb Removing some dependency edges from Blob to other caffe2 (#12043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12043

Re-trying D9979976, this time with all call sites fixed.

D9979976 got reverted because there was a call site that wasn't covered by sandcastle it seems.
I fixed it and used 'grep' to ensure there aren't any more call sites in fbsource.

Reviewed By: ezyang

Differential Revision: D10026392

fbshipit-source-id: cd341514a8e53a40147ea0ee3e52f63bb6444157
2018-09-25 11:40:24 -07:00
Maciej Bargiel
2cdf98a74d Back out "Removing some dependency edges from Blob to other caffe2"
Summary: The controller you requested could not be found. Original commit changeset: 2ea17724e223

Differential Revision:
D10026321
Ninja: stable broken

fbshipit-source-id: faf87cb7cc0f78c2c10d4aa6fceea279cd27acd6
2018-09-25 01:11:14 -07:00
Sebastian Messmer
17a65bf9b6 Removing some dependency edges from Blob to other caffe2 (#11923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11923

This is pre-work to allow moving Blob to ATen/core, which cannot depend on caffe2 anymore.
(1) Removing the Blob -> Tensor dependency allows us to move Blob to ATen/core and use it inside IValue without having to wait for the Tensor merge to be complete.
(2) In the final Blob design, we want it to be a very small class that doesn't have any special treatment for Tensor (or to be more correct, doesn't allow storing Tensor anymore), so this is anyhow the direction we want to go.

This changes call sites that will have to be moved to IValue later, but they cannot be moved to IValue directly, because for that, IValue first needs to be able to store Blob, which in turn first needs this diff and some other changes coming up in future diffs.

Codemods:
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.IsTensorType\\(" "BlobIsTensorType(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->IsTensorType\\(" "BlobIsTensorType(*\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.GetMutableTensor\\(" "BlobGetMutableTensor(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->GetMutableTensor\\(" "BlobGetMutableTensor(*\\1, "

It is, however, not only these codemods because regex based refactoring was only able to match a small amount of the call sites. To catch more, I wouldn've needed a AST aware tool like clangr, which I didn't figure out how to use.

Reviewed By: ezyang

Differential Revision: D9979976

fbshipit-source-id: 2ea17724e223b5b73b44f99362727759ca689e61
2018-09-24 22:57:05 -07:00
Christian Puhrsch
a6630e25af Remove many caffe2::TIndex and replace them with int64_t (#11943)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11943

See title

Reviewed By: ezyang

Differential Revision: D9992645

fbshipit-source-id: e8f80d6ea762971513e5e8072975ceea53e1f11a
2018-09-22 18:11:04 -07:00
Orion Reblitz-Richardson
a175282776 Flags for LMDB, LevelDB, and Caffe2 ops (#11462)
Summary:
Add flags for LMDB and LevelDB, default `OFF`. These can be enabled with

```
USE_LMDB=1 USE_LEVELDB=1 python setup.py build_deps
```

Also add a flag to build Caffe2 ops, which is default `ON`. Disable with

```
NO_CAFFE2_OPS=1 python setup.py build_deps
```

cc Yangqing soumith pjh5 mingzhe09088
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11462

Reviewed By: soumith

Differential Revision: D9758156

Pulled By: orionr

fbshipit-source-id: 95fd206d72fdf44df54fc5d0aeab598bff900c63
2018-09-10 17:27:50 -07:00
Edward Yang
91797c0672 Replace direct include of caffe2.pb.h with an intermediary header caffe2_pb.h (#10946)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10946

```
codemod -d . --extensions cc,cpp,cu,cuh,h caffe2/proto/caffe2.pb.h caffe2/proto/caffe2_pb.h
```

Reviewed By: houseroad

Differential Revision: D9539945

fbshipit-source-id: 497d04720e8e7e61c05ffe1b23733d0cb774de7e
2018-08-28 11:57:08 -07:00
Jerry Zhang
aebf3b47ae Remove template parameter from Tensor (#9939)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939

Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: ezyang, houseroad

Differential Revision: D9024330

fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
2018-07-27 10:56:39 -07:00
Jerry Zhang
969b62f276 Revert D8121878: Remove template parameter from Tensor
Differential Revision:
D8121878

Original commit changeset: 4a5e9a677ba4

fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e
2018-07-26 14:02:04 -07:00
Jerry Zhang
cd5adc7b5f Remove template parameter from Tensor (#13)
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: xw285cornell

Differential Revision: D8121878

fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
2018-07-26 10:25:23 -07:00
Marat Dukhan
d3651585b8
Simplify pthreadpool implementation on top of Caffe2 thread pool (#7666)
Remove one layer of pointer dereference when calling the thread pool.
2018-06-18 19:06:50 -07:00
bddppq
db5bc71562
Fix and ignore some warnings (#8081) 2018-06-04 01:01:59 -07:00
Paul Jesse Hellemn
b875fb281c
Update from facebook (#7451)
* [bootcamp] Improve "Shape" operator to support axes specification

To improve .shape operator of Caffe2 to support x.shape(tensor, axes), which takes an optional int array "axes" as input. For example, x.shape(tensor, [1, 0]) will return the dimension for axis 1 and 0 following the specified order. For current version, "axes" input allows duplications and can have arbitrary length.

* Back out "Add barrier net that runs before training nets"

Original commit changeset: b373fdc9c30f. Need additional changes to some callers to support barrier failures.

* Change warning to verbose log to reduce log spam

The `LOG(WARNING)` was a bit spammy for regular use so lets just make it a `VLOG`.

* Extract the shared code from different caffe2_benchmark binaries

The OSS benchmark and Internal benchmark will share most functions in the benchmark.

* Support MFR in sequence training

As titled.

* Make knowledge distillation work with using logged prediction feature as teacher label.

1) Add loading raw dense feature as teacher label.
2) Optional calibration function for teacher label
3) Add teacher label into generic unit test
4) Deprecated TTSN workflow version using feature_options to config teacher label

* [C2/CUDA]: unjoined cross entropy sigmoid

as desc

* Add async_scheduling executor into deferrable_net_exec_test

Add async_scheduling into tests and fix some exception cases

* Fix Event disabled error

When disabling event in RNN ops make sure we don't call Finish on disabled
event from op's RunAsync

* cuda ensure cpu output op can handle both TensorCPU and TensorCUDA

as desc.

* [C2 Core] Infer input device option in C2 hypothesis_test checkers

Improve how we default input blob device options.
Previously it defaults as where op lives but it is not necessarily the case.

For example:
CopyCPUToGPU

* [C2 Op]SplitByLengthsOp CPU/GPU implementation

[C2 Op]SplitByLengthsOp CPU/GPU implementation

* fix undefined symbol error

not sure why we're getting undefined symbol even with link_whole = True
Need to figure out why but need this workaround for now

* Add tools in DAIPlayground platform to help debugging models

Add additional tools to allow Plauground override individual method defined in AnyExp.  This will allow user to create module that specificly change certain default method behavior.  An example included in this diff is deactivating test model and checkpointing.  When debugging any model problems, switching off components helps me quickly narrow down the location of the bug.  The technique is extensively used in task T27038712 (Steady memory increase in EDPM, eventually resulting in gloo/cuda.cu:34: out of memory)

* add shape and type inference for int8 conversion operator

* Fix flaky test for group_norm

Fix flaky test for group_norm

* Fix group_norm_op_test flaky

Fix group_norm_op_test flaky

* Implementation of composite learning rate policy

In many state-of-the-arts deep learning works, people use a simple trick to
schedule the learning rate: use a fixed learning rate until error plateaus
and then switch to a different fixed learning rate, and so on. In this diff,
we implemented a simple version of the composite learning rate. The user gives
a set of learning rates policies and corresponding iteration nums, and the
optimizer will change the learning rate policy based on the number of iterations so far.

For example, the user give two learning rate policies, one is FixedLearningRate
and PolyLearningRate, with an iteration number of 1k. Then the first 1k iteration,
we use FixedLearningRate. For the following iterations, we use PolyLearningRate.

* Split two use cases of CachedReader into two classes, DBFileReader and CachedReader

# Use Cases:

1). input: DB file -> output: DatasetReader.

Use DBFileReader.

2). input: Reader -> build cache DB file -> output: DatasetReader.

Use CachedReader.

# Changes to CachedReader:

1). Move db_path to the constructor.
Because in mock reader. cache will always be built ahead.

# Changes to tests:

1). Make a separate TestCase class for CachedReader and DBFileReader.

2). Make it possible to add more test functions by adding setUp, tearDown and _make_temp_path.

3). Make delete db_path more general. `db_path` could be a file for `log_file_db`, but could also be a directory for `leveldb`.

* Back out "On Mobile phones, call GlobalInit with no arguments in predictor in case we need to perform initialization"

Original commit changeset: 4489c6133f11

* Fix LARS bug

Fixed a bug in the LARS implementation which caused all subsequent blobs not using LARS to have the LARS learning rate multiplier applied to them.

* [tum] support sparse init & add uniformFill option

as title

* Propagate exception for async nets

Capture the exception when an exception is thrown in async nets and re-throw it after wait().  This allows exceptions to be propagated up to the caller.

This diff was a part of D7752068.  We split the diff so that C2 core files changes are in a separate diff.

* Automatic update of fbcode/onnx to 69894f207dfcd72d1e70497d387201cec327efbc

Previous import was 403ccfbd0161c38f0834413d790bad0874afbf9a

Included changes:
- **[69894f2](https://github.com/onnx/onnx/commit/69894f2)**: Use op schema.all tensor types in random like definitions (#865) <Scott McKay>
- **[b9d6b90](https://github.com/onnx/onnx/commit/b9d6b90)**: Clarify random like operators (#846) <Scott McKay>
- **[fc6b5fb](https://github.com/onnx/onnx/commit/fc6b5fb)**: Refactor shape inference implementation (#855) <anderspapitto>
- **[b7d8dc8](https://github.com/onnx/onnx/commit/b7d8dc8)**: fix cmake warning message (#863) <Eric S. Yu>
- **[f585c5d](https://github.com/onnx/onnx/commit/f585c5d)**: add pytorch-operator test for tile (#831) <Wenhao Hu>
- **[993fe70](https://github.com/onnx/onnx/commit/993fe70)**: add install step (#832) <Eric S. Yu>
- **[68bc26c](https://github.com/onnx/onnx/commit/68bc26c)**: add type inference for traditional ml ops except classifier ops. (#857) <Ke Zhang>
- **[9cc0cda](https://github.com/onnx/onnx/commit/9cc0cda)**: fix string representation of scalar types (#858) <G. Ramalingam>
- **[1078925](https://github.com/onnx/onnx/commit/1078925)**: fix y in pow test case to scalar (#852) <Wenhao Hu>
- **[c66fb6f](https://github.com/onnx/onnx/commit/c66fb6f)**: Add some math function shape inference (#845) <anderspapitto>
- **[ff667d1](https://github.com/onnx/onnx/commit/ff667d1)**: Refactor return type and docs for ONNXIFI_BACKEND_DIRECTX_ID (#853) <Marat Dukhan>
- **[11c6876](https://github.com/onnx/onnx/commit/11c6876)**: clear initializer names when clear initializer (#849) <Wenhao Hu>
- **[73c34ae](https://github.com/onnx/onnx/commit/73c34ae)**: Clarify FeatureVectorizer description. (#843) <Scott McKay>
- **[1befb9b](https://github.com/onnx/onnx/commit/1befb9b)**: Remove useless text in docs (#850) <Lu Fang>
- **[e84788f](https://github.com/onnx/onnx/commit/e84788f)**: Fix SELU attributes' default values (#839) <Lu Fang>
- **[ebac046](https://github.com/onnx/onnx/commit/ebac046)**: Add tile test case (#823) <Wenhao Hu>
- **[8b7a925](https://github.com/onnx/onnx/commit/8b7a925)**: a few more shape inference functions (#772) <anderspapitto>
- **[9718f42](https://github.com/onnx/onnx/commit/9718f42)**: Make the coefficient non optional for LinearClassifier (#836) <Jaliya Ekanayake>
- **[ef083d0](https://github.com/onnx/onnx/commit/ef083d0)**: Add save_tensor and load_tensor functions for Protos (#770) <Lu Fang>
- **[45ceb55](https://github.com/onnx/onnx/commit/45ceb55)**: Check if CMAKE_BUILD_TYPE set before project(). (#812) <Sergii Dymchenko>
- **[4b3d2b0](https://github.com/onnx/onnx/commit/4b3d2b0)**: [WIP] reenable shape inference tests (#834) <anderspapitto>
- **[22d17ee](https://github.com/onnx/onnx/commit/22d17ee)**: RNN tests: LSTM, GRU, SimpleRNN (#739) <Peyman Manikashani>
- **[de65b95](https://github.com/onnx/onnx/commit/de65b95)**: dimension denotation (#443) <Tian Jin>
- **[eccc76e](https://github.com/onnx/onnx/commit/eccc76e)**: fix field number issue in onnx operator proto and enable its build (#829) <Ke Zhang>
- **[d582beb](https://github.com/onnx/onnx/commit/d582beb)**: disable shape inference test to unbreak ci (#830) <Lu Fang>
- **[485b787](https://github.com/onnx/onnx/commit/485b787)**: function proto for composite op. (#802) <Ke Zhang>
- **[cd58928](https://github.com/onnx/onnx/commit/cd58928)**: specify defaults for attributes of Affine op (#820) <G. Ramalingam>
- **[7ee2cf9](https://github.com/onnx/onnx/commit/7ee2cf9)**: merge the dummy backend back into the main one (#743) <anderspapitto>
- **[1c03a5a](https://github.com/onnx/onnx/commit/1c03a5a)**: [Proposal] ONNX Interface for Framework Integration (previously ONNX Backend API) header and docs (#551) <Marat Dukhan>
- **[3769a98](https://github.com/onnx/onnx/commit/3769a98)**: Rename real model test case from VGG-16 to ZFNet (#821) <Lu Fang>

* [C2]ReluN Op

relu n op.

tf reference: https://www.tensorflow.org/api_docs/python/tf/nn/relu6

* Call destructor when assigning a blob value

* Add executor overrides

Add executor overrides flag to enable migration to async_scheduling executor

* Add barrier net that runs before training nets - attempt #2

Add a synchonize barrier net that is run before training nets.  With this net, shards that are faster will wait for other shards before start training.  This reduce chances of the faster shards timing out during GLOO AllReduce.
Removed explicit data_parallel_model.py.synchronize call in holmes workflow.

This change was landed previously but caused errors for some EDPM workflows - See https://fb.facebook.com/groups/1426530000692545/permalink/1906766366002237/ - because EDPM assumes any call to CreateOrCloneCommonWorld and Gloo ops are wrapped in exception handlers but in this case exception thrown in the barrier init net is not handled.

To address this issue, we add _CreateOrCloneCommonWorld to the param_init_net instead of a new barrier init net.  Since errors for param_init_net run is handled gracefully and re-rendezvous, it should fixes the problem.

* Handle empty nets in async_scheduling

Make sure we don't get stuck on empty nets

* use CUDA_ARCH for conditional compile

* [C2 fix] infer function for ensure_cpu_output_op

* Update group_norm test to reduce flaky test

* Fix lr_multiplier for GPU
2018-05-10 23:14:27 -07:00
Marat Dukhan
24d05662ea
[caffe2] Open-source DEPTHWISE_3x3 engine (#6601)
DEPTHWISE_3x3 engine provides an optimized implementation of depthwise 3x3 convolution, e.g. for ShuffleNet, MobileNets
Implementations exist for CPU (generic), ARM CPU, and CUDA GPU.

Originally developed by @ajtulloch
2018-04-26 02:30:51 -04:00
Marat Dukhan
6ce6c0ed65
[caffe2] Fix bug in NNPACK bindings for convolution in precomputed transform (#6555)
Caffe2-NNPACK integration created blobs for precomputed kernel transorms based on the name of Conv operator.
When Conv operators have the same name (e.g. empty string), or the blobs for precomputed transforms get the same name and overwrite each other.
This patch ensures that blobs for all precomputed transforms in the network get a unique name.
2018-04-12 15:20:26 -04:00
Marat Dukhan
e83dd716ec
[caffe2] Support fused Conv+Relu with NNPACK (#6375)
Enable the use of fused Convolution+ReLU functionality from NNPACK
2018-04-09 15:39:31 -04:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Marat Dukhan
3aa393f7e2 Log NNPACK profile to std::cout instead of LOG(INFO)
Similar to #2333, but for NNPACK bindings
2018-03-21 18:46:11 -04:00
Yangqing Jia
dd1564b061 Caffe2 module update: move observers as well as binaries. (#2145)
* Caffe2 module update: move observers as well as binaries.

* Add threads linkage

* Add Threads dependency to public interface
2018-03-06 14:45:21 -08:00
Yangqing Jia
fdfe1d09a0 Explicitly require listing additional libraries if a binary needs things beyond Caffe2_MAIN_LIBS (#2110) 2018-03-02 16:29:13 -08:00
Koan-Sin Tan
fab2c07af9 make -DUSE_OBSERVERS=ON work
```
scripts/build_android.sh -DBUILD_TEST=ON -DBUILD_BINARY=ON  -DBUILD_OBSERVERS=On -DUSE_OBSERVERS=ON
```
2018-03-02 20:31:33 +01:00
sf-wind
b7ab3ff5e3 Change caffe_add_linker_flag to caffe2_interface_library (#2109)
* Remove OpenGL code from benchmark

* Update function name since it is renamed in other places
2018-03-01 21:45:44 -08:00
Yangqing Jia
178c4be295 [wip] Cmake modernization (#2066)
* cmake target - work in progress

* wip cmake public targets

* Add missing INTERFACE keyword

* Add cuda public dependencies

* Add dependency for test targets
2018-02-27 20:42:37 -08:00
Marat Dukhan
4da3ce720f Support convolution without bias in NNPACK bindings 2018-02-22 14:25:17 -05:00
sf-wind
0074dc7fa8 Allow more backends in caffe2_benchmark (#1979)
* Remove OpenGL code from benchmark

* Make it possible to print plot in the ipython notbook

* Create the blob if the blob is not specified in the init net

* Do not use gf library for MKL. Even after I install the entire MKL library it is still not found. After removing it, the MKL code can still run

* Support more backends in Caffe2 Benchmark

* Revert "Do not use gf library for MKL. Even after I install the entire MKL library it is still not found. After removing it, the MKL code can still run"

This reverts commit 981b6693a94cbf63ad78d51bd806c7a0d7a5a2d3.

* Build caffe2_benchmark using shared or static library depending on the flag
2018-02-20 14:45:53 -08:00
Fei Sun
38f2cd16ee If a blob is not specified in the init net, create the blob
Summary:
In Caffe2 Benchmark, if a blob is not specified in the init net, but only specified in the predict net (e.g. input), the blob cannot be retrieved from the workspace. In some cases, it results some errors.

Create the Blob before using it if it doesn't exist.
Closes https://github.com/caffe2/caffe2/pull/1948

Reviewed By: orionr

Differential Revision: D6970316

Pulled By: sf-wind

fbshipit-source-id: 3e317403de0b5cf7568c7bda69a0ebe9d59d4a1f
2018-02-12 16:24:18 -08:00
Marco Zandonadi
3708914bd5 Give NetObserverReporter a virtual destructor for correct destruction
Summary:
Future-clang is stricter about some things. We need to address deletes on non-virtual destructors.

For reference, the compiler error in question can be identified by: "delete called on 'ClassName' that is abstract but has non-virtual destructor [-Werror,-Wdelete-non-virtual-dtor]" for a given ClassName.

Reviewed By: smeenai

Differential Revision: D6853479

fbshipit-source-id: a40c8e83da7c1b44da48e887cc029e98e40d6737
2018-02-02 17:32:26 -08:00
Fei Sun
4a528cefac Remove OpenGL code from benchmark
Summary:
OpenGL is no longer built by default. Even after setting flag -DUSE_MOBILE_OPENGL, the build fails. Remove it in the benchmark code so that the benchmark can still be built.
Closes https://github.com/caffe2/caffe2/pull/1822

Reviewed By: Maratyszcza

Differential Revision: D6824777

Pulled By: sf-wind

fbshipit-source-id: 5af8b669a36adcd6a98b0a11237b9e03c146bb9d
2018-01-26 16:17:53 -08:00
Marat Dukhan
224493d9ce NNPACK: Use new bindings and custom thread pool
Summary:
This change should dramatically (~10X) improve performance of convolution with NNPACK engine
Closes https://github.com/caffe2/caffe2/pull/1730

Reviewed By: sf-wind

Differential Revision: D6695895

Pulled By: Maratyszcza

fbshipit-source-id: 26291916811ef4cb819a59aec848c4e23668e568
2018-01-11 10:48:12 -08:00
Marat Dukhan
2435d22782 Move NNPACK integration to share/contrib/nnpack
Summary:
we are going to deprecate NNPACK bindings in caffe2/contrib/nnpack.
The first step is to move modern NNPACK bindings from caffe2/mobile/contrib/ios/ to
caffe2/share/contrib/nnpack/, and is implemented in this diff.

Reviewed By: sf-wind

Differential Revision: D6687454

fbshipit-source-id: 458614bade92ab5ba5d2ab7f0691071043198b57
2018-01-09 17:22:24 -08:00
Fei Sun
5d6dacaafe Enable building operator QuantDecompZstd
Summary:
Make operator QuantDecompZstd buildable in open source. The operator is not built by default. Need to specify -DBUILD_SHARE_DIR=ON -DUSE_ZSTD=ON to build it.

Test plans: Build android caffe2 with the change without issue. Run a model with the operator successfully.
Closes https://github.com/caffe2/caffe2/pull/1613

Reviewed By: Yangqing

Differential Revision: D6556723

Pulled By: sf-wind

fbshipit-source-id: 453a7d787a55928f2dea1ed2b99f2df011aa8d26
2017-12-21 17:47:05 -08:00
Martin Schatz
f19ae690c3 Update observer when attached to RNN ops
Summary: Observer passed to RNN step net cloned with RecurrentOperator as subject instead of internal Operator.  This diff applies adds the internal operator as the subject

Reviewed By: enosair

Differential Revision: D6560996

fbshipit-source-id: 7af4fb0ff8c19795b5c994c5fc6876f3d2ba7bf4
2017-12-14 10:04:20 -08:00
Alexander Sidorov
913a9a736c Backed out changeset 4e1241fe65cd (revert a revert :) )
Summary:
This fixes the issue but I haven't figured out yet why is it
happening.

Reviewed By: bwasti

Differential Revision: D6437378

fbshipit-source-id: bf983c9b6f57647423423ec6b22e0f9d2b170e74
2017-11-29 13:33:15 -08:00
Fei Sun
8495503c6f Set a default input type
Summary:
Set a default input type so that users do not need to always specify one.

Test Plans: run caffe2_benchmark without the input_type argument, the default one is used.
Closes https://github.com/caffe2/caffe2/pull/1513

Reviewed By: hlu1

Differential Revision: D6401820

Pulled By: sf-wind

fbshipit-source-id: bc8406ca000b3f65fb9aeb1c9c80eb766d625758
2017-11-22 18:36:57 -08:00
Fei Sun
9468a1e24f Add input type to caffe2_benchmark
Summary:
Allow inputs to be either float of uint8_t type.
Closes https://github.com/caffe2/caffe2/pull/1508

Reviewed By: hlu1

Differential Revision: D6392742

Pulled By: sf-wind

fbshipit-source-id: d83f1602b366907405108ce37fa35c1a0f68551a
2017-11-22 11:09:08 -08:00
Aapo Kyrola
e0c8c539e7 Backed out changeset 119623addbbd
Summary: Unlanding D6327460 because seems to be causing unstability.

Differential Revision: D6377117

fbshipit-source-id: 4e1241fe65cd4c7a127fa6fa724f60b75965a096
2017-11-20 16:17:52 -08:00