Commit Graph

1554 Commits

Author SHA1 Message Date
Andrey Malevich
01de4e40d6 Fix a bug in nested parameter sharing logic.
Summary:
It appears that my initial implementation was not really working when one
starts doing nesting. This diff is fixing this by replacing itertools with
something that is really easy to reason about.

Reviewed By: idning

Differential Revision: D6933763

fbshipit-source-id: f7a1de996d878a41bac2b2acd9d87a7c4b416778
2018-02-08 13:32:53 -08:00
Giri Anantharaman
6aaa701c9c Adding ThresholdedRelu Op support.
Summary: Core operator and python operator changes for adding ThresholdedRelu Op support.

Reviewed By: houseroad

Differential Revision: D6900660

fbshipit-source-id: 9b17ede13ccb3264286389c7fc633ab9c1a7bbbf
2018-02-08 12:18:40 -08:00
Alexander Sidorov
e0e124e617 Fix RNN scoping situation
Summary:
There is a long lasting problem of scoping which was introduced in original python wrappers early in H1. Basically each RNNCell implemented has to manually scope outputs of each of the operators. If somebody forgets, then there could be weird bugs with layers etc.

Approach is the following. User has to explicitly specify current scope when using  apply_over_sequence function and others if the function is going to be called several times (like for stacking layers). This way we use Caffe2 native scoping approach instead of inventing one extra API people have to use (i.e. passing scope name as an argument to the RNNCell constructor).
Closes https://github.com/caffe2/caffe2/pull/1681

Differential Revision: D6777536

Pulled By: salexspb

fbshipit-source-id: 73d860b8d4857589e04bdea5a6fcd3080d68427c
2018-02-07 17:35:29 -08:00
James Reed
a68e224219 Fix ONNX While test for CUDA
Summary: We should not be trying to instantiate this op on GPU at this point

Reviewed By: pietern

Differential Revision: D6915576

fbshipit-source-id: 6bdbc93ad12fc67e3001fce1b506fe2895d7b0ba
2018-02-06 14:35:34 -08:00
Qinqing Zheng
c028bcd466 Fix input of Reduce{Front/Back}{Sum/Mean}Gradient ops
Summary: The previous refactor of these four Ops changed their input semantics, which makes backward impatible with old models. This diff fix this problem by checking the input and define follow-up behavior by case, so that the old models can be accommodated.

Reviewed By: dzhulgakov

Differential Revision: D6905840

fbshipit-source-id: fc37baec407fd5eae64fc9c2b61aba3c492a90f3
2018-02-05 23:33:07 -08:00
James Reed
f383600625 ONNX While Operator
Summary:
Special While loop operator that follows the semantics of While in ONNX: https://github.com/jamesr66a/onnx/blob/controlflow/docs/Operators.md#experimental-loop

Stuff that's missing:

- Lexical scoping enforced via child workspaces
- Double-buffering on forward

Further possible enhancements:
- Full parallelism when there are no loop-carried dependencies
- Diagonal execution
- More optimized scan_outputs shaping via static shape inference provided in ONNX (coming sometime)
- GPU support (probably just some tensor value management stuff)
- Gradient support (likely low-pri right now)
Closes https://github.com/caffe2/caffe2/pull/1848

Reviewed By: dzhulgakov

Differential Revision: D6907524

Pulled By: jamesr66a

fbshipit-source-id: 4938108733e168b8c027035091104712a18c992a
2018-02-05 21:05:52 -08:00
Anders Papitto
6a02cb2844 implement sequence length support for BasicRNN
Summary: Closes https://github.com/caffe2/caffe2/pull/1843

Differential Revision: D6839575

Pulled By: anderspapitto

fbshipit-source-id: efdf00f1c5cfb0d63f1992028a796c8277b76688
2018-02-05 21:05:51 -08:00
Aarti Basant
28f42cc8e7 separating set_params and init() for checkpoint managers.
Summary: separating set_params and init() for checkpoint managers.

Reviewed By: anshulverma

Differential Revision: D6852255

fbshipit-source-id: 061f16ce0c49953ca8a5fe9546af5c9945a3be48
2018-02-05 18:03:21 -08:00
Evgeny Kharitonov
7c7e09fe2d Adding the Percentile op & UT
Reviewed By: MisterTea

Differential Revision: D6879507

fbshipit-source-id: 7ca4165a42c073e384d3a6138ef033ca384afd49
2018-02-05 16:08:00 -08:00
Anders Papitto
d8748a9d53 GRU sequence lengths: allow unspecified sequence lengths
Summary:
modeled after the earlier change for LSTM
Closes https://github.com/caffe2/caffe2/pull/1841

Differential Revision: D6837461

Pulled By: anderspapitto

fbshipit-source-id: de4e787019fa30f813a4b29f14b7000ce9d22d8e
2018-02-05 13:20:05 -08:00
Orion Reblitz-Richardson
d3ea7e260b Allow for all of the names we have in our model zoo.
Summary:
* We now allow subdirectories as well as numbers in the name.
* Also fixed an error case.
Closes https://github.com/caffe2/caffe2/pull/1875

Reviewed By: pjh5

Differential Revision: D6894401

Pulled By: orionr

fbshipit-source-id: 6a9938bc7d2ba6b8f094ed7b8a02664120a10626
2018-02-05 08:52:55 -08:00
Lin Yang
3acce3e4a7 assert global_constant name as string
Reviewed By: kennyhorror

Differential Revision: D6895157

fbshipit-source-id: 9844ab6176d22c6d05a5a0f83b731f734ef9853d
2018-02-04 01:02:30 -08:00
Lin Yang
95626737d0 enforce global_constant name should be a string
Reviewed By: kennyhorror

Differential Revision: D6880114

fbshipit-source-id: 2c9bd27b01cedb469f19843163b04a613fda5904
2018-02-04 01:02:27 -08:00
Yan Shang
e816c777eb Add regularization for sparse features
Reviewed By: xianjiec

Differential Revision: D5767997

fbshipit-source-id: b9b7c47d11417fbe67d861a2a6b4daa38adbe57b
2018-02-02 16:03:32 -08:00
Yan Shang
dabddd65f4 Add sparse normalization operator
Reviewed By: xianjiec

Differential Revision: D6735673

fbshipit-source-id: 870b38d5175cb2d2dcad43c0e9fa4746e4dd15dd
2018-02-02 15:05:59 -08:00
Lin Yang
e138203d8f add sparse_to_dense_test
Summary: hypothesis_test have been introduced in D4508879, add a plain test which is more straightforward.

Reviewed By: kennyhorror

Differential Revision: D6835334

fbshipit-source-id: d05a2cd199b2de56ac0cc0319f19fcd7978647d5
2018-02-01 08:14:37 -08:00
Xue Feng
f652f20f73 change ModOp to support output sign configurations
Summary: enable ModOp to control the output sign to follow dividend or divisor.

Reviewed By: xianjiec

Differential Revision: D6852457

fbshipit-source-id: 62dbb66cacecb8e0a0f81f63f2b7b378efbd6ee2
2018-01-31 18:03:16 -08:00
Jerry Pan
eee42748d9 Caffe2: serialize init for parallel workers
Summary: Caffe2: serialize init for parallel workers

Reviewed By: kevinwilfong

Differential Revision: D6862119

fbshipit-source-id: 805b2971eca4501977950420565bd9ea37dc0f6c
2018-01-31 17:50:10 -08:00
Qinqing Zheng
90a3363f29 Return an empty TaskGroup if node managers exist in MultiNodeCheckpointManager
Summary: Current MultiNodeCheckpointManager return None in this case, yet in JobRunner we assume this function returns a valid task group, i.e. we call session.run(self.checkpoint_manager.init(...)) directly. This will fail the case we use LocalHostScheduler and reuse a MultiNodeCheckpointManager

Reviewed By: azzolini

Differential Revision: D6843450

fbshipit-source-id: a7ec942cfe692f19e8751b0078ae6a6108f29e54
2018-01-30 19:20:50 -08:00
Alexander Sidorov
98a4c3f9b2 Enable rnn_cell_test in jenkins
Summary: Closes https://github.com/caffe2/caffe2/pull/1839

Differential Revision: D6847623

Pulled By: salexspb

fbshipit-source-id: b8a32cb39a8063b8938c89556e5d42606735238d
2018-01-30 11:48:35 -08:00
Lu Fang
560e5c94bd Change default value of LeakyRelu's alpha from 0 to 0.01
Summary: To match the semantic in ONNX, change the default value of alpha of LeakyRelu to 0.01

Reviewed By: dzhulgakov

Differential Revision: D6840975

fbshipit-source-id: 08543f80fd86cbe96a0eee8d725ef137a5bf4ab8
2018-01-29 22:31:12 -08:00
Xiaomeng Yang
6b1f848df6 Adds gpu implementation for FCTransposed
Summary: Adds gpu implementation for FCTransposed.

Reviewed By: salexspb

Differential Revision: D6572785

fbshipit-source-id: a7cd0f7364ace286942c46b91e0287307cbfea83
2018-01-29 19:03:24 -08:00
mdschatz
3c952426fb Add operator attaching net observer
Summary:
Commonly, net observers attach operator observers at construction. This diff separates the logic into a base class to inherit from.
Closes https://github.com/caffe2/caffe2/pull/1806

Reviewed By: salexspb

Differential Revision: D6808623

Pulled By: mdschatz

fbshipit-source-id: 75ef0eea913ef30943541c829c0a976965f42736
2018-01-29 14:34:34 -08:00
Xiaolong Wang
f8575f6d68 Breakdown Dispatcher
Summary: dispatch by Ngram breakdown

Differential Revision: D6794082

fbshipit-source-id: 7f6e8fa3a0abe0dc6d0d466c95e8c4fc865e3abb
2018-01-26 17:47:54 -08:00
Anders Papitto
33d2212751 LSTM sequence lengths: allow unspecified sequence lengths
Summary:
In this case, each sequence is treated as having a length equal to the
first dimension of the input tensor. This matches the semantics of
ONNX when the sequence length input is left out.
Closes https://github.com/caffe2/caffe2/pull/1764

Reviewed By: dzhulgakov

Differential Revision: D6751219

Pulled By: anderspapitto

fbshipit-source-id: 89e0efd12339157627494e2b8c83e952bdd8a9f8
2018-01-26 16:32:56 -08:00
Lin Yang
252211b001 testPairwiseDotProduct
Summary: as title.

Reviewed By: kennyhorror

Differential Revision: D6793829

fbshipit-source-id: f803e0400635ca37184f1dd5bb711bfe0e4bea21
2018-01-26 11:33:08 -08:00
Alexander Sidorov
a3b8c459d4 Revamp MNIST tutorial
Summary:
Main changes:

1. Move reader creation to Brew in order to be consistent and avoid a wild use of param_init_net
2. Use optimizers for training function, avoid manual optimizer construction
3. Add MLP mode (a default)
4. Fix a bunch of too verbose comments and add a bit of new explanations
Closes https://github.com/caffe2/caffe2/pull/1760

Differential Revision: D6749059

Pulled By: salexspb

fbshipit-source-id: 9dfbbb2d9772a74a0300c2e404a92e791f7cc593
2018-01-26 09:17:31 -08:00
Peter Goldsborough
0fd41a63a1 Integrate Fused8BitRowwise ops with DPER
Summary: Updates `sparse_lookup.py` for the new fused 8-bit rowwise quantization. Mostly just changing the same files as the original diffs (D5753626 and D5761202). I know very little about this code here so please let me know if this is safe, also in terms of migration away from the non-fused storage.

Reviewed By: kennyhorror

Differential Revision: D6710784

fbshipit-source-id: 185f147af52a094a937ba631b0351225e660d205
2018-01-25 15:02:42 -08:00
Frank Jiang
304e607b70 Fix adam test
Reviewed By: pietern

Differential Revision: D6787780

fbshipit-source-id: a2d1428b0e028d6f3d8f7c312c90f3fa411cd0a2
2018-01-25 12:59:54 -08:00
Xiaolong Wang
b2cfc5ea53 add KeySplitOp
Summary:
as titled

After converting categorical to Ngram keys, use this op to extract eids

Differential Revision: D6794020

fbshipit-source-id: 4f9251a22d7a129da30b92845e312876e6510e7e
2018-01-25 10:50:53 -08:00
Xiaomeng Yang
d695027300 Adds cuda support for LC op
Summary: Adds cuda support for LC Op

Reviewed By: QueryConnectionException

Differential Revision: D6803659

fbshipit-source-id: 538bbf6fd202c79154132fda0e90e175eb09d025
2018-01-25 10:19:48 -08:00
Huazhong Ning
90543ff13a weighted sampling reader dequeue outputs table index
Summary: Weighted sampling reader dequeue randomly chooses a hive reader to read a mini-batch. This diff allows dequeue to output the index of the randomly chosen table to a specific blob.

Reviewed By: kennyhorror

Differential Revision: D6621070

fbshipit-source-id: 754b981fc2bcfdb0146d2a0a5b677e7cfe74211b
2018-01-24 19:06:25 -08:00
Huan Gui
c261b9ce70 Fix NGram from categorical test
Summary: Fix the flaky test for ngram from categorical test

Reviewed By: dragonxlwang

Differential Revision: D6801152

fbshipit-source-id: dcbae17b1d3737a41fb2f5c794c1146a02c542bb
2018-01-24 18:51:16 -08:00
Xiaomeng Yang
afafe8a466 Add LC Layer
Summary: Add the 1st version of LC layer.

Reviewed By: Yangqing

Differential Revision: D6788647

fbshipit-source-id: ebee9215a1d6e1e567548a0fef771802851682a3
2018-01-24 16:51:17 -08:00
Aarti Basant
fc56e86c7d Introduce init API for the optional Checkpoint Metadata Handler object
Summary:
Every call to the checkpoint_metadata_handler write() API requires us to pass all params like db_prefix, db_type etc.
Introducing an init API in the checkpoint_metadata_handler so that such params can be saved and need not be passed in every API call

Reviewed By: mraway, anshulverma

Differential Revision: D6792651

fbshipit-source-id: 059fa4309e8fce1ee5ab009af3e0570573c24245
2018-01-24 15:19:55 -08:00
Lukasz Wesolowski
29a4c942fe Add support for multi-device batch normalization through an option to data_parallel_model
Summary: Stage 3 in stack of diffs for supporting multi-device batch normalization. Adds input parameter to data_parallel_model to enable multi-device batch normalization. Depends on D6699258.

Reviewed By: pietern

Differential Revision: D6700387

fbshipit-source-id: 24ed62915483fa4da9b1760eec0c1ab9a64b94f8
2018-01-24 13:24:06 -08:00
Lukasz Wesolowski
9414072159 Add operators to support batch normalization across multiple devices on the same node
Summary: This is the first in a series of diffs to enable batch normalization across multiple devices on the same node with data parallel model. The diff contains the ops for computing the per-channel statistics required to obtain the mean and variance across multiple devices on the same node on the forward pass, and the gradient of the bias and scale during backpropagation. The actual modifications to SpatialBN and SpatialBNGradient to make use of these results will be in a separate diff.

Reviewed By: rbgirshick

Differential Revision: D6697336

fbshipit-source-id: 0de2750fe7e851795f238d9f625aeb4d74023dc2
2018-01-24 13:24:04 -08:00
Pieter Noordhuis
7a232aae49 Add random seed to NGramFromCategorical test
Summary: TSIA

Reviewed By: Yangqing, Maratyszcza, dzhulgakov

Differential Revision: D6797213

fbshipit-source-id: e1132229cda09d1fbde63686aaec81b995989c03
2018-01-24 13:05:28 -08:00
Xiaolong Wang
29c7c682d8 add NGramFromCategorical Op
Summary: as titled

Differential Revision: D6783763

fbshipit-source-id: 78280cf15c2cdc3c308562d3f27a81b61ef8d662
2018-01-23 15:08:25 -08:00
Xue Feng
0e9b0cf779 add error msg in fc input_record
Summary: as titled

Reviewed By: xianjiec

Differential Revision: D6787879

fbshipit-source-id: 4bbdd11455480b25fa18121fa4527a9f0a03addc
2018-01-23 14:48:15 -08:00
Anders Papitto
0aa1a6387e Add a seed to the gru unit test
Summary:
as it calls np.random and sometimes fails unreproducibly
Closes https://github.com/caffe2/caffe2/pull/1779

Reviewed By: pietern

Differential Revision: D6779802

Pulled By: anderspapitto

fbshipit-source-id: 2ad069f8a15f70a8110b1a6bdb06f81577c53ad4
2018-01-23 13:47:43 -08:00
Xianjie Chen
76a141f016 add error msg in get_key
Summary: as title

Differential Revision: D6782896

fbshipit-source-id: bd29f6d085e56f51deb4bf6ad81771787fd85a5a
2018-01-23 11:04:05 -08:00
Dániel Simig
2dd79eb53a Visualize distribution of activation functions
Summary:
This is a  first attempt at completing bootcamp task T24449916. This diff contains 3 major changes:
1) Change LayerModelHelper to allow for exposing the output and parameters of any layer to metrics
2) Added a runner that allows metrics to draw arbitrary plots to a matplotlib axes object
3) Implement a metric that aggregates distributions of values in a blob over the training, and try this out in a notebook

Reviewed By: kennyhorror

Differential Revision: D6671273

fbshipit-source-id: b8961837395e89c957edbf5c7c862bdb845ccf4b
2018-01-23 10:36:40 -08:00
Lin Yang
8e0177255e Test for PositionWeighted
Summary: add Test for SparseLookup with PositionWeighted.

Reviewed By: kennyhorror

Differential Revision: D6771612

fbshipit-source-id: b4b3bfd514f366f579b4192643330ae73843d4f9
2018-01-22 19:20:46 -08:00
Viswanath Sivakumar
231d6f7b09 Add SqueezeOp in MKLDNN
Summary:
SqueezeOp support to drop drop dims of size 1. MKLMemory now supports Reshape()
if the buffer is in plain layout, in which case just the dims and layouts are
modified similar to caffe2::Tensor. SqueezeOp takes care of converting the
input to plain layout if needed via an intermediate buffer before calling
Reshape().

Differential Revision: D6735656

fbshipit-source-id: 953309498370e1b8986e8c593bc6963f38036255
2018-01-22 18:39:42 -08:00
Wei Zhang
1d4e996b87 Separate parameter downloading tasks from training tasks and run them in a different group
Summary:
At the end of distributed training, trainer needs to download the parameters back from parameter servers for saving the model. Currently, this parameter downloading happens at the end of job's epoch task group, which creates several problems when checkpointing is enabled for distributed training:

1. When checkpointing is enabled, we run multiple training epochs. At the end of each epoch, the model download tasks will run to collect parameters, but we won't save the model until the true end of training, so there is a big waste of resource.
2. After trainer0 downloads the parameters, these parameters take a lot of memory, so trainer0 can easily run out of memory in the next epoch of training.

Our solution is to insert a parameter download task group between the job's training epoch_group and the job's exit_group.

Reviewed By: azzolini

Differential Revision: D6765393

fbshipit-source-id: 5a4f556fc3c1cd7834a7c406a3c0de3fccd50c49
2018-01-22 14:04:12 -08:00
Pieter Noordhuis
d618c05174 Increase lower bound of values for values in div test
Summary:
This should translate to an 1% error margin. The gradient checker uses a .5% threshold.
Closes https://github.com/caffe2/caffe2/pull/1766

Differential Revision: D6774077

Pulled By: pietern

fbshipit-source-id: f97c7ffb2ef34fdd71d69320a7fdcf4a6a457715
2018-01-22 09:06:12 -08:00
Viswanath Sivakumar
b5d513b1f9 Add op in MKLDNN
Summary:
Just redirects to MKLSumOp. Doesn't support broadcast though since dnnSumCreate
expects identical dims.

Differential Revision: D6729788

fbshipit-source-id: 3e189465ad9d026bec4954648562ffe4e67fc393
2018-01-21 08:21:43 -08:00
James Cross
91066559a8 truthy check for empty string in NameScope()
Summary:
As in name. LATTE translation team moving some code from Python 2 to 3 uncovered a case where comparison between unicode and str types leads NameScope('') to prepend a separator to the beginning of blob names. This fixes it.

Thank you so much to dzhulgakov for tracking down the cause of this so quickly!

Reviewed By: dzhulgakov

Differential Revision: D6766866

fbshipit-source-id: fbe46cff581f425ba10e8668400915ea40baab94
2018-01-19 21:34:09 -08:00
Ilia Cherniavskii
4ce4bc5c7f Fix occasional test timeouts
Summary: Make test less computationally expensive

Reviewed By: Yangqing, dzhulgakov

Differential Revision: D6766236

fbshipit-source-id: 59e51faa1331d804b11da9f7237ee9ce0cb27df8
2018-01-19 20:08:58 -08:00