Commit Graph

811 Commits

Author SHA1 Message Date
Hao Shi
c095b3f67f Deprecate CNNModelHelper - MLP()
Summary: This diff deprecates `CNNModelHelper` in `MLP()` function.

Reviewed By: harouwu

Differential Revision: D5241738

fbshipit-source-id: 03669a4166a02257aa5779860d06b40d7496104d
2017-06-15 14:03:23 -07:00
Luke Yeager
8ef12951e0 Fix for protobuf with unicode_literals
Summary:
Python 2.7, Protobuf 2.6

    >                   op.ClearField('uuid')
    E                   TypeError: field name must be a string

Fix: http://python-future.org/imports.html#should-i-import-unicode-literals

/cc salexspb tomdz
Closes https://github.com/caffe2/caffe2/pull/804

Differential Revision: D5258494

Pulled By: akyrola

fbshipit-source-id: 04c473c1e55bf8caac0bfde7d86171c9f95e71a1
2017-06-15 13:22:57 -07:00
Aapo Kyrola
7ffd76db51 check operator schema before calling gradient creator
Summary: Hard-to-debug problems arise when a gradient creator fails when the forward op is incorrect itself. Add checking of the schema before callig the creator. Also clarify the error messages

Reviewed By: Yangqing

Differential Revision: D5256016

fbshipit-source-id: 78550f7e2ce5b88e26b69fdae4be0eece52edfea
2017-06-15 13:04:58 -07:00
Mehdi Drissi
6500d7f307 Fixing a small bug in schema where the number of default arguments doesn't match the number of fields
Summary:
The current version of schema.py has a Metadata class with three fields. The default for it is set to
four Nones. This is just changing that to three Nones so that the number of default values matches the number
of actual fields.

Reviewed By: kennyhorror

Differential Revision: D5250463

fbshipit-source-id: 42e5650d270f5f63662614d8445b4819ed370dec
2017-06-15 10:31:56 -07:00
Junjie Bai
be7c336626 Deprecate CNNModelHelper in python/memonger_test.py
Summary: Also fixed a small bug in ModelHelper constructor

Reviewed By: harouwu

Differential Revision: D5246799

fbshipit-source-id: 3719ca078f0e2b5e463fc93da9c8215f5583bd9a
2017-06-15 10:06:57 -07:00
Aapo Kyrola
7bf4c0e0fb support RNNs in ExtractPredictorNet
Summary:
We need to support RNNs explicitly in ExtractPredictorNet, because they store sub-nets as strings in special arguments. When netdef argument arrive, we can generalize this a bit.

Added a test under rnn_cell_test to test that extracting an LSTM predictor net works correctly and sets the device option properly for the step net ops.

Reviewed By: yqwangustc

Differential Revision: D5236334

fbshipit-source-id: cd653427f8c440a14d94195a532d18276f94749a
2017-06-14 22:32:29 -07:00
haracejacob
2ec294a8bb Fix a few typos and grammars in comment
Summary:
Fix a few typos and grammars in comment

by using language-check, python library
spell_checker source code is here : https://github.com/17-1-SKKU-OSS/011A/blob/master/spell_checker/spell_checker.py
here is the text file which indicates what things should be fixed :  https://github.com/17-1-SKKU-OSS/011A/tree/master/spell_checker/fix/caffe2
Closes https://github.com/caffe2/caffe2/pull/719

Differential Revision: D5165118

Pulled By: aaronmarkham

fbshipit-source-id: 7fb8ef7a99d03cd5fd2f9ebdb01b9865e90fc37b
2017-06-14 18:22:39 -07:00
Aapo Kyrola
46a95cf420 Allow specifying device to load_from_db()
Summary: A quite common problem is that it is hard to load blobs with pe.load_from_db to a specific device. One must set the device options of the returned init_net and predict_init_net, which is quite magical. So I made load_from_db() able to set these device options automatically, based on device scope or device_option parameter. Added an unit test.

Reviewed By: asaadaldien

Differential Revision: D5249202

fbshipit-source-id: 7b9d91476cb8d1b0ec0d9772e50b9148b8b184fa
2017-06-14 14:32:24 -07:00
Simon Layton
eaacfc7e25 Fix multi-precision SGD outputs
Summary:
salexspb This fixes a major perf issue (40% boost on alexnet end-to-end perf) in the multi-precision SGD optimizer - it was causing repeated cudaMalloc / cudaFree calls during training iterations due to the changing size of the `grad` blob as it moved from fp16 <-> fp32.
Closes https://github.com/caffe2/caffe2/pull/797

Differential Revision: D5246978

Pulled By: salexspb

fbshipit-source-id: ec3d7ef18445e19eaf5aac908d0a7bcd5957eb60
2017-06-14 11:36:43 -07:00
Ahmed Taei
94d42b03fb MaxReduction ops GPU implementation.
Summary:
Move rowwise-max kernel from Softmax to math_util library and implement
colwwise-max kernel and MaxReduction ops.

Reviewed By: akyrola

Differential Revision: D5240329

fbshipit-source-id: a07281a877324de459aace33ff21175a68cfd8f6
2017-06-14 11:02:46 -07:00
Junjie Bai
c1f974aa9f Deprecate CNNModelHelper in python/crf.py
Reviewed By: harouwu

Differential Revision: D5241631

fbshipit-source-id: 3dc448355bc2a766ae9eda1dc579e501743b35cf
2017-06-14 08:49:27 -07:00
Alisson Gusatti Azzolini
d03ffb211c Remove WORKER_INIT_CALLS
Summary: This was only needed in order to initialize stateful PythonOps. Now PythonOp has support for initialization at Op creation time, so this is not used anymore.

Reviewed By: dzhulgakov

Differential Revision: D5242908

fbshipit-source-id: dbaa249466dd0f37f25d204d387b1f99c6dd4fed
2017-06-13 20:18:48 -07:00
Alexander Sidorov
eebda50b79 Operator python traceback
Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call.

Reviewed By: jamesr66a

Differential Revision: D5226047

fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108
2017-06-13 18:50:02 -07:00
Alisson Gusatti Azzolini
d3ec6e8f55 Run python op builder at op creation time
Summary:
This allows to construct a python op by passing a pickled "builder function call" as an argument to the op.
The builder function is called at PythonOp construction time and returns a function that will be called when the op is run.

This way we allow to drop the dependency on 'tokens', which didn't work properly for protobufs that get distributed to other processes. Now, the PythonOp definition is self-contained: as long as the build dependencies are right, sharding the protobuf is enough to execute the net remotely.

Reviewed By: dzhulgakov

Differential Revision: D5080833

fbshipit-source-id: a5deaca5d3143024cdb121519689224e9dbec5ce
2017-06-13 16:29:22 -07:00
Thomas Dudziak
b877d4b5f8 Misc fixes for Python 3
Summary: As title

Differential Revision: D5216942

fbshipit-source-id: def5563f1b259efefab3a829d8a78d8d3297ffc7
2017-06-13 12:18:43 -07:00
Xianjie Chen
795e7e64e8 add truncation for sparse feature
Summary:
truncate id list using the max length computed in compute meta, so that it has determined length,
which is useful for position weighted pooling method.

Reviewed By: sunwael

Differential Revision: D5233739

fbshipit-source-id: f73deec1bb50144ba14c4f8cfa545e1ced5071ce
2017-06-13 10:46:53 -07:00
Yiming Wu
406d748423 better engineering for core_test.TestInferDevice
Summary: Recently people find that this test is too strict because of proto string matching. Thus, I change it to compare fields so that this test will not complain even if protobuf chnaged in future.

Reviewed By: dzhulgakov

Differential Revision: D5229855

fbshipit-source-id: 54efcd7a0f9e5dbba1ddeb480801abcb859e07bd
2017-06-12 15:23:00 -07:00
Bokai Cao
0f787a01bc map operator (move maptrait def out of class)
Summary: added an operator that converts key/value blobs into a blob containing a map pointer, unittest passed.

Differential Revision: D5224449

fbshipit-source-id: 2f60754ed3ba6ed16039c09019117ae3c3646ab2
2017-06-12 14:52:04 -07:00
Thomas Dudziak
c7f5bf282b Revert py::bytes -> std::string
Summary: As title

Reviewed By: salexspb

Differential Revision: D5229338

fbshipit-source-id: 3bc9442c76061436db8f3217c1ba8edfd9581f8b
2017-06-12 14:11:37 -07:00
Bor-Yiing Su
c1420330b2 Fixes the checkpoint test.
Summary:
Diff D5224410 initializes the should_stop_blob explicitly. With that, we will
have one more blob when executing the job. Adjusts the check accordingly.

Reviewed By: azzolini

Differential Revision: D5228398

fbshipit-source-id: 439b186c30b0b1d0e41e513babbcccd85e7a1b4a
2017-06-12 12:19:14 -07:00
Alexander Sidorov
7f1385e70c Improve gradient accumulation of the framework: 1.5x - 2x
Summary:
We waste extra memory by creating two autosplit gradient
blobs and then accumulating it into them main one. Sometimesk, when Sum
/ Sub ops are involved, we can avoid wasting extra memory at all.

Ideally we would not waste any memory and make ops add to the same
blob rather then calculating separate results and then mering
them. But it would require a substantial change to the frameworks and
rewriting a lot of operators.

Reviewed By: dzhulgakov

Differential Revision: D5157667

fbshipit-source-id: 8293824d6cdd971d8853ae90aee68e4a6d1e132b
2017-06-11 22:02:30 -07:00
Dmytro Dzhulgakov
638fe804dc Implement recover_input_schema_by_prefix
Summary:
It's very useful for simple cases like benchmarking nets where we want to encode input/output record in the net and don't want to go through the hurdles of storing input/output record in MetaNetDef.

For those cases I propose remapping the input/output record before saving to 'input_record/{field_name}'. Then we can recover input/output record back just based on the names of the blobs.

Differential Revision: D5170473

fbshipit-source-id: ac5daa60051605ed93022aec1377a49f08f15663
2017-06-11 15:37:12 -07:00
Xiaolong Wang
b133c214ce fix potential bug in task.py
Summary: as titled

Differential Revision: D5225166

fbshipit-source-id: 9247fe44922c097752c6996ee9192ec72b7e7d88
2017-06-11 10:40:47 -07:00
Xiaolong Wang
827a0ac2fe Fix comment mistakes in task.py
Summary: as titled

Reviewed By: kennyhorror

Differential Revision: D5225154

fbshipit-source-id: 99a9547e15e0d5a4c81b6339ce75406160a7fc07
2017-06-11 10:17:07 -07:00
Artem Volkhin
4102a79da4 Explicitly set should_stop_blob to False in pipeline init
Summary: This diff fixes an issue with running the same reader in the same workspace multiple times. In order to achieve correct behavior of execution step we have to explicitly initialize should_stop_blob with False.

Reviewed By: kennyhorror

Differential Revision: D5224410

fbshipit-source-id: 4ad2740e187b62b0a1f5612ea3eef223dcc8a799
2017-06-11 02:33:42 -07:00
Bokai Cao
e01769ece5 map operator
Summary: added an operator that converts key/value blobs into a blob containing a map pointer, unittest passed.

Differential Revision: D5166513

fbshipit-source-id: 748527c423a163fe55f914c08fff3adfc74a540c
2017-06-09 15:17:29 -07:00
Jiyan Yang
c7aa8e142d Add gradient to SparseToDense op
Summary: As desc.

Differential Revision: D5169423

fbshipit-source-id: 64c72933c14c3caabfbe0bf85912194a479c24fa
2017-06-09 13:47:21 -07:00
Jiyan Yang
c822e89956 Rename SparseToDense layer
Summary:
The SparseToDense layer is essentially calling the SparseToDenseMask op.
This makes it impossible to call the functional layer with the true SparseToDense op.
This diff is to rename the layer.

Please let me know if I missed anything or you have a better name suggestion.

Differential Revision: D5169353

fbshipit-source-id: 724d3c6dba81448a6db054f044176ffc7f708bdb
2017-06-09 12:48:27 -07:00
Yiming Wu
072f4dbefc net_printer_quick_fix
Summary: To deal with encode failure

Reviewed By: azzolini

Differential Revision: D5215897

fbshipit-source-id: cf8687706f7e4deaee05b61cd2bfeaff88672fcc
2017-06-08 19:34:50 -07:00
Wael Abdelghani
c291c97494 Add integration test for pos_w
Summary: Title

Reviewed By: azzolini

Differential Revision: D5197307

fbshipit-source-id: 425bf8e7c5068ea544e5b2709b6bb27eef140bf3
2017-06-08 18:04:53 -07:00
Alexander Sidorov
df72826ead Static RNN
Summary:
Static RNN allows to unroll an RNN into Caffe2 graph using all existing cell abstractions. In this diff I introduce several new tests that already caught a few bugs in our RecurrentNetworkOp gradient accumulation logic by comparing it to an unrolled version.

Another use case is perf - potentially we can run an unrolled net faster because DAGNet will have access to the whole graph. Same about memonger. But this work is not part of this diff

Reviewed By: akyrola

Differential Revision: D5200943

fbshipit-source-id: 20f16fc1b2ca500d06ccc60c4cec6e81839149dc
2017-06-08 17:48:48 -07:00
Alexander Sidorov
bb9077a6cd Network forward / backward equality checker
Summary:
In some cases you have an optimized network and a normal
one. And you would like to make sure they produce same results. If
math under the hood is the same, you could do this with a very high
precision compare to a traditional numerical gradient check. One of
the application - RNNs. There we can unroll RNN into Caffe2 graph and
make sure result is the same as in the optimized version using
RecurrentNetworkOp.

Another possible application - graph transformations. We can verify
that after that nets produce same gradients (cc akyrola on memonger,
bwasti on other transformation ideas)

Reviewed By: bwasti

Differential Revision: D5200855

fbshipit-source-id: 0196af187f0c2feb33de4778ea08d0d288fe1017
2017-06-08 17:48:47 -07:00
Alexander Sidorov
264f75fdd0 ZeroGradient op
Summary:
when building a multi layer static RNN the last timestep of
the first layer (and other layers except the last one) doesn't get a
gradient for the cell state as normally user uses results only from
the last layer and cell state doesn't go up either.

ZeroGradient provides a general solution for injecting 0 gradient
blobs. It is in some way similar to StopGradient operator which is
also specialcased

Reviewed By: bwasti

Differential Revision: D5198375

fbshipit-source-id: a21d0cfb3676a77fac72e5897a200d0bd25fc6de
2017-06-08 16:02:38 -07:00
Luke Yeager
52ee7697f4 Fixing broken Python tests
Summary:
`brew_test.py` is just plain broken. `core_test.py` doesn't work with pytest. `apmeter_test.py` and `top_k_test.py` don't work for CUDA builds.
Closes https://github.com/caffe2/caffe2/pull/765

Differential Revision: D5211817

Pulled By: Yangqing

fbshipit-source-id: 78ec5af35a3fa870978e4c9590210ade9e3bc5ac
2017-06-08 13:34:46 -07:00
Luke Yeager
75f1da327d Skip Python tests which require opencv or lmdb
Summary:
Neither dependency is required by the core Python modules.

OpenCV, in particular, is a pain to install (no pip package). Conditionally skipping this test will make TravisCI integration easier.
Closes https://github.com/caffe2/caffe2/pull/739

Differential Revision: D5211799

Pulled By: Yangqing

fbshipit-source-id: c6bdc8a17977f64f34e968fd9ab8c65161d2624d
2017-06-08 13:34:43 -07:00
Aapo Kyrola
27e01744b2 Probably fixed memonger
Summary:
This diff fixes various issues with memonger, and works at leasrt with rbgirshick's failure case, Resnet-50, and new harder unit test. I will still create a proper resnet50-test.

1) Introduce concept of "tokens". These are passed down the dependency chains, and a blob can be used for recycling only if it owns all the tokens that are currently in possession. Tokens are added when branching, and tokens are redeemed after all inputs are satisfied. A bit hard to explain.
2) There were various bugs due to bad code: the free_blobs data structure is of different type when we have blob sizes and when we haven't. I plan to rewrite this soon. But there were some bugs.
3) Added a harder unit test that failed before.
4) Added test for resnet50 + memonger

Reviewed By: asaadaldien

Differential Revision: D5193393

fbshipit-source-id: bc2a714877aa1201c32a5ba8ade862865e455711
2017-06-08 09:19:24 -07:00
Aapo Kyrola
feba1eed00 resnet50: fetch right lr
Summary: I broke resnet50 when switching to use optimizer, which uses LR per parameter. This only happens after each epoch, and I did no test patiently enough. For a stop-gap, while asaadaldien works on a better solution, just fetch the lr of a conv1_w param.

Reviewed By: asaadaldien

Differential Revision: D5207552

fbshipit-source-id: f3474cd5eb0e291a59880e2834375491883fddfc
2017-06-07 21:46:35 -07:00
Yiming Wu
4fefff0bbb Auto injecting device copy for single net and several nets
Summary:
This diff plan to attack the problem where we want to just annotate device option for operators and leave Caffe2 to help us inject cross device copy functions. This feature would be useful for mixed device training and multi device training with several nets, where previously we do the heavy lifting of adding copy functions ourselves.

Ideally, this feature will happen like this:

      //construct your nets first
      core.InjectDeviceCopyAmongNets([train_init, train_net, ...])

My ideas are written in comments. I will update them here as well later.

Reviewed By: dzhulgakov

Differential Revision: D5134103

fbshipit-source-id: 173f7da9d1773d1c50ccdc27f1b5cd3067b04af5
2017-06-07 20:03:18 -07:00
Peizhao Zhang
87a12dd355 Caught exception when fetching uninitialized blobs when collecting blob sizes in workspace.
Summary: Caught exception when fetching uninitialized blobs when collecting blob sizes in workspace. Some of the output blobs (like mask output of DropOut when is_test=1) may be nullptr and FetchBlob will fail.

Differential Revision: D5198641

fbshipit-source-id: 45ee26c4cb1c25cc48904e9f7d7c007224c97418
2017-06-07 15:35:32 -07:00
Ran Xian
4316fb4876 Implement APMeter op
Summary: Implements an APMeter operator (APMeterOp) to calculate AP for multilclass classification given prediction socres and labels. The Op takes a score tensor [nsamples x nclasses] and a label tensor [nsamples x nclasses], and outputs a float tensor of size nclasses as the AP for each class.

Reviewed By: akyrola

Differential Revision: D5082565

fbshipit-source-id: ae7304bc8fc999c361245b9aec38eb9a5f5eef4b
2017-06-07 15:03:04 -07:00
Zhicheng Yan
ee3727db00 add_helper_function_ElementwiseLinear_op
Summary:
Add a helper function for parametric op ElementwiseLinear
The typical syntax is model.ElementwiseLinear(input, output, dimension)

Reviewed By: harouwu, akyrola

Differential Revision: D5114152

fbshipit-source-id: 8e8c691f824f518ae510a72ab0c12de1b018f3b5
2017-06-07 13:49:48 -07:00
James Cross
98825d1323 guard against special case of in-place operation
Summary:
There is an edge case where internal gradient blobs of the backward step net should not be considered internally calclulated if the only "internal" calculation is in-place.

In the case of the failing attention unit tests, the offending blob was attention_weighted_encoder_context_grad, which was incorrectly considered internal because it was the output (as well as input) of a Reshape on the step net's edge. The caveat here is that the results may be unpredictable if a non-pass-through in-place operation is applied to a blob within a step net which is also consumed both internally and is a recurrent state/output. (This is an extreme edge case, and difficult to explicitly enforce, but it's worth noting.)

Reviewed By: salexspb

Differential Revision: D5198328

fbshipit-source-id: 0cfa8f903fd767fc50e727f238ac3d8cdca03fe0
2017-06-07 12:33:31 -07:00
Thomas Dudziak
d524d5b481 Fixes zip/izip for Python 3
Summary: As title

Reviewed By: salexspb

Differential Revision: D5154186

fbshipit-source-id: 2ef24557d82ae16d3bdfbc90a4cc96be8e2dc6c3
2017-06-07 00:04:26 -07:00
Thomas Dudziak
60c78d6160 Fixes range/xrange for Python 3
Summary: As title

Differential Revision: D5151894

fbshipit-source-id: 7badce5d3122e8f2526a7170fbdcf0d0b66e2638
2017-06-07 00:04:26 -07:00
Ahmed Taei
4c5d101caf Implement ColwiseMax and RowwiseMax reduction ops.
Differential Revision: D5192949

fbshipit-source-id: e7e877b4bea19dd1be94449d45d2733f4858b8e7
2017-06-06 21:17:29 -07:00
Aarti Basant
93ac6a9837 checkpointing for distributed hive reader.
Summary:
The goal of this diff is:
1) Enable checkpointing to honor batches_per_epoch
2) Resume hive_readers mid-split

Reviewed By: azzolini

Differential Revision: D5004212

fbshipit-source-id: 2ff5df30ba946eefadd109d80056cde67398a080
2017-06-06 14:20:06 -07:00
Wenyi Huang
7723129d14 Add gradient for topK op
Summary:
Input of topK op: X (dense)
Output of topK op: Value and Indices (sparse representation)
Value will have gradient in some cases,

We backprop (copy) the gradient from sparse (d Value) to dense (d X)

Differential Revision: D5133461

fbshipit-source-id: 7bad55b60e8a22dfe0e51357ce2099d7f752c133
2017-06-06 14:20:06 -07:00
Xiangyu Wang
c9c862fa8f 16117716 [Caffe2 OSS] make char-rnn exapmle use build_sgd
Summary: replace hand made sgd with build_sgd

Reviewed By: salexspb

Differential Revision: D5186331

fbshipit-source-id: 3c7b4b370e29a1344b95819766463bae3812c9a6
2017-06-06 13:54:59 -07:00
Dmytro Dzhulgakov
80fe2e5caf Fix from_column_list
Summary: Previous implementation relied on the order of fields for some reason.

Reviewed By: azzolini

Differential Revision: D5164478

fbshipit-source-id: 12717310860584e18ce4ca67d0bd5048354cdc0a
2017-06-06 01:17:02 -07:00
Yiming Wu
8cd208ad6f Infer input and output device from OperatorDef through OperatorSchema
Summary: Infer input and output device from OperatorDef through OperatorSchema. This is inspired by shape inference. With this feature, we can easily analysis device information for all blobs in the net in a generic way. It is really helpful for auto cross device execution.

Reviewed By: akyrola, dzhulgakov

Differential Revision: D5161065

fbshipit-source-id: ee656123112171a4ca00f2fb3f6940f32ddf3135
2017-06-05 23:47:33 -07:00