Commit Graph

836 Commits

Author SHA1 Message Date
Luke Yeager
31e700910d Fix entropy error coming from test_div
Summary:
Working towards https://github.com/caffe2/caffe2/pull/817.

`E           InvalidArgument: Insufficient bytes of entropy to draw requested array.  shape=(4, 2, 5, 1, 3, 5, 5, 1), dtype=float32.  Can you reduce the size or dimensions of the array?  What about using a smaller dtype?  If slow test runs and minimisation are acceptable, you  could increase settings().buffer_size from 8192 to at least 24576000.`

https://travis-ci.org/caffe2/caffe2/jobs/243867951
Closes https://github.com/caffe2/caffe2/pull/828

Differential Revision: D5276723

Pulled By: akyrola

fbshipit-source-id: f7d0e2dd8ef8b6a2354bd4ff7c7446c377c954b4
2017-06-19 13:47:29 -07:00
Xiaoti Hu
969831ea33 Deprecate CNNModelHelper in lmdb_create_example
Reviewed By: akyrola

Differential Revision: D5233793

fbshipit-source-id: bae745791f071bc36fd45bd81145ce86c8ba9ed0
2017-06-19 13:04:02 -07:00
Wael Abdelghani
4b4022ded7 Make test_lstm_main more stable
Summary: Title

Reviewed By: Yangqing

Differential Revision: D5268569

fbshipit-source-id: f79c38376ef2dd0684fd438668b0762341d982cf
2017-06-19 12:36:29 -07:00
Luke Yeager
2579be1227 Skip fp16 initializer test for CPU-only builds
Summary:
Working towards https://github.com/caffe2/caffe2/pull/817.
```
E           AttributeError: Method FloatToHalf is not a registered operator. Did you mean: []
```
https://travis-ci.org/caffe2/caffe2/jobs/243867951

/cc slayton58
Closes https://github.com/caffe2/caffe2/pull/829

Differential Revision: D5276796

Pulled By: akyrola

fbshipit-source-id: 34edca6090a9ce7ab39ae1fdc0e83b5c3b7e4f49
2017-06-19 12:21:25 -07:00
Luke Yeager
90a52c3904 Skip TestInferDevice if no GPU support
Summary:
Working towards https://github.com/caffe2/caffe2/pull/817.
```
E           AttributeError: Method CopyCPUToGPU is not a registered operator. Did you mean: []
```
https://travis-ci.org/caffe2/caffe2/jobs/243867951
Closes https://github.com/caffe2/caffe2/pull/818

Differential Revision: D5276735

Pulled By: akyrola

fbshipit-source-id: 35d9df19330ae522037e8a5d721d83dc2e5aa4dc
2017-06-19 12:21:24 -07:00
Luke Yeager
932cf9eb92 Fix entropy error coming from utility_ops_test
Summary:
Working towards https://github.com/caffe2/caffe2/pull/817.

`E           InvalidArgument: Insufficient bytes of entropy to draw requested array.  shape=(20, 12, 22), dtype=float32.  Can you reduce the size or dimensions of the array?  What about using a smaller dtype?  If slow test runs and minimisation are acceptable, you  could increase settings().buffer_size from 8192 to at least 43253760.`

https://travis-ci.org/caffe2/caffe2/jobs/243867951

/cc kittipatv
Closes https://github.com/caffe2/caffe2/pull/830

Differential Revision: D5276639

Pulled By: akyrola

fbshipit-source-id: 0c21be25ecd931837dc8b0c2cc17048f531350d1
2017-06-19 12:09:32 -07:00
Ben Zhang
1ec0b89361 Memonger Graph Verifier
Summary:
We want to make sure that a graph optimized by memonger doesn't have any possibility of two threads writing into the same output blob at the same time, when blobs are renamed.

Creates a graph where edges are built such that a parents node's output blob is a child node's input blob, and there is no node in between the parent and child node that writes to the same blob. If two nets generate the same such graph, then the "path" of data is the same.

Reviewed By: akyrola

Differential Revision: D5210385

fbshipit-source-id: 6317fc4e16289339b50c2dcd86ec8b32d2d544a5
2017-06-19 00:46:32 -07:00
Jeff Johnson
3f860af050 Implement TopKOp for GPU
Summary:
This is a real implementation (not GPUFallbackOp) of the TopKOp for GPU.

There are two algorithm implementations:

-for k <= 512, it maps to a warp-wide min-heap implementation, which requires only a single scan of the input data.
-for k > 512, it maps to a multi-pass radix selection algorithm that I originally wrote in cutorch. I took the recent cutorch code and removed some cutorch-specific things as it made sense.

Also added several utility files that one or the other implementations use, some from the Faiss library and some from the cutorch library.

Reviewed By: jamesr66a

Differential Revision: D5248206

fbshipit-source-id: ae5fa3451473264293516c2838f1f40688781cf3
2017-06-17 08:47:38 -07:00
James Reed
21dc425e07 Optimize SumSqrElementsOp for CUDA
Summary: The old version used one block with 128 threads. Throughput was too low for the NMT use case (calculating squared gradient norms for every parameter), so this increases the throughput. Shaves 7% off CNN model training time per step

Reviewed By: wickedfoo

Differential Revision: D5263748

fbshipit-source-id: adc3bacd11e49ea00c60381d613d993050e899be
2017-06-16 17:03:38 -07:00
Dmytro Dzhulgakov
12094b5114 Add random shuffle through the data to the benchmark workflow
Reviewed By: kdub0

Differential Revision: D5171727

fbshipit-source-id: 1d9182bb820224b479682fc0ca5014f909ba19d5
2017-06-16 13:22:46 -07:00
Alexander Sidorov
eefd4b0bb2 Static RNN: gpu support and lstm_benchmark integration
Summary:
While this is not intended to be the best performat and
general solution, we can see from the test plan in some cases static DAG RNN could
perform better than our own implementation. Hopefully we will get
dynamic RNN DAG execution at least as fast as this one. Then we will
not need this one in production, only for testing.

Still putting it into our benchmark for comparison purposes

Reviewed By: akyrola

Differential Revision: D5210038

fbshipit-source-id: fa44baf51c455872abd6ec5f5d151cf06e15b1fa
2017-06-16 11:31:43 -07:00
Aapo Kyrola
2a9cb7d4a9 use brew for Tranpose --> major perf regression fix
Summary: I accidentaly noticed that we were calling the non-CUDNN version of Transpose with attention, and it is super slow. This broke when rnn_cell was changed to use ModelHelper instead of CNNModelHelper in D5062963, but calls to transpose were not "brewed".

Reviewed By: jamesr66a

Differential Revision: D5264248

fbshipit-source-id: b61494ae210f34597245f1195d20547f5b5cd8b5
2017-06-16 11:02:48 -07:00
Aapo Kyrola
96f19fefc0 add warning if data parallel model is created for gpus that we dont have
Summary: Don't want to assert since it can be useful to sometimes create models that are not run (for example, unit tests).

Reviewed By: pietern

Differential Revision: D5258905

fbshipit-source-id: f1beee0605bfef235ed0f23f7e78259109720254
2017-06-16 07:02:37 -07:00
Simon Layton
176a841087 Fixes for CuDNNDropoutOp
Summary: Closes https://github.com/caffe2/caffe2/pull/809

Differential Revision: D5263514

Pulled By: akyrola

fbshipit-source-id: 1f1e5bdb6fa551cb1f9beb3e5d3ad9c0c8813ed0
2017-06-15 22:51:12 -07:00
Kittipat Virochsiri
fc2a8d045c adding flatten indices output to TopK
Summary: This makes it easier to gather top-K by group of rows. This is useful in the situation where we want to pick up top-K from batch of fixed length sessions. Let `N` be number of sessions, and `M` be number of examples in a sessions. We would have a batch of `N * M` rows. We can reshape the score blob to `N x M`, and use it as input to `TopK` to select top score for each session. However, without the new output, it's would be inconvenient to gather the rows corresponding to the top scores. The indices are in `[0, K-1)` range. The new output can be used directly as input to `Gather`.

Reviewed By: chocjy

Differential Revision: D5171459

fbshipit-source-id: 69f7b41456c3f9670650ae07afc8fef8328485e9
2017-06-15 15:32:29 -07:00
Luke Yeager
84cc82cf3f Fix stats_ops_test
Summary:
The global StatRegistry doesn't get reset when the workspace is reset.
```
>       self.assertTrue(len(workspace.FetchBlob('k3')) == 2)
E       AssertionError: False is not true
```
https://travis-ci.org/lukeyeager/caffe2/jobs/240162665

/cc azzolini

NOTE: this error doesn't show up if you just run `stats_ops_test.py` directly. It shows up when you run other tests in the same session before this test:
```
pytest -v caffe2/python/
```
Closes https://github.com/caffe2/caffe2/pull/788

Differential Revision: D5259232

Pulled By: salexspb

fbshipit-source-id: 3c72633af6bb61c4fda62195298b1e9574b4cbef
2017-06-15 15:07:57 -07:00
Po-Yen Chou
5ce9cbae70 Upgrades python/hypothesis_test.py to use brew instead of CNNHelperModel
Summary: Upgrades this file to use brew instead of CNNHelperModel

Reviewed By: harouwu

Differential Revision: D5252089

fbshipit-source-id: 6df4350717c1d42bc4bcc63d255cd422f085ee05
2017-06-15 15:07:56 -07:00
Dmytro Dzhulgakov
e9cba7e69f Option to read from dataset indefinitely.
Summary: Useful for benchmarking

Reviewed By: kdub0

Differential Revision: D5226758

fbshipit-source-id: 6f3e6dd256f2c40ab71e598a7ce47cd06099adff
2017-06-15 15:07:53 -07:00
James Reed
d9d89b191d implement SliceOp for GPU
Summary: Implementation of the SliceOp for CUDA

Reviewed By: akyrola

Differential Revision: D5254287

fbshipit-source-id: 0a1660e1aa161fd088a2d8f886e019c05a1919a2
2017-06-15 14:34:34 -07:00
Luke Yeager
f61e4ca070 Fixes in tests to support numpy >= 0.12
Summary:
```
  File "/data/caffe2/install/caffe2/python/hypothesis_test.py", line 1911, in test_batch_to_space
    (w + 2 * pad) / block_size).astype(np.float32)
  File "mtrand.pyx", line 1404, in mtrand.RandomState.randn (numpy/random/mtrand/mtrand.c:19843)
  File "mtrand.pyx", line 1534, in mtrand.RandomState.standard_normal (numpy/random/mtrand/mtrand.c:20368)
  File "mtrand.pyx", line 167, in mtrand.cont0_array (numpy/random/mtrand/mtrand.c:6127)
TypeError: 'float' object cannot be interpreted as an index
```
```
  File "/data/caffe2/install/caffe2/python/operator_test/tile_op_test.py", line 101, in tile_ref
    tiled_data = np.tile(X, tuple(dims))
  File "/data/caffe2/venv/local/lib/python2.7/site-packages/numpy/lib/shape_base.py", line 881, in tile
    return c.reshape(shape_out)
TypeError: only integer scalar arrays can be converted to a scalar index
```
I also tested to make sure this still works with 0.11.
Closes https://github.com/caffe2/caffe2/pull/787

Differential Revision: D5248087

Pulled By: salexspb

fbshipit-source-id: eff69482a8eabb8ace330003fa326c832b53865f
2017-06-15 14:17:20 -07:00
Sen Li
9d8a194cef Deprecate CNNModelHelper in python/workspace_test.py
Summary: Deprecate CNNModelHelper in python/workspace_test.py to use Model_Helper instead of CNN

Reviewed By: harouwu

Differential Revision: D5251778

fbshipit-source-id: d634f1c76e41a95b0247ebf5d5a48aef6f8e232e
2017-06-15 14:17:18 -07:00
Hao Shi
c4c3797b0d Deprecate CNNModelHelper - Inception()
Summary:
This diff deprecates `CNNModelHelper` in `Inception()` only.

Depends on D5248848

Reviewed By: harouwu

Differential Revision: D5249312

fbshipit-source-id: 2818fb54bbae203956ed5cd5fb547508923c52a6
2017-06-15 14:03:27 -07:00
Hao Shi
b0625ff566 Deprecate CNNModelHelper - VGGA()
Summary:
This diff deprecates `CNNModelHelper` in `VGGA()` function.

Depends on D5247946

Reviewed By: harouwu

Differential Revision: D5248848

fbshipit-source-id: ede9113edb2024e4db8f0048f812050233e3fb40
2017-06-15 14:03:26 -07:00
Hao Shi
4aff677d3d Deprecate CNNModelHelper - OverFeat()
Summary:
This diff deprecates `CNNModelHelper` in `OverFeat()` function.

Depends on D5247004

Reviewed By: harouwu

Differential Revision: D5247946

fbshipit-source-id: 6a5299ec71f78e0b81a43212a028651522ab8f4b
2017-06-15 14:03:25 -07:00
Hao Shi
078589d7c6 Deprecate CNNModelHelper - AlexNet()
Summary:
This diff deprecates `CNNModelHelper` in the `AlexNet()` function. More diffs will be coming to deprecate the helper in other functions.

Depends on D5241738

Reviewed By: harouwu

Differential Revision: D5247004

fbshipit-source-id: eec5c5ef916a85de8289cb92d2174a6a4b8075bf
2017-06-15 14:03:24 -07:00
Hao Shi
c095b3f67f Deprecate CNNModelHelper - MLP()
Summary: This diff deprecates `CNNModelHelper` in `MLP()` function.

Reviewed By: harouwu

Differential Revision: D5241738

fbshipit-source-id: 03669a4166a02257aa5779860d06b40d7496104d
2017-06-15 14:03:23 -07:00
Luke Yeager
8ef12951e0 Fix for protobuf with unicode_literals
Summary:
Python 2.7, Protobuf 2.6

    >                   op.ClearField('uuid')
    E                   TypeError: field name must be a string

Fix: http://python-future.org/imports.html#should-i-import-unicode-literals

/cc salexspb tomdz
Closes https://github.com/caffe2/caffe2/pull/804

Differential Revision: D5258494

Pulled By: akyrola

fbshipit-source-id: 04c473c1e55bf8caac0bfde7d86171c9f95e71a1
2017-06-15 13:22:57 -07:00
Aapo Kyrola
7ffd76db51 check operator schema before calling gradient creator
Summary: Hard-to-debug problems arise when a gradient creator fails when the forward op is incorrect itself. Add checking of the schema before callig the creator. Also clarify the error messages

Reviewed By: Yangqing

Differential Revision: D5256016

fbshipit-source-id: 78550f7e2ce5b88e26b69fdae4be0eece52edfea
2017-06-15 13:04:58 -07:00
Mehdi Drissi
6500d7f307 Fixing a small bug in schema where the number of default arguments doesn't match the number of fields
Summary:
The current version of schema.py has a Metadata class with three fields. The default for it is set to
four Nones. This is just changing that to three Nones so that the number of default values matches the number
of actual fields.

Reviewed By: kennyhorror

Differential Revision: D5250463

fbshipit-source-id: 42e5650d270f5f63662614d8445b4819ed370dec
2017-06-15 10:31:56 -07:00
Junjie Bai
be7c336626 Deprecate CNNModelHelper in python/memonger_test.py
Summary: Also fixed a small bug in ModelHelper constructor

Reviewed By: harouwu

Differential Revision: D5246799

fbshipit-source-id: 3719ca078f0e2b5e463fc93da9c8215f5583bd9a
2017-06-15 10:06:57 -07:00
Aapo Kyrola
7bf4c0e0fb support RNNs in ExtractPredictorNet
Summary:
We need to support RNNs explicitly in ExtractPredictorNet, because they store sub-nets as strings in special arguments. When netdef argument arrive, we can generalize this a bit.

Added a test under rnn_cell_test to test that extracting an LSTM predictor net works correctly and sets the device option properly for the step net ops.

Reviewed By: yqwangustc

Differential Revision: D5236334

fbshipit-source-id: cd653427f8c440a14d94195a532d18276f94749a
2017-06-14 22:32:29 -07:00
haracejacob
2ec294a8bb Fix a few typos and grammars in comment
Summary:
Fix a few typos and grammars in comment

by using language-check, python library
spell_checker source code is here : https://github.com/17-1-SKKU-OSS/011A/blob/master/spell_checker/spell_checker.py
here is the text file which indicates what things should be fixed :  https://github.com/17-1-SKKU-OSS/011A/tree/master/spell_checker/fix/caffe2
Closes https://github.com/caffe2/caffe2/pull/719

Differential Revision: D5165118

Pulled By: aaronmarkham

fbshipit-source-id: 7fb8ef7a99d03cd5fd2f9ebdb01b9865e90fc37b
2017-06-14 18:22:39 -07:00
Aapo Kyrola
46a95cf420 Allow specifying device to load_from_db()
Summary: A quite common problem is that it is hard to load blobs with pe.load_from_db to a specific device. One must set the device options of the returned init_net and predict_init_net, which is quite magical. So I made load_from_db() able to set these device options automatically, based on device scope or device_option parameter. Added an unit test.

Reviewed By: asaadaldien

Differential Revision: D5249202

fbshipit-source-id: 7b9d91476cb8d1b0ec0d9772e50b9148b8b184fa
2017-06-14 14:32:24 -07:00
Simon Layton
eaacfc7e25 Fix multi-precision SGD outputs
Summary:
salexspb This fixes a major perf issue (40% boost on alexnet end-to-end perf) in the multi-precision SGD optimizer - it was causing repeated cudaMalloc / cudaFree calls during training iterations due to the changing size of the `grad` blob as it moved from fp16 <-> fp32.
Closes https://github.com/caffe2/caffe2/pull/797

Differential Revision: D5246978

Pulled By: salexspb

fbshipit-source-id: ec3d7ef18445e19eaf5aac908d0a7bcd5957eb60
2017-06-14 11:36:43 -07:00
Ahmed Taei
94d42b03fb MaxReduction ops GPU implementation.
Summary:
Move rowwise-max kernel from Softmax to math_util library and implement
colwwise-max kernel and MaxReduction ops.

Reviewed By: akyrola

Differential Revision: D5240329

fbshipit-source-id: a07281a877324de459aace33ff21175a68cfd8f6
2017-06-14 11:02:46 -07:00
Junjie Bai
c1f974aa9f Deprecate CNNModelHelper in python/crf.py
Reviewed By: harouwu

Differential Revision: D5241631

fbshipit-source-id: 3dc448355bc2a766ae9eda1dc579e501743b35cf
2017-06-14 08:49:27 -07:00
Alisson Gusatti Azzolini
d03ffb211c Remove WORKER_INIT_CALLS
Summary: This was only needed in order to initialize stateful PythonOps. Now PythonOp has support for initialization at Op creation time, so this is not used anymore.

Reviewed By: dzhulgakov

Differential Revision: D5242908

fbshipit-source-id: dbaa249466dd0f37f25d204d387b1f99c6dd4fed
2017-06-13 20:18:48 -07:00
Alexander Sidorov
eebda50b79 Operator python traceback
Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call.

Reviewed By: jamesr66a

Differential Revision: D5226047

fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108
2017-06-13 18:50:02 -07:00
Alisson Gusatti Azzolini
d3ec6e8f55 Run python op builder at op creation time
Summary:
This allows to construct a python op by passing a pickled "builder function call" as an argument to the op.
The builder function is called at PythonOp construction time and returns a function that will be called when the op is run.

This way we allow to drop the dependency on 'tokens', which didn't work properly for protobufs that get distributed to other processes. Now, the PythonOp definition is self-contained: as long as the build dependencies are right, sharding the protobuf is enough to execute the net remotely.

Reviewed By: dzhulgakov

Differential Revision: D5080833

fbshipit-source-id: a5deaca5d3143024cdb121519689224e9dbec5ce
2017-06-13 16:29:22 -07:00
Thomas Dudziak
b877d4b5f8 Misc fixes for Python 3
Summary: As title

Differential Revision: D5216942

fbshipit-source-id: def5563f1b259efefab3a829d8a78d8d3297ffc7
2017-06-13 12:18:43 -07:00
Xianjie Chen
795e7e64e8 add truncation for sparse feature
Summary:
truncate id list using the max length computed in compute meta, so that it has determined length,
which is useful for position weighted pooling method.

Reviewed By: sunwael

Differential Revision: D5233739

fbshipit-source-id: f73deec1bb50144ba14c4f8cfa545e1ced5071ce
2017-06-13 10:46:53 -07:00
Yiming Wu
406d748423 better engineering for core_test.TestInferDevice
Summary: Recently people find that this test is too strict because of proto string matching. Thus, I change it to compare fields so that this test will not complain even if protobuf chnaged in future.

Reviewed By: dzhulgakov

Differential Revision: D5229855

fbshipit-source-id: 54efcd7a0f9e5dbba1ddeb480801abcb859e07bd
2017-06-12 15:23:00 -07:00
Bokai Cao
0f787a01bc map operator (move maptrait def out of class)
Summary: added an operator that converts key/value blobs into a blob containing a map pointer, unittest passed.

Differential Revision: D5224449

fbshipit-source-id: 2f60754ed3ba6ed16039c09019117ae3c3646ab2
2017-06-12 14:52:04 -07:00
Thomas Dudziak
c7f5bf282b Revert py::bytes -> std::string
Summary: As title

Reviewed By: salexspb

Differential Revision: D5229338

fbshipit-source-id: 3bc9442c76061436db8f3217c1ba8edfd9581f8b
2017-06-12 14:11:37 -07:00
Bor-Yiing Su
c1420330b2 Fixes the checkpoint test.
Summary:
Diff D5224410 initializes the should_stop_blob explicitly. With that, we will
have one more blob when executing the job. Adjusts the check accordingly.

Reviewed By: azzolini

Differential Revision: D5228398

fbshipit-source-id: 439b186c30b0b1d0e41e513babbcccd85e7a1b4a
2017-06-12 12:19:14 -07:00
Alexander Sidorov
7f1385e70c Improve gradient accumulation of the framework: 1.5x - 2x
Summary:
We waste extra memory by creating two autosplit gradient
blobs and then accumulating it into them main one. Sometimesk, when Sum
/ Sub ops are involved, we can avoid wasting extra memory at all.

Ideally we would not waste any memory and make ops add to the same
blob rather then calculating separate results and then mering
them. But it would require a substantial change to the frameworks and
rewriting a lot of operators.

Reviewed By: dzhulgakov

Differential Revision: D5157667

fbshipit-source-id: 8293824d6cdd971d8853ae90aee68e4a6d1e132b
2017-06-11 22:02:30 -07:00
Dmytro Dzhulgakov
638fe804dc Implement recover_input_schema_by_prefix
Summary:
It's very useful for simple cases like benchmarking nets where we want to encode input/output record in the net and don't want to go through the hurdles of storing input/output record in MetaNetDef.

For those cases I propose remapping the input/output record before saving to 'input_record/{field_name}'. Then we can recover input/output record back just based on the names of the blobs.

Differential Revision: D5170473

fbshipit-source-id: ac5daa60051605ed93022aec1377a49f08f15663
2017-06-11 15:37:12 -07:00
Xiaolong Wang
b133c214ce fix potential bug in task.py
Summary: as titled

Differential Revision: D5225166

fbshipit-source-id: 9247fe44922c097752c6996ee9192ec72b7e7d88
2017-06-11 10:40:47 -07:00
Xiaolong Wang
827a0ac2fe Fix comment mistakes in task.py
Summary: as titled

Reviewed By: kennyhorror

Differential Revision: D5225154

fbshipit-source-id: 99a9547e15e0d5a4c81b6339ce75406160a7fc07
2017-06-11 10:17:07 -07:00
Artem Volkhin
4102a79da4 Explicitly set should_stop_blob to False in pipeline init
Summary: This diff fixes an issue with running the same reader in the same workspace multiple times. In order to achieve correct behavior of execution step we have to explicitly initialize should_stop_blob with False.

Reviewed By: kennyhorror

Differential Revision: D5224410

fbshipit-source-id: 4ad2740e187b62b0a1f5612ea3eef223dcc8a799
2017-06-11 02:33:42 -07:00