Summary: This is the continuation of T20872698 Implement the gradient operator for element-wise Logit
Reviewed By: asaadaldien
Differential Revision: D5969487
fbshipit-source-id: c9bb4222529f9fd9085aa9048b90eb70a63f41f4
Summary: Implemented logit gradient with eps as arg. Add the unit test for it and explored the optimal parameter to run the test.
Reviewed By: asaadaldien
Differential Revision: D5910655
fbshipit-source-id: 44898b784a57c7ad45519b202b1eaf95c1c4d460
Summary: Make CUDA version of SparseToDense, register EnsureDense (which is trivial) on CUDA. Need to use atomics because indices can be duplicated. We can later add an option to inform if the indices are unique, and use faster path then.
Reviewed By: jhcross
Differential Revision: D5750893
fbshipit-source-id: 005d1675b127a571aac8474fca62d9633f0c7bff
Summary:
Moved distance_op_test from hypothesis_test to distance_op_test and
refactored
Reviewed By: akyrola, asaadaldien
Differential Revision: D5495104
fbshipit-source-id: 4a90c75eabeb380ae9d150d6258e9b5b0fbfc5ca
Summary: As title. This helps with (quite common) cases where data input is stuck for reason or another, and the net execution never proceeds and is stuck forever.
Reviewed By: andrewwdye
Differential Revision: D5409885
fbshipit-source-id: 840261fd5964408f788fc0f50ece0d74193694ac
Summary:
/cc akyrola is it possible this test has been broken ever since 5614816fce?
More generally, why do we still have `hypothesis_test.py` at all? In the case of this test, surely one of these files does more than this one old test:
* `operator_test/cudnn_recurrent_test.py`
* `operator_test/recurrent_network_test.py`
* `operator_test/rnn_cell_test.py`
Closes https://github.com/caffe2/caffe2/pull/843
Differential Revision: D5292109
Pulled By: akyrola
fbshipit-source-id: 6df5df6353a9741d1ae1b796adaab98382857527
Summary:
Working towards https://github.com/caffe2/caffe2/pull/817.
`E InvalidArgument: Insufficient bytes of entropy to draw requested array. shape=(4, 2, 5, 1, 3, 5, 5, 1), dtype=float32. Can you reduce the size or dimensions of the array? What about using a smaller dtype? If slow test runs and minimisation are acceptable, you could increase settings().buffer_size from 8192 to at least 24576000.`
https://travis-ci.org/caffe2/caffe2/jobs/243867951
Closes https://github.com/caffe2/caffe2/pull/828
Differential Revision: D5276723
Pulled By: akyrola
fbshipit-source-id: f7d0e2dd8ef8b6a2354bd4ff7c7446c377c954b4
Summary: Upgrades this file to use brew instead of CNNHelperModel
Reviewed By: harouwu
Differential Revision: D5252089
fbshipit-source-id: 6df4350717c1d42bc4bcc63d255cd422f085ee05
Summary:
```
File "/data/caffe2/install/caffe2/python/hypothesis_test.py", line 1911, in test_batch_to_space
(w + 2 * pad) / block_size).astype(np.float32)
File "mtrand.pyx", line 1404, in mtrand.RandomState.randn (numpy/random/mtrand/mtrand.c:19843)
File "mtrand.pyx", line 1534, in mtrand.RandomState.standard_normal (numpy/random/mtrand/mtrand.c:20368)
File "mtrand.pyx", line 167, in mtrand.cont0_array (numpy/random/mtrand/mtrand.c:6127)
TypeError: 'float' object cannot be interpreted as an index
```
```
File "/data/caffe2/install/caffe2/python/operator_test/tile_op_test.py", line 101, in tile_ref
tiled_data = np.tile(X, tuple(dims))
File "/data/caffe2/venv/local/lib/python2.7/site-packages/numpy/lib/shape_base.py", line 881, in tile
return c.reshape(shape_out)
TypeError: only integer scalar arrays can be converted to a scalar index
```
I also tested to make sure this still works with 0.11.
Closes https://github.com/caffe2/caffe2/pull/787
Differential Revision: D5248087
Pulled By: salexspb
fbshipit-source-id: eff69482a8eabb8ace330003fa326c832b53865f
Summary: Use `CopyItems` so that it accepts any type of tensor. Also, move the cursor to input blob so that it's checkpoint friendly. Output is now also part of input so that inference can work correctly.
Reviewed By: xianjiec
Differential Revision: D4920987
fbshipit-source-id: da532736225ec27f409ff763ff69a0629235151c
Summary:
Implement NormalizeOP for GPU using CUDA, and re-write the graident to be a function of the output
so its more efficent specially for CUDA implemntation.
Reviewed By: akyrola
Differential Revision: D4971300
fbshipit-source-id: e0ab66462000988aaf1f26010ea550533d107167
Summary: Both SquaredL2Distance and SquaredL2DistanceGradient had bad CUDA implementations. Use proper reductions and batched kernels.
Reviewed By: asaadaldien
Differential Revision: D4968527
fbshipit-source-id: f7cf82072d38bc127c757c5751863a9439aca8b5
Summary:
Similar to SafeDequeueBlobsOp, but add weight-based sampling for reading from multiple input BlobsQueue.
WeightedSampleDequeueBlobsOp will take a vector of weights (each weight is mapped to one input blob queue).
Based on probability, we will choose which BlobQueue to fetch.
WeightedSampleDequeueBlobsOp shall stop when any of input BlobQueue is empty.
Reviewed By: dzhulgakov
Differential Revision: D4905160
fbshipit-source-id: 5b1551e2250569f933a6c01ed04442843c5e0cb6
Summary:
add necessary ops for feature processing
* logit op
* replace nan
* batch one hot op
Reviewed By: kittipatv
Differential Revision: D4840869
fbshipit-source-id: 197123ea5608d54f0b5ac7899973a077a6a86775
Summary:
Quite large diff to make cuDNN LSTM and our LSTM produce same results and provide python API for the cuDNN LSTM.
* Added operators RecurrentParamGet and RecurrentParamSet to access weights and biases for the different gates, input/recurrent.
* Removed RecurrentInit as not needed
* recurrent.cudnn_LSTM() returns a special net and mapping that can be used to retrieve the parameters from the LSTM
* recurrent.cudnn_LSTM() can be passed blobs that have the parameters for the individual gate weights and biases
* recurrnet.InitFromLSTMParams() can be used to initialize our own LSTM from CUDNN params. This way we can test if cuDNN and our own produce the same result.
recurrent_test.py tests for the equivalency
Reviewed By: salexspb
Differential Revision: D4654988
fbshipit-source-id: 6c1547d873cadcf33e03b0e0110248f0a7ab8cb0
Summary:
Actually adds stuff on duplicated indices. I didn't use UnorderedSegmentSum because it'd need more modifications for figuring out the first dimension and I don't want to make that function more complex than it's already is :)
We theoretically can have a version that does CopyItems and fails on duplicate indices as a fallback. But I haven't implemented it yet as it wouldn't be that useful for now.
Also fixes hypothesis test - doing rand() inside the body is not cool as it makes hypothesis run forever
Differential Revision: D4814574
fbshipit-source-id: 1851ec5f5df8fc4bf4844585076b8af23a06b0b2
Summary:
Uses the cudnnTransformTensor function. It works by shuffling the strides according to the transpose axis. Significant speedup over current GPU version .
+ moves the transpose test under utility_ops, because hypothesis_test is too big
Reviewed By: jamesr66a
Differential Revision: D4810993
fbshipit-source-id: 82577c4ced1389e70bd5992820ae4d8297a3817f
Summary:
All of these tests fail with some variant of `Cannot create operator of type 'X' on the device 'CUDA'` (see commit messages).
Closes https://github.com/caffe2/caffe2/pull/227
Differential Revision: D4797060
Pulled By: Yangqing
fbshipit-source-id: 5feaa8e949098bfc1254d4c7449a2744e552f925
Summary: PadImage has no kernel parameters resulting pads_ paraemeters to be not set (0). I added a test case too.
Differential Revision: D4785230
fbshipit-source-id: fd475e7c41208e07fa7a363def9a45c6f82cddfe
Summary: Creates SparseMomentumSGDUpdate, a sparse version of MomentumSGDUpdate, to make that optimization method (via in-place updating operator) compatible with GradientSlices.
Differential Revision: D4784973
fbshipit-source-id: e6330f471a4d5f53589a6ac245e38f256ca7f354
Summary: AccumulateHistogramOp, for computing the histogram of all values in input tensors
Differential Revision: D4654417
fbshipit-source-id: dea92346004c772af16e1eb41306287d81dc5a02
Summary:
1. Allow EnsureDense Op to do both in-place pass or copy
2. In MTML, add EnsureDense Op before gather
3. Change the unittest values (adding another operator changes the random seed,
which causes the model initialization also changes)
Reviewed By: xianjiec
Differential Revision: D4625219
fbshipit-source-id: b3c748c3651d1dedd75420912a9698b7e46187c5
Summary:
Update cuDNN RNN interface (mostly fixing ordering of arguments). Set seed so that test can pass consistently
Closes https://github.com/caffe2/caffe2/pull/62
Reviewed By: Yangqing
Differential Revision: D4348966
fbshipit-source-id: f9b56be37739e5bffabec130e3407492b2aef656
Summary:
Add two argument to DotProductOp operator, `force_same_dim` (1 if we want
DotProductOp to only accept two tensors with equal dimension, 0 otherwise) and
pad_value (only useful when force_same_dim = 0, pad the tensor with smaller
size to the same as the other one).
Differential Revision: D4502619
fbshipit-source-id: 46f7da710c6f6365f76a7af6234c34c7f656be62
Summary:
(Caffe2) Modified RecurrentNetworkGradient operator so that training is possible with any of the output blob(s) receiving gradient during the backward pass. This is realized through a new argument for the RecurrentNetwork op, outputs_with_grads, which takes a list of the indices of the output blobs which will receive gradient. The default case (only receiving gradient from the first output blob) remains the default.
New unit test covers the case where outputs_with_grads = [1, 2] using Python LSTM wrapper.
Reviewed By: urikz
Differential Revision: D4518516
fbshipit-source-id: 5c531582b20f3cf727d1aa91239b4d5a2b8a7c1f
Summary:
The existing op tranforms the input in a general way. It needs M transform mappings to transform a NxM input tensor.
But for binary predictions X (Nx2 tensor), we know that X[:, 0] = 1 - X[:, 1].
So we just need one mapping for X[:, 1]. After being transformed, we can compute X[:, 0].
This diff is to handle this.
Differential Revision: D4550441
fbshipit-source-id: 42d8c6e88d830c97628ee930b543740a32acf904
Summary:
1. The existing Gather op outputs gradients in sparse format. We add GatherDense that does the same thing
as Gather but outputs gradients in dense format. This relies on the SparseToDenseOp.
2. SparseToDenseOp converts sparse representation (indices, values) into a dense format (missing values are
filled with zeros). There is an existing SparseToDenseMaskOp. It is mainly for converting sparse features
into dense format. Modifying it to achieve our purpose is too complicated and messy. Better to create a new one.
Reviewed By: dzhulgakov
Differential Revision: D4508879
fbshipit-source-id: f4a50efa1c08586d94040f93195661c41cd414da
Summary:
I have forgotten to remove this one. The rest of indexing
instead of string names is comming after D4446813 lands as scratches
aren't inputs or outputs and thus can't be indexed.
Reviewed By: urikz
Differential Revision: D4465748
fbshipit-source-id: 2ccbedfb35541ef4a2231d1480eef59025bd5290
Summary: looks like we don't a good job with initial recurrent input gradients yet. Here is some fix, but gradient doesn't check yet. The shape is correct now though
Reviewed By: salexspb
Differential Revision: D4475447
fbshipit-source-id: 280f1f59f19e487fd0dce0d440609c50ddce294a
Summary: This diff use stack workspaces in RecurrentNetwork, which allows to simplify the implementation and get rid of scratches.
Reviewed By: salexspb
Differential Revision: D4446813
fbshipit-source-id: 514eec7e4300bdf492a9cb192b40cf4f89acf656
Summary:
it's broken because it relies on add sparse bias.
it's not easy to add_sparse_bias after switch to loader_param.
DPA would like to try it out :)
Differential Revision: D4447275
fbshipit-source-id: 631cb4995f35383070e44387dc86692ba64b91eb
Summary: Remove usage of recurrent_sizes, so recurrent states' sizes can depend on input (in case of attention matrix for beam decoder). I removed recurrent_sizes from forward and backward steps.
Reviewed By: salexspb
Differential Revision: D4427688
fbshipit-source-id: 580420a294d309c86ec5cb4e677058623b7228e1
Summary:
In this diff I stop passing parameters by name and also remove hardcoded output ids which were there specifically for LSTM to work. It also allows to avoid using recurrent_sizes in the backward pass (for forward this is done in D4427688)
Using similar technic it should be simple enough to eliminate blob name passing at all. Then we can fix scoping. These can be done in a next diff.
Reviewed By: urikz
Differential Revision: D4444614
fbshipit-source-id: 3580a76365502b9f2f09e3d8b7e78084ca739f00
Summary:
New operator is added for model calibration. Given a piecewise linear function and raw prediction as input, generate the mapping as output.
Detail can be find in the operator doc.
Differential Revision: D4418640
fbshipit-source-id: f8ff3ea786b0fe233a4ddcb709e5dbf0861ca484