* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Add axis to top_k_op. (#2416)
* Revert update on top_k_op
* Add axis to top_k_op
Add axis to top_k_op
* [auto] Update onnx to a8e4648 - Adjust link flags when built in Windows Debug mode (#647)
a8e4648a7d
* [auto] Update onnx to f4acf28 - Remove allowconsumed enforceconsumed from op schema. (#617)
f4acf281ef
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Initialize cpuinfo in the thread pool
Thread pool called cpuinfo_get_processors_count() without initializing cpuinfo. Only by luck it didn't make Caffe2 single-threaded: threadpool is initialized after NNPACK, and NNPACK initializes cpuinfo itself.
This commit also updates cpuinfo to a version that aborts with a fatal error if its used uninitialized.
* Updated Python Op and Image Pre-Processing Pipeline tutorials && Added CIFAR-10 Part 1 tutorial (#2286)
* Updated Basics tutorial: (1) Added Python 3 support with __future__ statements; (2) Various grammatical/typo fixes and minor refactoring of Markdown
* Added Python 3 support and made minor typo fixes
* Added Python 3 support with future imports, refactored and corrected errors in Markdown, added comments
* Added Python 3 support with future imports, Added use of caffe_translator.py to translate downloaded .caffemodel file to .pb files
* Upgrades to Image Pre-Processing Pipeline tutorial
* Updated Python Op tutorial
* removed markdown with empty links
* Added Part 1 of an end-to-end CIFAR-10 tutorial
* Updated MNIST Dataset and Databases tutorial with python3 support and markdown fixes
* Tweaks to markup, less training iterations
* changed permissions of CIFAR10_Part1; typo corrections in Image_Pre-Processing_Pipeline
* Typo corrections in Multi-GPU Training tutorial
* sync Python_Op py_gen with the IPython notebook
* nit typo correction
* [auto] Update onnx to 5cb999d - Minor cleanups to shape inference (#653)
5cb999ddc1
* [auto] Update onnx to ecac1c1 - Merge Rel 1.1.0 branch into master (#657)
ecac1c1624
* Strip down onnx to only pb definitions in mobile build (#2426)
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Exported AtomicIterOp count
* Revert "Use -DCMAKE_BUILD_TYPE=Release for local build by default"
This reverts commit 035c62081f6420405b9f1380cc5d21b4c6ae78f6.
* Revert "Export number of iterations of AtomicIterOp (#2338)"
This reverts commit 91b7a0cb48c6b079e2ca8fd5c26819a003937d76.
Summary:
This reverts commit 30f614beea6f859fee25ce4f85573142885dde45
bypass-lint
An infra SEV is better than not reverting this diff.
If you copy this password, see you in SEV Review!
cause_a_sev_many_files
Differential Revision:
D6893040
Original commit changeset: 30f614beea6f
fbshipit-source-id: 5e98a24699088283f864efe31234874bdacbe3c3
Summary: The old pow operator has been deleted in math_ops.cc, math_ops.cu and math_ops.h, while the new operator supporting scalar and tensor exponent has been added in pow_op.cc, pow_op.h an elementwise_op.cu.
Reviewed By: houseroad
Differential Revision: D6893040
fbshipit-source-id: 30f614beea6f859fee25ce4f85573142885dde45
Summary: Weighted sampling reader dequeue randomly chooses a hive reader to read a mini-batch. This diff allows dequeue to output the index of the randomly chosen table to a specific blob.
Reviewed By: kennyhorror
Differential Revision: D6621070
fbshipit-source-id: 754b981fc2bcfdb0146d2a0a5b677e7cfe74211b
Summary:
This should translate to an 1% error margin. The gradient checker uses a .5% threshold.
Closes https://github.com/caffe2/caffe2/pull/1766
Differential Revision: D6774077
Pulled By: pietern
fbshipit-source-id: f97c7ffb2ef34fdd71d69320a7fdcf4a6a457715
Summary: Before this diff RNNOp was using TextFormat for representing steps. This diff is changing RNNOp to prefer NetDef argument instead. To be backward compatible it supports TextFormat for existing models, though we can compile RNNs without TextFormat as well.
Reviewed By: salexspb
Differential Revision: D5949330
fbshipit-source-id: 9336a8f5ccf30ad8d8e3a7067b9437e1704b1c9f
Summary: This is the continuation of T20872698 Implement the gradient operator for element-wise Logit
Reviewed By: asaadaldien
Differential Revision: D5969487
fbshipit-source-id: c9bb4222529f9fd9085aa9048b90eb70a63f41f4
Summary: Implemented logit gradient with eps as arg. Add the unit test for it and explored the optimal parameter to run the test.
Reviewed By: asaadaldien
Differential Revision: D5910655
fbshipit-source-id: 44898b784a57c7ad45519b202b1eaf95c1c4d460
Summary: Make CUDA version of SparseToDense, register EnsureDense (which is trivial) on CUDA. Need to use atomics because indices can be duplicated. We can later add an option to inform if the indices are unique, and use faster path then.
Reviewed By: jhcross
Differential Revision: D5750893
fbshipit-source-id: 005d1675b127a571aac8474fca62d9633f0c7bff
Summary:
Moved distance_op_test from hypothesis_test to distance_op_test and
refactored
Reviewed By: akyrola, asaadaldien
Differential Revision: D5495104
fbshipit-source-id: 4a90c75eabeb380ae9d150d6258e9b5b0fbfc5ca
Summary: As title. This helps with (quite common) cases where data input is stuck for reason or another, and the net execution never proceeds and is stuck forever.
Reviewed By: andrewwdye
Differential Revision: D5409885
fbshipit-source-id: 840261fd5964408f788fc0f50ece0d74193694ac
Summary:
/cc akyrola is it possible this test has been broken ever since 5614816fce?
More generally, why do we still have `hypothesis_test.py` at all? In the case of this test, surely one of these files does more than this one old test:
* `operator_test/cudnn_recurrent_test.py`
* `operator_test/recurrent_network_test.py`
* `operator_test/rnn_cell_test.py`
Closes https://github.com/caffe2/caffe2/pull/843
Differential Revision: D5292109
Pulled By: akyrola
fbshipit-source-id: 6df5df6353a9741d1ae1b796adaab98382857527
Summary:
Working towards https://github.com/caffe2/caffe2/pull/817.
`E InvalidArgument: Insufficient bytes of entropy to draw requested array. shape=(4, 2, 5, 1, 3, 5, 5, 1), dtype=float32. Can you reduce the size or dimensions of the array? What about using a smaller dtype? If slow test runs and minimisation are acceptable, you could increase settings().buffer_size from 8192 to at least 24576000.`
https://travis-ci.org/caffe2/caffe2/jobs/243867951
Closes https://github.com/caffe2/caffe2/pull/828
Differential Revision: D5276723
Pulled By: akyrola
fbshipit-source-id: f7d0e2dd8ef8b6a2354bd4ff7c7446c377c954b4
Summary: Upgrades this file to use brew instead of CNNHelperModel
Reviewed By: harouwu
Differential Revision: D5252089
fbshipit-source-id: 6df4350717c1d42bc4bcc63d255cd422f085ee05
Summary:
```
File "/data/caffe2/install/caffe2/python/hypothesis_test.py", line 1911, in test_batch_to_space
(w + 2 * pad) / block_size).astype(np.float32)
File "mtrand.pyx", line 1404, in mtrand.RandomState.randn (numpy/random/mtrand/mtrand.c:19843)
File "mtrand.pyx", line 1534, in mtrand.RandomState.standard_normal (numpy/random/mtrand/mtrand.c:20368)
File "mtrand.pyx", line 167, in mtrand.cont0_array (numpy/random/mtrand/mtrand.c:6127)
TypeError: 'float' object cannot be interpreted as an index
```
```
File "/data/caffe2/install/caffe2/python/operator_test/tile_op_test.py", line 101, in tile_ref
tiled_data = np.tile(X, tuple(dims))
File "/data/caffe2/venv/local/lib/python2.7/site-packages/numpy/lib/shape_base.py", line 881, in tile
return c.reshape(shape_out)
TypeError: only integer scalar arrays can be converted to a scalar index
```
I also tested to make sure this still works with 0.11.
Closes https://github.com/caffe2/caffe2/pull/787
Differential Revision: D5248087
Pulled By: salexspb
fbshipit-source-id: eff69482a8eabb8ace330003fa326c832b53865f
Summary: Use `CopyItems` so that it accepts any type of tensor. Also, move the cursor to input blob so that it's checkpoint friendly. Output is now also part of input so that inference can work correctly.
Reviewed By: xianjiec
Differential Revision: D4920987
fbshipit-source-id: da532736225ec27f409ff763ff69a0629235151c
Summary:
Implement NormalizeOP for GPU using CUDA, and re-write the graident to be a function of the output
so its more efficent specially for CUDA implemntation.
Reviewed By: akyrola
Differential Revision: D4971300
fbshipit-source-id: e0ab66462000988aaf1f26010ea550533d107167
Summary: Both SquaredL2Distance and SquaredL2DistanceGradient had bad CUDA implementations. Use proper reductions and batched kernels.
Reviewed By: asaadaldien
Differential Revision: D4968527
fbshipit-source-id: f7cf82072d38bc127c757c5751863a9439aca8b5
Summary:
Similar to SafeDequeueBlobsOp, but add weight-based sampling for reading from multiple input BlobsQueue.
WeightedSampleDequeueBlobsOp will take a vector of weights (each weight is mapped to one input blob queue).
Based on probability, we will choose which BlobQueue to fetch.
WeightedSampleDequeueBlobsOp shall stop when any of input BlobQueue is empty.
Reviewed By: dzhulgakov
Differential Revision: D4905160
fbshipit-source-id: 5b1551e2250569f933a6c01ed04442843c5e0cb6
Summary:
add necessary ops for feature processing
* logit op
* replace nan
* batch one hot op
Reviewed By: kittipatv
Differential Revision: D4840869
fbshipit-source-id: 197123ea5608d54f0b5ac7899973a077a6a86775
Summary:
Quite large diff to make cuDNN LSTM and our LSTM produce same results and provide python API for the cuDNN LSTM.
* Added operators RecurrentParamGet and RecurrentParamSet to access weights and biases for the different gates, input/recurrent.
* Removed RecurrentInit as not needed
* recurrent.cudnn_LSTM() returns a special net and mapping that can be used to retrieve the parameters from the LSTM
* recurrent.cudnn_LSTM() can be passed blobs that have the parameters for the individual gate weights and biases
* recurrnet.InitFromLSTMParams() can be used to initialize our own LSTM from CUDNN params. This way we can test if cuDNN and our own produce the same result.
recurrent_test.py tests for the equivalency
Reviewed By: salexspb
Differential Revision: D4654988
fbshipit-source-id: 6c1547d873cadcf33e03b0e0110248f0a7ab8cb0
Summary:
Actually adds stuff on duplicated indices. I didn't use UnorderedSegmentSum because it'd need more modifications for figuring out the first dimension and I don't want to make that function more complex than it's already is :)
We theoretically can have a version that does CopyItems and fails on duplicate indices as a fallback. But I haven't implemented it yet as it wouldn't be that useful for now.
Also fixes hypothesis test - doing rand() inside the body is not cool as it makes hypothesis run forever
Differential Revision: D4814574
fbshipit-source-id: 1851ec5f5df8fc4bf4844585076b8af23a06b0b2
Summary:
Uses the cudnnTransformTensor function. It works by shuffling the strides according to the transpose axis. Significant speedup over current GPU version .
+ moves the transpose test under utility_ops, because hypothesis_test is too big
Reviewed By: jamesr66a
Differential Revision: D4810993
fbshipit-source-id: 82577c4ced1389e70bd5992820ae4d8297a3817f
Summary:
All of these tests fail with some variant of `Cannot create operator of type 'X' on the device 'CUDA'` (see commit messages).
Closes https://github.com/caffe2/caffe2/pull/227
Differential Revision: D4797060
Pulled By: Yangqing
fbshipit-source-id: 5feaa8e949098bfc1254d4c7449a2744e552f925
Summary: PadImage has no kernel parameters resulting pads_ paraemeters to be not set (0). I added a test case too.
Differential Revision: D4785230
fbshipit-source-id: fd475e7c41208e07fa7a363def9a45c6f82cddfe
Summary: Creates SparseMomentumSGDUpdate, a sparse version of MomentumSGDUpdate, to make that optimization method (via in-place updating operator) compatible with GradientSlices.
Differential Revision: D4784973
fbshipit-source-id: e6330f471a4d5f53589a6ac245e38f256ca7f354
Summary: AccumulateHistogramOp, for computing the histogram of all values in input tensors
Differential Revision: D4654417
fbshipit-source-id: dea92346004c772af16e1eb41306287d81dc5a02