Commit Graph

1158 Commits

Author SHA1 Message Date
Wojciech Glogowski
a1992e81b3 Replaced std::copysign(x) with (x > 0 ? 1 : -1)
Summary:
Replaced std::copysign(x) with (x > 0 ? 1 : -1).
std::copysign is not available on some Android platforms which was detected in GitHub's Travis tests:
"/home/travis/build/caffe2/caffe2/caffe2/sgd/yellowfin_op.cc:57:23: error: 'copysign' is not a member of 'std'"

Reviewed By: akyrola

Differential Revision: D5756384

fbshipit-source-id: 56bc220d2c6216ff45b9cc47ed02aebf6ad439a5
2017-09-01 11:52:44 -07:00
Wojciech Glogowski
925cfc0d90 Disabling test for YellowFin
Summary: Disabling test for YellowFin that does not pass test in Travis. Difference comes from numerical reasons. Test passes on my cpu / math libraries. Decide whether to merge it.

Reviewed By: Yangqing

Differential Revision: D5754144

fbshipit-source-id: b6ed6628f962d6904a8d522f0cf4080d7878acad
2017-09-01 10:35:48 -07:00
Aapo Kyrola
bb08f261f1 EnsureDense/SparseToDense for CUDA
Summary: Make CUDA version of SparseToDense, register EnsureDense (which is trivial) on CUDA. Need to use atomics because indices can be duplicated. We can later add an option to inform if the indices are unique, and use faster path then.

Reviewed By: jhcross

Differential Revision: D5750893

fbshipit-source-id: 005d1675b127a571aac8474fca62d9633f0c7bff
2017-09-01 09:33:05 -07:00
James Cross
53ccbd9a6e soft-coverage attention
Summary:
Implementation of a new variant of attention module, which contains a recurrent decoder state with vectors corresponding to each source-side word and strictly increasing values, thus enabling it to model the degree to which source words have been translated.

The approach is a variant of the approaches described in https://arxiv.org/pdf/1601.04811.pdf. We simply include the sum of all previous attention weights for encoder words as a new recurrent state (coverage_t). A new linear transform on encoder_outputs is used to produce coverage_weights, which has the same dimensionality as encoder_outputs, and implicitly models the fertility of source-side words (and putting this extra information strain on the encoder network).

Thus the encoder output, the decoder state, and the coverage weights have the same dimensionality for a given source word, and attention logits are calculated as v *  tanh(coverage * coverage_weights + encoder_output + decoder_state).

Note: the entire coverage state for each translation instance is of shape (encoder_length, coverage_units), but the states for the RecurrentNetwork operator, used to train the decoder, must be flat in the data dimension. This state is therefore initialized with shape (encoder_length * coverage_units) [not shown in the open-source library] and reshaped appropriately within the apply_soft_coverage_attention() function.

Differential Revision: D5593617

fbshipit-source-id: 7d0522b5eb0b26f22e8429e4461a459f2f16ed46
2017-08-31 21:21:54 -07:00
Bram Wasti
50b5c76ea9 A benchmark generator for individual ops
Summary: basic little op benchmark generator -- outputs init_net.pb and predict_net.pb for use with speed_benchmark or mobile_speed_benchmark

Reviewed By: Maratyszcza

Differential Revision: D5728534

fbshipit-source-id: 3e912fa63548497ca65ab34c8bb967694c46815b
2017-08-31 17:33:21 -07:00
Andrey Malevich
03711e9ab8 Handle bool's correctly in net.Const
Summary: As desc.

Reviewed By: volkhin

Differential Revision: D5745310

fbshipit-source-id: 66c3da37a42cf98bae05cead58f3f694eae19e0d
2017-08-31 12:02:58 -07:00
Jerry Zhang
debceaff02 Support new arguments in ConvTranspose
Summary: Adding support to use kernels, strides, pads etc. as arguments.

Reviewed By: houseroad

Differential Revision: D5710699

fbshipit-source-id: 8b63af4c4a76cd06b637a376aeb29a34c659be2e
2017-08-31 11:17:32 -07:00
Alisson Gusatti Azzolini
b4b89e1bd5 Ability to dequeue and concat multiple records in a single QueueDequeue op
Summary: This will allow to do data reading in small batches and concat the batches later on.

Reviewed By: kennyhorror

Differential Revision: D5739129

fbshipit-source-id: 66a8087e5f9d10d654e367c6111ac90cbf54224e
2017-08-31 10:48:59 -07:00
Kittipat Virochsiri
4ec26d23a7 TensorInference function for LengthsSum and such
Summary: Adding missing tensor inference function

Reviewed By: kennyhorror

Differential Revision: D5735119

fbshipit-source-id: 1602b5aeec95f13a3c3c6d3e5417af2712a4dfbb
2017-08-31 09:32:48 -07:00
Wojciech Glogowski
fefd5479a3 Initial implementation of YellowFin algorithm
Summary:
Added YellowFin optimizer to Caffe2.
This implemention is different from the original: It has separate alpha and mu for each parameter and it uses different version of Momentum SGD.
Tests / benchmarks for the optimizer are to be done. Some refactor of the code is to be done before pushing. This is still a working version.

Reviewed By: akyrola

Differential Revision: D5652689

fbshipit-source-id: c10dc0424f47c3051b454aede1d121902cb759a8
2017-08-30 18:53:46 -07:00
Wojciech Glogowski
5ed5be71b1 YellowFin GPU class and Python optimizer
Summary: YellowFin GPU in .cu file, Python operator in optimizer.py

Reviewed By: asaadaldien, akyrola

Differential Revision: D5727450

fbshipit-source-id: 42a878e5fd35e288e0e6eeaa0bf980a9db96e5a7
2017-08-30 18:32:24 -07:00
Hassan Eslami
0f3a5d3180 Tuning number of parameter servers based on performance estimation job
Summary:
1) Adds monitoring of CPU utilization in trainers and PS's, and report the utilization to global statistics
2) Adds the plan execution time to global stats
3) Uses CPU utilization and network utilization observed from performance estimation job to calculate the optimal number of parameter servers needed for the actual job. The optimal number of parameter server is the minimum number of servers needed while parameter servers are not the bottleneck in execution.

//Note: The calculation assumes that parameter shards are assigned to PS's in a uniform way and accesses to the shards follow a uniform access pattern. In reality, shards' access pattern may be skewed. As a next step, we should monitor shard access pattern in performance estimation job and distribute the shards in the optimal way.//

Reviewed By: sf-wind

Differential Revision: D5674398

fbshipit-source-id: 67a07cb9ed4e4d61ff5e81a0ecfe519b8feb2352
2017-08-30 18:03:59 -07:00
Jiyan Yang
33ef5f38a0 Fixed cuda loss op
Summary:
Currently the loss ops are still not on GPU even though ALL strategy is selected.
This diff is to enable it.

Reviewed By: xianjiec

Differential Revision: D5671255

fbshipit-source-id: 033863f171e1f89c8d75430d3af6a1e6d0d2eff2
2017-08-30 17:02:23 -07:00
Misha Smelyanskiy
080fab8f6c Code generator for and high-performance emebding look-up kernels, supporting
Summary:
Code generator for and high-performance emebding look-up kernels, supporting
Sum, WeightedSum, and Mean reducers.
Achieve at least 1.5x speedup on float and over 2x speedup for float16, compared to existing code
These are results on Broadwell, using sparse_lengths_sum_benchmar.par benchmark

Old
==============
[root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par  --iteration 10000
Preparing lookup table. 2017-08-08 00:10:23.101848
Preparation finished. 2017-08-08 00:10:27.955680
I0808 00:10:27.955732 30700 net.cc:177] Starting benchmark.
I0808 00:10:27.955759 30700 net.cc:178] Running warmup runs.
I0808 00:10:27.956367 30700 net.cc:188] Main runs.
I0808 00:10:31.839035 30700 net.cc:199] Main run finished. Milliseconds per iter: 0.388264. Iters per second: 2575.56
I0808 00:10:35.704169 30700 net.cc:233] Operator #0 (indices, Python) 0.0583264 ms/iter
I0808 00:10:35.704210 30700 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.327694 ms/iter
I0808 00:10:35.704213 30700 net.cc:237] Time per operator type:
I0808 00:10:35.704217 30700 net.cc:246]        0.327694 SparseLengthsSum
I0808 00:10:35.704221 30700 net.cc:246]       0.0583264 Python
[root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par  --iteration 10000 --dtype float16
Preparing lookup table. 2017-08-08 00:10:59.047159
Preparation finished. 2017-08-08 00:11:05.140565
I0808 00:11:05.140612 31725 net.cc:177] Starting benchmark.
I0808 00:11:05.140635 31725 net.cc:178] Running warmup runs.
I0808 00:11:05.141104 31725 net.cc:188] Main runs.
I0808 00:11:08.371510 31725 net.cc:199] Main run finished. Milliseconds per iter: 0.323039. Iters per second: 3095.6
I0808 00:11:11.671450 31725 net.cc:233] Operator #0 (indices, Python) 0.0609876 ms/iter
I0808 00:11:11.671489 31725 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.26856 ms/iter
I0808 00:11:11.671494 31725 net.cc:237] Time per operator type:
I0808 00:11:11.671497 31725 net.cc:246]         0.26856 SparseLengthsSum
I0808 00:11:11.671500 31725 net.cc:246]       0.0609876 Python

New (Misha's)
==============
[root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par  --iteration 10000
Preparing lookup table. 2017-08-07 23:44:55.897748
Preparation finished. 2017-08-07 23:45:00.708896
I0807 23:45:00.708945 4178361 net.cc:177] Starting benchmark.
I0807 23:45:00.708971 4178361 net.cc:178] Running warmup runs.
I0807 23:45:00.709444 4178361 net.cc:188] Main runs.
I0807 23:45:03.608551 4178361 net.cc:199] Main run finished. Milliseconds per iter: 0.289909. Iters per second: 3449.36
I0807 23:45:06.536182 4178361 net.cc:233] Operator #0 (indices, Python) 0.0572399 ms/iter
I0807 23:45:06.536224 4178361 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.23512 ms/iter
I0807 23:45:06.536228 4178361 net.cc:237] Time per operator type:
I0807 23:45:06.536232 4178361 net.cc:246]         0.23512 SparseLengthsSum
I0807 23:45:06.536236 4178361 net.cc:246]       0.0572399 Python
[root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par  --iteration 10000 --dtype float16
Preparing lookup table. 2017-08-07 23:45:17.191579
Preparation finished. 2017-08-07 23:45:23.173668
I0807 23:45:23.173715 4179316 net.cc:177] Starting benchmark.
I0807 23:45:23.173743 4179316 net.cc:178] Running warmup runs.
I0807 23:45:23.174090 4179316 net.cc:188] Main runs.
I0807 23:45:24.939749 4179316 net.cc:199] Main run finished. Milliseconds per iter: 0.176564. Iters per second: 5663.67
I0807 23:45:26.698885 4179316 net.cc:233] Operator #0 (indices, Python) 0.0557303 ms/iter
I0807 23:45:26.698923 4179316 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.119794 ms/iter
I0807 23:45:26.698927 4179316 net.cc:237] Time per operator type:
I0807 23:45:26.698931 4179316 net.cc:246]        0.119794 SparseLengthsSum
I0807 23:45:26.698935 4179316 net.cc:246]       0.0557303 Python

Reviewed By: salexspb

Differential Revision: D5582172

fbshipit-source-id: d71f5a55580b734a51b8f30852b75f379acfdaf2
2017-08-30 16:22:11 -07:00
Wojciech Glogowski
a7ec5def7b data_parallel_model names fix
Summary: Updated usage of deprecated functions in data_parallel_model.py

Reviewed By: akyrola

Differential Revision: D5738512

fbshipit-source-id: a7767e518da777ece058bcad480e5df1d91e9b42
2017-08-30 12:47:14 -07:00
Junjie Bai
41adebe974 Clear the operator default engines before running operator tests
Reviewed By: akyrola

Differential Revision: D5729024

fbshipit-source-id: f2850d5cf53537b22298b39a07f64dfcc2753c75
2017-08-29 17:47:20 -07:00
Ahmed Taei
5315669bd8 Add ShapeInference for ConcatOp (Fixed)
Reviewed By: akyrola

Differential Revision: D5721442

fbshipit-source-id: 64ed35cb4c40f32a5cca29fe9cd04e18a340db4b
2017-08-29 12:18:03 -07:00
Aapo Kyrola
488abdcd6c slice op shape inference
Summary: As titled + test

Reviewed By: jamesr66a

Differential Revision: D5720637

fbshipit-source-id: eae76e587808139fcf06abc0f8345152979815ec
2017-08-29 11:05:24 -07:00
Ilia Cherniavskii
a0204331a8 Control flow operators
Summary:
This diff adds control flow operators in Caffe2 (starting with If, While):
 - Added If operator that executes then/else subnet
 - Branch subnet is executed in a separate isolated workspace, with some of the blobs transparently forwarded from the outer workspace
 - Adding a new NetBuilder subclass to construct nets using new operator
 - NetBuilder also keeps track of outer blob names and automatically sets blob bindings between outer and inner workspace, implementing generic convention on handling local/global variables in blocks

Reviewed By: volkhin

Differential Revision: D5720644

fbshipit-source-id: a674cde0c789f6a6ffdcd9d80159d1e42e49133f
2017-08-28 20:04:43 -07:00
Aapo Kyrola
7c7603a60e fix FC shape inference
Summary: FC shape inference was broken for non-default axis. Add test.

Reviewed By: asaadaldien

Differential Revision: D5720146

fbshipit-source-id: f36f9cc8477dc61c3b07eeea8ea0702562045c88
2017-08-28 16:08:07 -07:00
Artem Volkhin
d3c8e68004 Revert D5641588: [caffe2] Control flow operators
Summary:
This reverts commit f9e04429961c3da7da4ebca3e8163bfcc2a09ec9

bypass-lint

Differential Revision: D5641588

fbshipit-source-id: bb23b213d08e9c3ea509216fce9367625943d007
2017-08-26 00:07:58 -07:00
Yangqing Jia
9f693b39aa Revert D5711951: [caffe2] Add shape inference for ConcatOp
Summary:
This reverts commit 9173ef0f18af25326ec18e66f6ce29eecfa5ceea

bypass-lint

Differential Revision: D5711951

fbshipit-source-id: 9bbb872eafcbd3c470b782a5ddb2a1c894888101
2017-08-25 23:37:38 -07:00
Christopher Hay
cc3662e939 Added support for scaling learning rate of Caffe2 optimizers during training
Summary: While there is currently support for scaling the base learning rate when loading the model, there is not support for scaling the base learning rate during training. This is needed for LATTE's seq2seq translation models, as the learning schedule is not predefined and is modified at runtime.

Reviewed By: jhcross

Differential Revision: D5701391

fbshipit-source-id: ae3bec45f238db1a2be7af9c04d720067e9095d5
2017-08-25 19:04:47 -07:00
Ahmed Taei
da418f5744 Add shape inference for ConcatOp
Reviewed By: akyrola

Differential Revision: D5711951

fbshipit-source-id: 9173ef0f18af25326ec18e66f6ce29eecfa5ceea
2017-08-25 18:09:35 -07:00
Jerry Zhang
3c180ba317 Opensourcing channel shuffle
Summary: att

Reviewed By: Yangqing

Differential Revision: D5662540

fbshipit-source-id: 474d7d808841ff8f7ce97b55df836b9d2f4a7629
2017-08-25 16:46:31 -07:00
Aapo Kyrola
885d9a7796 fix memonger for RecurrentNetworks
Summary: When we ported to memonger to C++ in D5544219, we forgot to include the special handling of RecurrentNetwork ops. This fixes that and adds a test.

Reviewed By: asaadaldien

Differential Revision: D5692407

fbshipit-source-id: 4e739b5dd6c7298303eee9bfa1aa4d19359eb7b5
2017-08-25 16:01:25 -07:00
Bor-Yiing Su
5bc52c3223 Adds the master setup plan to the model exporter.
Reviewed By: rayleichen

Differential Revision: D5697246

fbshipit-source-id: d1775e0de3b3080f398350f98659436b3dfbd7b8
2017-08-25 16:01:24 -07:00
Lei Chen
432cba6c05 Set up run_every_ms when constructing ExecutionStep
Summary: same as title.

Differential Revision: D5709274

fbshipit-source-id: f88b1325f3e6b948b836cc90f4d9c38a27be28ab
2017-08-25 15:58:29 -07:00
Alisson Gusatti Azzolini
ae0c4c8e66 Respect inplace blobs in InjectCrossDeviceCopies
Summary:
Before this diff, we were not respecting in-place blobs. E.g. if we had:

  with DeviceOption(CPU):
      blob = net.MyOpA([])
  with DeviceOption(CUDA):
      net.MyOpB([blob], [blob])

After the InjectCrossDevicesCopies we would have:

  blob = net.MyOpA([], device=CPU)
  blob_cuda0 = net.Copy([blob], [blob_cuda0], device=CUDA)
  net.MyOpB([blob_cuda0], [blob], device=CUDA)

Basically, we were not respecting inplace blobs. After this diff, we'll keep the inplace blob.

Reviewed By: harouwu

Differential Revision: D5671867

fbshipit-source-id: 6ad68c612dae19d7e1f45f4988d929644100b4d5
2017-08-25 14:57:58 -07:00
Aapo Kyrola
cffbbfa9e3 Revert D5655753: [Caffe2] better straggler exit procedure
Summary:
This reverts commit ad0c998feeb03bcb0cf4e5127fb3cc7bb00dcedb

bypass-lint

Differential Revision: D5655753

fbshipit-source-id: 2f1d350286d2ee31e8045c9bd03ef1235f1a93ec
2017-08-25 14:23:09 -07:00
Ilia Cherniavskii
86cc7ace93 Control flow operators
Summary:
This diff adds control flow operators in Caffe2 (starting with If, While):
 - Added If operator that executes then/else subnet
 - Branch subnet is executed in a separate isolated workspace, with some of the
   blobs transparently forwarded from the outer workspace
 - Adding a new NetBuilder subclass to construct nets using new operator
 - NetBuilder also keeps track of outer blob names and automatically sets
   blob bindings between outer and inner workspace, implementing generic
   convention on handling local/global variables in blocks

Reviewed By: azzolini

Differential Revision: D5641588

fbshipit-source-id: f9e04429961c3da7da4ebca3e8163bfcc2a09ec9
2017-08-25 12:31:14 -07:00
Alexander Sidorov
7eba614503 RNNCell: Initializers interface, simplify _LSTM helper
Summary:
_LSTM helper is a legacy piece we had before all the RNNCell awesomeness landed. Now we need to pull it apart and create separate building blocks that people can use for any RNNs.

Please note changes to a test with double scoping. That should go away once we change RNNCell scoping logic in such a way that each cells ads its own name to the scope for all of its outputs (see another diff: D5613139 )

Reviewed By: jhcross

Differential Revision: D5632276

fbshipit-source-id: 1cb568ab995c4c0b3dd1b4bad2d028e34bded9c1
2017-08-25 12:01:24 -07:00
Aapo Kyrola
82360d8cba shape inference for ReduceFront/Back/Sum/Mean, Gather and Dropout
Summary: These were missing and required for some seq2seq models. Unit tested. The previous implementation of ReduceBackMean shape inference was incorrect, so removed it.

Reviewed By: asaadaldien

Differential Revision: D5691262

fbshipit-source-id: 76f868b298440f988635966a410f0232301ca6c4
2017-08-25 11:31:17 -07:00
Alisson Gusatti Azzolini
5e0b28e7bd PrependDimOp
Summary:
Split the first dimension of a tensor into 2, the first of which is fixed and given in the argument.
This is used to then split batch into smaller batches and distributed it across workers.

Reviewed By: harouwu

Differential Revision: D5702175

fbshipit-source-id: 02bb93e49bf9db411b516e149c8e647301dd2ca5
2017-08-24 18:52:05 -07:00
Jiyan Yang
20c854d43c Make FC op work with empty batch in cuda
Reviewed By: xianjiec

Differential Revision: D5673458

fbshipit-source-id: d1c950c94173843670ae1fae0e15ff61ca7d6761
2017-08-24 18:52:04 -07:00
Aapo Kyrola
4c9eff807b better straggler exit procedure
Differential Revision: D5655753

fbshipit-source-id: ad0c998feeb03bcb0cf4e5127fb3cc7bb00dcedb
2017-08-24 12:33:30 -07:00
Aapo Kyrola
23209152a9 fix memonger test for open source by checking for cuda support
Summary: This test was failing on non-GPU builds because it refers to operator CopyGPUToCPU. Thanks pietern for catching this.

Reviewed By: asaadaldien

Differential Revision: D5698763

fbshipit-source-id: 0bde0f3e99c58647dba2ea6da4d51938e763d10c
2017-08-24 12:02:38 -07:00
Jerry Zhang
7f4ceb83e3 Relax dimension constraints for weight matrix in FC
Summary: att

Reviewed By: Yangqing

Differential Revision: D5662265

fbshipit-source-id: 893ee2f92debab06117725beeca3199cba565f1e
2017-08-24 11:16:39 -07:00
Christopher Hay
ad07f5f05d Added norm-based gradient clipping to optimizer library
Summary: Moved code for global norm-based gradient clipping from fb specific workflows (seq2seq) to the open-source caffe2 optimizer library

Reviewed By: jhcross

Differential Revision: D5637453

fbshipit-source-id: 7e73c9a1c97c28a152c188467b27a6449f79242e
2017-08-24 10:17:50 -07:00
Long Jin
3faeb621d3 support id_score_list for Feed
Reviewed By: xianjiec

Differential Revision: D5624894

fbshipit-source-id: 1b2caba9ffcce68f346020485cb1f4edb01ca5e7
2017-08-24 00:32:05 -07:00
Kittipat Virochsiri
d368b59177 logging the blob that has type error
Summary: Currently, it's not easy to track down which tensor is missing type and shape info. Print it out for easier debuggin.

Reviewed By: volkhin, xianjiec

Differential Revision: D5695223

fbshipit-source-id: 7f0be0be777a35bb5a71b3799b29b91f0763c159
2017-08-23 21:21:27 -07:00
Devesh Agrawal
16549ed92b Scaled training and fetching from the PS
Summary:
Today, the PS's weirdly store the entire embedding and not just their
subsection of it. This was simply an oversight on the part of the original
author and this diff fixes that.

The sparse params are sharded to the PS's and the PS's just store their section
of the embedding. The trainer requests the id's as is from the PS. But the PS
divides the id by the num_of_shards before looking it up in the emdedding table
blob.  This happens on the backward and the forward pass. However, during the
model download part, the PS multiples the embeddings with the num_of_shards
before returning them to the trainer. The upshot is that the trainer does not
know anything about how the embeddings are scaled on the PS. The PS adds extra
divide and multiply steps to achieve that.

2. During estimation time, we allocate just one PS for estimation. So in order
to make all of the embeddings fit on the single PS: We simply additionally
scale the hash table sizes (proportionally and equally for all the sparse
params) such that it fits. This scaling is handled analogously to (1).

Reviewed By: boryiingsu

Differential Revision: D5664093

fbshipit-source-id: 92f501f61566f939c41ce0b614a1b499669f978a
2017-08-23 18:16:03 -07:00
Catherine Dong
1955d0797e Added fast path for CUDNN global max pooling
Summary:
This adds a fast path for global max pooling with NCHW. Compared to equivalent ReduceBackMean, this is about 3.5x faster.

Based on D5533059.

Reviewed By: akyrola

Differential Revision: D5681122

fbshipit-source-id: 7a4df934044c7dd01888f095f7dd46654aaf4eae
2017-08-23 16:33:06 -07:00
Jeonghee Yi
98da4e3a04 pairwise dot product with dot_groups support
Summary: extending pairwise dot-product only between dot_groups

Differential Revision: D5527060

fbshipit-source-id: be5d3178c332e122853a2f9d8da12a880608b0ab
2017-08-23 15:23:36 -07:00
Jeonghee Yi
d675c101e9 extend pairwise dot product for non-equal x & y dimension size
Summary: extend pairwise dot product for different number of embeddings on x & y dimensions

Differential Revision: D5663553

fbshipit-source-id: 1743a2c101cb8c0fc1f0f3d89c19530802400ec6
2017-08-23 02:08:20 -07:00
Ilia Cherniavskii
e33dfe93e4 Update proto definition
Summary: Update Argument's definition to allow direct passing of NetDef

Reviewed By: azzolini

Differential Revision: D5681837

fbshipit-source-id: e6c618bff051f9bbc56075c796aeba0094fa97dd
2017-08-22 19:01:18 -07:00
Kittipat Virochsiri
058815955d Add default implementation of __call__ for context manager
Summary: Making it more convenient to wrap code int context

Reviewed By: boryiingsu

Differential Revision: D5680991

fbshipit-source-id: 07b7e4d5aa657184039a7d18192b68fe11c1a570
2017-08-22 17:46:22 -07:00
Badri Narayan Bhaskar
9507cae9e0 Create MergeIdListsLayer
Summary: We create a layer for MergeIdListsOp

Differential Revision: D5531348

fbshipit-source-id: a2e227e1abda05cefa893fd41a2c3ca997851e25
2017-08-22 17:00:55 -07:00
Alisson Gusatti Azzolini
930acc8e85 CUDA SparseLengthsWeightedSum
Summary: title.

Reviewed By: harouwu

Differential Revision: D5665776

fbshipit-source-id: a8ae1a71a9a21e68172662f38b5f799870b9dcd1
2017-08-22 15:42:02 -07:00
Junjie Bai
5748e7140f Strip Operator Schema in mobile build
Reviewed By: Yangqing

Differential Revision: D5677792

fbshipit-source-id: d29edb26a36b24a46821e13e2d77af0f21571fcd
2017-08-22 13:31:08 -07:00