Commit Graph

375 Commits

Author SHA1 Message Date
Kittipat Virochsiri
cea16ff7cd BatchSigmoidCrossEntropyLoss
Summary: To support feed interset team

Reviewed By: kdub0

Differential Revision: D4719213

fbshipit-source-id: 8deb3544377fb06593399b101de66f3f845f93b5
2017-03-17 09:35:51 -07:00
James Cross
79c3a3af54 add gpu support for caffe2-seq2seq
Summary: Adding synchronous optimization on GPUs to the translation training pipeline, via data_parallel_model.Parallelize_GPU, which needs to be updated so there is some way of performing sparse parameter updates (e.g., on embedding tables), whether on GPU or CPU.

Reviewed By: urikz

Differential Revision: D4631914

fbshipit-source-id: 9cdd655f7dbda3f9b2733d459228b3e097892441
2017-03-17 05:19:14 -07:00
Jon Morton
1513b1de6b Add ResizeNearest operator
Summary: This adds a nearest neighbor interpolation resizing operator to caffe2. CPU only, NCHW only, no gradients. Also adds torch2caffe support. This is probably not optimal in terms of performance, but it works.

Reviewed By: ajtulloch

Differential Revision: D4724244

fbshipit-source-id: b8295061141fb513da84acf91fdfd67264119059
2017-03-16 18:49:01 -07:00
Huazhong Ning
ad4ae4528f migrate mtml to dper2
Summary:
1. migrate the basic mtml model to dper 2
2. test dper 2 mtml model
3. test all optimizers

Reviewed By: kittipatv

Differential Revision: D4680215

fbshipit-source-id: 7aac5c59bdac22fcad8ed869b98e9e62dca1d337
2017-03-16 17:48:05 -07:00
James Reed
cc2e915461 Implement TopK op in caffe2
Reviewed By: salexspb, urikz

Differential Revision: D4718439

fbshipit-source-id: e6866eb7bb586f2716662cd4b65961bdd9914525
2017-03-16 17:32:20 -07:00
Kevin Waugh
2c8bf2525b added BatchL2Loss layer
Summary: layer that takes a label, prediction pair and outputs the L2 loss

Reviewed By: kittipatv

Differential Revision: D4702111

fbshipit-source-id: 09f2ede44d1b548e61096de741f1b2aa0b66bbcb
2017-03-16 17:32:20 -07:00
Alexander Sidorov
d85ed5d5d6 fix external_loggers
Summary:
it was broken in trunk and I fixed it locally then had a
wrong merge in D4672026.  This is just a revert of those changes

Reviewed By: ajtulloch

Differential Revision: D4723138

fbshipit-source-id: 14757d9c8ae5135bd7c084003a64e25efc74b54f
2017-03-16 13:47:58 -07:00
James Reed
10d95bd0f0 Remove batch_size parameter from attention and LSTMWithAttention interfaces
Summary: Reshape based on tensor shapes in the graph rather than based on a passed-in batch_size parameter

Reviewed By: urikz

Differential Revision: D4702086

fbshipit-source-id: c4c1d8425cd36c1e86695918eaba2667c27e9601
2017-03-16 11:47:52 -07:00
Xianjie Chen
b2ab7365be fix for special case when dense dim is 1
Summary: otherwise it will fail here: https://fburl.com/puy5x2dq

Reviewed By: kittipatv

Differential Revision: D4719212

fbshipit-source-id: e0d8211f64dca00ee48df3235d2bc030ea30f208
2017-03-16 05:19:10 -07:00
Luke Yeager
7773a2d643 Bugfix: type not being set when inferring types+shapes
Summary:
/cc akyrola

I basically just copied all the `ShapeCall` stuff as `TypeCall`. Is there a better way?
Closes https://github.com/caffe2/caffe2/pull/187

Differential Revision: D4699312

Pulled By: Yangqing

fbshipit-source-id: 92f736ffe4127b00b5821acb1eb359771975fdd7
2017-03-15 18:48:40 -07:00
Alexander Sidorov
56f324d191 Added predictor bindings to python interface
Summary: from caffe2.python import workspace; p = workspace.Predictor(init_net, predict_net); outputs = p.run(inputs)

Reviewed By: Yangqing

Differential Revision: D4576793

fbshipit-source-id: b829bbcaf2e7c34dad85024177433207bd96a234
2017-03-15 11:17:54 -07:00
Kittipat Virochsiri
61dd35f1d6 FCWithoutBias layer
Summary: For some embedding task, we don't want to include bias term in embedding computation.

Reviewed By: xianjiec

Differential Revision: D4689620

fbshipit-source-id: 4168584681d30c0eaa1d17ceaf68edda11924644
2017-03-15 11:03:37 -07:00
Pieter Noordhuis
92101aa87a Update resnet50 example
Summary:
Make it use Gloo and optionally use Redis for rendezvous (where a
shared filesystem is not available).

Differential Revision: D4709943

fbshipit-source-id: 59cc7a14316c7b634417ea5161a75fab3c19f2fa
2017-03-15 08:18:50 -07:00
ezineo
518d36d34b Add PReLU translator
Summary: Closes https://github.com/caffe2/caffe2/pull/171

Differential Revision: D4711877

Pulled By: Yangqing

fbshipit-source-id: 555f733e6eabf351480b7d4398aa05755cc26599
2017-03-15 02:47:03 -07:00
Huazhong Ning
bb58074332 support get/add a field by nested name
Summary:
We are having more and more nested Struct schema. There is increasing need to get/adda field by nested name, e.g., for the following nest Struct schema:

st = Struct(
  ('a': Scalar()),
  ('b': Struct(
     ('c': Scalar()),
  )),
)

We may want to get the field "b:c" and/or insert a new field "b:x". The immediate need is for dper2 metrics.

This diff is to achieve this.

Reviewed By: kittipatv

Differential Revision: D4690225

fbshipit-source-id: 71d4a74b36bd1228a2fefd901db2f200602152b7
2017-03-15 02:00:57 -07:00
Aapo Kyrola
26628d10ff Fix workspace clashes
Summary: For example, test and train nets could have shared workspaces, leading to race condition. This adds an assertion and adds a running counter to the workspace-blob name.

Reviewed By: jhcross

Differential Revision: D4712152

fbshipit-source-id: 808d7069095bac24ebfe0c9d31ebd134f4cf0956
2017-03-14 23:33:28 -07:00
Pieter Noordhuis
9e6fd02c28 Use Gloo ops in data_parallel_model
Summary:
No longer need GPU to CPU copies. The allreduce operator no longer
uses 'local allreduce - global allreduce - local broadcast' sequence
when Gloo is used, but passes all input blobs directly.

Depends on D4708860.

Differential Revision: D4709897

fbshipit-source-id: 4d745d5d8bac9c2fcca081dd5d812c902808c3b6
2017-03-14 22:34:51 -07:00
Alexander Sidorov
4d7451399b XRay mobile quantized model
Summary:
This is going to allow to experiment with various training from scratch / fine tunning technics. The code itself for the new model is not intended to be used as is. Instead one could  train a full precision model first. Then add quantization for the last layer, then for the next one and so on.

In my experiments I tried getting a pretrained model and then quantizing all inception layers with 4 bits. This restored original accuracy after several dozen iterations

Also in this diff I added a common prefix to the model checkpoint + added this prefix to git / hg ignore.
And also some extra logs which are usefull to quickly see how things changed right after enabling quantization

Differential Revision: D4672026

fbshipit-source-id: b022c8ccf11dd8a2af1a7b2e92673483bc741a11
2017-03-14 22:18:40 -07:00
Luke Yeager
014d1fe5c4 Allow test discovery in caffe2/python/
Summary:
These are all essentially no-op changes which allow for nose-style (or pytest-style) test discovery.

With this patch, you can use any of these methods to discover and run tests under `caffe2/python`:
```
python -m unittest discover -p '*test*.py' caffe2/python/
python -m nose caffe2/python/
python -m pytest caffe2/python/
```

Future work:

* Get all of the tests to pass
  * Some seem to be testing operations which don't have GPU implementations
  * I get a segfault unless I set `CUDA_VISIBLE_DEVICES=0`
  * Some tests are flaky
* Allow test discovery throughout the whole project (e.g. the `experiments/` dir)
Closes https://github.com/caffe2/caffe2/pull/199

Reviewed By: pietern

Differential Revision: D4704504

Pulled By: Yangqing

fbshipit-source-id: 8f5687ec9c8aa873dfaff30dbf44272bc38a206b
2017-03-14 18:16:41 -07:00
Aapo Kyrola
91f468b15c fixes to make data parallel model work for RecurrentNet + test case
Summary:
First, this diff includes a full test of data-parallel LSTM, which confirms it works correctly. To make it work, some changes had to be made:
 - cell net/step net external inputs must be namespace scoped
 - prevent double-namescoping of cellnet inputs
 - make data parallel model understand recurrentnets so the device-mapping works

Reviewed By: salexspb

Differential Revision: D4708840

fbshipit-source-id: 4b0ddc43642d449076a2b6f67ad1c47f84138ff4
2017-03-14 15:48:07 -07:00
Kittipat Virochsiri
25b1221579 Allow scalar output in functional layer
Summary: Some operators, e.g., SoftmaxWithLoss, returns scalar-typed tensor. This would allow us to use those ops without having to write layer manually.

Reviewed By: xianjiec, kennyhorror

Differential Revision: D4703982

fbshipit-source-id: f33969971c57fc037c9b44adb37af1caba4084b6
2017-03-14 15:32:47 -07:00
Aapo Kyrola
783e40e806 Fix lengths-remapping again + better errors
Summary: When cloning recurrent net op, we do a remapping of the lengths-blobs. But if they don't exists (like with CRF), we should not do that.

Differential Revision: D4702123

fbshipit-source-id: 37a22d11e709011b8b98b2cc3d9f08eb9fda06c4
2017-03-14 11:04:45 -07:00
Alexander Sidorov
1fac027d0e Quantized Training API
Summary: These python helpers are going to provide sufficient book keeping when adding quantization for conv layers

Reviewed By: Yangqing

Differential Revision: D4671478

fbshipit-source-id: 292e2f633dd30969c0afbe7a8075b340ce9a6d12
2017-03-13 22:17:58 -07:00
Deepak Gopinath
a1d63da6af Adding UNK to vocab | Changing default params
Summary: UNK needs tobe indexed in the vocabulary for validation to work. Default args now result in training loss decreasing.

Reviewed By: urikz

Differential Revision: D4703393

fbshipit-source-id: e4d6ad100daf8392f8ba1e502f9ecf39bb8ce24a
2017-03-13 22:17:48 -07:00
Aapo Kyrola
fc7939c25b add model_helper.ExtractPredictorNet()
Summary:
It has been a pain to save predictor-compatible models from Caffe2. This diff adds function ExtractPredictorNet that takes a training model and outputs a predictor model by removing all operators that are not relevant for prediction, such as backward pass and dequeue-ops for input loading (as in predictor, the input data is external input).

We can also consider including this directly in the predictor exporter for FB usage.

Reviewed By: rpenggithub

Differential Revision: D4693264

fbshipit-source-id: e81abbbec0bd4d717159cf36488d0baaf0130090
2017-03-13 16:32:04 -07:00
Ahmed Taei
a745981c94 ReduceBack{Sum|Mean}Op CPU & GPU implementation
Summary:
Implement ReduceBackSum & ReduceBackMean with gradients for CPU & GPU contexts.
The reduction happens among the last dimenstions for example if input is a
M x N matrix ReduceBackSum will results a vector of dim M x 1 contains the
rowwise sums.

Differential Revision: D4689768

fbshipit-source-id: 5b0482d4341867ecf23526dc6c4d544420e7d8f7
2017-03-13 16:19:58 -07:00
Kairan Sun
ee2bc06926 Add Shape Inference for Reshape Operator
Summary: Add shape inference for reshape. Because it cannot do shape inference for reshaped tensor with runtime tensor data, set `out[0].set_unknown_shape(true)` if no `shape` argument is used.

Differential Revision: D4671125

fbshipit-source-id: 685a9198f9b08e3336014c792f20051b381d8619
2017-03-13 14:31:27 -07:00
Deepak Gopinath
001ac5d751 Fix to use appropriate corpus and vocab in eval
Summary: We should be using the vocabulary built on the training data, and corpus_eval as data for the evaluation phase.

Reviewed By: urikz

Differential Revision: D4700382

fbshipit-source-id: ca1dd043a28f9bb585faad050c82fb12c1cdf6cc
2017-03-13 14:31:27 -07:00
Peizhao Zhang
a5a5d00b87 Fixed a bug: 'ModelTrainerLog instance has no attribute 'external_loggers''
Summary: Fixed a bug (AttributeError: ModelTrainerLog instance has no attribute 'external_loggers', at File "caffe2/python/experiment_util.py", line 101) when no external_loggers is passed to ModelTrainerLog().

Differential Revision: D4697197

fbshipit-source-id: 1c770c366d87ea474bcf40ab289b67c76648d48b
2017-03-13 12:32:36 -07:00
Xianjie Chen
e5858485ca small change to concat layer to make tensor board vis nicer
Summary:
otherwise the blob will be in different namescope, e.g., `_nested`: https://fburl.com/ntlsaezv.
this make tensorboard ugly.

Reviewed By: dzhulgakov

Differential Revision: D4696946

fbshipit-source-id: 73627feccd7c4896964e6c549b7241bcce4f49a7
2017-03-12 23:01:18 -07:00
Pieter Noordhuis
6729d81418 Specify which GPUs to use in resnet50 example
Summary:
TSIA

This change also fixes an undefined attribute error after running 20
iterations of the resnet50 example trainer.

Differential Revision: D4692794

fbshipit-source-id: b98efdfeb078c5ba89d2a86837f3c672e1eade5f
2017-03-12 22:33:15 -07:00
Dmytro Dzhulgakov
43b6fcba7d Improve error message from LogFileDB on missing file
Summary: A lot of people get confused if the file can't be loaded.

Reviewed By: rpenggithub

Differential Revision: D4686572

fbshipit-source-id: 519ff68a3d4f04cf8ce893f255f7814e043383b6
2017-03-10 23:31:28 -08:00
Aapo Kyrola
3f682ca699 Fix to data parallel model blob_to_device mapping
Summary: We need the InferToDeviceMapping too early, or we should had done it also after running parameter update function since that can create new blobs like the momentum blobs. This fix is maybe not optimal, but works and is fast enough.

Differential Revision: D4693450

fbshipit-source-id: 4c4cc2396dad371b3fbcd1d8da51133ea09a57e0
2017-03-10 18:03:58 -08:00
Dmytro Dzhulgakov
b61aaa90b6 Stop multi_reader if we run out of data before max_examples
Summary:
Before we didn't propagate the 'out-of-data' signal if splits_per_epoch wasn't specified.

Right now it's a hacky fix (just reuse ReaderWithLimit). azzolini - any suggestions of more elegant solution? I can create an extra reader that just export "is empty" signal out.

Overall, I guess we need to turn global_queue into a more sustainable unittest that verifies all possible combinations - I'm still not sure it's correct :-\

Reviewed By: xianjiec

Differential Revision: D4665677

fbshipit-source-id: fe44d10ee82c3383145635e67dea1d9b666e061f
2017-03-10 18:03:57 -08:00
Wenyi Huang
0308910c58 Enable use of Print for LayerModelHelper
Summary: Whe debug using LayerModelHelper, adding Print to model will trigger this assert.

Reviewed By: xianjiec

Differential Revision: D4687859

fbshipit-source-id: 6932e38f8dd17ba0b80da18a20943ecdb2e8af0a
2017-03-10 15:26:16 -08:00
Aapo Kyrola
a109cbdfb6 fix bug in data_parallel_model stripParams()
Summary: Thanks for shenpan, detected this bug. Problem is that FinalizeAfterCheckponit() can be passed a list of strings, not blob references, and that fails in stripParam() after assertion I added in D4649208. It is ok to pass strings as well to that function.

Reviewed By: jhcross

Differential Revision: D4691028

fbshipit-source-id: 0bca80d44a5ab641438cc5b26482bca0b1527d69
2017-03-10 13:17:11 -08:00
Aapo Kyrola
adb3f0ec22 add exception for empty shape param
Summary: Following krp's suggestion, check if the shape parameter is empty.

Reviewed By: dzhulgakov

Differential Revision: D4686698

fbshipit-source-id: 3f9fb1e3215dd2a4a726442531201eeb18224bc6
2017-03-10 00:33:59 -08:00
Karthik Prasad
965a7daf9b Implement MILSTM in caffe2
Summary:
Created a new function with specifics related to MI LSTM implementation in caffe2
See https://arxiv.org/pdf/1606.06630.pdf for details.
See D4478877 for the implementation of the same in tensorflow

Reviewed By: jhcross

Differential Revision: D4669882

fbshipit-source-id: 095bbcf187dbdac2cd79558ff0c8f9f67d8af639
2017-03-09 16:32:47 -08:00
Jerry Pan
bde53f61af Caffe2: add scuba logging to benchmark
Summary: Caffe2: add scuba logging to benchmark

Differential Revision: D4667194

fbshipit-source-id: 8e9fca5517d7d40a6bc3e55cd00161e7482cd6f4
2017-03-09 16:32:47 -08:00
Deepak Gopinath
57ecd20197 seq2seq open source implementation
Summary:
OSS implementation of seq2seq model in Caffe2. The script uses Seq2SeqModelCaffe2 class to build and run the model. It takes in training data in the form of text file with one sentence in each line, builds a vocabulary, generates batches based on batch size and runs the net for a configurable number of epochs. It prints total scalar loss at the end of each epoch.

All FBLearner and neural_mt type system dependencies have been removed. Unimplemented and unnecessary methods have been removed to make the script simpler.
fblearner/flow/projects/langtech/translation/neural_mt/model_util_caffe2.py has been moved to caffe2/caffe2/python/examples/seq2seq_util.py and remains unchanged

Potential TODOs:
  - Get the model running in GPU. Only GatherOp does not have a corresponding GPU implementation. Try adding CopyGPUToCPU before and CopyCPUToGPU after Gather, and use CUDA DeviceOption.
  - Add evaluation on test data with suitable metric (perplexity? bleu?)

Reviewed By: urikz

Differential Revision: D4653333

fbshipit-source-id: 1c7d970ebc86afe23fad4d48854296bf54eb0f77
2017-03-09 16:18:08 -08:00
James Cross
c5621ded31 Allow use of ReversePackedSegs operator in CUDA context
Summary: ReversePackedSegs operator for CUDA. Input "lengths" (static integers) required to be in CPU memory.

Differential Revision: D4661281

fbshipit-source-id: c800c316c34015ba8e732dcbcaa8c4edaffdfeab
2017-03-09 15:03:55 -08:00
Aapo Kyrola
89c08334bb data_parallel_model support for sparse gradients and CPU ops
Summary:
Data parallel model did not support sparse operations, nor gradients computed on CPU ops.

Currently sparse operations are done on CPU, so there is no point of "data parallelizing" them. I had to make a few changes to data_parallel_model to support this:
 1. Model can have params that are added prior to adding the data parallel part. For example, a lookup table of word vectors would be a parameter that is non-parallel.
 2. Thus, when data parallel model is called, it will separate the non-parallel params and avoid working on them. Note: when we add distributed version, we need to explicitly handle them with AllGather!

This works nicely since Caffe2 automatically adds the backward concat-operator when multiple ops gather from the same blob.

I also added support for data parallel CPU ops, which might be necessary in cases when we don't have GPU implemenation of some ops.

Test in data_parallel_model_test validates the correctness of the code by running the same trainer on different number of gpus and checking the end result is same.

Reviewed By: jhcross

Differential Revision: D4649208

fbshipit-source-id: e3b7ae701ead468dc94c52a976eafec5c9831097
2017-03-09 13:48:41 -08:00
Andrey Malevich
84e742ded7 Migrate realtime training workflows to use new metrics.
Summary: This diff is getting rid of old metrics interface in realtime training.

Reviewed By: xianjiec

Differential Revision: D4649734

fbshipit-source-id: de4af85eb5476df9790ebd3915625bf8beee65af
2017-03-08 23:49:41 -08:00
Xianjie Chen
95501a0165 clean old unit test, add sum processor and sqrt pooling
Summary: sum processor and sqrt pooling is to mimic the DoubleHelix model.

Differential Revision: D4678413

fbshipit-source-id: fc1ccfe3c92c540ce5914dfd8ff1a040805c48db
2017-03-08 23:04:19 -08:00
Chonglin Sun
581e57c244 add AccumulateHistogramOp
Summary: AccumulateHistogramOp, for computing the histogram of all values in input tensors

Differential Revision: D4654417

fbshipit-source-id: dea92346004c772af16e1eb41306287d81dc5a02
2017-03-08 19:37:32 -08:00
Minsuk (Brian) Kahng
c6a9d7f188 User input (Conv out, etc.)
Summary: Take user inputs for the introspection visualization: convolutions output layer activations, filters using containing phrases, and number of samples

Reviewed By: Mortimerp9

Differential Revision: D4603797

fbshipit-source-id: dc972dcb8ad36e30defab266d710e047b11cff73
2017-03-08 13:49:45 -08:00
Ahmed Taei
4f0e7730a9 Distrubited Multi-GPU resnet50
Summary: Use filesystem rendezvous for dist-multi GPU training.

Differential Revision: D4664945

fbshipit-source-id: 7b6767323e94bc4e7fa25ef3eba65b38abb79341
2017-03-08 11:39:29 -08:00
James Reed
8de1db9eb6 Implement recurrent attention in C2
Summary: Super rough implementation of recurrent attention. Planning to factor out the common code between the two functions as well as train and eval. I want to get this out and get eyes on it sooner rather than later

Differential Revision: D4647837

fbshipit-source-id: 54bc4e8ed0df6f04c86c425926decbe89f73b068
2017-03-08 11:21:28 -08:00
Kittipat Virochsiri
f0d78753ae Make ModelExporter.load_from_db() load to specific workspace
Summary: In case of distributed task, load_from_db() loads to wrong workspace (when used inside a Python op). Passing which workspace to use explicitly so that it loads to the one Python op is being run.

Reviewed By: kennyhorror

Differential Revision: D4653692

fbshipit-source-id: 94585c012b05ee38b9ce5e8ef0efdd50aa41dd2b
2017-03-08 09:31:42 -08:00
Jiyan Yang
e75221e316 Add eval net to two tower workflow
Summary: The evaluation part of the two tower workflow is missing. This diff is to complete it. Part of the newly added functions can be used for other workflows, eg, feed. As the eval workflow in different workflows will be overlapped, a generic eval workflow will be added in a separate diff.

Reviewed By: kennyhorror

Differential Revision: D4646880

fbshipit-source-id: 4d6eb35df10f6f613533d442f2a04dc0332386f8
2017-03-07 21:03:00 -08:00