Commit Graph

746 Commits

Author SHA1 Message Date
Pooya Davoodi
2c97c98ca7 Enable testing the GPU implementations of Adagrad and Adam
Summary:
Enable testing the GPU implementations of Adagrad and Adam incl sparse versions.
Closes https://github.com/caffe2/caffe2/pull/607

Reviewed By: dzhulgakov

Differential Revision: D5121552

Pulled By: Yangqing

fbshipit-source-id: da6b7dde456237c94cf74d00860e7327b2267eab
2017-06-01 18:10:57 -07:00
Kun Han
fc4d118e6b Caffe2 MemNN Production Model Saving
Summary:
Split the Caffe2 memory based model into to parts
- Dimension reduction MLP
- DNN with concatenation of memory and obj feature

Currently only implement simple mean

Differential Revision: D4866825

fbshipit-source-id: d2f6813402513ec9af30dbe29a50593e2d3cdb3b
2017-06-01 14:31:53 -07:00
Ahmed Taei
299f293cb2 Add initializer classes to conv_nd.
Summary: Fix parameters passed to _ConvBase

Reviewed By: sunwael

Differential Revision: D5166836

fbshipit-source-id: 6c2a9fa73cf1199a5f861900554f3075a49104fc
2017-06-01 14:17:55 -07:00
Simon Layton
58874ad5bf Fp16 training initializers
Summary:
Re-open for re-importing :)
Closes https://github.com/caffe2/caffe2/pull/721

Differential Revision: D5164345

Pulled By: akyrola

fbshipit-source-id: e80b32556cd25610602df91a4225b93edc0ca40b
2017-06-01 08:34:46 -07:00
Aapo Kyrola
ffbba0fae7 add model_helper Validate() + sprinkler around
Summary:
Recent diff introduced a duplicate parameter to the model, which would hurt the performance and also affect correctness (duplicate momentum updates, for example). We unfortunately had no checks for duplicate params, outside of data_parallel_model, which fortunately brought this into our attention.

But it is better to have a Validate() function in model_helper, and call that before adding gradient ops and querying for parameters. Added to brew_test calls as well.

Reviewed By: kennyhorror

Differential Revision: D5163458

fbshipit-source-id: 35692e8bfcc359d4e8bc73e6f2358659f6e45ceb
2017-06-01 02:36:47 -07:00
Aapo Kyrola
0f8c8f37a8 Revert D5159712: [caffe2][PR] Fp16 training initializers
Summary: This reverts commit 60a889494d2e2f4df1d720331e19f638c5eb95cc

Differential Revision: D5159712

fbshipit-source-id: 16040c911b260648857f656f92b165f92c2daae0
2017-06-01 00:17:14 -07:00
Aapo Kyrola
076376f4f6 Revert D5119830: [C2] Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params
Summary: This reverts commit 2001090a37346eb12abbb234e13e727c288eb8a7

Differential Revision: D5119830

fbshipit-source-id: bf321868338f0db85dff3237af7eaf74212dbdf6
2017-06-01 00:02:21 -07:00
Andrey Malevich
ff61ed358e Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params
Summary:
This diff is the first step in the effort for refactoring all paramters. As a
first step - I'm merging concept of params and computed_params, that is going
to be based on tags instead (in the first version it's still using old data
structs to store all the BlobReferences).

Renaming computed_params to non-trainable/non-backprop params should be done is
some other diff.

Reviewed By: salexspb

Differential Revision: D5119830

fbshipit-source-id: 2001090a37346eb12abbb234e13e727c288eb8a7
2017-05-31 22:36:36 -07:00
Luke Yeager
d8d1cd1064 Test smaller tensors in segment_ops_test
Summary:
It's causing problems inside docker containers:

`InvalidArgument: Insufficient bytes of entropy to draw requested array.  shape=(5, 9, 10, 5), dtype=float32.  Can you reduce the size or dimensions of the array?  What about using a smaller dtype? If slow test runs and minimisation are acceptable, you  could increase settings().buffer_size from 8192 to at least 18432000.`
Closes https://github.com/caffe2/caffe2/pull/707

Differential Revision: D5162621

Pulled By: Yangqing

fbshipit-source-id: 55544210961cbc80828dca2cbeba6a5ace8cf8d1
2017-05-31 20:17:31 -07:00
Luke Yeager
e2cf007dc8 Avoid numpy VisibleDeprecationWarning in test
Summary:
This warning becomes an error with https://github.com/numpy/numpy/pull/6271 (`>=0.12.0`).

```
caffe2/python/operator_test/tile_op_test.py::TestTile::test_tilewinput
  /opt/caffe2/caffe2/python/operator_test/tile_op_test.py💯 VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
    dims[axis] = tiles
  /usr/lib/python2.7/dist-packages/numpy/lib/shape_base.py:873: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
    return c.reshape(shape_out)
```
Closes https://github.com/caffe2/caffe2/pull/710

Differential Revision: D5160776

Pulled By: Yangqing

fbshipit-source-id: b264e0e389de5817a289db878c15e655f9fa2f09
2017-05-31 20:01:30 -07:00
Simon Layton
2bfacff426 Fp16 training initializers
Summary:
Adds support for generating and training pfp16 models. Added SGD optimizer for multi-precision trainers and a new callback to data_parallel_model in order to help multi-precision models keep their different copies of parameters in sync during training.
Closes https://github.com/caffe2/caffe2/pull/697

Differential Revision: D5159712

Pulled By: salexspb

fbshipit-source-id: 60a889494d2e2f4df1d720331e19f638c5eb95cc
2017-05-31 17:46:58 -07:00
Ahmed Taei
f0f4c2fc5d Increase the number of DAG execution worker threads.
Reviewed By: akyrola

Differential Revision: D5158414

fbshipit-source-id: add377aec5588076db881a2a3750101710f29732
2017-05-31 15:19:19 -07:00
Aapo Kyrola
73a8a49c7e synchronize re-rendezvousing on node changes + support num_shards=1 rendezvous
Summary:
Currently we can get into broken situations when some nodes working on computation detectChanges() faster than others, thus only some of the nodes start doing next iteration of training. This is an inconsistent state. To prevent this to happen, now each node sets a "re-rendezvous flag" and that is allreduced after each iteration. Once all agnodes agree, re-rendezvous will be done.

Also noticed that min_shards=1 does not work because data parallel model assumed num_shards>1 when rendezvous is not None. Fixed that.

Reviewed By: andrewwdye

Differential Revision: D5156282

fbshipit-source-id: f2ccbd8ad13ed37f7813ff8ad1080d963d0d17e3
2017-05-31 15:19:13 -07:00
Ahmed Taei
f2d9d97008 Add an option to reset momentum-sgd params every time between successive block updates.
Reviewed By: akyrola

Differential Revision: D5149263

fbshipit-source-id: c0a3637a1b48f74ec55c9d13c8fab3456dab809c
2017-05-31 00:32:11 -07:00
Aapo Kyrola
ccdf2d99e1 Add description to assert in model_helper
Summary: Add information about the offending param when assertion fires.

Reviewed By: kennyhorror

Differential Revision: D5153625

fbshipit-source-id: 9f5a02bf64ccbdef9d93d346f79e589dfe3ec5be
2017-05-31 00:02:18 -07:00
Aapo Kyrola
ce7ce46ca1 fix secondary device check by gradient, if it is sparse
Summary: Fix an issue where the parameter is not created in param_init_net, or net, and then we secondarily look at which device op outputs the gradient. This did not work if the gradient was a GradientSlice.

Reviewed By: harouwu

Differential Revision: D5153102

fbshipit-source-id: 20eae660ea32e5a9ea484bf93c04c8f8c71a51ed
2017-05-30 20:47:17 -07:00
Aapo Kyrola
96d8ae2163 Make fills work with input_shape when run in CUDAContext
Summary: If ConstantFill (or other fill op) is used in CUDAContext, with input_as_shape, the code crashes as it expects the shape be in CUDAContext but accesses the array in host code... We could fix this by copying the values from the CUDA tensor, but it is probably best to enforce the shape param is in CPU context. This is what this diff does.

Differential Revision: D5152766

fbshipit-source-id: 0629a189bd1d800c0b7c9dbc324b78d279efac0b
2017-05-30 20:47:16 -07:00
Alexander Sidorov
846240a340 Caffe2 gradient generator bug fix
Summary:
Bug repro is in a test. Generally speaking accumulation was
not happening if len(ys) >= 2 (list of blobs we compute gradients
from) and for some blob in the net it was both in ys list and also got
a gradient propagated from another element in ys.

Reviewed By: akyrola

Differential Revision: D5121695

fbshipit-source-id: 282d88f2f4f6e27dadae311964f40246a2739130
2017-05-30 18:47:08 -07:00
Andrey Malevich
aa59b217a9 Relax requirement on the outputs of the predictor.
Summary: It looks like it's a bit too restrictive requirement. Let's remove it.

Reviewed By: volkhin

Differential Revision: D5150968

fbshipit-source-id: 9e38574edc6542c5ce3c7f25a01afe8f5ff9b507
2017-05-30 17:23:18 -07:00
Simon Layton
1aa6300696 Option to use NCCL for broadcast
Summary:
Fixes some performance issues when `broadcast_computed_params=True` is passed to Parallelize_GPU. Enabled via the same `use_nccl` flag as AllReduce
Closes https://github.com/caffe2/caffe2/pull/630

Differential Revision: D5149828

Pulled By: akyrola

fbshipit-source-id: 12c9714c7fa078811f1cde61c8523dca8f7f968f
2017-05-30 16:46:38 -07:00
Thomas Dudziak
47e921ba49 Remove map() and filter() in favor of comprehensions
Summary: These return views in Python 3 which would not do anything in a lot of usages currently present in Caffe2. This diff simply removes (almost) all usages of these two in Caffe2 and sub projects in favor of comprehensions which are also easier to read/understand

Reviewed By: akyrola

Differential Revision: D5142049

fbshipit-source-id: e800631d2df7d0823fed698cae46c486038007dc
2017-05-30 15:32:58 -07:00
Aapo Kyrola
acb2ad12e5 fix race condition at terminate
Summary:
Looking at one segfault at exit (https://our.intern.facebook.com/intern/chronos/jobinstance/?jobinstanceid=911625597&smc=chronos_gp_admin_client&log_type=stderr&offset=0&pretty_logs=false) and it's coredump, only thing I can see that a FreeBlob() operator is called concurrently while a cudaMemcpyAsync (on thread 1) is crashing. FreeBlobOp is only called at data_workers _stop() (via utils.ResetBlobs()), and only code that could run a cudaMemcpyAsync that time is the fetcher -thread of data_workers that is enquing blobs.

Here are the stacks: P57455299

This is clearly a bug since we should only clear the scratch blobs after all threads are terminated, which happens at wait_for_finish().

I am not 100% sure this fixes all the segfaults, but at least this one was most likely caused by this.

Reviewed By: andrewwdye

Differential Revision: D5146278

fbshipit-source-id: ae00796706bfc4fee6823caf6529b62ab20c1cd3
2017-05-30 13:47:10 -07:00
Aapo Kyrola
cdb50fbf2b add optimizer support to data_parallel_model; Use MomentumSGDUpdate
Summary:
This diff does two things:
- add supports for optimizer to data_parallel_model. User can supply optimizer_builder_fun instead of param_update_builder_fun. The latter is called for each GPU separately with proper namescope and devicescope, while optimizer builder only is called once and adds optimizes to the whole model.

- use MomentumSGDUpdate instead of MomentumSGD + WeightedSum. This bring major perf benefits.

Changes resnet50 trainer to use optimizer.

This relies on D5133652

Reviewed By: dzhulgakov

Differential Revision: D5142973

fbshipit-source-id: 98e1114f5fae6c657314b3296841ae2dad0dc0e2
2017-05-30 12:49:57 -07:00
Luke Yeager
0a9684c3b9 Mark in-place GPU dropout as broken, add test
Summary:
I'll let y'all decide how you want to fix this (probably need a persistent curand buffer). Here's a test to verify the fix.
Closes https://github.com/caffe2/caffe2/pull/495

Differential Revision: D5148815

Pulled By: akyrola

fbshipit-source-id: e80dabe65230ddd32340f2d872cd8786ac960bf8
2017-05-30 12:35:22 -07:00
Aapo Kyrola
44257ea5ed automatically infer device scope for param
Summary:
hankun is using the optimizer, but having mixed set of of GPU and CPU operators. Currently this won't work with optimizer since it adds optimizers for all parameters in the current device scope. But we can actually infer the device that a param belongs to by looking at the device option in the param_init_net.

Added a test as well.

Reviewed By: salexspb

Differential Revision: D5133652

fbshipit-source-id: ad8689d75ac1f5c78981bae1b6978fe91e40ef0f
2017-05-30 12:02:19 -07:00
Luke Yeager
6b1cf26380 Fix for dpm when GPUs don't have p2p access
Summary:
See discussion at https://github.com/caffe2/caffe2/pull/633#issuecomment-303536902

Tested with a TitanX (Pascal) and a TitanZ (Kepler) with this access pattern.
```
Checking GPU(s) for support of peer to peer memory access...
> Peer access from TITAN X (Pascal) (GPU0) -> GeForce GTX TITAN Z (GPU1) : No
> Peer access from TITAN X (Pascal) (GPU0) -> GeForce GTX TITAN Z (GPU2) : No
> Peer access from GeForce GTX TITAN Z (GPU1) -> TITAN X (Pascal) (GPU0) : No
> Peer access from GeForce GTX TITAN Z (GPU1) -> GeForce GTX TITAN Z (GPU2) : Yes
> Peer access from GeForce GTX TITAN Z (GPU2) -> TITAN X (Pascal) (GPU0) : No
> Peer access from GeForce GTX TITAN Z (GPU2) -> GeForce GTX TITAN Z (GPU1) : Yes
```
All combinations pass:
* `0,1`
* `0,2`
* `1,2`
* `0,1,2`
Closes https://github.com/caffe2/caffe2/pull/659

Differential Revision: D5148779

Pulled By: akyrola

fbshipit-source-id: 6263edfe8b36623983f1946b5c3f4a3fef415a45
2017-05-30 12:02:19 -07:00
Luke Yeager
a47652379f Fix SparseAdagrad for indices.ndim>1
Summary:
Same fix as https://github.com/caffe2/caffe2/pull/249, but for SparseAdagrad.

Also update the tests for both ops to test this functionality.
Closes https://github.com/caffe2/caffe2/pull/675

Differential Revision: D5148750

Pulled By: akyrola

fbshipit-source-id: d30b722429bc547fd53400c1a29e4ee9e2e6ed18
2017-05-30 12:02:18 -07:00
Luke Yeager
16b240145a Fixing some tests
Summary:
As dzhulgakov said at https://github.com/caffe2/caffe2/pull/227#issuecomment-295084443, it would be nice to avoid this stream of CPU-only test fixes.

The second fix could have been avoided if tests were run on TravisCI. I think the TravisCI infra could be greatly improved if we used ccache like your colleagues at PyTorch: https://github.com/pytorch/pytorch/pull/614. Would you be interested in a PR which does this?
Closes https://github.com/caffe2/caffe2/pull/547

Differential Revision: D5147405

Pulled By: akyrola

fbshipit-source-id: 5e9a4571d364c5f0ed8a5e216c9b6136dd4d10be
2017-05-30 09:16:48 -07:00
Luke Yeager
dc517b6c42 Change hypothesis settings for slow memonger test
Summary:
Failure mode:
```
  - 7 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: 12-14987 ms
  - Stopped because settings.timeout=60
```
After this change:
```
  - 5 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: 12-15475 ms
  - Stopped because settings.max_examples=5
```
Obviously, the `DYNAMIC_PROGRAMMING` tests are the troublemakers. An alternate solution would be to make separate tests for the two assignment algorithms (one fast, one slow).
Closes https://github.com/caffe2/caffe2/pull/676

Differential Revision: D5147363

Pulled By: akyrola

fbshipit-source-id: 85d9f8198e53c10de2a8d6645e2b0eb7953c96e0
2017-05-30 09:16:48 -07:00
Simon Layton
2c3071fc4e Rework initializers to pass a class not object
Summary:
Changed tests
Moved to WeightInitializer, BiasInitializer keywords
Closes https://github.com/caffe2/caffe2/pull/682

Reviewed By: Yangqing

Differential Revision: D5138769

Pulled By: salexspb

fbshipit-source-id: 81d266100b2a95c64c0196c16670dfd34ea03e02
2017-05-30 09:06:56 -07:00
Huazhong Ning
660dd58022 fix for realtime training.
Reviewed By: kennyhorror

Differential Revision: D5068298

fbshipit-source-id: 0dc3580c9c8123368a3625fb654c6eaf1dc4a950
2017-05-26 23:49:40 -07:00
Jiyan Yang
6aff754dbc Add batch normalization layer
Summary: As desc.

Reviewed By: xianjiec

Differential Revision: D5077230

fbshipit-source-id: f73cdedac6d9a3542f8ef829b54fb4c713dcafd0
2017-05-26 16:46:52 -07:00
Thomas Dudziak
ec19b4bd7b Import fixes for Python 3
Summary: As title

Differential Revision: D5135990

fbshipit-source-id: 88cb15bb2fb97dd21faf3ea5ddb8d4dbff7fad93
2017-05-26 16:31:50 -07:00
Thomas Dudziak
3ccbf23132 String-related fixes for Python 3
Summary: This diff is one step towards enabling python 3 build by making it be more diligent in its handling of strings.

Reviewed By: salexspb

Differential Revision: D4893083

fbshipit-source-id: 28b8adf3280e8d1f0a7dc9b0fee5ad53f2fada57
2017-05-26 16:04:32 -07:00
Anmol Kalia
7f98dc28cb Refactored spatial softmax
Summary: Refactored SoftmaxWithLoss by removing the code for spatial=1 mode and created a new op SpatialSoftmaxWithLoss that has the spatial mode implemented.

Reviewed By: viswanathgs

Differential Revision: D5104120

fbshipit-source-id: 8ab999e32c916b2a39a670a7b2a3365401535f24
2017-05-26 14:50:43 -07:00
Ahmed Taei
75a6f909c5 Add option to enable memonger for gradients and add param_names for save_model.
Reviewed By: akyrola

Differential Revision: D5131493

fbshipit-source-id: 7c159ccffa30eb064c157e559f1d8f0350f03ccb
2017-05-26 11:31:35 -07:00
Dmytro Dzhulgakov
35eaf444c0 Quickly hack sparsenn_benchmarks to also do BenchmarkNet
Summary:
Makes benchmark a bit hacky, but it's a benchmark after all :)

Specifically ports functionality of proper BenchmarkNet run from the ads_benchmarks so that we can see training net perf.

Also adds --report_interval parameter to print stats more often when running in hogwild mode

kdub0 - hopefully if you have time you can integrate it properly with the Flow's workflow

harouwu -shouldn't conflict too much with your current diff

Reviewed By: rayleichen

Differential Revision: D5125183

fbshipit-source-id: 9c6f1663bc85e26d6609f0f2f23aa280731939db
2017-05-26 10:48:45 -07:00
Aapo Kyrola
d60a2e3c58 UnsortedSegmentSum/Mean for CUDA
Summary:
To make optimizer for sparse gradients work with CUDA, we need UnsortedSegmentSum and Mean implemented for CUDA. Unique was already implemented by harouwu.

Pretty straightforward implementations, should be fast enough -- and i don't know a faster way anyway.

Added some tests as well.

Reviewed By: asaadaldien

Differential Revision: D5124548

fbshipit-source-id: 63ae72f45fc2f07470603f7b2de12f34635dbb3d
2017-05-26 09:33:49 -07:00
Luke Yeager
97159810c9 Restore compatibility with protobuf2
Summary:
Addresses an issue with 417f74509e.
```
>               operators.append(proto.op.pop())
E               AttributeError: 'RepeatedCompositeFieldContainer' object has no attribute 'pop'
```
/cc jhcross
Closes https://github.com/caffe2/caffe2/pull/658

Reviewed By: dzhulgakov

Differential Revision: D5130382

Pulled By: salexspb

fbshipit-source-id: 34e0c39aad5f339c1aaa1506af3e7495193565f4
2017-05-26 08:47:24 -07:00
Alexander Sidorov
016f72537a ModelHelper.create_param, Initializer abstraction and ParameterInfo for optimizers
Summary:
This is going to unblock Nvidia in their work on adding fp16
support to Caffe2. I discussed this with kennyhorror before to make
sure this fits into his work on parameter sharing.

Reviewed By: kennyhorror

Differential Revision: D5127797

fbshipit-source-id: 4db155d320b1862570c23b77c4252bdacbf2296f
2017-05-25 22:03:15 -07:00
Andrey Malevich
6c12df3003 Fix export of SparseToDense layer.
Summary:
If there're 2 SparseToDense layers that are densifying same IdList feature
it'll result in the situation, where we might export invalid input for the
prediction in input specs. This diff is changing the behavior to support to use
Alias to a new blob instead of passing things directly.

Reviewed By: dzhulgakov

Differential Revision: D5093754

fbshipit-source-id: ef4fa4ac3722331d6e72716bd0c6363b3a629cf7
2017-05-25 21:46:28 -07:00
Jiyan Yang
9bf1f16255 Add bias to cosine distance for two tower models
Summary: Currently using two tower models with cosine distance results in bad calibration. Adding bias to the output of cosine term solves the problem.

Reviewed By: xianjiec

Differential Revision: D5132606

fbshipit-source-id: eb4fa75acf908db89954eeee67627b4a00572f61
2017-05-25 19:50:20 -07:00
Zhicheng Yan
2002018603 memory_leak_data_worker
Summary: Memory leak happens when new BlobReference is constantly added to the set _scratch_blobs

Reviewed By: panshen1

Differential Revision: D5134945

fbshipit-source-id: 3ce4d482153bb89de065f20cd91411178085caad
2017-05-25 19:22:03 -07:00
Pieter Noordhuis
a9b5efe3c2 Expose max collective concurrency
Summary:
This was hardcoded at 4 before but should be made
configurable. Can be kept low for big MLPs and higher for convnets.

Reviewed By: akyrola

Differential Revision: D5126138

fbshipit-source-id: 713ee8bbeb243b7de1479808fd6398d397e0b49a
2017-05-25 13:32:40 -07:00
Mohamed Fawzy
e35a4fe5cc Implement SizeOp as requested in github issue#583
Summary:
Implement SizeOp that returns the number of elements in the input
tensor.

Output is 1D tensor that contains the number of elements

Reviewed By: akyrola

Differential Revision: D5101061

fbshipit-source-id: d1c56053b6f3b41c65ac574dd748482775d1ea0d
2017-05-25 11:07:35 -07:00
Artem Volkhin
55d293f730 remove non-existing blobs from output_schema in layer_model_instantiator
Summary: In some cases (for example, when include_tags option is used) output_schema contains blobs that aren't produced by the generated net. In this case we want to filter them from output_schema as well.

Differential Revision: D5120115

fbshipit-source-id: f98ea3f747589390b039d1e1987becec3980634c
2017-05-25 00:36:19 -07:00
Aapo Kyrola
da6b82b810 fix another bug related to in-place ops --> treat in-place ops like any other
Summary:
D5116828 changed how in-place ops were hanled in memonger and fixed a crash in NeuralMT. However, it still produced incorrect memongerization, because an op with one inplace input-output but another non-inplace output would be handled still incorrectly, as the other output's branch would not be followed properly.

This is fixed by actually removing the whole in-place op special handling. This actually is not needed anymore, it was leftover from an older version of memonger that used topological sort of the ops.

Reviewed By: asaadaldien

Differential Revision: D5128142

fbshipit-source-id: b551b0faebdde410e6bd7516958c63cf610cc065
2017-05-24 23:32:03 -07:00
Deepak Gopinath
33c40e8a6e Handling shared indices in sparse gradient updates
Summary: When two or more blobs are gathered by the same indices blob in a data parallel model, we used to concatenate multiple times and re-write to the same indices blob. This leads to illegal memory access at times because the gradientslice indices blob is longer than its corresponding gradientslice values blob. This diff adds a check in order to avoid this.

Reviewed By: akyrola

Differential Revision: D5116817

fbshipit-source-id: 1c086d092eb6d48926d600f9408f578f5ddc41c7
2017-05-24 22:47:00 -07:00
Aapo Kyrola
f2303ccb77 fix tileop test
Summary: Gradient test for tile op was flaky because i had made the dimensions too large. This caused push blocking errors. Also I noticed my test_grad_tile was incorrect.

Reviewed By: asaadaldien

Differential Revision: D5126476

fbshipit-source-id: ae9ce5d9041648d7a4535fc88d4013e669bd6f02
2017-05-24 18:32:01 -07:00
Bram Ton
4da076d3e9 Fixed typo caffe_translator.py, fixes bug #397
Summary:
Fixed minor typo in python/caffe_translator.py. Fixes #397.
Closes https://github.com/caffe2/caffe2/pull/412

Differential Revision: D4950875

Pulled By: aaronmarkham

fbshipit-source-id: 07183c6d6e8e97451bb5ee5ff01a88553d6bdb82
2017-05-24 12:18:32 -07:00