Commit Graph

80 Commits

Author SHA1 Message Date
Newsha Ardalani
0fb58d76a1 Support ArgMin in c2_pt_converter
Summary:
+ Add ArgMin support to Caffe2 to PyTorch converter
+ Using hypothesis to parameterize different conditions for test

Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test

Reviewed By: houseroad

Differential Revision: D25016203

fbshipit-source-id: 94489fcf1ed3183ec96f9796a5b4fb348fbde5bc
2020-12-05 16:35:34 -08:00
Rahul Manghwani
142b21fd44 Add SparseLengthsSum4BitRowwiseSparse in c2_pt_converter (#48240)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48240

Adds the support for converting the SparseLengthsSum4BitRowwiseSparse operator from caffe2 to pytorch as a part of c2_pt_converter

Test Plan:
Added a unit tested

buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test

Tests Passed :
https://our.intern.facebook.com/intern/testinfra/testrun/2251799856412296

Reviewed By: houseroad

Differential Revision: D25067833

fbshipit-source-id: 45cbc331ca35bee27e083714e65a1e87a2a2d2e0
2020-12-04 14:16:25 -08:00
Frank Seide
29f0e1e2ce Fused8BitRowwiseQuantizedToFloat operator support (#48407)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48407

T79817692: Fused8BitRowwiseQuantizedToFloat operator support for c2_pt_converter.

Also refactored some repeated code from the existing test functions. (Initial commit only has refactoring.)

Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test

Reviewed By: bugra

Differential Revision: D25069936

fbshipit-source-id: 72f6a845a1b4639b9542c6b230c8cd74b06bc5a0
2020-11-30 17:11:39 -08:00
Jonathan Kwok
a3e08e5344 Support ReduceSum in c2_pt_converter (#47889)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47889

Adds support for converting the [caffe2 ReduceSum](https://caffe2.ai/docs/operators-catalogue#reducesum) operator to torch.
ghstack-source-id: 116580127

Test Plan:
buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test : [results](https://our.intern.facebook.com/intern/testinfra/testrun/6755399466095119)

    ✓ ListingSuccess: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - main (60.273)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_sub_op (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.119)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_layer_norm_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.404)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_local_model_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.966)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_reduce_sum (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (114.896)

Reviewed By: bugra

Differential Revision: D24925318

fbshipit-source-id: 3f3b791eff1b03e8f5adee744560fe8bc811c659
2020-11-13 12:02:58 -08:00
Alberto Alfarano
59e96c55f7 Support MatMul in c2_pt_converter
Summary: Added the MatMul operator for caffe2

Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test

Reviewed By: bugra

Differential Revision: D24920937

fbshipit-source-id: 7ba09ba0439cb9bd15d6a41fd8ff1a86d8d11437
2020-11-12 20:56:58 -08:00
Bugra Akyildiz
c26c4690fe Add sub operator
Summary: Add sub operator for caffe2

Test Plan:
```
buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test
```

Reviewed By: houseroad

Differential Revision: D24685090

fbshipit-source-id: 60d745065d01b634ebd3087e533d8b9ddab77a1f
2020-11-06 12:31:17 -08:00
Bugra Akyildiz
27c7158166 Remove __future__ imports for legacy Python2 supports (#45033)
Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:

```2to3 -f future -w caffe2```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033

Reviewed By: seemethere

Differential Revision: D23808648

Pulled By: bugra

fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
2020-09-23 17:57:02 -07:00
Stanislau Hlebik
b774ce54f8 remediation of S205607
fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3
2020-07-17 17:19:47 -07:00
Stanislau Hlebik
8fdea489af remediation of S205607
fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac
2020-07-17 17:17:03 -07:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Xiaomeng Yang
271f005eeb Add elementwise_affine for LayerNormGradientOp (#19982)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19982

Add elementwise_affine for LayerNormGradientOp

Reviewed By: houseroad

Differential Revision: D15157493

fbshipit-source-id: 7465f2c1d4df4649b4903b93483c4861e9c7afa9
2019-05-03 15:33:46 -07:00
Jerry Zhang
ff0a7ae43f Testing for folded conv_bn_relu (#19298)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19298

Proper testing for conv_bn_relu folding

Differential Revision: D13998891

fbshipit-source-id: ceb58ccec19885cbbf38964ee0d0db070e098b4a
2019-04-16 19:04:06 -07:00
Xiaomeng Yang
49f87320ba
[Caffe2] Add full impl of GroupNorm (#7058)
* Add full impl of GroupNorm

* Fix comments in math.h

* Remove unsed buffers

* Add #include <array> in gpu version

* Remove unused moments_buffer_

* Make inverse std to be a template.

* Add detailed comments
2018-04-29 11:26:40 -07:00
Yinghai Lu
ef8f556212
[Caffe2] Changes done inside Facebook (#6378)
* fix unit test for sqrt op

From the error logging:

[idx, grad, grad_estimate] are:
[[ 146.            0.5           0.45776367]
 [ 147.            0.5           0.45776367]

The gradient == 0.5 is correct, which means the SqrtOp and its gradient is doing right job. (Because y = sqrt(x), loss = y^2/2 = x/2, and then d(loss)/dx = 1/2 = 0.5; )

The test failed because of numerical problem of grad_estimate (in unit test). It can be because the step_size is small, and float precision is not high (when there are multiple elements in the tensor, we do sum(y^2) to compute loss)

This diff
- increase the step size, and also move the test cases to be further away from 0 (where sqrt(x) is not well defined) to be safe :)
- also clean up, and merge the test case for inplace Vs. non-inplace

Tested with:

`CAFFE2_HYPOTHESIS_PROFILE=debug ai_bt caffe2/caffe2/python/operator_test:elementwise_ops_test -- "test_sqrt"`

* CompositeReader & CompositeReaderBuilder

A new type of reader gluing multiple readers together.

* Back out "Revert D7394363: [GanH]: Log D Trick for Cross Entropy with Sigmoid"

Original commit changeset: 9325a4356dbe

* [dai][WIP] convert params to int8 on ps before sending to trainer

Add float->uint8 conversion in addition to float->fp16 conversion in model_saver.

* [easy] improve unit test for sparse length sum ops

as desc.

#accept2ship

* Update GitHub upstream to 771fcb3455

* move sparse hash unique ops to OOS and add unit tests

- move the SparseHash version to OOS, since 'sparsehash' is already deps of caffe2 OOS: https://fburl.com/arssw4n1
- The 'SparseHash' engine is also being used in OOS, so the SparseHash version shall be in OOS to reduce confusion: https://fburl.com/o5ea7ah2

- fix the CUDA UniqueOp for the case when batch is empty.
- add unit test

* group_norm_op for caffe2

This is the cuda op for Group Normalization (GN): https://arxiv.org/abs/1803.08494

This code implements GN in one op that computes Y=gamma * (X-mu) / sigma + beta and also its gradients. It is expected to have minimal memory consumption (similar to the BN op), without creating new blobs if GN were implemented as several ops (e.g., reshape, norm_mean/std, affine_channel).

* Resubmit D7405233: disappeared in D7464958

OOS publish causes the op missing -- however, test was still there

* [c2] add sparse hash engine for cuda unique op

The SparseHash version of UniqueOp copy input tensor to CPU, and make use of sparse hash map to get unique output, and then copy back to GPU.

* [dper][gpu] enable unit testing gpu trainer for sparse nn

to debug the GPU trainer using mock data in unit test.

make it easier to develop GPU trainer for new models.

* Reuse Gloo context for Synchronize() calls

Previously we were creating (and leaking) the Gloo context on each call to Synchronize(). Now only run the common world op and create the barrier net once, then run the barrier net on each Synchronize() call. Since timeout is associated with the Gloo context, assert that the timeout is fixed instead of trying to handle the complexity of multiple timeouts (and associated contexts).

* [GanH/WGAN][1/n]: add FC param clipping

as titled

* [mobile] minimizing changes between caffe2_benchmark and speed_benchmark

* [GanH]: enable diagnose within model

avoid finding blob names but to directly enable inside the model

* Add `net_transformer_fun` option to DPM

This callback allows for various transformations to be made to the
model after gradient operators have been added. The immediate motivation for
this is to allow transformations such has "checkpoint-and-recompute" which
allow trading off memory for additional compute.

Adding several callbacks like this has made DPM's API less than ideal at this
stage. However, I could not find any reasonable alternative.

* [DT] [33/n] Compile flow task groups

task groups need to compiled in order to pickle the object in fblearner. However I also changed the Job's compile function as creating new object is not necessary.

* Initial commit for sparse_normalize vectorization and benchmark

* [GanH]: LB Calibration for JSD

as titled

* Tracing event in async executor

Adding event tracing through TRACE_EVENT macro in async executor

* [Resubmit] D7409751 Reseting book-keeping blobs when the reservoir is reset

D7409751 got lost in D7464958

* Visualizing realtime weights values

we want to visualize the weights values as optimizer is iterating. This diff supports to visual the weights at an assigned index.
Currently, we assume the blob to be 2 dimensional.

* [GanH][Easy]: Fix Homotopy Weighting

apparantely, there was a bug in homotopy weight (alpha, beta) update

* [c2] move sparse hash unique op out of oss

so that oss do not need to depend on google hash map.

* Get rid of std::round as it's not supported on Android

* Revert changes on setup.py

* Skip shaky test on Dataio

* fix
2018-04-10 21:11:43 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Kutta Srinivasan
b4b2f0d2cc Work on fp16 conv op 2018-03-05 21:13:03 -08:00
Junjie Bai
b11ba65204 Experimental support for setup.py develop mode install
Summary:
`python setup.py develop` / `pip install -e .`
Closes https://github.com/caffe2/caffe2/pull/1926

Reviewed By: orionr

Differential Revision: D6951780

Pulled By: bddppq

fbshipit-source-id: 01249cbca90ec5326ea4107d4e500ae95a9dbd7b
2018-02-12 23:36:18 -08:00
Zhicheng Yan
d79a31761e rectangle_cropping_multi_cropping_color_jittering_lighting
Summary:
Change log
- Support rectangle cropping, where height and width of clip cropping can be set separately. This is useful when most video resolution is non-square, such as 240p, 360p and 480p where width is significantly larger than height.
  - Comparisons of training on ucf101 between using 112x112 croppings and using 112x144 cropping.
  - https://fburl.com/i0rw6y1k
- Support 14 multi-cropping per video clip at testing stage to improve classification accuracy. Take left-top, central-top, right-top, left-bottom, central-bottom, right-bottom and central-central croppings as well as their mirrorings. In total, 14 croppings.
   - Comparisons on the same model trained on UCF-101. Use 1 clip per video
      - RGB. f41014306, w/o Vs f41014868, w/ multi-cropping: `0.64099 Vs 0.65796`
      - OF. f41014889, w/o Vs f41014913, w/ multi-cropping: `0.65796 Vs 0.67624`

- Support color jittering and color lighting on RGB data for training data augmentation.
  - Comparisons of training on ucf101 from scratch with and without color jittering and lighting:
  - https://fburl.com/k69zatul

Reviewed By: HengCV

Differential Revision: D6962620

fbshipit-source-id: 9b43478945874142727fea351ee04417218e6606
2018-02-12 16:39:06 -08:00
Alexander Sidorov
a3b8c459d4 Revamp MNIST tutorial
Summary:
Main changes:

1. Move reader creation to Brew in order to be consistent and avoid a wild use of param_init_net
2. Use optimizers for training function, avoid manual optimizer construction
3. Add MLP mode (a default)
4. Fix a bunch of too verbose comments and add a bit of new explanations
Closes https://github.com/caffe2/caffe2/pull/1760

Differential Revision: D6749059

Pulled By: salexspb

fbshipit-source-id: 9dfbbb2d9772a74a0300c2e404a92e791f7cc593
2018-01-26 09:17:31 -08:00
Ilia Cherniavskii
79ac146808 Add if and while ops to brew
Summary:
Adding if and while control ops to brew, also adding unit tests
Note: unlike net_builder where we can figure which blobs are external and which ones are local to subnets, here in brew we need to use external_blobs param explicitly to point at external blobls

Reviewed By: harouwu

Differential Revision: D6440508

fbshipit-source-id: c920f0af84b77ccb2d8462ffc7567bb1908c844a
2017-12-05 17:33:34 -08:00
James Cross
2c190d2f05 update transformer code for layer_norm() API change
Summary: Quick fix for unit test broken by D6454290. This is my fault for approving while the tests covering the single callsite were broken.

Reviewed By: goldsborough

Differential Revision: D6466566

fbshipit-source-id: 2683be3d6bb184286e64fbde3e572946e39030c7
2017-12-01 20:19:31 -08:00
Peter Goldsborough
b43c1b2bed Fix and upgrade brew.layer_norm
Summary:
While working on layer normalization for LSTMs I encountered an issue where the layer norm parameters (which are the scale/gain and bias/shift from the paper) were not registered in the model for `brew.layer_norm`. salexspb explained that this is because it was using the `init_net_param` API instead of `create_param`. This diff fixes this.

While fixing I noticed that I noticed that `brew.layer_norm` actually had a bug where it was multiplying with the bias instead of adding it. Another issue was that the function giving the scale and bias a shape of `[1]`, however the paper (https://arxiv.org/pdf/1607.06450.pdf) specifies that, like for batch norm, there is one scale and bias parameter per neuron, i.e. the shape should be `[1, axis_dimension]`. The API now takes an explicit `dim_in` parameter (also more consistent with other normalization functions in that module) so that this can be specified. See tests for how this now looks.

Reviewed By: jhcross

Differential Revision: D6454290

fbshipit-source-id: fc00ca614de3190c40ab743e8984bec9e85fb58c
2017-12-01 14:18:28 -08:00
Aapo Kyrola
14f95c2782 Updated brew SpatialBN to use initializers
Summary: Updated brew SpatialBN to use initializers similar to other brew ops such as conv and fc instead of initilaizing all of its parameters itself within the brew call.

Reviewed By: asaadaldien

Differential Revision: D5840359

fbshipit-source-id: 9f3d688d4957605eaf7ecd2488bc26bfb1da3f78
2017-11-02 11:25:45 -07:00
Aapo Kyrola
669ec0ccba Added FP16 compute support to FC Op
Summary: Allow the GEMMs in the FC/FCGradient Op to do FP16 compute instead of FP32 if the appropriate op flag is set.

Reviewed By: asaadaldien

Differential Revision: D5839777

fbshipit-source-id: 8051daedadf72bf56c298c1cf830b019b7019f43
2017-10-30 17:03:51 -07:00
Junjie Bai
d894a6362f Add missing is_test argument in ImageInput ops
Summary: reported in Github Issue https://github.com/caffe2/caffe2/issues/1269

Reviewed By: salexspb

Differential Revision: D6004461

fbshipit-source-id: 03f4bccfe085010b30109ab7b6fe7325caa160ef
2017-10-10 10:03:13 -07:00
James Reed
995c83f945 Disable cudnn dropout
Summary: The cudnn version of the DropoutOp was taking a significant (and unwarranted) amount of time in our RNN training. Further investigation showed that setting the cudnn dropout descriptors was an extremely expensive operation (https://pxl.cl/99nT), much more so than the dropout operation itself. This diff adds to the DropoutCell the option to disable cudnn. The non-cudnn version uses a raw curand call that elides all of the expensive descriptor setting.

Reviewed By: jmp84, akyrola

Differential Revision: D5972022

fbshipit-source-id: 6325ec5d6569f8b94d776cbb2554cc8ddb28f699
2017-10-04 17:24:09 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Junjie Bai
d9b0bcd7a4 Make all existing (except in RoIPool) "is_test" arguments required
Reviewed By: akyrola

Differential Revision: D5830168

fbshipit-source-id: 8634e9cfe308ba0ee90cd8a5c4b09a47b0b5f015
2017-09-25 23:46:12 -07:00
Aapo Kyrola
fb45383ed6 resubmission of PR1175: fp16 BatchMatMul
Summary: PR 1175 caused a build error because gemmBatched was only under a specific #ifdef. Now put it outside the #ifdef, and things work.

Reviewed By: asaadaldien

Differential Revision: D5834868

fbshipit-source-id: 072a64c8f4b259ff7504104121766115b46b8aa0
2017-09-14 21:46:05 -07:00
Yangqing Jia
f0d0361609 Revert D5794634: [caffe2][PR] fp16: BatchMatMul
Summary:
This reverts commit 911c462824edec3de529a5a4385a4c437e24bf59

bypass-lint

Differential Revision: D5794634

fbshipit-source-id: 1863b02282329cbee6b10e5870f03051b4bb6c58
2017-09-13 18:46:47 -07:00
Luke Yeager
3cfc6f26e7 fp16: BatchMatMul
Summary:
Was https://github.com/caffe2/caffe2/pull/1151
Closes https://github.com/caffe2/caffe2/pull/1175

Reviewed By: Yangqing

Differential Revision: D5794634

Pulled By: akyrola

fbshipit-source-id: 911c462824edec3de529a5a4385a4c437e24bf59
2017-09-13 14:35:25 -07:00
Luke Yeager
944115c915 Bugfix for concat frontend
Summary:
When breaking out pooyadavoodi's change to `brew.concat` from https://github.com/caffe2/caffe2/pull/1151 to https://github.com/caffe2/caffe2/pull/1184, I made it throw an error instead of silently changing removing `order`. But `order` is always present because of [this](https://github.com/caffe2/caffe2/blob/v0.8.1/caffe2/python/model_helper.py#L118), so the frontend can never be used to set `axis`. That's bad. This PR changes the behavior back to Pooya's original implementation.
Closes https://github.com/caffe2/caffe2/pull/1202

Reviewed By: akyrola

Differential Revision: D5806488

Pulled By: pietern

fbshipit-source-id: ceaea77469688a66b269b8ed2944f0d3fe873940
2017-09-11 13:02:59 -07:00
Luke Yeager
03de05229e brew.concat: don't set both order and axis
Summary:
Was https://github.com/caffe2/caffe2/pull/1151.

pooyadavoodi says this was causing problems for him. I don't remember the details.
Closes https://github.com/caffe2/caffe2/pull/1184

Differential Revision: D5794711

Pulled By: akyrola

fbshipit-source-id: 4d75f2a9b30881ba662141c352ac556cb5d3cce6
2017-09-08 10:34:34 -07:00
James Reed
f388135d3f Layer norm brew wrapper
Summary: Implement a brew wrapper for the LayerNorm op. This adds the scalar weight and bias terms to the op.

Reviewed By: jmp84

Differential Revision: D5595836

fbshipit-source-id: 467b2e1158b0c454a149d4b26c47719826e98752
2017-08-17 11:17:47 -07:00
Simon Layton
85788a0f65 Add TensorCore support
Summary:
Add support for TensorCore convolution and gemm on Volta hardware.

Currently built on top of #1055
Closes https://github.com/caffe2/caffe2/pull/1056

Differential Revision: D5604068

Pulled By: Yangqing

fbshipit-source-id: 100f67e26ed5fabb1dbb31dcd77f7ecb84de4ee7
2017-08-10 20:16:48 -07:00
Ahmed Taei
5bb1e6b817 Allow passing unsymmetric 2d kernels to brew.conv.
Reviewed By: jay-mahadeokar

Differential Revision: D5598523

fbshipit-source-id: 47135a8562f7c720badb2be677cb79730dc417a0
2017-08-10 15:27:16 -07:00
Simon Layton
ded2a5899e Option to set BN scale and bias initial values
Summary:
Necessary to reproduce setup from 1-hour imagenet paper
Closes https://github.com/caffe2/caffe2/pull/995

Differential Revision: D5547666

Pulled By: akyrola

fbshipit-source-id: cbd4396888b02f32c67e1fe7e53636329de64f1b
2017-08-02 11:38:57 -07:00
Kevin Wilfong
60cb55461e Caffe2: Support additional outputs in ImageInputOp
Summary: This allows users to add an arbitrary of additional outputs to ImageInputOp.  These are populated by reading additional TensorProto values from the TensorProtos from the DBReader, and converting them into Tensors.  Similar to labels, only ints and floats are supported, and multiple values are supported.

Reviewed By: panshen1

Differential Revision: D5502019

fbshipit-source-id: 5a8b61b3a8549272a112e8e02cd613d8f9a271ba
2017-08-01 14:36:05 -07:00
Mitchell Wortsman
823869ba79 Adding tanh to brew
Summary: Added tanh to brew.

Reviewed By: harouwu

Differential Revision: D5395358

fbshipit-source-id: 8eb5303f503e10aec4c59b42055933198d67e9b3
2017-07-11 18:17:52 -07:00
Luke Yeager
dfd745a4d1 Conv frontend: checking engine and use_cudnn
Summary:
*Fixes https://github.com/caffe2/caffe2/issues/860*

Raise an exception when the user specifies conflicting values for `engine` and `use_cudnn` in the conv frontend.
Closes https://github.com/caffe2/caffe2/pull/861

Differential Revision: D5329587

Pulled By: akyrola

fbshipit-source-id: 0f1ced9a88c9c6c5a7cb30a070e5bf60129082f0
2017-06-27 09:47:48 -07:00
Davin Wang
dd1525d346 fix #790 so model.init_params = False takes effect
Summary:
Given the parameter init_params=False, Weight Blob(*_w) and Bias Blob (*_b) should be suppressed in model.param_init_net. Without this fix, the init_params=False doesn't take effect in brew.conv as it does in brew.fc or other ops. This issue is the root cause of #790 [https://github.com/caffe2/caffe2/pull/790].
Closes https://github.com/caffe2/caffe2/pull/824

Reviewed By: harouwu

Differential Revision: D5276676

Pulled By: akyrola

fbshipit-source-id: 8f7088a8e1976658f67e027223e555375b3a2392
2017-06-20 14:08:35 -07:00
Zhicheng Yan
ee3727db00 add_helper_function_ElementwiseLinear_op
Summary:
Add a helper function for parametric op ElementwiseLinear
The typical syntax is model.ElementwiseLinear(input, output, dimension)

Reviewed By: harouwu, akyrola

Differential Revision: D5114152

fbshipit-source-id: 8e8c691f824f518ae510a72ab0c12de1b018f3b5
2017-06-07 13:49:48 -07:00
Andrey Malevich
e05173a476 Create ExternalInitializer to simplify logic around init_params = False
Summary:
This diff is creating new type of Initializer - ExternalInitializer. This
initializer is supposed to be used in cases when the parameter blob is already
expected to exist in the workspace.

Reviewed By: dzhulgakov

Differential Revision: D5171322

fbshipit-source-id: d27861f0f80afdea93c235d49f63da19adccc92c
2017-06-02 18:22:50 -07:00
Andrey Malevich
a8fb85797c Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params.
Summary:
This diff is the first step in the effort for refactoring all parameters. As a first step - I'm merging concept of params and computed_params, that is going
to be based on tags instead (in the first version it's still using old data structs to store all the BlobReferences).

Renaming computed_params to non-trainable/non-backprop params should be done is some other diff.

Reviewed By: salexspb

Differential Revision: D5171159

fbshipit-source-id: 68031ca779f053fb266a7c4a2e5b482a3bd9c832
2017-06-02 17:17:57 -07:00
Ahmed Taei
299f293cb2 Add initializer classes to conv_nd.
Summary: Fix parameters passed to _ConvBase

Reviewed By: sunwael

Differential Revision: D5166836

fbshipit-source-id: 6c2a9fa73cf1199a5f861900554f3075a49104fc
2017-06-01 14:17:55 -07:00
Simon Layton
58874ad5bf Fp16 training initializers
Summary:
Re-open for re-importing :)
Closes https://github.com/caffe2/caffe2/pull/721

Differential Revision: D5164345

Pulled By: akyrola

fbshipit-source-id: e80b32556cd25610602df91a4225b93edc0ca40b
2017-06-01 08:34:46 -07:00
Aapo Kyrola
0f8c8f37a8 Revert D5159712: [caffe2][PR] Fp16 training initializers
Summary: This reverts commit 60a889494d2e2f4df1d720331e19f638c5eb95cc

Differential Revision: D5159712

fbshipit-source-id: 16040c911b260648857f656f92b165f92c2daae0
2017-06-01 00:17:14 -07:00
Aapo Kyrola
076376f4f6 Revert D5119830: [C2] Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params
Summary: This reverts commit 2001090a37346eb12abbb234e13e727c288eb8a7

Differential Revision: D5119830

fbshipit-source-id: bf321868338f0db85dff3237af7eaf74212dbdf6
2017-06-01 00:02:21 -07:00
Andrey Malevich
ff61ed358e Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params
Summary:
This diff is the first step in the effort for refactoring all paramters. As a
first step - I'm merging concept of params and computed_params, that is going
to be based on tags instead (in the first version it's still using old data
structs to store all the BlobReferences).

Renaming computed_params to non-trainable/non-backprop params should be done is
some other diff.

Reviewed By: salexspb

Differential Revision: D5119830

fbshipit-source-id: 2001090a37346eb12abbb234e13e727c288eb8a7
2017-05-31 22:36:36 -07:00
Simon Layton
2bfacff426 Fp16 training initializers
Summary:
Adds support for generating and training pfp16 models. Added SGD optimizer for multi-precision trainers and a new callback to data_parallel_model in order to help multi-precision models keep their different copies of parameters in sync during training.
Closes https://github.com/caffe2/caffe2/pull/697

Differential Revision: D5159712

Pulled By: salexspb

fbshipit-source-id: 60a889494d2e2f4df1d720331e19f638c5eb95cc
2017-05-31 17:46:58 -07:00