pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Newsha Ardalani	0fb58d76a1	Support ArgMin in c2_pt_converter Summary: + Add ArgMin support to Caffe2 to PyTorch converter + Using hypothesis to parameterize different conditions for test Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: houseroad Differential Revision: D25016203 fbshipit-source-id: 94489fcf1ed3183ec96f9796a5b4fb348fbde5bc	2020-12-05 16:35:34 -08:00
Rahul Manghwani	142b21fd44	Add SparseLengthsSum4BitRowwiseSparse in c2_pt_converter (#48240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48240 Adds the support for converting the SparseLengthsSum4BitRowwiseSparse operator from caffe2 to pytorch as a part of c2_pt_converter Test Plan: Added a unit tested buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Tests Passed : https://our.intern.facebook.com/intern/testinfra/testrun/2251799856412296 Reviewed By: houseroad Differential Revision: D25067833 fbshipit-source-id: 45cbc331ca35bee27e083714e65a1e87a2a2d2e0	2020-12-04 14:16:25 -08:00
Frank Seide	29f0e1e2ce	Fused8BitRowwiseQuantizedToFloat operator support (#48407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48407 T79817692: Fused8BitRowwiseQuantizedToFloat operator support for c2_pt_converter. Also refactored some repeated code from the existing test functions. (Initial commit only has refactoring.) Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: bugra Differential Revision: D25069936 fbshipit-source-id: 72f6a845a1b4639b9542c6b230c8cd74b06bc5a0	2020-11-30 17:11:39 -08:00
Jonathan Kwok	a3e08e5344	Support ReduceSum in c2_pt_converter (#47889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47889 Adds support for converting the [caffe2 ReduceSum](https://caffe2.ai/docs/operators-catalogue#reducesum) operator to torch. ghstack-source-id: 116580127 Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test : [results](https://our.intern.facebook.com/intern/testinfra/testrun/6755399466095119) ✓ ListingSuccess: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - main (60.273) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_sub_op (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.119) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_layer_norm_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.404) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_local_model_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.966) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_reduce_sum (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (114.896) Reviewed By: bugra Differential Revision: D24925318 fbshipit-source-id: 3f3b791eff1b03e8f5adee744560fe8bc811c659	2020-11-13 12:02:58 -08:00
Alberto Alfarano	59e96c55f7	Support MatMul in c2_pt_converter Summary: Added the MatMul operator for caffe2 Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: bugra Differential Revision: D24920937 fbshipit-source-id: 7ba09ba0439cb9bd15d6a41fd8ff1a86d8d11437	2020-11-12 20:56:58 -08:00
Bugra Akyildiz	c26c4690fe	Add sub operator Summary: Add sub operator for caffe2 Test Plan: ``` buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test ``` Reviewed By: houseroad Differential Revision: D24685090 fbshipit-source-id: 60d745065d01b634ebd3087e533d8b9ddab77a1f	2020-11-06 12:31:17 -08:00
Bugra Akyildiz	27c7158166	Remove __future__ imports for legacy Python2 supports (#45033 ) Summary: There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports: ```2to3 -f future -w caffe2``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033 Reviewed By: seemethere Differential Revision: D23808648 Pulled By: bugra fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38	2020-09-23 17:57:02 -07:00
Stanislau Hlebik	b774ce54f8	remediation of S205607 fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3	2020-07-17 17:19:47 -07:00
Stanislau Hlebik	8fdea489af	remediation of S205607 fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac	2020-07-17 17:17:03 -07:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
Xiaomeng Yang	271f005eeb	Add elementwise_affine for LayerNormGradientOp (#19982 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19982 Add elementwise_affine for LayerNormGradientOp Reviewed By: houseroad Differential Revision: D15157493 fbshipit-source-id: 7465f2c1d4df4649b4903b93483c4861e9c7afa9	2019-05-03 15:33:46 -07:00
Jerry Zhang	ff0a7ae43f	Testing for folded conv_bn_relu (#19298 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19298 Proper testing for conv_bn_relu folding Differential Revision: D13998891 fbshipit-source-id: ceb58ccec19885cbbf38964ee0d0db070e098b4a	2019-04-16 19:04:06 -07:00
Xiaomeng Yang	49f87320ba	[Caffe2] Add full impl of GroupNorm (#7058 ) * Add full impl of GroupNorm * Fix comments in math.h * Remove unsed buffers * Add #include <array> in gpu version * Remove unused moments_buffer_ * Make inverse std to be a template. * Add detailed comments	2018-04-29 11:26:40 -07:00
Yinghai Lu	ef8f556212	[Caffe2] Changes done inside Facebook (#6378 ) * fix unit test for sqrt op From the error logging: [idx, grad, grad_estimate] are: [[ 146. 0.5 0.45776367] [ 147. 0.5 0.45776367] The gradient == 0.5 is correct, which means the SqrtOp and its gradient is doing right job. (Because y = sqrt(x), loss = y^2/2 = x/2, and then d(loss)/dx = 1/2 = 0.5; ) The test failed because of numerical problem of grad_estimate (in unit test). It can be because the step_size is small, and float precision is not high (when there are multiple elements in the tensor, we do sum(y^2) to compute loss) This diff - increase the step size, and also move the test cases to be further away from 0 (where sqrt(x) is not well defined) to be safe :) - also clean up, and merge the test case for inplace Vs. non-inplace Tested with: `CAFFE2_HYPOTHESIS_PROFILE=debug ai_bt caffe2/caffe2/python/operator_test:elementwise_ops_test -- "test_sqrt"` * CompositeReader & CompositeReaderBuilder A new type of reader gluing multiple readers together. * Back out "Revert D7394363: [GanH]: Log D Trick for Cross Entropy with Sigmoid" Original commit changeset: 9325a4356dbe * [dai][WIP] convert params to int8 on ps before sending to trainer Add float->uint8 conversion in addition to float->fp16 conversion in model_saver. * [easy] improve unit test for sparse length sum ops as desc. #accept2ship * Update GitHub upstream to `771fcb3455` * move sparse hash unique ops to OOS and add unit tests - move the SparseHash version to OOS, since 'sparsehash' is already deps of caffe2 OOS: https://fburl.com/arssw4n1 - The 'SparseHash' engine is also being used in OOS, so the SparseHash version shall be in OOS to reduce confusion: https://fburl.com/o5ea7ah2 - fix the CUDA UniqueOp for the case when batch is empty. - add unit test * group_norm_op for caffe2 This is the cuda op for Group Normalization (GN): https://arxiv.org/abs/1803.08494 This code implements GN in one op that computes Y=gamma * (X-mu) / sigma + beta and also its gradients. It is expected to have minimal memory consumption (similar to the BN op), without creating new blobs if GN were implemented as several ops (e.g., reshape, norm_mean/std, affine_channel). * Resubmit D7405233: disappeared in D7464958 OOS publish causes the op missing -- however, test was still there * [c2] add sparse hash engine for cuda unique op The SparseHash version of UniqueOp copy input tensor to CPU, and make use of sparse hash map to get unique output, and then copy back to GPU. * [dper][gpu] enable unit testing gpu trainer for sparse nn to debug the GPU trainer using mock data in unit test. make it easier to develop GPU trainer for new models. * Reuse Gloo context for Synchronize() calls Previously we were creating (and leaking) the Gloo context on each call to Synchronize(). Now only run the common world op and create the barrier net once, then run the barrier net on each Synchronize() call. Since timeout is associated with the Gloo context, assert that the timeout is fixed instead of trying to handle the complexity of multiple timeouts (and associated contexts). * [GanH/WGAN][1/n]: add FC param clipping as titled * [mobile] minimizing changes between caffe2_benchmark and speed_benchmark * [GanH]: enable diagnose within model avoid finding blob names but to directly enable inside the model * Add `net_transformer_fun` option to DPM This callback allows for various transformations to be made to the model after gradient operators have been added. The immediate motivation for this is to allow transformations such has "checkpoint-and-recompute" which allow trading off memory for additional compute. Adding several callbacks like this has made DPM's API less than ideal at this stage. However, I could not find any reasonable alternative. * [DT] [33/n] Compile flow task groups task groups need to compiled in order to pickle the object in fblearner. However I also changed the Job's compile function as creating new object is not necessary. * Initial commit for sparse_normalize vectorization and benchmark * [GanH]: LB Calibration for JSD as titled * Tracing event in async executor Adding event tracing through TRACE_EVENT macro in async executor * [Resubmit] D7409751 Reseting book-keeping blobs when the reservoir is reset D7409751 got lost in D7464958 * Visualizing realtime weights values we want to visualize the weights values as optimizer is iterating. This diff supports to visual the weights at an assigned index. Currently, we assume the blob to be 2 dimensional. * [GanH][Easy]: Fix Homotopy Weighting apparantely, there was a bug in homotopy weight (alpha, beta) update * [c2] move sparse hash unique op out of oss so that oss do not need to depend on google hash map. * Get rid of std::round as it's not supported on Android * Revert changes on setup.py * Skip shaky test on Dataio * fix	2018-04-10 21:11:43 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Kutta Srinivasan	b4b2f0d2cc	Work on fp16 conv op	2018-03-05 21:13:03 -08:00
Junjie Bai	b11ba65204	Experimental support for setup.py develop mode install Summary: `python setup.py develop` / `pip install -e .` Closes https://github.com/caffe2/caffe2/pull/1926 Reviewed By: orionr Differential Revision: D6951780 Pulled By: bddppq fbshipit-source-id: 01249cbca90ec5326ea4107d4e500ae95a9dbd7b	2018-02-12 23:36:18 -08:00
Zhicheng Yan	d79a31761e	rectangle_cropping_multi_cropping_color_jittering_lighting Summary: Change log - Support rectangle cropping, where height and width of clip cropping can be set separately. This is useful when most video resolution is non-square, such as 240p, 360p and 480p where width is significantly larger than height. - Comparisons of training on ucf101 between using 112x112 croppings and using 112x144 cropping. - https://fburl.com/i0rw6y1k - Support 14 multi-cropping per video clip at testing stage to improve classification accuracy. Take left-top, central-top, right-top, left-bottom, central-bottom, right-bottom and central-central croppings as well as their mirrorings. In total, 14 croppings. - Comparisons on the same model trained on UCF-101. Use 1 clip per video - RGB. f41014306, w/o Vs f41014868, w/ multi-cropping: `0.64099 Vs 0.65796` - OF. f41014889, w/o Vs f41014913, w/ multi-cropping: `0.65796 Vs 0.67624` - Support color jittering and color lighting on RGB data for training data augmentation. - Comparisons of training on ucf101 from scratch with and without color jittering and lighting: - https://fburl.com/k69zatul Reviewed By: HengCV Differential Revision: D6962620 fbshipit-source-id: 9b43478945874142727fea351ee04417218e6606	2018-02-12 16:39:06 -08:00
Alexander Sidorov	a3b8c459d4	Revamp MNIST tutorial Summary: Main changes: 1. Move reader creation to Brew in order to be consistent and avoid a wild use of param_init_net 2. Use optimizers for training function, avoid manual optimizer construction 3. Add MLP mode (a default) 4. Fix a bunch of too verbose comments and add a bit of new explanations Closes https://github.com/caffe2/caffe2/pull/1760 Differential Revision: D6749059 Pulled By: salexspb fbshipit-source-id: 9dfbbb2d9772a74a0300c2e404a92e791f7cc593	2018-01-26 09:17:31 -08:00
Ilia Cherniavskii	79ac146808	Add if and while ops to brew Summary: Adding if and while control ops to brew, also adding unit tests Note: unlike net_builder where we can figure which blobs are external and which ones are local to subnets, here in brew we need to use external_blobs param explicitly to point at external blobls Reviewed By: harouwu Differential Revision: D6440508 fbshipit-source-id: c920f0af84b77ccb2d8462ffc7567bb1908c844a	2017-12-05 17:33:34 -08:00
James Cross	2c190d2f05	update transformer code for layer_norm() API change Summary: Quick fix for unit test broken by D6454290. This is my fault for approving while the tests covering the single callsite were broken. Reviewed By: goldsborough Differential Revision: D6466566 fbshipit-source-id: 2683be3d6bb184286e64fbde3e572946e39030c7	2017-12-01 20:19:31 -08:00
Peter Goldsborough	b43c1b2bed	Fix and upgrade brew.layer_norm Summary: While working on layer normalization for LSTMs I encountered an issue where the layer norm parameters (which are the scale/gain and bias/shift from the paper) were not registered in the model for `brew.layer_norm`. salexspb explained that this is because it was using the `init_net_param` API instead of `create_param`. This diff fixes this. While fixing I noticed that I noticed that `brew.layer_norm` actually had a bug where it was multiplying with the bias instead of adding it. Another issue was that the function giving the scale and bias a shape of `[1]`, however the paper (https://arxiv.org/pdf/1607.06450.pdf) specifies that, like for batch norm, there is one scale and bias parameter per neuron, i.e. the shape should be `[1, axis_dimension]`. The API now takes an explicit `dim_in` parameter (also more consistent with other normalization functions in that module) so that this can be specified. See tests for how this now looks. Reviewed By: jhcross Differential Revision: D6454290 fbshipit-source-id: fc00ca614de3190c40ab743e8984bec9e85fb58c	2017-12-01 14:18:28 -08:00
Aapo Kyrola	14f95c2782	Updated brew SpatialBN to use initializers Summary: Updated brew SpatialBN to use initializers similar to other brew ops such as conv and fc instead of initilaizing all of its parameters itself within the brew call. Reviewed By: asaadaldien Differential Revision: D5840359 fbshipit-source-id: 9f3d688d4957605eaf7ecd2488bc26bfb1da3f78	2017-11-02 11:25:45 -07:00
Aapo Kyrola	669ec0ccba	Added FP16 compute support to FC Op Summary: Allow the GEMMs in the FC/FCGradient Op to do FP16 compute instead of FP32 if the appropriate op flag is set. Reviewed By: asaadaldien Differential Revision: D5839777 fbshipit-source-id: 8051daedadf72bf56c298c1cf830b019b7019f43	2017-10-30 17:03:51 -07:00
Junjie Bai	d894a6362f	Add missing is_test argument in ImageInput ops Summary: reported in Github Issue https://github.com/caffe2/caffe2/issues/1269 Reviewed By: salexspb Differential Revision: D6004461 fbshipit-source-id: 03f4bccfe085010b30109ab7b6fe7325caa160ef	2017-10-10 10:03:13 -07:00
James Reed	995c83f945	Disable cudnn dropout Summary: The cudnn version of the DropoutOp was taking a significant (and unwarranted) amount of time in our RNN training. Further investigation showed that setting the cudnn dropout descriptors was an extremely expensive operation (https://pxl.cl/99nT), much more so than the dropout operation itself. This diff adds to the DropoutCell the option to disable cudnn. The non-cudnn version uses a raw curand call that elides all of the expensive descriptor setting. Reviewed By: jmp84, akyrola Differential Revision: D5972022 fbshipit-source-id: 6325ec5d6569f8b94d776cbb2554cc8ddb28f699	2017-10-04 17:24:09 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Junjie Bai	d9b0bcd7a4	Make all existing (except in RoIPool) "is_test" arguments required Reviewed By: akyrola Differential Revision: D5830168 fbshipit-source-id: 8634e9cfe308ba0ee90cd8a5c4b09a47b0b5f015	2017-09-25 23:46:12 -07:00
Aapo Kyrola	fb45383ed6	resubmission of PR1175: fp16 BatchMatMul Summary: PR 1175 caused a build error because gemmBatched was only under a specific #ifdef. Now put it outside the #ifdef, and things work. Reviewed By: asaadaldien Differential Revision: D5834868 fbshipit-source-id: 072a64c8f4b259ff7504104121766115b46b8aa0	2017-09-14 21:46:05 -07:00
Yangqing Jia	f0d0361609	Revert D5794634: [caffe2][PR] fp16: BatchMatMul Summary: This reverts commit 911c462824edec3de529a5a4385a4c437e24bf59 bypass-lint Differential Revision: D5794634 fbshipit-source-id: 1863b02282329cbee6b10e5870f03051b4bb6c58	2017-09-13 18:46:47 -07:00
Luke Yeager	3cfc6f26e7	fp16: BatchMatMul Summary: Was https://github.com/caffe2/caffe2/pull/1151 Closes https://github.com/caffe2/caffe2/pull/1175 Reviewed By: Yangqing Differential Revision: D5794634 Pulled By: akyrola fbshipit-source-id: 911c462824edec3de529a5a4385a4c437e24bf59	2017-09-13 14:35:25 -07:00
Luke Yeager	944115c915	Bugfix for concat frontend Summary: When breaking out pooyadavoodi's change to `brew.concat` from https://github.com/caffe2/caffe2/pull/1151 to https://github.com/caffe2/caffe2/pull/1184, I made it throw an error instead of silently changing removing `order`. But `order` is always present because of [this](https://github.com/caffe2/caffe2/blob/v0.8.1/caffe2/python/model_helper.py#L118), so the frontend can never be used to set `axis`. That's bad. This PR changes the behavior back to Pooya's original implementation. Closes https://github.com/caffe2/caffe2/pull/1202 Reviewed By: akyrola Differential Revision: D5806488 Pulled By: pietern fbshipit-source-id: ceaea77469688a66b269b8ed2944f0d3fe873940	2017-09-11 13:02:59 -07:00
Luke Yeager	03de05229e	brew.concat: don't set both order and axis Summary: Was https://github.com/caffe2/caffe2/pull/1151. pooyadavoodi says this was causing problems for him. I don't remember the details. Closes https://github.com/caffe2/caffe2/pull/1184 Differential Revision: D5794711 Pulled By: akyrola fbshipit-source-id: 4d75f2a9b30881ba662141c352ac556cb5d3cce6	2017-09-08 10:34:34 -07:00
James Reed	f388135d3f	Layer norm brew wrapper Summary: Implement a brew wrapper for the LayerNorm op. This adds the scalar weight and bias terms to the op. Reviewed By: jmp84 Differential Revision: D5595836 fbshipit-source-id: 467b2e1158b0c454a149d4b26c47719826e98752	2017-08-17 11:17:47 -07:00
Simon Layton	85788a0f65	Add TensorCore support Summary: Add support for TensorCore convolution and gemm on Volta hardware. Currently built on top of #1055 Closes https://github.com/caffe2/caffe2/pull/1056 Differential Revision: D5604068 Pulled By: Yangqing fbshipit-source-id: 100f67e26ed5fabb1dbb31dcd77f7ecb84de4ee7	2017-08-10 20:16:48 -07:00
Ahmed Taei	5bb1e6b817	Allow passing unsymmetric 2d kernels to brew.conv. Reviewed By: jay-mahadeokar Differential Revision: D5598523 fbshipit-source-id: 47135a8562f7c720badb2be677cb79730dc417a0	2017-08-10 15:27:16 -07:00
Simon Layton	ded2a5899e	Option to set BN scale and bias initial values Summary: Necessary to reproduce setup from 1-hour imagenet paper Closes https://github.com/caffe2/caffe2/pull/995 Differential Revision: D5547666 Pulled By: akyrola fbshipit-source-id: cbd4396888b02f32c67e1fe7e53636329de64f1b	2017-08-02 11:38:57 -07:00
Kevin Wilfong	60cb55461e	Caffe2: Support additional outputs in ImageInputOp Summary: This allows users to add an arbitrary of additional outputs to ImageInputOp. These are populated by reading additional TensorProto values from the TensorProtos from the DBReader, and converting them into Tensors. Similar to labels, only ints and floats are supported, and multiple values are supported. Reviewed By: panshen1 Differential Revision: D5502019 fbshipit-source-id: 5a8b61b3a8549272a112e8e02cd613d8f9a271ba	2017-08-01 14:36:05 -07:00
Mitchell Wortsman	823869ba79	Adding tanh to brew Summary: Added tanh to brew. Reviewed By: harouwu Differential Revision: D5395358 fbshipit-source-id: 8eb5303f503e10aec4c59b42055933198d67e9b3	2017-07-11 18:17:52 -07:00
Luke Yeager	dfd745a4d1	Conv frontend: checking engine and use_cudnn Summary: Fixes https://github.com/caffe2/caffe2/issues/860 Raise an exception when the user specifies conflicting values for `engine` and `use_cudnn` in the conv frontend. Closes https://github.com/caffe2/caffe2/pull/861 Differential Revision: D5329587 Pulled By: akyrola fbshipit-source-id: 0f1ced9a88c9c6c5a7cb30a070e5bf60129082f0	2017-06-27 09:47:48 -07:00
Davin Wang	dd1525d346	fix #790 so model.init_params = False takes effect Summary: Given the parameter init_params=False, Weight Blob(_w) and Bias Blob (_b) should be suppressed in model.param_init_net. Without this fix, the init_params=False doesn't take effect in brew.conv as it does in brew.fc or other ops. This issue is the root cause of #790 [https://github.com/caffe2/caffe2/pull/790]. Closes https://github.com/caffe2/caffe2/pull/824 Reviewed By: harouwu Differential Revision: D5276676 Pulled By: akyrola fbshipit-source-id: 8f7088a8e1976658f67e027223e555375b3a2392	2017-06-20 14:08:35 -07:00
Zhicheng Yan	ee3727db00	add_helper_function_ElementwiseLinear_op Summary: Add a helper function for parametric op ElementwiseLinear The typical syntax is model.ElementwiseLinear(input, output, dimension) Reviewed By: harouwu, akyrola Differential Revision: D5114152 fbshipit-source-id: 8e8c691f824f518ae510a72ab0c12de1b018f3b5	2017-06-07 13:49:48 -07:00
Andrey Malevich	e05173a476	Create ExternalInitializer to simplify logic around init_params = False Summary: This diff is creating new type of Initializer - ExternalInitializer. This initializer is supposed to be used in cases when the parameter blob is already expected to exist in the workspace. Reviewed By: dzhulgakov Differential Revision: D5171322 fbshipit-source-id: d27861f0f80afdea93c235d49f63da19adccc92c	2017-06-02 18:22:50 -07:00
Andrey Malevich	a8fb85797c	Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params. Summary: This diff is the first step in the effort for refactoring all parameters. As a first step - I'm merging concept of params and computed_params, that is going to be based on tags instead (in the first version it's still using old data structs to store all the BlobReferences). Renaming computed_params to non-trainable/non-backprop params should be done is some other diff. Reviewed By: salexspb Differential Revision: D5171159 fbshipit-source-id: 68031ca779f053fb266a7c4a2e5b482a3bd9c832	2017-06-02 17:17:57 -07:00
Ahmed Taei	299f293cb2	Add initializer classes to conv_nd. Summary: Fix parameters passed to _ConvBase Reviewed By: sunwael Differential Revision: D5166836 fbshipit-source-id: 6c2a9fa73cf1199a5f861900554f3075a49104fc	2017-06-01 14:17:55 -07:00
Simon Layton	58874ad5bf	Fp16 training initializers Summary: Re-open for re-importing :) Closes https://github.com/caffe2/caffe2/pull/721 Differential Revision: D5164345 Pulled By: akyrola fbshipit-source-id: e80b32556cd25610602df91a4225b93edc0ca40b	2017-06-01 08:34:46 -07:00
Aapo Kyrola	0f8c8f37a8	Revert D5159712: [caffe2][PR] Fp16 training initializers Summary: This reverts commit 60a889494d2e2f4df1d720331e19f638c5eb95cc Differential Revision: D5159712 fbshipit-source-id: 16040c911b260648857f656f92b165f92c2daae0	2017-06-01 00:17:14 -07:00
Aapo Kyrola	076376f4f6	Revert D5119830: [C2] Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params Summary: This reverts commit 2001090a37346eb12abbb234e13e727c288eb8a7 Differential Revision: D5119830 fbshipit-source-id: bf321868338f0db85dff3237af7eaf74212dbdf6	2017-06-01 00:02:21 -07:00
Andrey Malevich	ff61ed358e	Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params Summary: This diff is the first step in the effort for refactoring all paramters. As a first step - I'm merging concept of params and computed_params, that is going to be based on tags instead (in the first version it's still using old data structs to store all the BlobReferences). Renaming computed_params to non-trainable/non-backprop params should be done is some other diff. Reviewed By: salexspb Differential Revision: D5119830 fbshipit-source-id: 2001090a37346eb12abbb234e13e727c288eb8a7	2017-05-31 22:36:36 -07:00
Simon Layton	2bfacff426	Fp16 training initializers Summary: Adds support for generating and training pfp16 models. Added SGD optimizer for multi-precision trainers and a new callback to data_parallel_model in order to help multi-precision models keep their different copies of parameters in sync during training. Closes https://github.com/caffe2/caffe2/pull/697 Differential Revision: D5159712 Pulled By: salexspb fbshipit-source-id: 60a889494d2e2f4df1d720331e19f638c5eb95cc	2017-05-31 17:46:58 -07:00

1 2

80 Commits