pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Andrey Malevich	01de4e40d6	Fix a bug in nested parameter sharing logic. Summary: It appears that my initial implementation was not really working when one starts doing nesting. This diff is fixing this by replacing itertools with something that is really easy to reason about. Reviewed By: idning Differential Revision: D6933763 fbshipit-source-id: f7a1de996d878a41bac2b2acd9d87a7c4b416778	2018-02-08 13:32:53 -08:00
Aapo Kyrola	86dc6e0837	Added inverted FP16 Initializer Summary: Added initializer which sets up the ParameterInfo object in the opposite format as the pFP16Initializer. This is needed for when the op requires the initialized blob to be FP32 but a FP16 copy of the weights is needed. Reviewed By: wesolwsk Differential Revision: D5840832 fbshipit-source-id: 439e87f41a1dbc58bf63a5c0e7f7fc4cb00b4d65	2017-10-27 10:20:04 -07:00
Jiyan Yang	ee3baa2ed4	Add shape checks and print more info in parameter sharing Summary: As titled. Reviewed By: kittipatv Differential Revision: D6145747 fbshipit-source-id: 39a212bb6bebbbf3164cade2f95db22ddb2d2c87	2017-10-27 01:22:06 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Devesh Agrawal	16549ed92b	Scaled training and fetching from the PS Summary: Today, the PS's weirdly store the entire embedding and not just their subsection of it. This was simply an oversight on the part of the original author and this diff fixes that. The sparse params are sharded to the PS's and the PS's just store their section of the embedding. The trainer requests the id's as is from the PS. But the PS divides the id by the num_of_shards before looking it up in the emdedding table blob. This happens on the backward and the forward pass. However, during the model download part, the PS multiples the embeddings with the num_of_shards before returning them to the trainer. The upshot is that the trainer does not know anything about how the embeddings are scaled on the PS. The PS adds extra divide and multiply steps to achieve that. 2. During estimation time, we allocate just one PS for estimation. So in order to make all of the embeddings fit on the single PS: We simply additionally scale the hash table sizes (proportionally and equally for all the sparse params) such that it fits. This scaling is handled analogously to (1). Reviewed By: boryiingsu Differential Revision: D5664093 fbshipit-source-id: 92f501f61566f939c41ce0b614a1b499669f978a	2017-08-23 18:16:03 -07:00
Tao Wu	b9e64ecef1	allow param_info to set optimizer Summary: this diff adds optimizer into param_info, and the associated implementations for modelhelper and brew to set optimizer for each individual parameter. Reviewed By: kennyhorror Differential Revision: D5385432 fbshipit-source-id: 5d682f9d1ab077e04a5d76a24d71470f4e64fc92	2017-07-12 08:49:48 -07:00
Luke Yeager	2579be1227	Skip fp16 initializer test for CPU-only builds Summary: Working towards https://github.com/caffe2/caffe2/pull/817. ``` E AttributeError: Method FloatToHalf is not a registered operator. Did you mean: [] ``` https://travis-ci.org/caffe2/caffe2/jobs/243867951 /cc slayton58 Closes https://github.com/caffe2/caffe2/pull/829 Differential Revision: D5276796 Pulled By: akyrola fbshipit-source-id: 34edca6090a9ce7ab39ae1fdc0e83b5c3b7e4f49	2017-06-19 12:21:25 -07:00
Andrey Malevich	77c1027abb	Create ParameterSharing abstraction for Caffe2. Summary: This diff is introducing abstractions for parameter sharing for all the parameters, that are created through new create_param syntax. Possible use-cases of this parameters sharing: 1. Share params within RNN interface. 2. Some complicated models that might share some of the branches. 3. TODO (next diff): Cross-model parameter sharing. Reviewed By: salexspb Differential Revision: D5160935 fbshipit-source-id: c6d40a5ed7ead240cd7db0eb69de6dc5f505b05a	2017-06-05 11:49:54 -07:00
Andrey Malevich	e05173a476	Create ExternalInitializer to simplify logic around init_params = False Summary: This diff is creating new type of Initializer - ExternalInitializer. This initializer is supposed to be used in cases when the parameter blob is already expected to exist in the workspace. Reviewed By: dzhulgakov Differential Revision: D5171322 fbshipit-source-id: d27861f0f80afdea93c235d49f63da19adccc92c	2017-06-02 18:22:50 -07:00
Andrey Malevich	a8fb85797c	Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params. Summary: This diff is the first step in the effort for refactoring all parameters. As a first step - I'm merging concept of params and computed_params, that is going to be based on tags instead (in the first version it's still using old data structs to store all the BlobReferences). Renaming computed_params to non-trainable/non-backprop params should be done is some other diff. Reviewed By: salexspb Differential Revision: D5171159 fbshipit-source-id: 68031ca779f053fb266a7c4a2e5b482a3bd9c832	2017-06-02 17:17:57 -07:00
Simon Layton	58874ad5bf	Fp16 training initializers Summary: Re-open for re-importing :) Closes https://github.com/caffe2/caffe2/pull/721 Differential Revision: D5164345 Pulled By: akyrola fbshipit-source-id: e80b32556cd25610602df91a4225b93edc0ca40b	2017-06-01 08:34:46 -07:00
Aapo Kyrola	0f8c8f37a8	Revert D5159712: [caffe2][PR] Fp16 training initializers Summary: This reverts commit 60a889494d2e2f4df1d720331e19f638c5eb95cc Differential Revision: D5159712 fbshipit-source-id: 16040c911b260648857f656f92b165f92c2daae0	2017-06-01 00:17:14 -07:00
Aapo Kyrola	076376f4f6	Revert D5119830: [C2] Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params Summary: This reverts commit 2001090a37346eb12abbb234e13e727c288eb8a7 Differential Revision: D5119830 fbshipit-source-id: bf321868338f0db85dff3237af7eaf74212dbdf6	2017-06-01 00:02:21 -07:00
Andrey Malevich	ff61ed358e	Refactoring of the parameters step 0. Add simple tags and unify interface for params and computed_params Summary: This diff is the first step in the effort for refactoring all paramters. As a first step - I'm merging concept of params and computed_params, that is going to be based on tags instead (in the first version it's still using old data structs to store all the BlobReferences). Renaming computed_params to non-trainable/non-backprop params should be done is some other diff. Reviewed By: salexspb Differential Revision: D5119830 fbshipit-source-id: 2001090a37346eb12abbb234e13e727c288eb8a7	2017-05-31 22:36:36 -07:00
Simon Layton	2bfacff426	Fp16 training initializers Summary: Adds support for generating and training pfp16 models. Added SGD optimizer for multi-precision trainers and a new callback to data_parallel_model in order to help multi-precision models keep their different copies of parameters in sync during training. Closes https://github.com/caffe2/caffe2/pull/697 Differential Revision: D5159712 Pulled By: salexspb fbshipit-source-id: 60a889494d2e2f4df1d720331e19f638c5eb95cc	2017-05-31 17:46:58 -07:00
Simon Layton	2c3071fc4e	Rework initializers to pass a class not object Summary: Changed tests Moved to WeightInitializer, BiasInitializer keywords Closes https://github.com/caffe2/caffe2/pull/682 Reviewed By: Yangqing Differential Revision: D5138769 Pulled By: salexspb fbshipit-source-id: 81d266100b2a95c64c0196c16670dfd34ea03e02	2017-05-30 09:06:56 -07:00
Alexander Sidorov	016f72537a	ModelHelper.create_param, Initializer abstraction and ParameterInfo for optimizers Summary: This is going to unblock Nvidia in their work on adding fp16 support to Caffe2. I discussed this with kennyhorror before to make sure this fits into his work on parameter sharing. Reviewed By: kennyhorror Differential Revision: D5127797 fbshipit-source-id: 4db155d320b1862570c23b77c4252bdacbf2296f	2017-05-25 22:03:15 -07:00

17 Commits