pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Dmytro Dzhulgakov	2972a6ca02	Revert D6026557: [caffe2][PR] Fix "No handlers could be found for logger" Summary: This reverts commit 95c634872ac02be721257169e38c8fead04cd66b bypass-lint Differential Revision: D6026557 fbshipit-source-id: 663c28583ce3b01070ff5449115ed7e222f71776	2017-10-12 20:21:52 -07:00
Luke Yeager	75bece6ede	Fix "No handlers could be found for logger" Summary: Closes https://github.com/caffe2/caffe2/pull/1316 Differential Revision: D6026557 Pulled By: Yangqing fbshipit-source-id: 95c634872ac02be721257169e38c8fead04cd66b	2017-10-10 22:32:13 -07:00
Hassan Eslami	7fc7756487	Refactor param initialization from model manipulation to layers logic Summary: This diff refactors the parameter initialization logic from model manipulation to layers Reviewed By: azzolini Differential Revision: D5920225 fbshipit-source-id: 50d230e406bc9ce0b00bdd164802c504cf32ea46	2017-10-02 22:08:40 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Xiaolong Wang	642dea487d	update inline comment Summary: as desc Reviewed By: kennyhorror Differential Revision: D5930526 fbshipit-source-id: 510388fd66b487410ff748a9e6f546a8ce27bc1d	2017-09-28 10:17:13 -07:00
Kittipat Virochsiri	1b059f4c98	Add option to ignore parameter initialization Summary: When parameter sharing is used, the model may not own the parameters. Emptying out initializer ensures that the shared model doesn't overwrite initialization. Reviewed By: chocjy Differential Revision: D5870362 fbshipit-source-id: f8587b84c3a13f331a3251973e8206563939606a	2017-09-20 12:03:22 -07:00
Wenlin Chen	adc5510ecb	dynamic embedding Summary: refactor get_categorical_limit Reviewed By: xianjiec Differential Revision: D5459389 fbshipit-source-id: 14a7e07394db52fb090c6923e341c34576fcb6d6	2017-08-03 00:33:18 -07:00
Jiyan Yang	a8695178aa	Adding parameter sharing API to Dper2 Summary: To achive this, I modified the blob name scheme defined in a layer. Before it was scope/fc_w and scope/fc_w_auto_0 (if there is another fc within the same scope). Now I change it to scope/fc/w and scope/fc_auto_0/w. That is, we rely on the uniqueness of the scoped layer name to define names for blobs. I also overwrote the create_param method in LayerModelHelper to let it use the resolved name for blobs given the sharingparameter context. There are some details such as making the initializer more structured that I need to finalize. Reviewed By: kennyhorror Differential Revision: D5435132 fbshipit-source-id: a0525f5ea0977e255dd5ea765b38913f5951d455	2017-08-03 00:33:18 -07:00
Xiaolong Wang	82adbde878	pass layer_parameter shape to ps builder if cannot inferred from initializer Summary: Feed team uses distributed training and wants to also use transfer learning. Currently, transfer learning implements by overwriting the layer parameter initializer. Therefore, PS builder can't infer correctly the parameter shape. To fix this, add a field 'shape' in `layer_parameter` and set the shape if we overwrite its initializer. We also enforce the check of parameter shape between the original initializer and the loaded blob. (this adds extra cost) Differential Revision: D5520541 fbshipit-source-id: 80547dbd328b3f6cbfcea0b2daaf4004703dfe81	2017-07-31 16:04:23 -07:00
Bangsheng Tang	5f63f5697a	IndexHash Summary: 1. IndexHashOp 2. Helper class SparseFeatureHash 3. FeatureSpec changes to add desired_hash_size Reviewed By: kennyhorror Differential Revision: D5361370 fbshipit-source-id: bf02e3ca12b3654f1d291f77c8af9248b6c4ac55	2017-07-07 23:06:11 -07:00
Yan Shang	cf4ac83a91	Make List.__getitem__() works with output of List.field_names() Summary: As described in T19378176 by kittipatv, in this diff, we fix the issue of __getitem__() of schema.List. For example, given Map(int32, float) (Map is a special List), field_names() will return "lengths", "values:keys", & "values:values". "values:keys" and "values:values" are not accessible via __getitem__(). __getitem__() bypasses the values prefix and directly access the fields in the map. Other APIs (e.g., _SchemaNode & dataset_ops) expect "values:keys" and "values:values" as it simplifies traversal logic. Therefore, we should keep field_names() as is and fix __getitem__(). Reviewed By: kittipatv Differential Revision: D5251657 fbshipit-source-id: 1acfb8d6e53e286eb866cf5ddab01d2dce97e1d2	2017-06-21 14:06:05 -07:00
Bokai Cao	d9087edb07	add rekey in feature_processor Differential Revision: D5270972 fbshipit-source-id: 8805c0e947f4752d2c575e2a7b8986cd804601dc	2017-06-20 23:19:09 -07:00
Bokai Cao	d2b1cb22a4	rekey layer Differential Revision: D5210095 fbshipit-source-id: dc66a10d95842e0f10cb53a5afb7ddcc3fcac0de	2017-06-19 18:47:28 -07:00
haracejacob	2ec294a8bb	Fix a few typos and grammars in comment Summary: Fix a few typos and grammars in comment by using language-check, python library spell_checker source code is here : https://github.com/17-1-SKKU-OSS/011A/blob/master/spell_checker/spell_checker.py here is the text file which indicates what things should be fixed : https://github.com/17-1-SKKU-OSS/011A/tree/master/spell_checker/fix/caffe2 Closes https://github.com/caffe2/caffe2/pull/719 Differential Revision: D5165118 Pulled By: aaronmarkham fbshipit-source-id: 7fb8ef7a99d03cd5fd2f9ebdb01b9865e90fc37b	2017-06-14 18:22:39 -07:00
Wael Abdelghani	ebecafbcca	Support for position weighted in distributed PS Summary: Title Reviewed By: azzolini Differential Revision: D5081871 fbshipit-source-id: 68a97c2112522fbcbcdfd9e0f717b8bce60fe028	2017-06-05 17:04:42 -07:00
Wael Abdelghani	5447f5c0d7	Move position weighted to separate layer Reviewed By: kennyhorror Differential Revision: D5063086 fbshipit-source-id: 212c08946728437bcc8b6049438ae82235137ec6	2017-06-05 15:49:22 -07:00
Xiaolong Wang	11bcdbc3f0	Load Parameters from Model Summary: In Dper utility, add a function `load_parameters_from_model_init_options` to allow init parameters from pretrained models Reviewed By: xianjiec Differential Revision: D4926075 fbshipit-source-id: 5ab563140b5b072c9ed076bbba1aca43e71c6ac5	2017-05-10 10:33:04 -07:00
Chonglin Sun	e8e93066e7	add workflow for user complicated embedding Summary: Correctly propagate request_only tag to all layer. Reviewed By: kennyhorror Differential Revision: D4751496 fbshipit-source-id: e65fd8cfe56d2989213d44e684a528ede691d316	2017-05-02 10:46:52 -07:00
Huazhong Ning	f950a1b70f	create bucket-based calibration - model manipulation Summary: added a new context to layers.py Reviewed By: kennyhorror Differential Revision: D4817124 fbshipit-source-id: 36f08964b86092e81df24c1b9d4b167293a7ffb8	2017-04-18 22:01:23 -07:00
Huazhong Ning	15c6f637d6	create bucket-based calibration - layer Summary: The basic idea of bucket-based calibration: 1. given a model and a calibration data set 2. apply the model to the calibration data set and sort the prediction scores 3. bucketize the prediction scores 4. for the samples in each bucket, compute the proportion of positive samples 5. build a set of piecewise linear functions that map from the bucket range to the proportion 6. appends an operator of piecewise linear transform to the prediction net that is supposed to calibrate the raw predictions. 7. to support calibration in realtime training, we create a new type of Net -- bucket calibration net. This needs a new Context to add_calibration_ops(), to export and load the new Net. This includes a series of diffs. This diff implements a layer that adds different operators for train/cali/eval for bucket based calibration. Reviewed By: dragonxlwang Differential Revision: D4817119 fbshipit-source-id: 44f8fcad2a94f40f7439cc1ad47e7bae5e17397d	2017-04-11 12:30:26 -07:00
Kittipat Virochsiri	3b4c950862	Add option to use id_score_list_features column Summary: Somehow, feed-non-ranking training data usually have this type of column. Add option to support it. Reviewed By: xianjiec, kennyhorror Differential Revision: D4773960 fbshipit-source-id: 5a7ef4618a070e04f3cd8ddfcbf2b7441c00d92d	2017-04-03 17:03:09 -07:00
Ou Jin	cd4160c894	distributed training for dper2 Summary: Add distributed training to dper2 and keep the dper1 working. * Created a ModelDelegator to wrap ModelHelper and LayerModelHelper to mitigate the difference. * To get the average length for sparse feature, I extracted some information in feature_processor. There should be some better way to do it after we have new compute_meta. * metric right now only runs on the first trainer. * The model is saved correctly for evaluation. But I'm still not sure how to handle the weights for adagrad. Reviewed By: kennyhorror Differential Revision: D4767745 fbshipit-source-id: 0559d264827a7fd9327071e8367d1e84a936bea9	2017-03-30 19:04:50 -07:00
Aaron Markham	58f7f2b441	doxygen python block added Summary: Closes https://github.com/caffe2/caffe2/pull/226 Differential Revision: D4793550 Pulled By: JoelMarcey fbshipit-source-id: cc33e58186304fa8dcac2ee9115dcc271d785b1e	2017-03-29 06:46:16 -07:00
Andrey Malevich	7cc92b1260	Add eval net for layer_model_helper Summary: This diff is adding eval nets to layer model helper. It should be useful for the cases when train/eval nets need some extra input (usually some supervision) for train/eval. For example various sampled layers, etc. Differential Revision: D4769453 fbshipit-source-id: 7a8ec7024051eab73b8869ec21e20b5f10fd9acb	2017-03-29 04:03:40 -07:00
Kittipat Virochsiri	da36212259	SamplingTrain layer Summary: `SamplingTrain` layer is a wrapper around another layer subclassing `SamplingTrainableMixin`. When initiated in the training context, `SamplingTrain` produces sparse output of the wrapped layer. Output can be paired with `indices` to create Map schema. When initiated in prediction context, the full output of the wrap layer is produced. This is liked the SampledFC function in model helper, https://fburl.com/gi9g1awh, with the ability to initiated in both trainig and prediction context. I'd like to get consensus whether we should introduce the `SamplingTrain` layer and the accompaying mixin. This can probably be accomplished in some other way, but I think this is not too bad. Reviewed By: xianjiec Differential Revision: D4689887 fbshipit-source-id: 7be8a52d82f3a09a053378146262df1047ab26a8	2017-03-27 23:31:55 -07:00
Qichao Que	2f68632a32	Add SparseNN workflow for feed. Summary: Add SparseNN workflow for feed. I haven't fully thought about the change needed for ads, as I added a property called 'preproc_output_schema' for LayerModelHelper. Reviewed By: xianjiec Differential Revision: D4585796 fbshipit-source-id: 060d08f4beb928e7e7863f2e563f612c358951fb	2017-03-01 11:02:38 -08:00
Andrey Malevich	a3726759c6	Add a way do describe layers in a more AdHoc manner. Summary: This diff is trying to address one of the concerns that Xianjie have had - requirements create a layer for all operators and attach pass shapes and other info around. The basic idea of the diff: 1. Try to create a layer with a given name, but if it's not available try to fallback on operator with that name (that is expected to have no parameters). 2. For all operators that we're adding through this functional style of creation - try to use C2 Shape/Type inference logic to get output type. If we fail to get - it just return untyped record and expect user to annotate it when it's really needed. Reviewed By: xianjiec Differential Revision: D4408771 fbshipit-source-id: aced7487571940d726424269970df0eb62670c39	2017-02-27 23:30:39 -08:00
Artem Volkhin	b2cf0fad15	Convert SparseLookup layer's embedding to fp16 blobs for predictor Summary: First part of adding half-floats support to DPER 2.0. Let's add an option use_half_floats to enable converting some weights of the model from fp32 to fp16 before saving it to predictor models parts. For now it's for SparseLookup layer's embeddings. All conversion is done after training is finished and saved models are ready to be used on remote predictors as-is (they will be stored compacted in memory). New fp16 blobs are saved to the model instead of original ones, under the same names, so we don't modify MetaNetDef at all. Next steps: 1) support on delivery side -- operators working with these blobs should support both float and float16 input types 2) benchmark performance to make sure there is no regression a) of serialization b) of delivery 3) support realtime training (I'm thinking about adding new pre-publishing net which will be executed each time the realtime trainer stops to publish a new snapshot) Depends on D4567304 Reviewed By: kennyhorror Differential Revision: D4571710 fbshipit-source-id: 19967a17d3bd84878d66e8c0ed8c5342bf38d979	2017-02-22 11:05:49 -08:00
Andrey Malevich	86fb25cefa	Rely on embedding size in split Summary: As desc. Differential Revision: D4471823 fbshipit-source-id: 2685c64c22556da1749b3e3e6b21a684a7231e7b	2017-01-27 19:44:31 -08:00
Andrey Malevich	ec51f887bf	Create only one instance of SigridTransform in DPerExample. Summary: DPer example have been creating multiple copies of the transform config in net defition till this moment, that resulted in the fact that I've hit the limit of ProtoBuf (64MB) for a certain Task requests (especially visible because of the ValidationPipeline that I was adding). After this diff we're going to store SigridTransforms in one instance per machine for training (or 1 instance per reading). Difference in sizes of the plans for some simple SparseNN model ~30 MB (even including the fact that second model have validation plan as well). TODO: Do similar logic for NNPreProc as well (it's also pretty large). Reviewed By: dzhulgakov Differential Revision: D4441441 fbshipit-source-id: 4452dd86a4dc49b2c7f5b7642f443aed5720b047	2017-01-22 19:29:16 -08:00
Ievgen Soboliev	a7f8fe0423	introduce request net into prediction schema Summary: As title. We want to have request_only net which runs on user_only sparse features. Submitting to get early feedback. Reviewed By: dzhulgakov Differential Revision: D4282783 fbshipit-source-id: 71241bf5444550075884c788c2da4783659bc1e0	2016-12-22 15:59:27 -08:00
Ievgen Soboliev	1632f053e5	implement user-only metadata for input_record Summary: We want to implement request only net and to do this we decided to split the work into two parts. The first part will propagate required metadata and the second part will cut the nets properly. This diff is to propagate request_only metadata across the layers. A few notes about implementation: - Each layer contains a field request_only which can be set based on the input_record. If all the scalars from the input_record are marked request_only we mark a layer as request_only; - Sparse-To-Dense layer sets request_only metadata; - SigridTransformation and SparseLookup layers propagate request_only status; - As for now we join request_only and other sparse features together in input_record, but ideally we may want to separate this, because request_only should be served separately; Reviewed By: xianjiec Differential Revision: D4259505 fbshipit-source-id: db8a30ef92cba84f1a843981b9dde3a8b9633608	2016-12-15 12:01:29 -08:00
Yangqing Jia	238ceab825	fbsync. TODO: check if build files need update.	2016-11-15 00:00:46 -08:00

33 Commits