pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Jiyan Yang	a8695178aa	Adding parameter sharing API to Dper2 Summary: To achive this, I modified the blob name scheme defined in a layer. Before it was scope/fc_w and scope/fc_w_auto_0 (if there is another fc within the same scope). Now I change it to scope/fc/w and scope/fc_auto_0/w. That is, we rely on the uniqueness of the scoped layer name to define names for blobs. I also overwrote the create_param method in LayerModelHelper to let it use the resolved name for blobs given the sharingparameter context. There are some details such as making the initializer more structured that I need to finalize. Reviewed By: kennyhorror Differential Revision: D5435132 fbshipit-source-id: a0525f5ea0977e255dd5ea765b38913f5951d455	2017-08-03 00:33:18 -07:00
Honghao Wei	290acab2c7	implement drelu and unittest Summary: In this revision, I mainly implemented the DRelu activation. See https://arxiv.org/pdf/1706.06978v1.pdf for details. To sum up, different from standard relu and purely, which divide the scope into two parts with boundary at zero, DRelu calculate another value p to divide the activation into two part. P is the softmax value of the output of Batch Normalization. For f(x)=x part in relu, you can find similar patten in f(x)=px, and for f(x)=0 part in rely, you can find similar pattern in f(x)=a(1-p)x, in which a is a parameter to tune. Drelu activation result is the sum of these two parts, f(x) = a(1-p)x + px. To implement DRelu, I take BatchNormalization as super class and then use the above formula for computation. In order to allow users to choose activation methods, which usually takes place when calling add_mlp function in processor_util.py, I pass the parameter transfer in model_option from UI to the details, just as what dropout do. Currently, I place it in extra_option, but can modify it if AML team needs to redesign the UI. I also add units test for DRelu. We check the shape of output and also do the numeric unit tests. For Unit test, I first check the numeric value of BatchNormalization, since there is no similar test before. I then compute the value of DRelu outputs and compare the results with current DRelu layer. Reviewed By: chocjy Differential Revision: D5341464 fbshipit-source-id: 896b4dcc49cfd5493d97a8b448401b19e9c80630	2017-07-20 11:50:08 -07:00
Honghao Wei	34f7acbedf	Report bugs in BatchNormalization, the dimension is wrong for second order Summary: The number input dimension for NHWC should be the last dimension C. Since batch size is omitted, it should be 2 instead of 3. Reviewed By: chocjy Differential Revision: D5418538 fbshipit-source-id: a6939a863817b7566198ea2a665a1d236a2cf63d	2017-07-13 18:31:18 -07:00
Jiyan Yang	6aff754dbc	Add batch normalization layer Summary: As desc. Reviewed By: xianjiec Differential Revision: D5077230 fbshipit-source-id: f73cdedac6d9a3542f8ef829b54fb4c713dcafd0	2017-05-26 16:46:52 -07:00

6 Commits