pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Bugra Akyildiz	27c7158166	Remove __future__ imports for legacy Python2 supports (#45033 ) Summary: There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports: ```2to3 -f future -w caffe2``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033 Reviewed By: seemethere Differential Revision: D23808648 Pulled By: bugra fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38	2020-09-23 17:57:02 -07:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
Silun Wang	28c1258f18	Scale init for batch-norm and layer-norm (#31983 ) Summary: Per discussion with Fei Tian, we need to add a `scale_init_value` to scale down the output of normalization such as batch-norm and layer-norm. Currently we have `sparse_normalization_options` to normalize embedding pooling output. By default, scale = 1.0, we found it's better to set scale from 0.025 to 0.1 https://fb.quip.com/MiKUAibEaYhH Besides, I am removing the tags from normalizers because it makes more sense to calculate norm ops in distributed trainers, not ps. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31983 Test Plan: Testing LN and BN after sum-pooling -- baseline f160348514 LN: f160348609 BN: f160348710 {F226106518} Layer norm after sum-pooling fwd_net https://fburl.com/sa4j207n Layer norm after dot-prod fwd_net https://fburl.com/twggwyvb ## Unit Tests Testing normalization after pooling ``` buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_sparse_pooling_batch_normalization buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_dense_sparse_pooling_batch_normalization buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_sparse_pooling_layer_normalization buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_dense_sparse_pooling_layer_normalization ``` Testing normalization after dot-prod ``` buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_last_layer_use_batch_norm buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_last_layer_use_layer_norm ``` Differential Revision: D19277618 Pulled By: SilunWang fbshipit-source-id: ea323e33e3647ba55d2e808ef09d94ad7b45b934	2020-01-10 11:55:56 -08:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Jiyan Yang	a8695178aa	Adding parameter sharing API to Dper2 Summary: To achive this, I modified the blob name scheme defined in a layer. Before it was scope/fc_w and scope/fc_w_auto_0 (if there is another fc within the same scope). Now I change it to scope/fc/w and scope/fc_auto_0/w. That is, we rely on the uniqueness of the scoped layer name to define names for blobs. I also overwrote the create_param method in LayerModelHelper to let it use the resolved name for blobs given the sharingparameter context. There are some details such as making the initializer more structured that I need to finalize. Reviewed By: kennyhorror Differential Revision: D5435132 fbshipit-source-id: a0525f5ea0977e255dd5ea765b38913f5951d455	2017-08-03 00:33:18 -07:00
Honghao Wei	290acab2c7	implement drelu and unittest Summary: In this revision, I mainly implemented the DRelu activation. See https://arxiv.org/pdf/1706.06978v1.pdf for details. To sum up, different from standard relu and purely, which divide the scope into two parts with boundary at zero, DRelu calculate another value p to divide the activation into two part. P is the softmax value of the output of Batch Normalization. For f(x)=x part in relu, you can find similar patten in f(x)=px, and for f(x)=0 part in rely, you can find similar pattern in f(x)=a(1-p)x, in which a is a parameter to tune. Drelu activation result is the sum of these two parts, f(x) = a(1-p)x + px. To implement DRelu, I take BatchNormalization as super class and then use the above formula for computation. In order to allow users to choose activation methods, which usually takes place when calling add_mlp function in processor_util.py, I pass the parameter transfer in model_option from UI to the details, just as what dropout do. Currently, I place it in extra_option, but can modify it if AML team needs to redesign the UI. I also add units test for DRelu. We check the shape of output and also do the numeric unit tests. For Unit test, I first check the numeric value of BatchNormalization, since there is no similar test before. I then compute the value of DRelu outputs and compare the results with current DRelu layer. Reviewed By: chocjy Differential Revision: D5341464 fbshipit-source-id: 896b4dcc49cfd5493d97a8b448401b19e9c80630	2017-07-20 11:50:08 -07:00
Honghao Wei	34f7acbedf	Report bugs in BatchNormalization, the dimension is wrong for second order Summary: The number input dimension for NHWC should be the last dimension C. Since batch size is omitted, it should be 2 instead of 3. Reviewed By: chocjy Differential Revision: D5418538 fbshipit-source-id: a6939a863817b7566198ea2a665a1d236a2cf63d	2017-07-13 18:31:18 -07:00
Jiyan Yang	6aff754dbc	Add batch normalization layer Summary: As desc. Reviewed By: xianjiec Differential Revision: D5077230 fbshipit-source-id: f73cdedac6d9a3542f8ef829b54fb4c713dcafd0	2017-05-26 16:46:52 -07:00

9 Commits