pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
David Riazati	10c4b98ade	Remove weak script (#22212 ) Summary: * Deletes all weak script decorators / associated data structures / methods * In order to keep supporting the standard library in script, this enables recursive script on any function defined in `torch.nn` * Most changes in `torch/nn` are the result of `ag -Q "weak" torch/nn/ -l \| xargs sed -i '/weak/d'`, only `rnn.py` needed manual editing to use the `ignore` and `export` to continue supporting the overloaded `forward` methods * `Sequential`/`ModuleList` no longer need to be added to constants since they are compiled on demand This should also fix https://github.com/pytorch/pytorch/issues/22212 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22212 Differential Revision: D15988346 Pulled By: driazati fbshipit-source-id: af223e3ad0580be895377312949997a70e988e4f	2019-07-03 17:28:25 -07:00
Guanheng Zhang	bb0f299f27	Update MultiheadAttention module support key/value with different number of features and allow static key/value (#21288 ) Summary: The changes include: 1. Allow key/value to have different number of features with query. It supports the case when key and value have different feature dimensions. 2. Support three separate proj_weight, in addition to a single in_proj_weight. The proj_weight of key and value may have different dimension with that of query so three separate proj_weights are necessary. In case that key and value have same dimension as query, it is preferred to use a single large proj_weight for performance reason. However, it should be noted that using a single large weight or three separate weights is a size-dependent decision. 3. Give an option to use static k and v in the multihead_attn operator (see saved_k and saved_v). Those static key/value tensors can now be re-used when training the model. 4. Add more test cases to cover the arguments. Note: current users should not be affected by the changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21288 Differential Revision: D15738808 Pulled By: zhangguanheng66 fbshipit-source-id: 288b995787ad55fba374184b3d15b5c6fe9abb5c	2019-07-02 18:06:25 -07:00
davidriazati	736bf7b46c	Fix __constants__ for some nn modules (#21071 ) Summary: A bunch of modules were missing entries for `__constants__` which was making their `__repr__`s not work. Others had `__constants__` that were not necessary since it was provided by some parent class instead. Fixes #20978 ](https://our.intern.facebook.com/intern/diff/15539518/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21071 Pulled By: driazati Differential Revision: D15539518 fbshipit-source-id: 24bdd1ef41ef636eefd5d2bad4ab2d79646ed4f0	2019-05-29 13:55:53 -07:00
Guanheng Zhang	8e3311c5e2	Remove functionality unsupported by the JIT from multi_head_attention_forward. (#20653 ) Summary: Remove the internal functions in multi_head_attention_forward. Those internal functions cause 10-15% performance regression and there is possibly a JIT issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20653 Differential Revision: D15398888 Pulled By: cpuhrsch fbshipit-source-id: 0a3f053a4ade5009e73d3974fa6733c2bff9d929	2019-05-27 15:12:58 -07:00
Josef Lindman Hörnlund	87040af498	Fix documentation for attention mask shape (#20850 ) Summary: Attention mask should be of shape `(L, S)` since it is added to `attn_output_weights`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20850 Differential Revision: D15495587 Pulled By: ezyang fbshipit-source-id: 61d6801da5291df960daab273e874df28aedbf6e	2019-05-24 09:10:11 -07:00
Guanheng Zhang	ca24e18c7e	Add an AssertError check back to MultiheadAttention module (#20492 ) Summary: Fix a typo in the doc. Add an AssertError check back to MultiheadAttention module Pull Request resolved: https://github.com/pytorch/pytorch/pull/20492 Differential Revision: D15349008 Pulled By: cpuhrsch fbshipit-source-id: 2d898345f03787c713e537673613a748ad826b34	2019-05-15 17:28:25 -07:00
Jason Lian	6e82b1c77d	Split nn.MultiHeadAttention into Module + functional (#20415 ) Summary: Moving functions from torch/nn/modules/activation.py to torch/nn/functional.py. For functions not implemented (_get_input_buffer and _set_input_buffer), a TODO is added. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20415 Differential Revision: D15318078 Pulled By: jamarshon fbshipit-source-id: 5ca698e2913821442cf8609cc61ac8190496a3c6	2019-05-14 08:41:28 -07:00
Guanheng Zhang	41673d477c	Disable incremental_state function in MultiheadAttention module. (#20177 ) Summary: To fully support incremental_state function, it requires several additional utils available in fairseq. However, we lack a problem for the unit test. Therefore, the incremental_state function will be disable for now. If it is needed in the future, a feature request could be created. Fixed #20132 Add some unit tests to cover the arguments of MultiheadAttention module, including bias, add_bias_kv, add_zero_attn, key_padding_mask, need_weights, attn_mask. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20177 Differential Revision: D15304575 Pulled By: cpuhrsch fbshipit-source-id: ebd8cc0f11a4da0c0998bf0c7e4e341585e5685a	2019-05-13 08:21:15 -07:00
Zayd Hammoudeh	2aea5b6335	Fixed Softmax doc to specify dimension to prevent warning in 1.1.0. (#20310 ) Summary: See Issue #20301 Specifying dim in docstring example to prevent UserWarning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20310 Differential Revision: D15277734 Pulled By: ezyang fbshipit-source-id: 2e8b748dbe743675a5a538ccbe97713aad02e8ac	2019-05-09 06:21:57 -07:00
sdg	cf55670bdd	Add proper __repr__ to LogSoftMax (#20018 ) Summary: Fixes #19961 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20018 Differential Revision: D15218171 Pulled By: ezyang fbshipit-source-id: 36bdf44d3b7a6df6a6ec5275a74741d4b057d3b4	2019-05-07 08:38:09 -07:00
Tongzhou Wang	0de4b9e97e	Improve nn.ActivationCls repr of inplace Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20127 Differential Revision: D15212928 Pulled By: soumith fbshipit-source-id: f2e3ccb51315a11043d685bc6bf415ea039eeaa3	2019-05-06 13:04:23 -07:00
Guanheng Zhang	fc00bfd12e	Update MultiheadAttention documentations (#20071 ) Summary: Add documentations to add_bias_kv, add_zero_attn, and attn_mask. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20071 Differential Revision: D15213034 Pulled By: zhangguanheng66 fbshipit-source-id: c3db4b9e8527863420ba3ce6abf6098d3b0fb7a7	2019-05-04 13:55:41 -07:00
Junjie Bai	4864000e55	Add aten mkldnn ops: relu, max_pool2d and avg_pool2d Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19205 Reviewed By: dzhulgakov Differential Revision: D14850598 fbshipit-source-id: 5bbd5909c06df9c980de680ffb81bf772766c0ba	2019-04-26 13:41:44 -07:00
Guanheng Zhang	4b20fc826d	Import MultiheadAttention to PyTorch (#18334 ) Summary: Import MultiheadAttention into the core pytorch framework. Users now can import MultiheadAttention directly from torch.nn. See "Attention Is All You Need" for more details related to MultiheadAttention function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18334 Differential Revision: D14577966 Pulled By: zhangguanheng66 fbshipit-source-id: 756c0deff623f3780651d9f9a70ce84516c806d3	2019-04-11 08:07:30 -07:00
ZhuBaohe	e81878e0a9	Correct padding and activations docstrings in nn module Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17197 Differential Revision: D14131284 Pulled By: soumith fbshipit-source-id: 6edd225b47b1dde81b5ad0a23c588c6621987a69	2019-02-19 08:16:52 -08:00
Travis Johnston	a1a330bd6e	fixed LogSigmoid math string that wasn't rendering in documentation (#16900 ) Summary: The documentation for LogSigmoid says: > Applies the element-wise function: > \<blank\> Now the documentation properly displays the math string. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16900 Differential Revision: D14020097 Pulled By: ezyang fbshipit-source-id: 41e229d0fcc6b9bb53367be548bf85286dc13546	2019-02-10 11:47:56 -08:00
Sasha Rush	dbe6a7a9ff	Unify the shape notation for all of the pytorch modules (#15741 ) Summary: PR to update the shape notation for all of the torch.nn modules to take a unified form. The goal is to make these definitions machine-readable and those checkable by unifying the style across all of the different modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15741 Differential Revision: D13709601 Pulled By: ezyang fbshipit-source-id: fb89a03903fdf0cd0dcf76f3e469b8582b2f3634	2019-01-17 10:32:14 -08:00
surgan12	ac206a95f5	crelu mentioned (#15825 ) Summary: Mentioning crelu near relu in the docs . fixes #15730 . cc : ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15825 Differential Revision: D13605782 Pulled By: soumith fbshipit-source-id: d34932cf82e5407c48548dbdfc1c61b596669a0b	2019-01-08 22:55:49 -08:00
Wanchao Liang	d872af9282	Add tests for dropout/batchnorm train/eval, remove training constants (#14780 ) Summary: This PR: 1. add tests for batchnorm/dropout for train/eval parameter mutatino 2. remove training constants from all our standard library Pull Request resolved: https://github.com/pytorch/pytorch/pull/14780 Differential Revision: D13331578 Pulled By: wanchaol fbshipit-source-id: d92ca3ce38cc2888688d50fe015e3e22539a20a5	2018-12-04 18:17:43 -08:00
David Riazati	1f6d9f44fc	Add InstanceNorm, Distance modules to Script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14551 Differential Revision: D13272741 Pulled By: driazati fbshipit-source-id: 3e4fe870d0e268903757f3ae8a56100606906bce	2018-11-29 22:18:55 -08:00
David Riazati	67e3905bc6	Revert D13268293: [pytorch][PR] [jit] Add InstanceNorm, Distance modules to Script Differential Revision: D13268293 Original commit changeset: cb33c6dcdadd fbshipit-source-id: 214a29b74c85b7b25df0eb48e3fdb81539049130	2018-11-29 19:19:35 -08:00
David Riazati	75eccffdfe	Add InstanceNorm, Distance modules to Script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14551 Differential Revision: D13268293 Pulled By: driazati fbshipit-source-id: cb33c6dcdaddf8c7a49b3535894d77bf5d771ddd	2018-11-29 17:26:29 -08:00
David Riazati	9e93a02624	Use nn module tests in test_jit (#14238 ) Summary: This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types` Also depends on #14379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238 Differential Revision: D13252887 Pulled By: driazati fbshipit-source-id: e9638cf74089884a32b8f0f38396cf432c02c988	2018-11-28 23:31:25 -08:00
David Riazati	3d98810fbd	Revert D13192230: [pytorch][PR] [jit] Use nn module tests in test_jit Differential Revision: D13192230 Original commit changeset: 36488960b6c9 fbshipit-source-id: 63b68bd909b9ef0548f52c986c84f549aecb8909	2018-11-28 00:23:09 -08:00
David Riazati	4cdcbbf410	Use nn module tests in test_jit (#14238 ) Summary: This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types` Also depends on #14379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238 Differential Revision: D13192230 Pulled By: driazati fbshipit-source-id: 36488960b6c91448b38c0fa65422539a93af8c5e	2018-11-27 21:19:51 -08:00
David Riazati	dbc467545f	Update weak script modules to match fns (#13631 ) Summary: Add weak modules for those that use weak script functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/13631 Differential Revision: D12945328 Pulled By: driazati fbshipit-source-id: 6cb235763bf5ab35c7b32e0f734f08d22418594f	2018-11-06 21:22:52 -08:00
David Riazati	14ea4bf0d1	Make 7 nn modules into weak modules (#12966 ) Summary: Depends on #12682 ([stacked diff](https://github.com/driazati/pytorch/compare/weak_mod...driazati:mod_conv1)) * Adds tests for weak module conversion that creates a `ScriptModule` that uses the weak module and checks its graph * Adds `torch._jit_internal.weak_module` tags to modules that already work * `Sigmoid` * `Tanh` * `Hardshrink` * `PReLU` * `Softsign` * `Tanhshrink` * `PairwiseDistance` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12966 Differential Revision: D10559557 Pulled By: driazati fbshipit-source-id: dc4bea3aa744b3c44d4fa7dceefd97e951f824d0	2018-10-25 13:59:34 -07:00
Wei Yang	de11fe0c83	migrate PReLU to ATen (#11758 ) Summary: - fixes https://github.com/pytorch/pytorch/issues/10723 - migrate PReLU to ATen and deprecate legacy PReLU - performance: CPU with weight.numel() = 1 ``` >>> m = nn.PReLU() >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 100 loops, best of 100: 9.43 ms per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 10 loops, best of 100: 24.4 ms per loop >>> m = nn.PReLU() >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 1000 loops, best of 100: 695 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 100 loops, best of 100: 2.47 ms per loop ``` CPU with weight.numel() = channels ``` >>> m = nn.PReLU(100) >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 1000 loops, best of 100: 603 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 100 loops, best of 100: 13.3 ms per loop >>> m = nn.PReLU(100) >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 1000 loops, best of 100: 655 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 100 loops, best of 100: 2.45 ms per loop ``` CUDA with weight.numel() = 1 ``` >>> m = nn.PReLU().cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 10000 loops, best of 100: 187 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.01 ms per loop >>> m = nn.PReLU().cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 1000 loops, best of 100: 195 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.28 ms per loop ``` CUDA with weight.numel() = channel ``` >>> m = nn.PReLU(100).cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 1000 loops, best of 100: 174 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.27 ms per loop >>> m = nn.PReLU(100).cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 10000 loops, best of 100: 181 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.26 ms per loop ``` The huge performance regression in CPU when weight.numel() = 1 is addressed by replacing at::CPU_tensor_apply* with parallelized kernels. ezyang SsnL zou3519 soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/11758 Differential Revision: D9995799 Pulled By: weiyangfb fbshipit-source-id: d289937c78075f46a54dafbde92fab0cc4b5b86e	2018-09-21 16:26:04 -07:00
Rob Kunkle	6e85112f12	Adding katex rendering of equations, and required edits to equations. (#8848 ) Summary: This fixes issue #8529. - Adds Katex extension to conf.py and requirements.txt - Fixes syntax differences in docs - Should allow documentation pages to render faster Pull Request resolved: https://github.com/pytorch/pytorch/pull/8848 Reviewed By: soumith Differential Revision: D8677702 Pulled By: goodlux fbshipit-source-id: c4a832c5879e0eebcb14763b35a41663331ba23f	2018-08-02 12:25:17 -07:00
Xiang Gao	6fc75eadf0	Add CELU activation to pytorch (#8551 ) Summary: Also fuse input scale multiplication into ELU Paper: https://arxiv.org/pdf/1704.07483.pdf Pull Request resolved: https://github.com/pytorch/pytorch/pull/8551 Differential Revision: D9088477 Pulled By: SsnL fbshipit-source-id: 877771bee251b27154058f2b67d747c9812c696b	2018-08-01 07:54:44 -07:00
Anton	56daed0a85	copy paste documentation error fixed in Softmin (#7324 )	2018-05-06 21:50:46 +02:00
Tongzhou Wang	f9d3c3f4fd	fix typo in link to sigmoid activation image (#6429 )	2018-04-09 14:48:26 -04:00
Tongzhou Wang	e0f3e5dc77	fix activation images not showing up on official website (#6367 )	2018-04-07 11:06:24 -04:00
Kaiyu Shi	605307f8f3	Add support for printing extra information in Module and refactor redundant codes (#5936 ) This PR enables users to print extra information of their subclassed nn.Module. Now I simply insert the user-defined string at the ending of module name, which should be discussed in this PR. Before this PR, users should redefine the __repr__ and copy&paste the source code from Module. * Add support for extra information on Module * Rewrite the repr method of Module * Fix flake8 * Change the __repr__ to get_extra_repr in Linear * Fix extra new-line for empty line * Add test for __repr__ method * Fix bug of block string indent * Add indent for multi-line repr test. * Address review comments * Update tutorial for creating nn.Module * Fix flake8, add extra_repr of bilinear * Refactor DropoutNd * Change to extra_repr in some Modules * Fix flake8 * Refactor padding modules * Refactor pooling module * Fix typo * Change to extra_repr * Fix bug for GroupNorm * Fix bug for LayerNorm	2018-04-02 13:52:33 -04:00
Vishwak Srinivasan	76a283db40	[ready] General Documentation Improvements - 2 (#5685 ) * Fix some minor errors in existing docs. * Fix Convolution and Pooling docs in torch.nn.functional * Cleaned up torch.nn.functional docs * Address @SsnL 's comments * Add multiplication sign missing in docs * Fix more typos, and clear some warnings * Change infinity symbol in LPPool2d * Revert some changes in torch.nn.functional * Few more minor changes	2018-03-13 09:47:43 -04:00
Richard Zou	582d045092	Fix rrelu docs (#5678 )	2018-03-09 23:33:20 +01:00
Richard Zou	50770f0bc4	Fix Hardshrink equation in docs (#5679 )	2018-03-09 23:32:52 +01:00
Vishwak Srinivasan	32b3841553	[ready] General documentation improvements (#5450 ) * Improvize documentation 1. Add formula for erf, erfinv 2. Make exp, expm1 similar to log, log1p 3. Symbol change in ge, le, ne, isnan * Fix minor nit in the docstring * More doc improvements 1. Added some formulae 2. Complete scanning till "Other Operations" in Tensor docs * Add more changes 1. Modify all torch.Tensor wherever required * Fix Conv docs 1. Fix minor nits in the references for LAPACK routines * Improve Pooling docs 1. Fix lint error * Improve docs for RNN, Normalization and Padding 1. Fix flake8 error for pooling * Final fixes for torch.nn.* docs. 1. Improve Loss Function documentation 2. Improve Vision Layers documentation * Fix lint error * Improve docstrings in torch.nn.init * Fix lint error * Fix minor error in torch.nn.init.sparse * Fix Activation and Utils Docs 1. Fix Math Errors 2. Add explicit clean to Makefile in docs to prevent running graph generation script while cleaning 3. Fix utils docs * Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py * Fix batch norm doc error	2018-03-08 13:21:12 -05:00
Tongzhou Wang	27265503ad	nn.* doc update after Variable/Tensor merge (#5459 ) The nn.* counterpart of #5443 . Mostly removed Variable wrapper. Also added doc for nn.RReLU. Notice that torch.randn(*, requires_grad=True) isn't documented until #5462 is done.	2018-03-01 18:11:39 -05:00
Piotr Mitros	7b33ef4cff	Documentation cleanup for activation functions (#5457 )	2018-03-01 14:53:11 +01:00
Hugh Perkins	ef70db09dd	fix some more mathjax (#3352 )	2017-11-24 11:14:37 +01:00
Ozan Çağlayan	dd6d04ddf2	doc: Normalize all true/false in docstrings to ``True\|False`` (#3593 ) * doc: Normalize all true/false in docstrings to ``True\|False`` This makes them more apparent in the documentation. * doc: fix flake8	2017-11-09 08:12:29 -05:00
Ace	3f6fccd1a8	fixes for torch.nn.Hardtanh (examples and CPU implementation) (#3391 )	2017-10-31 14:29:42 +01:00
vfdev	acb73c729b	Space is missing in __repr___ of conv (#3229 ) * - Remove spaces in `__repr__` of layers - Replace `size` by `kernel_size` in `__repr__` of a pooling layer * Fix flake8 errors	2017-10-30 13:45:37 -04:00
Hugh Perkins	820ac0df2b	fix mathjax notation on softmax/softmin (#3338 )	2017-10-28 18:10:35 +05:30
SsnL	de1f4e69dd	raw text (#3327 )	2017-10-28 01:24:02 +05:30
Adam Paszke	98e67448fa	Large Softmax and LogSoftmax refactor - Cleaned up THNN and THCUNN code and kernels - Improved THCUNN kernel performance 5x, making it match cuDNN performance - Added support for computing softmax over arbitrary dims NOTE: The default dim for 3D inputs is now 1 (used to be 0) - Both functions now accept inputs with arbitrarily many dimensions - Autograd functions no longer save the input (it's unnecessary) - Added cuDNN bindings for softmax, but they are unused as THCUNN matches or even exceeds cuDNN performance	2017-10-19 19:51:10 +02:00
Ofir Press	40ca356d36	make logsoftmax documenatation readable (#2606 )	2017-09-04 00:23:26 -04:00
Gregory Chanan	9a243abe5c	Implement Softmin double backwards.	2017-08-14 16:19:10 -04:00
Gregory Chanan	61c873cc7d	Implement SoftMax and NLLLoss double backwards.	2017-08-01 14:29:33 +05:30

1 2

76 Commits