Commit Graph

213 Commits

Author SHA1 Message Date
Guanheng Zhang
4b20fc826d Import MultiheadAttention to PyTorch (#18334)
Summary:
Import MultiheadAttention into the core pytorch framework.
Users now can import MultiheadAttention directly from torch.nn.
See "Attention Is All You Need" for more details related to MultiheadAttention function.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18334

Differential Revision: D14577966

Pulled By: zhangguanheng66

fbshipit-source-id: 756c0deff623f3780651d9f9a70ce84516c806d3
2019-04-11 08:07:30 -07:00
ZhuBaohe
e81878e0a9 Correct padding and activations docstrings in nn module
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17197

Differential Revision: D14131284

Pulled By: soumith

fbshipit-source-id: 6edd225b47b1dde81b5ad0a23c588c6621987a69
2019-02-19 08:16:52 -08:00
Travis Johnston
a1a330bd6e fixed LogSigmoid math string that wasn't rendering in documentation (#16900)
Summary:
The documentation for LogSigmoid says:

> Applies the element-wise function:
> \<blank\>

Now the documentation properly displays the math string.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16900

Differential Revision: D14020097

Pulled By: ezyang

fbshipit-source-id: 41e229d0fcc6b9bb53367be548bf85286dc13546
2019-02-10 11:47:56 -08:00
Sasha Rush
dbe6a7a9ff Unify the shape notation for all of the pytorch modules (#15741)
Summary:
PR to update the shape notation for all of the torch.nn modules to take a unified form. The goal is to make these definitions machine-readable and those checkable by unifying the style across all of the different modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15741

Differential Revision: D13709601

Pulled By: ezyang

fbshipit-source-id: fb89a03903fdf0cd0dcf76f3e469b8582b2f3634
2019-01-17 10:32:14 -08:00
surgan12
ac206a95f5 crelu mentioned (#15825)
Summary:
Mentioning crelu near relu in the docs .
fixes #15730  .
cc : ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15825

Differential Revision: D13605782

Pulled By: soumith

fbshipit-source-id: d34932cf82e5407c48548dbdfc1c61b596669a0b
2019-01-08 22:55:49 -08:00
Wanchao Liang
d872af9282 Add tests for dropout/batchnorm train/eval, remove training constants (#14780)
Summary:
This PR:

1. add tests for batchnorm/dropout for train/eval parameter mutatino
2. remove training constants from all our standard library
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14780

Differential Revision: D13331578

Pulled By: wanchaol

fbshipit-source-id: d92ca3ce38cc2888688d50fe015e3e22539a20a5
2018-12-04 18:17:43 -08:00
David Riazati
1f6d9f44fc Add InstanceNorm, Distance modules to Script
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14551

Differential Revision: D13272741

Pulled By: driazati

fbshipit-source-id: 3e4fe870d0e268903757f3ae8a56100606906bce
2018-11-29 22:18:55 -08:00
David Riazati
67e3905bc6 Revert D13268293: [pytorch][PR] [jit] Add InstanceNorm, Distance modules to Script
Differential Revision:
D13268293

Original commit changeset: cb33c6dcdadd

fbshipit-source-id: 214a29b74c85b7b25df0eb48e3fdb81539049130
2018-11-29 19:19:35 -08:00
David Riazati
75eccffdfe Add InstanceNorm, Distance modules to Script
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14551

Differential Revision: D13268293

Pulled By: driazati

fbshipit-source-id: cb33c6dcdaddf8c7a49b3535894d77bf5d771ddd
2018-11-29 17:26:29 -08:00
David Riazati
9e93a02624 Use nn module tests in test_jit (#14238)
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`

Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238

Differential Revision: D13252887

Pulled By: driazati

fbshipit-source-id: e9638cf74089884a32b8f0f38396cf432c02c988
2018-11-28 23:31:25 -08:00
David Riazati
3d98810fbd Revert D13192230: [pytorch][PR] [jit] Use nn module tests in test_jit
Differential Revision:
D13192230

Original commit changeset: 36488960b6c9

fbshipit-source-id: 63b68bd909b9ef0548f52c986c84f549aecb8909
2018-11-28 00:23:09 -08:00
David Riazati
4cdcbbf410 Use nn module tests in test_jit (#14238)
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`

Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238

Differential Revision: D13192230

Pulled By: driazati

fbshipit-source-id: 36488960b6c91448b38c0fa65422539a93af8c5e
2018-11-27 21:19:51 -08:00
David Riazati
dbc467545f Update weak script modules to match fns (#13631)
Summary:
Add weak modules for those that use weak script functions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13631

Differential Revision: D12945328

Pulled By: driazati

fbshipit-source-id: 6cb235763bf5ab35c7b32e0f734f08d22418594f
2018-11-06 21:22:52 -08:00
David Riazati
14ea4bf0d1 Make 7 nn modules into weak modules (#12966)
Summary:
Depends on #12682 ([stacked diff](https://github.com/driazati/pytorch/compare/weak_mod...driazati:mod_conv1))

* Adds tests for weak module conversion that creates a `ScriptModule` that uses the weak module and checks its graph
* Adds `torch._jit_internal.weak_module` tags to modules that already work
  * `Sigmoid`
  * `Tanh`
  * `Hardshrink`
  * `PReLU`
  * `Softsign`
  * `Tanhshrink`
  * `PairwiseDistance`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12966

Differential Revision: D10559557

Pulled By: driazati

fbshipit-source-id: dc4bea3aa744b3c44d4fa7dceefd97e951f824d0
2018-10-25 13:59:34 -07:00
Wei Yang
de11fe0c83 migrate PReLU to ATen (#11758)
Summary:
- fixes https://github.com/pytorch/pytorch/issues/10723
- migrate PReLU to ATen and deprecate legacy PReLU
- performance:

CPU with weight.numel() = 1
```
>>> m = nn.PReLU()
>>> x = torch.randn(100, 100, 100, requires_grad=True)
>>> %timeit -r 100 y = m(x)
100 loops, best of 100: 9.43 ms per loop

>>> y = m(x).sum()
>>> %timeit -r 100 y.backward(retain_graph=True)
10 loops, best of 100: 24.4 ms per loop

>>> m = nn.PReLU()
>>> x = torch.randn(100, 100, 100, requires_grad=True)
>>> %timeit -r 100 y = m(x)
1000 loops, best of 100: 695 µs per loop

>>> y = m(x).sum()
>>> %timeit -r 100 y.backward(retain_graph=True)
100 loops, best of 100: 2.47 ms per loop
```

CPU with weight.numel() = channels
```
>>> m = nn.PReLU(100)
>>> x = torch.randn(100, 100, 100, requires_grad=True)
>>> %timeit -r 100 y = m(x)
1000 loops, best of 100: 603 µs per loop

>>> y = m(x).sum()
>>> %timeit -r 100 y.backward(retain_graph=True)
100 loops, best of 100: 13.3 ms per loop

>>> m = nn.PReLU(100)
>>> x = torch.randn(100, 100, 100, requires_grad=True)
>>> %timeit -r 100 y = m(x)
1000 loops, best of 100: 655 µs per loop

>>> y = m(x).sum()
>>> %timeit -r 100 y.backward(retain_graph=True)
100 loops, best of 100: 2.45 ms per loop
```

CUDA with weight.numel() = 1
```
>>> m = nn.PReLU().cuda()
>>> x = torch.randn(100, 100, 100, requires_grad=True).cuda()
>>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize();
10000 loops, best of 100: 187 µs per loop

>>> y = m(x).sum()
>>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize();
100 loops, best of 100: 2.01 ms per loop

>>> m = nn.PReLU().cuda()
>>> x = torch.randn(100, 100, 100, requires_grad=True).cuda()
>>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize();
1000 loops, best of 100: 195 µs per loop

>>> y = m(x).sum()
>>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize();
100 loops, best of 100: 2.28 ms per loop
```

CUDA with weight.numel() = channel
```
>>> m = nn.PReLU(100).cuda()
>>> x = torch.randn(100, 100, 100, requires_grad=True).cuda()
>>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize();
1000 loops, best of 100: 174 µs per loop

>>> y = m(x).sum()
>>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize();
100 loops, best of 100: 2.27 ms per loop

>>> m = nn.PReLU(100).cuda()
>>> x = torch.randn(100, 100, 100, requires_grad=True).cuda()
>>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize();
10000 loops, best of 100: 181 µs per loop

>>> y = m(x).sum()
>>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize();
100 loops, best of 100: 2.26 ms per loop
```

The huge performance regression in CPU when weight.numel() = 1 is addressed by replacing at::CPU_tensor_apply* with parallelized kernels.

ezyang SsnL zou3519  soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11758

Differential Revision: D9995799

Pulled By: weiyangfb

fbshipit-source-id: d289937c78075f46a54dafbde92fab0cc4b5b86e
2018-09-21 16:26:04 -07:00
Rob Kunkle
6e85112f12 Adding katex rendering of equations, and required edits to equations. (#8848)
Summary:
This fixes issue #8529.

- Adds Katex extension to conf.py and requirements.txt
- Fixes syntax differences in docs
- Should allow documentation pages to render faster
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8848

Reviewed By: soumith

Differential Revision: D8677702

Pulled By: goodlux

fbshipit-source-id: c4a832c5879e0eebcb14763b35a41663331ba23f
2018-08-02 12:25:17 -07:00
Xiang Gao
6fc75eadf0 Add CELU activation to pytorch (#8551)
Summary:
Also fuse input scale multiplication into ELU

Paper:
https://arxiv.org/pdf/1704.07483.pdf
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8551

Differential Revision: D9088477

Pulled By: SsnL

fbshipit-source-id: 877771bee251b27154058f2b67d747c9812c696b
2018-08-01 07:54:44 -07:00
Anton
56daed0a85 copy paste documentation error fixed in Softmin (#7324) 2018-05-06 21:50:46 +02:00
Tongzhou Wang
f9d3c3f4fd fix typo in link to sigmoid activation image (#6429) 2018-04-09 14:48:26 -04:00
Tongzhou Wang
e0f3e5dc77 fix activation images not showing up on official website (#6367) 2018-04-07 11:06:24 -04:00
Kaiyu Shi
605307f8f3 Add support for printing extra information in Module and refactor redundant codes (#5936)
This PR enables users to print extra information of their subclassed nn.Module.
Now I simply insert the user-defined string at the ending of module name, which should be discussed in this PR.

Before this PR, users should redefine the __repr__ and copy&paste the source code from Module.

* Add support for extra information on Module

* Rewrite the repr method of Module

* Fix flake8

* Change the __repr__ to get_extra_repr in Linear

* Fix extra new-line for empty line

* Add test for __repr__ method

* Fix bug of block string indent

* Add indent for multi-line repr test.

* Address review comments

* Update tutorial for creating nn.Module

* Fix flake8, add extra_repr of bilinear

* Refactor DropoutNd

* Change to extra_repr in some Modules

* Fix flake8

* Refactor padding modules

* Refactor pooling module

* Fix typo

* Change to extra_repr

* Fix bug for GroupNorm

* Fix bug for LayerNorm
2018-04-02 13:52:33 -04:00
Vishwak Srinivasan
76a283db40 [ready] General Documentation Improvements - 2 (#5685)
* Fix some minor errors in existing docs.

* Fix Convolution and Pooling docs in torch.nn.functional

* Cleaned up torch.nn.functional docs

* Address @SsnL 's comments

* Add multiplication sign missing in docs

* Fix more typos, and clear some warnings

* Change infinity symbol in LPPool2d

* Revert some changes in torch.nn.functional

* Few more minor changes
2018-03-13 09:47:43 -04:00
Richard Zou
582d045092 Fix rrelu docs (#5678) 2018-03-09 23:33:20 +01:00
Richard Zou
50770f0bc4 Fix Hardshrink equation in docs (#5679) 2018-03-09 23:32:52 +01:00
Vishwak Srinivasan
32b3841553 [ready] General documentation improvements (#5450)
* Improvize documentation
1. Add formula for erf, erfinv
2. Make exp, expm1 similar to log, log1p
3. Symbol change in ge, le, ne, isnan

* Fix minor nit in the docstring

* More doc improvements
1. Added some formulae
2. Complete scanning till "Other Operations" in Tensor docs

* Add more changes
1. Modify all torch.Tensor wherever required

* Fix Conv docs
1. Fix minor nits in the references for LAPACK routines

* Improve Pooling docs
1. Fix lint error

* Improve docs for RNN, Normalization and Padding
1. Fix flake8 error for pooling

* Final fixes for torch.nn.* docs.
1. Improve Loss Function documentation
2. Improve Vision Layers documentation

* Fix lint error

* Improve docstrings in torch.nn.init

* Fix lint error

* Fix minor error in torch.nn.init.sparse

* Fix Activation and Utils Docs
1. Fix Math Errors
2. Add explicit clean to Makefile in docs to prevent running graph generation script
while cleaning
3. Fix utils docs

* Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py

* Fix batch norm doc error
2018-03-08 13:21:12 -05:00
Tongzhou Wang
27265503ad nn.* doc update after Variable/Tensor merge (#5459)
The nn.* counterpart of #5443 . Mostly removed Variable wrapper. Also added doc for nn.RReLU.

Notice that torch.randn(*, requires_grad=True) isn't documented until #5462 is done.
2018-03-01 18:11:39 -05:00
Piotr Mitros
7b33ef4cff Documentation cleanup for activation functions (#5457) 2018-03-01 14:53:11 +01:00
Hugh Perkins
ef70db09dd fix some more mathjax (#3352) 2017-11-24 11:14:37 +01:00
Ozan Çağlayan
dd6d04ddf2 doc: Normalize all true/false in docstrings to `True|False` (#3593)
* doc: Normalize all true/false in docstrings to ``True|False``

This makes them more apparent in the documentation.

* doc: fix flake8
2017-11-09 08:12:29 -05:00
Ace
3f6fccd1a8 fixes for torch.nn.Hardtanh (examples and CPU implementation) (#3391) 2017-10-31 14:29:42 +01:00
vfdev
acb73c729b Space is missing in __repr___ of conv (#3229)
* - Remove spaces in `__repr__` of layers
- Replace `size` by `kernel_size` in `__repr__` of a pooling layer

* Fix flake8 errors
2017-10-30 13:45:37 -04:00
Hugh Perkins
820ac0df2b fix mathjax notation on softmax/softmin (#3338) 2017-10-28 18:10:35 +05:30
SsnL
de1f4e69dd raw text (#3327) 2017-10-28 01:24:02 +05:30
Adam Paszke
98e67448fa Large Softmax and LogSoftmax refactor
- Cleaned up THNN and THCUNN code and kernels
- Improved THCUNN kernel performance 5x, making it match cuDNN performance
- Added support for computing softmax over arbitrary dims
  NOTE: The default dim for 3D inputs is now 1 (used to be 0)
- Both functions now accept inputs with arbitrarily many dimensions
- Autograd functions no longer save the input (it's unnecessary)
- Added cuDNN bindings for softmax, but they are unused as THCUNN
  matches or even exceeds cuDNN performance
2017-10-19 19:51:10 +02:00
Ofir Press
40ca356d36 make logsoftmax documenatation readable (#2606) 2017-09-04 00:23:26 -04:00
Gregory Chanan
9a243abe5c Implement Softmin double backwards. 2017-08-14 16:19:10 -04:00
Gregory Chanan
61c873cc7d Implement SoftMax and NLLLoss double backwards. 2017-08-01 14:29:33 +05:30
yunjey
ba544aa0ad Add comments in nn.ELU (#2111) 2017-07-16 23:04:11 -04:00
Soumith Chintala
a5a8ab10b0 fix Hardtanh argument names to be consistent between functional and Module 2017-07-13 22:46:51 -04:00
Leonid Vlasenkov
f536c662bf fix op in docs (#2048) 2017-07-11 10:36:19 -04:00
Leonid Vlasenkov
46a868dab7 [Ready] Limit docs line length (#1900)
* some docs are ready

* docs

* docs

* fix some more

* fix some more
2017-07-10 10:24:54 -04:00
Francisco Massa
76ee014d10 Add documentation to SELU and AlphaDropout 2017-06-19 18:18:01 -04:00
Sam Gross
38b9598685 Added GLU (gated linear unit)
From https://arxiv.org/abs/1612.08083
2017-06-13 20:48:19 -04:00
Francisco Massa
a24db91a38 Add SELU activation function (#1769)
* Add SELU activation function

* Remove unnecessary case

* Add Function for SELU + tests and fix RReLU inplace

* Fix extra line in doc

* Fix tests

Remove in-place tests for RReLU. For some reason they fail on legacy nn, but passes on nn

* SELU in new-style Function

It also supports double backprop, verifyed with gradgradcheck

* Fix flake8
2017-06-11 10:07:48 +03:00
Luke Yeager
e7c1e6a8e3 [pep8] Fix most lint automatically with autopep8
Here's the command I used to invoke autopep8 (in parallel!):

    git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i

Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.

Also configures flake8 to match pep8's behavior.

Also configures TravisCI to check the whole project for lint.
2017-01-28 01:15:51 +01:00
Soumith Chintala
46bc43a80f fixing loss layer docs 2017-01-04 18:40:51 -05:00
Soumith Chintala
7fa60b2e44 fixing docs of activations, pixelshuffle, sparse for rst 2017-01-04 18:40:51 -05:00
Soumith Chintala
c78893f912 removing Image: references in nn activation docs 2017-01-04 13:51:56 -05:00
Sergey Zagoruyko
62af45d99f Basic functional interface (#354) 2016-12-29 22:53:57 +01:00
Sam Gross
ffcc38cf05 Deterministic ordering of parameters and buffers. (#317)
Uses the assignment syntax to get deterministic ordering of parameters.
The ordering of parameters using the constructor syntax is
non-deterministic because kwargs use dict() in Python 3.5 and earlier.
2016-12-16 14:45:56 -05:00
Soumith Chintala
513d902df1 adding __repr__ for nn 2016-11-07 16:17:40 -05:00
Adam Paszke
b4f4cca875 Rename training and evaluation methods 2016-10-30 00:16:06 +02:00
Adam Paszke
3cbe66ba8c Change requires_grad default to False 2016-10-05 08:46:34 -07:00
soumith
d92b7da733 fix documentation to not use forward 2016-09-30 09:49:30 -07:00
Adam Paszke
c8a4734b97 Add RReLU to both nn packages 2016-09-29 11:33:34 -07:00
Adam Paszke
f9d25e8e72 Refactor nn (require specifying parameters explicitly) 2016-09-27 15:22:26 -07:00
Soumith Chintala
b5f7720ab9 docstrings for container and batchnorm 2016-09-16 05:31:36 -04:00
soumith
a0a2d9885a adding docstrings for activation functions 2016-09-16 03:40:24 -04:00
Adam Paszke
fb39971464 Add more modules to nn 2016-09-14 11:05:56 -07:00
Sam Gross
b738b09606 Clean up Module forward and __call__ (#14)
* _forward is renamed forward since users should override it

 * some __call__ overrides are changed to forward

 * function which return a single variable are changed to return that
   variable instead of a one-element tuple
2016-09-07 15:41:39 -04:00
Adam Paszke
ea93fb7ac0 Add more nn modules 2016-08-23 19:15:21 -07:00
Adam Paszke
2bf68e72d5 Add hook system to autograd and nn 2016-08-23 13:51:34 -07:00
Adam Paszke
e055ffbdc7 Add nn 2016-08-19 14:56:55 -07:00