Commit Graph

356 Commits

Author SHA1 Message Date
David Riazati
10c4b98ade Remove weak script (#22212)
Summary:
* Deletes all weak script decorators / associated data structures / methods
   * In order to keep supporting the standard library in script, this enables recursive script on any function defined in `torch.nn`
   * Most changes in `torch/nn` are the result of `ag -Q "weak" torch/nn/ -l | xargs sed -i '/weak/d'`, only `rnn.py` needed manual editing to use the `ignore` and `export` to continue supporting the overloaded `forward` methods
* `Sequential`/`ModuleList` no longer need to be added to constants since they are compiled on demand

This should also fix https://github.com/pytorch/pytorch/issues/22212
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22212

Differential Revision: D15988346

Pulled By: driazati

fbshipit-source-id: af223e3ad0580be895377312949997a70e988e4f
2019-07-03 17:28:25 -07:00
Guanheng Zhang
bb0f299f27 Update MultiheadAttention module support key/value with different number of features and allow static key/value (#21288)
Summary:
The changes include:

1. Allow key/value to have different number of features with query. It supports the case when key and value have different feature dimensions.
2. Support three separate proj_weight, in addition to a single in_proj_weight. The proj_weight of key and value may have different dimension with that of query so three separate proj_weights are necessary. In case that key and value have same dimension as query, it is preferred to use a single large proj_weight for performance reason. However, it should be noted that using a single large weight or three separate weights is a size-dependent decision.
3. Give an option to use static k and v in the multihead_attn operator (see saved_k and saved_v). Those static key/value tensors can now be re-used when training the model.
4. Add more test cases to cover the arguments.

Note: current users should not be affected by the changes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21288

Differential Revision: D15738808

Pulled By: zhangguanheng66

fbshipit-source-id: 288b995787ad55fba374184b3d15b5c6fe9abb5c
2019-07-02 18:06:25 -07:00
Lara
34aee933f9 ONNX Export Interpolate (Resize) for opset version 10
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21434

Reviewed By: zrphercule

Differential Revision: D15777197

Pulled By: houseroad

fbshipit-source-id: 517b06a54a234ffdb762401e83f5a732023ed259
2019-06-19 13:40:27 -07:00
Ivan Ogasawara
0f675f9cbc Port im2col and vol2col (#21769)
Summary:
resolves partially https://github.com/pytorch/pytorch/issues/18353
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21769

Differential Revision: D15854530

Pulled By: ezyang

fbshipit-source-id: 574853c068010d1b7588047d2ab7450077471447
2019-06-17 10:06:26 -07:00
Natalia Gimelshein
efd20de276 fix multihead attention for half (#21658)
Summary:
Currently multihead attention for half type is broken
```
  File "/home/ngimel/pytorch/torch/nn/functional.py", line 3279, in multi_head_attention_forward
    attn_output = torch.bmm(attn_output_weights, v)
RuntimeError: Expected object of scalar type Float but got scalar type Half for argument https://github.com/pytorch/pytorch/issues/2 'mat2'
```
because softmax converts half inputs into fp32 inputs. This is unnecessary - all the computations in softmax will be done in fp32 anyway, and the results need to be converted into fp16 for the subsequent batch matrix multiply, so nothing is gained by writing them out in fp32. This PR gets rid of type casting in softmax, so that half works.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21658

Differential Revision: D15807487

Pulled By: zhangguanheng66

fbshipit-source-id: 4709ec71a36383d0d35a8f01021e12e22b94992d
2019-06-13 15:17:04 -07:00
Kabir Kwatra
26bcadcc61 Gumbel-Softmax Arxiv Docs Link Fix (#21376)
Summary:
Links separated #20297
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21376

Differential Revision: D15696413

Pulled By: ezyang

fbshipit-source-id: 513bd430e41c109aa2d0fbaa9a242acb2a12059b
2019-06-06 10:11:18 -07:00
Xiaomeng Yang
0c6efbd410 Fix gelu documents (#21265)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21265

Fix gelu documents

Reviewed By: hl475

Differential Revision: D15598958

fbshipit-source-id: 483040069102daada705401c36c8990598142d3d
2019-06-02 20:17:56 -07:00
Xiaomeng Yang
93ae040ff0 Add gelu activation in pytorch (#20665)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20665

Add gelu activation forward on CPU in pytorch

Compare to current python implemented version of gelu in BERT model like

  def gelu(self, x):
      x * 0.5 * (1.0 + torch.erf(x / self.sqrt_two))

The torch.nn.functional.gelu function can reduce the forward time from 333ms to 109ms (with MKL) / 112ms (without MKL) for input size = [64, 128, 56, 56] on a devvm.

Reviewed By: zheng-xq

Differential Revision: D15400974

fbshipit-source-id: f606b43d1dd64e3c42a12c4991411d47551a8121
2019-06-02 09:08:47 -07:00
Guanheng Zhang
8e3311c5e2 Remove functionality unsupported by the JIT from multi_head_attention_forward. (#20653)
Summary:
Remove the internal functions in multi_head_attention_forward. Those internal functions cause 10-15% performance regression and there is possibly a JIT issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20653

Differential Revision: D15398888

Pulled By: cpuhrsch

fbshipit-source-id: 0a3f053a4ade5009e73d3974fa6733c2bff9d929
2019-05-27 15:12:58 -07:00
daquexian
a3a458ed30 Fix align corner docs (#20961)
Summary:
I believe the `True` and `False` in the doc are reversed :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20961

Differential Revision: D15510806

Pulled By: soumith

fbshipit-source-id: 62566bb595e187506b23dedc24892e48f35b1147
2019-05-26 14:57:37 -07:00
Yifu Wang
5e69e76aba Remove padding_mode from torch.nn.functional.conv{1,2,3}d's docstr (#20891)
Summary:
Fixes #20694
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20891

Differential Revision: D15510790

Pulled By: soumith

fbshipit-source-id: aa3630693c7446bf18a390cb49c4df9bc9c59eea
2019-05-26 14:52:51 -07:00
Josef Lindman Hörnlund
87040af498 Fix documentation for attention mask shape (#20850)
Summary:
Attention mask should be of shape `(L, S)` since it is added to `attn_output_weights`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20850

Differential Revision: D15495587

Pulled By: ezyang

fbshipit-source-id: 61d6801da5291df960daab273e874df28aedbf6e
2019-05-24 09:10:11 -07:00
Guanheng Zhang
3caf4e6985 Remove weak_script in MultiheadAttention function. (#20563)
Summary:
Remove weak_script. After recently splitting the forward() function in MultiheadAttention module, we notice a memory leak on GPU. Fix the problem by removing those "weak_script" decorator.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20563

Differential Revision: D15368262

Pulled By: zhangguanheng66

fbshipit-source-id: 475db93c9ee0dbaea8fb914c004e7d1e0d419bc2
2019-05-15 20:10:39 -07:00
Jason Lian
6e82b1c77d Split nn.MultiHeadAttention into Module + functional (#20415)
Summary:
Moving functions from torch/nn/modules/activation.py to torch/nn/functional.py. For functions not implemented (_get_input_buffer and _set_input_buffer), a TODO is added.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20415

Differential Revision: D15318078

Pulled By: jamarshon

fbshipit-source-id: 5ca698e2913821442cf8609cc61ac8190496a3c6
2019-05-14 08:41:28 -07:00
interesaaat
35fed93b1e Adding Poisson NLL loss to libtorch (#19316)
Summary:
This PR add Poisson NLL loss to aten and substitute the python implementation with a call to the c++.

Fixes #19186.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19316

Differential Revision: D15012957

Pulled By: ezyang

fbshipit-source-id: 0a3f56e8307969c2f9cc321b5357a496c3d1784e
2019-05-10 11:57:49 -07:00
Ailing Zhang
899bddeeb6 fix typo in adaptive methods annotation (#20306)
Summary:
fixes #20215
The confusing behavior was caused by typos in type annotation :(
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20306

Differential Revision: D15276216

Pulled By: ailzhang

fbshipit-source-id: 1b0c9635a72a05c9b537f80d85b117b5077fbec7
2019-05-09 09:29:37 -07:00
Mikhail Zolotukhin
3a0727e58b Fix flake8. (#19832)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19832
ghimport-source-id: 7360a52dbcf83458797c27002afc1fd53ee5907f

Differential Revision: D15115620

Pulled By: ZolotukhinM

fbshipit-source-id: aa62b04facc1e1824a8889a32dace5804daa21df
2019-04-30 12:09:10 -07:00
Tongzhou Wang
42fbeef5d7 update F.grid_sample doc for clarity (#19754)
Summary:
https://github.com/pytorch/pytorch/issues/19717
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19754

Differential Revision: D15085449

Pulled By: soumith

fbshipit-source-id: 0dda05bd395d58a496bf397ca7f1c50a239b0ed1
2019-04-26 16:01:24 -07:00
Wanchao Liang
e9c8f372c4 dispatch max_pools with no indices, expose max_pools to torch namespace (#19449)
Summary:
in functional interfaces we do boolean dispatch, but all to max_pool\*d_with_indices. This change it to emit max_pool\*d op instead when it's not necessary to expose with_indices ops to different backends (for jit).

It also bind max_pool\*d to the torch namespace, which is the same behavior with avg_pool\*d
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19449

Differential Revision: D15016839

Pulled By: wanchaol

fbshipit-source-id: f77cd5f0bcd6d8534c1296d89b061023a8288a2c
2019-04-23 11:20:05 -07:00
Richard Zou
2a2007e5ac EmbeddingBag CPU forward with per_sample_weights. (#18735)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18735
ghimport-source-id: d81bef54dafd7167d2451250d7be478d3c013920

Reviewed By: cpuhrsch

Differential Revision: D14851415

Pulled By: zou3519

fbshipit-source-id: cea6039e760ad571b90f0a536e420498f34be325
2019-04-09 18:12:55 -07:00
Zachary DeVito
09c19e1068 Fix interpolate tracing (#19034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19034
ghimport-source-id: 874e0b0a8685184416152a77fc1850d9a06516ae

Differential Revision: D14837282

Pulled By: zdevito

fbshipit-source-id: b0ed82b607c288a54eecec3d6ed62c4626e5a563
2019-04-08 14:59:26 -07:00
Elias Ellison
e6bbbb017e Fix interpolate trace (#18875)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/10654

The issue is that in tracing `.size` returns an int tensor, and when an int tensor is multiplied by a scalar the int dominates and the scalar gets casted 0.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18875

Differential Revision: D14814441

Pulled By: eellison

fbshipit-source-id: a4e96a2698f2fcbf3ec4b2bb4c43a30250f30ad9
2019-04-05 17:55:23 -07:00
Joakim Rishaug
b90cbb841d Method is supposed to be in-place (#18684)
Summary:
Tracing models which attempts to return this in-place value doesn't turn out well.

I haven't run any tests to confirm the results to be honest, but regardless of the outcome, the operation happens in-place, so it should work as before.

Sample output from traced model attempting to set `max_norm` on `Embedding`:
```
a leaf Variable that requires grad has been used in an in-place operation. (check_inplace at /pytorch/torch/csrc/autograd/VariableTypeUtils.h:49)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f0ecc5cc021 in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f0ecc5cb8ea in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #2: <unknown function> + 0x38ab2f (0x7f0ecb55ab2f in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #3: torch::autograd::VariableType::embedding_renorm_(at::Tensor&, at::Tensor const&, double, double) const + 0x76 (0x7f0ecb5b5966 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #4: <unknown function> + 0x56c958 (0x7f0ecb73c958 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #5: <unknown function> + 0x672286 (0x7f0ecb842286 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #6: torch::jit::InterpreterState::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) + 0x22 (0x7f0ecb83d842 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #7: <unknown function> + 0x65c6ac (0x7f0ecb82c6ac in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #8: <unknown function> + 0x3c8ab4 (0x7f0f06bc0ab4 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #9: <unknown function> + 0x3ad2c3 (0x7f0f06ba52c3 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #10: <unknown function> + 0x11663e (0x7f0f0690e63e in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #39: python_call + 0x11 (0x5563c3c521c1 in uwsgi)
frame #40: uwsgi_request_wsgi + 0x100 (0x5563c3c54410 in uwsgi)
frame #41: wsgi_req_recv + 0xac (0x5563c3becabc in uwsgi)
frame #42: simple_loop_run + 0xc4 (0x5563c3c35be4 in uwsgi)
frame #43: simple_loop + 0x10 (0x5563c3c35a00 in uwsgi)
frame #44: uwsgi_ignition + 0x241 (0x5563c3c3a3a1 in uwsgi)
frame #45: uwsgi_worker_run + 0x275 (0x5563c3c3ec35 in uwsgi)
frame #46: <unknown function> + 0x8f22c (0x5563c3c3f22c in uwsgi)
frame #47: <unknown function> + 0x3c13e (0x5563c3bec13e in uwsgi)
frame #48: __libc_start_main + 0xf1 (0x7f0f138922e1 in /lib/x86_64-linux-gnu/libc.so.6)
frame #49: _start + 0x2a (0x5563c3bec16a in uwsgi)
:
operation failed in interpreter:
op_version_set = 0
def forward(self,
    input_1: Tensor) -> Tensor:
  _0 = torch.norm(self.item_embedding.weight, 2, 1, True)
  _1 = torch.div(self.item_embedding.weight, _0)
  m_weight = torch.t(_1)
  input_2 = torch.contiguous(input_1)
  weight_1 = torch.embedding_renorm_(self.item_embedding.weight, input_2, 1., 2.)
             ~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
  x = torch.embedding(weight_1, input_2, -1, False, False)
  input_3 = torch.div(x, torch.norm(x, 2, 2, True))
  max_batch_size = ops.prim.NumToTensor(torch.size(input_3, 0))
  hx = torch.zeros([2, int(max_batch_size), 70], dtype=6, layout=0, device=torch.device("cpu"))
  _2 = [self.lstm_layer.weight_ih_l0, self.lstm_layer.weight_hh_l0, self.lstm_layer.weight_ih_l1, self.lstm_layer.weight_hh_l1]
  input_4, _3, _4 = torch.lstm(input_3, [hx, hx], _2, False, 2, 0.10000000000000001, False, False, True)
  input = torch.matmul(input_4, torch.t(self.rnn2item.weight))
  tastevec = torch.div(input, torch.norm(input, 2, 2, True))
  outputs = torch.matmul(tastevec, m_weight)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18684

Differential Revision: D14782041

Pulled By: ezyang

fbshipit-source-id: 7b2fc19b7d5b6600263644498bb728319a19f39d
2019-04-05 13:00:29 -07:00
Soumith Chintala
cb39bd9c2f pad_circular -> _pad_circular (#18608)
Summary:
pad_circular is really private, as circular padding is exposed via `F.pad`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18608

Differential Revision: D14691704

Pulled By: soumith

fbshipit-source-id: 8c2f90596feed670976115041efed3ca071e8306
2019-03-30 13:27:04 -07:00
Edward Yang
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
Aurélien Roy
12abc8a99a Target and input sizes mismatch warning in L1 Loss / L1 Smooth Loss (#18565)
Summary:
Addind the same warning message already present in the mse_loss function to the L1 losses when input and target sizes are different.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18565

Differential Revision: D14671415

Pulled By: soumith

fbshipit-source-id: 01f5e1fb1ea119dbb2aecf1d94d0cb462f284982
2019-03-28 20:49:51 -07:00
mc-robinson
8bc5b86709 Added tensor size warning to F.mse_loss() (#18349)
Summary:
To address the issue of broadcasting giving the wrong result in `nn.MSELoss()` as mentioned here https://github.com/pytorch/pytorch/issues/16045 . In particular, the issue often arises when computing the loss between tensors with shapes (n, 1) and (n,)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18349

Differential Revision: D14594176

Pulled By: soumith

fbshipit-source-id: f23ae68a4bf42f3554ad7678a314ba2c7532a6db
2019-03-24 19:22:14 -07:00
Narine Kokhlikyan
670f509984 Circular Convolution Function via circular padding (#17240)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17240

Added circular padding in addition to zero padding to Conv1D, Conv2D and Conv3D based on the solution suggested in: https://github.com/pytorch/pytorch/issues/3858

Reviewed By: ezyang

Differential Revision: D14126416

fbshipit-source-id: a2f1587503ee0cfff98d5cb0d5b0a600ef8aaeb4
2019-03-18 12:33:20 -07:00
ZhuBaohe
75f88d4da6 Correct loss docstrings (#17300)
Summary:
In the loss doc description, replace the deprecated 'reduct' and 'size_average' parameters with the 'reduction' parameter.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17300

Differential Revision: D14195789

Pulled By: soumith

fbshipit-source-id: 625e650ec20f13b2d22153a4a535656cf9c8f0eb
2019-03-10 11:56:41 -07:00
zou3519
68c5c66800 Warn about memory overlaps on expanded tensors (#17576)
Summary:
Eventually we should remove these when we're certain that all our ops
handle memory overlaps correctly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17576

Differential Revision: D14349990

Pulled By: zou3519

fbshipit-source-id: c3a09f6113b9b1bf93e7f13c0b426c45b2cdf21f
2019-03-06 17:44:04 -08:00
ZhuBaohe
19a6de328f Correct docstring of vision/init functions
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17351

Differential Revision: D14276355

Pulled By: soumith

fbshipit-source-id: 9b572b6a04eeb1e44cd93961edac76ed10f7b24e
2019-03-01 11:40:23 -08:00
vishwakftw
724c7e76c6 Fix reduction='none' in poisson_nll_loss (#17358)
Summary:
Changelog:
- Modify `if` to `elif` in reduction mode comparison
- Add error checking for reduction mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17358

Differential Revision: D14190523

Pulled By: zou3519

fbshipit-source-id: 2b734d284dc4c40679923606a1aa148e6a0abeb8
2019-02-25 10:35:33 -08:00
ZhuBaohe
e81878e0a9 Correct padding and activations docstrings in nn module
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17197

Differential Revision: D14131284

Pulled By: soumith

fbshipit-source-id: 6edd225b47b1dde81b5ad0a23c588c6621987a69
2019-02-19 08:16:52 -08:00
ZhuBaohe
8852e21245 Correct recurrent/linear/dropout/sparse layers docstrings
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17238

Differential Revision: D14130811

Pulled By: soumith

fbshipit-source-id: d3998ca7da46aec5a59220c6af489f71f3d60735
2019-02-19 05:23:04 -08:00
Krishna
b892f69440 one_hot docs missing (#17142)
Summary:
one_hot docs is missing [here](https://pytorch.org/docs/master/nn.html#one-hot).

I dug around and could not find a way to get this working properly.

Differential Revision: D14104414

Pulled By: zou3519

fbshipit-source-id: 3f45c8a0878409d218da167f13b253772f5cc963
2019-02-15 10:48:18 -08:00
ZhuBaohe
acf5ec07af Correct conv and pooling docstrings in nn module (#17052)
Summary:
This PR fix conv and pooling docstrings in nn module
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17052

Differential Revision: D14068566

Pulled By: ezyang

fbshipit-source-id: 3ec1de232ff6334b6a544dadefbb0ee6193d443a
2019-02-15 06:58:02 -08:00
David Riazati
48943c3b7a Update Upsample docs to match nn.interpolate
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17134

Reviewed By: ezyang

Differential Revision: D14095694

Pulled By: driazati

fbshipit-source-id: 79afec9ddd50b3b8ce39acf98c2543cf1a3d1127
2019-02-15 06:38:41 -08:00
Ailing Zhang
b0545aa85f maskrcnn & bert AD coverage part 1 (#16689)
Summary:
- Moved a few functions from `autograd` namespace to `aten` namespace to be visible from JIT nativeResolver.
- Added a hack to loop up keyword only argument. Will add proper support for kw only later
- Simulate function overload in aten using `_<number>` as function name suffix.
- Even `forward` returns multiple outputs like in `kthvalue`, there's at most one requires grad that we currently support.
- Removed the `TensorList` related ops here since partial `TensorList` support is prone to bugs. Our symbolic diff for `cat` was never tested with autodiff, and it seems broken. Need to find another proper way to support these ops(either by properly supporting `TensorList` or sth like `prim::ConstantChunk`  and leave them for next PR.

Ops supported in this PR:
```
erf
expand_as
index
kthvalue
mean
permute
pow
rsub
select
sqrt
squeeze
t
to
topk
transpose
view
var
embedding
logsumexp
// grad is None
_dim_arange
contiguous
nonzero
ones_like
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16689

Differential Revision: D14020806

Pulled By: ailzhang

fbshipit-source-id: a5e2c144a7be5a0d39d7ac5f93cb402ec12503a5
2019-02-14 15:36:39 -08:00
Theo
3618b52c74 Add module and name to func created with _jit_internal.boolean_dispatch (#16922)
Summary:
The use case for making this PR is the following bug :
(with F = torch.nn.functional)
`F.max_pool2d.__module__` is `torch._jit_internal`
`F.max_pool2d.__name__` is `fn`

With this PR you get:
`F.max_pool2d.__module__` is `torch.nn.functional`
`F.max_pool2d.__name__` is `max_pool2d`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16922

Differential Revision: D14020053

Pulled By: driazati

fbshipit-source-id: c109c1f04640f3b2b69bc4790b16fef7714025dd
2019-02-12 09:38:48 -08:00
Thomas Viehmann
29f096cc70 optionally zero infinite losses in CTCLoss (#16199)
Summary:
Here is a stab at implementing an option to zero out infinite losses (and NaN gradients).
It might be nicer to move the zeroing to the respective kernels.
The default is currently `False` to mimic the old behaviour, but I'd be half inclined to set the default to `True`, because the behaviour wasn't consistent between CuDNN and Native anyways and the NaN gradients aren't terribly useful.

This topic seems to come up regularly, e.g. in  #14335
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16199

Differential Revision: D14020462

Pulled By: ezyang

fbshipit-source-id: 5ba8936c66ec6e61530aaf01175dc49f389ae428
2019-02-11 13:12:55 -08:00
Wanchao Liang
ac00e85e36 Remove undefined tensor in jit script (#16379)
Summary:
This PR is a follow up of #15460, it did the following things:

* remove the undefined tensor semantic in jit script/tracing mode
* change ATen/JIT schema for at::index and other index related ops with `Tensor?[]` to align with what at::index is really doing and to adopt `optional[tensor]` in JIT
* change python_print to correctly print the exported script
* register both TensorList and ListOfOptionalTensor in JIT ATen ops to support both
* Backward compatibility for `torch.jit.annotate(Tensor, None)`

List of follow ups:

* remove the undefined tensor semantic in jit autograd, autodiff and grad_of
* remove prim::Undefined fully

For easy reviews, please turn on `hide white space changes` in diff settings.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16379

Differential Revision: D13855677

Pulled By: wanchaol

fbshipit-source-id: 0e21c14d7de250c62731227c81bfbfb7b7da20ab
2019-02-07 11:02:14 -08:00
vishwakftw
34b43baeec Allow list and tuples to be passed as output_size to max_unpool1d (#16489)
Summary:
Changelog:
- Modify concantenation of [1] to a tuple by using cases for list and non-list types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16489

Differential Revision: D13875838

Pulled By: soumith

fbshipit-source-id: fade65cc47385986b773b9bde9b4601ab93fe1cf
2019-01-30 11:00:34 -08:00
Lu Fang
b1b00f329e Fix the flake8 linter
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16549

Reviewed By: bddppq

Differential Revision: D13877435

Pulled By: houseroad

fbshipit-source-id: dbe575ba3f6dd30d27ac6aa5eec2eea025063540
2019-01-30 09:36:00 -08:00
Elias Ellison
c2be9f1487 Remove unneeded manual unwrap optionals (#16245)
Summary:
Remove calls to torch.jit._unwrap_optional that are no longer needed.

The remaining instances would require control flow logic for exceptions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16245

Differential Revision: D13804292

Pulled By: eellison

fbshipit-source-id: 08c5cbe4b956519be2333de5cf4e202488aff626
2019-01-24 15:48:01 -08:00
Egil Martinsson
d6a8dd9538 Cleanup gumbel_softmax (#13339)
Summary:
Fixes #12643, amends to #3341.

- Allow multidimensional input ~~(but apply softmax over `dim=-1`)~~ with `dim` argument
- Cleaner: Less lines of code
- Faster (1.32x speedup vs original, 2x speedup vs using `torch.Distributions`)
- Small fixes in docstring
- Remove some references in docstring. Was the linked (excellent) ipynb the first to do the straight-through trick? Instead, I propose changing to reference to the two papers most known for it.
- Add deprecationwarning for `eps`. It's not needed anymore.
- Initial commit keeps some code alternatives commented to exploit CI

- As of discussion when `gumbel_softmax` was added (#3341), this was merged into `torch.nn.functional` before all the work with `Distributions` and `Pyro`, and there will probably be multiple other best practices for this in the future.
I've tested building using the `Distributions`-api, but it was too slow, see below.

I therefore propose not using `Distributions` to keep it fast and simple, but adding a comment in docstring that `gumbel_softmax` may be deprecated in the future.

```
dist = torch.distributions.RelaxedOneHotCategorical(temperature=tau, logits=logits, validate_args=False)
y_soft = dist.rsample()
```

Pros:
* Built using tricks like `logsumexp` etc
* Explicitly uses `torch.distributions.utils._finfo` to avoid overflow (old implementation had an `eps` flag)
* Maintained for this exact purpose.

Cons:
* Very slow. Construction of distribution adds overhead see timings below. May be solved in future with speedups of `TransformedDistribution` and `Distribution`.
* Assumes which `dim` to apply softmax over.

```
    y_soft = logits.new(logits.shape)
    y_soft = (logits - y_soft.exponential_().log()) / tau  # Gumbel noise
    y_soft = y_soft.softmax(dim)  # Gumbel softmax noise
```
Pros:
* Faster

```
    import time
    start = time.time()
    num_draws = 1000000
    logits = torch.randn(1,3)

    for draw in range(num_draws):
        y_draw = gumbel_softmax(logits, hard=True)
        counts = counts + y_draw
    print(end - start)

>> 12.995795965194702

>> 7.658372640609741

>> 20.3382670879364
````

Decide on which path to chose. I'll commit in changes to the unit tests in a while to show that it passes both old tests and new tests. I'll also remove the commented code about `RelaxedOneHotCategorical`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13339

Differential Revision: D13092434

Pulled By: ezyang

fbshipit-source-id: 4c21788df336f4e9c2ac289022e395b261227b4b
2019-01-17 12:56:35 -08:00
Gregory Chanan
595f767880 Revert batched pdist, improve existing kernel, add test (#15901)
Summary:
1) Reverts https://github.com/pytorch/pytorch/pull/12302 which added support for batched pdist. Except I kept the (non-batched) test improvements that came with that PR, because they are nice to have.  Motivation: https://github.com/pytorch/pytorch/issues/15511
2) For the non-batched pdist, improved the existing kernel by forcing fp64 math and properly checking cuda launch errors
3) Added a 'large tensor' test that at least on my machine, fails on the batch pdist implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15901

Reviewed By: ezyang

Differential Revision: D13616730

Pulled By: gchanan

fbshipit-source-id: 620d3f9b9acd492dc131bad9d2ff618d69fc2954
2019-01-17 10:44:43 -08:00
Chandler Zuo
237c0c3c7a Port the backend of FractionalMaxPool3d from TH to ATen (#15575)
Summary:
1. Port the FractionalMaxPool3d implementation from THNN/THCUNN to ATen.
2. Expose this function to Python module nn.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15575

Differential Revision: D13612848

Pulled By: chandlerzuo

fbshipit-source-id: 5f474b39005efa7788e984e8a805456dcdc43f6c
2019-01-16 14:16:30 -08:00
Elias Ellison
7d601715e5 Constant prop prim::None (#15979)
Summary:
Previously we were only constant propping prim::Constants, but we should be constant propping prim::None as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15979

Differential Revision: D13664692

Pulled By: eellison

fbshipit-source-id: 01839403576c21fc030c427e49275b8e1210fa8f
2019-01-15 11:34:51 -08:00
Derek Kim
abdaa477e5 Improved the documentation for torch.nn.functional.pad (#15984)
Summary:
- Fixed a few typos and grammar errors.
- Changed the sentences a bit.
- Changed the format of the tuples to be consistent with padding notations in the other places. For example, `ReflectionPad2d`'s dostring contains :math:`H_{out} = H_{in} + \text{padding\_top} + \text{padding\_bottom}`.

I also made sure that the generated html doesn't break.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15984

Differential Revision: D13649939

Pulled By: soumith

fbshipit-source-id: 0abfa22a7bf1cbc6546ac4859652ce8741d41232
2019-01-14 04:12:45 -08:00
Derek Kim
da753b7ccf Trivial typo fixings in nn.functional dropout* docstrings (#15951)
Summary:
Defualt -> Default
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15951

Differential Revision: D13633875

Pulled By: soumith

fbshipit-source-id: 0da823ef235418396e9322089f6610b592e6990f
2019-01-10 22:42:52 -08:00