Commit Graph

534 Commits

Author SHA1 Message Date
Edward Yang
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
jithunnair-amd
fdedc62c26 enable more unit tests (#18537)
Summary:
Enable unit tests working with ROCm 2.3. In particular, these are unit tests where we skipped for double data types previously and some tests for multi-GPU setups.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18537

Differential Revision: D14651822

Pulled By: ezyang

fbshipit-source-id: 7dd575504ebe235a91489866c91000e9754b1235
2019-03-27 14:27:23 -07:00
mc-robinson
8bc5b86709 Added tensor size warning to F.mse_loss() (#18349)
Summary:
To address the issue of broadcasting giving the wrong result in `nn.MSELoss()` as mentioned here https://github.com/pytorch/pytorch/issues/16045 . In particular, the issue often arises when computing the loss between tensors with shapes (n, 1) and (n,)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18349

Differential Revision: D14594176

Pulled By: soumith

fbshipit-source-id: f23ae68a4bf42f3554ad7678a314ba2c7532a6db
2019-03-24 19:22:14 -07:00
Will Feng
7be05b822c Fix incorrect sparse add behavior when the sparse tensor has non-contiguous values (#18179)
Summary:
Currently, this code gives incorrect result:
```python
import torch
indices=torch.tensor([[7, 1, 3]])
values=torch.tensor([[1., 1., 1.],
               [1., 1., 1.],
               [1., 1., 1.]])
x = torch.sparse_coo_tensor(indices, values, size=(10, 3))
values=torch.tensor(1.).expand(3, 3)
y = torch.sparse_coo_tensor(indices, values, size=(10, 3))
z = x + y

tensor(indices=tensor([[7, 1, 3]]),
       values=tensor([[2., 1., 1.],
                      [1., 1., 1.],
                      [1., 1., 1.]]),
       size=(10, 3), nnz=3, layout=torch.sparse_coo)
```

This PR fixes the bug by adding special handling for sparse tensors with non-contiguous values in the addition function (specifically, by cat'ing the indices and values together).

This PR closes https://github.com/pytorch/pytorch/issues/17950 and https://github.com/pytorch/pytorch/issues/17919.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18179

Reviewed By: ezyang

Differential Revision: D14569591

Pulled By: yf225

fbshipit-source-id: f5a14c4a31337fc95eab64596212066b4fb18b1a
2019-03-22 19:35:14 -07:00
Ailing Zhang
fe5d23cf4a Using sqrt for better precision in cosine_similarity (#18250)
Summary:
address comment in #18168 .
Testing in CI...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18250

Differential Revision: D14568601

Pulled By: ailzhang

fbshipit-source-id: 39fbbdb08743b53fa665c7e88e4750cbe0976ec7
2019-03-22 13:33:30 -07:00
Edward Yang
ba81074c40 Fix B902 lint error: invalid first argument. (#18181)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18181
ghimport-source-id: 9c23551584a1a1b0b7ac246367f3a7ae1c50b315

Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18184 Fix B903 lint: save memory for data classes with slots/namedtuple
* **#18181 Fix B902 lint error: invalid first argument.**
* #18178 Fix B006 lint errors: using mutable structure in default argument.
* #18177 Fix lstrip bug revealed by B005 lint

A variety of sins were committed:
- Some code was dead
- Some code was actually a staticmethod
- Some code just named it the wrong way
- Some code was purposely testing the omitted case

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14530876

fbshipit-source-id: 292a371d9a76ddc7bfcfd38b6f0da9165290a58e
2019-03-21 09:10:28 -07:00
Ailing Zhang
8895bfba6a fix cosine_similarity (#18168)
Summary:
fixes #18057 according to colesbury 's suggestion. Thanks!
cc: ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18168

Differential Revision: D14520953

Pulled By: ailzhang

fbshipit-source-id: 970e6cfb482d857a81721ec1d0ee4a4df84a0450
2019-03-19 20:09:17 -07:00
Edward Yang
3fe7bdb2ff Fix lint in test_nn.py (#18006)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18006
ghimport-source-id: e267ece1ac03e0d17e01dddf4a77f52421859435

Stack:
* **#18006 Fix lint in test_nn.py**

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Reviewed By: eellison

Differential Revision: D14458108

fbshipit-source-id: 18ee6199447efed55a922cff5b3ad940a21c0536
2019-03-14 08:59:24 -07:00
bhushan
16e50c78e7 Report convolution size mismatch (#17436)
Summary:
1. Kernel size is larger than input
2. Expected output size to be less than zero

Test case added:
- invalid_conv1d
- Relevant test cases for conv2d and conv3d exists

Fixes #17247
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17436

Reviewed By: mrshenli

Differential Revision: D14354272

Pulled By: fmassa

fbshipit-source-id: 94b98621aa03b1f60d151ef9399ed3da55d41b42
2019-03-14 06:35:29 -07:00
Soumith Chintala
a478d41620 Fix nll_loss crash on cpu where ignore_index is out of bounds (#17328)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/15508
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17328

Differential Revision: D14322629

Pulled By: soumith

fbshipit-source-id: 7d02f372be78794782c18affcfc109ce30b1e91c
2019-03-05 14:35:05 -08:00
Tongzhou Wang
44a607b90c Fix autograd with buffers requiring grad in DataParallel (#13352)
Summary:
Causing a problem with spectral norm, although SN won't use that anymore after #13350 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13352

Differential Revision: D14209562

Pulled By: ezyang

fbshipit-source-id: f5e3183e1e7050ac5a66d203de6f8cf56e775134
2019-02-26 20:53:19 -08:00
vishwakftw
724c7e76c6 Fix reduction='none' in poisson_nll_loss (#17358)
Summary:
Changelog:
- Modify `if` to `elif` in reduction mode comparison
- Add error checking for reduction mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17358

Differential Revision: D14190523

Pulled By: zou3519

fbshipit-source-id: 2b734d284dc4c40679923606a1aa148e6a0abeb8
2019-02-25 10:35:33 -08:00
Tongzhou Wang
3d5968d366 Fix DataParallel(cpu_m).cuda() not working by checking at forward (#17363)
Summary:
Fixes #17362
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17363

Differential Revision: D14175151

Pulled By: soumith

fbshipit-source-id: 7b7e2335d553ed2133287deeaca3f6b6254aea4a
2019-02-22 08:31:36 -08:00
Natalia Gimelshein
5fa78303ed fix double backward for half softmax/logsoftmax (#17330)
Summary:
Fix for #17261, SsnL do you have tests for it in your other PR? If not, I'll add to this. Example from #17261 now does not error out (and same for log_softmax).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17330

Differential Revision: D14171529

Pulled By: soumith

fbshipit-source-id: ee925233feb1b44ef9f1d757db59ca3601aadef2
2019-02-21 14:58:45 -08:00
Soumith Chintala
c63af8837d remove nn.Upsample deprecation warnings from tests (#17352)
Differential Revision: D14168481

Pulled By: soumith

fbshipit-source-id: 63c37c5f04d2529abd4f42558a3d5e81993eecec
2019-02-21 11:27:24 -08:00
Tongzhou Wang
2b57bdb7ab Fix cuda softmax backward with empty input (#17259)
Summary:
Fixes #17256
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17259

Differential Revision: D14142196

Pulled By: soumith

fbshipit-source-id: 1f2dc202951b59b43da27684f9f924314bcd3040
2019-02-19 16:41:52 -08:00
Jie
594a4d7b55 at::native batch norm kernel launch config update (#17047)
Summary:
limit block dimension to avoid configuration error on batch norm kernel launch

This should resolve #16998
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17047

Differential Revision: D14142132

Pulled By: soumith

fbshipit-source-id: 9c8c52dcd1d108cda1f65f5227e625b8fe6e12a0
2019-02-19 16:41:51 -08:00
Sergei Nikolaev
6455d91e4d False alarm about leak in TestNN.test_variable_sequence_cuda (#17242)
Summary:
`TestNN.test_variable_sequence_cuda` sometimes brakes due to CUDA leak.
The cause appears to be too small tolerance breaking float16 sub-test of the test above.
When it breaks it calls abort disrupting correct tear down of the test
and false alarming about the leak.

~~Also, removed annoying **Upsample** module warning.
IMHO this warning is wrong because the module **Upsample** is not deprecated. Seems like it's been mixed
with `nn.functional.upsample` function which is indeed deprecated in favor of `nn.functional.interpolate`, see `torch/nn/functional.py:2387` for details (this replacement is also performed in `test_nn.py`).~~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17242

Differential Revision: D14141686

Pulled By: soumith

fbshipit-source-id: faa8f87440d94bdc6ab0ff00be6dad82353115c4
2019-02-19 15:59:30 -08:00
Shen Li
472cfc0f2c Enforce module device at DataParallel construction time (#17129)
Summary:
closes #17065

CC douwekiela
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17129

Differential Revision: D14093353

Pulled By: mrshenli

fbshipit-source-id: 9a5a10f16e392337a7f7073223541cf69b402f82
2019-02-15 11:14:46 -08:00
wbydo
6c67dcfb05 Fix AdaptiveLogSoftmaxWithLoss's constructor (#16694)
Summary:
t-ken1 and I are members of a same team.
I have added test codes about the pull request https://github.com/pytorch/pytorch/pull/16656.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16694

Differential Revision: D14070106

Pulled By: ezyang

fbshipit-source-id: ff784dbf45e96a6bcf9a4b5cb9544a661a8acad2
2019-02-15 06:58:00 -08:00
Junjie Bai
9b7f3da74b Skip test_cudnn_multiple_threads_same_device on ROCm (flaky) (#17061)
Summary:
cc iotamudelta
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/10722//console
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/10710//console
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/10753//console
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-devtoolset7-rocmrpm-centos7.5-test/1756//console
```
19:07:18 ======================================================================
19:07:18 FAIL: test_cudnn_multiple_threads_same_device (test_nn.TestNN)
19:07:18 ----------------------------------------------------------------------
19:07:18 Traceback (most recent call last):
19:07:18   File "/var/lib/jenkins/workspace/test/test_nn.py", line 3905, in test_cudnn_multiple_threads_same_device
19:07:18     (2048 - test_iters) * (2048 - test_iters))
19:07:18   File "/var/lib/jenkins/workspace/test/common_utils.py", line 453, in assertEqual
19:07:18     super(TestCase, self).assertLessEqual(abs(x - y), prec, message)
19:07:18 AssertionError: 3794704.0 not less than or equal to 1e-05 :
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17061

Differential Revision: D14069324

Pulled By: bddppq

fbshipit-source-id: e33b09abca217a62a8b577f9c332ea22985ef4ff
2019-02-13 17:18:47 -08:00
Johannes M Dieterich
9d01be1a5a enable more unit tests in test_nn (#16994)
Summary:
These tests work with ROCm 2.1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16994

Differential Revision: D14059802

Pulled By: bddppq

fbshipit-source-id: 8e2cbb13196c2e0283d3e02b7f761374bc580751
2019-02-12 17:58:44 -08:00
Thomas Viehmann
29f096cc70 optionally zero infinite losses in CTCLoss (#16199)
Summary:
Here is a stab at implementing an option to zero out infinite losses (and NaN gradients).
It might be nicer to move the zeroing to the respective kernels.
The default is currently `False` to mimic the old behaviour, but I'd be half inclined to set the default to `True`, because the behaviour wasn't consistent between CuDNN and Native anyways and the NaN gradients aren't terribly useful.

This topic seems to come up regularly, e.g. in  #14335
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16199

Differential Revision: D14020462

Pulled By: ezyang

fbshipit-source-id: 5ba8936c66ec6e61530aaf01175dc49f389ae428
2019-02-11 13:12:55 -08:00
Johannes M Dieterich
23e1c55cc0 enable unit tests working on ROCm 2.1 (#16871)
Summary:
This is the first round of enabling unit tests that work on ROCm 2.1 in my tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16871

Differential Revision: D13997662

Pulled By: bddppq

fbshipit-source-id: d909a3f7dd5fc8f85f126bf0613751c8e4ef949f
2019-02-09 00:30:50 -08:00
Asher Mancinelli
7078b2baf5 Better bounds checks in ctcloss (#16269)
Summary:
Adds better bounds checks for target lengths in CTC loss, checks for integral types for target and prediction lengths, and adds tests for each, according to #15946
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16269

Differential Revision: D13847567

Pulled By: ezyang

fbshipit-source-id: 5d7a975565e02baf78fe388813a1d1ef56dfb212
2019-02-01 08:02:54 -08:00
vishwakftw
34b43baeec Allow list and tuples to be passed as output_size to max_unpool1d (#16489)
Summary:
Changelog:
- Modify concantenation of [1] to a tuple by using cases for list and non-list types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16489

Differential Revision: D13875838

Pulled By: soumith

fbshipit-source-id: fade65cc47385986b773b9bde9b4601ab93fe1cf
2019-01-30 11:00:34 -08:00
Lu Fang
b1b00f329e Fix the flake8 linter
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16549

Reviewed By: bddppq

Differential Revision: D13877435

Pulled By: houseroad

fbshipit-source-id: dbe575ba3f6dd30d27ac6aa5eec2eea025063540
2019-01-30 09:36:00 -08:00
Gregory Chanan
0cb24098c7 Handle non-contiguous inputs with mkldnn convolution. (#16300)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/16018.

Backwards appears to be fine because the derivative is written in terms of mkldnn_convolution itself.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16300

Differential Revision: D13797776

Pulled By: gchanan

fbshipit-source-id: 68a990b8a3c186412a99d176931314806c9ed7bf
2019-01-24 07:39:31 -08:00
Thomas Viehmann
b662a9b66a add back NNPACK in PyTorch (#15924)
Summary:
This tests the water for adding back NNPACK in PyTorch, it's a lot better than the fallback THNN versions.

In #6151, we (ezyang and soumith) removed NNPACK support from PyTorch. Of course Maratyszcza might have advice, too. (Or an opinion on the CMake changes.)

The only functional changes are to use NNPack more aggressively on mobile and a .contiguous() to match NNPack's assumption (I stumbled over that while using NNPack for style transfer.)
The CMake changes try to use the NNPack we already have in git.

In terms of lines of code this is a large part of the diff of https://lernapparat.de/pytorch-jit-android/ . As far as I can tell, we don't have MKLDNN on mobile and the native THNN implementation are prohibitively expensive in terms of both CPU and memory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15924

Differential Revision: D13709576

Pulled By: ezyang

fbshipit-source-id: f2e287739909451c173abf046588209a7450ca2c
2019-01-18 15:34:35 -08:00
Richard Zou
ed0a761c82 Improve pack_sequence and pack_padded_sequence error message (#16084)
Summary:
Mention that if enforce_sorted=True, the user can set
enforce_sorted=False. This is a new flag that is probably hard to
discover unless one throughly reads the docs.

Fixes #15567
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16084

Differential Revision: D13701118

Pulled By: zou3519

fbshipit-source-id: c9aeb47ae9769d28b0051bcedb8f2f51a5a5c260
2019-01-18 07:58:54 -08:00
Egil Martinsson
d6a8dd9538 Cleanup gumbel_softmax (#13339)
Summary:
Fixes #12643, amends to #3341.

- Allow multidimensional input ~~(but apply softmax over `dim=-1`)~~ with `dim` argument
- Cleaner: Less lines of code
- Faster (1.32x speedup vs original, 2x speedup vs using `torch.Distributions`)
- Small fixes in docstring
- Remove some references in docstring. Was the linked (excellent) ipynb the first to do the straight-through trick? Instead, I propose changing to reference to the two papers most known for it.
- Add deprecationwarning for `eps`. It's not needed anymore.
- Initial commit keeps some code alternatives commented to exploit CI

- As of discussion when `gumbel_softmax` was added (#3341), this was merged into `torch.nn.functional` before all the work with `Distributions` and `Pyro`, and there will probably be multiple other best practices for this in the future.
I've tested building using the `Distributions`-api, but it was too slow, see below.

I therefore propose not using `Distributions` to keep it fast and simple, but adding a comment in docstring that `gumbel_softmax` may be deprecated in the future.

```
dist = torch.distributions.RelaxedOneHotCategorical(temperature=tau, logits=logits, validate_args=False)
y_soft = dist.rsample()
```

Pros:
* Built using tricks like `logsumexp` etc
* Explicitly uses `torch.distributions.utils._finfo` to avoid overflow (old implementation had an `eps` flag)
* Maintained for this exact purpose.

Cons:
* Very slow. Construction of distribution adds overhead see timings below. May be solved in future with speedups of `TransformedDistribution` and `Distribution`.
* Assumes which `dim` to apply softmax over.

```
    y_soft = logits.new(logits.shape)
    y_soft = (logits - y_soft.exponential_().log()) / tau  # Gumbel noise
    y_soft = y_soft.softmax(dim)  # Gumbel softmax noise
```
Pros:
* Faster

```
    import time
    start = time.time()
    num_draws = 1000000
    logits = torch.randn(1,3)

    for draw in range(num_draws):
        y_draw = gumbel_softmax(logits, hard=True)
        counts = counts + y_draw
    print(end - start)

>> 12.995795965194702

>> 7.658372640609741

>> 20.3382670879364
````

Decide on which path to chose. I'll commit in changes to the unit tests in a while to show that it passes both old tests and new tests. I'll also remove the commented code about `RelaxedOneHotCategorical`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13339

Differential Revision: D13092434

Pulled By: ezyang

fbshipit-source-id: 4c21788df336f4e9c2ac289022e395b261227b4b
2019-01-17 12:56:35 -08:00
Gregory Chanan
595f767880 Revert batched pdist, improve existing kernel, add test (#15901)
Summary:
1) Reverts https://github.com/pytorch/pytorch/pull/12302 which added support for batched pdist. Except I kept the (non-batched) test improvements that came with that PR, because they are nice to have.  Motivation: https://github.com/pytorch/pytorch/issues/15511
2) For the non-batched pdist, improved the existing kernel by forcing fp64 math and properly checking cuda launch errors
3) Added a 'large tensor' test that at least on my machine, fails on the batch pdist implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15901

Reviewed By: ezyang

Differential Revision: D13616730

Pulled By: gchanan

fbshipit-source-id: 620d3f9b9acd492dc131bad9d2ff618d69fc2954
2019-01-17 10:44:43 -08:00
Natalia Gimelshein
461dc9a28b use all_weights instead of _parameters in _flat_weights in rnn (#15766)
Summary:
Fixes #15749
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15766

Differential Revision: D13592320

Pulled By: soumith

fbshipit-source-id: 6c3805f576c3df5a2da8bef1e4305eda379718df
2019-01-08 09:48:36 -08:00
Michael Carilli
e313f1a7bf Cudnn Handle Pool 3: At Wit's End (#15668)
Summary:
ezyang Here's a freshly rebased version of https://github.com/pytorch/pytorch/pull/15080 with the if statement that relieved the hangs that occasionally, nondeterministically, occurred on cudnnCreate on a particular windows build ([example w/debug statements](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-win-ws2016-cuda9-cudnn7-py3-test2/19238/console))  in https://github.com/pytorch/pytorch/pull/15280.

I'd like to run the CI over this several times before it's considered mergeable.  Sometimes the windows hang doesn't manifest for 2 or 3 consecutive trials.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15668

Differential Revision: D13579291

Pulled By: soumith

fbshipit-source-id: 3972eb98bad6ece933ca5e67a10fc4bc2ed06068
2019-01-04 06:28:21 -08:00
Ailing Zhang
78442f04fc Add mkldnn conv double backward (#15686)
Summary:
Fixes #15353 .

Like cudnn conv implementation, mkldnn also falls back to the default `_convolution_double_backward` as double backward.

This bug wasn't caught by CI before because mkldnn is only used when input scalar type is float, but our tests are all using double as default.

Adding test for float inputs, but mkldnn seems to have imprecision issues similar to cudnn implementation, so here I only check if double backward exists instead of calling `gradgradcheck`. Please correct me if the precision should actually be checked.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15686

Differential Revision: D13571682

Pulled By: ailzhang

fbshipit-source-id: f1762439762370f276cfd59e8b8b8a4dee960a4b
2019-01-03 10:50:00 -08:00
Will Feng
7b87ecae37 Move autograd metadata from VariableImpl to TensorImpl (#13827)
Summary:
Changes originally in this PR:
1. Move Variable::Impl data members into TensorImpl as `AutogradMeta` struct
2. Change Variable::Impl functions to use data members in `AutogradMeta` struct
3. Add `shallow_copy_and_detach()` function to each subclass of TensorImpl
4. Do shallow copy when the user calls `make_variable(tensor)` / `make_variable_view(tensor)` / `variable.set_data(tensor)` / `variable.detach()`

Changes moved from https://github.com/pytorch/pytorch/pull/13645:
1. Add a flag to Variable to disallow size/stride/storage_ptr changes from in-place operations such as `resize_` / `resize_as_` / `set_` / `transpose_`, and set this flag to true when people call `tensor.data` in Python.
2. Write text in the docs to actively discourage changing the shape or storage of `tensor_detached` and expecting `tensor` to also be updated.

This is the 1st+2nd PR mentioned in https://github.com/pytorch/pytorch/issues/13638.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13827

Differential Revision: D13507173

Pulled By: yf225

fbshipit-source-id: b177b08438d534a8197e34e1ad4a837e2db0ed6a
2018-12-26 16:34:24 -08:00
David Pollack
cdb8edce75 add from_pretrained method to EmbeddingBag (#15273)
Summary:
The `EmbeddingBag` module does not include a `from_pretrained` method like the `Embedding` module.  I added it for consistency between the two modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15273

Differential Revision: D13547842

Pulled By: soumith

fbshipit-source-id: 8ffde51ff0c1e8fc8310263b6f375da88089ff7d
2018-12-26 08:35:39 -08:00
Richard Zou
3353064060 Add option to automatically handle unsorted variable-length sequences in RNNs (#15225)
Summary:
Fixes #3584.

Motivation: manually sorting sequences, packing them, and then unsorting them
is something a lot of users have complained about doing, especially when we can
offer library support for them.

Overview: we internally sort sequences before packing them and store a list of
`unsorted_indices` that represent how to unsort the sequences inside
PackedSequence. The packing helper functions return PackedSequence with the
`permutation` field and the unpacking helper functions use it to unsort.

To implement this, the following changes were made:
- PackedSequence now keeps `sorted_indices` and `unsorted_indices`.
  These two can be thought of as permutations and are inverses of each other.
  `sorted_indices` is how the sequences were sorted; `unsorted_indices` is how
  to unsort the sequences.
- Added an `enforce_sorted` argument to pack_sequence and pack_padded_sequence
  that maintains the legacy behavior of error-ing out on unsorted-sequences.
  When `enforce_sorted=True`, these functions maintain their ONNX exportability.
- pack_sequence(sequences, enforce_sorted) takes in unsorted sequences.
- pack_padded_sequence can take in a padded tensor that represents padded,
  unsorted sequences.
- pad_packed_sequence unsorts the PackedSequence such that it is still the
  inverse operation of packed_padded_sequence.
- RNNs apply `sort_indices` to their input hidden state and apply
  `unsort_indices` to their output hidden state. This is to ensure that the
  hidden state batches correspond to the user's ordering of input sequences.

NOT BC-Breaking
- The default for pack_sequence and pack_padded_sequence is
  `enforce_sorted=True` to avoid breaking ONNX export. To use the new
  functionality, pass in `enforce_sorted=False`

Testing Plan
- Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence,
  and TestNN.test_variable_sequence (RNN test) to check the behavior
  of unsorted sequences, sorted sequences, and sorted sequences with
  enforce_sorted=True
- test/test_jit.py has a test to see if RNNs are exportable with
  enforce_sorted=True

cc colesbury
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15225

Reviewed By: soumith

Differential Revision: D13507138

Pulled By: zou3519

fbshipit-source-id: b871dccd6abefffca81bc4e3efef1873faa242ef
2018-12-20 17:37:18 -08:00
Gao, Xiang
a47749cb28 Add at::one_hot (#15208)
Summary: Closes: https://github.com/pytorch/pytorch/issues/15060

Differential Revision: D13528014

Pulled By: ezyang

fbshipit-source-id: 5a18689a4c5638d92f9390c91517f741e5396293
2018-12-20 14:24:58 -08:00
Erik Brinkman
8db44eda01 Add support for batched pdist (#12302)
Summary:
This updates pdist to work for batched inputs, and updates the
documentation to reflect issues raised.

closes #9406
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12302

Reviewed By: ezyang

Differential Revision: D13528485

Pulled By: erikbrinkman

fbshipit-source-id: 63d93a6e1cc95b483fb58e9ff021758b341cd4de
2018-12-20 09:41:08 -08:00
Peter Goldsborough
aec9fdf0a4 Fix _apply in nn.Module (#15305)
Summary:
Fixes an issue that arose from https://github.com/pytorch/pytorch/pull/13481 where `.shared_memory()` couldn't be called. Effectively undoes all changes to `nn.Module` from that PR and solve the relevant problem in a different way (the goal was to be able to call `._apply()` on the Python wrapper for a C++ module).

soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15305

Differential Revision: D13493937

Pulled By: goldsborough

fbshipit-source-id: 4cb8687f90fc8709a536c5e7eacd0dc8edf6f750
2018-12-17 16:22:21 -08:00
David Riazati
59d71b9664 Bicubic interpolation for nn.functional.interpolate (#9849)
Summary:
Addresses #918, interpolation results should be similar to tf

* Adds bicubic interpolation operator to `nn.functional.interpolate`
* Corresponding test in `test_nn.py`

The operator is added in legacy `TH` to be aligned with the other upsampling operators; they can be refactored/moved to ATen all at once when #10482 is resolved
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9849

Differential Revision: D9007525

Pulled By: driazati

fbshipit-source-id: 93ef49a34ce4e5ffd4bda94cd9a6ddc939f0a4cc
2018-12-17 15:31:48 -08:00
Edward Yang
dcd1685282 Revert D13440858: [pytorch][PR] Use a pool of per-thread cudnn handles for each device, updated
Differential Revision:
D13440858

Original commit changeset: 1c6af5c53538

fbshipit-source-id: fda42ea75000d4a4e9c4a8eeaaa5518f7ad9c298
2018-12-14 14:35:01 -08:00
Chaitanya Sri Krishna Lolla
9f1d8f2eeb enabled tests in test_nn, test_cuda and test_sparse (#15232)
Summary:
tests work on ROCm 1.9.2 as present on CI (fp16 bringup, hipMemset and sparse improvements)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15232

Differential Revision: D13470991

Pulled By: bddppq

fbshipit-source-id: 45acc4f9ea5baaaf7672b86eb022948055779925
2018-12-14 14:27:57 -08:00
Michael Carilli
ca4358c8f5 Use a pool of per-thread cudnn handles for each device, updated (#15080)
Summary:
Rebased version of https://github.com/pytorch/pytorch/pull/14861, hopefully addressing ezyang's comments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15080

Differential Revision: D13440858

Pulled By: ezyang

fbshipit-source-id: 1c6af5c53538b81c6b92cf1dda231ed333f28035
2018-12-13 10:24:06 -08:00
Johannes M Dieterich
6610ace28b use ROCm 1.9.2 fp16 capabilities in rocBLAS and MIOpen interfaces (#14994)
Summary:
* relax MIOpen if statement to allow fp16/fp32 mixed precision training now supported by ROCm 1.9.2
* use gemm_ex API of rocBLAS in ROCm 1.9.2 instead of the previous hgemm API
* with this: enable all but one half test in test_nn

While there, fix also:
* a group convolution issue w/ MIOpen pertaining to initializing MIOpen on multi-GPU systems properly we detected while working on this
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14994

Differential Revision: D13439869

Pulled By: bddppq

fbshipit-source-id: 75e4eb51a59488882e64b5eabdc30555b25be25e
2018-12-12 16:16:47 -08:00
Natalia Gimelshein
27d5ae7afb use datatype dependent tolerance in data parallel tests
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14856

Differential Revision: D13413560

Pulled By: soumith

fbshipit-source-id: b3a0cfe93477ed332e6eaa2e39ef5f4cc8b36481
2018-12-10 22:50:27 -08:00
Johannes M Dieterich
52942e1f09 Enable unit tests known to work on ROCm (#14011)
Summary:
* Enable unit tests known to work on ROCm.
* Disable a few that are known to be flaky for the time being.
* Use std::abs for Half
* No more special casing for ROCm in TensorMathReduce
* Document an important detail for a hardcoded block size w.r.t. ROCm in TensorMathReduce

ezyang bddppq for awareness
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14011

Differential Revision: D13387679

Pulled By: bddppq

fbshipit-source-id: 4177f2a57b09d866ccbb82a24318f273e3292f71
2018-12-07 18:57:32 -08:00
Ailing Zhang
ef91cfd68b Add new reduction mode in kl_div (#14457)
Summary:
Fixes #6622 .
We used to average over all elements for kl divergence, which is not aligned with its math definition.
This PR corrects the default reduction behavior of KL divergence that it now naverages over batch dimension.

- In KL, default behavior `reduction=mean` averages over batch dimension. While for most other loss functions, `reduction=mean` averages over all elements.
- We used to support scalar tensor as well. For BC purpose, we still support it, no reduction is performed on scalar tensor.
- Added a new reduction mode called `batchmean` which has the correct behavior for KL. Add a warning to make `batchmean` as default for KL instead of `mean` in next major release.
- [deprecated]I chose to not add a new reduction option, since "mean over batch dimension" is kinda special, and it only makes sense in few cases like KL. We don't want to explain why there's a option "batchmean" but it's not applicable for all other functions. I'm open to discussion on this one, as I cannot think of a perfect solution for this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14457

Differential Revision: D13236016

Pulled By: ailzhang

fbshipit-source-id: 905cc7b3bfc35a11d7cf098b1ebc382170a087a7
2018-12-04 12:24:28 -08:00
David Riazati
814b5715ba Move module tests to common_nn (#14578)
Summary:
This moves `new_module_tests` from `test_nn.py` to `common_nn.py` so
that they can be used in `test_jit.py` without running any of
`test_nn.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14578

Differential Revision: D13268286

Pulled By: driazati

fbshipit-source-id: 6e8654a4c29ab754d656ac83820c14d1c1843e03
2018-11-30 12:14:59 -08:00