Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30768
The behavior didn't match the documentation, because the documentation (for 'none' reduction) reads:
input X target -> output
(N, C) X (N, C) -> (N,)
(C,) X (C,) -> ()
but the later case would output (1,). This also changes the case to:
() X (C,) -> ()
from:
() X (C,) -> (C,)
which makes more sense with the above formulas.
Restacked version of: https://github.com/pytorch/pytorch/pull/30748
Test Plan: Imported from OSS
Differential Revision: D18821554
Pulled By: gchanan
fbshipit-source-id: 3df77c51cf25648cb5fab62a68b09f49c91dab4e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30765
It is already supported in CPU and is pretty easy to add for consistency.
Restacked version of: https://github.com/pytorch/pytorch/pull/30727
Test Plan: Imported from OSS
Differential Revision: D18821557
Pulled By: gchanan
fbshipit-source-id: e6aa3e91000ff3fd63941defc7d30aef58ae2f82
Summary:
This fixes https://github.com/pytorch/pytorch/issues/28575.
It seems `poisson_nll_loss` was implemented with the incorrect assumption about `masked_select`, which actually doesn't return tensor with the same storage, so in-place operation used there didn't work as intended.
Here I used `masked_fill` instead.
Also, the existing test didn't have `reference_fn`, so I added it (although it's not fundamentally useful since current cpp `poisson_nll_loss` itself does exactly same algorithm as `reference_fn`).
Thanks in advance for reviewing this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28637
Differential Revision: D18299724
Pulled By: albanD
fbshipit-source-id: 1aac5b20e77bf54874b79018207ba8f743766232
Summary:
Handling of empty example was giving a cuda error.
Adding getLastError check to make sure cuda errors are attributed to the
correct function (instead of currently it was attributing the error to the next
cuda operator).
Added special case for batch-size zero, also added to cpu to keep things
consistent.
Resubmit of D18085429 without stacked commits
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28614
Test Plan: test included
Differential Revision: D18122212
Pulled By: ggoossen
fbshipit-source-id: 8c6741a157a9fbbc82685d81a6f8021452b650d4
Summary:
Currently, `reshape` does an `as_strided` when the geometry is viewable. However, `as_strided` backward is not very optimized, and can not always detect such cases. Improvements are planned at https://github.com/pytorch/pytorch/pull/8965, and I will finish it some day. But the current situation is that in these cases backward through `reshape` will copy gradient while a simple `view` will not. This is unnecessary.
Notably this affects `flatten` and a whole bunch of other ops implemented on top of `reshape`.
```py
In [15]: x = torch.randn(3, 4, requires_grad=True)
In [16]: y = x.reshape(x.shape)
In [17]: assert y._base is not None
In [18]: gy = torch.randn_like(y)
In [20]: gx = torch.autograd.grad(y, x, gy)[0]
In [21]: gx
Out[21]:
tensor([[ 0.2189, 0.3396, -0.1108, 1.7703],
[ 1.0737, -0.1222, 1.0765, -1.3363],
[-1.3798, -0.2950, 0.0800, 0.2501]])
In [22]: gx._base # not gy
Out[22]:
tensor([ 0.2189, 0.3396, -0.1108, 1.7703, 1.0737, -0.1222, 1.0765, -1.3363,
-1.3798, -0.2950, 0.0800, 0.2501])
In [23]: gy.zero_()
Out[23]:
tensor([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
In [24]: gx # not sharing storage with gy
Out[24]:
tensor([[ 0.2189, 0.3396, -0.1108, 1.7703],
[ 1.0737, -0.1222, 1.0765, -1.3363],
[-1.3798, -0.2950, 0.0800, 0.2501]])
# but everything is optimized with view, which should be equivalent with reshape in this case
In [25]: y = x.view(x.shape)
In [26]: assert y._base is not None
In [27]: gy = torch.randn_like(y)
In [28]: gx = torch.autograd.grad(y, x, gy)[0]
In [29]: gx
Out[29]:
tensor([[-2.4463, 1.1446, 0.1501, 0.1212],
[-1.1125, 1.4661, 0.9092, -0.2153],
[-0.1937, -0.3381, -1.3883, -0.7329]])
In [30]: gy.zero_()
Out[30]:
tensor([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
In [31]: gx # sharing storage with gy
Out[31]:
tensor([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28901
Differential Revision: D18240868
Pulled By: ezyang
fbshipit-source-id: 28fdaa0c7014a9dae6731dfe8b67784d38fc27f0
Summary:
At the current moment of time nn.Linear (an it's interal functional code), will
fail in THBlas:
RuntimeError: invalid argument 8: lda should be at least max(1, 0), but have 0 at caffe2/aten/src/TH/generic/THBlas.cpp:363
This diff is trying to fix this bug.
As of now I was able to identify 2 possible places where changes needs to be done based on current dispatcher logic:
1. The file touched in this diff
2. caffe2/aten/src/THC/generic/THCTensorMathBlas.cu
At the moment I didn't find a better places comparing to injecting logic to those files:
the only non-generated function for forward pass, this + mm_mat2_backward function family on a backward pass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27211
Test Plan: New unit-tests are passing. Code that was failing earlier works. Need to test other backends.
Differential Revision: D17599915
Pulled By: kennyhorror
fbshipit-source-id: 78894ce602d96aac2d6bf8c16a3fab43973e2d53
Summary:
This PR adds Average Pool module to C++ front-end.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25800
Differential Revision: D17318094
Pulled By: yf225
fbshipit-source-id: c914c0e802bbe5f1d1f0a21a669c28bc956899db
Summary:
yf225 This is L1Loss module. I don't think that ```_Loss``` and ```_WeightedLoss``` as base Python classes do anything. First one sets reduction type and also takes in ```reduce``` parameter which is deprecated. The second one only registers ```weight``` parameter. I don't think that we should keep this structure. What do you think?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25902
Differential Revision: D17307045
Pulled By: yf225
fbshipit-source-id: ad3eda2ee8dcf4465054b376c1be89b39d11532f
Summary:
This PR adds Python/C++ API parity tracker at `test/cpp_api_parity/parity-tracker.md`, which currently shows parity status for `torch.nn` modules.
A good amount of line changes here is moving `new_criterion_tests` from `test_nn.py` to `common_nn.py`, so that it can be used in `test_cpp_api_parity.py`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25289
Differential Revision: D17188085
Pulled By: yf225
fbshipit-source-id: 33d12fb1a4de2d9147ed09380973f361a3981fdf
Summary:
Moving so that `new_criterion_tests` can be used from `test_cpp_api_parity.py`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25333
Differential Revision: D17097188
Pulled By: yf225
fbshipit-source-id: 7f7905cc6799bca8dc6b3c9cc43995313c6bc058
Summary:
1. update on restricting block.z <= 64, compliant to CUDA maximum z-dimension of
a block;
2. clang-format
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22602
Differential Revision: D16203857
Pulled By: ezyang
fbshipit-source-id: 567719ae175681a48eb0f818ca0aba409dca2550
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**
This was requested by someone at Facebook; this lint is turned
on for Facebook by default. "Sure, why not."
I had to noqa a number of imports in __init__. Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it. Left for future work.
Be careful! flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments. flake8-3 will
report an import unused; flake8-2 will not. For now, I just
noqa'd all these sites.
All the changes were done by hand.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision: D14687478
fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
Summary:
Enable unit tests working with ROCm 2.3. In particular, these are unit tests where we skipped for double data types previously and some tests for multi-GPU setups.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18537
Differential Revision: D14651822
Pulled By: ezyang
fbshipit-source-id: 7dd575504ebe235a91489866c91000e9754b1235
Summary:
When adaptive pooling has to produce a single pixel feature map, it is faster to do so by calling .mean(). Backward calls a pretty inefficient cuda kernel with atomics, which becomes ridiculously slow for halfs. For half this PR provides approx 30x speed-up for adaptive average pooling, which results in 30% end-to-end speed-up on senet. Improvements are smaller for float, but still significant (approx 5x).
Also this PR unifies handling of 3d (no batch dimension) and 4d tensors, using negative dimension indices.
cc ezyang for review.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17011
Reviewed By: ailzhang
Differential Revision: D14078747
Pulled By: soumith
fbshipit-source-id: 0eb9255da2351190a6bcaf68c30e2ae2402a2dd9
Summary:
1. Port the FractionalMaxPool3d implementation from THNN/THCUNN to ATen.
2. Expose this function to Python module nn.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15575
Differential Revision: D13612848
Pulled By: chandlerzuo
fbshipit-source-id: 5f474b39005efa7788e984e8a805456dcdc43f6c
Summary:
Addresses #918, interpolation results should be similar to tf
* Adds bicubic interpolation operator to `nn.functional.interpolate`
* Corresponding test in `test_nn.py`
The operator is added in legacy `TH` to be aligned with the other upsampling operators; they can be refactored/moved to ATen all at once when #10482 is resolved
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9849
Differential Revision: D9007525
Pulled By: driazati
fbshipit-source-id: 93ef49a34ce4e5ffd4bda94cd9a6ddc939f0a4cc
Summary:
This PR adds `None` buffers as parameters (similarly to #14715). It also cleans up a bunch of the `test_jit.py` tests that should be covered by `common_nn.py` and brings in `criterion_tests` to test loss functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14778
Differential Revision: D13330849
Pulled By: driazati
fbshipit-source-id: 924cc4cf94e0dcd11e811a55222fd2ebc42a9e76
Summary:
Fixes#6622 .
We used to average over all elements for kl divergence, which is not aligned with its math definition.
This PR corrects the default reduction behavior of KL divergence that it now naverages over batch dimension.
- In KL, default behavior `reduction=mean` averages over batch dimension. While for most other loss functions, `reduction=mean` averages over all elements.
- We used to support scalar tensor as well. For BC purpose, we still support it, no reduction is performed on scalar tensor.
- Added a new reduction mode called `batchmean` which has the correct behavior for KL. Add a warning to make `batchmean` as default for KL instead of `mean` in next major release.
- [deprecated]I chose to not add a new reduction option, since "mean over batch dimension" is kinda special, and it only makes sense in few cases like KL. We don't want to explain why there's a option "batchmean" but it's not applicable for all other functions. I'm open to discussion on this one, as I cannot think of a perfect solution for this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14457
Differential Revision: D13236016
Pulled By: ailzhang
fbshipit-source-id: 905cc7b3bfc35a11d7cf098b1ebc382170a087a7
Summary:
Add support for interpolate and upsampling in weak_script mode.
Because the function parameters are overloaded, i had to add it as a builtin op. For interpolate:
size can be ?int | int[]?, and scale_factor can be ?float | float[]?. Every combination of the two parameters needs to be supported.
The same logic applies for upsample_nearest, upsample_bilinear, and upsample.
There are a few fixes that I came to along the way.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14123
Differential Revision: D13278923
Pulled By: eellison
fbshipit-source-id: e59729034369be4ce4b747291a3d1c74e135b869
Summary:
This moves `new_module_tests` from `test_nn.py` to `common_nn.py` so
that they can be used in `test_jit.py` without running any of
`test_nn.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14578
Differential Revision: D13268286
Pulled By: driazati
fbshipit-source-id: 6e8654a4c29ab754d656ac83820c14d1c1843e03
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`
Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238
Differential Revision: D13252887
Pulled By: driazati
fbshipit-source-id: e9638cf74089884a32b8f0f38396cf432c02c988
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`
Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238
Differential Revision: D13192230
Pulled By: driazati
fbshipit-source-id: 36488960b6c91448b38c0fa65422539a93af8c5e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794
common.py is used in base_module for almost all tests in test/. The
name of this file is so common that can easily conflict with other dependencies
if they happen to have another common.py in the base module. Rename the file to
avoid conflict.
Reviewed By: orionr
Differential Revision: D10438204
fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380
Summary:
* improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs)
* integrate rocFFT (i.e., enable Fourier functionality)
* fix bugs in ROCm caused by wrong warp size
* enable more test sets, skip the tests that don't work on ROCm yet
* don't disable asserts any longer in hipification
* small improvements
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893
Differential Revision: D9615053
Pulled By: ezyang
fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b