Summary:
Currently `cumsum` crashes for tensors with non-empty dimensions but with zero elements, which could happen when some dimension is zero. This commit fixes the error by checking both `dim()` and `numel()` in cumsum backward
Fixes https://github.com/pytorch/pytorch/issues/31515
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31694
Reviewed By: mrshenli
Differential Revision: D19266613
Pulled By: leedtan
fbshipit-source-id: 9407e0aa55440fed911c01a3580bb6c5eab62a16
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31517
This is going to be used by upsample (which currently uses magic values to represent optionals).
For now, we just introduce a fake function for testing (torch._test_optional_float(x)).
Test Plan: Imported from OSS
Differential Revision: D19198721
Pulled By: gchanan
fbshipit-source-id: 0a1382fde0927c5d277d02d62bfb31fb574b8c74
Summary:
Reference: https://github.com/pytorch/pytorch/issues/23159
Currently we don't support reduction operations for dim>=64 and we should give a descriptive RuntimeError indicating the same
Diff: D19179039
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31476
Differential Revision: D19179039
Pulled By: anjali411
fbshipit-source-id: 58568f64627bf3df6b3e00a1498544c030e74a0e
Summary:
Make the following changes:
- When there are more than 10k errors, cuda-memcheck only shows 10k errors, in this case we shouldn't raise an Exception
- Add UNDER_CUDA_MEMCHECK environment to allow disabling `pin_memory` tests when running cuda-memcheck.
- Add a `--ci` command option, when turned on, then this script would run output to stdout instead of writing a file, and exit with an error if cuda-memcheck fails
- Add a `--nohang` command option. When turned on, then hang would be treated as pass instead of error
- Do simple filtering on the test to run: if `'cpu'` in the test name but not `'cuda'` is not in the test name
- Add `--split` and `--rank` to allowing splitting the work (NVIDIA CI has a limitation of 3 hours, we have to split the work to satisfy this limitation)
- The error summary could be `ERROR SUMMARY: 1 error`, or `ERROR SUMMARY: 2 errors`, the tail could be `error` or `errors`, it is not of the same length. The script is fixed to handle this case.
- Ignore errors from `cufft`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29243
Differential Revision: D18941701
Pulled By: mruberry
fbshipit-source-id: 2048428f32b66ef50c67444c03ce4dd9491179d2
Summary:
Tests for unique_dim will be refactored in a separate PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31211
Differential Revision: D19034968
Pulled By: ngimel
fbshipit-source-id: 855d326b37638b5944f11fbbce03394cf000daf9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30892
Fixes all outstanding lints and actually installs a properly configured
flake8
Test Plan: Imported from OSS
Differential Revision: D18862825
Pulled By: suo
fbshipit-source-id: 08e9083338a7309272e17bb803feaa42e348aa85
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30826
Previously the scalar_check for the reduction None case was:
input.dim() <= 1, but it should be target based, i.e.:
target.dim() == 0. This follows from the "correct cases", i.e.
(N, C) X (N,) -> (N,)
(C,) X () -> ()
Test Plan: Imported from OSS
Differential Revision: D18833660
Pulled By: gchanan
fbshipit-source-id: 26338b842a8311718c4b89da3e2f1b726d5409b8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30768
The behavior didn't match the documentation, because the documentation (for 'none' reduction) reads:
input X target -> output
(N, C) X (N, C) -> (N,)
(C,) X (C,) -> ()
but the later case would output (1,). This also changes the case to:
() X (C,) -> ()
from:
() X (C,) -> (C,)
which makes more sense with the above formulas.
Restacked version of: https://github.com/pytorch/pytorch/pull/30748
Test Plan: Imported from OSS
Differential Revision: D18821554
Pulled By: gchanan
fbshipit-source-id: 3df77c51cf25648cb5fab62a68b09f49c91dab4e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30670
Also turn off scalar_check for grad_input: it isn't necessary because the input can't be 0-dimensional.
Test Plan: Imported from OSS
Differential Revision: D18784523
Pulled By: gchanan
fbshipit-source-id: 246d30970457075a0403dd0089317659a2cd2dd4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30669
The inputs can't be 0-d, so we don't need that check in the scalar_check.
Test Plan: Imported from OSS
Differential Revision: D18784524
Pulled By: gchanan
fbshipit-source-id: d44222dffc91880a6e8c7be69e6e146e60040d43
Summary:
With the CI failure caused in 8bbafa0b32 fixed (incorrect return type of the lambdas in CUDA kernels)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30521
Differential Revision: D18770151
Pulled By: ailzhang
fbshipit-source-id: 02f0fe1d5718c34d24da6dbb5884ee8b247ce39a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30527
When we introduced dtype.is_signed we allowed for support of
quantized types, but we're not sure what the correct result should be.
See discussion at https://github.com/pytorch/pytorch/pull/29511
Test Plan: Imported from OSS
Differential Revision: D18765410
Pulled By: nairbv
fbshipit-source-id: c87cfe999b604cfcbbafa561e04d0d5cdbf41e6d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30434
These are all pointwise ops that are implemented correctly wrt shapes in THC.
Test Plan: Imported from OSS
Differential Revision: D18699087
Pulled By: gchanan
fbshipit-source-id: 82cb91b00c77bfaca75be497c87fc7ae52daf46c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29953
The underlying function handles it correctly.
Test Plan: Imported from OSS
Differential Revision: D18548055
Pulled By: gchanan
fbshipit-source-id: cc2d0ae37d9689423363d115c6a653cb64840528
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29952
The underlying op handles the check correctly.
Test Plan: Imported from OSS
Differential Revision: D18548048
Pulled By: gchanan
fbshipit-source-id: 9ac6fde743408e59ccdfc61bd574ebe6e2862238
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29923
Note that this changes the behavior of masked_select when both "self" and "mask" are 0-dimensional.
In previous versions of PyTorch, this would return a 0-dimensional tensor. But the documentation reads:
"Returns a new 1-D tensor which indexes the input tensor according to the boolean mask mask which is a BoolTensor."
Test Plan: Imported from OSS
Differential Revision: D18539560
Pulled By: gchanan
fbshipit-source-id: 1637ed2c434fcf8ceead0073aa610581f4a19d21
Summary:
Migrate index_add cpu from TH to ATen.
I couldn't find replacement for get1d and set1d, so doing pointer arithmetic inplace.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28421
Test Plan: existing tests
Differential Revision: D18060971
Pulled By: ggoossen
fbshipit-source-id: 413719990cdb2fe578964cde14e93577e48a4342