Summary:
This PR adds functionality to skip a test based on CUDA version.
This way, we can be more specific when skipping a test, such as when the test only fails for a particular CUDA version.
This allows us to add back the skipped tests for CUDA 11.2 for other CUDA versions, such as 10.1 and 11.1.
I tested this locally (by using 11.0 instead of 11.2), but will run all the CI to make sure it works.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52359
Reviewed By: walterddr
Differential Revision: D26487951
Pulled By: janeyx99
fbshipit-source-id: 45c71cc6105ffd9985054880009cf68ea5ef3f6a
Summary:
Fixes https://github.com/pytorch/pytorch/issues/51719, https://github.com/pytorch/pytorch/issues/28142
**Change**
- Update `torch.Tensor.unflatten` to support users pass`-1` as the inferred size for both tensors and named tensors.
- Examples of using `-1` in the `unflatten` function are added to the docs.
- Fix the rendered issue of original `unflatten` docs by removing a blank line between its example section.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51955
Reviewed By: agolynski
Differential Revision: D26467198
Pulled By: zou3519
fbshipit-source-id: 6a3ede25561223187273796427ad0cb63f125364
Summary:
Reference: https://github.com/pytorch/pytorch/issues/50006
We should probably add aliases for these operators to be consistent with NumPy names i.e. `np.degrees` and `np.radians`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51283
Reviewed By: ngimel
Differential Revision: D26171163
Pulled By: mruberry
fbshipit-source-id: 1869604ed400820d95f6ff50a0e3cba1de1ffa84
Summary:
Adding CUDA 11.2 to Windows CI.
Disabled tests:
The following ran into `CUDA error: misaligned address` for CUDA 11.2: (issue linked below)
`test_where_scalar_valid_combination_cuda_complex128` in test_torch.py
`test_sgn_complex_cuda` in test_autograd.py
The following ran into `CUDA error: too many resources requested for launch` for CUDA 11.2: (https://github.com/pytorch/pytorch/issues/52002)
test_EmbeddingBag_per_sample_weights_and_new_offsets_cuda_int64_float64
test_EmbeddingBag_per_sample_weights_and_offsets_cuda_int64_float64
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51598
Reviewed By: mrshenli
Differential Revision: D26344965
Pulled By: janeyx99
fbshipit-source-id: 3c9a4ed16d748969e96593220ec0a9f33e1ffcef
Summary:
Toward fixing https://github.com/pytorch/pytorch/issues/47624
~Step 1: add `TORCH_WARN_MAYBE` which can either warn once or every time in c++, and add a c++ function to toggle the value.
Step 2 will be to expose this to python for tests. Should I continue in this PR or should we take a different approach: add the python level exposure without changing any c++ code and then over a series of PRs change each call site to use the new macro and change the tests to make sure it is being checked?~
Step 1: add a python and c++ toggle to convert TORCH_WARN_ONCE into TORCH_WARN so the warnings can be caught in tests
Step 2: add a python-level decorator to use this toggle in tests
Step 3: (in future PRs): use the decorator to catch the warnings instead of `maybeWarnsRegex`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48560
Reviewed By: ngimel
Differential Revision: D26171175
Pulled By: mruberry
fbshipit-source-id: d83c18f131d282474a24c50f70a6eee82687158f
Summary:
Implements `np.diff` for single order differences only:
- method and function variants for `diff` and function variant for `diff_out`
- supports out variant, but not in-place since shape changes
- adds OpInfo entry, and test in `test_torch`
- automatic autograd because we are using the `Math` dispatch
_Update: we only support Tensors for prepend and append in this PR. See discussion below and comments for more details._
Currently there is a quirk in the c++ API based on how this is implemented: it is not possible to specify scalar prepend and appends without also specifying all 4 arguments.
That is because the goal is to match NumPy's diff signature of `diff(int n=1, int dim=-1, Union[Scalar, Tensor] prepend=None, Union[Scalar, Tensor] append)=None` where all arguments are optional, positional and in the correct order.
There are a couple blockers. One is c++ ambiguity. This prevents us from simply doing `diff(int n=1, int dim=-1, Scalar? prepend=None, Tensor? append=None)` etc for all combinations of {Tensor, Scalar} x {Tensor, Scalar}.
Why not have append, prepend not have default args and then write out the whole power set of {Tensor, Scalar, omitted} x {Tensor, Scalar, omitted} you might ask. Aside from having to write 18 overloads, this is actually illegal because arguments with defaults must come after arguments without defaults. This would mean having to write `diff(prepend, append, n, dim)` which is not desired. Finally writing out the entire power set of all arguments n, dim, prepend, append is out of the question because that would actually involve 2 * 2 * 3 * 3 = 36 combinations. And if we include the out variant, that would be 72 overloads!
With this in mind, the current way this is implemented is actually to still do `diff(int n=1, int dim=-1, Scalar? prepend=None, Tensor? append=None)`. But also make use of `cpp_no_default_args`. The idea is to only have one of the 4 {Tensor, Scalar} x {Tensor, Scalar} provide default arguments for the c++ api, and add `cpp_no_default_args` for the remaining 3 overloads. With this, Python api works as expected, but some calls such as `diff(prepend=1)` won't work on c++ api.
We can optionally add 18 more overloads that cover the {dim, n, no-args} x {scalar-tensor, tensor-scalar, scalar-scalar} x {out, non-out} cases for c++ api. _[edit: counting is hard - just realized this number is still wrong. We should try to count the cases we do cover instead and subtract that from the total: (2 * 2 * 3 * 3) - (3 + 2^4) = 17. 3 comes from the 3 of 4 combinations of {tensor, scalar}^2 that we declare to be `cpp_no_default_args`, and the one remaining case that has default arguments has covers 2^4 cases. So actual count is 34 additional overloads to support all possible calls]_
_[edit: thanks to https://github.com/pytorch/pytorch/issues/50767 hacky_wrapper is no longer necessary; it is removed in the latest commit]_
hacky_wrapper was also necessary here because `Tensor?` will cause dispatch to look for the `const optional<Tensor>&` schema but also generate a `const Tensor&` declaration in Functions.h. hacky_wrapper allows us to define our function as `const Tensor&` but wraps it in optional for us, so this avoids both the errors while linking and loading.
_[edit: rewrote the above to improve clarity and correct the fact that we actually need 18 more overloads (26 total), not 18 in total to complete the c++ api]_
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50569
Reviewed By: H-Huang
Differential Revision: D26176105
Pulled By: soulitzer
fbshipit-source-id: cd8e77cc2de1117c876cd71c29b312887daca33f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51578https://github.com/pytorch/pytorch/pull/49710 introduced an edge case in which
drawing a single sample resulted in ignoring the `dtype` arg to `draw`. This
fixes this and adds a unit test to cover this behavior.
Test Plan: Unit tests
Reviewed By: danielrjiang
Differential Revision: D26204393
fbshipit-source-id: 441a44dc035002e7bbe6b662bf6d1af0e2cd88f4
Summary:
Performs the update that was suggested in https://github.com/pytorch/pytorch/issues/41489
Adjust the functionality to largely match that pf the scipy companion PR https://github.com/scipy/scipy/pull/10844/, including
- a new `draw_base2` method
- include zero as the first point in the (unscrambled) Sobol sequence
The scipy PR is also quite opinionated if the `draw` method doesn't get called with a base 2 number (for which the resulting sequence has nice properties, see the scipy PR for a comprehensive discussion of this).
Note that this update is a **breaking change** in the sense that sequences generated with the same parameters after as before will not be identical! They will have the same (better, arguably) distributional properties, but calling the engine with the same seed will result in different numbers in the sequence.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49710
Test Plan:
```
from torch.quasirandom import SobolEngine
sobol = SobolEngine(3)
sobol.draw(4)
sobol = SobolEngine(4, scramble=True)
sobol.draw(5)
sobol = SobolEngine(4, scramble=True)
sobol.draw_base2(2)
```
Reviewed By: malfet
Differential Revision: D25657233
Pulled By: Balandat
fbshipit-source-id: 9df50a14631092b176cc692b6024aa62a639ef61
Summary:
Reference: https://github.com/pytorch/pytorch/issues/33152
Changes
* Enable complex support for masked_scatter
* Enable half support for masked_scatter CPU
* Enable complex autograd support for masked_scatter CPU and masked_select (both CPU and CUDA).
**Note**:
Complex Support for masked_scatter CUDA is disabled as it depends on `masked_fill` which is yet to be ported to ATen.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51281
Reviewed By: ailzhang
Differential Revision: D26127561
Pulled By: anjali411
fbshipit-source-id: 6284926b934942213c5dfc24b5bcc8538d0231af
Summary:
Fixes https://github.com/pytorch/pytorch/issues/3307
Previously, `self.grad` was not ~cloned~ deepcopied to the returned tensor in `deepcopy`. Added a test and an implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50663
Reviewed By: heitorschueroff
Differential Revision: D26074811
Pulled By: albanD
fbshipit-source-id: 536dad36415f1d03714b4ce57453f406ad802b8c
Summary:
All these Unary operators have been an entry in OpInfo DB.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50096
Reviewed By: zhangguanheng66
Differential Revision: D25870048
Pulled By: mruberry
fbshipit-source-id: b64e06d5b9ab5a03a202cda8c22fdb7e4ae8adf8
Summary:
Based on ngimel's (Thank you!) feedback, cpu half was only accidental, so I'm removing it.
This lets us ditch the old codepath for without replacement in favour of the new, better one.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50063
Reviewed By: mruberry
Differential Revision: D25772449
Pulled By: ngimel
fbshipit-source-id: 608729c32237de4ee6d1acf7e316a6e878dac7f0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49552
This PR:
1. Migrates independent autograd test for `hstack`, `dstack`, `vstack`, `movedim`, `moveaxis` from `test_autograd.py` to the new `OpInfo` based tests.
2. Migrates autograd test for `gather`, `index_select` from the method_tests to the new `OpInfo` based tests.
2. Enables complex backward for `stack, gather, index_select, index_add_` and adds tests for complex autograd for all the above mentioned ops.
Test Plan: Imported from OSS
Reviewed By: mruberry
Differential Revision: D25682511
Pulled By: anjali411
fbshipit-source-id: 5d8f89db4a9ec340ab99a6196987d44a23e2c6c6
Summary:
**BC-breaking Note:**
This PR updates PyTorch's digamma function to be consistent with SciPy's special.digamma function. This changes the result of the digamma function on the nonpositive integers, where the gamma function is not defined. Since the gamma function is undefined at these points, the (typical) derivative of the logarithm of the gamma function is also undefined at these points, and for negative integers this PR updates digamma to return NaN. For zero, however, it returns -inf to be consistent with SciPy.
Interestingly, SciPy made a similar change, which was noticed by at least one user: https://github.com/scipy/scipy/issues/9663#issue-396587679.
SciPy's returning of negative infinity at zero is intentional:
59347ae8b8/scipy/special/cephes/psi.c (L163)
This change is consistent with the C++ standard for the gamma function:
https://en.cppreference.com/w/cpp/numeric/math/tgamma
**PR Summary:**
Reference https://github.com/pytorch/pytorch/issues/42515
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48302
Reviewed By: ngimel
Differential Revision: D25664087
Pulled By: mruberry
fbshipit-source-id: 1168e81e218bf9fe5b849db0e07e7b22e590cf73
Summary:
**BC-Breaking Note:**
This PR updates PyTorch's angle operator to be consistent with NumPy's. Previously angle would return zero for all floating point values (including NaN). Now angle returns `pi` for negative floating point values, zero for non-negative floating point values, and propagates NaNs.
**PR Summary:**
Reference: https://github.com/pytorch/pytorch/issues/42515
TODO:
* [x] Add BC-Breaking Note (Prev all real numbers returned `0` (even `nan`)) -> Fixed to match the correct behavior of NumPy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49163
Reviewed By: ngimel
Differential Revision: D25681758
Pulled By: mruberry
fbshipit-source-id: 54143fe6bccbae044427ff15d8daaed3596f9685
Summary:
This replaces the narrow character set APIs with the wide character set ones in `THAllocator.cpp`. This fixes the potential crashes caused by passing non-ASCII characters in `torch::from_file` on Windows.
See: https://github.com/pytorch/pytorch/issues/47422
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47905
Reviewed By: zhangguanheng66
Differential Revision: D25399146
Pulled By: ezyang
fbshipit-source-id: 0a183b65de171c48ed1718fa71e773224eaf196f
Summary:
Fixes https://github.com/pytorch/pytorch/issues/45964
Indexing operators e.g. `scatter`/`gather` use tensor restriding so the `TensorIterator` built in overlap checking needs to be disabled. This adds the missing overlap checks for these operators.
In addition, some indexing operators don't work will with `MemOverlapStatus::FULL` which is explicitly allowed by `assert_no_partial_overlap`. So, I've introduced `assert_no_overlap` that will raise an error on partial _or_ full overlap.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48651
Reviewed By: zhangguanheng66
Differential Revision: D25401047
Pulled By: ngimel
fbshipit-source-id: 53abb41ac63c4283f3f1b10a0abb037169f20b89