Commit Graph

546 Commits

Author SHA1 Message Date
Jinku Cui
27eecf32bd Remove redundant dummy overrides (#103992)
# Tidy the code in [overrides.py](https://github.com/pytorch/pytorch/blob/main/torch/overrides.py)

## Duplicate APIs in the [get_testing_overrides()](https://github.com/pytorch/pytorch/blob/main/torch/overrides.py#L335) function:

| APIs  | Line number|
|-------|-------|
| torch.fft.fft| L544 L564 |
| torch.logsumexp | L670 L672
| torch.narrow_copy | L733 L1126 |
| torch.native_norm | L740 L741 L742 |
| torch.nn.init.constant_ | L885 L887 |
| torch.squeeze_copy | L1134 L1135 |
| torch.view_copy | L1148 L1149 |
| Tensor.\_coalesced\_ | L1236 L1261 |

## Testing script

```Python

import torch
import inspect
import functools
from typing import Dict, Set, Callable

"""
@functools.lru_cache(None)
def get_testing_overrides() -> Dict[Callable, Callable]:
    ...
    Tensor = torch.Tensor
    ret: Dict[Callable, Callable] = {
        # ...
        torch.fft.fft: lambda input, n=None, dim=-1, norm=None: -1,                         # L544
        torch.fft.fft: lambda input, n=None, dim=-1, norm=None: -1,                         # L564
        torch.logsumexp: lambda input, names, keepdim=False, out=None: -1,                  # L670
        torch.logsumexp: lambda input, names, keepdim=False, out=None: -1,                  # L672
        torch.narrow_copy: lambda input, dim, start, length: -1,                            # L733
        torch.narrow_copy: lambda self, dim, start, length: -1,                             # L1126
        torch.native_norm: lambda input, p=2: -1,                                           # L740
        torch.native_norm: lambda input, p=2: -1,                                           # L741
        torch.native_norm: lambda input, p=2, dim=None, keepdim=False, dtype=None: -1,      # L742
        torch.squeeze_copy: lambda self: -1,                                                # L1134
        torch.squeeze_copy: lambda self, dim: -1,                                           # L1135
        torch.view_copy: lambda self, size: -1,                                             # L1148
        torch.view_copy: lambda self, dtype: -1,                                            # L1149
        Tensor._coalesced_: lambda self: -1,                                                # L1236
        Tensor._coalesced_: lambda self, coalesced: -1,                                     # L1261
        # ...
    }
    ...
"""

if __name__ == "__main__":
    ret = torch.overrides.get_testing_overrides()

    Tensor = torch.Tensor
    dups = {"torch.fft.fft": torch.fft.fft,
            "torch.logsumexp": torch.logsumexp,
            "torch.narrow_copy": torch.narrow_copy,
            "torch.native_norm": torch.native_norm,
            "torch.squeeze_copy": torch.squeeze_copy,
            "torch.view_copy": torch.view_copy,
            "Tensor._coalesced_": Tensor._coalesced_}

    for k,v in dups.items():
        print(f"{k:18} {inspect.signature(ret[v])}")

```

## Testing output

```Shell
torch.fft.fft      (input, n=None, dim=-1, norm=None)
torch.logsumexp    (input, names, keepdim=False, out=None)
torch.narrow_copy  (self, dim, start, length)
torch.native_norm  (input, p=2, dim=None, keepdim=False, dtype=None)
torch.squeeze_copy (self, dim)
torch.view_copy    (self, dtype)
Tensor._coalesced_ (self, coalesced)

```

## Explanation:
The function `get_testing_overrides()` returns a `Dict[Callable, Callable]`. The later dummy overrides will cover the previous dummy overrides in the returned `Dict`. Therefore, removing the dummy overrides with homonym API names can tidy the code and increase the readability of the code.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103992
Approved by: https://github.com/kit1980
2023-06-28 01:59:56 +00:00
Meghan
6ff4548b6e [AMP] Support XLA:TPU (#96370)
With https://github.com/pytorch/xla/pull/5148, https://github.com/pytorch/xla/pull/4740

With these changes
XLA:GPU users should use `torch.cuda.amp.autocast()` for AMP with float16
XLA:TPU users should use `torch.amp.autocast('xla')` for AMP with bfloat16

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96370
Approved by: https://github.com/bdhirsh, https://github.com/malfet
2023-06-23 19:46:42 +00:00
PyTorch MergeBot
7274582390 Revert "sparse_mask: backward support for sparse lhs (#95165)"
This reverts commit f090fdf3b4.

Reverted https://github.com/pytorch/pytorch/pull/95165 on behalf of https://github.com/huydhn due to Sorry for reverting this. I think one of the tests test_sparse.py::TestSparseCUDA::test_sparse_mask_backward_cuda_complex128 is failing on slow gradcheck f090fdf3b4 ([comment](https://github.com/pytorch/pytorch/pull/95165#issuecomment-1604696109))
2023-06-23 18:40:15 +00:00
Nikita Vedeneev
f090fdf3b4 sparse_mask: backward support for sparse lhs (#95165)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95165
Approved by: https://github.com/pearu, https://github.com/cpuhrsch
2023-06-23 12:27:27 +00:00
xuanqi
a152b3e3b8 [RFC] Create functional aten assertion ops (#103751)
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom):

* #103887
* #103757
* __->__ #103751

Prep PR to create functional version of assertions. Concrete logic will be implemented in future PRs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103751
Approved by: https://github.com/tugsbayasgalan
2023-06-23 06:20:42 +00:00
Muralidhar Andoorveedu
4e204ff87b Added is_xla (#103100)
This change creates `is_xla` which is congruent with `is_cuda` and `is_cpu`. Useful in situations like: https://github.com/pytorch/pytorch/pull/102858

```
>>> x = torch.tensor([1], device=xm.xla_device())
>>> x.is_xla
True
>>> x.is_cpu
False
>>> x = torch.tensor([1])
>>> x.is_cpu
True
>>> x.is_xla
False
```

Attn: @albanD
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103100
Approved by: https://github.com/albanD
2023-06-22 23:31:04 +00:00
Charlie West-Taylor
5eb7325bc7 Add autocast support for IPU (#103890)
As part of this, a new `AutocastIPU` dispatch key has been added.

There's an existing PR, #85043, to make `Autocast` a proper per-backend functionality key, but it ran into issues with layering with other functionality keys and went stale.

This has been tested in the out-of-tree IPU PyTorch backend.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103890
Approved by: https://github.com/albanD
2023-06-22 15:38:45 +00:00
Aleksandar Samardžić
09fdea8564 Fix autograd issue with identity conversions (#92022)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92022
Approved by: https://github.com/pearu, https://github.com/mtaaooby, https://github.com/amjames, https://github.com/cpuhrsch
2023-06-21 21:23:03 +00:00
xuanqi
b27c3558a4 [RFC]: Create aten native op for constrain_range (#103346)
At high current implementation of constrains functions (constrain_as_**) will raise exception for the following code snippets:
```
def f(x):
    a = x.item()
    constrain_as_size(a, 4, 7)
    return torch.empty((a, 4))

inp = torch.tensor([5])
ep = torch._export.export(f, (inp,))
```

The reason is because current constrain logic is:
1) Purely python so it won't survive AOT export (the full node is gone after AOT export since AOT export only maintains aten level op).
2) Utilize side effect to add range constraints for traced symbol's shape env ([code](9591e52880/torch/fx/experimental/symbolic_shapes.py (L370-L372))).
3) If runtime assertion is turned on (by default). [`_AddRuntimeAssertionsForConstraintsPass`](9591e52880/torch/_export/passes/add_runtime_assertions_for_constraints_pass.py (L98-L100)) will try to append assertion node based on range constrains extracted from shape env of symbol during another interpretation round.
4). However, since 1), in the round of AOT export, range constraints logic won't run for symbols generated during this round. And later there is no range constrains information available for assertion round and caused issue.
5) As a result of above, it will failure at `torch.empty((a, 4))` (there is no constrains for `a` that it must be positive).

The fix here is just to implement range constrain logic as a native aten op (CPU implementation as no-op) to make it be able to survive AOT export.

**NOTE:**
[Logic](2d745b95d7/torch/fx/experimental/symbolic_shapes.py (L350-L365C15)) within [`constrain_range`](2d745b95d7/torch/fx/experimental/symbolic_shapes.py (LL313C74-L313C74)) is split out as `constrain_range_int` to capture case when non `SymInt` is passed in and reused in the new `_constrain_range`. The reason is when non `SymInt` is provided:
* If it directly calls `sym_constrain_range`, the C++ version will be called which will be no-op.
* So in this case it calls `constrain_range_int` instead to be able to capture issue like user provides a input whose tensor's shape could be out of range during exporting, like the following for above code example:
```
...
inp = torch.tensor([10])
ep = torch._export.export(f, (inp,)) # immediately raise error
```

Differential Revision: [D46734204](https://our.internmc.facebook.com/intern/diff/D46734204)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103346
Approved by: https://github.com/tugsbayasgalan
2023-06-16 14:55:40 +00:00
Nikita Vedeneev
056d92e2a0 sparse.mm backward: performance improvements (#94991)
`torch.sparse.mm` - faster and without syncs in "most" cases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94991
Approved by: https://github.com/Skylion007, https://github.com/pearu, https://github.com/cpuhrsch
2023-06-12 20:57:29 +00:00
Richard Zou
74f10b9ea5 Switch most Python RAII guard usages to context manager (#102642)
There are some I can't easily switch due to reasons like:
- Dynamo modelling the guard
- BC concerns (for torch.autograd.set_multithreading_enabled)

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102642
Approved by: https://github.com/albanD
2023-06-01 16:28:37 +00:00
leslie-fang-intel
488a4303a5 Enable quantized_max_pool3d (#101654)
**Summary**
Enable `quantized_max_pool3d` kernel to fix the issue https://github.com/pytorch/pytorch/issues/101386.

**Test Plan**
```
clear && python -u -m pytest -s -v test_quantized_op.py -k test_max_pool3d
clear && python -u -m pytest -s -v test_quantized_op.py -k test_max_pool3d_nhwc
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101654
Approved by: https://github.com/albanD, https://github.com/jgong5, https://github.com/mingfeima
2023-05-23 00:45:38 +00:00
Tugsbayasgalan Manlaibaatar
d4bf76c2a4 Persist torch.assert in aten graph (#100101)
This PR introduces a new operator called aten._assert_async.msg, which allows passing a tensor value and assertion message as inputs. As part of TorchDynamo, we're replacing the use of torch._assert with this new operator so that make_fx also knows how to handle assertions. This is subset of https://github.com/pytorch/pytorch/pull/98878, refer there for historic reviews.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100101
Approved by: https://github.com/jansel
2023-04-28 07:31:43 +00:00
Justin Chu
6e3cdcad08 Fix flake8 lint errors - part 2 - manual fixes (#99799)
<!--
copilot:all
-->
### <samp>🤖 Generated by Copilot at 8aef78f</samp>

### Summary
📝🚀🛠️

<!--
1.  📝 for modifying the logging format and style
2.  🚀 for improving performance and avoiding unnecessary string creation
3.  🛠️ for fixing flake8 issues
-->
This pull request updates some logging calls to use old-style string formatting with `%s` placeholders instead of f-strings in `torch/_dynamo/logging.py`, `torch/_functorch/compilers.py`, and `torch/fx/passes/pass_manager.py` as part of a logging standardization effort. It also adds a `# noqa: F404` comment to the `import __future__` statement in `torch/overrides.py` to fix a flake8 warning.

> _`log` uses old style_
> _formatting strings with `%s`_
> _logging is faster_

### Walkthrough
*  Standardize logging format and style to use old-style string formatting with `%s` placeholders instead of f-string syntax for performance and consistency ([link](https://github.com/pytorch/pytorch/pull/99799/files?diff=unified&w=0#diff-18807f7fd187b8bc8e69e93722566195b36d5bf269099b415a6f90b552228d6bL55-R55), [link](https://github.com/pytorch/pytorch/pull/99799/files?diff=unified&w=0#diff-fae8a66564055743ec031edb87eb22edeebf7fdebef9d21660d5e6a6252e5222L370-R373), [link](https://github.com/pytorch/pytorch/pull/99799/files?diff=unified&w=0#diff-5f3e37ded032f24e247dcf4a3be4b73ea0cf21382e342631742e5a04550202e1L72-R72))
*  Suppress flake8 warning for `import __future__` statement in `torch/overrides.py` with `# noqa: F404` comment ([link](https://github.com/pytorch/pytorch/pull/99799/files?diff=unified&w=0#diff-4f601fe7f31e875ee4354882c0bb490bc35e51d3d413d058cc5fda3be8ca9f15L23-R23))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99799
Approved by: https://github.com/Skylion007
2023-04-24 06:03:26 +00:00
Guang Yang
c377a8590b Add nonzero_static() op to pytorch to unblock export (#97417)
Summary: Add new experimental python op (`torch.nonzero_static`) for export. There is NO cuda impl included in this PR

Example:

Say input tensor is `x = torch.tensor([[1, 0], [3, 2]])`

call regular `nonzero()` on x will give you a tensor `tensor([[0, 0], [1, 0], [1, 1])`
call `nonzero_static(x, size=4)` on x will give you a tensor `tensor([[0, 0], [1, 0], [1, 1], [fill_value, fill_value])` (padded)
call `nonzero_static(x, size=2)` on x will give you a tensor `tensor([[0, 0], [1, 0])` (truncated)

Test Plan:
**Unit Tests**
```
buck test @mode/dev-nosan //caffe2/test:test_dynamo -- 'caffe2/test:test_dynamo - test_export.py::ExportTests::test_export_with_nonzero_static' -- 'caffe2/test:test_dynamo - test_misc.py::MiscTests::test_nonzero_static'
```

**PT2 Export with `nonzero_static()`**
Example of `GraphModule` in the exported graph
```
def forward(self, x):
    arg0, = fx_pytree.tree_flatten_spec(([x], {}), self._in_spec)
    nonzero_static_default = torch.ops.aten.nonzero_static.default(arg0, size = 4);  arg0 = None
    return pytree.tree_unflatten([nonzero_static_default], self._out_spec)
```

Differential Revision: D44324808

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97417
Approved by: https://github.com/ezyang
2023-04-11 05:13:36 +00:00
BJ Hargrave
555ab310dc Add itemsize and nbytes properties to Tensor (#98322)
Adds properties for itemsize and nbytes to Tensor matching the properties in NumPy.

Fixes https://github.com/pytorch/pytorch/issues/12728

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98322
Approved by: https://github.com/ezyang
2023-04-05 12:11:55 +00:00
Joel Schlosser
77e73b9b7a Refactor NT offsets metadata to be a Tensor (#96909)
It's tedious work, but somebody's gotta do it.

Benefits:
* Enable access to offsets metadata from Python via private API (for validation, etc.)
* Consistency with nested sizes / strides metadata
* Needed for SymInt-ifying offsets metadata
* more TBD

Bonus:
* Remove `_tensor` suffixes from metadata / getter names
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96909
Approved by: https://github.com/drisspg
2023-03-21 18:51:35 +00:00
Pearu Peterson
2abcafcfd8 Add masked_grad kw argument to to_dense (#96095)
As in the title.

The `masked_grad` kw argument is required for `to_dense` backward to distinguish the expected semantics of sparse tensors. `masked_grad=True` means that the `to_dense` backward will apply a mask to the returned gradient where the mask is defined by the input indices. The default semantics implies `masked_grad==True` for BC but see the [comment](https://github.com/pytorch/pytorch/pull/96095/files#diff-d4df180433a09071e891d552426911c227b30ae9b8a8e56da31046e7ecb1afbeR501-R513) in `to_dense_backward`.

As a consequence, existing code that is run through autograd engine must replace `.to_dense()` calls with `.to_dense(masked_grad=False)`. For example,
```python
torch.autograd.gradcheck(lambda x: torch.sum(x, [0]).to_dense())
torch.autograd.gradcheck(lambda x: torch.sparse.sum(x, [0]).to_dense())
```
(recall, gradcheck has `masked=False` as default) must be updated to
```python
torch.autograd.gradcheck(lambda x: torch.sum(x, [0]).to_dense(masked_grad=False))
torch.autograd.gradcheck(lambda x: torch.sparse.sum(x, [0]).to_dense(masked_grad=True), masked=True)
```

Fixes https://github.com/pytorch/pytorch/issues/95550

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96095
Approved by: https://github.com/cpuhrsch
2023-03-16 21:38:11 +00:00
Edward Z. Yang
ce950b412f Reland "Add torch.empty_permuted (#95069)" (#95208)
This reverts commit 92e03cd583.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95208
Approved by: https://github.com/albanD
2023-02-21 18:02:48 +00:00
PyTorch MergeBot
92e03cd583 Revert "Add torch.empty_permuted (#95069)"
This reverts commit bedeb1f014.

Reverted https://github.com/pytorch/pytorch/pull/95069 on behalf of https://github.com/jeanschmidt due to Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq
2023-02-21 12:05:20 +00:00
Edward Z. Yang
bedeb1f014 Add torch.empty_permuted (#95069)
torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics.

The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards.

However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides:

* The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the *inverse* of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3.
* An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation.

It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it.

A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95069
Approved by: https://github.com/malfet, https://github.com/ngimel, https://github.com/albanD, https://github.com/dagitses
2023-02-20 00:23:10 +00:00
Mikayla Gawarecki
c7c7238976 Fix bug in unsqueeze_nested stride calculation (#88688)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88688
Approved by: https://github.com/cpuhrsch
2023-02-10 17:00:04 +00:00
Aaron Gokaslan
1e2d82b8e4 [BE] Merge isinstance calls together (#94419)
Simplify and speeds up isinstance calls by checking for multiple types at the same time.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94419
Approved by: https://github.com/ezyang
2023-02-09 00:47:26 +00:00
albanD
496c0a207b Make segment_reduce properly private. (#93166)
I am attempting not to change the aten function to reduce the amount of BC issues on the torchscript side.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93166
Approved by: https://github.com/ngimel
2023-02-06 18:32:23 +00:00
Ivan Yashchuk
fba13d94a1 Remove deprecated torch.symeig (#70988)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`.

- [x] XLA PR: https://github.com/pytorch/xla/pull/4498

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988
Approved by: https://github.com/lezcano, https://github.com/kit1980, https://github.com/malfet
2023-01-31 11:59:11 +00:00
PyTorch MergeBot
acdd462b1a Revert "Remove deprecated torch.symeig (#70988)"
This reverts commit d70ed68162.

Reverted https://github.com/pytorch/pytorch/pull/70988 on behalf of https://github.com/kit1980 due to Failing XLA tests, forward fix unsuccessful
2023-01-24 19:03:40 +00:00
Michael Gschwind
7265f60ad0 Regularize mask handling for attn_mask and key_padding_mask (#92733)
Summary:
Regularize mask handling for attn_mask and key_padding_mask
* Update documentation to remove reference to byte masks (which were deprecated long ago)
* Introduce check and warn about deprecation if attn_mask and key_padding_mask types mismatch
* Convert all masks to float before combining
* Combine by adding

Test Plan: sandcastle & github CI

Differential Revision: D42653215

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92733
Approved by: https://github.com/ngimel, https://github.com/drisspg
2023-01-24 14:12:05 +00:00
Ivan Yashchuk
d70ed68162 Remove deprecated torch.symeig (#70988)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988
Approved by: https://github.com/lezcano, https://github.com/kit1980
2023-01-23 22:51:40 +00:00
Driss Guessous
df14650f0b [SDPA] Update SDPA API and make function Public (#92189)
# Summary
In preparation for pt 2.0 launch this PR updates SDPA's API and makes the function a nn.funcitonal public function.

## Changes
### API
Previously the the function signature was:
`scaled_dot_product_attention(query, key, value, attn_mask=None, need_attn_weights=False, dropout_p=0.0, is_causal=False) -> (Tensor, Tensor)`
Updated signature:
`scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False) -> Tensor`

This PR removes the need_attn_weights optional boolean variable and updates the return type to a singular tensor.

#### Reasoning:
The main goal of this function is to provide an easy interface for users to call into fused attention kernels e.g.  (FlashAttention). The fused kernels do not currently support arbitrary attn_mask or dropout but there is a PR to mem-efficient attention to enable these. We want to have the API surface ready for when the backing kernels get updated.

The fused kernels save on memory usage by not materializing the weights and it is unlikely that a fast fused implementation will enable this feature so we are removing.

Discussed with folks at FAIR/Xformers and +1 this API change.

#### Make function Public
In preparation for the pt 2.0 launch we make the function public to start to generate user feedback

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92189
Approved by: https://github.com/cpuhrsch
2023-01-23 20:50:46 +00:00
kshitij12345
d2728bb6a7 [functorch] add is_any_true (#92686)
Adds `is_any_true` similar to `is_all_true` (https://github.com/pytorch/pytorch/pull/89097/files)

This would unblock https://github.com/pytorch/functorch/issues/1049
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92686
Approved by: https://github.com/Chillee
2023-01-21 03:36:05 +00:00
Edward Z. Yang
5c6f5439b7 Implement SymBool (#92149)
We have known for a while that we should in principle support SymBool as a separate concept from SymInt and SymFloat ( in particular, every distinct numeric type should get its own API). However, recent work with unbacked SymInts in, e.g., https://github.com/pytorch/pytorch/pull/90985 have made this a priority to implement. The essential problem is that our logic for computing the contiguity of tensors performs branches on the passed in input sizes, and this causes us to require guards when constructing tensors from unbacked SymInts. Morally, this should not be a big deal because, we only really care about the regular (non-channels-last) contiguity of the tensor, which should be guaranteed since most people aren't calling `empty_strided` on the tensor, however, because we store a bool (not a SymBool, prior to this PR it doesn't exist) on TensorImpl, we are forced to *immediately* compute these values, even if the value ends up not being used at all. In particular, even when a user allocates a contiguous tensor, we still must compute channels-last contiguity (as some contiguous tensors are also channels-last contiguous, but others are not.)

This PR implements SymBool, and makes TensorImpl use SymBool to store the contiguity information in ExtraMeta. There are a number of knock on effects, which I now discuss below.

* I introduce a new C++ type SymBool, analogous to SymInt and SymFloat. This type supports logical and, logical or and logical negation. I support the bitwise operations on this class (but not the conventional logic operators) to make it clear that logical operations on SymBool are NOT short-circuiting. I also, for now, do NOT support implicit conversion of SymBool to bool (creating a guard in this case). This does matter too much in practice, as in this PR I did not modify the equality operations (e.g., `==` on SymInt) to return SymBool, so all preexisting implicit guards did not need to be changed. I also introduced symbolic comparison functions `sym_eq`, etc. on SymInt to make it possible to create SymBool. The current implementation of comparison functions makes it unfortunately easy to accidentally introduce guards when you do not mean to (as both `s0 == s1` and `s0.sym_eq(s1)` are valid spellings of equality operation); in the short term, I intend to prevent excess guarding in this situation by unit testing; in the long term making the equality operators return SymBool is probably the correct fix.
* ~~I modify TensorImpl to store SymBool for the `is_contiguous` fields and friends on `ExtraMeta`. In practice, this essentially meant reverting most of the changes from https://github.com/pytorch/pytorch/pull/85936 . In particular, the fields on ExtraMeta are no longer strongly typed; at the time I was particularly concerned about the giant lambda I was using as the setter getting a desynchronized argument order, but now that I have individual setters for each field the only "big list" of boolean arguments is in the constructor of ExtraMeta, which seems like an acceptable risk. The semantics of TensorImpl are now that we guard only when you actually attempt to access the contiguity of the tensor via, e.g., `is_contiguous`. By in large, the contiguity calculation in the implementations now needs to be duplicated (as the boolean version can short circuit, but the SymBool version cannot); you should carefully review the duplicate new implementations. I typically use the `identity` template to disambiguate which version of the function I need, and rely on overloading to allow for implementation sharing. The changes to the `compute_` functions are particularly interesting; for most of the functions, I preserved their original non-symbolic implementation, and then introduce a new symbolic implementation that is branch-less (making use of our new SymBool operations). However, `compute_non_overlapping_and_dense` is special, see next bullet.~~ This appears to cause performance problems, so I am leaving this to an update PR.
* (Update: the Python side pieces for this are still in this PR, but they are not wired up until later PRs.) While the contiguity calculations are relatively easy to write in a branch-free way, `compute_non_overlapping_and_dense` is not: it involves a sort on the strides. While in principle we can still make it go through by using a data oblivious sorting network, this seems like too much complication for a field that is likely never used (because typically, it will be obvious that a tensor is non overlapping and dense, because the tensor is contiguous.) So we take a different approach: instead of trying to trace through the logic computation of non-overlapping and dense, we instead introduce a new opaque operator IsNonOverlappingAndDenseIndicator which represents all of the compute that would have been done here. This function returns an integer 0 if `is_non_overlapping_and_dense` would have returned `False`, and an integer 1 otherwise, for technical reasons (Sympy does not easily allow defining custom functions that return booleans). The function itself only knows how to evaluate itself if all of its arguments are integers; otherwise it is left unevaluated. This means we can always guard on it (as `size_hint` will always be able to evaluate through it), but otherwise its insides are left a black box. We typically do NOT expect this custom function to show up in actual boolean expressions, because we will typically shortcut it due to the tensor being contiguous. It's possible we should apply this treatment to all of the other `compute_` operations, more investigation necessary. As a technical note, because this operator takes a pair of a list of SymInts, we need to support converting `ArrayRef<SymNode>` to Python, and I also unpack the pair of lists into a single list because I don't know if Sympy operations can actually validly take lists of Sympy expressions as inputs. See for example `_make_node_sizes_strides`
* On the Python side, we also introduce a SymBool class, and update SymNode to track bool as a valid pytype. There is some subtlety here: bool is a subclass of int, so one has to be careful about `isinstance` checks (in fact, in most cases I replaced `isinstance(x, int)` with `type(x) is int` for expressly this reason.) Additionally, unlike, C++, I do NOT define bitwise inverse on SymBool, because it does not do the correct thing when run on booleans, e.g., `~True` is `-2`. (For that matter, they don't do the right thing in C++ either, but at least in principle the compiler can warn you about it with `-Wbool-operation`, and so the rule is simple in C++; only use logical operations if the types are statically known to be SymBool). Alas, logical negation is not overrideable, so we have to introduce `sym_not` which must be used in place of `not` whenever a SymBool can turn up. To avoid confusion with `__not__` which may imply that `operators.__not__` might be acceptable to use (it isn't), our magic method is called `__sym_not__`. The other bitwise operators `&` and `|` do the right thing with booleans and are acceptable to use.
* There is some annoyance working with booleans in Sympy. Unlike int and float, booleans live in their own algebra and they support less operations than regular numbers. In particular, `sympy.expand` does not work on them. To get around this, I introduce `safe_expand` which only calls expand on operations which are known to be expandable.

TODO: this PR appears to greatly regress performance of symbolic reasoning. In particular, `python test/functorch/test_aotdispatch.py -k max_pool2d` performs really poorly with these changes. Need to investigate.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92149
Approved by: https://github.com/albanD, https://github.com/Skylion007
2023-01-21 02:21:56 +00:00
Kurt Mohler
647b8f8e3e Add TORCH_CHECK_TENSOR_ALL (#89097)
`TORCH_CHECK_TENSOR_ALL(cond, ...)` is a wrapper around `TORCH_CHECK` which allows the condition argument to be a tensor, batched or unbatched. `cond` can be a boolean tensor of any size. If any element is False, or if `cond.numel() == 0`, then `TORCH_CHECK_TENSOR_ALL` raises an error

Part of #72948
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89097
Approved by: https://github.com/zou3519
2023-01-19 21:04:09 +00:00
Edward Z. Yang
6420fecdc4 Introduce sym_min and sym_max (#92107)
It turns out our old max/min implementation didn't do anything, because `__max__` and `__min__` are not actually magic methods in Python. So I give 'em the `sym_` treatment, similar to the other non-overrideable builtins.

NB: I would like to use `sym_max` when computing contiguous strides but this appears to make `python test/functorch/test_aotdispatch.py -v -k test_aot_autograd_symbolic_exhaustive_nn_functional_max_pool2d_cpu_float32` run extremely slowly. Needs investigating.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92107
Approved by: https://github.com/albanD, https://github.com/voznesenskym, https://github.com/Skylion007
2023-01-18 20:57:27 +00:00
yanbing-j
94a7c01159 Enable oneDNN implementation in LSTM op (#91158)
### Description
This PR is to enable oneDNN implementation in LSTM op to improve the performance of it. Both FP32 and BF16 are supported.

### Performance improvement
In CPX 28C, with setting iomp and jemalloc.
We choose 8 LSTM input options (including input_size, hidden_size, num_layers, bidirectional, bias, batch_first, dropout, batch_size, seq_len), and the final option is a real input from train-clean-100 in LibriSpeech dataset. The performance improvements are shown in the following figures. We can see that LSTM with oneDNN implementation can perform better than the original.

In single socket:
![image](https://user-images.githubusercontent.com/61222868/211182994-833debec-518a-4b35-8504-6b0fadb17930.png)

![image](https://user-images.githubusercontent.com/61222868/211183012-31e1253f-2c60-4c92-a656-c239a971b453.png)

In single core:
![image](https://user-images.githubusercontent.com/61222868/211183017-186e5d47-cb9a-4c1e-914f-fa718e769f1c.png)

![image](https://user-images.githubusercontent.com/61222868/211183022-53266857-5a9e-4a95-b300-33fa34811d08.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91158
Approved by: https://github.com/jgong5, https://github.com/malfet
2023-01-18 04:41:18 +00:00
Edward Z. Yang
333540a458 Reland "Add torch.utils.device_mode" (#91796)
Original PR https://github.com/pytorch/pytorch/pull/91525

Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796
Approved by: https://github.com/albanD
2023-01-09 20:57:12 +00:00
PyTorch MergeBot
9b415240d4 Revert "Reland "Add torch.utils.device_mode" (#91796)"
This reverts commit 81b5eff3c3.

Reverted https://github.com/pytorch/pytorch/pull/91796 on behalf of https://github.com/huydhn due to This breaks trunk with the following failed test https://hud.pytorch.org/failure/test_jit_save%2CTestTracer
2023-01-09 04:45:47 +00:00
Edward Z. Yang
81b5eff3c3 Reland "Add torch.utils.device_mode" (#91796)
Original PR https://github.com/pytorch/pytorch/pull/91525

Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796
Approved by: https://github.com/albanD
2023-01-08 03:44:56 +00:00
PyTorch MergeBot
f571ae4fdb Revert "Make torch.device usable as a context manager (#91525)"
This reverts commit 619d52a5d2.

Reverted https://github.com/pytorch/pytorch/pull/91525 on behalf of https://github.com/mehtanirav due to Internal breakages
2023-01-05 21:34:50 +00:00
Edward Z. Yang
619d52a5d2 Make torch.device usable as a context manager (#91525)
Fixes https://github.com/pytorch/pytorch/issues/82296
Fixes https://github.com/pytorch/pytorch/issues/27878
Fixes https://github.com/pytorch/pytorch/issues/260

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91525
Approved by: https://github.com/albanD
2023-01-04 01:32:00 +00:00
Kurt Mohler
08a47549af Rename Tensor._storage to Tensor.untyped_storage and update docs (#91414)
Fixes #89224

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91414
Approved by: https://github.com/ezyang
2022-12-28 19:21:34 +00:00
Joel Schlosser
8b55b86dbd Move sym_int and sym_float alongside SymInt / SymFloat in base torch package (#91317)
This PR moves the definitions for:
* `sym_int`
* `sym_ceil` (used only for `sym_int`)
* `sym_floor` (used only for `sym_int`)
* `sym_float`

from `torch/fx/experimental/symbolic_shapes.py` to `torch/__init__.py`, where `SymInt` and `SymFloat` are already defined.

This removes the need for several in-line imports, and enables proper JIT script gating for #91318. I'm very open to doing this in a better way!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91317
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2022-12-28 16:08:16 +00:00
Richard Zou
fb2e1878cb [torch.func] alias torch.func.vmap as torch.vmap (#91026)
This PR also redirects torch.vmap to torch.func.vmap instead of the old
vmap prototype.

Test Plan:
- tests
- view docs preview
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91026
Approved by: https://github.com/albanD, https://github.com/samdow
2022-12-21 20:51:49 +00:00
Michael Gschwind
512ec181ec Introduce causal mask (#90508)
Summary: Introduce causal mask

This PR introduces a causal mask option _causal_mask (as well as causal mask detection if attn_mask is provided), since current custom kernels do not support arbitrary masks.

Test Plan: sandcastle & github ci/cd

Differential Revision: D41723137

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90508
Approved by: https://github.com/albanD
2022-12-16 21:39:42 +00:00
Sergii Dymchenko
f51f6aa387 Fix non-existing parameters in docstrings (#90505)
Continuation after https://github.com/pytorch/pytorch/pull/90163.

Here is a script I used to find all the non-existing arguments in the docstrings (the script can give false positives in presence of *args/**kwargs or decorators):

_Edit:_
I've realized that the indentation is wrong for the last `break` in the script, so the script only gives output for a function if the first docstring argument is wrong. I'll create a separate PR if I find more issues with corrected script.

``` python
import ast
import os
import docstring_parser

for root, dirs, files in os.walk('.'):
    for name in files:
        if root.startswith("./.git/") or root.startswith("./third_party/"):
            continue
        if name.endswith(".py"):
            full_name = os.path.join(root, name)
            with open(full_name, "r") as source:
                tree = ast.parse(source.read())
                for node in ast.walk(tree):
                    if isinstance(node, ast.FunctionDef):
                        all_node_args = node.args.args
                        if node.args.vararg is not None:
                            all_node_args.append(node.args.vararg)
                        if node.args.kwarg is not None:
                            all_node_args.append(node.args.kwarg)
                        if node.args.posonlyargs is not None:
                            all_node_args.extend(node.args.posonlyargs)
                        if node.args.kwonlyargs is not None:
                            all_node_args.extend(node.args.kwonlyargs)
                        args = [a.arg for a in all_node_args]
                        docstring = docstring_parser.parse(ast.get_docstring(node))
                        doc_args = [a.arg_name for a in docstring.params]
                        clean_doc_args = []
                        for a in doc_args:
                            clean_a = ""
                            for c in a.split()[0]:
                                if c.isalnum() or c == '_':
                                    clean_a += c
                            if clean_a:
                                clean_doc_args.append(clean_a)
                        doc_args = clean_doc_args
                        for a in doc_args:
                            if a not in args:
                                print(full_name, node.lineno, args, doc_args)
                            break

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90505
Approved by: https://github.com/malfet, https://github.com/ZainRizvi
2022-12-09 21:43:09 +00:00
Nikita Shulga
768bd3fb4a Add torch.compile implementation (#89607)
`torch.compile` can be used either as decorator or to optimize model directly, for example:
```
@torch.compile
def foo(x):
  return torch.sin(x) + x.max()
```
or
```
mod = torch.nn.ReLU()
optimized_mod = torch.compile(mod, mode="max-autotune")
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89607
Approved by: https://github.com/soumith
2022-12-01 20:17:52 +00:00
albanD
c79489c8e6 Expose to python the backward AD view_func (#89586)
This will be useful for other systems (AOTAutograd) that want to replay autograd views.

FYI @bdhirsh
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89586
Approved by: https://github.com/soulitzer
2022-11-24 03:39:58 +00:00
Jane Xu
8695f0cced Rectify native_batch_norm schema by splitting it into two legit schemas (#88697)
Using the same repro from the issue (but with BatchNorm2D)

Rectifies native_batch_norm schema by splitting the schema into 2:
1. one will have NON-optional alias-able running_mean and running_var inputs
2. the other will just not have those parameters at all (no_stats variation)

**Calling for name suggestions!**

## test plan
I've added tests in test_functionalization.py as well as an entry in common_method_invocations.py for `native_batch_norm_legit`
CI should pass.

## next steps
Because of bc/fc reasons, we reroute native_batch_norm to call our new schemas ONLY through the python dispatcher, but in 2 weeks or so, we should make `native_batch_norm_legit` the official batch_norm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88697
Approved by: https://github.com/albanD
2022-11-23 23:23:17 +00:00
Kurt Mohler
ee28b865ee Deprecate TypedStorage, its derived classes, and all of their public methods (#85303)
Part of #85302

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303
Approved by: https://github.com/ezyang
2022-11-08 18:11:01 +00:00
Kazuaki Ishizaki
2ddefbdc3c Fix typos used in documents under torch directory (#88300)
This PR fixes typos, in comments of Python files, that are found from a search box at https://pytorch.org/docs/master/search.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88300
Approved by: https://github.com/lezcano
2022-11-02 09:38:13 +00:00
Kazuaki Ishizaki
0fab8df0b6 Fix incorrect param names in get_testing_overrides (#87625)
This PR fixes incorrect parameter names for lambda in `get_testing_overrides()`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87625
Approved by: https://github.com/kit1980
2022-10-25 02:49:14 +00:00
samdow
169ec120ef [Modes] refactor modes to only use a stack in cpp (#86458)
Refactors the mode code to only have the C++ mode stack and not the "C++ mode" like we originally had. This also simplifies the mode logic in a number of places
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86458
Approved by: https://github.com/zou3519
2022-10-21 19:18:23 +00:00
Antoni Viros i Martin
cdbffa7f66 🦊 [AI Accelerators] Consolidate native_layer_norm for nested tensor (#86295)
Summary: In order to make the layer normalization implementation for nested tensors public, it needs to be generalized to accept a normalized_shape argument instead of assuming it to be the last dimension of the nested_tensor. This commit does that, as well as adding extra unit tests to ensure the implementation is correct.

Test Plan:
All unit tests designed to test different ways of using the function work:

`buck test //caffe2/test:nested -- test_layer_norm`

Differential Revision: D40105207

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86295
Approved by: https://github.com/drisspg
2022-10-06 13:10:25 +00:00
Ivan Yashchuk
b00a5359f7 Add a way to skip lowering to nvprims (#85811)
This PR adds `skip_ops` argument to `TorchRefsNvfuserCapabilityMode` and `NvfuserPrimsMode` which is an iterable of function names to be skipped in the translation to nvprims process.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85811
Approved by: https://github.com/mruberry, https://github.com/jjsjann123
2022-09-30 12:01:45 +00:00
Mikayla Gawarecki
afaee00fec Add python nested_tensor and as_nested_tensor constructors in torch.nested (#85593)
Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ).

Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc.

Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593
Approved by: https://github.com/drisspg, https://github.com/cpuhrsch
2022-09-28 20:15:02 +00:00
Edward Z. Yang
24a268143d Directly access has_symbolic_sizes_strides, avoid expensive test (#85754)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85754
Approved by: https://github.com/albanD
2022-09-28 00:26:11 +00:00
samdow
a106611055 [Modes] fix handle_torch_funcion logic (#85707)
Fixes #85696. I didn't totally get what was happening in handle_torch_function and so was trying to recreate the original logic instead of follow what the C++ is doing. This fixes that
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85707
Approved by: https://github.com/ezyang
2022-09-27 18:35:51 +00:00
samdow
18d8c548f4 [Modes] remove enable and rewrite mode stack (squashed) (#84774)
Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch|function}

This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily

Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup

### Background
Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like

```python
## PRE-PR UX
def f(mode):
  with mode.restore():  # user needs to understand this restore thing?
    ...

with Mode() as m:
  pass
f(m)
```

Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation"  step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write
```python
## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR
def f(mode):
  with mode:
    ...
f(Mode())
```

** Technical Details **
With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-09-27 01:04:35 +00:00
Ivan Yashchuk
539076e2c2 Remove deprecated torch.lstsq (#70980)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.lstsq`.

There's a note in `tools/codegen/gen.py` about `lstsq` schema in `native_function.yaml` that I will not remove:
87139d8532/tools/codegen/gen.py (L734-L770)

cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70980
Approved by: https://github.com/lezcano, https://github.com/kit1980
2022-09-23 00:16:55 +00:00
Ivan Yashchuk
bcf93181a0 Remove deprecated torch.matrix_rank (#70981)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.matrix_rank`.

cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70981
Approved by: https://github.com/lezcano, https://github.com/kit1980
2022-09-22 17:40:46 +00:00
Mikayla Gawarecki
77f1f98479 Re-introduce torch.Tensor.to_padded_tensor (#85293)
Differential Revision: [D39629004](https://our.internmc.facebook.com/intern/diff/D39629004)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85293
Approved by: https://github.com/cpuhrsch
2022-09-21 18:45:56 +00:00
Khushi Agrawal
2386cd2945 [reland] [numpy] add torch.concatenate, alias of torch.cat (#85073)
Previous PR: #82946

Fixes #81161

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85073
Approved by: https://github.com/mruberry
2022-09-15 19:34:44 +00:00
PyTorch MergeBot
fa7bf3e2dc Revert "[numpy] add torch.concatenate, alias of torch.cat (#82946)"
This reverts commit 270e5e519d.

Reverted https://github.com/pytorch/pytorch/pull/82946 on behalf of https://github.com/malfet due to Broke M1 tests, see 270e5e519d
2022-09-14 21:32:11 +00:00
Khushi Agrawal
270e5e519d [numpy] add torch.concatenate, alias of torch.cat (#82946)
As per the title. Fixes: #81161

- [x] add ErrorInputs
- ~[ ] dtype argument?~
- ~[ ] casting argument?~

As discussed offline with @kshitij12345, we can currently ignore `dtype` and `casting` arguments.

cc: @kshitij12345!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82946
Approved by: https://github.com/mruberry
2022-09-14 19:28:43 +00:00
Mikayla Gawarecki
e217b30b0f Add torch.nested namespace (#84102)
First step towards #83775
- only `to_padded_tensor` is moved to the nested namespace for now
- following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in
`torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`.

~~**Question**: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~

[generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested)

Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102
Approved by: https://github.com/drisspg
2022-09-12 16:31:05 +00:00
Ivan Yashchuk
01c54ad6de Remove deprecated torch.eig (#70982)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`.

cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982
Approved by: https://github.com/Lezcano, https://github.com/malfet
2022-09-09 21:31:57 +00:00
samdow
7532d5b125 [Modes] remove inner constructor kwarg (#83925)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83925
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-08-31 00:05:56 +00:00
Michael Gschwind
cf2c94e6de NestedTensor Softmax (#83435)
Summary: Simple mask compute and softmax

Test Plan: unit test

Differential Revision: D38711915

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83435
Approved by: https://github.com/erichan1, https://github.com/huydhn
2022-08-17 21:57:42 +00:00
PyTorch MergeBot
0061e67629 Revert "NestedTensor Softmax (#83435)"
This reverts commit d7fc76a1ed.

Reverted https://github.com/pytorch/pytorch/pull/83435 on behalf of https://github.com/huydhn due to This is suspected to break functorch tests in trunk d7fc76a1ed
2022-08-17 16:19:38 +00:00
Michael Gschwind
d7fc76a1ed NestedTensor Softmax (#83435)
Summary: Simple mask compute and softmax

Test Plan: unit test

Differential Revision: D38711915

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83435
Approved by: https://github.com/erichan1
2022-08-17 04:19:23 +00:00
soulitzer
31fad3926a Add option to run anomaly mode without nan checking (#83481)
Fixes https://github.com/pytorch/pytorch/issues/83117

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83481
Approved by: https://github.com/albanD
2022-08-16 22:56:23 +00:00
Jeff Daily
d52d2bd5a9 [ROCm] MIOpen fused convolution relu (#82002)
Adds MIOpen fused convolution relu for fp32 and contiguous memory format.  Adds fallbacks for conv + z + bias + relu, fp16, and channels last until MIOpen adds these features.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82002
Approved by: https://github.com/ngimel, https://github.com/malfet
2022-08-16 20:49:33 +00:00
albanD
e4ea751810 Fix hash for Tensor subclasses (#83174)
Fixes https://github.com/pytorch/pytorch/issues/82832
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83174
Approved by: https://github.com/ezyang
2022-08-10 19:23:56 +00:00
Fabio Rocha
fd84c458f4 Add torch.unflatten and improve its docs (#81399)
unflatten now has a free function version in torch.flatten in addition to
    the method in torch.Tensor.flatten.

    Updated docs to reflect this and polished them a little.
    For consistency, changed the signature of the int version of unflatten in
    native_functions.yaml.

    Some override tests were failing because unflatten has unusual
    characteristics in terms of the .int and .Dimname versions having
    different number of arguments so this required some changes
    to test/test_override.py

    Removed support for using mix of integer and string arguments
    when specifying dimensions in unflatten.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81399
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-07-29 15:02:42 +00:00
samdow
2ac24675cc get rid of push_torch_{dispatch, function}_mode (#78215)
Currently we have 2 ways of doing the same thing for torch dispatch and function modes:
`with push_torch_dispatch_mode(X)` or `with X.push(...)`
is now the equivalent of doing
`with X()`

This removes the first API (which is older and private so we don't need to go through a deprecation cycle)

There is some risk here that this might land race with a PR that uses the old API but in general it seems like most are using the `with X()` API or `enable_torch_dispatch_mode(X())` which isn't getting removed.

EDIT: left the `with X.push(...)` API since there were ~3 land races with that over the past day or so. But made it give a warning and ask users to use the other API
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78215
Approved by: https://github.com/ezyang
2022-07-22 18:56:37 +00:00
Edward Z. Yang
d4f065d261 Return mode object from __enter__ (#80998)
This makes `with Mode() as m:` work.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80998
Approved by: https://github.com/samdow
2022-07-12 23:22:26 +00:00
lezcano
e505796a2c [Array API] Add linalg.vecdot (#70542)
This PR adds the function `linalg.vecdot` specified by the [Array
API](https://data-apis.org/array-api/latest/API_specification/linear_algebra_functions.html#function-vecdot)

For the complex case, it chooses to implement \sum x_i y_i. See the
discussion in https://github.com/data-apis/array-api/issues/356

Edit. When it comes to testing, this function is not quite a binopt, nor a reduction opt. As such, we're this close to be able to get the extra testing, but we don't quite make it. Now, it's such a simple op that I think we'll make it without this.

Resolves https://github.com/pytorch/pytorch/issues/18027.

cc @mruberry @rgommers @pmeier @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70542
Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry
2022-07-12 14:28:54 +00:00
PyTorch MergeBot
39f659c3ba Revert "[Array API] Add linalg.vecdot (#70542)"
This reverts commit 74208a9c68.

Reverted https://github.com/pytorch/pytorch/pull/70542 on behalf of https://github.com/malfet due to Broke CUDA-10.2 for vecdot_bfloat16, see 74208a9c68
2022-07-08 22:56:51 +00:00
lezcano
74208a9c68 [Array API] Add linalg.vecdot (#70542)
This PR adds the function `linalg.vecdot` specified by the [Array
API](https://data-apis.org/array-api/latest/API_specification/linear_algebra_functions.html#function-vecdot)

For the complex case, it chooses to implement \sum x_i y_i. See the
discussion in https://github.com/data-apis/array-api/issues/356

Edit. When it comes to testing, this function is not quite a binopt, nor a reduction opt. As such, we're this close to be able to get the extra testing, but we don't quite make it. Now, it's such a simple op that I think we'll make it without this.

Resolves https://github.com/pytorch/pytorch/issues/18027.

cc @mruberry @rgommers @pmeier @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70542
Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry
2022-07-08 15:37:58 +00:00
Nikolay Korovaiko
8389ccbcd8 reinstate size and shape returning symints (#79560)
This PR redirects `size` and `.shape` to call `sym_sizes`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79560
Approved by: https://github.com/Chillee
2022-07-08 01:17:33 +00:00
lezcano
19f3d4d795 Expose linalg.solve_ex (#80073)
This prepares for making `linalg.inv_ex` just a call into this function
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80073
Approved by: https://github.com/IvanYashchuk, https://github.com/albanD
2022-07-01 16:09:23 +00:00
Allen Goodman
63ef2a03e5 torch.special.scaled_modified_bessel_k0 (#78900)
```Python
scaled_modified_bessel_k0(input, *, out=None) -> Tensor
```

Scaled modified Bessel function of the second kind of order $0$.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78900
Approved by: https://github.com/mruberry
2022-06-29 14:53:37 +00:00
Nikolay Korovaiko
7e34edf12d adding sym_size override (#80357)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80357
Approved by: https://github.com/ezyang
2022-06-29 00:53:45 +00:00
PyTorch MergeBot
602c38ff63 Revert "torch.special.gamma (#78904)"
This reverts commit f563f25efd.

Reverted https://github.com/pytorch/pytorch/pull/78904 on behalf of https://github.com/suo due to This PR appears to have broken mac tests on master f563f25efd
2022-06-28 00:54:22 +00:00
Allen Goodman
ab8797d69b torch.special.spherical_bessel_j0 (#78912)
```Python
spherical_bessel_j0(input, *, out=None) -> Tensor
```

Spherical Bessel function of the first kind of order $0$.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78912
Approved by: https://github.com/mruberry
2022-06-27 20:14:46 +00:00
Allen Goodman
f563f25efd torch.special.gamma (#78904)
```Python
gamma(input, *, out=None) -> Tensor
```

Gamma function $\Gamma\left(\text{input}\right)$.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78904
Approved by: https://github.com/mruberry
2022-06-27 19:36:17 +00:00
Allen Goodman
b3ca3638be torch.special.scaled_modified_bessel_k1 (#78901)
```Python
scaled_modified_bessel_k1(input, *, out=None) -> Tensor
```

Scaled modified Bessel function of the second kind of order $1$.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78901
Approved by: https://github.com/mruberry
2022-06-24 20:57:38 +00:00
Allen Goodman
b3308e21bf torch.special.airy_ai (#78902)
```Python
airy_ai(input, *, out=None) -> Tensor
```

Airy function $\text{Ai}\left(\text{input}\right)$.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78902
Approved by: https://github.com/mruberry, https://github.com/linbinyu, https://github.com/seemethere
2022-06-23 19:33:40 +00:00
Edward Z. Yang
f7ee061638 Wconstab/reland pysymint (#79795)
rebased https://github.com/pytorch/pytorch/pull/79617/ to see if issues are reproducible.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79795
Approved by: https://github.com/malfet
2022-06-20 22:55:06 +00:00
Mikayla Gawarecki
7360b53ff3 reland Add offsets-based reduction to segment_reduce (CPU, CUDA)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79725

Approved by: https://github.com/george-qi
2022-06-17 15:49:31 +00:00
PyTorch MergeBot
44436947bc Revert "Reland PySymInt (#79617)"
This reverts commit 8ef6356f26.

Reverted https://github.com/pytorch/pytorch/pull/79617 on behalf of https://github.com/zengk95 due to this is breaking periodic jobs (and maybe pull) on trunk
2022-06-16 19:40:27 +00:00
Nikolay Korovaiko
8ef6356f26 Reland PySymInt (#79617)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79617
Approved by: https://github.com/Chillee
2022-06-16 04:18:06 +00:00
drisspg
b9f83cb737 use is_same_size in autograd init (#79553)
Broke: #79446 into a smaller commit that just adds is_same_size to the the autograd __init_file. This function is_same_size will be dispatched to the original behavior for regular tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79553
Approved by: https://github.com/soulitzer
2022-06-15 19:49:42 +00:00
Joel Benjamin Schlosser
2d73c8e6e0 Add Dropout1d module
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79545

Approved by: https://github.com/ngimel, https://github.com/albanD
2022-06-15 14:39:07 +00:00
PyTorch MergeBot
b8db0a0475 Revert "Python Bindings for SymInts (#78135)"
This reverts commit d332724071.

Reverted https://github.com/pytorch/pytorch/pull/78135 on behalf of https://github.com/ezyang due to broke torchvision tests
2022-06-15 13:52:14 +00:00
Nikolay Korovaiko
d332724071 Python Bindings for SymInts (#78135)
This PR adds support for `SymInt`s in python. Namely,
* `THPVariable_size` now returns `sym_sizes()`
* python arg parser is modified to parse PyObjects into ints and `SymbolicIntNode`s
* pybind11 bindings for `SymbolicIntNode` are added, so size expressions can be traced
* a large number of tests added to demonstrate how to implement python symints.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78135
Approved by: https://github.com/ezyang
2022-06-14 02:17:59 +00:00
George Qi
05624bcf7b add sizes to slowpath
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79295

Approved by: https://github.com/ezyang
2022-06-14 01:19:59 +00:00
PyTorch MergeBot
3b194fd532 Revert "Add offsets-based reduction to segment_reduce (CPU, CUDA)"
This reverts commit 1ec30a6647.

Reverted https://github.com/pytorch/pytorch/pull/78907 on behalf of https://github.com/osalpekar due to Caused Typecasting errors in PT Distributed and fx2trt builds internally
2022-06-13 22:37:25 +00:00
Mikayla Gawarecki
1ec30a6647 Add offsets-based reduction to segment_reduce (CPU, CUDA)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78907

Approved by: https://github.com/cpuhrsch
2022-06-11 17:43:42 +00:00
lezcano
54949a5abc Simplify and optimize linalg.solve
This PR heavily simplifies the code of `linalg.solve`. At the same time,
this implementation saves quite a few copies of the input data in some
cases (e.g. A is contiguous)

We also implement it in such a way that the derivative goes from
computing two LU decompositions and two LU solves to no LU
decompositions and one LU solves. It also avoids a number of unnecessary
copies the derivative was unnecessarily performing (at least the copy of
two matrices).

On top of this, we add a `left` kw-only arg that allows the user to
solve `XA = B` rather concisely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74046

Approved by: https://github.com/nikitaved, https://github.com/IvanYashchuk, https://github.com/mruberry
2022-06-11 04:06:40 +00:00
samdow
3734fcc8f8 add ability to push a mode if the current mode is an ancestor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78822

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-10 18:27:04 +00:00
George Qi
a90f006fe5 add strides to slow path
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78610

Approved by: https://github.com/ezyang
2022-06-10 16:59:14 +00:00
lezcano
c7d6cec078 Add linalg.lu_solve
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes https://github.com/pytorch/pytorch/issues/61657

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77634

Approved by: https://github.com/malfet
2022-06-07 22:28:28 +00:00
vitrioil
ebb7f424b8 Add Tensor.is_cpu (#78887)
Fixes #76872

Not sure if this is also required.
ac8c6d09d1/torch/csrc/tensor/python_tensor.cpp (L146)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78887
Approved by: https://github.com/ezyang
2022-06-06 22:01:12 +00:00
samdow
184e0065b3 add better error message for class method
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78821

Approved by: https://github.com/ezyang
2022-06-06 13:31:32 +00:00
Allen Goodman
bc84143152 Orthogonal Polynomials (#78304)
```Python
chebyshev_polynomial_v(input, n, *, out=None) -> Tensor
```

Chebyshev polynomial of the third kind $V_{n}(\text{input})$.

```Python
chebyshev_polynomial_w(input, n, *, out=None) -> Tensor
```

Chebyshev polynomial of the fourth kind $W_{n}(\text{input})$.

```Python
legendre_polynomial_p(input, n, *, out=None) -> Tensor
```

Legendre polynomial $P_{n}(\text{input})$.

```Python
shifted_chebyshev_polynomial_t(input, n, *, out=None) -> Tensor
```

Shifted Chebyshev polynomial of the first kind $T_{n}^{\ast}(\text{input})$.

```Python
shifted_chebyshev_polynomial_u(input, n, *, out=None) -> Tensor
```

Shifted Chebyshev polynomial of the second kind $U_{n}^{\ast}(\text{input})$.

```Python
shifted_chebyshev_polynomial_v(input, n, *, out=None) -> Tensor
```

Shifted Chebyshev polynomial of the third kind $V_{n}^{\ast}(\text{input})$.

```Python
shifted_chebyshev_polynomial_w(input, n, *, out=None) -> Tensor
```

Shifted Chebyshev polynomial of the fourth kind $W_{n}^{\ast}(\text{input})$.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78304
Approved by: https://github.com/mruberry
2022-06-03 22:38:56 +00:00
Allen Goodman
4a5381ab40 Bessel functions (#78451)
Adds:

```Python
bessel_j0(input, *, out=None) -> Tensor
```

Bessel function of the first kind of order $0$, $J_{0}(\text{input})$.

```Python
bessel_j1(input, *, out=None) -> Tensor
```

Bessel function of the first kind of order $1$, $J_{1}(\text{input})$.

```Python
bessel_j0(input, *, out=None) -> Tensor
```

Bessel function of the second kind of order $0$, $Y_{0}(\text{input})$.

```Python
bessel_j1(input, *, out=None) -> Tensor
```

Bessel function of the second kind of order $1$, $Y_{1}(\text{input})$.

```Python
modified_bessel_i0(input, *, out=None) -> Tensor
```

Modified Bessel function of the first kind of order $0$, $I_{0}(\text{input})$.

```Python
modified_bessel_i1(input, *, out=None) -> Tensor
```

Modified Bessel function of the first kind of order $1$, $I_{1}(\text{input})$.

```Python
modified_bessel_k0(input, *, out=None) -> Tensor
```

Modified Bessel function of the second kind of order $0$, $K_{0}(\text{input})$.

```Python
modified_bessel_k1(input, *, out=None) -> Tensor
```

Modified Bessel function of the second kind of order $1$, $K_{1}(\text{input})$.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78451
Approved by: https://github.com/mruberry
2022-06-02 14:06:20 +00:00
samdow
aa06d05297 enable with semantics
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78214

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-01 21:14:45 +00:00
Allen Goodman
64e0d0c4fe Laguerre polynomial (#78366)
Adds:

```Python
laguerre_polynomial_l(input, n, *, out=None) -> Tensor
```

Laguerre polynomial $L_{n}(\text{input})$.

## Derivatives

Recommended $k$-derivative formula with respect to $\text{input}$:

$$\frac{d^{k}}{d \times \text{input}^{k}} L_{n}(\text{input}) = -1^{k} \times L_{-k + n}^{k}(\text{input})$$

where $L_{n}^{\alpha}$ is the associated Laguerre polynomial.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78366
Approved by: https://github.com/mruberry
2022-05-30 17:24:00 +00:00
Allen Goodman
9dc6d42c18 Probabilist’s Hermite polynomial (#78357)
Adds:

```Python
hermite_polynomial_he(input, n, *, out=None) -> Tensor
```
Physicist’s Hermite polynomial $He_{n}(\text{input})$.

If $n = 0$, $1$ is returned. If $n = 1$, $\text{input}$ is returned. Otherwise, the recursion:

$$He_{n + 1}(\text{input}) = 2 \times \text{input} \times He_{n}(\text{input}) - He_{n - 1}(\text{input})$$

is evaluated.

## Derivatives

Recommended $k$-derivative formula with respect to $\text{input}$:

$$\frac{d^{k}}{d \times \text{input}^{k}} He_{n}^{(k)} = \frac{n!}{(n - k)!}He_{n - k}(\text{input}).$$
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78357
Approved by: https://github.com/mruberry
2022-05-28 13:56:12 +00:00
Allen Goodman
18273c39da Physicist’s Hermite polynomial (#78352)
Adds:

```Python
hermite_polynomial_h(input, n, *, out=None) -> Tensor
```
Physicist’s Hermite polynomial $H_{n}(\text{input})$.

If $n = 0$, $1$ is returned. If $n = 1$, $\text{input}$ is returned. Otherwise, the recursion:

$$H_{n + 1}(\text{input}) = 2 \times \text{input} \times H_{n}(\text{input}) - H_{n - 1}(\text{input})$$

is evaluated.

## Derivatives

Recommended $k$-derivative formula with respect to $\text{input}$:

$$\frac{d^{k}}{d \times \text{input}^{k}} H_{n}^{(k)} = 2^{k} \times \frac{n!}{(n - k)!}H_{n - k}(\text{input})$$
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78352
Approved by: https://github.com/mruberry
2022-05-28 02:26:30 +00:00
Allen Goodman
40a6cc6cc6 Chebyshev polynomial of the second kind (#78293)
Adds:

```Python
chebyshev_polynomial_u(input, n, *, out=None) -> Tensor
```

Chebyshev polynomial of the second kind $U_{n}(\text{input})$.

If $n = 0$, $1$ is returned. If $n = 1$, $2 \times \text{input}$ is returned. If $n < 6$ or $|\text{input}| > 1$ the recursion:

$$T_{n + 1}(\text{input}) = 2 \times \text{input} \times T_{n}(\text{input}) - T_{n - 1}(\text{input})$$

is evaluated. Otherwise, the explicit trigonometric formula:

$$\frac{\text{sin}((n + 1) \times \text{arccos}(\text{input}))}{\text{sin}(\text{arccos}(\text{input}))}$$

is evaluated.

## Derivatives

Recommended first derivative formula with respect to $\text{input}$:

$$\frac{(-1 - n)\times U_{-1 + n}(\text{input}) + n \times \text{input} \times U_{n}(x)}{-1 + \text{input}^{2}}.$$

Recommended $k$-derivative formula with respect to $\text{n}$:

$$\frac{\text{arccos}(\text{input})^{k} \times \text{sin}(\frac{k \times \pi}{2} + (1 + n) \times \text{arccos}(\text{input}))}{\sqrt{1 - \text{input}^{2}}}.$$

## Example

```Python
x = torch.linspace(-1.0, 1.0, 256)

matplotlib.pyplot.plot(x, torch.special.chebyshev_polynomial_u(x, 10))
```

![image](https://user-images.githubusercontent.com/315821/170352780-12af63d3-ce31-4948-8b68-8ecc37c71ac5.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78293
Approved by: https://github.com/mruberry
2022-05-27 18:32:11 +00:00
Allen Goodman
029bbe4995 Chebyshev polynomial of the first kind (#78196)
Adds:

```Python
chebyshev_polynomial_t(input, n, *, out=None) -> Tensor
```

Chebyshev polynomial of the first kind $T_{n}(\text{input})$.

If $n = 0$, $1$ is returned. If $n = 1$, $\text{input}$ is returned. If $n < 6$ or $|\text{input}| > 1$ the recursion:

$$T_{n + 1}(\text{input}) = 2 \times \text{input} \times T_{n}(\text{input}) - T_{n - 1}(\text{input})$$

is evaluated. Otherwise, the explicit trigonometric formula:

$$T_{n}(\text{input}) = \text{cos}(n \times \text{arccos}(x))$$

is evaluated.

## Derivatives

Recommended $k$-derivative formula with respect to $\text{input}$:

$$2^{-1 + k} \times n \times \Gamma(k) \times C_{-k + n}^{k}(\text{input})$$

where $C$ is the Gegenbauer polynomial.

Recommended $k$-derivative formula with respect to $\text{n}$:

$$\text{arccos}(\text{input})^{k} \times \text{cos}(\frac{k \times \pi}{2} + n \times \text{arccos}(\text{input})).$$

## Example

```Python
x = torch.linspace(-1, 1, 256)

matplotlib.pyplot.plot(x, torch.special.chebyshev_polynomial_t(x, 10))
```

![image](https://user-images.githubusercontent.com/315821/170125525-60415735-4d49-4cbd-9278-26286413f635.png)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78196
Approved by: https://github.com/mruberry
2022-05-26 21:06:44 +00:00
PyTorch MergeBot
d450034f24 Revert "Beta function (#78031)"
This reverts commit da16450360.

Reverted https://github.com/pytorch/pytorch/pull/78031 on behalf of https://github.com/suo due to broke trunk, see the above message
2022-05-24 22:55:06 +00:00
Brian Hirsh
07e4533403 reland of as_strided support for functionalization; introduce as_strided_scatter
This reverts commit a95f1edd85.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78199

Approved by: https://github.com/ezyang
2022-05-24 22:40:44 +00:00
Allen Goodman
da16450360 Beta function (#78031)
Euler beta function:

```Python
torch.special.beta(input, other, *, out=None) → Tensor
```

`reentrant_gamma` and `reentrant_ln_gamma` implementations (using Stirling’s approximation) are provided. I started working on this before I realized we were missing a gamma implementation (despite providing incomplete gamma implementations). Uses the coefficients computed by Steve Moshier to replicate SciPy’s implementation. Likewise, it mimics SciPy’s behavior (instead of the behavior in Cephes).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78031
Approved by: https://github.com/mruberry
2022-05-24 21:07:25 +00:00
PyTorch MergeBot
a95f1edd85 Revert "as_strided support for functionalization; introduce as_strided_scatter"
This reverts commit 3a921f2d26.

Reverted https://github.com/pytorch/pytorch/pull/77128 on behalf of https://github.com/suo due to This broke rocm tests on master 3a921f2d26. rocm tests are no longer run on PRs, you should add a `ciflow/trunk` label if you want to run them
2022-05-24 20:19:12 +00:00
Brian Hirsh
3a921f2d26 as_strided support for functionalization; introduce as_strided_scatter
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77128

Approved by: https://github.com/ezyang
2022-05-24 18:20:31 +00:00
Edward Z. Yang
4941e72e40 Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836)""
This reverts commit c35bd8d423.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719

Approved by: https://github.com/Chillee, https://github.com/malfet
2022-05-18 18:40:57 +00:00
PyTorch MergeBot
48581d74ad Revert "Add dispatch mode testing for meta tensors and other stuff"
This reverts commit c1cdb1216b.

Reverted https://github.com/pytorch/pytorch/pull/77477 on behalf of https://github.com/malfet
2022-05-18 02:56:48 +00:00
Edward Z. Yang
c1cdb1216b Add dispatch mode testing for meta tensors and other stuff
We don't have any coverage for meta tensor correctness for backwards
because torch function mode can only allow us to interpose on
Python torch API calls, but backwards invocations happen from C++.
To make this possible, I add torch_dispatch_meta test which runs the
tests with __torch_dispatch__

While doing this, I needed to generate fresh expected failure / skip
lists for the new test suite, and I discovered that my original
scaffolding for this purpose was woefully insufficient.  So I rewrote
how the test framework worked, and at the same time rewrote the
__torch_function__ code to also use the new logic.  Here's whats
new:

- Expected failure / skip is now done on a per function call basis,
  rather than the entire test.  This means that separate OpInfo
  samples for a function don't affect each other.

- There are now only two lists: expect failure list (where the test
  consistently fails on all runs) and skip list (where the test
  sometimes passes and fails.

- We explicitly notate the dtype that failed.  I considered detecting
  when something failed on all dtypes, but this was complicated and
  listing everything out seemed to be nice and simple.  To keep the
  dtypes short, I introduce a shorthand notation for dtypes.

- Conversion to meta tensors is factored into its own class
  MetaConverter

- To regenerate the expected failure / skip lists, just run with
  PYTORCH_COLLECT_EXPECT and filter on a specific test type
  (test_meta or test_dispatch_meta) for whichever you want to update.

Other misc fixes:

- Fix max_pool1d to work with BFloat16 in all circumstances, by making
  it dispatch and then fixing a minor compile error (constexpr doesn't
  work with BFloat16)

- Add resolve_name for turning random torch API functions into string
  names

- Add push classmethod to the Mode classes, so that you can more easily
  push a mode onto the mode stack

- Add some more skips for missing LAPACK

- Added an API to let you query if there's already a registration for
  a function, added a test to check that we register_meta for all
  decompositions (except detach, that decomp is wrong lol), and then
  update all the necessary sites to make the test pass.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77477

Approved by: https://github.com/zou3519
2022-05-18 00:18:34 +00:00
Christian Puhrsch
8c608a79b4 Compressed sparse layout conversion stubs (#77489)
This PR unifies sparse layout conversions into a single location and adds stubs to raise a Runtime error for unsupported conversions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77489
Approved by: https://github.com/pearu, https://github.com/mruberry
2022-05-16 18:37:42 +00:00
Pearu Peterson
88205886d7 Add ccol_indices and row_indices methods.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77503

Approved by: https://github.com/cpuhrsch
2022-05-16 00:23:54 +00:00
Christian Puhrsch
289192199a Add to_sparse_bsr (#77366)
Conversion function of CSR to BSR.

Follow up work includes
- Conversion from strided, COO, CSC, BSC
- autograd
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77366
Approved by: https://github.com/IvanYashchuk, https://github.com/mikaylagawarecki
2022-05-13 20:16:03 +00:00
Mikayla Gawarecki
841c65f499 Unprivate _index_reduce and add documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76997

Approved by: https://github.com/cpuhrsch
2022-05-13 19:48:38 +00:00
Ivan Yashchuk
890bdf13e1 Remove deprecated torch.solve (#70986)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.solve`.

cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70986
Approved by: https://github.com/Lezcano, https://github.com/albanD
2022-05-10 13:44:07 +00:00
PyTorch MergeBot
4ebc4890dd Revert "Add linalg.lu_solve"
This reverts commit fc5b4a5a33.

Reverted https://github.com/pytorch/pytorch/pull/72935 on behalf of https://github.com/malfet
2022-05-09 19:12:30 +00:00
lezcano
621ff0f973 Add linalg.vander
This PR adds `linalg.vander`, the linalg version of `torch.vander`.

We add autograd support and support for batched inputs.

We also take this chance to improve the docs (TODO: Check that they
render correctly!) and add an OpInfo.

**Discussion**: The current default for the `increasing` kwargs is extremely
odd as it is the opposite of the classical definition (see
[wiki](https://en.wikipedia.org/wiki/Vandermonde_matrix)). This is
reflected in the docs, where I explicit both the odd defaults that we
use and the classical definition. See also [this stackoverflow
post](https://stackoverflow.com/a/71758047/5280578), which shows how
people are confused by this defaults.

My take on this would be to correct the default to be `increasing=True`
and document the divergence with NumPy (as we do for other `linalg`
functions) as:

- It is what people expect
- It gives the correct determinant called "the Vandermonde determinant" rather than (-1)^{n-1} times the Vandermonde det (ugh).
- [Minor] It is more efficient (no `flip` needed)
- Since it's under `linalg.vander`, it's strictly not a drop-in replacement for `np.vander`.

We will deprecate `torch.vander` in a PR after this one in this stack
(once we settle on what's the correct default).

Thoughts? mruberry

cc kgryte rgommers as they might have some context for the defaults of
NumPy.

Fixes https://github.com/pytorch/pytorch/issues/60197

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76303

Approved by: https://github.com/albanD, https://github.com/mruberry
2022-05-06 08:44:14 +00:00
lezcano
fc5b4a5a33 Add linalg.lu_solve
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes https://github.com/pytorch/pytorch/issues/61657

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72935

Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry
2022-05-05 19:02:13 +00:00
lezcano
7cb7cd5802 Add linalg.lu
This PR modifies `lu_unpack` by:
- Using less memory when unpacking `L` and `U`
- Fuse the subtraction by `-1` with `unpack_pivots_stub`
- Define tensors of the correct types to avoid copies
- Port `lu_unpack` to be a strucutred kernel so that its `_out` version
does not incur on extra copies

Then we implement `linalg.lu` as a structured kernel, as we want to
compute its derivative manually. We do so because composing the
derivatives of `torch.lu_factor` and `torch.lu_unpack` would be less efficient.

This new function and `lu_unpack` comes with all the things it can come:
forward and backward ad, decent docs, correctness tests, OpInfo, complex support,
support for metatensors and support for vmap and vmap over the gradients.

I really hope we don't continue adding more features.

This PR also avoids saving some of the tensors that were previously
saved unnecessarily for the backward in `lu_factor_ex_backward` and
`lu_backward` and does some other general improvements here and there
to the forward and backward AD formulae of other related functions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67833

Approved by: https://github.com/IvanYashchuk, https://github.com/nikitaved, https://github.com/mruberry
2022-05-05 09:17:05 +00:00
Edward Z. Yang
48eb8d6aad Use TorchFunctionMode to implement PrimTorch tracing context
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76735

Approved by: https://github.com/mruberry
2022-05-04 23:49:46 +00:00
Eddie Yan
e838137b3e Add high level control of fp32 matmul precision; disable TF32 for matmuls by default
#76440

CC @mruberry @ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76509
Approved by: https://github.com/ngimel
2022-05-04 20:40:13 +00:00
samdow
6779366f27 add nested mode to python mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75965

Approved by: https://github.com/albanD, https://github.com/ezyang, https://github.com/zou3519
2022-05-04 13:01:06 +00:00
Pearu Peterson
436a7be059 Factory functions for sparse CSC, BSR, and BSC tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76634

Tests for Sparse Compressed factory functions

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76746

Approved by: https://github.com/cpuhrsch
2022-05-04 03:30:41 +00:00
PyTorch MergeBot
bc5307347f Revert "Add linalg.vander"
This reverts commit 1ea49c68d0.

Reverted https://github.com/pytorch/pytorch/pull/76303 on behalf of https://github.com/malfet
2022-05-02 18:50:08 +00:00
Pearu Peterson
e6b4d77c3e Sparse Compressed tensor factory function 2
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76623

Approved by: https://github.com/cpuhrsch
2022-05-02 17:38:30 +00:00
lezcano
1ea49c68d0 Add linalg.vander
This PR adds `linalg.vander`, the linalg version of `torch.vander`.

We add autograd support and support for batched inputs.

We also take this chance to improve the docs (TODO: Check that they
render correctly!) and add an OpInfo.

**Discussion**: The current default for the `increasing` kwargs is extremely
odd as it is the opposite of the classical definition (see
[wiki](https://en.wikipedia.org/wiki/Vandermonde_matrix)). This is
reflected in the docs, where I explicit both the odd defaults that we
use and the classical definition. See also [this stackoverflow
post](https://stackoverflow.com/a/71758047/5280578), which shows how
people are confused by this defaults.

My take on this would be to correct the default to be `increasing=True`
and document the divergence with NumPy (as we do for other `linalg`
functions) as:

- It is what people expect
- It gives the correct determinant called "the Vandermonde determinant" rather than (-1)^{n-1} times the Vandermonde det (ugh).
- [Minor] It is more efficient (no `flip` needed)
- Since it's under `linalg.vander`, it's strictly not a drop-in replacement for `np.vander`.

We will deprecate `torch.vander` in a PR after this one in this stack
(once we settle on what's the correct default).

Thoughts? mruberry

cc kgryte rgommers as they might have some context for the defaults of
NumPy.

Fixes https://github.com/pytorch/pytorch/issues/60197

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76303

Approved by: https://github.com/albanD
2022-05-02 15:26:44 +00:00
Ivan Yashchuk
8bb7203049 Add torch.linalg.ldl_factor_ex and torch.linalg.ldl_solve
This PR adds a function for computing the LDL decomposition and a function that can solve systems of linear equations using this decomposition. The result of `torch.linalg.ldl_factor_ex` is in a compact form and it's required to use it only through `torch.linalg.ldl_solve`. In the future, we could provide `ldl_unpack` function that transforms the compact representation into explicit matrices.

Fixes https://github.com/pytorch/pytorch/issues/54847.

cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69828
Approved by: https://github.com/Lezcano, https://github.com/mruberry, https://github.com/albanD
2022-04-28 19:23:37 +00:00
Mikayla Gawarecki
676a4a3969 Prototype _index_reduce (CPU-only)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75981

Approved by: https://github.com/cpuhrsch
2022-04-27 23:01:00 +00:00
Joel Benjamin Schlosser
bc34cf5fe4 Support for tensor subclasses as parameters
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73459

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-04-27 19:28:55 +00:00
Kulin Seth
54c75e1e8f Add "mps" device to PyTorch framework.
Remove the "mlc" device for Mac platforms.

This commit will be followed up with:

* adding MPS runtime components
* PyTorch ops for MPS device

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291
Approved by: https://github.com/albanD
2022-04-27 19:21:57 +00:00
Brian Hirsh
ea5209c9fd functionalization: add native fill() op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76084

Approved by: https://github.com/ezyang
2022-04-25 21:34:16 +00:00
kshitij12345
aa51704ce5 [complex32] add chalf alias for complex32 and chalf method
Reference: https://github.com/pytorch/pytorch/issues/74537

Adds chalf alias for complex32 and also adds method `chalf` similar to `cfloat, cdouble`

TODO:
* [x] Add docs
* [x] Add override
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75320
Approved by: https://github.com/anjali411
2022-04-20 23:44:47 +00:00
albanD
cd0591dff3 Change default TLS behavior in dispatch to favor is-a style
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75827

Approved by: https://github.com/ezyang
2022-04-20 17:32:29 +00:00
Edward Z. Yang
ee955b8bb9 Cannibalize noarch CI job into crossref CI job
crossref is a new strategy for performing tests when you want
to run a normal PyTorch API call, separately run some variation of
the API call (e.g., same thing but all the arguments are meta tensors)
and then cross-reference the results to see that they are consistent.
Any logic you add to CrossRefMode will get run on *every* PyTorch API
call that is called in the course of PyTorch's test suite.  This can
be a good choice for correctness testing if OpInfo testing is not
exhaustive enough.

For now, the crossref test doesn't do anything except verify that
we can validly push a mode onto the torch function mode stack for all
functions.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75988

Approved by: https://github.com/seemethere
2022-04-20 11:56:25 +00:00
Edward Z. Yang
d9219d2944 Add torch.nn.init to list of overridable functions
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76014

Approved by: https://github.com/zou3519
2022-04-20 11:55:56 +00:00
Alban Desmaison
3467f3fa80 Remove spurious warning when using disabled torch function
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75826

Approved by: https://github.com/ezyang
2022-04-15 17:08:45 +00:00
Scott Wolchok
97c993ca7a [PyTorch] Add NestedTensor support functions for transformers
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491

Here are the NestedTensor kernels we'll need for the improved transformer implementation.

Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)!

Approved by: https://github.com/cpuhrsch
2022-04-14 16:30:23 +00:00
Brian Hirsh
23b8414391 code-generate non-aliasing {view}_copy kernels (#73442)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73442

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D35016025

Pulled By: bdhirsh

fbshipit-source-id: 2a7f303ec76f5913b744c7822a531d55a57589c9
(cherry picked from commit 3abe13c2a787bcbe9c41b0a335c96e5a3d3642fb)
2022-04-11 19:48:55 +00:00
Edward Z. Yang
0a1bc5f501 Miscellaneous __torch_function__ fixes
I figured these out by unconditionally turning on a no-op torch function
mode on the test suite and then fixing errors as they showed up.  Here's
what I found:

- _parse_to failed internal assert when __torch_function__'ed because it
  claims its name is "to" to the argument parser; added a name override
  so we know how to find the correct name

- Infix operator magic methods on Tensor did not uniformly handle
  __torch_function__ and TypeError to NotImplemented.  Now, we always
  do the __torch_function__ handling in
  _wrap_type_error_to_not_implemented and your implementation of
  __torch_function__ gets its TypeErrors converted to NotImplemented
  (for better or for worse; see
  https://github.com/pytorch/pytorch/issues/75462 )

- A few cases where code was incorrectly testing if a Tensor was
  Tensor-like in the wrong way, now use is_tensor_like (in grad
  and in distributions).  Also update docs for has_torch_function to
  push people to use is_tensor_like.

- is_grads_batched was dropped from grad in handle_torch_function, now
  fixed

- Report that you have a torch function even if torch function is
  disabled if a mode is enabled.  This makes it possible for a mode
  to return NotImplemented, pass to a subclass which does some
  processing and then pass back to the mode even after the subclass
  disables __torch_function__ (so the tensors are treated "as if"
  they are regular Tensors).  This brings the C++ handling behavior
  in line with the Python behavior.

- Make the Python implementation of overloaded types computation match
  the C++ version: when torch function is disabled, there are no
  overloaded types (because they all report they are not overloaded).

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75484

Approved by: https://github.com/zou3519
2022-04-11 16:52:16 +00:00
Scott Wolchok
48147675f2 [PyTorch] _addm_activation native function for matmul/bias/activation fusion
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74490

Here's an extended version of addmm that takes advantage of cublasLt's fused addmm + relu/gelu support.

Differential Revision: [D35019612](https://our.internmc.facebook.com/intern/diff/D35019612/)

Approved by: https://github.com/ngimel
2022-04-08 17:54:09 +00:00