Siddharth Kotapati
e27c0048db
Enable additional tests for MPS CI runs ( #134356 )
...
As part of the follow up for https://github.com/pytorch/pytorch/issues/133520 , adapting existing unused tests for use in MPS CI runs. Focusing on nhwc & other memory formatting tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134356
Approved by: https://github.com/malfet , https://github.com/eqy , https://github.com/huydhn
2024-10-04 21:52:38 +00:00
Xuehai Pan
4226ed1585
[BE] Format uncategorized Python files with ruff format ( #132576 )
...
Remove patterns `**`, `test/**`, and `torch/**` in `tools/linter/adapters/pyfmt_linter.py` and run `lintrunner`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132576
Approved by: https://github.com/ezyang , https://github.com/Skylion007
ghstack dependencies: #132574
2024-08-04 17:13:31 +00:00
Xuehai Pan
ba48cf6535
[BE][Easy][6/19] enforce style for empty lines in import segments in test/ ( #129757 )
...
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501 . Most changes are auto-generated by linter.
You can review these PRs via:
```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129757
Approved by: https://github.com/ezyang
2024-07-17 06:42:37 +00:00
Xuehai Pan
26f4f10ac8
[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch ( #127126 )
...
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
2024-05-27 14:49:57 +00:00
PyTorch MergeBot
55c0ab2887
Revert "[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch ( #127126 )"
...
This reverts commit 7763c83af6 .
Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286 ))
2024-05-27 09:22:08 +00:00
Xuehai Pan
7763c83af6
[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch ( #127126 )
...
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
ghstack dependencies: #127122 , #127123 , #127124 , #127125
2024-05-27 04:22:18 +00:00
Jianping Wu
c281d3a0cb
Enable UFMT on test_indexing&test_view_ops ( #125112 )
...
Part of https://github.com/pytorch/pytorch/issues/123062
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125112
Approved by: https://github.com/ezyang
2024-05-01 23:44:53 +00:00
lezcano
8597d37536
Implement numpy(force=True) ( #109636 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109636
Approved by: https://github.com/ezyang
ghstack dependencies: #109634
2023-09-20 20:06:13 +00:00
Manuele Sigona
a711679527
Add skipLazy marker for tests and use it for tests not working with LazyTensor ( #107382 )
...
[This PR](https://github.com/pytorch/pytorch/pull/80251/files#diff-87e1d4e98eab994c977a57be29c716d3dc0f76d5b5e98cbf23cfcbd48ae625a4 ) marked some tests in `test/test_view_ops.py` with `@onlyNativeDeviceTypes`, because they'd fail if run on the `'lazy'` device type.
However, that marker is overly restrictive, because it prevents all devices outside of the native ones to run those tests.
This PR adds a `@skipLazy` marker (analogous to the existing ones for the other devices), and marks the tests from the mentioned PR so that they're skipped only for the `'lazy'` device type.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107382
Approved by: https://github.com/ezyang
2023-08-22 22:34:36 +00:00
ecao
fdb04c6a86
Add overflow check for stride calculation ( #94900 )
...
Fixes #94120 and #94128 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94900
Approved by: https://github.com/ezyang , https://github.com/jgong5
2023-04-09 01:30:55 +00:00
lezcano
46a81c8db7
Deprecate .mT,.T,.mH,.H on 0D tensors ( #92143 )
...
As discussed with @ngimel, this is not only not documented,
but also an unnecessary edge case. See https://github.com/pytorch/pytorch/pull/90463#discussion_r1064807197
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92143
Approved by: https://github.com/ngimel
2023-01-17 16:54:35 +00:00
Kurt Mohler
08a47549af
Rename Tensor._storage to Tensor.untyped_storage and update docs ( #91414 )
...
Fixes #89224
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91414
Approved by: https://github.com/ezyang
2022-12-28 19:21:34 +00:00
mikey dagitses
279dcce702
disable test that fails in fbcode ( #88786 )
...
Summary:
caffe2/test:torch_cuda - test_advanced_indexing_assignment_lazy (test_view_ops.TestViewOpsLAZY)
RuntimeError: TorchScript backend not yet supported in FBCODE/OVRSOURCE builds
File "/usr/local/fbcode/platform010/lib/python3.8/unittest/suite.py", line 163, in _handleClassSetUp
setUpClass()
File "/re_cwd/fbcode/buck-out/opt/gen/caffe2/test/torch_cuda#binary,link-tree/torch/testing/_internal/common_device_type.py", line 506, in setUpClass
torch._lazy.ts_backend.init()
File "/re_cwd/fbcode/buck-out/opt/gen/caffe2/test/torch_cuda#binary,link-tree/torch/_lazy/ts_backend.py", line 6, in init
torch._C._lazy_ts_backend._init()
Test Plan: Rely on CI.
Differential Revision: D41170545
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88786
Approved by: https://github.com/zou3519
2022-11-15 19:08:31 +00:00
Kurt Mohler
ee28b865ee
Deprecate TypedStorage, its derived classes, and all of their public methods ( #85303 )
...
Part of #85302
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303
Approved by: https://github.com/ezyang
2022-11-08 18:11:01 +00:00
Brian Hirsh
9ad1659b17
functionalization: make view_copy outputs always contiguous ( #85747 )
...
This fixes an issue with mobile: The output of view_copy ops should always be contiguous.
Later, we can consider adding optional arguments to the `view_copy()` functions to let you explicitly say what the contiguity of the output can be (e.g. channels_last)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85747
Approved by: https://github.com/ezyang
2022-10-21 17:42:02 +00:00
Animesh Jain
1d90d6ee60
Setup for running PyTorch tests with TorchDynamo and skips for known failing tests ( #80106 )
...
@ezyang I am going to keep adding more skips in this PR for now. And once we have the CI running, I will replace with the appropriate decorators.
cc @mlazos , we should add those tests in test_ops.py in this PR as well
cc @jansel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80106
Approved by: https://github.com/ezyang , https://github.com/jansel
2022-07-07 18:57:33 +00:00
Brian Hirsh
c2d395cf8e
functionalization <> LTC integration (take 3) ( #80251 )
...
new PR for https://github.com/pytorch/pytorch/pull/75527 .
It looks like there's a bug in the windows CI scripts that was causing
flaky failures, that disappear when I create a new PR. example failure:
https://github.com/pytorch/pytorch/runs/6999272635?check_suite_focus=true
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80251
Approved by: https://github.com/wconstab
2022-06-26 23:10:21 +00:00
jjsjann123
fea909b43e
[primTorch] Adds broadcast_shapes reference ( #78612 )
...
1. Added references `_refs.broadcast_shapes`
2. Added OpInfo test for `torch.broadcast_shapes`
A few minor changes:
- `test_python_ref_meta` and `_ref_test_helper` update to avoid non-tensor outputs
- type annotation update for `_resize_meta`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78612
Approved by: https://github.com/mruberry
2022-06-02 08:56:37 +00:00
Edward Z. Yang
4941e72e40
Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes ( #76836 )""
...
This reverts commit c35bd8d423 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719
Approved by: https://github.com/Chillee , https://github.com/malfet
2022-05-18 18:40:57 +00:00
Brian Hirsh
edc904d6ba
add native view_copy.out ops, teach codegen about tensorlist out=
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76126
Approved by: https://github.com/ezyang
2022-05-18 14:23:43 +00:00
PyTorch MergeBot
48581d74ad
Revert "Add dispatch mode testing for meta tensors and other stuff"
...
This reverts commit c1cdb1216b .
Reverted https://github.com/pytorch/pytorch/pull/77477 on behalf of https://github.com/malfet
2022-05-18 02:56:48 +00:00
Edward Z. Yang
c1cdb1216b
Add dispatch mode testing for meta tensors and other stuff
...
We don't have any coverage for meta tensor correctness for backwards
because torch function mode can only allow us to interpose on
Python torch API calls, but backwards invocations happen from C++.
To make this possible, I add torch_dispatch_meta test which runs the
tests with __torch_dispatch__
While doing this, I needed to generate fresh expected failure / skip
lists for the new test suite, and I discovered that my original
scaffolding for this purpose was woefully insufficient. So I rewrote
how the test framework worked, and at the same time rewrote the
__torch_function__ code to also use the new logic. Here's whats
new:
- Expected failure / skip is now done on a per function call basis,
rather than the entire test. This means that separate OpInfo
samples for a function don't affect each other.
- There are now only two lists: expect failure list (where the test
consistently fails on all runs) and skip list (where the test
sometimes passes and fails.
- We explicitly notate the dtype that failed. I considered detecting
when something failed on all dtypes, but this was complicated and
listing everything out seemed to be nice and simple. To keep the
dtypes short, I introduce a shorthand notation for dtypes.
- Conversion to meta tensors is factored into its own class
MetaConverter
- To regenerate the expected failure / skip lists, just run with
PYTORCH_COLLECT_EXPECT and filter on a specific test type
(test_meta or test_dispatch_meta) for whichever you want to update.
Other misc fixes:
- Fix max_pool1d to work with BFloat16 in all circumstances, by making
it dispatch and then fixing a minor compile error (constexpr doesn't
work with BFloat16)
- Add resolve_name for turning random torch API functions into string
names
- Add push classmethod to the Mode classes, so that you can more easily
push a mode onto the mode stack
- Add some more skips for missing LAPACK
- Added an API to let you query if there's already a registration for
a function, added a test to check that we register_meta for all
decompositions (except detach, that decomp is wrong lol), and then
update all the necessary sites to make the test pass.
Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77477
Approved by: https://github.com/zou3519
2022-05-18 00:18:34 +00:00
Mike Ruberry
f6bbecf8b5
Adds python ref consistency test, elementwise unary reference inputs, and formats test files
...
Per title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76626
Approved by: https://github.com/ngimel
2022-05-01 22:42:46 +00:00
Brian Hirsh
23b8414391
code-generate non-aliasing {view}_copy kernels ( #73442 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73442
Test Plan: Imported from OSS
Reviewed By: ezyang
Differential Revision: D35016025
Pulled By: bdhirsh
fbshipit-source-id: 2a7f303ec76f5913b744c7822a531d55a57589c9
(cherry picked from commit 3abe13c2a787bcbe9c41b0a335c96e5a3d3642fb)
2022-04-11 19:48:55 +00:00
Peter Bell
13a3e5c70c
Catch overflows in calculating storage byte size
...
Fixes #73184
In the issue the output tensor's shape is `[2, 4, 536870912, 536870912]` which results in a `numel()` slightly below the point of overflow. When the storage is created it does `numel() * 8` which overflows and a much smaller storage is allocated than required.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73719
Approved by: https://github.com/ezyang , https://github.com/malfet
2022-03-31 16:16:03 +00:00
Nikita Shulga
bfac65dfe5
[testing] Update dispatch macros ( #74977 )
...
This PR is reland of #74289
Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>
2022-03-30 14:13:21 -07:00
PyTorch MergeBot
2e4152b118
Revert "[testing] Update dispatch macros"
...
This reverts commit eed19a0f38 .
Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet
2022-03-30 19:52:37 +00:00
Khushi Agrawal
eed19a0f38
[testing] Update dispatch macros
...
Hi,
This PR is the follow-up PR of #71561 . (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that).
Please take a look. Thanks!
cc: @pmeier @mruberry @kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289
Approved by: https://github.com/pmeier , https://github.com/mruberry
2022-03-30 16:10:16 +00:00
Khushi Agrawal
f1af4dbed0
[fix] Contiguity of torch.ravel!
...
Hi!
The PR aims to fix #70657 . The objective was to ensure that `torch.ravel()` returns contiguous outputs for non-contiguous inputs. It also adds the test verifying the contiguity of the `torch.ravel`, which was missing.
I am looking forward to your viewpoints. Thanks :)
Thank you so much, @kshitij12345, for helping me clear up the concepts! :)
cc: @mruberry @kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71771
Approved by: https://github.com/mruberry
2022-03-28 16:41:39 +00:00
Nikita Shulga
ef066f0832
Revert D34856571: [pytorch][PR] Replace get_all_ type macros with the ATen dispatch macros.
...
Test Plan: revert-hammer
Differential Revision:
D34856571 (3ded7b1da3 )
Original commit changeset: 0dca038bcad5
Original Phabricator Diff: D34856571 (3ded7b1da3 )
fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386
(cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)
2022-03-15 22:07:11 +00:00
Khushi Agrawal
3ded7b1da3
Replace get_all_ type macros with the ATen dispatch macros. ( #71561 )
...
Summary:
Hi, Team!
The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738 . It aims to replace `get_all` type macros with the ATen dispatch macros.
The files it iterates over are: (Thanks, Lezcano, for the idea!!)
<details>
<summary>
`test/test_autograd.py`</summary>
<p>
```python
43:from torch.testing._internal.common_dtype import get_all_dtypes
8506: floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point]
```
</p>
</details>
<details>
<summary>
`test/test_binary_ufuncs.py`</summary>
<p>
```python
26: all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes,
27: get_all_complex_dtypes, get_all_fp_dtypes,
935: dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1035: dtypes(*get_all_dtypes(
1488: dtypes(*(get_all_dtypes(include_bool=False, include_bfloat16=False)))
1879: dtypes(*product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False)))
1887: dtypes(*(get_all_int_dtypes() + [torch.bool]))
1913: dtypes(*(get_all_fp_dtypes()))
1941: dtypes(*(get_all_fp_dtypes()))
1977: dtypes(*product(get_all_complex_dtypes(), get_all_dtypes()))
2019: dtypes(*product(get_all_fp_dtypes(), get_all_fp_dtypes()))
2048: dtypes(*get_all_dtypes())
2110: dtypes(*product(get_all_dtypes(include_complex=False),
2111: get_all_dtypes(include_complex=False)))
2128: types = [torch.bool, torch.bfloat16] + get_all_int_dtypes()
2173: if dtypes[1] in get_all_fp_dtypes():
2178: dtypes(*product(get_all_fp_dtypes(),
2179: get_all_fp_dtypes()))
2260: dtypesIfCUDA(*set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128})
2261: dtypes(*set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128})
2273: dtypesIfCUDA(*set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128})
2274: dtypes(*set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128})
2307: dtypes(*get_all_math_dtypes('cpu'))
2319: dtypes(*get_all_fp_dtypes(include_bfloat16=False))
2331: dtypes(*get_all_int_dtypes())
2356: dtypes(*get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False))
2393: if dtype in get_all_int_dtypes():
2614: dtypes(*get_all_dtypes())
2624: dtypes(*tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2)))
2806: dtypes(*list(product(get_all_dtypes(include_complex=False),
2807: get_all_dtypes(include_complex=False))))
2866: dtypes(*list(product(get_all_complex_dtypes(),
2867: get_all_complex_dtypes())))
2902: dtypes(*product(get_all_dtypes(), get_all_dtypes()))
2906: dtypes(*product(get_all_dtypes(), get_all_dtypes()))
2910: dtypes(*product(get_all_dtypes(), get_all_dtypes()))
3019: dtypes = [torch.float, torch.double] + get_all_complex_dtypes()
3221: dtypes(*get_all_dtypes(include_complex=False))
3407: dtypes(*list(product(get_all_dtypes(include_bool=False),
3408: get_all_dtypes(include_bool=False))))
3504: dtypes(*product(get_all_dtypes(include_complex=False, include_bfloat16=False),
3505: get_all_dtypes(include_complex=False, include_bfloat16=False)))
3516: if x.dtype in get_all_int_dtypes() + [torch.bool]:
3643: dtypes(*product(get_all_dtypes(include_complex=False,
3645: get_all_dtypes(include_complex=False,
```
</p>
</details>
<details>
<summary>
`test/test_complex.py`</summary>
<p>
```python
6:from torch.testing._internal.common_dtype import get_all_complex_dtypes
11: dtypes(*get_all_complex_dtypes())
```
</p>
</details>
<details>
<summary>
`test/test_foreach.py`</summary>
<p>
```python
18: get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes,
142: if dtype in get_all_int_dtypes():
179: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
201: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
205: disable_fastpath |= dtype in get_all_int_dtypes() + [torch.bool]
211: disable_fastpath |= dtype not in get_all_complex_dtypes()
241: bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
246: disable_fastpath |= dtype in get_all_int_dtypes() + [torch.bool]
248: disable_fastpath |= dtype not in get_all_complex_dtypes()
250: disable_fastpath |= True and dtype not in get_all_complex_dtypes()
307: disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool]
365: if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes():
376: ops(foreach_unary_op_db, dtypes=get_all_dtypes())
393: dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False))
401: ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True))
426: if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes():
439: dtypes(*get_all_dtypes())
449: ops(foreach_binary_op_db, dtypes=get_all_dtypes())
481: ops(foreach_binary_op_db, dtypes=get_all_dtypes())
536: if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div:
545: ops(foreach_binary_op_db, dtypes=get_all_dtypes())
637: ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False))
```
</p>
</details>
<details>
<summary>
`test/test_linalg.py`</summary>
<p>
```python
29: all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes,
30: get_all_fp_dtypes,
111: dtypes(*(get_all_dtypes()))
794: float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes()
807: dtypes(*(get_all_int_dtypes()))
828: dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
841: if dtype in get_all_complex_dtypes():
844: dtypes(*itertools.product(get_all_dtypes(),
845: get_all_dtypes()))
855: for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3):
5607: *get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater)))
5608: dtypes(*(set(get_all_dtypes()) - {torch.half, torch.bool}))
5644: dtypes(*(get_all_complex_dtypes() + get_all_fp_dtypes()))
6255: dtypesIfCUDA(*get_all_complex_dtypes(),
6256: *get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)),
6292: dtypesIfCUDA(*get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater))))
6323: dtypesIfCUDA(*get_all_complex_dtypes(),
6324: *get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater))))
6325: dtypes(*get_all_complex_dtypes(), *get_all_fp_dtypes())
6358: dtypesIfCUDA(*([torch.float, torch.double] + get_all_complex_dtypes()))
6556: dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
6668: dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
6741: dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
```
</p>
</details>
<details>
<summary>
`test/test_nn.py`</summary>
<p>
```python
37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes
50: onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \
8862: for device in get_all_device_types():
9629: for dt1 in get_all_math_dtypes(device):
9630: for dt2 in get_all_math_dtypes(device):
9631: for dt3 in get_all_math_dtypes(device):
9648: for input_dtype in get_all_math_dtypes(device):
9664: for input_dtype in get_all_math_dtypes(device):
13015: dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
13034: dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
13159: dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
17400: dtypesIfCUDA(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
17768: dtypesIfCUDA(*get_all_fp_dtypes())
17773: dtypesIfCUDA(*get_all_fp_dtypes())
17778: dtypesIfCUDA(*get_all_fp_dtypes())
17783: dtypesIfCUDA(*get_all_fp_dtypes())
17788: dtypesIfCUDA(*get_all_fp_dtypes())
17793: dtypesIfCUDA(*get_all_fp_dtypes())
17798: dtypesIfCUDA(*get_all_fp_dtypes())
17963: dtypesIfCUDA(*get_all_fp_dtypes())
17977: dtypesIfCUDA(*get_all_fp_dtypes())
18684: def test_cross_entropy_loss_prob_target_all_reductions(self, device):
```
</p>
</details>
<details>
<summary>
`test/test_numpy_interop.py`</summary>
<p>
```python
12:from torch.testing._internal.common_dtype import get_all_dtypes
399: dtypes(*get_all_dtypes())
```
</p>
</details>
<details>
<summary>
`test/test_ops.py`</summary>
<p>
```python
12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes
86: for dtype in get_all_dtypes():
```
</p>
</details>
<details>
<summary>
`test/test_reductions.py`</summary>
<p>
```python
16: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes,
360: allowed_dtypes=get_all_dtypes(include_bfloat16=False))
366: allowed_dtypes=get_all_dtypes(include_bfloat16=False))
394: allowed_dtypes=get_all_dtypes(include_bfloat16=False))
750: for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]:
1404: dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1457: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1458: get_all_complex_dtypes()))
1465: return dtype in get_all_int_dtypes()
1494: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1501: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1507: dtypes(*(get_all_complex_dtypes()))
1514: dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))
1523: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1531: if dtype in get_all_fp_dtypes():
1608: dtypes(*(get_all_dtypes(include_half=True, include_bfloat16=False,
1837: dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1855: dtypes(*(set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8}))
3219: for dtype in get_all_dtypes(include_half=True, include_bfloat16=False,
```
</p>
</details>
<details>
<summary>
`test/test_serialization.py`</summary>
<p>
```python
26:from torch.testing._internal.common_dtype import get_all_dtypes
586: for device, dtype in product(devices, get_all_dtypes()):
589: for other_dtype in get_all_dtypes():
```
</p>
</details>
<details>
<summary>
`test/test_shape_ops.py`</summary>
<p>
```python
18:from torch.testing._internal.common_dtype import get_all_dtypes
230: dtypes(*get_all_dtypes(include_complex=False, include_bool=False, include_half=False,
232: dtypesIfCUDA(*get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False))
344: dtypes(*get_all_dtypes())
443: dtypes(*get_all_dtypes())
461: dtypes(*get_all_dtypes())
570: dtypes(*get_all_dtypes(include_complex=False))
```
</p>
</details>
<details>
<summary>
`test/test_sort_and_select.py`</summary>
<p>
```python
12: all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes,
136: dtypes(*set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128})
231: dtypes(*set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128})
296: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
647: dtypesIfCUDA(*get_all_fp_dtypes())
678: dtypesIfCUDA(*(get_all_dtypes(include_complex=False,
682: dtypes(*(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False)))
739: dtypesIfCPU(*set(get_all_dtypes()) - {torch.complex64, torch.complex128})
740: dtypes(*set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128})
799: dtypesIfCPU(*set(get_all_dtypes()) - {torch.complex64, torch.complex128})
800: dtypes(*set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128})
```
</p>
</details>
<details>
<summary>
`test/test_sparse.py`</summary>
<p>
```python
20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes
29: floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes,
1963: return dtype in get_all_int_dtypes()
1994: dtypes(*get_all_dtypes(include_bool=False, include_half=False,
2103: return dtype in get_all_int_dtypes()
2138: dtypes(*get_all_dtypes(include_bool=False, include_half=False,
2626: all_sparse_dtypes = get_all_dtypes(include_complex=True)
2633: all_sparse_dtypes = get_all_dtypes(include_complex=True)
3230: dtypes(*get_all_complex_dtypes(),
3231: *get_all_fp_dtypes(include_half=False, include_bfloat16=False))
3234: *get_all_fp_dtypes(
```
</p>
</details>
<details>
<summary>
`test/test_sparse_csr.py`</summary>
<p>
```python
7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor
17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes
120: dtypes(*get_all_dtypes())
133: dtypes(*get_all_dtypes())
150: dtypes(*get_all_dtypes())
180: dtypes(*get_all_dtypes())
201: dtypes(*get_all_dtypes())
210: dtypes(*get_all_dtypes())
225: dtypes(*get_all_dtypes())
244: dtypes(*get_all_dtypes())
263: dtypes(*get_all_dtypes())
285: dtypes(*get_all_dtypes())
411: dtypes(*get_all_dtypes())
482: dtypes(*get_all_dtypes())
502: dtypes(*get_all_dtypes())
562: dtypes(*get_all_dtypes())
588: dtypesIfCUDA(*get_all_complex_dtypes(),
589: *get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater))
745: dtypesIfCUDA(*get_all_complex_dtypes(),
746: *get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC,
765: dtypesIfCUDA(*get_all_complex_dtypes(),
766: *get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC,
801: *torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater,
841: *torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater,
1182: dtypes(*get_all_dtypes())
1276: dtypes(*get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False))
1286: dtypes(*get_all_dtypes())
```
</p>
</details>
<details>
<summary>
`test/test_tensor_creation_ops.py`</summary>
<p>
```python
21: onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types)
23: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
150: for dt in get_all_dtypes():
160: for dt in get_all_dtypes():
314: dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16]
1012: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1013: get_all_complex_dtypes()))
1032: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1033: get_all_complex_dtypes()))
1050: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1051: get_all_complex_dtypes()))
1745: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1779: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1868: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1926: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1954: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device)
1956: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None)
1957: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device)
2538: for device in get_all_device_types():
2645: for dtype in get_all_dtypes():
2678: dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False) +
2679: get_all_complex_dtypes()))
2716: dtypes(*get_all_fp_dtypes(include_half=False, include_bfloat16=False))
2827: for dt in get_all_dtypes():
2913: dtypes(*get_all_dtypes(include_bool=False, include_half=False))
2914: dtypesIfCUDA(*get_all_dtypes(include_bool=False, include_half=True))
3028: dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
3033: dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
3074: dtypes(*get_all_dtypes(include_bool=False, include_half=False, include_complex=False))
3075: dtypesIfCUDA(*((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16])
3077: else get_all_dtypes(include_bool=False, include_half=True, include_complex=False)))
3873: dtypes(*get_all_dtypes())
3884: dtypes(*get_all_dtypes(include_bool=False))
3916: for other in get_all_dtypes():
3922: dtypes(*get_all_dtypes())
3932: dtypes(*get_all_dtypes(include_bool=False))
3955: dtypes(*get_all_dtypes(include_bool=False))
3961: dtypes(*get_all_dtypes(include_bool=False))
3965: dtypes(*get_all_dtypes())
```
</p>
</details>
<details>
<summary>
`test/test_testing.py`</summary>
<p>
```python
25:from torch.testing._internal.common_dtype import get_all_dtypes
31: dtypes(*(get_all_dtypes(include_half=True, include_bfloat16=False,
```
</p>
</details>
<details>
<summary>
`test/test_torch.py`</summary>
<p>
```python
51: expectedAlertNondeterministic, get_all_device_types, skipXLA)
57: get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes
296: for d in get_all_device_types():
323: for device in get_all_device_types():
324: for dt1 in get_all_dtypes():
325: for dt2 in get_all_dtypes():
343: all_dtypes = get_all_dtypes()
350: all_dtypes = get_all_dtypes()
781: for dtype in get_all_dtypes():
986: for device in get_all_device_types():
1017: for device in get_all_device_types():
1018: for dtype in get_all_math_dtypes(device):
2792: for device in get_all_device_types():
3186: dtypes(*get_all_dtypes())
3195: for error_dtype in get_all_dtypes():
3203: dtypes(*get_all_dtypes())
3212: for error_dtype in get_all_dtypes():
4539: dtypes(*get_all_fp_dtypes())
4545: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
4577: dtypes(*get_all_fp_dtypes(include_half=False, include_bfloat16=False))
4578: dtypesIfCPU(*(get_all_fp_dtypes(include_half=False, include_bfloat16=True)))
4579: dtypesIfCUDA(*(get_all_fp_dtypes(include_bfloat16=False)))
4599: dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False)))
4600: dtypesIfCPU(*(get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False)))
4601: dtypesIfCUDA(*(get_all_dtypes(include_bfloat16=False, include_complex=False)))
4613: for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False):
4628: dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False)))
4629: dtypesIfCUDA(*(get_all_fp_dtypes(include_bfloat16=False)))
4640: dtypes(*get_all_fp_dtypes())
4723: dtypes(*get_all_fp_dtypes())
4735: dtypes(*get_all_fp_dtypes(include_bfloat16=False))
4736: dtypesIfCUDA(*get_all_fp_dtypes())
4747: dtypes(*get_all_fp_dtypes())
4761: dtypes(*get_all_fp_dtypes())
4771: dtypes(*get_all_fp_dtypes())
4792: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
5302: dtypes(*get_all_dtypes(include_bfloat16=False))
5322: dtypes(*get_all_dtypes(include_half=False, include_bfloat16=False))
5323: dtypesIfCPU(*get_all_dtypes(include_bfloat16=False))
5324: dtypesIfCUDA(*get_all_dtypes(include_bfloat16=False))
5591: for dt in get_all_dtypes():
5611: for dt in get_all_dtypes():
5678: for dt in get_all_dtypes():
5696: dtypesIfCUDA(*set(get_all_math_dtypes('cuda')))
5697: dtypes(*set(get_all_math_dtypes('cpu')))
5746: dtypes(*get_all_dtypes())
5780: dtypes(*get_all_dtypes())
5885: dtypes(*get_all_dtypes())
5902: dtypes(*get_all_dtypes())
5945: dtypes(*get_all_dtypes())
5979: dtypes(*get_all_dtypes(include_bool=False))
6049: dtypes(*get_all_dtypes(include_bool=False))
6092: dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6093: get_all_complex_dtypes()))
6094: dtypesIfCPU(*get_all_dtypes())
6095: dtypesIfCUDA(*get_all_dtypes())
6122: dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6123: get_all_complex_dtypes()))
6124: dtypesIfCPU(*get_all_dtypes())
6125: dtypesIfCUDA(*get_all_dtypes())
6163: dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6164: get_all_complex_dtypes()))
6165: dtypesIfCPU(*get_all_dtypes())
6166: dtypesIfCUDA(*get_all_dtypes())
6190: dtypes(*(get_all_complex_dtypes() +
6191: get_all_int_dtypes()))
6238: dtypes(*get_all_dtypes())
6323: dtypes(*get_all_dtypes())
6389: dtypes(*product(get_all_dtypes(), (torch.uint8, torch.bool)))
6699: dtypesIfCUDA(*set(get_all_math_dtypes('cuda')))
6700: dtypes(*set(get_all_math_dtypes('cpu')))
7452: dtypes(*get_all_dtypes(include_bool=False))
7461: dtypes(*get_all_dtypes(include_bool=False))
7477: dtypes(*get_all_dtypes(include_bool=False))
7496: dtypes(*get_all_dtypes(include_bool=False))
7538: dtypes(*get_all_dtypes(include_bool=False))
8162: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes() +
8163: get_all_complex_dtypes()))
8175: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes() +
8176: get_all_complex_dtypes()))
```
</p>
</details>
<details>
<summary>
`test/test_type_promotion.py`</summary>
<p>
```python
14: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes
187: for dtype in get_all_dtypes():
262: dtypes1 = get_all_math_dtypes('cuda')
263: dtypes2 = get_all_math_dtypes(device)
339: dtypes(*itertools.product(get_all_dtypes(), get_all_dtypes()))
468: for dt1 in get_all_math_dtypes(device):
469: for dt2 in get_all_math_dtypes(device):
519: for dt1 in get_all_math_dtypes(device):
520: for dt2 in get_all_math_dtypes(device):
528: for dt in get_all_math_dtypes(device):
561: for dtype in get_all_dtypes():
766: dtypes=get_all_math_dtypes(device))
771: dtypes=get_all_math_dtypes(device))
782: dtypes=get_all_math_dtypes(device))
879: dtypes = get_all_dtypes(include_bfloat16=False)
898: dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False)
965: dtypesIfCUDA(*itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False),
966: get_all_dtypes(include_bfloat16=False, include_complex=False)))
967: dtypes(*itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False,
969: get_all_dtypes(include_half=False, include_bfloat16=False,
976: return dtype in get_all_int_dtypes() + [torch.bool]
979: return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False)
```
</p>
</details>
<details>
<summary>
`test/test_unary_ufuncs.py`</summary>
<p>
```python
24: floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes,
25: get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
517: dtypes(*(get_all_int_dtypes() + [torch.bool] +
518: get_all_fp_dtypes(include_bfloat16=False)))
596: dtypes(*get_all_fp_dtypes(include_half=True, include_bfloat16=False))
611: invalid_input_dtypes = get_all_int_dtypes() + \
612: get_all_complex_dtypes() + \
619: for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False):
1048: dtypes(*get_all_math_dtypes('cpu'))
1182: dtypesIfCUDA(*get_all_fp_dtypes())
1190: dtypesIfCUDA(*get_all_fp_dtypes())
1205: dtypesIfCUDA(*get_all_fp_dtypes())
1215: dtypesIfCUDA(*get_all_fp_dtypes())
1307: dtypes(*(get_all_dtypes(include_bool=False)))
1349: dtypes(*(get_all_fp_dtypes(include_half=False) +
1350: get_all_complex_dtypes()))
1351: dtypesIfCUDA(*(get_all_fp_dtypes(include_half=True) +
1352: get_all_complex_dtypes()))
```
</p>
</details>
<details>
<summary>
`test/test_view_ops.py`</summary>
<p>
```python
19: get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
124: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
131: dtypes(*get_all_dtypes(include_bfloat16=False))
213: for view_dtype in [*get_all_fp_dtypes(), *get_all_complex_dtypes()]:
220: dtypes(*get_all_dtypes())
224: for view_dtype in get_all_dtypes():
305: dtypes(*get_all_complex_dtypes(include_complex32=True))
343: dtypes(*get_all_dtypes())
354: dtypes(*get_all_dtypes())
364: dtypes(*get_all_dtypes())
374: dtypes(*get_all_dtypes())
384: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
395: dtypes(*get_all_complex_dtypes())
426: dtypes(*get_all_complex_dtypes())
451: dtypes(*product(get_all_complex_dtypes(), get_all_dtypes()))
1263: dtypes(*(torch.testing.get_all_dtypes()))
1279: dtypes(*(torch.testing.get_all_dtypes()))
1405: dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1406: get_all_complex_dtypes()))
1471: dtypes(*get_all_dtypes(include_bfloat16=False))
1574: dtypes(*get_all_dtypes())
1601: dtypes(*get_all_dtypes(include_bfloat16=False))
1632: dtypes(*get_all_dtypes(include_bfloat16=False))
1711: for dt in get_all_dtypes():
1717: for dt in get_all_dtypes():
1724: for dt in get_all_dtypes():
```
</p>
</details>
I'm looking forward to your viewpoints. Thanks :)
cc: mruberry kshitij12345 anjali411
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561
Reviewed By: samdow
Differential Revision: D34856571
Pulled By: mruberry
fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335
(cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)
2022-03-15 20:31:41 +00:00
Khushi Agrawal
905efa82ff
[fix] torch.broadcast_shapes should not handle shapes with negative dimensions. ( #72999 )
...
Summary:
Hi,
The PR fixes https://github.com/pytorch/pytorch/issues/68957 . It aims to include the following:
- Fixes the code in `torch/functional.py`.
- Add the missing tests for negative input values and non-iterable inputs.
~#### TODO~
~- [x] Add OpInfo~
EDIT: `broadcast_shapes` don't take any tensor inputs. So we don't need OpInfo here. Thanks, kshitij12345 for guidance.
#### Earlier
```python
>>> shapes = [1, -12]
>>> torch.broadcast_shapes(*shapes)
torch.Size([-12]) # MUST RAISE ERROR
```
#### Now
```python
>>> shapes = [1, -12]
>>> torch.broadcast_shapes(*shapes)
RuntimeError: Trying to create tensor with negative dimension -12: [-12]
```
#### NumPy's Output
```python
>>> shapes = [1, -12]
>>> numpy.broadcast_shapes(*shapes)
ValueError: negative dimensions are not allowed
```
#### `torch.broadcast_tensor()` Output
As mentioned in the [doc](https://pytorch.org/docs/stable/generated/torch.broadcast_shapes.html ):
```python
>>> shapes = [1, -12]
>>> torch.broadcast_tensors(*map(torch.empty, shapes))[0].shape
RuntimeError: Trying to create tensor with negative dimension -12: [-12]
```
Looking forward to hearing from you and your questions. Thanks! :)
cc: mruberry kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72999
Reviewed By: albanD
Differential Revision: D34543995
Pulled By: ngimel
fbshipit-source-id: e32b1f266500a5e002c8f353b1e02f44c23d4f6e
(cherry picked from commit a6253ce6bb8455a3c89398f12b7d790a0b7e8d95)
2022-03-03 18:33:06 +00:00
Philip Meier
0973c5a1cc
align signature of make_tensor with other creation ops ( #72702 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72702
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision: D34457729
Pulled By: mruberry
fbshipit-source-id: 83d580c4201eef946dc9cf4b9e28a3d36be55609
(cherry picked from commit aa4cf20fbeb4b795595729b8ac2e6ba7707d8283)
2022-02-25 06:30:31 +00:00
Philip Meier
1f74e082e2
only compare attributes for meta tensors ( #72508 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72508
Todo:
- [x] document this behavior
- [x] add tests
Test Plan: Imported from OSS
Reviewed By: zou3519
Differential Revision: D34262452
Pulled By: ezyang
fbshipit-source-id: bc5c9653d5c3ad5c6efccc9c8e0efc0d28e15104
(cherry picked from commit 233142c88e )
2022-02-17 02:33:08 +00:00
anjali411
de8d0203e9
Allow torch.Tensor.real on real-valued tensors ( #71718 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71718
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D33770668
Pulled By: anjali411
fbshipit-source-id: bad21ebe72220b9017a0b8efa71eaeab84bd9e9f
(cherry picked from commit aa0a922757 )
2022-01-25 22:30:48 +00:00
soulitzer
6078e12ad6
Add forward AD support for as_strided ( #68629 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68629
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D32899680
Pulled By: soulitzer
fbshipit-source-id: b80ba4483c06108938923f17dc67278b854515ef
2021-12-14 04:33:05 -08:00
soulitzer
0dcbd73eee
Add some forward AD formulas ( #69384 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69384
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D33020602
Pulled By: soulitzer
fbshipit-source-id: a92dd243f2b5b21fe277b0bb17bcd61dfe5a0d67
2021-12-12 00:11:11 -08:00
Kurt Mohler
0420545639
Enable all dtype combinations in torch.Tensor.view(dtype) ( #66493 )
...
Summary:
Fixes https://github.com/pytorch/pytorch/issues/29013
Note: This PR does not enable autograd. This can be done in a future PR.
cc mruberry rgommers
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66493
Reviewed By: gchanan
Differential Revision: D32314680
Pulled By: mruberry
fbshipit-source-id: 69d325573b2331f32b83c05c91ffbe80571e7ae2
2021-11-11 13:55:21 -08:00
soulitzer
83e8612d11
Clean up test autograd ( #67413 )
...
Summary:
Partially fixes https://github.com/pytorch/pytorch/issues/66066
This PR:
- cleans up op-specific testing from test_autograd. test_autograd should be reserved for testing generic autograd functionality
- tests related to an operator are better colocated
- see the tracker for details
What to think about when moving tests to their correct test suite:
- naming, make sure its not too generic
- how the test is parametrized, sometimes we need to add/remove a device/dtype parameter
- can this be merged with existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67413
Reviewed By: jbschlosser, albanD
Differential Revision: D32031480
Pulled By: soulitzer
fbshipit-source-id: 8e13da1e58a38d5cecbfdfd4fe2b4fe6f816897f
2021-11-03 15:26:09 -07:00
kshitij12345
885a8e53ba
replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes ( #65201 )
...
Summary:
Reference https://github.com/pytorch/pytorch/issues/53849
Replace `onlyOnCPUandCUDA` with `onlyNativeDeviceTypes` which includes `cpu, cuda and meta`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65201
Reviewed By: mrshenli
Differential Revision: D31299718
Pulled By: mruberry
fbshipit-source-id: 2d8356450c035d6a314209ab51b2c237583920fd
2021-11-01 09:22:34 -07:00
Jane Xu
8a65047acc
[skip ci] Set test owners for everything considered with module: tests ( #66865 )
...
Summary:
Action following https://github.com/pytorch/pytorch/issues/66232
cc mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66865
Reviewed By: anjali411
Differential Revision: D31771147
Pulled By: janeyx99
fbshipit-source-id: 8bebe5ac2098364ef1ee93b590abb5f4455b0f89
2021-10-20 09:37:03 -07:00
anjali411
035310c574
Handle shared memory cases in MathBithFallback ( #63602 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63602
This PR fixes the case when a read and write is performed on a memory shared between mutable and (or) non-mutable arguments. Example:
```
a=torch.tensor([1+1j])
b=a.conj()
b.add_(a) # should return tensor([2]) but returns tensor ([2-2j])
```
The issue here is that in the conjugate fallback, we resolve the conjugation in-place for mutable arguments which can be a problem as shown above in the case when other input arguments share memory with the mutable argument(s).
This PR fixes this issue by:
1. first scanning through the operator input arguments and creating a vector of mutable arguments that have the conj bit set to `True` (and accordingly setting the flag `check_for_alias_with_mut_arg ` to `True` or `False`).
2. Iterating through all the arguments. At this time we only look at the non-mutable arguments. If `check_for_alias_with_mut_arg` is set to `True`, then we iterate through `mutable_inputs` to check if the current arg tensor in question doesn't alias any of the entries in `mutable_inputs`. If yes, then we clone the non-mutable tensor arg, else we resolve the conjugation as before.
3. Now we look through the mutable_inputs vector (which contains only mutable input tensors with conj bit set to `True`). We in-place conjugate each of the entries in the vector.
4. Do the computation.
5. Re-conjugate the mutable argument tensors.
NOTE: `TensorLists` are not fully handled in ConjugateFallback. Please see the in-line comment for more details.
Fixes https://github.com/pytorch/pytorch/issues/59943
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D30466905
Pulled By: anjali411
fbshipit-source-id: 58058e5e6481da04a12d03f743c1491942a6cc9b
2021-10-13 13:39:31 -07:00
lezcano
82a216c45b
Add tensor.{adjoint(),H,mT,mH} methods and properties ( #64179 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64179
This PR follows the discussion in https://github.com/pytorch/pytorch/issues/45063#issuecomment-904431478
Fixes https://github.com/pytorch/pytorch/issues/45063
cc ezyang anjali411 dylanbespalko mruberry Lezcano nikitaved rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi heitorschueroff
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision: D30730483
Pulled By: anjali411
fbshipit-source-id: 821d25083f5f682450f6812bf852dc96a1cdf9f2
2021-10-13 07:44:43 -07:00
=
b7adb3350a
Add crow_/col_indices to view types ( #63176 )
...
Summary:
Fixes https://github.com/pytorch/pytorch/issues/61103
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63176
Reviewed By: malfet, albanD
Differential Revision: D30315882
Pulled By: cpuhrsch
fbshipit-source-id: eedae5265a757ed68fd69e4f9d07070b05de4bd8
2021-09-20 14:35:58 -07:00
Philip Meier
26b7ff5aea
deprecate dtype getters from torch.testing namespace ( #63554 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63554
Following https://github.com/pytorch/pytorch/pull/61840#issuecomment-884087809 , this deprecates all the dtype getters publicly exposed in the `torch.testing` namespace. The reason for this twofold:
1. If someone is not familiar with the C++ dispatch macros PyTorch uses, the names are misleading. For example `torch.testing.floating_types()` will only give you `float32` and `float64` skipping `float16` and `bfloat16`.
2. The dtype getters provide very minimal functionality that can be easily emulated by downstream libraries.
We thought about [providing an replacement](https://gist.github.com/pmeier/3dfd2e105842ad0de4505068a1a0270a ), but ultimately decided against it. The major problem is BC: by keeping it, either the namespace is getting messy again after a new dtype is added or we need to somehow version the return values of the getters.
Test Plan: Imported from OSS
Reviewed By: H-Huang
Differential Revision: D30662206
Pulled By: mruberry
fbshipit-source-id: a2bdb10ab02ae665df1b5b76e8afa9af043bbf56
2021-09-07 08:58:51 -07:00
Kushashwa Ravi Shrimali
d37636901e
[Doc] make_tensor to torch.testing module ( #63925 )
...
Summary:
This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs.
TODOs:
* [x] Add examples
cc: pmeier mruberry brianjo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925
Reviewed By: ngimel
Differential Revision: D30633487
Pulled By: mruberry
fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af
2021-08-30 12:25:40 -07:00
Shen Li
1022443168
Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
...
Test Plan: revert-hammer
Differential Revision:
D30279364 (b004307252 )
Original commit changeset: c1ed77dfe43a
fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e
2021-08-12 11:45:01 -07:00
Zsolt Dollenstein
b004307252
[codemod][lint][fbcode/c*] Enable BLACK by default
...
Test Plan: manual inspection & sandcastle
Reviewed By: zertosh
Differential Revision: D30279364
fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
2021-08-12 10:58:35 -07:00
anjali411
143ef016ee
Throw RuntimeError when numpy() is called on a tensor with conjugate or negative bit set ( #61925 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61925
Resolves https://github.com/pytorch/pytorch/issues/59945 and https://github.com/pytorch/pytorch/issues/59946
bc breaking note: Unlike before, complex_tensor.conj().numpy(), complex_float_tensor.conj().view(torch.float64), complex_float_tensor.conj().imag.view(torch.int32) now doesn't return a view but instead errors out
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D29819288
Pulled By: anjali411
fbshipit-source-id: 4bebec721eb535f44ef4b728bdc75fa444e05d16
2021-07-23 11:28:36 -07:00
Laurence Rouesnel
adb73d3dcf
Removed overhead from reshape() call if tensor doesn't need to be changed ( #61466 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61466
## Goal
Per #55126 the performance of `reshape` is worse than `alias` in cases where they are performing the same operation (i.e. where reshape is returning a view) because `reshape` delegates to `view` and duplicates some of the operations (specifically `infer_size_dv` and `computeStride`).
The goal of this pull-request is to reduce or remove the additional overhead that `reshape` has.
### Proposed Implementation
Instead of using `view` we implement a private/internal operator (`_reshape_alias`) that `reshape` dispatches to which skips the relevant checks. This is functionally equivalent to `as_strided` however it is a lot simpler because it's specialized to this use-case, and importantly the `backward` implementation is a lot faster.
Note that we have to dispatch (`reshape` is a composite operator) because `reshape` can return either a view or a copy of the Tensor depending on the parameters, and this complicates implementing a derivative/backward for `reshape`.
### Why not `as_strided`?
Using `as_strided` directly slows down autograd. If we use a custom function equivalent to `_reshape_alias` but with a simpler backward function then `view` has the same performance as `reshape`. If we delegate to `as_strided` it is about 56% slower (and this holds against our custom function).
This is also the reason we make an internal operator named `_reshape_alias` instead of exposing a new operator since this should only be used in the `reshape` case and it is effectively a more limited version of `view`, `alias`, and `as_strided`.
## Benchmarks
In a micro-benchmark for `backward` running:
```cpp
// Setup
at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
// Benchmark loop
// `reshape(-1)` replaced with a call to view(-1) for view baseline
x.pow(4).reshape(-1).mean().backward();
```
I also benchmarked simple operations without gradients using:
```cpp
// Setup
at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
// Benchmark loop
x.reshape(-1) // replaced with a call to view(-1) for view baseline
```
Baselined to `view`:
* Original `reshape`: `+3.3%` (without gradients `+20.8%`)
* Using `as_strided`: `+55.1%` (without gradients `+1.0%`)
* Using custom `_reshape_view`: `-1.0%` (without gradients `+6.2%`)
In absolute terms (note the percentages above were generated comparing between runs/tests rather than to a single baseline):
* Original `view`: `53.66 us` (without gradients `582.78 ns`)
* Original `reshape`: `55.46 us` (without gradients `704.24 ns`)
* Using `as_strided`: `83.24 us` (without gradients `576.49 ns`)
* Using custom `_reshape_view`: `53.13 us` (without gradients `536.01 ns`)
Note that these benchmarks perform a backwards operation as well. When compared without using gradient computation at all the performance differneces are more pronounced as this takes up more of the time.
### Original performance
<details>
<summary>Benchmark results</summary>
```
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f0e4d393160>
x.pow(4).view(-1).mean().backward();
setup: at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
Median: 53.66 us
IQR: 2.70 us (52.54 to 55.24)
884 measurements, 100 runs per measurement, 1 thread]
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f0e2ebd4fa0>
x.pow(4).reshape(-1).mean().backward();
setup: at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
Median: 55.46 us
IQR: 2.61 us (54.39 to 57.01)
889 measurements, 100 runs per measurement, 1 thread]
2276116
2286256
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f0e5b2e3e20>
2640 ???:at::detail::computeStride(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::SmallVector<long, 5u> const&)
1920 ???:at::native::reshape(at::Tensor const&, c10::ArrayRef<long>)
1520 ???:at::_ops::reshape::call(at::Tensor const&, c10::ArrayRef<long>)
1040 ???:c10::SmallVectorImpl<long>::operator=(c10::SmallVectorImpl<long>&&)
980 ???:void at::infer_size_impl<c10::SmallVector<long, 5u> >(c10::ArrayRef<long>, long, c10::SmallVector<long, 5u>&)
720 ???:__tls_get_addr
520 ???:at::shouldRunRecordFunction(bool*)
520 ???:__memcpy_avx_unaligned_erms
200 ???:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10:: ... g>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
100 ???:c10::TensorImpl::strides() const
100 ???:c10::TensorImpl::sizes() const
100 ???:at::(anonymous namespace)::manager()
77 /tmp/benchmark_utils_jit_build__1626465284__8a34e7ff-cd37-4a82-be28-7f19e081e771/timer_cpp_7815557938202456331/timer_src.cpp:main
40 ???:c10::TensorImpl::numel() const
-77 /tmp/benchmark_utils_jit_build__1626465284__8a34e7ff-cd37-4a82-be28-7f19e081e771/timer_cpp_8055217880649990171/timer_src.cpp:main
-260 ???:at::native::view(at::Tensor const&, c10::ArrayRef<long>)
Total: 10140
```
```
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f850dd66c10>
x.view(-1);
setup: at::Tensor x=torch::empty({2,2});
Median: 582.78 ns
IQR: 33.80 ns (573.80 to 607.61)
833 measurements, 10000 runs per measurement, 1 thread]
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f850de31e20>
x.reshape(-1);
setup: at::Tensor x=torch::empty({2,2});
Median: 704.24 ns
IQR: 24.42 ns (697.20 to 721.62)
679 measurements, 10000 runs per measurement, 1 thread]
56896
67036
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f84e1930bb0>
2640 ???:at::detail::computeStride(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::SmallVector<long, 5u> const&)
1920 ???:at::native::reshape(at::Tensor const&, c10::ArrayRef<long>)
1520 ???:at::_ops::reshape::call(at::Tensor const&, c10::ArrayRef<long>)
1040 ???:c10::SmallVectorImpl<long>::operator=(c10::SmallVectorImpl<long>&&)
980 ???:void at::infer_size_impl<c10::SmallVector<long, 5u> >(c10::ArrayRef<long>, long, c10::SmallVector<long, 5u>&)
720 ???:__tls_get_addr
520 ???:at::shouldRunRecordFunction(bool*)
520 ???:__memcpy_avx_unaligned_erms
200 ???:c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10:: ... g>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
100 ???:c10::TensorImpl::strides() const
100 ???:c10::TensorImpl::sizes() const
100 ???:at::(anonymous namespace)::manager()
76 /tmp/benchmark_utils_jit_build__1626466038__15fbbac0-2072-4459-8f8e-08121a905b99/timer_cpp_547407365342278353/timer_src.cpp:main
40 ???:c10::TensorImpl::numel() const
-76 /tmp/benchmark_utils_jit_build__1626466038__15fbbac0-2072-4459-8f8e-08121a905b99/timer_cpp_3457873755756181226/timer_src.cpp:main
-260 ???:at::native::view(at::Tensor const&, c10::ArrayRef<long>)
Total: 10140
```
</details>
### Using `as_strided`
<details>
<summary>Benchmark results</summary>
```
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f8b13bb5b50>
x.pow(4).view(-1).mean().backward();
setup: at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
Median: 53.37 us
IQR: 3.15 us (51.73 to 54.88)
936 measurements, 100 runs per measurement, 1 thread]
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f8af55f8490>
x.pow(4).reshape(-1).mean().backward();
setup: at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
Median: 83.24 us
IQR: 4.05 us (81.20 to 85.25)
609 measurements, 100 runs per measurement, 1 thread]
2267916
2525061
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f8af55f8e50>
31930 ???:_int_free
15940 ???:malloc
11595 ???:_int_malloc
10100 ???:torch::autograd::generated::details::as_strided_backward(at::Tensor, at::TensorGeometry, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)
9360 ???:__tls_get_addr
8280 ???:free
8100 ???:torch::autograd::VariableType::(anonymous namespace)::as_strided(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)
4520 ???:c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_()
4080 ???:operator new(unsigned long)
...
-780 ???:at::_ops::view::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-920 ???:c10::SmallVectorImpl<long>::operator=(c10::SmallVectorImpl<long> const&)
-1220 ???:torch::autograd::generated::ViewBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
-1520 ???:at::_ops::view::call(at::Tensor const&, c10::ArrayRef<long>)
-1580 ???:torch::ADInplaceOrView::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-1680 ???:at::Tensor at::native::alias_with_sizes_and_strides<c10::SmallVector<long, 5u> >(at::Tensor const&, c10::SmallVector<long, 5u> const&, c10::SmallVector<long, 5u> const&)
-2560 ???:at::detail::computeStride(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::SmallVector<long, 5u> const&)
-2640 ???:at::native::view(at::Tensor const&, c10::ArrayRef<long>)
-4860 ???:torch::autograd::VariableType::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
Total: 257145
```
```
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f93176a0160>
x.view(-1);
setup: at::Tensor x=torch::empty({2,2});
Median: 570.55 ns
IQR: 32.69 ns (552.87 to 585.56)
874 measurements, 10000 runs per measurement, 1 thread]
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f92f8f29490>
x.reshape(-1);
setup: at::Tensor x=torch::empty({2,2});
Median: 576.49 ns
IQR: 37.95 ns (559.51 to 597.46)
861 measurements, 10000 runs per measurement, 1 thread]
56896
58556
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f932556ca60>
2140 ???:at::native::reshape(at::Tensor const&, c10::ArrayRef<long>)
1940 ???:torch::autograd::VariableType::(anonymous namespace)::as_strided(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)
1880 ???:torch::ADInplaceOrView::(anonymous namespace)::as_strided(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)
1720 ???:at::_ops::as_strided::call(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)
1520 ???:at::_ops::reshape::call(at::Tensor const&, c10::ArrayRef<long>)
1400 ???:at::native::as_strided_tensorimpl(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)
1260 ???:at::_ops::as_strided::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)'2
1260 ???:at::_ops::as_strided::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<long>)
980 ???:void at::infer_size_impl<c10::SmallVector<long, 5u> >(c10::ArrayRef<long>, long, c10::SmallVector<long, 5u>&)
...
-620 ???:at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&, c10::ArrayRef<long ... ::ArrayRef<long>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>) const
-780 ???:at::_ops::view::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)'2
-780 ???:at::_ops::view::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-920 ???:c10::SmallVectorImpl<long>::operator=(c10::SmallVectorImpl<long> const&)
-1520 ???:at::_ops::view::call(at::Tensor const&, c10::ArrayRef<long>)
-1580 ???:torch::ADInplaceOrView::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-1680 ???:at::Tensor at::native::alias_with_sizes_and_strides<c10::SmallVector<long, 5u> >(at::Tensor const&, c10::SmallVector<long, 5u> const&, c10::SmallVector<long, 5u> const&)
-1740 ???:torch::autograd::VariableType::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-2640 ???:at::native::view(at::Tensor const&, c10::ArrayRef<long>)
Total: 1660
```
</details>
### Using custom function (`_reshape_alias`)
<details>
<summary>Benchmark results</summary>
```
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f16861d6b50>
x.pow(4).view(-1).mean().backward();
setup: at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
Median: 53.50 us
IQR: 2.64 us (52.32 to 54.96)
906 measurements, 100 runs per measurement, 1 thread]
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f1667b2ed60>
x.pow(4).reshape(-1).mean().backward();
setup: at::Tensor x=torch::empty({2,2}, torch::requires_grad(true));
Median: 53.13 us
IQR: 3.40 us (51.72 to 55.13)
914 measurements, 100 runs per measurement, 1 thread]
2269736
2273236
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f1693f8dc10>
5060 ???:torch::autograd::VariableType::(anonymous namespace)::_reshape_alias(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
2000 ???:at::native::reshape(at::Tensor const&, c10::ArrayRef<long>)
1780 ???:torch::ADInplaceOrView::(anonymous namespace)::_reshape_alias(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
1660 ???:at::_ops::_reshape_alias::call(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
1600 ???:at::Tensor at::native::alias_with_sizes_and_strides<c10::ArrayRef<long> >(at::Tensor const&, c10::ArrayRef<long> const&, c10::ArrayRef<long> const&)
1520 ???:at::_ops::reshape::call(at::Tensor const&, c10::ArrayRef<long>)
1240 ???:at::_ops::_reshape_alias::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)'2
1240 ???:at::_ops::_reshape_alias::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
1220 ???:torch::autograd::generated::AliasToShapeBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
...
-780 ???:at::_ops::view::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)'2
-780 ???:at::_ops::view::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-920 ???:c10::SmallVectorImpl<long>::operator=(c10::SmallVectorImpl<long> const&)
-1220 ???:torch::autograd::generated::ViewBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
-1520 ???:at::_ops::view::call(at::Tensor const&, c10::ArrayRef<long>)
-1580 ???:torch::ADInplaceOrView::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-1680 ???:at::Tensor at::native::alias_with_sizes_and_strides<c10::SmallVector<long, 5u> >(at::Tensor const&, c10::SmallVector<long, 5u> const&, c10::SmallVector<long, 5u> const&)
-2640 ???:at::native::view(at::Tensor const&, c10::ArrayRef<long>)
-4860 ???:torch::autograd::VariableType::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
Total: 3500
```
```
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f5287adfb20>
x.view(-1);
setup: at::Tensor x=torch::empty({2,2});
Median: 505.10 ns
IQR: 20.04 ns (500.41 to 520.45)
944 measurements, 10000 runs per measurement, 1 thread]
[<torch.utils.benchmark.utils.common.Measurement object at 0x7f526951b430>
x.reshape(-1);
setup: at::Tensor x=torch::empty({2,2});
Median: 536.01 ns
IQR: 17.81 ns (531.34 to 549.16)
916 measurements, 10000 runs per measurement, 1 thread]
56896
60376
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f5295896c10>
2000 ???:at::native::reshape(at::Tensor const&, c10::ArrayRef<long>)
1860 ???:torch::autograd::VariableType::(anonymous namespace)::_reshape_alias(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
1780 ???:torch::ADInplaceOrView::(anonymous namespace)::_reshape_alias(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
1660 ???:at::_ops::_reshape_alias::call(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
1600 ???:at::Tensor at::native::alias_with_sizes_and_strides<c10::ArrayRef<long> >(at::Tensor const&, c10::ArrayRef<long> const&, c10::ArrayRef<long> const&)
1520 ???:at::_ops::reshape::call(at::Tensor const&, c10::ArrayRef<long>)
1240 ???:at::_ops::_reshape_alias::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)'2
1240 ???:at::_ops::_reshape_alias::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>)
980 ???:void at::infer_size_impl<c10::SmallVector<long, 5u> >(c10::ArrayRef<long>, long, c10::SmallVector<long, 5u>&)
...
-620 ???:at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&, c10::ArrayRef<long ... ::ArrayRef<long>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>) const
-780 ???:at::_ops::view::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)'2
-780 ???:at::_ops::view::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-920 ???:c10::SmallVectorImpl<long>::operator=(c10::SmallVectorImpl<long> const&)
-1520 ???:at::_ops::view::call(at::Tensor const&, c10::ArrayRef<long>)
-1580 ???:torch::ADInplaceOrView::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-1680 ???:at::Tensor at::native::alias_with_sizes_and_strides<c10::SmallVector<long, 5u> >(at::Tensor const&, c10::SmallVector<long, 5u> const&, c10::SmallVector<long, 5u> const&)
-1740 ???:torch::autograd::VariableType::(anonymous namespace)::view(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>)
-2640 ???:at::native::view(at::Tensor const&, c10::ArrayRef<long>)
Total: 3480
```
</details>
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision: D29792126
Pulled By: laurencer
fbshipit-source-id: f0519b45b65f868aa3e8651679354558bd761dfd
2021-07-21 14:05:35 -07:00