Commit Graph

120 Commits

Author SHA1 Message Date
Anthony Shoumikhin
e2f9759bd0 Fix broken URLs (#152237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152237
Approved by: https://github.com/huydhn, https://github.com/malfet
2025-04-27 09:56:42 +00:00
cyy
df458be4e5 [4/N] Apply py39 ruff and pyupgrade fixes (#143257)
```torch/fx/passes/annotate_getitem_nodes.py``` was changed to support the new type hinting annotations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143257
Approved by: https://github.com/justinchuby, https://github.com/albanD
2025-01-04 10:47:51 +00:00
Tom Ritchford
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
Natalia Gimelshein
05c3330893 use more elements per thread for narrow dtypes (#139449)
Fix perf issue for narrow type by accessing more elements per thread

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139449
Approved by: https://github.com/Chillee, https://github.com/eqy
2024-11-14 22:50:16 +00:00
PyTorch MergeBot
adcff4bff0 Revert "use more elements per thread for narrow dtypes (#139449)"
This reverts commit d3fc13a9dd.

Reverted https://github.com/pytorch/pytorch/pull/139449 on behalf of https://github.com/ngimel due to breaks tests ([comment](https://github.com/pytorch/pytorch/pull/139449#issuecomment-2477012582))
2024-11-14 17:28:32 +00:00
zeshengzong
ffb7a08921 Fix torch.histc not checking min > max on cuda for int8 tensors (#139372)
Fixes #139360

86e6513c86/aten/src/ATen/native/cuda/SummaryOps.cu (L323-L324)

Assign `min` and `max` to with low-precision input_t variable `minvalue` and `maxvalue` cause wrong comparing result in following check in here:

86e6513c86/aten/src/ATen/native/cuda/SummaryOps.cu (L353)

![image](https://github.com/user-attachments/assets/0d5c87f4-3dc6-48bb-bcc8-b1803e7cd487)

Change type of `minvalue` and `maxvalue` to fix it, similar like in line:

86e6513c86/aten/src/ATen/native/cuda/SummaryOps.cu (L280-L282)

**Test Result**
```bash
$ pytest test/test_reductions.py -vv
```
![image](https://github.com/user-attachments/assets/6b5d0d48-ebc2-4a8c-85f4-dbad147c086c)

```bash
$ lintrunner
```
![image](https://github.com/user-attachments/assets/f97c2d6d-78ea-4439-a1ba-907bc9defad7)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139372
Approved by: https://github.com/eqy
2024-11-05 08:42:38 +00:00
Natalia Gimelshein
d3fc13a9dd use more elements per thread for narrow dtypes (#139449)
Fix perf issue for narrow type by accessing more elements per thread

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139449
Approved by: https://github.com/Chillee, https://github.com/eqy
2024-11-04 16:43:33 +00:00
axel
f6e5d09682 Raise error for int64 and bool dtypes in nanmean, even for empty tensors (#138745)
This PR ensures that the `nanmean()` function raises a `RuntimeError` when using `int64` or `bool` dtypes, even for empty tensors. Previously, non-empty tensors correctly raised errors for unsupported dtypes, while empty tensors did not. This change brings consistent error handling for both cases.

addressing the need raised in an issue by @hyperkai  (Issue [#131043](https://github.com/pytorch/pytorch/issues/131043)).

### Changes

- Added checks in `nanmean_out()` to raise errors for `int64` and `bool` dtypes regardless of tensor size.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138745
Approved by: https://github.com/ezyang
2024-11-02 22:52:40 +00:00
Animesh Jain
b1acd0978e [dynamo] Support range_iterator as a function input (#138657)
Fixes https://github.com/pytorch/pytorch/issues/138654

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138657
Approved by: https://github.com/williamwen42, https://github.com/jansel
2024-10-24 03:49:26 +00:00
Benjamin Glass
f984b88718 Ensure noncontiguous tensor creation tests offsetting (#136396)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136396
Approved by: https://github.com/amjames, https://github.com/eellison
ghstack dependencies: #136055
2024-10-02 00:40:43 +00:00
PyTorch MergeBot
f21b471978 Revert "Fix numerical instability for norm (#129352)"
This reverts commit 66340e6751.

Reverted https://github.com/pytorch/pytorch/pull/129352 on behalf of https://github.com/atalman due to Breaks Internal CI ([comment](https://github.com/pytorch/pytorch/pull/129352#issuecomment-2379989485))
2024-09-27 20:18:47 +00:00
CaoE
66340e6751 Fix numerical instability for norm (#129352)
Fixes #123645
When the reduce size is large, reducing directly may exceed the range that FP32 can represent, resulting in incorrect results.
Reducing in group and using double as the intermediate cumulative type can avoid exceeding the representation range.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129352
Approved by: https://github.com/jgong5, https://github.com/peterbell10
2024-09-27 00:51:31 +00:00
DavidGu-Datong
fb4670a1f9 fix mean_out: op does not update parameter out for BF16/FP16 dtype on CPU (#135174)
Fixes #134848

For BF16/FP16, when a tensor is specified in `out` parameter of mean, the mean kernel should use its storage for output, but that doesn't happen, since an `at::to` in the current code causes storage to be allocated again, but the `out` parameter tensor's storage doesn't get updated, resulting in it not holding the mean output.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135174
Approved by: https://github.com/soulitzer
2024-09-21 14:21:42 +00:00
Tobias Ringwald
758d787901 Added complex support for torch.logsumexp (#133187)
Added complex support for `torch.logsumexp`. Implemented complex backward pass for `torch.logsumexp`.

Fixes #133047

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133187
Approved by: https://github.com/amjames, https://github.com/lezcano
2024-09-03 17:28:36 +00:00
haozhe.zhu
2ba60a1618 fix torch.prod vectorized path for bool (#128009)
Fix https://github.com/pytorch/pytorch/issues/127866.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128009
Approved by: https://github.com/jgong5, https://github.com/albanD
2024-08-28 05:27:50 +00:00
ajbrent
30bfdf1afc Errors when 0-dim tensor of complex or bool type passed to aminmax. (#128404)
Fixes #126742

Added errors for the case of 0-dim tensors of complex or bool types passed to aminmax.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128404
Approved by: https://github.com/janeyx99
2024-06-24 21:46:49 +00:00
cyy
c6b36ec2f9 Remove calls of deprecated _aminmax (#127182)
While  #125995 is pending, the calls should be removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127182
Approved by: https://github.com/ezyang
2024-05-28 03:51:45 +00:00
PyTorch MergeBot
315389bfed Revert "Remove deprecated _aminmax operator (#125995)"
This reverts commit 0116ffae7f.

Reverted https://github.com/pytorch/pytorch/pull/125995 on behalf of https://github.com/huydhn due to Sorry for reverting your change but we need to reland this after I get rid of all usage of _aminmax internally in Meta ([comment](https://github.com/pytorch/pytorch/pull/125995#issuecomment-2113769497))
2024-05-16 01:45:37 +00:00
Edward Z. Yang
b659506d82 Parametrize test_dim_reduction (#126292)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126292
Approved by: https://github.com/Skylion007
2024-05-15 19:55:37 +00:00
cyy
0116ffae7f Remove deprecated _aminmax operator (#125995)
It has been deprecated for a long time.

Co-authored-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125995
Approved by: https://github.com/ezyang
2024-05-12 17:50:17 +00:00
Jeff Daily
be0bdf111c relax tol for flaky nansum_out_dtype_cuda_float32 test (#121550)
TestReductionsCUDA.test_nansum_out_dtype_cuda_float32 would fail or pass depending on the random inputs. Observed by ROCm internal QA testing.  But same problematic random inputs breaks the test for CUDA, verified on V100.

There is precedent in another test within the same file to relax tolerance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121550
Approved by: https://github.com/albanD
2024-03-14 15:28:45 +00:00
CaoE
412c687e2e Fix permuted sum precision issue for lower precision on CPU (#108559)
Fixes #83149
There is a limitation of `TensorIterator` reductions:
The non-permuted input tensor will be coalesced down to a 2-d tensor by `TensorIterator` whereas the permuted case may become a >2d operation (for example, two reduced dimensions and non-reduced dim).
Since the cpu reduction loop of `TensorIterator` only operates on two dimensions at a time, this means the intermediate sums will be truncated to lower precision.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108559
Approved by: https://github.com/mingfeima, https://github.com/peterbell10
2024-03-06 01:01:35 +00:00
Edward Z. Yang
a359afbc3f Make and/or on uint8 tensors properly return 0x00 or 0x01 (#117827)
Fixes https://github.com/pytorch/pytorch/issues/117215

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117827
Approved by: https://github.com/albanD
2024-01-22 17:30:22 +00:00
Edward Z. Yang
5b24877663 Improve uint{16,32,64} dlpack/numpy compatibility (#116808)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116808
Approved by: https://github.com/malfet, https://github.com/albanD
2024-01-11 17:01:54 +00:00
Aaron Gokaslan
3fe437b24b [BE]: Update flake8 to v6.1.0 and fix lints (#116591)
Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling.
- Replace `assert(0)` with `raise AssertionError()`
- Remove extraneous parenthesis i.e.
  - `assert(a == b)` -> `assert a == b`
  - `if(x > y or y < z):`->`if x > y or y < z:`
  - And `return('...')` -> `return '...'`

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591
Approved by: https://github.com/albanD, https://github.com/malfet
2024-01-03 06:04:44 +00:00
CaoE
26b5e27ace Add Half support for cummax, cummin, cumprod, logcumsumexp, and prod on CPU (#112132)
Add Half support for cummax, cummin, cumprod, logcumsumexp, and prod on CPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112132
Approved by: https://github.com/cpuhrsch
2023-11-05 12:31:38 +00:00
CaoE
a310cc8968 Add Half support for kthvalue, cross, hist, and logit on CPU (#112135)
Add Half support for kthvalue, cross, hist, and logit on CPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112135
Approved by: https://github.com/cpuhrsch
2023-10-31 09:12:47 +00:00
CaoE
4b324a8717 Add Half support for aminmax on CPU (#106853)
Add Half support for aminmax on CPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106853
Approved by: https://github.com/cpuhrsch
2023-10-23 17:43:47 +00:00
Kazuaki Ishizaki
a603dcc307 Fix typo under test directory (#110826)
This PR fixes typo `the the` of comments in files under `test` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110826
Approved by: https://github.com/Skylion007
2023-10-08 20:52:38 +00:00
Tobias Ringwald
555c83d097 Added a UserWarning when using torch.{std,var,std_mean,std_var} with dof<=0 (#109824)
Fixes #109696.

This PR adds a `UserWarning` when calling
- `torch.var`
- `torch.var_mean`
- `torch.std`
- `torch.std_mean`

with an effective `dof<=0`. Until now, only `torch.cov` warned about this. The code also handles edge cases, such as `torch.empty`
```
>>> import torch; torch.std_mean(torch.empty(0), correction=0)
<stdin>:1: UserWarning: std_mean(): degrees of freedom is <= 0 (Triggered internally at /app/aten/src/ATen/native/ReduceOps.cpp:1671.)
(tensor(nan), tensor(nan))
```

multi-dim reductions

```
>>> import torch; torch.std_mean(torch.empty(10, 30, 20, 50), correction=600, dim=(1, 2))
<stdin>:1: UserWarning: std_mean(): degrees of freedom is <= 0 (Triggered internally at /app/aten/src/ATen/native/ReduceOps.cpp:1671.)
[... snip ...]
```

and a negative `correction`.

```
>>> import torch; torch.std_mean(torch.randn(0), correction=-5)
(tensor(nan), tensor(nan))
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109824
Approved by: https://github.com/soulitzer
2023-10-06 01:03:47 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
ekamiti
017499b078 Update reduction_ops groupings to include primtorch types (#107338)
Fixes https://github.com/pytorch/pytorch/issues/107335. The skips were updated for the _ref ops to match those for eager mode where necessary. Part of breakdown of https://github.com/pytorch/pytorch/pull/104489.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107338
Approved by: https://github.com/ezyang
2023-08-19 02:09:11 +00:00
Justin Chu
73e1455327 [BE] Enable ruff's UP rules and autoformat test/ (#105434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434
Approved by: https://github.com/albanD
2023-07-19 20:36:06 +00:00
ganler
3dcf8b6140 [Fix] Inbound check of sorter indices in searchsorted (#95109)
Fixes https://github.com/pytorch/pytorch/issues/91606, but in C++14 style.

Prior fix (https://github.com/pytorch/pytorch/pull/94863) was in C++17 which might violate some builds.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95109
Approved by: https://github.com/ngimel
2023-02-20 04:59:11 +00:00
Yuxin Wu
9bb2fe3eae fix numpy1.24 deprecations in unittests (#93997)
Fixes https://github.com/pytorch/pytorch/issues/91329

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93997
Approved by: https://github.com/ngimel, https://github.com/jerryzh168
2023-02-18 00:59:09 +00:00
Wei Wang
c16b2916f1 Back out "fix: make sure sorter indices are inbound in searchsorted (#94863)" (#95086)
Summary:
Original commit changeset: 96a2200d1fd8

Original Phabricator Diff: D43342962

Test Plan: Sandcastle and land castle as well as buck2 build mode/opt //frl/et/projects/Masquerade/stable/datasets/masquerade/c6p7:post_processing

Reviewed By: seemethere, bigfootjon

Differential Revision: D43402398

@bypass-github-export-checks
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95086
Approved by: https://github.com/bigfootjon
2023-02-17 22:48:22 +00:00
Khushi
a0389681c2 [complex] nansum & nanmean (#93199)
Follows: #71472

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93199
Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/kshitij12345
2023-02-16 06:13:42 +00:00
ganler
5e1de31548 fix: make sure sorter indices are inbound in searchsorted (#94863)
Fixes #91606

Add a checker to `sorter` to make sure indices are inbound (as NumPy).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94863
Approved by: https://github.com/Skylion007, https://github.com/malfet
2023-02-16 04:28:39 +00:00
Xuehai Pan
b005ec62b9 [BE] Remove dependency on six and future (#94709)
Remove the Python 2 and 3 compatibility library [six](https://pypi.org/project/six) and [future](https://pypi.org/project/future) and `torch._six`. We only support Python 3.8+ now. It's time to retire them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94709
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-02-14 09:14:14 +00:00
mfkasim1
2acac8a83a Logcumsumexp for CUDA (build-time optimized) (#94310)
Hopefully fixes #89205.
This is another version of #90847 where it was reverted because it increases the compile-time significantly.
From my discussion with @ngimel in https://github.com/pytorch/pytorch/pull/93153#issuecomment-1409051528, it seems the option of jiterator would be very tricky if not impossible.
So what I did was to optimize the compile-time in my computer.

To optimize the build time, first I compile the pytorch as a whole, then only change the `LogcumsumexpKernel.cu` file to see how it changes the compile time.
Here are my results for the compilation time of only the `LogcumsumexpKernel.cu` file in my computer:

- Original version (without any complex implementations): 56s (about 1 minute)
- The previous PR (#90847): 13m 57s (about 14 minutes)
- This PR: 3m 35s (about 3.5 minutes)

If the previous PR increases the build time by 30 mins in pytorch's computer, then this PR reduces the increment of build time to about 6 mins. Hopefully this is an acceptable level of build-time increase.

What I did was (sorted by how significant it reduces the build time from the most significant one):

- Substituting `log(x)` to `log1p(x - 1)`. This is applied in the infinite case, so we don't really care about precision.
- Implementing complex exponential manually

tag: @malfet, @albanD
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94310
Approved by: https://github.com/Skylion007, https://github.com/malfet
2023-02-13 16:00:52 +00:00
Aaron Gokaslan
67d9790985 [BE] Apply almost all remaining flake8-comprehension checks (#94676)
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
2023-02-12 01:01:25 +00:00
mfkasim1
75cfc0be21 Logcumsumexp for CPU (#93153)
Partial work from #90847, in the direction of solving #89205.
Most of the content is from #90847, but this is only for CPU, so hopefully it does not increase the build time by a lot.

tag: @albanD, @malfet

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93153
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-01-27 22:29:33 +00:00
PyTorch MergeBot
9b23fd378f Revert "Logcumsumexp for complex in CPU and CUDA (#90847)"
This reverts commit 64985123e4.

Reverted https://github.com/pytorch/pytorch/pull/90847 on behalf of https://github.com/malfet due to Reverting to decrease build time, let's discuss the alternatives here
2023-01-24 20:49:08 +00:00
mfkasim1
64985123e4 Logcumsumexp for complex in CPU and CUDA (#90847)
Another PR towards solving #89205.
What's in this PR:

* The implementation of forward `logcumsumexp` for complex numbers in CPU & CUDA
* The tests on forward call of `logcumsumexp` for complex numbers
* The implementation of backward `logcumsumexp` for complex numbers

What's missing:

* The test on backward gradient of `logcumsumexp` (it complaints `RuntimeError: logcumsumexp does not support automatic differentiation for outputs with complex dtype.` and I don't know how to solve the error and I don't know where to put the test for the backward computation). If possible, I'd like this to be done in this PR.

It's really tricky to handle the edge cases here (i.e. the ones involving `inf`), but I've tried my best to put some comments explaining the reasonings of my decisions in this PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90847
Approved by: https://github.com/albanD
2023-01-20 15:10:50 +00:00
lezcano
1d6a188d08 Reland Dispatch torch.norm to linalg.vector_norm and linalg.matrix_norm (#81761) (#84624)
Reland https://github.com/pytorch/pytorch/pull/81761

Differential Revision: [D39332292](https://our.internmc.facebook.com/intern/diff/D39332292)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84624
Approved by: https://github.com/kit1980
2022-11-22 07:53:24 +00:00
Aidyn-A
3bc78295c2 Fix consistentcy of histc on CPU and CUDA (#87832)
Fixes #87657

The main reason why `histc` returns slightly different outputs is the difference on how bin position is calculate.
The CPU calculates it as: 449778a939/aten/src/ATen/native/cpu/HistogramKernel.cpp (L168-L170)
which is basically `(i - a) / (b - a) * N`, while cuda code 449778a939/aten/src/ATen/native/cuda/SummaryOps.cu (L41)
 which is `(i - a) * N / (b - a)`.

For some cases like in #87657 the order of arithmetic operations matters due to the floating point round-off.

________________

Not sure where would be the most appropriate place to put the unit test. Hope `test_reductions::test_histc` will do.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87832
Approved by: https://github.com/soumith
2022-11-18 05:08:47 +00:00
Nikita Shulga
7fa601b1a7 Skip chalf.mean in test_reductions_large_half_tensors (#86747)
As `mean_reduce` is not implemented for complex half

Fixes https://github.com/pytorch/pytorch/issues/86743 and unblock A10G testing

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86747
Approved by: https://github.com/ngimel
2022-10-11 23:27:30 +00:00
kshitij12345
88a8a900b9 fix: half reduction with multiple sub-iterators (#85596)
Fixes #74438

TODO:
* [x] Add test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85596
Approved by: https://github.com/ngimel
2022-10-11 05:40:12 +00:00