Commit Graph

993 Commits

Author SHA1 Message Date
Iurii Zdebskyi
58d2dd5b73 Enabled flip for bool tensors (#31267)
Summary:
Fix this [issue](https://github.com/pytorch/pytorch/issues/31213)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31267

Differential Revision: D19047249

Pulled By: izdeby

fbshipit-source-id: f58ca3ac88aab28742b8d345400270f7d31c3856
2019-12-18 09:01:32 -08:00
Kurt Mohler
3694749cd1 Detect dill version in torch.save/load (#30985)
Summary:
Fix for issue https://github.com/pytorch/pytorch/issues/28313
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30985

Differential Revision: D19142947

Pulled By: zou3519

fbshipit-source-id: 10e3a182a99e80ca8c9c8328b6f8764b27d78eb3
2019-12-18 08:05:08 -08:00
Xiang Gao
ffe0c1ae4d Make test_torch.py pass cuda-memcheck (#29243)
Summary:
Make the following changes:
- When there are more than 10k errors, cuda-memcheck only shows 10k errors, in this case we shouldn't raise an Exception
- Add UNDER_CUDA_MEMCHECK environment to allow disabling `pin_memory` tests when running cuda-memcheck.
- Add a `--ci` command option, when turned on, then this script would run output to stdout instead of writing a file, and exit with an error if cuda-memcheck fails
- Add a `--nohang` command option. When turned on, then hang would be treated as pass instead of error
- Do simple filtering on the test to run: if `'cpu'` in the test name but not `'cuda'` is not in the test name
- Add `--split` and `--rank` to allowing splitting the work (NVIDIA CI has a limitation of 3 hours, we have to split the work to satisfy this limitation)
- The error summary could be `ERROR SUMMARY: 1 error`, or `ERROR SUMMARY: 2 errors`, the tail could be `error` or `errors`, it is not of the same length. The script is fixed to handle this case.
- Ignore errors from `cufft`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29243

Differential Revision: D18941701

Pulled By: mruberry

fbshipit-source-id: 2048428f32b66ef50c67444c03ce4dd9491179d2
2019-12-14 20:29:58 -08:00
Vitaly Fedyunin
c35cddb306 Switch default memory format of clone operator to Preserve
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30089

Test Plan: Imported from OSS

Differential Revision: D18624985

Pulled By: VitalyFedyunin

fbshipit-source-id: 8d315b08b7b5858fd0a81d3375b44ccb94787ad4
2019-12-14 20:29:06 -08:00
Vitaly Fedyunin
fde3d707ad Switch default memory format of to (and similar) operators to Preserve
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30088

Test Plan: Imported from OSS

Differential Revision: D18624984

Pulled By: VitalyFedyunin

fbshipit-source-id: 54901786d7496c7dce785140b0585ac9093b1d86
2019-12-14 20:29:01 -08:00
Vitaly Fedyunin
927588df8e Switch default memory format of _like operators to Preserve
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30087

Test Plan: Imported from OSS

Differential Revision: D18624986

Pulled By: VitalyFedyunin

fbshipit-source-id: 8e434966f872ffaddf1249248ea445cbbab300ce
2019-12-14 20:28:57 -08:00
Xiang Gao
9954739956 Refactor test for unique and unique_consecutive and fix some bugs (#31211)
Summary:
Tests for unique_dim will be refactored in a separate PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31211

Differential Revision: D19034968

Pulled By: ngimel

fbshipit-source-id: 855d326b37638b5944f11fbbce03394cf000daf9
2019-12-14 20:28:38 -08:00
Iurii Zdebskyi
f6c31f61c5 Enabled roll for bool tensor (#31194)
Summary:
Fixed this [issue](https://github.com/pytorch/pytorch/issues/31079).
Tested via unit test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31194

Differential Revision: D18958141

Pulled By: izdeby

fbshipit-source-id: 119bf4d31df10ee02c277f5a4663038470cf7780
2019-12-12 13:48:14 -08:00
Brian Vaughan
945ce71b18 Correctly handle scalar types, fix parse of numpy ints (#30486)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30486

Fixes: https://github.com/pytorch/pytorch/issues/29252

There is some incorrect code in the handling of parsing python numbers that led to issue #29252:

When we allow interpretation of a zero-dim numpy integer value
as a scalar in pytorch, we incorrectly parse the int as a float.

This PR also fixes the issue described in the "FIXME" here:
https://github.com/pytorch/pytorch/pull/27628/files#diff-f539198dd366265fb8dc2d661bc5d5bcR1487

Test Plan: Added a unit test based on the example given in the issue.

Differential Revision: D18932520

Pulled By: nairbv

fbshipit-source-id: f6416f28dfd73ac72c1042042851d76beb5fcf65
2019-12-11 15:35:57 -08:00
Alban Desmaison
717274c001 Add useful warnings for t.grad when it won't be populated for known reasons (#30531)
Summary:
Fix https://github.com/pytorch/pytorch/issues/2362 and https://github.com/pytorch/pytorch/issues/19778

To avoid issues with frozen model, we only consider warning for Tensors that require gradients and are neither leafs nor retain gradients.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30531

Differential Revision: D18832767

Pulled By: albanD

fbshipit-source-id: 743e863dc14ab57713e66da78b2e4d759dfba0ff
2019-12-11 09:47:18 -08:00
Michael Suo
62b10721fb Actually make flake8 do something (#30892)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30892

Fixes all outstanding lints and actually installs a properly configured
flake8

Test Plan: Imported from OSS

Differential Revision: D18862825

Pulled By: suo

fbshipit-source-id: 08e9083338a7309272e17bb803feaa42e348aa85
2019-12-06 17:50:50 -08:00
Gregory Chanan
377131b0eb MultiMarginCriterion: fix scalar_check in the case where reduction == None. (#30826)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30826

Previously the scalar_check for the reduction None case was:
input.dim() <= 1, but it should be target based, i.e.:
target.dim() == 0.  This follows from the "correct cases", i.e.
(N, C) X (N,) -> (N,)
(C,) X () -> ()

Test Plan: Imported from OSS

Differential Revision: D18833660

Pulled By: gchanan

fbshipit-source-id: 26338b842a8311718c4b89da3e2f1b726d5409b8
2019-12-06 09:04:38 -08:00
Gregory Chanan
e5d571ae25 Remove scalar_check from topk, move it to the THC implementation.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30852

Test Plan: Imported from OSS

Differential Revision: D18842662

Pulled By: gchanan

fbshipit-source-id: b5e8a4367fce9441be2ddbd026495f1911038221
2019-12-06 07:50:20 -08:00
Edward Yang
6e38d50352 Revert D18117070: Migrate max and min (binary) from TH to ATen.
Test Plan: revert-hammer

Differential Revision:
D18117070

Original commit changeset: e06d37a8a140

fbshipit-source-id: 49dd33f52e7e3ffcaafc02109a0a0a67545ec7e8
2019-12-05 14:43:29 -08:00
Edward Yang
2ced81f289 Revert "Default to not build Caffe2 operators on Windows. (#29061)" (#30740)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30740

This reverts commit 7102aceaf8.

Test Plan: Imported from OSS

Differential Revision: D18834315

Pulled By: ezyang

fbshipit-source-id: 2dbd1cf686864b9840365083182cd6188a285399
2019-12-05 14:01:59 -08:00
Hong Xu
1578a28692 Migrate max and min (binary) from TH to ATen. (#27185)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27185

TH implementation will be removed after the unary max and min are migrated.

Benchmark: (Debian 10, Release build, gcc 7.4, no turbo)

```python
import timeit
for device in ('cpu', 'cuda'):
    print(f'device: {device}')
    for op in ('max', 'min'):
        for dtype in ('torch.double', 'torch.float', 'torch.int16', 'torch.int32', 'torch.int64'):
            for n, t in [(10_000, 200000),
                        (100_000, 20000)]:
                print(f'torch.{op}(a, b), numel() == {n} for {t} times, dtype={dtype}')
                print(timeit.timeit(f'torch.{op}(a)' + (';torch.cuda.synchronize()' if device == 'cuda' else ''),
                                    setup=f'import torch; a = torch.arange({n}, dtype={dtype}); b = torch.ones({n}, 0, dtype={dtype}) * ({n} / 2)', number=t))
    print()
```

Before:

```
device: cpu
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double
2.241763713000182
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.7138833169992722
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float
2.2183356810000987
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float
1.7031846980007685
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
1.7704679510006827
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.289198366999699
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
1.7937613740014058
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.2930124340000475
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
1.8032857640009752
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.2908709189996443
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double
1.8829010000008566
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.2994690759987861
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float
1.8037853410005482
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float
1.2929310759991495
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
1.8075240359994496
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.2932477679987642
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
1.7868400779989315
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.2885970789993735
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
1.8389664830010588
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.29402057399966

device: cuda
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double
4.787109836999662
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.842438002999188
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float
3.429616614999759
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float
1.835390076999829
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
2.940423873000327
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.4108991760003846
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
2.9318018840003788
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.4168134739993548
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
2.9610764919998473
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.4189234130008117
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double
2.960172712999338
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.4162539499993727
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float
2.8985912560001452
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float
1.4113489299998037
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
2.9160250799995993
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.4128787690005993
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
2.8806865219994506
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.4086357010000938
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
2.9362181240012433
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.4151225870009512

```

After:

```
device: cpu
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double
2.2685823729998447
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.72004808300062
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float
2.212242640000113
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float
1.7089235590001408
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
1.7767087259999244
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.2916517639996528
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
1.8265984959998605
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.3002885240002797
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
1.8084679720004715
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.3012119999993956
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double
1.8800218449996464
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.3060645710002063
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float
2.4905043950002437
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float
1.9126290209997023
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
1.7972335520007618
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.2918074379995232
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
1.8047651860006226
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.2992197730000044
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
1.8526509560006161
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.3030709570002728

device: cuda
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double
4.700986622000528
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.8415469050005413
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float
3.3051693249999516
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float
1.8321999460004008
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
2.8086475109994353
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.405110773999695
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
2.913458047999484
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.4236377289998927
torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
2.9386842409994642
torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.4230227469997772
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double
3.0341797270002644
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double
1.4289592409995748
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float
3.6091147850002017
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float
2.036691903999781
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16
2.8256167649997224
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16
1.4078955400000268
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32
2.8631781489993955
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32
1.4210130069996012
torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64
3.0112479260005784
torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64
1.4297719679998409

```

Solve partly #24594 #24595

Close #25016

Test Plan: Imported from OSS

Differential Revision: D18117070

Pulled By: VitalyFedyunin

fbshipit-source-id: e06d37a8a1405848ba0b9e398870a77eb52bae8b
2019-12-05 09:55:56 -08:00
Gregory Chanan
2607772959 Turn off scalar_checks for SpatialDepthwiseConvolution and SpatialConvolutionMM. (#30789)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30789

The input(s) can't be 0-dimensional, so its irrelevant.

Restacked version of: https://github.com/pytorch/pytorch/pull/30438

Test Plan: Imported from OSS

Differential Revision: D18825716

Pulled By: gchanan

fbshipit-source-id: a4883b795163efcb9d8dba6166d0f2102b6728a2
2019-12-05 08:07:31 -08:00
Gregory Chanan
50625798df Fix scalar check of MultiLabelMarginLoss. (#30768)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30768

The behavior didn't match the documentation, because the documentation (for 'none' reduction) reads:
input X target -> output
(N, C) X (N, C) -> (N,)
(C,) X (C,) -> ()

but the later case would output (1,).  This also changes the case to:
() X (C,) -> ()
from:
() X (C,) -> (C,)
which makes more sense with the above formulas.

Restacked version of: https://github.com/pytorch/pytorch/pull/30748

Test Plan: Imported from OSS

Differential Revision: D18821554

Pulled By: gchanan

fbshipit-source-id: 3df77c51cf25648cb5fab62a68b09f49c91dab4e
2019-12-05 08:07:20 -08:00
Gregory Chanan
473a044835 Fix a CUDA memory leak in MultiLabelMarginCriterion error checking. (#30767)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30767

Restacked version of: https://github.com/pytorch/pytorch/pull/30733

Test Plan: Imported from OSS

Differential Revision: D18821553

Pulled By: gchanan

fbshipit-source-id: 8bf0365ce54dd2f07a5d6d0937332d0baf75b350
2019-12-05 08:07:15 -08:00
Gregory Chanan
786de33832 Move scalar_check logic from codegen to code in NLLLoss. (#30670)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30670

Also turn off scalar_check for grad_input: it isn't necessary because the input can't be 0-dimensional.

Test Plan: Imported from OSS

Differential Revision: D18784523

Pulled By: gchanan

fbshipit-source-id: 246d30970457075a0403dd0089317659a2cd2dd4
2019-12-04 12:30:23 -08:00
Gregory Chanan
fa2aa245cf Simplify scalar_check of nll_loss. (#30669)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30669

The inputs can't be 0-d, so we don't need that check in the scalar_check.

Test Plan: Imported from OSS

Differential Revision: D18784524

Pulled By: gchanan

fbshipit-source-id: d44222dffc91880a6e8c7be69e6e146e60040d43
2019-12-04 12:30:19 -08:00
Hong Xu
bb5dcaf24f Add logical_and and logical_or (#30521)
Summary:
With the CI failure caused in 8bbafa0b32 fixed (incorrect return type of the lambdas in CUDA kernels)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30521

Differential Revision: D18770151

Pulled By: ailzhang

fbshipit-source-id: 02f0fe1d5718c34d24da6dbb5884ee8b247ce39a
2019-12-03 18:24:54 -08:00
Hong Xu
4ac614191a Remove exp10 in TH (unused)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30422

Test Plan: Imported from OSS

Differential Revision: D18764186

Pulled By: VitalyFedyunin

fbshipit-source-id: 9343a5a7e4edf61ba3b85eaf846b2e149ed6529a
2019-12-03 18:17:15 -08:00
Brian Vaughan
a376dd344c Added check for torch.where on CPU that both arguments have same dtype (#30662)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30662

Cherry picked from: https://github.com/pytorch/pytorch/pull/29081

Test Plan: Imported from OSS

Differential Revision: D18782295

Pulled By: nairbv

fbshipit-source-id: 897ab25ddf8819ca34f5e86c5d3f41debb56cb04

Co-authored-by: ifedan
2019-12-03 15:19:52 -08:00
Gregory Chanan
8b29701ae5 Turn off scalar_checks for _th_reciprocal. (#30436)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30436

The underlying TH implementation is correct.

Test Plan: Imported from OSS

Differential Revision: D18699088

Pulled By: gchanan

fbshipit-source-id: e75a588ae4afb0506922ba98208546d5c0de623a
2019-12-03 07:04:53 -08:00
Gregory Chanan
61798865e3 Turn off scalar_checks for torch.clamp. (#30435)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30435

The underlying THC implementations are correct.

Test Plan: Imported from OSS

Differential Revision: D18699089

Pulled By: gchanan

fbshipit-source-id: f5d1319bf48eae36903296dad0b98ed80661f732
2019-12-03 07:04:47 -08:00
Brian Vaughan
e5b947a3a8 Raise an error for is_signed on quantized types (#30527)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30527

When we introduced dtype.is_signed we allowed for support of
quantized types, but we're not sure what the correct result should be.

See discussion at https://github.com/pytorch/pytorch/pull/29511

Test Plan: Imported from OSS

Differential Revision: D18765410

Pulled By: nairbv

fbshipit-source-id: c87cfe999b604cfcbbafa561e04d0d5cdbf41e6d
2019-12-03 06:34:53 -08:00
Gregory Chanan
569729527b Turn off scalar_checks for exp, cos, cosh, tan, atan, tanh, erf, erfc. (#30434)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30434

These are all pointwise ops that are implemented correctly wrt shapes in THC.

Test Plan: Imported from OSS

Differential Revision: D18699087

Pulled By: gchanan

fbshipit-source-id: 82cb91b00c77bfaca75be497c87fc7ae52daf46c
2019-12-02 16:10:25 -08:00
Gregory Chanan
0b25371f5d Turn off scalar_check for _th_normal.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29955

Test Plan: Imported from OSS

Differential Revision: D18548051

Pulled By: gchanan

fbshipit-source-id: c652999ac9e37d2592aa85ef022040fe0700b5cf
2019-11-27 14:52:06 -08:00
Richard Zou
ec5c08de74 Revert D18580867: Add logical_and and logical_or
Test Plan: revert-hammer

Differential Revision:
D18580867

Original commit changeset: 7e4d7c37da4d

fbshipit-source-id: 81fb604c7aef8d847f518f5faa016e7bd0423016
2019-11-27 09:27:00 -08:00
Hong Xu
8bbafa0b32 Add logical_and and logical_or (#28162)
Summary:
Superseding https://github.com/pytorch/pytorch/issues/24379 as type promotion has been implemented.

Close https://github.com/pytorch/pytorch/issues/24379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28162

Differential Revision: D18580867

Pulled By: ailzhang

fbshipit-source-id: 7e4d7c37da4dc8df87314bd4f1f6a7539e46586a
2019-11-26 17:38:22 -08:00
Gregory Chanan
dbce53fe32 Turn off scalar_check for _th_gather. (#29954)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29954

The underlying op handles scalar_check correctly.

Test Plan: Imported from OSS

Differential Revision: D18548054

Pulled By: gchanan

fbshipit-source-id: a1b44afa80c2928b78abbfba8b8b5d3608ac0fd3
2019-11-26 10:23:42 -08:00
Gregory Chanan
72ac45662b Turn off scalar_checks for torch.take. (#29953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29953

The underlying function handles it correctly.

Test Plan: Imported from OSS

Differential Revision: D18548055

Pulled By: gchanan

fbshipit-source-id: cc2d0ae37d9689423363d115c6a653cb64840528
2019-11-26 10:23:37 -08:00
Gregory Chanan
79a830af56 Turn off scalar_check for Tensor.set_(Tensor) (#29952)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29952

The underlying op handles the check correctly.

Test Plan: Imported from OSS

Differential Revision: D18548048

Pulled By: gchanan

fbshipit-source-id: 9ac6fde743408e59ccdfc61bd574ebe6e2862238
2019-11-26 10:23:33 -08:00
Gregory Chanan
0c67311878 Turn off scalar_check for set_(Storage, ...) (#29950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29950

The underlying code handles it correctly.

Test Plan: Imported from OSS

Differential Revision: D18548052

Pulled By: gchanan

fbshipit-source-id: 88b737572c816fb0026ac5e66da7e3f4ab686773
2019-11-25 14:52:22 -08:00
Gregory Chanan
7160300638 Turn off scalar_check for reductions _th_max, _th_min. (#29949)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29949

The underlying functions handle this already.

Test Plan: Imported from OSS

Differential Revision: D18548047

Pulled By: gchanan

fbshipit-source-id: 123c9297db4e4315da9b1d996ac8b41aa1b4c7bc
2019-11-25 14:52:17 -08:00
Gregory Chanan
16606e1725 Turn off scalar_check for mode; the underlying code is correct.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29948

Test Plan: Imported from OSS

Differential Revision: D18548053

Pulled By: gchanan

fbshipit-source-id: 15cdfc24d3e5123497c72dc09c5e6b28cb5e1f88
2019-11-25 14:52:12 -08:00
Gregory Chanan
b8eba7aca9 Turn off scalar_check for ormqr. (#29947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29947

It requires > 0-dimensional tensors.

Test Plan: Imported from OSS

Differential Revision: D18548049

Pulled By: gchanan

fbshipit-source-id: ce80a42515b59513a0e5ef2b32e2c2b90b4d64f5
2019-11-25 14:52:07 -08:00
Gregory Chanan
7c6cc1d6d4 Turn off scalar_checks for _th_multinomial_alias_draw. (#29946)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29946

it requires > 0-dimensional tensors.

Test Plan: Imported from OSS

Differential Revision: D18548050

Pulled By: gchanan

fbshipit-source-id: 4d1e3b53bd701137cc2cb674f95627a5e064a274
2019-11-25 14:52:02 -08:00
Gregory Chanan
ce5f1a1b25 Turn off scalar_check for masked_select. (#29923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29923

Note that this changes the behavior of masked_select when both "self" and "mask" are 0-dimensional.

In previous versions of PyTorch, this would return a 0-dimensional tensor.  But the documentation reads:
"Returns a new 1-D tensor which indexes the input tensor according to the boolean mask mask which is a BoolTensor."

Test Plan: Imported from OSS

Differential Revision: D18539560

Pulled By: gchanan

fbshipit-source-id: 1637ed2c434fcf8ceead0073aa610581f4a19d21
2019-11-25 14:51:51 -08:00
Gregory Chanan
0c9c62ba6e Turn off scalar_checks for __and__ and clone.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29880

Test Plan: Imported from OSS

Differential Revision: D18521732

Pulled By: gchanan

fbshipit-source-id: 7fdf5d8a7b93b43ac32067222cb8df5e790900de
2019-11-25 14:51:46 -08:00
Gregory Chanan
94ad7544ae Turn off scalar_check for __or__
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29879

Test Plan: Imported from OSS

Differential Revision: D18521745

Pulled By: gchanan

fbshipit-source-id: 93d17d5e9cad5dd6d2c20221d87408c838d74eca
2019-11-25 14:51:40 -08:00
Gregory Chanan
f994377d28 Turn off scalar_check for lshift, rshift.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29878

Test Plan: Imported from OSS

Differential Revision: D18521746

Pulled By: gchanan

fbshipit-source-id: 11fd7db79ac8ae76b1a5df25fb0ff59d81fcf394
2019-11-25 14:51:34 -08:00
Gerard Goossen
faacbfa8bf Migrate index_add cpu from TH to ATen (#28421)
Summary:
Migrate index_add cpu from TH to ATen.

I couldn't find replacement for get1d and set1d, so doing pointer arithmetic inplace.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28421

Test Plan: existing tests

Differential Revision: D18060971

Pulled By: ggoossen

fbshipit-source-id: 413719990cdb2fe578964cde14e93577e48a4342
2019-11-22 06:25:13 -08:00
Johannes M Dieterich
48b943960e Add bfloat16 support in linear algebra on ROCm (#27719)
Summary:
This adds support for gemm-style matrix multiplications with data and output in bf16 to PyTorch on ROCm to the backend (i.e., bgemm).

Enable operators depending on bgemm.

With this change, bf16 matrices on ROCm can be multiplied on the GPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27719

Differential Revision: D18653514

Pulled By: bddppq

fbshipit-source-id: 805db923579bec6fc8fd1c51eeb5b1ef85a96758
2019-11-21 23:54:03 -08:00
Prasun Anand
0fdbb762d1 Warn user when resizing out Tensor after arange() (#29195)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/28347

gchanan , I am generating a warning as follows:
```
(torch_new) prasun@prasun-xps:~/dev/explore-array-computing$ python arange_test.py
Trying 45...
  Before arange shape is torch.Size([1, 45])
  After arange shape is torch.Size([1, 45])
Trying 46...
  Before arange shape is torch.Size([1, 46])
  After arange shape is torch.Size([1, 46])
Trying 47...
  Before arange shape is torch.Size([1, 47])
  After arange shape is torch.Size([1, 47])
Trying 48...
  Before arange shape is torch.Size([1, 48])
  After arange shape is torch.Size([1, 48])
Trying 49...
  Before arange shape is torch.Size([1, 49])
../aten/src/ATen/native/RangeFactories.cpp:163: UserWarning: Size of out Tensor does not match the result Tensor. The output Tensor will be resized!
  After arange shape is torch.Size([50])
Traceback (most recent call last):
  File "arange_test.py", line 10, in <module>
    assert len(line.shape) == 2
AssertionError
```

Is this alright ?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29195

Differential Revision: D18638184

Pulled By: ezyang

fbshipit-source-id: a93e4ce615b5a315570f9951021ef74fc1d895a6
2019-11-21 13:06:14 -08:00
vishwakftw
ae6af8d55f Enable multinomial for torch.half (#29266)
Summary:
Changelog:
- Re-enable multinomial sampling when the probability tensor has `dtype == torch.half`.

It seems to have been missed in https://github.com/pytorch/pytorch/issues/28481.

Fixes https://github.com/pytorch/pytorch/issues/29211
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29266

Differential Revision: D18619105

Pulled By: ezyang

fbshipit-source-id: 1f87e5183e75de5c5e0ffde862fc72d040b32864
2019-11-20 13:06:46 -08:00
Iurii Zdebskyi
36a47d71e1 Enabled bfloat16 for cuda (#27259)
Summary:
Enabled basic support for bfloat16 on cuda
Tested via unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27259

Differential Revision: D17728661

Pulled By: izdeby

fbshipit-source-id: 99efb6bc4aec029fe6bbc8a68963dca9c9dc5810
2019-11-20 08:49:56 -08:00
David Riazati
dca123e76d Add zipfile serialization (#29232)
Summary:
Stacked PRs
 * https://github.com/pytorch/pytorch/issues/29244 - Use custom CRC
 * **https://github.com/pytorch/pytorch/issues/29232 - Add zipfile serialization**

This adds a serialization method that uses a zipfile (https://github.com/pytorch/pytorch/issues/26567). Right now it is
guarded behind a flag `_use_new_zipfile_serialization`. In release mode it seems to have performance about the same / slightly better than the current serialization in some simple benchmarks for large/small tensors.

Follow ups:
* Flip the `_use_new_zipfile_serialization` flag
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29232

Differential Revision: D18332036

Pulled By: driazati

fbshipit-source-id: 1bac0847c4d599612cba905f2cac8248783be2f4
2019-11-19 10:17:32 -08:00
Brian Vaughan
adfb8a4888 Fix bug in atomicAdd for int16_t (#29231)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29231

Fixes: https://github.com/pytorch/pytorch/issues/29153

Bug is that atomicAdd doesn't correctly add values for some dtypes due to incorrect casting. Was returning zeros.

Incorrect behavior before this PR:

```
In [23]: sparse=torch.sparse_coo_tensor(indices=torch.tensor([[0,0],[1,1]]), values=torch.tensor([5, 6], dtype=torch.int16), size=(2,2), device='cuda', dtype=torch.int16 )

In [24]: sparse
Out[24]:
tensor(indices=tensor([[0, 0],
                       [1, 1]]),
       values=tensor([5, 6]),
       device='cuda:0', size=(2, 2), nnz=2, dtype=torch.int16,
       layout=torch.sparse_coo)

In [25]: sparse.coalesce()
Out[25]:
tensor(indices=tensor([[0],
                       [1]]),
       values=tensor([11]),
       device='cuda:0', size=(2, 2), nnz=1, dtype=torch.int16,
       layout=torch.sparse_coo)

In [26]: sparse.to_dense()
Out[26]:
tensor([[0, 0],
        [0, 0]], device='cuda:0', dtype=torch.int16)

In [27]: sparse.coalesce().to_dense()
Out[27]:
tensor([[ 0, 11],
        [ 0,  0]], device='cuda:0', dtype=torch.int16)

In [30]: torch.add(torch.zeros([2,2],dtype=torch.int16, device='cuda'), sparse)
Out[30]:
tensor([[0, 0],
        [0, 0]], device='cuda:0', dtype=torch.int16)
```

Test Plan: Imported from OSS

Differential Revision: D18575666

Pulled By: nairbv

fbshipit-source-id: 9b193b386bf4a9615014aa890d2e9f4f694940ac
2019-11-18 12:42:02 -08:00