pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
xiaobing.zhang	b47e9b97a2	Add op bitwise_and (#31104 ) Summary: Refer to https://github.com/pytorch/pytorch/pull/25665, add `bitwise_and` operator. Benchmark script : ``` import timeit #for __and__ for n, t in [(10, 100000),(1000, 10000)]: print('__and__ (a.numel() == {}) for {} times'.format(n, t)) for device in ('cpu', 'cuda'): for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'): print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t') print(timeit.timeit(f'a & b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}")', number=t)) #for __iand__ for n, t in [(10, 100000),(1000, 10000)]: print('__iand__ (a.numel() == {}) for {} times'.format(n, t)) for device in ('cpu', 'cuda'): for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'): print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t') print(timeit.timeit(f'a & b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.tensor(5, dtype = {dtype}, device="{device}")', number=t)) ``` Device: Tesla P100, skx-8180 Cuda verison: 9.0.176 Before: ``` __and__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.1766007635742426 device: cpu, dtype: torch.uint8, 100000 times 0.17322628945112228 device: cpu, dtype: torch.int16, 100000 times 0.17650844901800156 device: cpu, dtype: torch.int32, 100000 times 0.17711848113685846 device: cpu, dtype: torch.int64, 100000 times 0.18240160401910543 device: cuda, dtype: torch.int8, 100000 times 1.273967768996954 device: cuda, dtype: torch.uint8, 100000 times 1.2778537990525365 device: cuda, dtype: torch.int16, 100000 times 1.2753686187788844 device: cuda, dtype: torch.int32, 100000 times 1.2797665279358625 device: cuda, dtype: torch.int64, 100000 times 1.2933144550770521 __and__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.031139614060521126 device: cpu, dtype: torch.uint8, 10000 times 0.03091452084481716 device: cpu, dtype: torch.int16, 10000 times 0.022756479680538177 device: cpu, dtype: torch.int32, 10000 times 0.025045674294233322 device: cpu, dtype: torch.int64, 10000 times 0.024164282716810703 device: cuda, dtype: torch.int8, 10000 times 0.12820732593536377 device: cuda, dtype: torch.uint8, 10000 times 0.12775669433176517 device: cuda, dtype: torch.int16, 10000 times 0.12697868794202805 device: cuda, dtype: torch.int32, 10000 times 0.12832533661276102 device: cuda, dtype: torch.int64, 10000 times 0.1280576130375266 __iand__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.3687064303085208 device: cpu, dtype: torch.uint8, 100000 times 0.36253443732857704 device: cpu, dtype: torch.int16, 100000 times 0.362891579978168 device: cpu, dtype: torch.int32, 100000 times 0.37680106051266193 device: cpu, dtype: torch.int64, 100000 times 0.3689364707097411 device: cuda, dtype: torch.int8, 100000 times 1.419940729625523 device: cuda, dtype: torch.uint8, 100000 times 1.4247053815051913 device: cuda, dtype: torch.int16, 100000 times 1.4191444097086787 device: cuda, dtype: torch.int32, 100000 times 1.4305962566286325 device: cuda, dtype: torch.int64, 100000 times 1.4567416654899716 __iand__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.06224383972585201 device: cpu, dtype: torch.uint8, 10000 times 0.06205617543309927 device: cpu, dtype: torch.int16, 10000 times 0.05016433447599411 device: cpu, dtype: torch.int32, 10000 times 0.05216377507895231 device: cpu, dtype: torch.int64, 10000 times 0.06139362137764692 device: cuda, dtype: torch.int8, 10000 times 0.14827249851077795 device: cuda, dtype: torch.uint8, 10000 times 0.14801877550780773 device: cuda, dtype: torch.int16, 10000 times 0.14952312968671322 device: cuda, dtype: torch.int32, 10000 times 0.14999118447303772 device: cuda, dtype: torch.int64, 10000 times 0.14951884001493454 ``` After: ``` __and__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.23157884553074837 device: cpu, dtype: torch.uint8, 100000 times 0.23063660878688097 device: cpu, dtype: torch.int16, 100000 times 0.23005440644919872 device: cpu, dtype: torch.int32, 100000 times 0.23748818412423134 device: cpu, dtype: torch.int64, 100000 times 0.24106105230748653 device: cuda, dtype: torch.int8, 100000 times 1.4394256137311459 device: cuda, dtype: torch.uint8, 100000 times 1.4436759827658534 device: cuda, dtype: torch.int16, 100000 times 1.4631587155163288 device: cuda, dtype: torch.int32, 100000 times 1.459101552143693 device: cuda, dtype: torch.int64, 100000 times 1.4784048134461045 __and__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.028442862443625927 device: cpu, dtype: torch.uint8, 10000 times 0.028130197897553444 device: cpu, dtype: torch.int16, 10000 times 0.025318274274468422 device: cpu, dtype: torch.int32, 10000 times 0.02519288007169962 device: cpu, dtype: torch.int64, 10000 times 0.028299466706812382 device: cuda, dtype: torch.int8, 10000 times 0.14342594426125288 device: cuda, dtype: torch.uint8, 10000 times 0.145280827768147 device: cuda, dtype: torch.int16, 10000 times 0.14673697855323553 device: cuda, dtype: torch.int32, 10000 times 0.14499565307050943 device: cuda, dtype: torch.int64, 10000 times 0.14582364354282618 __iand__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.25548241566866636 device: cpu, dtype: torch.uint8, 100000 times 0.2552562616765499 device: cpu, dtype: torch.int16, 100000 times 0.25905191246420145 device: cpu, dtype: torch.int32, 100000 times 0.26635489892214537 device: cpu, dtype: torch.int64, 100000 times 0.26269810926169157 device: cuda, dtype: torch.int8, 100000 times 1.485458506271243 device: cuda, dtype: torch.uint8, 100000 times 1.4742380809038877 device: cuda, dtype: torch.int16, 100000 times 1.507783885113895 device: cuda, dtype: torch.int32, 100000 times 1.4926990242674947 device: cuda, dtype: torch.int64, 100000 times 1.519851053133607 __iand__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.03425929415971041 device: cpu, dtype: torch.uint8, 10000 times 0.03293587639927864 device: cpu, dtype: torch.int16, 10000 times 0.029559112153947353 device: cpu, dtype: torch.int32, 10000 times 0.030915481969714165 device: cpu, dtype: torch.int64, 10000 times 0.03292469773441553 device: cuda, dtype: torch.int8, 10000 times 0.15792148280888796 device: cuda, dtype: torch.uint8, 10000 times 0.16000914946198463 device: cuda, dtype: torch.int16, 10000 times 0.1600684942677617 device: cuda, dtype: torch.int32, 10000 times 0.16162546630948782 device: cuda, dtype: torch.int64, 10000 times 0.1629159888252616 ``` Fix https://github.com/pytorch/pytorch/issues/24508, https://github.com/pytorch/pytorch/issues/24509, https://github.com/pytorch/pytorch/issues/24655, https://github.com/pytorch/pytorch/issues/24656. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31104 Differential Revision: D18938930 Pulled By: VitalyFedyunin fbshipit-source-id: a77e805a0b84e8ace16c6e648c2f67dad44f2e44	2020-01-03 10:32:36 -08:00
leetanenbaum	0b9cd410a9	Fix cumsum error for tensors with zero elements (#31694 ) Summary: Currently `cumsum` crashes for tensors with non-empty dimensions but with zero elements, which could happen when some dimension is zero. This commit fixes the error by checking both `dim()` and `numel()` in cumsum backward Fixes https://github.com/pytorch/pytorch/issues/31515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/31694 Reviewed By: mrshenli Differential Revision: D19266613 Pulled By: leedtan fbshipit-source-id: 9407e0aa55440fed911c01a3580bb6c5eab62a16	2020-01-03 10:16:46 -08:00
BowenBao	c4f10e0fe7	Renaming scales parameter for interpolate (#31526 ) Summary: PR separated from https://github.com/pytorch/pytorch/pull/31274. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31526 Reviewed By: zou3519 Differential Revision: D19221931 Pulled By: gchanan fbshipit-source-id: 81958a9910867ac9d62f2b47abc49384526c4e51	2020-01-02 08:19:30 -08:00
anjali411	ae214f67a5	updated code to ensure error check for negative dims Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31636 Differential Revision: D19233031 Pulled By: anjali411 fbshipit-source-id: c29265ddd1f887f1a0b98aca56a2691d7584353d	2019-12-27 14:39:57 -08:00
Gregory Chanan	68e5172382	Support optional float parameters (float?, optional<double>). (#31517 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31517 This is going to be used by upsample (which currently uses magic values to represent optionals). For now, we just introduce a fake function for testing (torch._test_optional_float(x)). Test Plan: Imported from OSS Differential Revision: D19198721 Pulled By: gchanan fbshipit-source-id: 0a1382fde0927c5d277d02d62bfb31fb574b8c74	2019-12-23 08:33:39 -08:00
anjali411	9d9bc93bfb	Added error message to indicate that reduction operations are not supported for dim>=64 (#31476 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/23159 Currently we don't support reduction operations for dim>=64 and we should give a descriptive RuntimeError indicating the same Diff: D19179039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/31476 Differential Revision: D19179039 Pulled By: anjali411 fbshipit-source-id: 58568f64627bf3df6b3e00a1498544c030e74a0e	2019-12-19 13:00:53 -08:00
Iurii Zdebskyi	58d2dd5b73	Enabled flip for bool tensors (#31267 ) Summary: Fix this [issue](https://github.com/pytorch/pytorch/issues/31213) Pull Request resolved: https://github.com/pytorch/pytorch/pull/31267 Differential Revision: D19047249 Pulled By: izdeby fbshipit-source-id: f58ca3ac88aab28742b8d345400270f7d31c3856	2019-12-18 09:01:32 -08:00
Kurt Mohler	3694749cd1	Detect dill version in torch.save/load (#30985 ) Summary: Fix for issue https://github.com/pytorch/pytorch/issues/28313 Pull Request resolved: https://github.com/pytorch/pytorch/pull/30985 Differential Revision: D19142947 Pulled By: zou3519 fbshipit-source-id: 10e3a182a99e80ca8c9c8328b6f8764b27d78eb3	2019-12-18 08:05:08 -08:00
Xiang Gao	ffe0c1ae4d	Make test_torch.py pass cuda-memcheck (#29243 ) Summary: Make the following changes: - When there are more than 10k errors, cuda-memcheck only shows 10k errors, in this case we shouldn't raise an Exception - Add UNDER_CUDA_MEMCHECK environment to allow disabling `pin_memory` tests when running cuda-memcheck. - Add a `--ci` command option, when turned on, then this script would run output to stdout instead of writing a file, and exit with an error if cuda-memcheck fails - Add a `--nohang` command option. When turned on, then hang would be treated as pass instead of error - Do simple filtering on the test to run: if `'cpu'` in the test name but not `'cuda'` is not in the test name - Add `--split` and `--rank` to allowing splitting the work (NVIDIA CI has a limitation of 3 hours, we have to split the work to satisfy this limitation) - The error summary could be `ERROR SUMMARY: 1 error`, or `ERROR SUMMARY: 2 errors`, the tail could be `error` or `errors`, it is not of the same length. The script is fixed to handle this case. - Ignore errors from `cufft` Pull Request resolved: https://github.com/pytorch/pytorch/pull/29243 Differential Revision: D18941701 Pulled By: mruberry fbshipit-source-id: 2048428f32b66ef50c67444c03ce4dd9491179d2	2019-12-14 20:29:58 -08:00
Vitaly Fedyunin	c35cddb306	Switch default memory format of clone operator to Preserve Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30089 Test Plan: Imported from OSS Differential Revision: D18624985 Pulled By: VitalyFedyunin fbshipit-source-id: 8d315b08b7b5858fd0a81d3375b44ccb94787ad4	2019-12-14 20:29:06 -08:00
Vitaly Fedyunin	fde3d707ad	Switch default memory format of to (and similar) operators to Preserve Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30088 Test Plan: Imported from OSS Differential Revision: D18624984 Pulled By: VitalyFedyunin fbshipit-source-id: 54901786d7496c7dce785140b0585ac9093b1d86	2019-12-14 20:29:01 -08:00
Vitaly Fedyunin	927588df8e	Switch default memory format of _like operators to Preserve Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30087 Test Plan: Imported from OSS Differential Revision: D18624986 Pulled By: VitalyFedyunin fbshipit-source-id: 8e434966f872ffaddf1249248ea445cbbab300ce	2019-12-14 20:28:57 -08:00
Xiang Gao	9954739956	Refactor test for unique and unique_consecutive and fix some bugs (#31211 ) Summary: Tests for unique_dim will be refactored in a separate PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31211 Differential Revision: D19034968 Pulled By: ngimel fbshipit-source-id: 855d326b37638b5944f11fbbce03394cf000daf9	2019-12-14 20:28:38 -08:00
Iurii Zdebskyi	f6c31f61c5	Enabled roll for bool tensor (#31194 ) Summary: Fixed this [issue](https://github.com/pytorch/pytorch/issues/31079). Tested via unit test Pull Request resolved: https://github.com/pytorch/pytorch/pull/31194 Differential Revision: D18958141 Pulled By: izdeby fbshipit-source-id: 119bf4d31df10ee02c277f5a4663038470cf7780	2019-12-12 13:48:14 -08:00
Brian Vaughan	945ce71b18	Correctly handle scalar types, fix parse of numpy ints (#30486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30486 Fixes: https://github.com/pytorch/pytorch/issues/29252 There is some incorrect code in the handling of parsing python numbers that led to issue #29252: When we allow interpretation of a zero-dim numpy integer value as a scalar in pytorch, we incorrectly parse the int as a float. This PR also fixes the issue described in the "FIXME" here: https://github.com/pytorch/pytorch/pull/27628/files#diff-f539198dd366265fb8dc2d661bc5d5bcR1487 Test Plan: Added a unit test based on the example given in the issue. Differential Revision: D18932520 Pulled By: nairbv fbshipit-source-id: f6416f28dfd73ac72c1042042851d76beb5fcf65	2019-12-11 15:35:57 -08:00
Alban Desmaison	717274c001	Add useful warnings for t.grad when it won't be populated for known reasons (#30531 ) Summary: Fix https://github.com/pytorch/pytorch/issues/2362 and https://github.com/pytorch/pytorch/issues/19778 To avoid issues with frozen model, we only consider warning for Tensors that require gradients and are neither leafs nor retain gradients. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30531 Differential Revision: D18832767 Pulled By: albanD fbshipit-source-id: 743e863dc14ab57713e66da78b2e4d759dfba0ff	2019-12-11 09:47:18 -08:00
Michael Suo	62b10721fb	Actually make flake8 do something (#30892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30892 Fixes all outstanding lints and actually installs a properly configured flake8 Test Plan: Imported from OSS Differential Revision: D18862825 Pulled By: suo fbshipit-source-id: 08e9083338a7309272e17bb803feaa42e348aa85	2019-12-06 17:50:50 -08:00
Gregory Chanan	377131b0eb	MultiMarginCriterion: fix scalar_check in the case where reduction == None. (#30826 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30826 Previously the scalar_check for the reduction None case was: input.dim() <= 1, but it should be target based, i.e.: target.dim() == 0. This follows from the "correct cases", i.e. (N, C) X (N,) -> (N,) (C,) X () -> () Test Plan: Imported from OSS Differential Revision: D18833660 Pulled By: gchanan fbshipit-source-id: 26338b842a8311718c4b89da3e2f1b726d5409b8	2019-12-06 09:04:38 -08:00
Gregory Chanan	e5d571ae25	Remove scalar_check from topk, move it to the THC implementation. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30852 Test Plan: Imported from OSS Differential Revision: D18842662 Pulled By: gchanan fbshipit-source-id: b5e8a4367fce9441be2ddbd026495f1911038221	2019-12-06 07:50:20 -08:00
Edward Yang	6e38d50352	Revert D18117070: Migrate max and min (binary) from TH to ATen. Test Plan: revert-hammer Differential Revision: D18117070 Original commit changeset: e06d37a8a140 fbshipit-source-id: 49dd33f52e7e3ffcaafc02109a0a0a67545ec7e8	2019-12-05 14:43:29 -08:00
Edward Yang	2ced81f289	Revert "Default to not build Caffe2 operators on Windows. (#29061 )" (#30740 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30740 This reverts commit `7102aceaf8`. Test Plan: Imported from OSS Differential Revision: D18834315 Pulled By: ezyang fbshipit-source-id: 2dbd1cf686864b9840365083182cd6188a285399	2019-12-05 14:01:59 -08:00
Hong Xu	1578a28692	Migrate max and min (binary) from TH to ATen. (#27185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27185 TH implementation will be removed after the unary max and min are migrated. Benchmark: (Debian 10, Release build, gcc 7.4, no turbo) ```python import timeit for device in ('cpu', 'cuda'): print(f'device: {device}') for op in ('max', 'min'): for dtype in ('torch.double', 'torch.float', 'torch.int16', 'torch.int32', 'torch.int64'): for n, t in [(10_000, 200000), (100_000, 20000)]: print(f'torch.{op}(a, b), numel() == {n} for {t} times, dtype={dtype}') print(timeit.timeit(f'torch.{op}(a)' + (';torch.cuda.synchronize()' if device == 'cuda' else ''), setup=f'import torch; a = torch.arange({n}, dtype={dtype}); b = torch.ones({n}, 0, dtype={dtype}) * ({n} / 2)', number=t)) print() ``` Before: ``` device: cpu torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double 2.241763713000182 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.7138833169992722 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float 2.2183356810000987 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float 1.7031846980007685 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 1.7704679510006827 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.289198366999699 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 1.7937613740014058 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.2930124340000475 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 1.8032857640009752 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.2908709189996443 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double 1.8829010000008566 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.2994690759987861 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float 1.8037853410005482 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float 1.2929310759991495 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 1.8075240359994496 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.2932477679987642 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 1.7868400779989315 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.2885970789993735 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 1.8389664830010588 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.29402057399966 device: cuda torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double 4.787109836999662 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.842438002999188 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float 3.429616614999759 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float 1.835390076999829 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 2.940423873000327 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.4108991760003846 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 2.9318018840003788 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.4168134739993548 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 2.9610764919998473 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.4189234130008117 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double 2.960172712999338 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.4162539499993727 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float 2.8985912560001452 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float 1.4113489299998037 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 2.9160250799995993 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.4128787690005993 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 2.8806865219994506 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.4086357010000938 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 2.9362181240012433 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.4151225870009512 ``` After: ``` device: cpu torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double 2.2685823729998447 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.72004808300062 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float 2.212242640000113 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float 1.7089235590001408 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 1.7767087259999244 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.2916517639996528 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 1.8265984959998605 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.3002885240002797 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 1.8084679720004715 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.3012119999993956 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double 1.8800218449996464 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.3060645710002063 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float 2.4905043950002437 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float 1.9126290209997023 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 1.7972335520007618 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.2918074379995232 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 1.8047651860006226 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.2992197730000044 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 1.8526509560006161 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.3030709570002728 device: cuda torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.double 4.700986622000528 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.8415469050005413 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.float 3.3051693249999516 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.float 1.8321999460004008 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 2.8086475109994353 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.405110773999695 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 2.913458047999484 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.4236377289998927 torch.max(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 2.9386842409994642 torch.max(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.4230227469997772 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.double 3.0341797270002644 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.double 1.4289592409995748 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.float 3.6091147850002017 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.float 2.036691903999781 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int16 2.8256167649997224 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int16 1.4078955400000268 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int32 2.8631781489993955 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int32 1.4210130069996012 torch.min(a, b), numel() == 10000 for 200000 times, dtype=torch.int64 3.0112479260005784 torch.min(a, b), numel() == 100000 for 20000 times, dtype=torch.int64 1.4297719679998409 ``` Solve partly #24594 #24595 Close #25016 Test Plan: Imported from OSS Differential Revision: D18117070 Pulled By: VitalyFedyunin fbshipit-source-id: e06d37a8a1405848ba0b9e398870a77eb52bae8b	2019-12-05 09:55:56 -08:00
Gregory Chanan	2607772959	Turn off scalar_checks for SpatialDepthwiseConvolution and SpatialConvolutionMM. (#30789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30789 The input(s) can't be 0-dimensional, so its irrelevant. Restacked version of: https://github.com/pytorch/pytorch/pull/30438 Test Plan: Imported from OSS Differential Revision: D18825716 Pulled By: gchanan fbshipit-source-id: a4883b795163efcb9d8dba6166d0f2102b6728a2	2019-12-05 08:07:31 -08:00
Gregory Chanan	50625798df	Fix scalar check of MultiLabelMarginLoss. (#30768 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30768 The behavior didn't match the documentation, because the documentation (for 'none' reduction) reads: input X target -> output (N, C) X (N, C) -> (N,) (C,) X (C,) -> () but the later case would output (1,). This also changes the case to: () X (C,) -> () from: () X (C,) -> (C,) which makes more sense with the above formulas. Restacked version of: https://github.com/pytorch/pytorch/pull/30748 Test Plan: Imported from OSS Differential Revision: D18821554 Pulled By: gchanan fbshipit-source-id: 3df77c51cf25648cb5fab62a68b09f49c91dab4e	2019-12-05 08:07:20 -08:00
Gregory Chanan	473a044835	Fix a CUDA memory leak in MultiLabelMarginCriterion error checking. (#30767 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30767 Restacked version of: https://github.com/pytorch/pytorch/pull/30733 Test Plan: Imported from OSS Differential Revision: D18821553 Pulled By: gchanan fbshipit-source-id: 8bf0365ce54dd2f07a5d6d0937332d0baf75b350	2019-12-05 08:07:15 -08:00
Gregory Chanan	786de33832	Move scalar_check logic from codegen to code in NLLLoss. (#30670 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30670 Also turn off scalar_check for grad_input: it isn't necessary because the input can't be 0-dimensional. Test Plan: Imported from OSS Differential Revision: D18784523 Pulled By: gchanan fbshipit-source-id: 246d30970457075a0403dd0089317659a2cd2dd4	2019-12-04 12:30:23 -08:00
Gregory Chanan	fa2aa245cf	Simplify scalar_check of nll_loss. (#30669 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30669 The inputs can't be 0-d, so we don't need that check in the scalar_check. Test Plan: Imported from OSS Differential Revision: D18784524 Pulled By: gchanan fbshipit-source-id: d44222dffc91880a6e8c7be69e6e146e60040d43	2019-12-04 12:30:19 -08:00
Hong Xu	bb5dcaf24f	Add logical_and and logical_or (#30521 ) Summary: With the CI failure caused in `8bbafa0b32` fixed (incorrect return type of the lambdas in CUDA kernels) Pull Request resolved: https://github.com/pytorch/pytorch/pull/30521 Differential Revision: D18770151 Pulled By: ailzhang fbshipit-source-id: 02f0fe1d5718c34d24da6dbb5884ee8b247ce39a	2019-12-03 18:24:54 -08:00
Hong Xu	4ac614191a	Remove exp10 in TH (unused) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30422 Test Plan: Imported from OSS Differential Revision: D18764186 Pulled By: VitalyFedyunin fbshipit-source-id: 9343a5a7e4edf61ba3b85eaf846b2e149ed6529a	2019-12-03 18:17:15 -08:00
Brian Vaughan	a376dd344c	Added check for torch.where on CPU that both arguments have same dtype (#30662 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30662 Cherry picked from: https://github.com/pytorch/pytorch/pull/29081 Test Plan: Imported from OSS Differential Revision: D18782295 Pulled By: nairbv fbshipit-source-id: 897ab25ddf8819ca34f5e86c5d3f41debb56cb04 Co-authored-by: ifedan	2019-12-03 15:19:52 -08:00
Gregory Chanan	8b29701ae5	Turn off scalar_checks for _th_reciprocal. (#30436 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30436 The underlying TH implementation is correct. Test Plan: Imported from OSS Differential Revision: D18699088 Pulled By: gchanan fbshipit-source-id: e75a588ae4afb0506922ba98208546d5c0de623a	2019-12-03 07:04:53 -08:00
Gregory Chanan	61798865e3	Turn off scalar_checks for torch.clamp. (#30435 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30435 The underlying THC implementations are correct. Test Plan: Imported from OSS Differential Revision: D18699089 Pulled By: gchanan fbshipit-source-id: f5d1319bf48eae36903296dad0b98ed80661f732	2019-12-03 07:04:47 -08:00
Brian Vaughan	e5b947a3a8	Raise an error for is_signed on quantized types (#30527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30527 When we introduced dtype.is_signed we allowed for support of quantized types, but we're not sure what the correct result should be. See discussion at https://github.com/pytorch/pytorch/pull/29511 Test Plan: Imported from OSS Differential Revision: D18765410 Pulled By: nairbv fbshipit-source-id: c87cfe999b604cfcbbafa561e04d0d5cdbf41e6d	2019-12-03 06:34:53 -08:00
Gregory Chanan	569729527b	Turn off scalar_checks for exp, cos, cosh, tan, atan, tanh, erf, erfc. (#30434 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30434 These are all pointwise ops that are implemented correctly wrt shapes in THC. Test Plan: Imported from OSS Differential Revision: D18699087 Pulled By: gchanan fbshipit-source-id: 82cb91b00c77bfaca75be497c87fc7ae52daf46c	2019-12-02 16:10:25 -08:00
Gregory Chanan	0b25371f5d	Turn off scalar_check for _th_normal. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29955 Test Plan: Imported from OSS Differential Revision: D18548051 Pulled By: gchanan fbshipit-source-id: c652999ac9e37d2592aa85ef022040fe0700b5cf	2019-11-27 14:52:06 -08:00
Richard Zou	ec5c08de74	Revert D18580867: Add logical_and and logical_or Test Plan: revert-hammer Differential Revision: D18580867 Original commit changeset: 7e4d7c37da4d fbshipit-source-id: 81fb604c7aef8d847f518f5faa016e7bd0423016	2019-11-27 09:27:00 -08:00
Hong Xu	8bbafa0b32	Add logical_and and logical_or (#28162 ) Summary: Superseding https://github.com/pytorch/pytorch/issues/24379 as type promotion has been implemented. Close https://github.com/pytorch/pytorch/issues/24379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/28162 Differential Revision: D18580867 Pulled By: ailzhang fbshipit-source-id: 7e4d7c37da4dc8df87314bd4f1f6a7539e46586a	2019-11-26 17:38:22 -08:00
Gregory Chanan	dbce53fe32	Turn off scalar_check for _th_gather. (#29954 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29954 The underlying op handles scalar_check correctly. Test Plan: Imported from OSS Differential Revision: D18548054 Pulled By: gchanan fbshipit-source-id: a1b44afa80c2928b78abbfba8b8b5d3608ac0fd3	2019-11-26 10:23:42 -08:00
Gregory Chanan	72ac45662b	Turn off scalar_checks for torch.take. (#29953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29953 The underlying function handles it correctly. Test Plan: Imported from OSS Differential Revision: D18548055 Pulled By: gchanan fbshipit-source-id: cc2d0ae37d9689423363d115c6a653cb64840528	2019-11-26 10:23:37 -08:00
Gregory Chanan	79a830af56	Turn off scalar_check for Tensor.set_(Tensor) (#29952 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29952 The underlying op handles the check correctly. Test Plan: Imported from OSS Differential Revision: D18548048 Pulled By: gchanan fbshipit-source-id: 9ac6fde743408e59ccdfc61bd574ebe6e2862238	2019-11-26 10:23:33 -08:00
Gregory Chanan	0c67311878	Turn off scalar_check for set_(Storage, ...) (#29950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29950 The underlying code handles it correctly. Test Plan: Imported from OSS Differential Revision: D18548052 Pulled By: gchanan fbshipit-source-id: 88b737572c816fb0026ac5e66da7e3f4ab686773	2019-11-25 14:52:22 -08:00
Gregory Chanan	7160300638	Turn off scalar_check for reductions _th_max, _th_min. (#29949 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29949 The underlying functions handle this already. Test Plan: Imported from OSS Differential Revision: D18548047 Pulled By: gchanan fbshipit-source-id: 123c9297db4e4315da9b1d996ac8b41aa1b4c7bc	2019-11-25 14:52:17 -08:00
Gregory Chanan	16606e1725	Turn off scalar_check for mode; the underlying code is correct. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29948 Test Plan: Imported from OSS Differential Revision: D18548053 Pulled By: gchanan fbshipit-source-id: 15cdfc24d3e5123497c72dc09c5e6b28cb5e1f88	2019-11-25 14:52:12 -08:00
Gregory Chanan	b8eba7aca9	Turn off scalar_check for ormqr. (#29947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29947 It requires > 0-dimensional tensors. Test Plan: Imported from OSS Differential Revision: D18548049 Pulled By: gchanan fbshipit-source-id: ce80a42515b59513a0e5ef2b32e2c2b90b4d64f5	2019-11-25 14:52:07 -08:00
Gregory Chanan	7c6cc1d6d4	Turn off scalar_checks for _th_multinomial_alias_draw. (#29946 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29946 it requires > 0-dimensional tensors. Test Plan: Imported from OSS Differential Revision: D18548050 Pulled By: gchanan fbshipit-source-id: 4d1e3b53bd701137cc2cb674f95627a5e064a274	2019-11-25 14:52:02 -08:00
Gregory Chanan	ce5f1a1b25	Turn off scalar_check for masked_select. (#29923 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29923 Note that this changes the behavior of masked_select when both "self" and "mask" are 0-dimensional. In previous versions of PyTorch, this would return a 0-dimensional tensor. But the documentation reads: "Returns a new 1-D tensor which indexes the input tensor according to the boolean mask mask which is a BoolTensor." Test Plan: Imported from OSS Differential Revision: D18539560 Pulled By: gchanan fbshipit-source-id: 1637ed2c434fcf8ceead0073aa610581f4a19d21	2019-11-25 14:51:51 -08:00
Gregory Chanan	0c9c62ba6e	Turn off scalar_checks for __and__ and clone. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29880 Test Plan: Imported from OSS Differential Revision: D18521732 Pulled By: gchanan fbshipit-source-id: 7fdf5d8a7b93b43ac32067222cb8df5e790900de	2019-11-25 14:51:46 -08:00
Gregory Chanan	94ad7544ae	Turn off scalar_check for __or__ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29879 Test Plan: Imported from OSS Differential Revision: D18521745 Pulled By: gchanan fbshipit-source-id: 93d17d5e9cad5dd6d2c20221d87408c838d74eca	2019-11-25 14:51:40 -08:00
Gregory Chanan	f994377d28	Turn off scalar_check for lshift, rshift. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29878 Test Plan: Imported from OSS Differential Revision: D18521746 Pulled By: gchanan fbshipit-source-id: 11fd7db79ac8ae76b1a5df25fb0ff59d81fcf394	2019-11-25 14:51:34 -08:00
Gerard Goossen	faacbfa8bf	Migrate index_add cpu from TH to ATen (#28421 ) Summary: Migrate index_add cpu from TH to ATen. I couldn't find replacement for get1d and set1d, so doing pointer arithmetic inplace. Pull Request resolved: https://github.com/pytorch/pytorch/pull/28421 Test Plan: existing tests Differential Revision: D18060971 Pulled By: ggoossen fbshipit-source-id: 413719990cdb2fe578964cde14e93577e48a4342	2019-11-22 06:25:13 -08:00

1 2 3 4 5 ...

999 Commits