pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Kyle Chen	16faabe7f0	[ROCm] re-enable tests (#50691 ) Summary: Signed-off-by: Kyle Chen <kylechen@amd.com> cc: jeffdaily re-enable test_torch.py and test_unary_ufuncs.py tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/50691 Reviewed By: mruberry Differential Revision: D25967842 Pulled By: ngimel fbshipit-source-id: dc0f6cb68fe4d151c2719bdf67ead96e1396acf2	2021-01-20 11:23:39 -08:00
Xinyu Li	7526e38cd3	Revert "Stable sort for CPU (#50052 )" (#50752 ) Summary: This reverts commit `c99f356051`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50752 Reviewed By: zou3519 Differential Revision: D25958146 Pulled By: glaringlee fbshipit-source-id: f4068d038f9bd337bac8b673eaeb46a4646f6c77	2021-01-19 18:21:25 -08:00
kshitij12345	316f0b89c3	[testing] Port `torch.{repeat, tile}` tests to use OpInfo machinery (#50199 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/50013 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50199 Reviewed By: ngimel Differential Revision: D25949791 Pulled By: mruberry fbshipit-source-id: 10eaf2d749fac8c08847f50461e72ad1c75c61e3	2021-01-19 06:02:27 -08:00
nikitaved	c458558334	kill `multinomial_alias_setup/draw` (#50489 ) Summary: As per title. Partially Fixes https://github.com/pytorch/pytorch/issues/49421. These functions appear to be dead code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50489 Reviewed By: mruberry Differential Revision: D25948912 Pulled By: ngimel fbshipit-source-id: 108723bd4c76cbc3535eba902d6f74597bfdfa58	2021-01-19 00:23:58 -08:00
76181208+imaginary-person@users.noreply.github.com	3f052ba07b	Remove unnecessary dtype checks for complex types & disable complex dispatch for CPU min/max pointwise ops (#50465 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50064 PROBLEM DESCRIPTION: 1. Had not removed dtype checks for complex types in the previous PR (https://github.com/pytorch/pytorch/issues/50347) for this issue. These type-checks were added in https://github.com/pytorch/pytorch/issues/36377, but are no longer necessary, as we now rely upon dispatch macros to produce error messages. 2. dtype checks in `clamp_max()` and `clamp_min()` for complex inputs had not been removed either. 3. For min/max pointwise ops in TensorCompareKernel.cpp, complex dispatch had not been removed for min/max functions. ### FIX DESCRIPTION: FIX SUMMARY: 1. Removed dtype checks added in https://github.com/pytorch/pytorch/issues/36377, and added 3 more in TensorCompare.cpp. 2. Removed dtype checks for complex inputs in `clamp_max()` and `clamp_min()`. 3. Disabled complex dispatch for min/max pointwise ops in TensorCompareKernel.cpp. 4. Error messages in the exceptions raised due to min/max ops not being implemented are now checked for containing the text _not support_ (which can also be present in _not supported_), or _not implemented_, so one of them should be a part of error messages, in order for them to be informative. REASON FOR NOT CHANGING DISPATCH FOR CUDA AND CLAMP OPS: As for the CUDA min/max operations, their kernels do not seem to be compiled & dispatched for complex types anyway, so no further changes seem to be required. Basically, the dispatch macros currently being used don't have cases for complex types. For example, 1. the reduce CUDA ops use [AT_DISPATCH_ALL_TYPES_AND2 (`678fe9f077`)](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L548-L575) in [ReduceMinMaxKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/ReduceMinMaxKernel.cu), and that macro doesn't allow complex types. 2. In [MinMaxElementwiseKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/MaxMinElementwiseKernel.cu), the CUDA pointwise ops use [`AT_DISPATCH_FLOATING_TYPES_AND2 (`678fe9f077`)`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L240-L263) for non-integral & non-boolean types, and this marco doesn't have a case for complex types either. 3. [clamp CUDA ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/UnaryOpsKernel.cu#L170-L211) use `AT_DISPATCH_ALL_TYPES_AND2 (`678fe9f077`)`, which doesn't have a case for complex types. Similarly, [CPU clamp min/max ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp#L428-L458) use the `AT_DISPATCH_ALL_TYPES_AND `dispatch macro, which doesn't have a case for complex types. REASON FOR ADDING 3 dtype CHECKS: There are a few cases in which the methods corresponding to `min_stub()` or `max_stub()` are not called, so dispatch macros don't get invoked, resulting in no exceptions being raised. Hence, `dtype` checks are necessary at 3 places to raise exceptions: 1. `52dcc72999/aten/src/ATen/native/TensorCompare.cpp (L342)` 2. `52dcc72999/aten/src/ATen/native/TensorCompare.cpp (L422)` 3. `52dcc72999/aten/src/ATen/native/TensorCompare.cpp (L389)` The first dtype check requirement can be verified from the following example Python code based on `test_complex_unsupported()`: ``` import unittest import torch class MyTestCase(unittest.TestCase): def test_1(self): t = torch.tensor((1 + 1j), device='cpu', dtype=torch.complex128) with self.assertRaises(Exception): torch.max(t, dim=0) if __name__ == '__main__': unittest.main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/50465 Reviewed By: mruberry Differential Revision: D25938106 Pulled By: ngimel fbshipit-source-id: 95e2df02ba8583fa3ce87d4a2fdcd60b912dda46	2021-01-17 22:00:05 -08:00
nikitaved	c99f356051	Stable sort for CPU (#50052 ) Summary: Fixes [https://github.com/pytorch/pytorch/issues/38681](https://github.com/pytorch/pytorch/issues/38681) for the CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50052 Reviewed By: mrshenli Differential Revision: D25900823 Pulled By: glaringlee fbshipit-source-id: 1a3fa336037d0aa2344d79f46dcacfd478a353d1	2021-01-15 19:34:27 -08:00
kshitij12345	5546a12fe3	remove redundant tests from tensor_op_tests (#50096 ) Summary: All these Unary operators have been an entry in OpInfo DB. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50096 Reviewed By: zhangguanheng66 Differential Revision: D25870048 Pulled By: mruberry fbshipit-source-id: b64e06d5b9ab5a03a202cda8c22fdb7e4ae8adf8	2021-01-12 04:53:12 -08:00
kshitij12345	9f832c8d3e	[numpy] torch.exp: promote integer inputs to float (#50093 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50093 Reviewed By: H-Huang Differential Revision: D25803549 Pulled By: mruberry fbshipit-source-id: e6f245b5e728f2dca6072f8c359f03dff63aa14d	2021-01-08 06:30:18 -08:00
Thomas Viehmann	def8aa5499	Remove cpu half and dead code from multinomial (#50063 ) Summary: Based on ngimel's (Thank you!) feedback, cpu half was only accidental, so I'm removing it. This lets us ditch the old codepath for without replacement in favour of the new, better one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50063 Reviewed By: mruberry Differential Revision: D25772449 Pulled By: ngimel fbshipit-source-id: 608729c32237de4ee6d1acf7e316a6e878dac7f0	2021-01-05 19:46:33 -08:00
anjali411	8fb5f16931	Complex backward for indexing, slicing, joining, and mutating ops (#49552 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49552 This PR: 1. Migrates independent autograd test for `hstack`, `dstack`, `vstack`, `movedim`, `moveaxis` from `test_autograd.py` to the new `OpInfo` based tests. 2. Migrates autograd test for `gather`, `index_select` from the method_tests to the new `OpInfo` based tests. 2. Enables complex backward for `stack, gather, index_select, index_add_` and adds tests for complex autograd for all the above mentioned ops. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25682511 Pulled By: anjali411 fbshipit-source-id: 5d8f89db4a9ec340ab99a6196987d44a23e2c6c6	2021-01-04 19:44:15 -08:00
kshitij12345	42d2e31cd6	[numpy] `torch.rsqrt` : promote integer inputs to float (#47909 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47909 Reviewed By: ngimel Differential Revision: D25730876 Pulled By: mruberry fbshipit-source-id: c87a8f686e1dd64e511640e0278021c4a584ccf2	2020-12-30 10:33:14 -08:00
kshitij12345	963f7629b5	[numpy] `torch.digamma` : promote integer inputs to float (#48302 ) Summary: BC-breaking Note: This PR updates PyTorch's digamma function to be consistent with SciPy's special.digamma function. This changes the result of the digamma function on the nonpositive integers, where the gamma function is not defined. Since the gamma function is undefined at these points, the (typical) derivative of the logarithm of the gamma function is also undefined at these points, and for negative integers this PR updates digamma to return NaN. For zero, however, it returns -inf to be consistent with SciPy. Interestingly, SciPy made a similar change, which was noticed by at least one user: https://github.com/scipy/scipy/issues/9663#issue-396587679. SciPy's returning of negative infinity at zero is intentional: `59347ae8b8/scipy/special/cephes/psi.c (L163)` This change is consistent with the C++ standard for the gamma function: https://en.cppreference.com/w/cpp/numeric/math/tgamma PR Summary: Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48302 Reviewed By: ngimel Differential Revision: D25664087 Pulled By: mruberry fbshipit-source-id: 1168e81e218bf9fe5b849db0e07e7b22e590cf73	2020-12-24 22:42:55 -08:00
Kshiteej K	3f4b98d568	[numpy] `torch.erfinv`: promote integer inputs to float (#49155 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49155 Reviewed By: ngimel Differential Revision: D25664234 Pulled By: mruberry fbshipit-source-id: 630fd1d334567d78c8130236a67dda0f5ec02560	2020-12-23 14:22:03 -08:00
Kshiteej K	461aafe389	[numpy] `torch.angle`: promote integer inputs to float (#49163 ) Summary: BC-Breaking Note: This PR updates PyTorch's angle operator to be consistent with NumPy's. Previously angle would return zero for all floating point values (including NaN). Now angle returns `pi` for negative floating point values, zero for non-negative floating point values, and propagates NaNs. PR Summary: Reference: https://github.com/pytorch/pytorch/issues/42515 TODO: * [x] Add BC-Breaking Note (Prev all real numbers returned `0` (even `nan`)) -> Fixed to match the correct behavior of NumPy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49163 Reviewed By: ngimel Differential Revision: D25681758 Pulled By: mruberry fbshipit-source-id: 54143fe6bccbae044427ff15d8daaed3596f9685	2020-12-22 18:43:14 -08:00
Xiang Gao	50b361a821	Enable BF16 for indexing on CUDA (#48801 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/48801 Reviewed By: glaringlee Differential Revision: D25542914 Pulled By: ngimel fbshipit-source-id: 4113eb2729d15b40a89268172cc37122b5213624	2020-12-14 17:24:31 -08:00
Chester Liu	3a943e9f82	Use Unicode friendly API on Win32 in THAllocator (#47905 ) Summary: This replaces the narrow character set APIs with the wide character set ones in `THAllocator.cpp`. This fixes the potential crashes caused by passing non-ASCII characters in `torch::from_file` on Windows. See: https://github.com/pytorch/pytorch/issues/47422 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47905 Reviewed By: zhangguanheng66 Differential Revision: D25399146 Pulled By: ezyang fbshipit-source-id: 0a183b65de171c48ed1718fa71e773224eaf196f	2020-12-14 14:24:20 -08:00
Brian Hirsh	f54ab8fbfe	Revert "Revert D25003113: make validate debug-only in Device copy ctr" (#49123 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49123 This reverts commit `7a4a2df225`. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25463531 Pulled By: bdhirsh fbshipit-source-id: 7c7ecdc1d63ffd137b84a129887c424b2083a958	2020-12-14 07:33:37 -08:00
kiyosora	15200e385a	Enable torch.where() to support Float16 & BFloat16 type inputs (#49004 ) Summary: Fixed https://github.com/pytorch/pytorch/issues/49075 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49004 Reviewed By: zou3519 Differential Revision: D25495225 Pulled By: H-Huang fbshipit-source-id: 09418ee5503f65c8862e40119c5802779505a4db	2020-12-11 13:36:41 -08:00
kshitij12345	eb9516eaa4	[numpy] `torch.exp{2, m1}`: promote integer inputs to float (#48926 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48926 Reviewed By: zhangguanheng66 Differential Revision: D25392344 Pulled By: mruberry fbshipit-source-id: ddbabcfd58cc4c944153b1a224cc232efa022104	2020-12-10 00:14:22 -08:00
Kurt Mohler	27f7d1c286	Port `eig` CPU from TH to ATen (#43215 ) Summary: Also consolidates shared logic between `eig` CPU and CUDA implementations Fixes https://github.com/pytorch/pytorch/issues/24693 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43215 Reviewed By: VitalyFedyunin, zhangguanheng66 Differential Revision: D23862622 Pulled By: ngimel fbshipit-source-id: ca1002428850520cd74cd5b7ed8cb4d12dbd9c52	2020-12-09 23:27:35 -08:00
Peter Bell	5765bbd78c	Review memory overlap checks for advanced indexing operations (#48651 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/45964 Indexing operators e.g. `scatter`/`gather` use tensor restriding so the `TensorIterator` built in overlap checking needs to be disabled. This adds the missing overlap checks for these operators. In addition, some indexing operators don't work will with `MemOverlapStatus::FULL` which is explicitly allowed by `assert_no_partial_overlap`. So, I've introduced `assert_no_overlap` that will raise an error on partial _or_ full overlap. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48651 Reviewed By: zhangguanheng66 Differential Revision: D25401047 Pulled By: ngimel fbshipit-source-id: 53abb41ac63c4283f3f1b10a0abb037169f20b89	2020-12-09 15:10:52 -08:00
Supriya Rao	7a4a2df225	Revert D25003113: make validate debug-only in Device copy ctr Test Plan: revert-hammer Differential Revision: D25003113 (`4b26cafb8f`) Original commit changeset: e17e6495db65 fbshipit-source-id: fd636c954a97bd80892464feb974a11b9dd96899	2020-12-09 13:58:11 -08:00
Brian Hirsh	4b26cafb8f	make validate debug-only in Device copy ctr (#47854 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47854 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25003113 Pulled By: bdhirsh fbshipit-source-id: e17e6495db65c48c7daf3429acbd86742286a1f3	2020-12-09 08:11:24 -08:00
Rong Rong	58c13cf685	Back out "Revert D25375885: [pytorch][PR] Reenable some BF16 tests on CUDA" Summary: Revert D25397144 69829f3fff4d4a2d1a71bb52e90d3c7f16b27fa3 Test Plan: Revert Hammer Reviewed By: janeyx99 Differential Revision: D25397572 fbshipit-source-id: 625ca2a32e4558ae4582a15697b6e1cc57cc1573	2020-12-08 07:52:59 -08:00
Rong Rong	39445f718c	Revert D25375885: [pytorch][PR] Reenable some BF16 tests on CUDA Test Plan: revert-hammer Differential Revision: D25375885 (`e3893b867f`) Original commit changeset: 2e19fe725ae9 fbshipit-source-id: 69829f3fff4d4a2d1a71bb52e90d3c7f16b27fa3	2020-12-08 07:05:33 -08:00
Xiang Gao	e3893b867f	Reenable some BF16 tests on CUDA (#48805 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/48805 Reviewed By: agolynski Differential Revision: D25375885 Pulled By: ailzhang fbshipit-source-id: 2e19fe725ae9450bd1a2bc4e2d308c59b9f94fac	2020-12-07 16:16:07 -08:00
Gao, Xiang	a39398b9e5	CUDA BF16 norm (#48806 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/48806 Reviewed By: mruberry Differential Revision: D25358465 Pulled By: ngimel fbshipit-source-id: 1a2afd86f39e96db0754d04bf81de045b1e1235c	2020-12-06 23:41:05 -08:00
Kurt Mohler	2cb9204159	Add nondeterministic alert to index_copy, median CUDA and kthvalue CUDA (#46942 ) Summary: Also fixes issue where skipped tests did not properly restore deterministic flag. Fixes https://github.com/pytorch/pytorch/issues/46743 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46942 Reviewed By: heitorschueroff Differential Revision: D25298020 Pulled By: mruberry fbshipit-source-id: 14b1680e1fa536ec72018d0cdb0a3cf83b098767	2020-12-03 11:03:07 -08:00
Edward Yang	f9a0abfc43	Fix code review from #48659 and #48116 (#48731 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48731 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D25278034 Pulled By: ezyang fbshipit-source-id: 73652311b48d8d80c06e9385b7ff18ef3a158ae8	2020-12-03 08:26:17 -08:00
kshitij12345	90a3049a9a	[fix] repr(torch.device) (#48655 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48585 In the following commit `4c9eb57914`, type of `DeviceIndex` was changed from `uint16_t` to `uint8_t`. `uint8_t` is treated as ascii chars by std::cout and other stream operators. Hence the broken `repr` Stackoverflow Reference: https://stackoverflow.com/questions/19562103/uint8-t-cant-be-printed-with-cout Pull Request resolved: https://github.com/pytorch/pytorch/pull/48655 Reviewed By: bdhirsh Differential Revision: D25272289 Pulled By: ezyang fbshipit-source-id: a1549f5f8d417138cf38795e4c373e3a487d3691	2020-12-02 15:48:17 -08:00
Erjia Guan	c98c98d77d	Migrate `fmod` and `fmod_` from TH to ATen (CUDA) (#47323 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47323 Fixes #24565 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D24763086 Pulled By: ejguan fbshipit-source-id: fa004baea19bbbdbeb44814903db29226805ef0e	2020-12-02 09:38:29 -08:00
Edward Yang	b4f5efa7b2	Structured kernels generate Meta registrations (#48116 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48116 If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer, ailzhang Differential Revision: D25056640 Pulled By: ezyang fbshipit-source-id: f8fcfa0dbb58a94d9b4196748f56e155f83b1521	2020-12-02 07:54:48 -08:00
kshitij12345	bcc85a363e	[numpy] `torch.sigmoid` : promote integer inputs to float (#47551 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47551 Reviewed By: ngimel Differential Revision: D25211953 Pulled By: mruberry fbshipit-source-id: 9174cda401aeba0fd585a4c9bda166dbcf64f42f	2020-12-01 23:28:57 -08:00
Taylor Robie	27905dfe9c	Expose CXX_FLAGS through __config__ (#47861 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47861 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25199263 Pulled By: robieta fbshipit-source-id: 3cfdb0485d686a03a68dd0907d1733634857963f	2020-12-01 19:58:29 -08:00
Mike Ruberry	36c87f1243	Refactors test_torch.py to be fewer than 10k lines (#47356 ) Summary: Creates multiple new test suites to have fewer tests in test_torch.py, consistent with previous test suite creation like test_unary_ufuncs.py and test_linalg.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47356 Reviewed By: ngimel Differential Revision: D25202268 Pulled By: mruberry fbshipit-source-id: 75fde3ca76545d1b32b86d432a5cb7a5ba8f5bb6	2020-11-28 20:11:40 -08:00
kiyosora	272f4db043	Implement NumPy-like function torch.float_power() (#44937 ) Summary: - Related with https://github.com/pytorch/pytorch/issues/38349 - Implementing the NumPy-like function `torch.float_power()` . Pull Request resolved: https://github.com/pytorch/pytorch/pull/44937 Reviewed By: ngimel Differential Revision: D25192119 Pulled By: mruberry fbshipit-source-id: 2e446b8e0c2825f045fe057e30c9419335557a05	2020-11-27 18:01:42 -08:00
Antonio Cuni	344918576c	Migrate `eig` from the TH to Aten (CUDA) (#44105 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/24553 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44105 Reviewed By: ngimel Differential Revision: D25192116 Pulled By: mruberry fbshipit-source-id: 87f1ba4924b9174bfe0d9e2ab14bbe1c6bae879c	2020-11-27 15:15:48 -08:00
elfringham	db1b0b06c4	Flake8 fixes (#48453 ) Summary: Quiet errors from flake8. Only a couple of code changes for deprecated Python syntax from before 2.4. The rest is just adding noqa markers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48453 Reviewed By: mruberry Differential Revision: D25181871 Pulled By: ngimel fbshipit-source-id: f8d7298aae783b1bce2a46827b088fc390970641	2020-11-25 19:09:50 -08:00
Xiao Wang	4ab2055857	Re-enable only cuda tests wrongly disabled before (#48429 ) Summary: Close https://github.com/pytorch/pytorch/issues/46536 Re-enable only cuda tests wrongly disabled in https://github.com/pytorch/pytorch/pull/45332 See discussions https://github.com/pytorch/pytorch/issues/46536#issuecomment-721386038 and https://github.com/pytorch/pytorch/pull/45332#issuecomment-721350987 ~~See also https://github.com/pytorch/pytorch/pull/47237 and https://github.com/pytorch/pytorch/pull/47642~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/48429 Reviewed By: ngimel Differential Revision: D25176368 Pulled By: mruberry fbshipit-source-id: 3822f5a45e58c0e387624e70ea272d16218901a9	2020-11-25 13:26:35 -08:00
kshitij12345	9ecaeb0962	[numpy] Add unary-ufunc tests for `erf` variants (#47155 ) Summary: Adding Unary Ufunc Test entry for `erf` variants. We use scipy functions for reference implementation. We can later update the tests once these functions will update integer input to float. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47155 Reviewed By: ngimel Differential Revision: D25176654 Pulled By: mruberry fbshipit-source-id: cb08efed1468b27650cec4f87a9a34e999ebd810	2020-11-25 13:20:14 -08:00
Fayçal Arbai	2e0a8b75d8	An implementation of torch.tile as requested in pytorch/pytorch#38349 (#47974 ) Summary: The approach is to simply reuse `torch.repeat` but adding one more functionality to tile, which is to prepend 1's to reps arrays if there are more dimensions to the tensors than the reps given in input. Thus for a tensor of shape (64, 3, 24, 24) and reps of (2, 2) will become (1, 1, 2, 2), which is what NumPy does. I've encountered some instability with the test on my end, where I could get a random failure of the test (due to, sometimes, random value of `self.dim()`, and sometimes, segfaults). I'd appreciate any feedback on the test or an explanation for this instability so I can this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47974 Reviewed By: ngimel Differential Revision: D25148963 Pulled By: mruberry fbshipit-source-id: bf63b72c6fe3d3998a682822e669666f7cc97c58	2020-11-24 18:07:25 -08:00
Kurt Mohler	b6654906c7	Fix assertEqual's handling of numpy array inputs (#48217 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48217 Reviewed By: mrshenli Differential Revision: D25119607 Pulled By: mruberry fbshipit-source-id: efe84380d3797d242c2aa7d43d2209bcba89cee0	2020-11-22 00:13:42 -08:00
Nikita Shulga	dc843fe197	Fix test_ldexp on Windows (#48335 ) Summary: Force `torch.randint` to generate tensor of int32 rather than tensor of int64 Delete unneeded copies Pull Request resolved: https://github.com/pytorch/pytorch/pull/48335 Reviewed By: ranman Differential Revision: D25133312 Pulled By: malfet fbshipit-source-id: 70bfcb6b7ff3bea611c4277e6634dc7473541288	2020-11-20 15:41:59 -08:00
Randall Hunt	562d4c3bc5	Add basic ldexp operator for numpy compatibility (#45370 ) Summary: Adds ldexp operator for https://github.com/pytorch/pytorch/issues/38349 I'm not entirely sure the changes to `NamedRegistrations.cpp` were needed but I saw other operators in there so I added it. Normally the ldexp operator is used along with the frexp to construct and deconstruct floating point values. This is useful for performing operations on either the mantissa and exponent portions of floating point values. Sleef, std math.h, and cuda support both ldexp and frexp but not for all data types. I wasn't able to figure out how to get the iterators to play nicely with a vectorized kernel so I have left this with just the normal CPU kernel for now. This is the first operator I'm adding so please review with an eye for errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45370 Reviewed By: mruberry Differential Revision: D24333516 Pulled By: ranman fbshipit-source-id: 2df78088f00aa9789aae1124eda399771e120d3f	2020-11-20 04:09:39 -08:00
kiyosora	008f840e7a	Implement in-place method torch.cumsum_ and torch.cumprod_ (#47651 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47193 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47651 Reviewed By: zou3519 Differential Revision: D24992438 Pulled By: ezyang fbshipit-source-id: c38bea55f4af1fc92be780eaa8e1d462316e6192	2020-11-19 11:20:12 -08:00
mfkasim91	8819bad86c	Implement igammac (3rd PR) (#48171 ) Summary: Related: https://github.com/pytorch/pytorch/issues/46183 (torch.igamma) This is the regularized upper incomplete gamma function. This is supposed to be exactly the same as https://github.com/pytorch/pytorch/issues/47463, but after rebasing the `viable/strict` branch. cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/48171 Reviewed By: zhangguanheng66 Differential Revision: D25060107 Pulled By: mruberry fbshipit-source-id: 89780dea21dbb2141cbc4f7f18192cb78a769b17	2020-11-18 23:44:32 -08:00
Edward Yang	a97d059614	Get TestTorch.test_empty_meta working again (#48113 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48113 Fix is simple: just treat Meta as a backend covered by AutogradOther. This semantically makes sense, since meta kernels are just like regular CPU/CUDA kernels, they just don't do any compute. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zhangguanheng66 Differential Revision: D25056641 Pulled By: ezyang fbshipit-source-id: 7b68911982352b3e0ee8616b38cd9c70bd58a740	2020-11-18 19:50:27 -08:00
Scott Wolchok	4c9eb57914	[PyTorch] Narrow Device to 2 bytes by narrowing DeviceType and DeviceIndex (#47023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47023 DeviceType pretty clearly only needs 1 byte. DeviceIndex only needs 1 byte given that machines don't have anywhere near 255 GPUs in them as far as I know. ghstack-source-id: 116901430 Test Plan: Existing tests, added assertion to catch if my assumption about DeviceIndex is incorrect Reviewed By: dzhulgakov Differential Revision: D24605460 fbshipit-source-id: 7c9a89027fcf8eebd623b7cdbf6302162c981cd2	2020-11-18 19:39:40 -08:00
Mike Ruberry	ea1e78a0c5	Revert D24853669: [pytorch][PR] Migrate `eig` from the TH to Aten (CUDA) Test Plan: revert-hammer Differential Revision: D24853669 (`866f8591be`) Original commit changeset: a513242dc7f4 fbshipit-source-id: a0c8c424b61b1e627d9102de6b4c6d0717a6c06d	2020-11-18 16:53:18 -08:00
Antonio Cuni	866f8591be	Migrate `eig` from the TH to Aten (CUDA) (#44105 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/24553 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44105 Reviewed By: heitorschueroff Differential Revision: D24853669 Pulled By: mruberry fbshipit-source-id: a513242dc7f49f55dbc6046c18d8a9d9aa2aaf8d	2020-11-18 12:10:18 -08:00
kshitij12345	68a3a3f3b5	Add `torch.swapdims` and `torch.swapaxes` (#46041 ) Summary: Reference https://github.com/pytorch/pytorch/issues/38349 Delegates to `torch.transpose` (not sure what is the best way to alias) TODO: * [x] Add test * [x] Add documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/46041 Reviewed By: gchanan Differential Revision: D25022816 Pulled By: mruberry fbshipit-source-id: c80223d081cef84f523ef9b23fbedeb2f8c1efc5	2020-11-18 11:35:53 -08:00
Ivan Yashchuk	81b1673a21	Enable complex tests that depend on batched matmul on CUDA (#47910 ) Summary: Now when https://github.com/pytorch/pytorch/pull/42553 is merged we can delete a bit of code from the tests and enable some of the skipped complex tests. Unfortunately, `test_pinverse_complex_xfailed` and `test_symeig_complex_xfailed` had bugs and it wasn't caught automatically that these tests xpass. Need to be careful next time with `unittest.expectedFailure`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47910 Reviewed By: zhangguanheng66 Differential Revision: D25052130 Pulled By: mruberry fbshipit-source-id: 29512995c024b882f9cb78b7bede77733d5762d0	2020-11-18 10:44:47 -08:00
Heitor Schueroff	2ff748a680	Move kthvalue scalar test to separate method for XLA (#48042 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48042 Moving scalar test to a separate method so the XLA team can continue to test for the other cases without failing. Requested here https://github.com/pytorch/xla/issues/2620#issuecomment-725696108 Test Plan: Imported from OSS Reviewed By: zhangguanheng66 Differential Revision: D25055677 Pulled By: heitorschueroff fbshipit-source-id: 5da66bac78ea197821fee0b9b8a213ff2dc19c67	2020-11-18 07:49:14 -08:00
Xiang Gao	d293413b3e	Batched matmul dtypes (#47873 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/47873 Reviewed By: navahgar Differential Revision: D24928256 Pulled By: anjali411 fbshipit-source-id: a26aef7a15a13fc0b5716e905971265d8b1cea61	2020-11-14 22:45:48 -08:00
anjali411	db1f217d8d	Add complex support for torch.addcmul and torch.addcdiv (#46639 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46639 Resolves: https://github.com/pytorch/pytorch/issues/46546#issuecomment-713122245 Test Plan: Imported from OSS Reviewed By: izdeby, ansley Differential Revision: D24879099 Pulled By: anjali411 fbshipit-source-id: 76131dc68ac964e67a633f62e07f7c799df4463e	2020-11-14 21:27:34 -08:00
Ivan Yashchuk	260daf088d	Added linalg.cholesky (#46083 ) Summary: This PR adds `torch.linalg.cholesky` function that matches `numpy.linalg.cholesky`. Fixed `lda` argument to `lapackCholesky` calls. Added `random_hermitian_pd_matrix` helper function for tests. Ref https://github.com/pytorch/pytorch/issues/42666. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46083 Reviewed By: ailzhang Differential Revision: D24861752 Pulled By: mruberry fbshipit-source-id: 214dbceb4e8a2c589df209493efd843962d25593	2020-11-13 16:50:40 -08:00
Richard Zou	1c7c612af0	Revert D24543682: [pytorch][PR] Added support for complex input for torch.lu_solve Test Plan: revert-hammer Differential Revision: D24543682 (`ffd0003022`) Original commit changeset: 165bde39ef95 fbshipit-source-id: 790b4157fdbc7149aaf0748555efe6daed7e1a23	2020-11-13 08:24:53 -08:00
Ivan Yashchuk	ffd0003022	Added support for complex input for torch.lu_solve (#46862 ) Summary: `torch.lu_solve` now works for complex inputs both on CPU and GPU. I moved the existing tests to `test_linalg.py` and modified them to test complex dtypes, but I didn't modify/improve the body of the tests. Ref. https://github.com/pytorch/pytorch/issues/33152 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46862 Reviewed By: nikithamalgifb Differential Revision: D24543682 Pulled By: anjali411 fbshipit-source-id: 165bde39ef95cafebf976c5ba4b487297efe8433	2020-11-13 02:35:31 -08:00
Gao, Xiang	0652d755d3	Fix some flaky tests in test_torch.py and test_nn.py (#46941 ) Summary: Fixed test: - `test_is_nonzero`, this is asserting exact match, which is flaky when `TORCH_SHOW_CPP_STACKTRACES=1`, I changed this to non-exact assert - `test_pinverse` TF32 - `test_symeig` TF32 - `test_triangular_solve_batched_many_batches_cpu_float64` precision on CPU BLAS - `test_qr` TF32, as well as the tensor factory forgets a `dtype=dtype` - `test_lu` TF32 - `ConvTranspose2d` TF32 - `Conv3d_1x1x1_no_bias` TF32 - `Transformer*` TF32 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46941 Reviewed By: heitorschueroff Differential Revision: D24852725 Pulled By: mruberry fbshipit-source-id: ccd4740cc643476178d81059d1c78da34e5082ed	2020-11-12 22:35:42 -08:00
kshitij12345	3649a2c170	[numpy] `torch.sqrt` : promote integer inputs to float (#47293 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47293 Reviewed By: malfet Differential Revision: D24855994 Pulled By: mruberry fbshipit-source-id: 1e6752f2eeba6d638dea0bdea0c650cf722718c9	2020-11-12 16:16:09 -08:00
Ivan Yashchuk	149190c014	Added CUDA support for complex input for torch.solve (#47045 ) Summary: `torch.solve` now works for complex inputs on GPU. I moved the existing tests to `test_linalg.py` and modified them to test complex and float32 dtypes. Differentiation also works correctly with complex inputs. Fixes https://github.com/pytorch/pytorch/issues/41084 Ref. https://github.com/pytorch/pytorch/issues/33152 anjali411 I hope you don't mind that I took over https://github.com/pytorch/pytorch/pull/42737 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47045 Reviewed By: nikithamalgifb Differential Revision: D24921503 Pulled By: anjali411 fbshipit-source-id: 4c3fc4f193a84b6e28c43c08672d480715000923	2020-11-12 12:22:59 -08:00
Gregory Chanan	b6cb2caa68	Revert "Fixed einsum compatibility/performance issues (#46398 )" (#47821 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47821 This reverts commit `a5c65b86ce`. Conflicts: test/test_linalg.py Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24909923 Pulled By: gchanan fbshipit-source-id: 9dcf98e7c4a3c7e5aaffe475867fa086f3bb6ff2	2020-11-12 08:11:40 -08:00
anjali411	e1ee3bfc0e	Port bmm and baddbmm from TH to ATen (#42553 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42553 Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D24893511 Pulled By: anjali411 fbshipit-source-id: 0eba3f2aec99c48b3018a5264ee7789279cfab58	2020-11-12 07:57:42 -08:00
Ivan Yashchuk	52ec8b9340	Added CUDA support for complex input for torch.triangular_solve (#46916 ) Summary: `torch.triangular_solve` now works for complex inputs on GPU. I moved the existing tests to `test_linalg.py` and modified them to test complex and float32 dtypes. Ref. https://github.com/pytorch/pytorch/issues/33152 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46916 Reviewed By: navahgar, agolynski Differential Revision: D24706647 Pulled By: anjali411 fbshipit-source-id: fe780eac93d2ae1b2549539bb385e5fac25213b3	2020-11-11 16:08:11 -08:00
Ivan Yashchuk	a1db5b0f2b	Added CUDA support for complex input for torch.inverse #2 (#47595 ) Summary: `torch.inverse` now works for complex inputs on GPU. Opening a new PR here. The previous PR was merged and reverted due to a bug in tests marked with `slowTest`. Previous PR https://github.com/pytorch/pytorch/pull/45034 Ref. https://github.com/pytorch/pytorch/issues/33152 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47595 Reviewed By: navahgar Differential Revision: D24840955 Pulled By: anjali411 fbshipit-source-id: ec49fffdc4b3cb4ae7507270fa24e127be14f59b	2020-11-11 11:06:08 -08:00
Heitor Schueroff	a5c65b86ce	Fixed einsum compatibility/performance issues (#46398 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46398 This PR makes torch.einsum compatible with numpy.einsum except for the sublist input option as requested here https://github.com/pytorch/pytorch/issues/21412. It also fixed 2 performance issues linked below and adds a check for reducing to torch.dot instead of torch.bmm which is faster in some cases. fixes #45854, #37628, #30194, #15671 fixes #41467 with benchmark below ```python import torch from torch.utils.benchmark import Timer a = torch.randn(10000, 100, 101, device='cuda') b = torch.randn(10000, 101, 3, device='cuda') c = torch.randn(10000, 100, 1, device='cuda') d = torch.randn(10000, 100, 1, 3, device='cuda') print(Timer( stmt='torch.einsum("bij,bjf->bif", a, b)', globals={'a': a, 'b': b} ).blocked_autorange()) print() print(Timer( stmt='torch.einsum("bic,bicf->bif", c, d)', globals={'c': c, 'd': d} ).blocked_autorange()) ``` ``` <torch.utils.benchmark.utils.common.Measurement object at 0x7fa37c413850> torch.einsum("bij,bjf->bif", a, b) Median: 4.53 ms IQR: 0.00 ms (4.53 to 4.53) 45 measurements, 1 runs per measurement, 1 thread <torch.utils.benchmark.utils.common.Measurement object at 0x7fa37c413700> torch.einsum("bic,bicf->bif", c, d) Median: 63.86 us IQR: 1.52 us (63.22 to 64.73) 4 measurements, 1000 runs per measurement, 1 thread ``` fixes #32591 with benchmark below ```python import torch from torch.utils.benchmark import Timer a = torch.rand(1, 1, 16, 2, 16, 2, 16, 2, 2, 2, 2, device="cuda") b = torch.rand(729, 1, 1, 2, 1, 2, 1, 2, 2, 2, 2, device="cuda") print(Timer( stmt='(a * b).sum(dim = (-3, -2, -1))', globals={'a': a, 'b': b} ).blocked_autorange()) print() print(Timer( stmt='torch.einsum("...ijk, ...ijk -> ...", a, b)', globals={'a': a, 'b': b} ).blocked_autorange()) ``` ``` <torch.utils.benchmark.utils.common.Measurement object at 0x7efe0de28850> (a * b).sum(dim = (-3, -2, -1)) Median: 17.86 ms 2 measurements, 10 runs per measurement, 1 thread <torch.utils.benchmark.utils.common.Measurement object at 0x7efe0de286a0> torch.einsum("...ijk, ...ijk -> ...", a, b) Median: 296.11 us IQR: 1.38 us (295.42 to 296.81) 662 measurements, 1 runs per measurement, 1 thread ``` TODO - [x] add support for ellipsis broadcasting - [x] fix corner case issues with sumproduct_pair - [x] update docs and add more comments - [x] add tests for error cases Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D24860367 Pulled By: heitorschueroff fbshipit-source-id: 31110ee598fd598a43acccf07929b67daee160f9	2020-11-10 19:38:43 -08:00
Heitor Schueroff	bf6a156f64	Fix kthvalue error for scalar input (#47600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47600 fixes https://github.com/pytorch/pytorch/issues/30818 Note that the median case was already fixed by https://github.com/pytorch/pytorch/pull/45847 Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D24860337 Pulled By: heitorschueroff fbshipit-source-id: 69ccbbb6c7c86671e5712b1c2056c012d898b4f2	2020-11-10 17:21:52 -08:00
kshitij12345	6575e674ce	[numpy] torch.{all, any} : Extend Dtype Support (#44790 ) Summary: Reference https://github.com/pytorch/pytorch/issues/44779 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44790 Reviewed By: bdhirsh Differential Revision: D24393119 Pulled By: heitorschueroff fbshipit-source-id: a9b88e9d06b3c282f2e5360b6eaea4ae8ef77c1d	2020-11-10 17:11:39 -08:00
Natalia Gimelshein	c9d37675b2	Back out "[pytorch][PR] The dimension being reduced should not be coalesced by TensorIterator" (#47642 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47642 Original commit changeset: 02bb2b15694c Test Plan: Covered by CI tests Reviewed By: anjali411 Differential Revision: D24849072 fbshipit-source-id: a8790cbf46936aee7a6f504dac8595997175fc65	2020-11-10 16:31:33 -08:00
Radhakrishnan Venkataramani	163adb9fa7	Add HalfToFloat + FloatToHalf operators to PyTorch (#45092 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45092 Adding two operators 1. at::float_to_half -> Converts FP32 tensor to FP16 tensor 2. at::half_to_float -> Converts FP16 tensor to FP32 tensor. These operators internally use the kernel provided by FBGeMM. Both C2 and PT will use the same FBGeMM kernel underneath. Test Plan: buck test //caffe2/test:torch -- .test_half_tensor. Run benchmark locally using ``` buck run //caffe2/benchmarks/operator_benchmark/pt:tensor_to_test ``` AI Bench results are pending. I expect that not to finish as we have large queue with jobs pending for 2+ days. Benchmark for 512x512 tensor with FbGeMM implementation ``` # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: FloatToHalfTensorConversionBenchmark # Mode: Eager # Name: FloatToHalfTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 1246.332 # Benchmarking PyTorch: HalfToFloatTensorConversionBenchmark # Mode: Eager # Name: HalfToFloatTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 1734.304 ``` Benchmark for 512x512 tensor trunk with no FbGeMM integration. ``` # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: FloatToHalfTensorConversionBenchmark # Mode: Eager # Name: FloatToHalfTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 169045.724 # Benchmarking PyTorch: HalfToFloatTensorConversionBenchmark # Mode: Eager # Name: HalfToFloatTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 152382.494 ``` Reviewed By: ngimel Differential Revision: D23824869 fbshipit-source-id: ef044459b6c8c6e5ddded72080204c6a0ab4582c	2020-11-10 12:00:53 -08:00
Gregory Chanan	65a72cae2c	Fix type promotion for trace on CPU. (#47305 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47305 Fixes https://github.com/pytorch/pytorch/issues/47127. Ideally this would just use diag and sum (as the CUDA implementation does), but that seems to have performance problems, which I'll link in the github PR. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D24729627 Pulled By: gchanan fbshipit-source-id: 151b786b53e7b958f0929c803dbf8e95981c6884	2020-11-10 07:46:03 -08:00
John Kilpatrick	8aca85dbcd	Add diagflat complex support (#47564 ) Summary: Adds complex numbers support for `torch.diag` ``` python >>> import torch >>> a = torch.ones(2, dtype=torch.complex128) >>> torch.diagflat(a) tensor([[1.+0.j, 0.+0.j], [0.+0.j, 1.+0.j]], dtype=torch.complex128) >>> b = a.cuda() >>> torch.diagflat(b) tensor([[1.+0.j, 0.+0.j], [0.+0.j, 1.+0.j]], device='cuda:0', dtype=torch.complex128) ``` Note that automatic differentiation isn't implemented: ``` python >>> d = torch.ones(1, dtype=torch.complex128, requires_grad=True) >>> torch.diagflat(d) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: diag does not support automatic differentiation for outputs with complex dtype. ``` Fixes https://github.com/pytorch/pytorch/issues/47499 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47564 Reviewed By: heitorschueroff Differential Revision: D24844467 Pulled By: anjali411 fbshipit-source-id: 9c8cb795d52880b7dcffab0c059b0f6c2e5ef151	2020-11-09 20:28:23 -08:00
Xiang Gao	f23a2a1115	The dimension being reduced should not be coalesced by TensorIterator (#47237 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/37583#issuecomment-720172838 Also add overload of `<<` for convenience of debugging. This PR is tested by `test_reduction_split_cuda` which was added in https://github.com/pytorch/pytorch/pull/37788. Reproduce ```python import torch a = torch.zeros(8, 1, 128, 1024, 1024) a.cuda().sum(1) ``` Before ``` TensorIterator @ 0x7ffd05b10ba0 { ntensors() = 2 noutputs() = 1 shape() = [1073741824] strides() = { (0) = [4] (1) = [4] } dtype() = { (0) = Float (1) = Float } is_reduction_ = 1 } ``` After ``` TensorIterator @ 0x7fffc9051010 { ntensors() = 2 noutputs() = 1 shape() = [1, 1073741824] strides() = { (0) = [0, 4] (1) = [536870912, 4] } dtype() = { (0) = Float (1) = Float } is_reduction_ = 1 } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/47237 Reviewed By: ejguan Differential Revision: D24734763 Pulled By: ngimel fbshipit-source-id: 02bb2b15694c68f96434f55033b63b6e5ff7085b	2020-11-07 01:30:24 -08:00
Xiong Wei	f90da88d8f	Add complex support for torch.mean [CUDA] (#47048 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/46982 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47048 Reviewed By: heitorschueroff Differential Revision: D24729895 Pulled By: anjali411 fbshipit-source-id: 8e948480eb87c37de810207edf909375c0380772	2020-11-06 21:29:19 -08:00
Howard Huang	451e7d3db4	Enable diag for bool Tensors (#47455 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47455 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D24772483 Pulled By: H-Huang fbshipit-source-id: 08ea4af4352972617db3c6475943b326f36b3049	2020-11-06 21:29:17 -08:00
Howard Huang	3253ccbd9f	Add bool tensor support for where (#47454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47454 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D24772482 Pulled By: H-Huang fbshipit-source-id: ea488aae5bf64ac20f7a5d001e8edf55eed16eaf	2020-11-06 21:26:24 -08:00
Rong Rong	5614f72534	Suppres test issues in test_torch running in sandcastle (#47474 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47474 After enabling GPU/Re, some issues were specific to those runs Test Plan: ``` buck test -c test.external_runner=tpx mode/opt //caffe2/test:torch_cuda -- --use-remote-execution --force-tpx --run-disabled ``` Reviewed By: malfet, janeyx99 Differential Revision: D24771578 fbshipit-source-id: 1ada79dae12c8cb6f795a0d261c60f038eee2dfb	2020-11-06 10:34:28 -08:00
Edward Yang	1aeefcdaa6	Revert D24730264: [pytorch][PR] Added CUDA support for complex input for torch.inverse Test Plan: revert-hammer Differential Revision: D24730264 (`33acbedace`) Original commit changeset: b9c94ec46301 fbshipit-source-id: beb9263700e9bc92685f74c37c46aa33f3b595b9	2020-11-06 07:28:14 -08:00
Ivan Yashchuk	33acbedace	Added CUDA support for complex input for torch.inverse (#45034 ) Summary: `torch.inverse` now works for complex inputs on GPU. Test cases with complex matrices are xfailed for now. For example, batched matmul does not work with complex yet. Ref. https://github.com/pytorch/pytorch/issues/33152 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45034 Reviewed By: zou3519 Differential Revision: D24730264 Pulled By: anjali411 fbshipit-source-id: b9c94ec463012913c117278a884adeee96ea02aa	2020-11-05 16:30:11 -08:00
Heitor Schueroff	a4ba018e57	Updated docs/test for dot and vdot (#47242 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47242 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D24733771 Pulled By: heitorschueroff fbshipit-source-id: 92e3b0e28e0565918335fa85d52abe5db9eeff57	2020-11-05 06:27:50 -08:00
Xiang Gao	f19637e6ee	Expand the test of torch.addbmm and torch.baddbmm (#47079 ) Summary: This is to satisfy the request at https://github.com/pytorch/pytorch/pull/42553#issuecomment-673673914. See also https://github.com/pytorch/pytorch/pull/47124 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47079 Reviewed By: ejguan Differential Revision: D24735356 Pulled By: ngimel fbshipit-source-id: 122fceb4902658f350c2fd6f92455adadd0ec2a4	2020-11-04 21:11:26 -08:00
Xiang Gao	030caa190f	Expand the test of torch.bmm on CUDA (#47124 ) Summary: basically https://github.com/pytorch/pytorch/pull/47070, enabled on all CI with `ci-all` Pull Request resolved: https://github.com/pytorch/pytorch/pull/47124 Reviewed By: ejguan Differential Revision: D24735130 Pulled By: ngimel fbshipit-source-id: c2124562a9f9d1caf24686e5d8a1106c79366233	2020-11-04 17:29:34 -08:00
Brian Hirsh	fe17269e75	Revert "Revert D24335982: explicitly error out in comparison ops when the types don't match" (#47288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47288 This reverts commit `b3eb0c86cf`. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24706531 Pulled By: bdhirsh fbshipit-source-id: f3bf34ddba7882932155819251b6c7dcb5c6b56c	2020-11-04 09:27:47 -08:00
Erjia Guan	f1ac63d324	Implement copysign (#46396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46396 Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` \| a \| b \| c \| a.grad \| \| -1 \| -1 \| -1 \| 1 \| \| -0 \| -1 \| -0 \| 0 \| \| 0 \| -1 \| -0 \| 0 \| \| 1 \| -1 \| -1 \| -1 \| \| -1 \| -0 \| -1 \| 1 \| \| -0 \| -0 \| 0 \| 0 \| \| 0 \| -0 \| 0 \| 0 \| \| 1 \| -0 \| -1 \| -1 \| \| -1 \| 0 \| 1 \| -1 \| \| -0 \| 0 \| 0 \| 0 \| \| 0 \| 0 \| 0 \| 0 \| \| 1 \| 0 \| 1 \| 1 \| \| -1 \| 1 \| 1 \| -1 \| \| -0 \| 1 \| 0 \| 0 \| \| 0 \| 1 \| 0 \| 0 \| \| 1 \| 1 \| 1 \| 1 \| This function becomes non-differentiable at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24401366 Pulled By: ejguan fbshipit-source-id: 3621c5ff74b185376a3705589983bb5197ab896d	2020-11-04 08:08:57 -08:00
Qi Zhou	0ec717c830	Support int32 indices and offsets in nn.EmbeddingBag (#46758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46758 It's in general helpful to support int32 indices and offsets, especially when such tensors are large and need to be transferred to accelerator backends. Since it may not be very useful to support the combination of int32 indices and int64 offsets, here we enforce that these two must have the same type. Test Plan: unit tests Reviewed By: ngimel Differential Revision: D24470808 fbshipit-source-id: 94b8a1d0b7fc9fe3d128247aa042c04d7c227f0b	2020-11-03 23:33:50 -08:00
Howard Huang	a8ef4d3f0b	Provide 'out' parameter for 'tensordot' (#47278 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42102 Added an optional out parameter to the tensordot operation to allow using buffers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47278 Test Plan: pytest test/test_torch.py -k tensordot -v Reviewed By: agolynski Differential Revision: D24706258 Pulled By: H-Huang fbshipit-source-id: eb4bcd114795f67de3a670291034107d2826ea69	2020-11-03 15:56:00 -08:00
Xiao Wang	774b638eb6	Change largeCUDATensorTest to largeTensorTest+onlyCUDA; add a buffer to large cuda tensor test (#45332 ) Summary: Effectively, `largeCUDATensorTest` = `largeTensorTest` + `onlyCUDA`. There was this problem where a user got OOM for a `largeCUDATensorTest('16GB')` on a 16GB V100. This decorator was checking total memory for a GPU device, however in most cases, we can't allocate all of the memory that a GPU has. So, it would be beneficial that we have a buffer on this `largeTensorTest` check for CUDA. I added a 10% buffer to it. Definition of `largeTensorTest` `d22dd80128/torch/testing/_internal/common_device_type.py (L560-L578)` `_has_sufficient_memory` `d22dd80128/torch/testing/_internal/common_device_type.py (L535-L557)` `largeCUDATensorTest` `d22dd80128/torch/testing/_internal/common_device_type.py (L526-L532)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45332 Reviewed By: ngimel Differential Revision: D24698690 Pulled By: mruberry fbshipit-source-id: a77544478e45ce271f6639ea04e87700574ae307	2020-11-03 11:43:49 -08:00
Richard Zou	86151da19e	Port CPU Trace from TH to ATen (#47126 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47126 Context ------- This PR is a rebase of shihongzhi's https://github.com/pytorch/pytorch/pull/35360. I forgot to merge it back when it was submitted so I rebased it and ran new benchmarks on it. Benchmarks ---------- TL;DR: The op has more overhead than the TH version but for larger shapes the overhead disappears. ``` import torch shapes = [ [1, 1], [100, 100], [1000, 1000], [10000, 10000], [100000, 100000], ] for shape in shapes: x = torch.ones(shape) %timeit x.trace() Before: 1.83 µs ± 42.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) 1.98 µs ± 48.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 3.19 µs ± 10.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 85.2 µs ± 700 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 1.23 ms ± 4.34 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) After: 2.16 µs ± 325 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) 2.08 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 4.45 µs ± 19.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 81.8 µs ± 766 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 1.27 ms ± 6.75 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Future work ----------- Things that can be done after this PR: - add complex tensor support - Fix the type promotion discrepancy between CPU and CUDA Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D24683259 Pulled By: zou3519 fbshipit-source-id: f92b566ad0d58b72663ab64899d209c96edb78eb	2020-11-02 16:03:22 -08:00
Richard Zou	8054ae3e77	Add test for trace (#47125 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47125 We didn't actually have any tests for torch.trace. The tests expose a discrepancy between the behavior of torch.trace on CPU and CUDA that I'll file an issue for. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24683260 Pulled By: zou3519 fbshipit-source-id: 71dd3af62bc98c6b9b0ba2bf2923cb6d44daa640	2020-11-02 16:00:33 -08:00
Brian Hirsh	b3eb0c86cf	Revert D24335982: explicitly error out in comparison ops when the types don't match Test Plan: revert-hammer Differential Revision: D24335982 (`60fea510a1`) Original commit changeset: 3dfb02bcb403 fbshipit-source-id: 00072f1b00e228bbbe295053091cf4a7a46f4668	2020-11-02 14:08:01 -08:00
Xiong Wei	22b3d414de	Enhance the torch.pow testcase for the complex scalar base (#47101 ) Summary: Related https://github.com/pytorch/pytorch/issues/45259 This PR is to address the https://github.com/pytorch/pytorch/pull/45259#discussion_r514390664 - leverage the `make_tensor` function to generate a random tensor as the exponent, preventing the full zeros for the integer exponent. - add some special cases for the zero exponents and the `1 + 0j` base. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47101 Reviewed By: mruberry Differential Revision: D24682430 Pulled By: zou3519 fbshipit-source-id: f559dc0ba08f37ae070036fb25a52ede17a24149	2020-11-02 13:13:15 -08:00
Brian Hirsh	60fea510a1	explicitly error out in comparison ops when the types don't match (#46399 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46399 Explicitly error out in comparison/logical ops when the dtypes of the various input/output tensors don't match. See [this comment](https://github.com/pytorch/pytorch/pull/46399#discussion_r505686406) for more details. fixes #42660 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24335982 Pulled By: bdhirsh fbshipit-source-id: 3dfb02bcb403dda5bcbf5ed3eae543354ad698b2	2020-11-02 11:42:32 -08:00
Nikita Shulga	edac4060d7	Fix mul cuda for bool (#47031 ) Summary: Also, add tests for tensor by scalar multiplication / division Fixes https://github.com/pytorch/pytorch/issues/47007 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47031 Reviewed By: walterddr Differential Revision: D24608874 Pulled By: malfet fbshipit-source-id: 4e15179904814d6e67228276d3d11ff1b5d15d0d	2020-10-30 10:38:32 -07:00
Heitor Schueroff	ddeacf1565	Fix median bug on discontigous tensors (#46917 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46917 fixes https://github.com/pytorch/pytorch/issues/46814 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24633412 Pulled By: heitorschueroff fbshipit-source-id: 54732671b298bdc2b04b13ab3a373892ee0933c3	2020-10-29 17:12:22 -07:00
Xiong Wei	74d730c0b5	implement NumPy-like functionality column_stack, row_stack (#46313 ) Summary: Related https://github.com/pytorch/pytorch/issues/38349 This PR implements `column_stack` as the composite ops of `torch.reshape` and `torch.hstack`, and makes `row_stack` as the alias of `torch.vstack`. Todo - [x] docs - [x] alias pattern for `row_stack` Pull Request resolved: https://github.com/pytorch/pytorch/pull/46313 Reviewed By: ngimel Differential Revision: D24585471 Pulled By: mruberry fbshipit-source-id: 62fc0ffd43d051dc3ecf386a3e9c0b89086c1d1c	2020-10-29 12:14:39 -07:00
mfkasim91	6eaa324c9f	Implement torch.igamma (#46183 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41637 This is regularized lower incomplete gamma function, equivalent to scipy's `gammainc` and tensorflow `igamma`. cc fritzo mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/46183 Reviewed By: gchanan Differential Revision: D24479126 Pulled By: mruberry fbshipit-source-id: fdf8ea289fe4ca1b408810732192411e948fcdfe	2020-10-29 11:40:18 -07:00
Sameer Deshmukh	2249a293b7	Fix segfault with torch.orgqr. (#46700 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41768 The fault was that a NULL `tau` would get passed to LAPACK function. This PR fixes that by checking whether the `tau` contains 0 elements at the beginning of the function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46700 Reviewed By: albanD Differential Revision: D24616427 Pulled By: mruberry fbshipit-source-id: 92e8f1489b113c0ceeca6e54dea8b810a51a63c3	2020-10-29 10:34:39 -07:00
Kurt Mohler	b75b961934	Fix `requires_grad` arg for `new_full`, `new_empty`, `new_zeros` (#46486 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/36455 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46486 Reviewed By: gchanan Differential Revision: D24497034 Pulled By: ezyang fbshipit-source-id: 769a7f00f9a8f7cb77273a1193173a837ae7e32f	2020-10-28 09:34:53 -07:00
kiyosora	53839ac9d7	Fix internal assert for torch.heaviside with cuda tensor and cpu scalar tensor (#46831 ) Summary: Fixed https://github.com/pytorch/pytorch/issues/46681 ``` >>> x = torch.randn(10, device='cuda') >>> y = torch.tensor(1.) >>> torch.heaviside(x, y) tensor([0., 1., 0., 1., 1., 0., 1., 1., 1., 0.], device='cuda:0') ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/46831 Reviewed By: navahgar Differential Revision: D24567953 Pulled By: izdeby fbshipit-source-id: e5fcf4355b27ce0bdf434963d01863d3b24d0bea	2020-10-27 16:47:33 -07:00
Hong Xu	bcbb6baccf	Add a warning message that torch.sign would not support complex numbers (#43280 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43280 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D24538769 Pulled By: anjali411 fbshipit-source-id: ab2d5283501e4c1d7d401d508e32f685add7ebb1	2020-10-26 21:13:12 -07:00

1 2 3 4 5 ...

1649 Commits