pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Mike Ruberry	21c94606b8	Cleans up type conversions, adds CPU test comparing with NumPy (#35374 ) Summary: Per title. Follow-up to https://github.com/pytorch/pytorch/pull/35086. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35374 Differential Revision: D20712443 Pulled By: mruberry fbshipit-source-id: 987089c14bff644fd6a636da5530dc260e1d1a68	2020-03-27 22:11:57 -07:00
anjali411	96eec95ece	torch.from_numpy for complex dtypes (#35531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35531 Differential Revision: D20693581 Pulled By: anjali411 fbshipit-source-id: d53e26b4175452fa00b287efbfceea18104c1364	2020-03-27 14:40:28 -07:00
Johannes M Dieterich	835ee34e38	[ROCm] Update to ROCm 3.1.1 (#35552 ) Summary: Redux. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35552 Differential Revision: D20701593 Pulled By: ezyang fbshipit-source-id: 1946d1e8fb47d597da903bae5d355bf52a5f017f	2020-03-27 12:21:12 -07:00
Vitaly Fedyunin	930d218fbf	Increase Channels Last test coverage (#35504 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35504 Test Plan: Imported from OSS Differential Revision: D20682117 Pulled By: VitalyFedyunin fbshipit-source-id: ddd7ef1f075ea2c5c35df7bd698974fc5c59bc40	2020-03-27 12:04:47 -07:00
Natalia Gimelshein	8d720b7034	fix complex conversions on cuda (#35344 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/35225. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35344 Differential Revision: D20650471 Pulled By: ngimel fbshipit-source-id: f9edabc6dd8884f72c1a38cdf9dbe1de8362535e	2020-03-26 13:17:37 -07:00
KostekIV	ada40777c4	Rand function for complex dtype (#34924 ) Summary: Address https://github.com/pytorch/pytorch/issues/34380 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34924 Differential Revision: D20596623 Pulled By: anjali411 fbshipit-source-id: e17ce069cd763b773399128d113704579ca766e6	2020-03-26 08:34:56 -07:00
Johannes M Dieterich	d807292c4a	[ROCm] Hotfix disable tests (#35396 ) Summary: Regressions introduced sometime these last days - disable for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35396 Differential Revision: D20656744 Pulled By: xw285cornell fbshipit-source-id: 386e4e5d50fb81a1d44e8f3558b81cb69299fe92	2020-03-26 00:21:40 -07:00
Kurt Mohler	a7c232f74c	Port `mm` cuda from TH to ATen (#34891 ) Summary: Issue https://github.com/pytorch/pytorch/issues/24596 This PR moves `mm` cuda to ATen. The internal `addmmImpl` that was used as the base of the old TH version of `mm` cuda is also ported. This PR also sets up `addmm` cuda to be fairly easily ported to ATen in a future PR, since TH `mm` and `addmm` used the same `addmmImpl` function at their core. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34891 Differential Revision: D20650713 Pulled By: ngimel fbshipit-source-id: 692aba1bbae65a18d23855b5e101446082d64c66	2020-03-25 21:42:35 -07:00
Pavel Belevich	2dd867f30f	Move normal() to DistributionTemplates (#35167 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35167 The purpose of this PR is to move `normal`/`normal_`/`normal_out` to `native/DistributionTemplates.h`, `native/cpu/DistributionTemplates.h` and `native/cuda/DistributionTemplates.h` to make it reusable for custom RNG, see cpu_rng_test.cpp as an example of custom RNG. Test Plan: Imported from OSS Differential Revision: D20588248 Pulled By: pbelevich fbshipit-source-id: 7ee60be97f81522cd68894ff1389007c05130a60	2020-03-25 19:54:18 -07:00
Peter Bell	40b244ceb4	Fix handling of non-finite values in topk (#35253 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/34191 `at::native::radixSelect` basically uses integer comparison which creates a defined ordering of non-finite float values. This isn't compatible with IEEE float comparison, so mixing the two leads to unwritten values in the output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35253 Differential Revision: D20645554 Pulled By: ezyang fbshipit-source-id: 651bcb1742ed67086ec89cc318d862caae65b981	2020-03-25 13:29:45 -07:00
Zafar Takhirov	5959bd6c29	Making sure all tensors in `torch.cat` sequence have the same dtype. (#35150 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35150 Fixes #35014 Test Plan: Imported from OSS Differential Revision: D20578589 Pulled By: z-a-f fbshipit-source-id: edeaef133d1cf5152dcbafab2b969f1424ee2836	2020-03-25 11:36:12 -07:00
Vasiliy Kuznetsov	f3e9fa6122	add hardswish FP operator (#34747 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34747 Adds the hardswish FP operator from MobileNetV3 to PyTorch. This is for common operator coverage, since this is widely used. A future PR will add the quantized version. CUDA is saved for a future PR as well. Test Plan: tests pass: ``` python test/test_torch.py TestTorchDeviceTypeCPU.test_hardswish_cpu_float32 ``` microbenchmark: https://gist.github.com/vkuzo/b10d3b238f24e58c585314e8b5385aca (batch_size == 1: 11.5GiB/s, batch_size == 4: 11.9GiB/s) Imported from OSS Differential Revision: D20451404 fbshipit-source-id: c7e13c9ab1a83e27a1ba18182947c82c896efae2	2020-03-24 15:15:34 -07:00
Peter Bell	6f6436ff5d	Fix input overwriting in irfft (#35219 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/34551 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35219 Differential Revision: D20605330 Pulled By: ezyang fbshipit-source-id: a62f1685779bb05c3682255bb3a3f6f9ec35814f	2020-03-24 08:27:06 -07:00
Mike Ruberry	7c1ea736ba	Extends true_divide to be a method (#34794 ) Summary: Per title. See related https://github.com/pytorch/pytorch/pull/34570. In PyTorch 1.7 the plan is for torch.div and Python's division operator to perform "true" division, like Python 3, JAX, and NumPy. To facilitate this change, this PR expands true_divide to be a method so it can cover all of torch.div's use cases. New true_divide tests are added to test_torch.py, test_type_promotion.py, and test_sparse.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34794 Differential Revision: D20545507 Pulled By: mruberry fbshipit-source-id: 55286f819716c8823d1930441a69008560ac2bd5	2020-03-23 23:12:23 -07:00
Mike Ruberry	36e36eff2f	Ignores deliberate undefined float->int conversion (#35086 ) Summary: In C++, casting a floating point value to an integer dtype is undefined when the value is outside the dtype's dynamic range. For example, casting 300.5 to Int8 is undefined behavior because the maximum representable Int8 value is 127, and 300.5 > 127. PyTorch, like NumPy, deliberately allows and makes these casts, however, and when we do this we trigger undefined behavior that causes our sanitizers to (correctly) complain. I propose skipping this sanitization on our cast function. The history of this PR demonstrates the issue, showing a single CI failure in the ASAN build when a test is added that converts a large float value to an integral value. The current PR shows a green CI after the sanitization is skipped. There are alternatives to skipping this sanitization: - Clamping or otherwise converting floats to the dynamic range of integral types they're cast to - Throwing a runtime error if a float value is outside the dynamic range of the integral type it's cast to (this would not be NumPy compatible) - Declaring programs in error if they perform these casts (this is technically true) - Preventing this happening in PyTorch proper so the ASAN build doesn't fail None of these alternatives seems particularly appealing, and I think it's appropriate to skip the sanitization because our behavior is deliberate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35086 Differential Revision: D20591163 Pulled By: mruberry fbshipit-source-id: fa7a90609c73c4c627bd39726a7dcbaeeffa1d1b	2020-03-23 01:08:57 -07:00
anjali411	7d5a899883	randn cuda kernel complex dtype (#35056 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35056 Differential Revision: D20559396 Pulled By: anjali411 fbshipit-source-id: 64b911f893e9c54aef89e8c1e643998d8b70e613	2020-03-20 11:19:08 -07:00
Wojciech Baranowski	eb78f7ea41	torch.cat: disallow inputs on different devices (#35053 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/35045 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35053 Differential Revision: D20545517 Pulled By: ngimel fbshipit-source-id: eee3fc87c7e578ff44d69d5ce6f92a8f496fa97b	2020-03-19 22:06:39 -07:00
rohithkrn	edb794fb19	[ROCm] Enable BFloat16 type for TopK operator on ROCm. (#34849 ) Summary: This PR enables bfloat16 for topk on ROCm. iotamudelta ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/34849 Differential Revision: D20544732 Pulled By: ezyang fbshipit-source-id: 1ad017a4403d2a429d98e60c8eb1f78b320df920	2020-03-19 20:04:08 -07:00
Mike Ruberry	0d8447a9b8	Warns when performing integer division with div and addcdiv (#34570 ) Summary: Per title. In the future we want to make div(), the division operator, and addcdiv perform true division as in Python 3, NumPy, and JAX. To do this without silently breaking users we plan to: - Warn (once) in 1.5 when a user performs integer division using div or addcdiv - RuntimeError in 1.6 when a user attempts to perform integer division using div or addcdiv - Always perform true division in 1.7 using div, /, and addcdiv Users can use true_divide or floor_divide today to explicitly specify the type of division they like. A test for this behavior is added to test_type_promotion. Unfortunately, because we are only warning once (to avoid a deluge) the test only uses maybeWarns Regex. The XLA failure is real but will be solved by https://github.com/pytorch/pytorch/pull/34552. I'll be sure to land that PR first to avoid temporarily breaking the XLA build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34570 Differential Revision: D20529211 Pulled By: mruberry fbshipit-source-id: 65af5a9641c5825175d029e8413c9e1730c661d0	2020-03-19 04:10:55 -07:00
Mike Ruberry	3b7e1cd2cc	Makes floor_divide a method, adds sparse floor division (#34552 ) Summary: (Updated per review feedback) `torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to: - have an out variant: `floor_divide(x, y, out=z)` - be a method on a tensor: `x.floor_divide(y)` - have an in-place variant: `x.floor_divide_(y)` - work with sparse tensors Tests are added to test_sparse.py and test_torch.py for these new behaviors. In addition, this PR: - cleans up the existing sparse division and true_division code and improves their error message - adds testing of sparse true_division to test_sparse.py - extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y). The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors. There are two potential follow-up issues suggested by this PR: - the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes - the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552 Differential Revision: D20509850 Pulled By: mruberry fbshipit-source-id: 2cd3c828aad67191c77f2ed8470411e246f604f8	2020-03-18 15:00:53 -07:00
Mike Ruberry	1afc584188	Deprecates current torch.full integral type inference, adds torch.full complex type inference (#34709 ) Summary: Per title. Currently torch.full will always (attempt to) produce a float tensor. This is inconsistent with NumPy in (at least) two cases: - When integral fill values (including bool) are given - When complex fill values are given For example: ``` np.full((1, 2), 1).dtype : dtype('int64') np.full((1, 2), (1 + 1j)).dtype : dtype('complex128') ``` Whereas in PyTorch ``` torch.full((1, 2), 1).dtype : torch.float32 torch.full((1, 2), (1 + 1j)).dtype : RuntimeError: value cannot be converted to type float without overflow: (1,1) ``` This PR begins the process of deprecating our current behavior of returning float tensors (by default) when given integer fill values by warning the user that integer fill values will require explicitly specifying the dtype or out kwargs in 1.6, and in 1.7 the behavior will change to return a LongTensor by default (BoolTensor for bool values). The intermediate 1.6 release is to prevent changing the behavior silently and unexpectedly. The PR also implements inference for complex types. So that with it: ``` torch.full((1, 2), (1 + 1j)).dtype : torch.complex64 ``` The complex type inference returns a ComplexFloat tensor when given a complex fill value (and no dtype or out kwarg is specified), unless the default dtype is Double, in which case a ComplexDouble tensor is returned. A test for these behaviors is added to test_torch.py. Implementation note: This PR required customizing full's dispatch because currently in eager codegen the TensorOptions object passed to functions improperly sets has_dtype() to true, even if the user did not explicitly provide a dtype. torch.arange already worked around this issue with its own custom implementation. The JIT, however, does pass a properly constructed TensorOptions object. Future Work: This PR does not extend torch.full's complex type inference to ONNX. This seems unlikely to come up and will be a clear error if it does. When integer type inference is added to torch.full, however, then porting the behavior to ONNX may be warranted. torch.arange ported its complex type promotion logic to ONNX, for example. Additionally, this PR mostly leaves existing call sites in PyTorch that would trigger this warning intact. This is to be more minimal (since the PR is BC breaking). I will submit a separate PR fixing PyTorch's call sites. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34709 Differential Revision: D20509387 Pulled By: mruberry fbshipit-source-id: 129593ba06a1662032bbbf8056975eaa59baf933	2020-03-18 12:19:31 -07:00
Mike Ruberry	a1eaaea288	Revert D20497453: [pytorch][PR] Makes floor_divide a method, adds sparse floor division Test Plan: revert-hammer Differential Revision: D20497453 Original commit changeset: ac326f2007d8 fbshipit-source-id: b94b89b1a25521506e3d0a6b072d3d4d8c55e63d	2020-03-18 01:48:50 -07:00
Mike Ruberry	b7129050e7	Makes floor_divide a method, adds sparse floor division (#34552 ) Summary: (Updated per review feedback) `torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to: - have an out variant: `floor_divide(x, y, out=z)` - be a method on a tensor: `x.floor_divide(y)` - have an in-place variant: `x.floor_divide_(y)` - work with sparse tensors Tests are added to test_sparse.py and test_torch.py for these new behaviors. In addition, this PR: - cleans up the existing sparse division and true_division code and improves their error message - adds testing of sparse true_division to test_sparse.py - extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y). The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors. There are two potential follow-up issues suggested by this PR: - the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes - the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552 Differential Revision: D20497453 Pulled By: mruberry fbshipit-source-id: ac326f2007d8894f730d1278fef84d63bcb07b5d	2020-03-18 00:01:45 -07:00
Vasiliy Kuznetsov	1bac5fd0d3	add hardsigmoid FP operator to PyTorch (#34545 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34545 This is for common operator coverage, since this is widely used. A future PR will add the quantized version. Some initial questions for reviewers, since it's my first FP operator diff: * do we need a backwards.out method for this? * do we need CUDA? If yes, should it be this PR or is it ok to split Test Plan: ``` // test python test/test_torch.py TestTorchDeviceTypeCPU.test_hardsigmoid_cpu_float32 // benchmark python -m pt.hardsigmoid_test ... Forward Execution Time (us) : 40.315 Forward Execution Time (us) : 42.603 ``` Imported from OSS Differential Revision: D20371692 fbshipit-source-id: 95668400da9577fd1002ce3f76b9777c6f96c327	2020-03-16 15:24:12 -07:00
Xiang Gao	31eaeba38a	Increase the prec of test_baddbmm (#34764 ) Summary: This test is flaky on my computer, the error is: ``` AssertionError: tensor(1.3351e-05) not less than or equal to 1e-05 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/34764 Differential Revision: D20476006 Pulled By: ezyang fbshipit-source-id: dad7e702275346070552c8a98765c37e6ca2c197	2020-03-16 15:06:01 -07:00
Pearu Peterson	8bae1ed144	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721 ) Summary: This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721 Differential Revision: D20444270 Pulled By: vincentqb fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659	2020-03-16 14:13:30 -07:00
Andrew Delong	8e8a37d746	Fix bug in baddbmm corner case (#33467 ) (#33538 ) Summary: Ensure `torch.baddbmm(c, a, b)` returns `beta*c` when `a @ b` has empty inner dimension. Fixes https://github.com/pytorch/pytorch/issues/33467. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33538 Differential Revision: D20352352 Pulled By: albanD fbshipit-source-id: a7021c1979f82402ecea4784d6cc39783392ea16	2020-03-13 09:30:20 -07:00
rohithkrn	2f32b92763	[ROCm] Enable BFloat16 type for EmbeddingBag ops et al (#34630 ) Summary: This PR enables bfloat16 type for - Embedding, Index, Sigmoid Ops used in [DLRM](https://github.com/facebookresearch/dlrm) - Miscellaneous ops like comparison ops, arange op used in unit tests - Rename types list with the pattern `*_with_bfloat16` in `test_torch.py` to avoid confusion iotamudelta ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/34630 Differential Revision: D20405093 Pulled By: ezyang fbshipit-source-id: aa9538acf81b3a5a9a46ce5014529707fdf25687	2020-03-12 11:30:33 -07:00
Edward Yang	4b929e5466	Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem Test Plan: revert-hammer Differential Revision: D20193196 Original commit changeset: 78a487991242 fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8	2020-03-11 09:24:34 -07:00
Pearu Peterson	2ec779d46c	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488 ) Summary: This PR implements the following linear algebra algorithms for low-rank matrices: - [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + uses `torch.lowrank.get_approximate_basis` + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] PCA - using `torch.svd_lowrank` + uses `torch.svd_lowrank` + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices, uses non-centered sparse matrix algorithm + [x] documentation - [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124) + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883) + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point. - [x] benchmarks + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations. + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster. + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results) Resolves https://github.com/pytorch/pytorch/issues/8049. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488 Differential Revision: D20193196 Pulled By: vincentqb fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e	2020-03-11 07:33:49 -07:00
Kurt Mohler	fbbeee0983	Port `remainder` from TH to ATen (CPU and CUDA) (#34136 ) Summary: CPU issue https://github.com/pytorch/pytorch/issues/24753 CUDA issue https://github.com/pytorch/pytorch/issues/24615 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34136 Differential Revision: D20375458 Pulled By: ezyang fbshipit-source-id: 1a9fb39a7e2d17a0d31bd14b211eaacea060e834	2020-03-11 07:08:11 -07:00
Ailing Zhang	ab2297dfe6	Add Tensor overload for start in narrow. (#34317 ) Summary: https://github.com/pytorch/pytorch/issues/31558 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34317 Differential Revision: D20294333 Pulled By: ailzhang fbshipit-source-id: 47c6646ae298e04a455923bd5048db026a5e3c7c	2020-03-10 22:33:22 -07:00
Gao, Xiang	d0834c5b64	Preserve memory format for torch.cat on CUDA (#34526 ) Summary: fix https://github.com/pytorch/pytorch/issues/34084 cc: ptrblck VitalyFedyunin Pull Request resolved: https://github.com/pytorch/pytorch/pull/34526 Differential Revision: D20371847 Pulled By: ngimel fbshipit-source-id: e3b1a34caff2db8099ad9afe91bf9b473d5da6e8	2020-03-10 16:06:10 -07:00
rohithkrn	29b673392f	[ROCm] Enable BFloat16 type for loss functions and few misc ops required for resnet50 (#34469 ) Summary: This PR enables bfloat16 type for loss criterion ops(and the ops they depend on) and few miscellaneous ops required to train resnet50. iotamudelta ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/34469 Differential Revision: D20348856 Pulled By: ezyang fbshipit-source-id: 0a8f06c2169cfa3c9cf319120e27150170095f6c	2020-03-10 08:39:07 -07:00
Johannes M Dieterich	2c1a302d6a	[ROCm] Enable double __shfl_down (#34103 ) Summary: This allows us to enable some double-based pdist tests running into accrued error from casting down to float previously. Addresses https://github.com/pytorch/pytorch/issues/33128 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34103 Differential Revision: D20343279 Pulled By: ezyang fbshipit-source-id: a2da768259fab34ef326976283b7a15bebbbb979	2020-03-09 16:23:56 -07:00
Mike Ruberry	7e55494502	Warns on read-only Numpy array->tensor conversion (#33615 ) Summary: Addresses https://github.com/pytorch/pytorch/issues/5442. Per title (and see issue). A test is added to test_torch.py to verify the behavior. Update (with new behavior): NumPy arrays can be non-writeable (read-only). When converting a NumPy array to a Torch tensor the storage is shared, but the tensor is always writable (PyTorch doesn't have a read-only tensor). Thus, when a non-writeable NumPy array is converted to a PyTorch tensor it can be written to. In the past, PyTorch would silently copy non-writeable NumPy arrays and then convert those copies into tensors. This behavior violates the from_numpy contract, however, which promises that the tensor and the array share memory. This PR adds a warning message when a non-writeable NumPy array is converted into a Torch tensor. This will not break any networks, but will make end users aware of the behavior. They can work-around the warning message by marking their NumPy arrays as writeable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33615 Differential Revision: D20289894 Pulled By: mruberry fbshipit-source-id: b76df0077399eb91038b12a6bf1917ef38c2cafd	2020-03-08 20:03:50 -07:00
Pavel Belevich	35b6d2945d	Tensor.random_ check that from and to are in tensor dtype bounds (#34033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34033 Test Plan: Imported from OSS Differential Revision: D20182414 Pulled By: pbelevich fbshipit-source-id: 3704570ead7de169ce13c81164be0aff0806fb46	2020-03-06 07:22:47 -08:00
lixinyu	f9f135c5d8	ChannelsLast3d support is_contiguous, contiguous, suggest_memory_format, caching (#33033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33033 Test Plan: Imported from OSS Differential Revision: D19759661 Pulled By: glaringlee fbshipit-source-id: 6c4798fa93589338c0c71c5308b9fd1151330245	2020-03-06 06:02:03 -08:00
Peter Bell	2af64ba3ed	Allow output to zero-strided tensors if the size is <= 1 along that dim (#34100 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/33812 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34100 Differential Revision: D20267778 Pulled By: ngimel fbshipit-source-id: 1b84c4f6e6bf5d29c3698daa3cb71554b25c1eee	2020-03-05 16:01:33 -08:00
Edward Yang	ba1bd41767	Turn on strict dtype checking for test_torch.py (#33825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33825 Partially addresses #20376 I do this by overriding assertEqual in classes that opt into this. This means I have to fix #33821. The fix is a little unsatisfactory as idiomatic Python 2 super() calls don't work (since the class is no longer in scope); hopefully this will just work when we go to Python 3. General approach taken: - A lot of dtype mismatches are because we specified tensor constants that infer to some dtype, but the actual dtype needed is something else. Those are easy, just annotate the tensor() constructor (often a legacy Tensor/FloatTensor call) with dtype - There are a few cases where the promotion rules are nontrivial. Some of them I just typed out the expected promotion rules manually (based on trial and error) - There are some more complex cases; if it gets too hairy I just set exact_dtype=False and nope the fuck out I don't have time to do it for all the other classes. But the setup should work if people just incrementally add the overrides to classes, and then eventually flip the default. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20125791 Pulled By: ezyang fbshipit-source-id: 389c2d1efbd93172af02f13e38ac5e92fe730c57	2020-03-03 14:45:53 -08:00
anjali411	fbc9c61c81	randn and normal_ for complex tensors (#34037 ) Summary: 1. randn and normal_ methods will work for complex tensors after this PR 2. added an internal function for viewing complex tensors as float tensors which enables us to reuse functions defined for float tensors for complex tensors with change in arguments passed(like size, standard deviation in case of normal_). currently the resultant new float tensor doesn't share the storage with the input complex tensor which means that the version counter wouldn't be updated if any function is called on this resultant tensor, but once the dtype entry is removed from the storage class, this issue will be resolved. Side notes: 1. didn't add a separate header for the util functions because of this issue https://github.com/pytorch/pytorch/issues/20686#issuecomment-593002293 2. we should eventually have a public API method view_complex_as_float once (2) mentioned above gets resolved Pull Request resolved: https://github.com/pytorch/pytorch/pull/34037 Differential Revision: D20221793 Pulled By: anjali411 fbshipit-source-id: a78f5e83d6104e2f55e0b250c4ec32e8d29a14eb	2020-03-03 12:46:01 -08:00
Edward Yang	74a0663afd	In torch_test, mark every test that takes >5s on a DEBUG CPU-only build as slow test (#33901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33901 After this change, the pytest profile looks like: 4.83s call test/test_torch.py::TestTorch::test_fft_ifft_rfft_irfft 4.23s call test/test_torch.py::TestTorch::test_var_dim 4.22s call test/test_torch.py::TestTorch::test_std_dim 4.19s call test/test_torch.py::TestTorch::test_max 4.06s call test/test_torch.py::TestTorch::test_min 3.60s call test/test_torch.py::TestTorchDeviceTypeCPU::test_cdist_norm_batch_cpu 2.62s call test/test_torch.py::TestTorchDeviceTypeCPU::test_pow_cpu 2.60s call test/test_torch.py::TestTorch::test_matmul_small_brute_force_1d_Nd And the entire CPU-only test suite can be run in 88s on my Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20222288 Pulled By: ezyang fbshipit-source-id: 4224a9117f42566e290ae202881d76f1545cebec	2020-03-03 11:49:49 -08:00
Gerard Goossen	f29110fdf8	[pytorch] blas gemm fix for k=0 (#33819 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33819 These conditions are for the specific implementation, the fallback implementation works without these checks. So use that if any of these checks isn't true. Resubmit of https://github.com/pytorch/pytorch/pull/33419 (which got reverted due to a problem with XLA, but which now has been fixed) ghstack-source-id: 99333280 Test Plan: Test included Differential Revision: D20121460 fbshipit-source-id: c1056b8e26751e24078bbe80c7cb4b223bcca7cb	2020-03-03 08:56:05 -08:00
Pavel Belevich	e568c039bd	Enable Tensor.random_(from, to) for half on CPU (#34030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34030 Test Plan: Imported from OSS Differential Revision: D20182412 Pulled By: pbelevich fbshipit-source-id: b7439e6d66e1c0b9ffa8b397cab057c9146f5714	2020-03-02 14:22:35 -08:00
anjali411	ba4cff2ffc	[dtype inference] Following pytorch default for float vs double (#33713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33713 Differential Revision: D20193387 Pulled By: anjali411 fbshipit-source-id: d802ec395df4e75e2be02e91d7288ae6fb7cf8e0	2020-03-02 11:56:34 -08:00
Mingfei Ma	c6d301220a	Fix torch.cat() performance regression on single core CPU (#33534 ) Summary: This PR addresses the performance regression on `torch.cat()` on CPU with single thread. Previous optimization https://github.com/pytorch/pytorch/issues/30806 introduced regression for several cases on pytorch operator benchmark. See https://github.com/pytorch/pytorch/issues/33334 for detail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33534 Differential Revision: D20129963 Pulled By: VitalyFedyunin fbshipit-source-id: 3fa6cd266978e5b54fa37105555502b77352df3e	2020-02-28 11:22:08 -08:00
anjali411	dece155335	Modified assertEqual to handle complex tensors (#33773 ) Summary: - Modified assertEqual to handle complex tensors - added a test in test_torch.py to test torch.zeros - added dispatch for complex for index_kernel, index_put_kernel Pull Request resolved: https://github.com/pytorch/pytorch/pull/33773 Differential Revision: D20135553 Pulled By: anjali411 fbshipit-source-id: f716604535c0447ecffa335b0fc843431397c988	2020-02-28 08:43:28 -08:00
Pavel Belevich	095de1e872	Migrate `random_` from the TH to Aten (CPU and CUDA) (#33663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33663 Test Plan: Imported from OSS Differential Revision: D20056350 Pulled By: pbelevich fbshipit-source-id: f9859b79ffdec70c48d6ee3ec70fd6fad593a9f5	2020-02-27 05:05:42 -08:00
Edward Yang	8159316714	Revert D19941103: [pytorch] blas gemm fix for k=0 Test Plan: revert-hammer Differential Revision: D19941103 Original commit changeset: e1c85d1e7574 fbshipit-source-id: da12747130c60b61452aa46e269c66546a1075f9	2020-02-25 13:30:38 -08:00
xiaobing.zhang	4d203c6fc8	Move cumprod and cumsum to Aten(CPU) (#33280 ) Summary: This PR is about move cumprod and cumsum to Aten. Test script: ``` import torch import torch.nn as nn import time torch.manual_seed(0) def _time(): return time.time() device = "cpu" #torch.set_num_threads(1) #warm up for n in [10, 300]: input = torch.randn(n, n, n, requires_grad=False, device=device) input = input * 0.01 + 1 for dim in range(input.dim()): for i in range(100): #output = input.cumsum(dim) output = input.cumprod(dim) for n in [10, 300]: input = torch.randn(n, n, n, requires_grad=False, device=device) input = input * 0.01 + 1 for dim in range(input.dim()): fwd_t = 0 for i in range(1000): t1 = _time() #output = input.cumsum(dim) output = input.cumprod(dim) t2 = _time() fwd_t = fwd_t + (t2 -t1) fwd_avg = fwd_t / 1000 * 1000 print("size = (%d, %d, %d); reduce dim=%d; compute time is %.4f(ms)" % (n, n, n, dim, fwd_avg)) ``` Test device: skx-8180. Performance: ``` size = (10, 10, 10); reduce dim=0; compute time is 0.0098(ms) size = (10, 10, 10); reduce dim=1; compute time is 0.0089(ms) size = (10, 10, 10); reduce dim=2; compute time is 0.0089(ms) size = (300, 300, 300); reduce dim=0; compute time is 208.9403(ms) size = (300, 300, 300); reduce dim=1; compute time is 241.5989(ms) size = (300, 300, 300); reduce dim=2; compute time is 66.2587(ms) After: size = (10, 10, 10); reduce dim=0; compute time is 0.0065(ms) size = (10, 10, 10); reduce dim=1; compute time is 0.0063(ms) size = (10, 10, 10); reduce dim=2; compute time is 0.0053(ms) size = (300, 300, 300); reduce dim=0; compute time is 36.0139(ms) size = (300, 300, 300); reduce dim=1; compute time is 36.0776(ms) size = (300, 300, 300); reduce dim=2; compute time is 21.0111(ms) number_threads = 1: size = (10, 10, 10); reduce dim=0; compute time is 0.0053(ms) size = (10, 10, 10); reduce dim=1; compute time is 0.0052(ms) size = (10, 10, 10); reduce dim=2; compute time is 0.0051(ms) size = (300, 300, 300); reduce dim=0; compute time is 81.8831(ms) size = (300, 300, 300); reduce dim=1; compute time is 88.5687(ms) size = (300, 300, 300); reduce dim=2; compute time is 54.9922(ms) cumprod: Before: size = (10, 10, 10); reduce dim=0; compute time is 0.0096(ms) size = (10, 10, 10); reduce dim=1; compute time is 0.0088(ms) size = (10, 10, 10); reduce dim=2; compute time is 0.0088(ms) size = (300, 300, 300); reduce dim=0; compute time is 221.2601(ms) size = (300, 300, 300); reduce dim=1; compute time is 249.7894(ms) size = (300, 300, 300); reduce dim=2; compute time is 71.5182(ms) number_threads = 1: size = (10, 10, 10); reduce dim=0; compute time is 0.0100(ms) size = (10, 10, 10); reduce dim=1; compute time is 0.0093(ms) size = (10, 10, 10); reduce dim=2; compute time is 0.0093(ms) size = (300, 300, 300); reduce dim=0; compute time is 207.6287(ms) size = (300, 300, 300); reduce dim=1; compute time is 241.6693(ms) size = (300, 300, 300); reduce dim=2; compute time is 66.2977(ms) After: size = (10, 10, 10); reduce dim=0; compute time is 0.0063(ms) size = (10, 10, 10); reduce dim=1; compute time is 0.0062(ms) size = (10, 10, 10); reduce dim=2; compute time is 0.0053(ms) size = (300, 300, 300); reduce dim=0; compute time is 36.4283(ms) size = (300, 300, 300); reduce dim=1; compute time is 38.1139(ms) size = (300, 300, 300); reduce dim=2; compute time is 20.9140(ms) number_threads =1: size = (10, 10, 10); reduce dim=0; compute time is 0.0052(ms) size = (10, 10, 10); reduce dim=1; compute time is 0.0052(ms) size = (10, 10, 10); reduce dim=2; compute time is 0.0050(ms) size = (300, 300, 300); reduce dim=0; compute time is 82.6926(ms) size = (300, 300, 300); reduce dim=1; compute time is 90.1265(ms) size = (300, 300, 300); reduce dim=2; compute time is 55.0196(ms) ``` Fix https://github.com/pytorch/pytorch/issues/24668, https://github.com/pytorch/pytorch/issues/24669. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33280 Differential Revision: D20076997 Pulled By: VitalyFedyunin fbshipit-source-id: 12225767da8cfdc5e44257462a432bffa04cd469	2020-02-25 13:03:16 -08:00

1 2 3 4 5 ...

1095 Commits