pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Rohan Varma	63e545e0fe	Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol Test Plan: revert-hammer Differential Revision: D21717199 Original commit changeset: 9feb856f94ee fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259	2020-05-26 18:23:59 -07:00
mattip	2e6ee853ab	make onnx expect tests resiliant to producer_version changes (#39002 ) Summary: closes gh-32561 closes gh-38545. As part of the fallout from gh-36797, this PR - replaces the producer_version: "1.6" in onnx expect tests with `producer_version: "XXX" - adapts `testing/_internal/common_utils.py` with a regex to change the onnx producer_version so tests still pass The consistency of the torch version and the onnx `producer_version` is tested in gh-36797, so there is no reason to test it again in the expect tests. xref gh-38629 which documented how to run the onnx tests and at the same time refactored the Community documentation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/39002 Differential Revision: D21723062 Pulled By: ezyang fbshipit-source-id: 1bd6a8ed37d5383e69d017226dc09c0645a69aff	2020-05-26 16:11:21 -07:00
Mike Ruberry	6ddca30b2d	Updates assertEqual to require atol and rtol, removes positional atol (#38872 ) Summary: This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument. In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872 Differential Revision: D21717199 Pulled By: mruberry fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a	2020-05-26 08:30:23 -07:00
Mike Ruberry	9cfc10d52e	Updates assertEqual to use torch.isclose-like logic (#37294 ) Summary: Edit: this has been updated to reflect the PR's current status, which has changed after review. This PR updates the behavior of the assertEqual, assertNotEqual, and assert_allclose to be consistent with each other and torch.isclose. It corrects several additional bugs in the current implementations and adds extensive testing and comments, too. These updates follow from changes to assertEqual like https://github.com/pytorch/pytorch/pull/34258 and https://github.com/pytorch/pytorch/pull/37069, and from our discussion of torch.isclose for complex tensors (see https://github.com/pytorch/pytorch/issues/36462), where we decided to implement a NumPy-compatible mathematical notion of "closeness" for complex tensors that is not a great fit for our testing framework. The detailed changelist is: - New test framework functions for comparing tensors and scalars - Tensors are compared using isclose; the real and imaginary parts of complex tensors are compared independently - Scalars are compared using the same algorithm - assertEqual and assert_allclose now use this common comparison function, instead of each implementing their own with divergent behavior - assertEqual-like debug messages are now available for all tensor and scalar comparisons, with additional context when comparing the components of sparse, quantized, and complex tensors - Extensive testing of the comparison behavior and debug messages - Small Updates - assertEqual now takes an "exact_device" argument, analogous to "exact_dtype", which should be useful in multidevice tests - assertEqual now takes an "equal_nan" argument for argument consistency with torch.isclose - assertEqual no longer takes the "allow_inf" keyword, which misleadingly only applied to scalar comparisons, was only ever set (rarely) to true, and is not supported by torch.isclose - Bug fixes: - the exact_dtype attribute has been removed (no longer needed after https://github.com/pytorch/pytorch/pull/38103) - message arguments passed to assertEqual are now handled correctly - bool x other dtype comparisons are now supported - uint8 and int8 tensor comparisons now function properly - rtol for integer comparisons is now supported (default is zero) - rtol and atol for scalar comparisons are now supported - complex scalar comparisons are now supported, analogous to complex tensor comparisons - assertNotEqual is now equivalent to the logical negation of assertEqual Pull Request resolved: https://github.com/pytorch/pytorch/pull/37294 Differential Revision: D21596830 Pulled By: mruberry fbshipit-source-id: f2576669f7113a06f82581fc71883e6b772de19b	2020-05-15 16:24:03 -07:00
David Reiss	1f87f15ba3	Remove _reset_warning_registry (#38485 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38485 Python 2 has reached end-of-life and is no longer supported by PyTorch. This class does nothing in Python 3. Test Plan: CI Reviewed By: ailzhang Differential Revision: D21575260 Pulled By: dreiss fbshipit-source-id: 184696c9fa501e8d2517950b47cdbc90b2ae8053	2020-05-14 15:03:30 -07:00
Nikolay Korovaiko	96885f73ed	make test_jit infer the profiling mode, add a job for simple executor (#38374 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38374 Differential Revision: D21567658 Pulled By: Krovatkin fbshipit-source-id: c0eb44cf6c842d5feebabf8c7d99c1b4aa6c4960	2020-05-13 23:55:40 -07:00
Pavel Belevich	4f08bdddfc	Add skipIfNoSciPy/get_all_int_dtypes/get_all_fp_dtypes to common_utils (#38299 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38299 Test Plan: Imported from OSS Differential Revision: D21534876 Pulled By: pbelevich fbshipit-source-id: 864881b3be899aea3660039128d9bc2e94edab95	2020-05-12 19:11:31 -07:00
Vitaly Fedyunin	48ad9f5a30	assertEqual now requires matching dtypes (#38103 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38103 Test Plan: Imported from OSS Differential Revision: D21477062 Pulled By: VitalyFedyunin fbshipit-source-id: 9592fed336214dd97eb8e9d6b3e16f21ff6f072d	2020-05-09 14:49:01 -07:00
Vitaly Fedyunin	e3414c1ef1	AssertEqual now checks tensors dtype (#34154 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34154 Temporary replacing with `assertEqualIgnoreType` all cases when `AssertEqual` fails. Test Plan: Imported from OSS Differential Revision: D20251131 Pulled By: VitalyFedyunin fbshipit-source-id: fa69c6e2b3a7963912af5b0fa42bec9eded323d3	2020-05-09 14:47:01 -07:00
Ailing Zhang	9232356e5f	remove uses of type() and type_as() part 1. (#38029 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38029 Differential Revision: D21468523 Pulled By: ailzhang fbshipit-source-id: 14b7185d43eb03f630cfaa2d70e02d637ff8551b	2020-05-08 08:16:24 -07:00
Nikita Shulga	53aa7d8bc5	Add option to skip tests after retries (#38079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38079 Differential Revision: D21470238 Pulled By: malfet fbshipit-source-id: b2e63be34090c6f61acad8b6530658a835c68870	2020-05-07 21:56:29 -07:00
Nikita Shulga	72e5b7ae5b	Add option to run python unittests in parallel (#37180 ) Summary: So far results looks quite promising: test_nn is purely sequential tests and can be accelerated 3x Pull Request resolved: https://github.com/pytorch/pytorch/pull/37180 Differential Revision: D21437871 Pulled By: malfet fbshipit-source-id: 8679a8af355f839f2c9dae3bf36d2e102af05425	2020-05-06 22:14:11 -07:00
Elias Ellison	0e3a05ec00	[JIT] rename enable_profiling_mode to enable_profiling_mode_for_profiling_tests (#37825 ) Summary: The existing contextmanager only conditionally enabled_profiling_mode, which was counter intuitive. When we changed the default executor it broke internal benchmarking as a result. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37825 Differential Revision: D21404611 Pulled By: eellison fbshipit-source-id: 306b3c333ef4eb44ab6a6e5ab4e0682e5ce312ce	2020-05-06 11:30:02 -07:00
Nikita Shulga	2c6aed0d61	[Testing] Add `--save-xml` option (#37840 ) Summary: Passing `--save-xml` option to common test runner would have the same effect as setting up `IN_CIRCLECI` environment variable, but also would allow one to specify folder to save results Pull Request resolved: https://github.com/pytorch/pytorch/pull/37840 Differential Revision: D21410250 Pulled By: malfet fbshipit-source-id: ae5855fafdc8c66b550d42b683d547c88b4e55d9	2020-05-05 14:57:50 -07:00
Nikolay Korovaiko	edc5ef1afb	run the simple executor for jit tests by default, add profiling jobs … (#37017 ) Summary: …for fusion tests fix flake8 warnings fix ci failures fix test_determination.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/37017 Differential Revision: D21238446 Pulled By: Krovatkin fbshipit-source-id: 393e6135883dc5ac57bdff580de96c66829d454c	2020-04-28 19:16:52 -07:00
Nikita Shulga	ea741f829e	Add `--repeat` option to python unit-test (#37281 ) Summary: This would run same testsuite (or individual test) multiple time Useful for detecting flaky tests Example usage: `python test_autograd.py TestAutograd.test_profiler -v --repeat=100` Pull Request resolved: https://github.com/pytorch/pytorch/pull/37281 Differential Revision: D21244442 Pulled By: malfet fbshipit-source-id: 3ecafec7ae87bc1e418aa28151bbc472ef37a713	2020-04-25 13:56:58 -07:00
Brian Vaughan	a50a1fb4c3	Enforce kw-only args now that py2 is unsupported (#37069 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37069 Test Plan: Imported from OSS Differential Revision: D21204729 Pulled By: nairbv fbshipit-source-id: 8e93decae59e753706fa288bcdc3bf6278b8eeb5	2020-04-24 07:08:24 -07:00
David Reiss	e75fb4356b	Remove (most) Python 2 support from Python code (#35615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615 Python 2 has reached end-of-life and is no longer supported by PyTorch. Now we can clean up a lot of cruft that we put in place to support it. These changes were all done manually, and I skipped anything that seemed like it would take more than a few seconds, so I think it makes sense to review it manually as well (though using side-by-side view and ignoring whitespace change might be helpful). Test Plan: CI Differential Revision: D20842886 Pulled By: dreiss fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed	2020-04-22 09:23:14 -07:00
Nikita Shulga	3b832ee2bf	Use Python3 `super()` throughout `torch.testing.` (#37024 ) Summary: Hattip to ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/37024 Differential Revision: D21173244 Pulled By: malfet fbshipit-source-id: 7079703e28777d873f69bf9fd4dcbad8d53a2682	2020-04-22 09:00:28 -07:00
Brian Vaughan	54ed6fd3ee	Use both absolute and relative tolerance in testing (#34258 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34258 This PR allows both atol and rtol to be specified, uses defaults based on the prior analysis (spreadsheet attached to https://github.com/pytorch/pytorch/pull/32538), but retains the absolute tolerance behavior in cases where precision was previously specified explicitly. Test Plan: Imported from OSS Differential Revision: D21110255 Pulled By: nairbv fbshipit-source-id: 57b3a004c7d5ac1be80ee765f03668b1b13f4a7e	2020-04-19 06:16:49 -07:00
Elias Ellison	54a575c9bd	[JIT] fix torch.tensor jit dtype (#36587 ) Summary: Previously we were always creating a double tensor from `torch.tensor(1.)`, whereas python eager uses the current default dtype. Fix for https://github.com/pytorch/pytorch/issues/36369 Pull Request resolved: https://github.com/pytorch/pytorch/pull/36587 Differential Revision: D21043617 Pulled By: eellison fbshipit-source-id: 38da303594f52e06941d86b6e57c4a06e7d36938	2020-04-16 10:55:49 -07:00
Mike Ruberry	d0c925f1c7	Returns float tensors for complex inputs to abs (#35871 ) Summary: Per title. A test is added to test_type_promotion for the behavior. This behavior is consistent with NumPy's. For complex inputs to `abs` the result is cast to float after the computation since the computation of abs must be performed on the original complex tensor. While `std::abs` returns a float value when called on complex inputs, returning a FloatTensor directly would require additional loop instantiations in TensorIterator. This may be worthwhile to pursue in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35871 Differential Revision: D20984456 Pulled By: mruberry fbshipit-source-id: 226445178f92f2b0292e92578656d98674a6aa20	2020-04-16 09:03:17 -07:00
Natalia Gimelshein	f3f640d479	move test_abs to device-generic tests (#36465 ) Summary: Per title. test_abs used to be marked as slow_test and run on cpu only. Conceptually similar tests are done in TestTorchMathOps, so it's a matter of adding `abs` test there. 2 remaining checks (correct abs for large-valued long tensors, and correct abs for signed zeros) are factored into separate tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36465 Differential Revision: D21000248 Pulled By: ngimel fbshipit-source-id: 8bc8b0da936b1c10fe016ff2f0dbb5ea428e7e61	2020-04-14 09:48:08 -07:00
Wanchao Liang	3526627f46	Use unittest assertWarns instead (#36411 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36411 This PR remove pytorch specific defined assertwarns and use the unit test one, also format some tests Test Plan: Imported from OSS Differential Revision: D20998159 Pulled By: wanchaol fbshipit-source-id: 1280ecff2dd293b95a639d13cc7417fc819c2201	2020-04-13 15:56:42 -07:00
Mike Ruberry	254be6a201	Adds NumPy array x Torch tensor binary ufunc interaction test (#35945 ) Summary: Adds test for behavior reported in https://github.com/pytorch/pytorch/issues/35257 to ensure it doesn't regress. The test was extended to reveal three additional issues: - https://github.com/pytorch/pytorch/issues/36363 - https://github.com/pytorch/pytorch/issues/36058 - https://github.com/pytorch/pytorch/issues/36057 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35945 Differential Revision: D20984429 Pulled By: mruberry fbshipit-source-id: a15be9455afba9c77e40c337a860f9be348bf8d5	2020-04-11 21:56:38 -07:00
Lu Fang	742c77971a	Revert D20961711: [pytorch][PR] Returns float tensors for complex inputs to abs Test Plan: revert-hammer Differential Revision: D20961711 Original commit changeset: 232f62cf64ca fbshipit-source-id: 7b2a537d2effe6b2449f192dc42e375062058995	2020-04-11 02:55:41 -07:00
Mike Ruberry	3aeb2b1562	Returns float tensors for complex inputs to abs (#35871 ) Summary: Per title. A test is added to test_type_promotion for the behavior. This behavior is consistent with NumPy's. For complex inputs to `abs` the result is cast to float after the computation since the computation of abs must be performed on the original complex tensor. While `std::abs` returns a float value when called on complex inputs, returning a FloatTensor directly would require additional loop instantiations in TensorIterator. This may be worthwhile to pursue in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35871 Differential Revision: D20961711 Pulled By: mruberry fbshipit-source-id: 232f62cf64caa4154eb2194969efa51d2082d842	2020-04-10 09:08:45 -07:00
Nikita Shulga	bb32e123e6	Report results of python unit tests during window test runs (#35687 ) Summary: Define `store_test_results` attribute in CircleCI yamls Install `unittest-xml-reporting` and define `IN_CIRCLECI` environment variable to trigger test runners to save results to XML Pull Request resolved: https://github.com/pytorch/pytorch/pull/35687 Differential Revision: D20739831 Pulled By: malfet fbshipit-source-id: 6a7bbf19f93c32766963f5edad191ad8ca316ff8	2020-03-30 12:33:03 -07:00
Mike Ruberry	683246e5ea	Improves precision of linspace, logspace (#35461 ) Summary: The Torch algorithms for linspace and logspace conceptually compute each of their values using: `start_value + step_value * idx` [And NumPy does the same,](`cef4dc9d91/numpy/core/function_base.py (L24)`) except NumPy then [sets the last value in its array directly.](`cef4dc9d91/numpy/core/function_base.py (L162)`) This is because the above computation is unstable when using floats, and NumPy's contract, like PyTorch's, is that the last element in the array is the stop value. In PyTorch there can be a divergence between the computed last value and the actual value. One user reported case was: `torch.linspace(-0.031608279794, 0.031531572342, 257, dtype=torch.float32)` Which causes a difference of 3.7253e-09 between the last value as set by NumPy and computed by PyTorch. After this PR the difference is zero. Instead of simply setting the last element of the tensor, this PR updates the kernels with a "symmetric" algorithm that sets the first and last array elements without requiring an additional kernel launch on CUDA. The performance impact of this change seems small. I tested with a step sizes of 2^8 and 2^22, and all timing differences were imperceptible except for 2^22 on CPU, which appears to have suffered ~5% slowdown. I think that's an acceptable performance hit for the improved precision when we consider the context of linspace. An alternative would be to simply set the last element, as NumPy does, on CPU. But I think it's preferable to keep the CPU and CUDA algorithms aligned and keep the algorithm symmetric. In current PyTorch, for example, torch.linspace starts generating values very similar to NumPy, but as the index increases so do the errors, giving our current implementation a "left bias." Two tests are added to test_torch.py for this behavior. The linspace test will fail on current PyTorch, but the logspace test will succeed since its more complex computation needs wider error bars. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35461 Differential Revision: D20712539 Pulled By: mruberry fbshipit-source-id: 2c1257c8706f4cdf080ff0331bbf2f7041ab9adf	2020-03-27 23:50:39 -07:00
Alban Desmaison	181da12126	Revert D20687652: [pytorch][PR] Report results from cpp unittests on Windows and Linux Test Plan: revert-hammer Differential Revision: D20687652 Original commit changeset: fc370b7e2614 fbshipit-source-id: 8153815c8ed8f3d4f472caa95eda76180b038a42	2020-03-27 06:56:53 -07:00
Nikita Shulga	d2d40c45b6	Report results from cpp unittests on Windows and Linux (#35500 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35500 Test Plan: Test in production :) Results should eventually be published to: https://circleci.com/build-insights/gh/pytorch/pytorch/master Differential Revision: D20687652 Pulled By: malfet fbshipit-source-id: fc370b7e261402e14b427f42038ecb2d95bad059	2020-03-26 23:00:33 -07:00
Nikita Shulga	6fa0b3df2e	[testing] Pass verbosity settings to `XMLTestRunner` (#35224 ) Summary: When `unittest.main()` is invoked with custom testRunner, verbosity settings for the runner must be set manually Pull Request resolved: https://github.com/pytorch/pytorch/pull/35224 Test Plan: CI Differential Revision: D20605896 Pulled By: malfet fbshipit-source-id: 79fc6f55911189b6d8a4bc83bd2390c94bd69e5e	2020-03-23 16:37:52 -07:00
Ailing Zhang	471ddacd8b	Add retry decorator and use it for Hub tests. (#34829 ) Summary: fix https://github.com/pytorch/pytorch/issues/34751 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34829 Differential Revision: D20476231 Pulled By: ailzhang fbshipit-source-id: eb38ee655e28250352b15e8e37b3b39310a7c378	2020-03-16 20:19:45 -07:00
Pearu Peterson	8bae1ed144	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721 ) Summary: This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721 Differential Revision: D20444270 Pulled By: vincentqb fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659	2020-03-16 14:13:30 -07:00
Edward Yang	4b929e5466	Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem Test Plan: revert-hammer Differential Revision: D20193196 Original commit changeset: 78a487991242 fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8	2020-03-11 09:24:34 -07:00
Pearu Peterson	2ec779d46c	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488 ) Summary: This PR implements the following linear algebra algorithms for low-rank matrices: - [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + uses `torch.lowrank.get_approximate_basis` + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] PCA - using `torch.svd_lowrank` + uses `torch.svd_lowrank` + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices, uses non-centered sparse matrix algorithm + [x] documentation - [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124) + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883) + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point. - [x] benchmarks + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations. + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster. + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results) Resolves https://github.com/pytorch/pytorch/issues/8049. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488 Differential Revision: D20193196 Pulled By: vincentqb fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e	2020-03-11 07:33:49 -07:00
Edward Yang	ba1bd41767	Turn on strict dtype checking for test_torch.py (#33825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33825 Partially addresses #20376 I do this by overriding assertEqual in classes that opt into this. This means I have to fix #33821. The fix is a little unsatisfactory as idiomatic Python 2 super() calls don't work (since the class is no longer in scope); hopefully this will just work when we go to Python 3. General approach taken: - A lot of dtype mismatches are because we specified tensor constants that infer to some dtype, but the actual dtype needed is something else. Those are easy, just annotate the tensor() constructor (often a legacy Tensor/FloatTensor call) with dtype - There are a few cases where the promotion rules are nontrivial. Some of them I just typed out the expected promotion rules manually (based on trial and error) - There are some more complex cases; if it gets too hairy I just set exact_dtype=False and nope the fuck out I don't have time to do it for all the other classes. But the setup should work if people just incrementally add the overrides to classes, and then eventually flip the default. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20125791 Pulled By: ezyang fbshipit-source-id: 389c2d1efbd93172af02f13e38ac5e92fe730c57	2020-03-03 14:45:53 -08:00
anjali411	dece155335	Modified assertEqual to handle complex tensors (#33773 ) Summary: - Modified assertEqual to handle complex tensors - added a test in test_torch.py to test torch.zeros - added dispatch for complex for index_kernel, index_put_kernel Pull Request resolved: https://github.com/pytorch/pytorch/pull/33773 Differential Revision: D20135553 Pulled By: anjali411 fbshipit-source-id: f716604535c0447ecffa335b0fc843431397c988	2020-02-28 08:43:28 -08:00
Nikolay Korovaiko	a7e22b4c6a	add bailout checks to checkScript (#32802 ) Summary: this adds enough infrastructure to run bailout checks in `checkScript`. I'll need to figure out the best way to enable it for nightly builds now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32802 Differential Revision: D19974718 Pulled By: Krovatkin fbshipit-source-id: 40485503f6d3ae14edcce98e1eec1f0559f3ad08	2020-02-21 21:18:54 -08:00
Rohan Varma	6cb9e6b015	Back out "Revert D19871946: [distributed] pass in timeout to TCP store when initializing" (#33434 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33434 Reland of https://github.com/pytorch/pytorch/pull/33325, since the unit test was flaky and failed on land. To ensure that the test is not flaky, I bumped the timeout so the rendezvous does not timeout (timing out the rendezvous in 1s led to the flakiness). I also generalized our mechanism for retrying on errors to include retrying on errors due to timeout in rendezvous. ghstack-source-id: 98558377 Test Plan: Added UT test_tcp_store_timeout_set Differential Revision: D19935390 fbshipit-source-id: 56ccf8c333dd2f954a33614d35cd1642d4e9473a	2020-02-19 17:17:17 -08:00
ptrblck	1e3664b6ef	Remove c/pdist tests from _internal/common_utils.py (#33409 ) Summary: * remove brute_test from `torch/testing/_internal/common_utils.py` * add these tests as internal tests to `test_torch.py` CC ailzhang Pull Request resolved: https://github.com/pytorch/pytorch/pull/33409 Differential Revision: D19951729 Pulled By: ailzhang fbshipit-source-id: b1126aaf26fa64a0f17cbb582dc8038b79cfe3eb	2020-02-19 10:27:30 -08:00
Pritam Damania	fd684cc312	Use torch.set_default_dtype in test_data_parallel and rename dtype2prec (#32962 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32962 As per gchanan's comments on https://github.com/pytorch/pytorch/pull/30445, I've used `torch.set_default_dtype` in test_data_parallel instead of specifying dtype=torch.double everywhere. Also, renamed dtype2prec to dtype2prec_DONTUSE ghstack-source-id: 98388429 Test Plan: waitforbuildbot Differential Revision: D19714374 fbshipit-source-id: eb55bbca33881625636ba9ea6dd4cb692f25668e	2020-02-15 14:07:54 -08:00
ptrblck	a64d0ffe81	Use int64 in pdist kernel to handle batches >= 46342 #30583 (#31593 ) Summary: Currently `torch.pdist` yields an illegal CUDA memory access for batch sizes >= 46342 as reported by SsnL in https://github.com/pytorch/pytorch/issues/30583. Thanks for the minimal code reproduction, btw! ;) Reason for this bug: The calculation if `i` in the [`pdist_kerne_cuda_impl`](`46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L112)`) might overflow, if a tensor with a `batch size >= 46342` is passed to `torch.pdist`. Detailed description: * `result` is resizes as ` n * (n - 1) / 2 = 1073767311` ([line of code](`46ad80c839/aten/src/ATen/native/Distance.cpp (L140)`)) * `grid` is initialized as `result.numel()` ([line of code](`46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L246)`)) * `k` is assigned to the `blockIdx.x` as an `int32` ([line of code](`46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L108)`)) * `i` is calculated using `2 * k >= 2147534622` ([line of code](`46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L112)`)), which overflows, since `2147534622 > 2147483647 (int32_max)`. Using `const int64_t k = blockIdx.x;` would solve the illegal memory access. This seems also be done for [`cdist_kernel_cuda_impl`](`46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L198-L201)`). However, we might expect a slowdown, so I've timed the current PyTorch master vs. this PR: (tested with `x = torch.randn(x.size(0), 128)` on a V100) \|x.size(0) \| int32 idx \| int64 idx \| slowdown \| \|----------\|-----------\|-----------\|----------\| \| 50000 \| - \| 4.4460 \| - \| \| 25000 \| 1.02522 \| 1.10869 \| 7.53% \| \| 12500 \| 0.25182 \| 0.27277 \| 7.68% \| \| 6250 \| 0.06291 \| 0.06817 \| 7.72% \| \| 3125 \| 0.01573 \| 0.01704 \| 7.69% \| \| 1562 \| 0.00393 \| 0.00426 \| 7.75% \| While checking the backward kernel, it seems I'm triggering another error with a size limit of ```python x = torch.randn(1449, 1, device='cuda', requires_grad=True) out = torch.pdist(x) out.mean().backward() > RuntimeError: CUDA error: invalid configuration argument ``` , while `[<=1448, 1]` works. I'll take another look at this issue. Let me know, if the potential fix should go into this PR or if I should open a new issue. CC ngimel, csarofeen Pull Request resolved: https://github.com/pytorch/pytorch/pull/31593 Differential Revision: D19825571 Pulled By: ngimel fbshipit-source-id: ace9ccab49f3cf0ce894cdb6daef0795e2e8ec03	2020-02-11 12:00:39 -08:00
George Guanheng Zhang	f4fbe9549d	Revert D19800021: [pytorch][PR] Improve error message for assertWarnsRegex Test Plan: revert-hammer Differential Revision: D19800021 Original commit changeset: 1c31ae785c8f fbshipit-source-id: d7b340d678562c25a84d48be66c576075000b50d	2020-02-10 12:17:52 -08:00
Peter Bell	c917a247a8	Improve error message for assertWarnsRegex (#33099 ) Summary: `assertWarnsRegex` now prints out any warnings that it caught while failing to find a matching warning. This makes it easier to debug tests by just looking at the CI logs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33099 Differential Revision: D19800021 Pulled By: ezyang fbshipit-source-id: 1c31ae785c8ffc5d47619aff6597e479263be2de	2020-02-10 07:27:59 -08:00
Richard Zou	6209412647	Add option to use ninja to compile ahead-of-time cpp_extensions (#32495 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32495 Background ------------------------------ Previously, ninja was used to compile+link inline cpp_extensions and ahead-of-time cpp_extensions were compiled with distutils. This PR adds the ability to compile (but not link) ahead-of-time cpp_extensions with ninja. The main motivation for this is to speed up cpp_extension builds: distutils does not make use of parallelism. With this PR, using the new option, on my machine, - torchvision compilation goes from 3m43s to 49s - nestedtensor compilation goes from 2m0s to 28s. User-facing changes ------------------------------ I added a `use_ninja` flag to BuildExtension. This defaults to `True`. When `use_ninja` is True: - it will attempt to use ninja. - If we cannot use ninja, then this throws a warning and falls back to distutils. - Situations we cannot use ninja: Windows (NYI, I'll open a new issue for this), if ninja cannot be found on the system. Implementation Details ------------------------------ This PR makes this change in two steps. Please me know if it would be easier to review this if I split this up into a stacked diff. Those changes are: 1) refactor _write_ninja_file to separate the policy (what compiler flags to pass) from the mechanism (how to write the ninja file and do compilation). 2) call _write_ninja_file and _run_ninja_build while building ahead-of-time cpp_extensions. These are only used to compile objects; distutils still handles the linking. Change 1: refactor _write_ninja_file to seperate policy from mechanism - I split _write_ninja_file into: _write_ninja_file and _write_ninja_file_to_build_library - I renamed _build_extension_module to _run_ninja_build Change 2: Call _write_ninja_file while building ahead-of-time cpp_extensions - _write_ninja_file_and_compile_objects calls _write_ninja_file to only build object files. - We monkey-patch distutils.CCompiler.compile to call _write_ninja_files_and_compile_objects - distutils still handles the linking step. The linking step is not a bottleneck so it was not a concern. - This change only works on unix-based systems. Our code for windows goes down a different codepath and I did not want to mess with that. - If a system does not support ninja, we raise a warning and fall back to the original compilation path. Test Plan ------------------------------ Adhoc testing - I built torchvision using pytorch master and printed out the build commands. Next, I used this branch to build torchvision and looked at the ninja file. I compared the ninja file with the build commands and asserted that they were functionally the same. - I repeated the above for pytorch/nestedtensor. PyTorch test suite - I split `test_cpp_extensions` into `test_cpp_extensions_aot` and `test_cpp_extensions_jit`. The AOT (ahead-of-time) version tests ahead-of-time and the JIT version tests just-in-time (not to be confused with TorchScript) - `test_cpp_extensions_aot` gets run TWICE by run_test.py, once with a module that was built with ninja, and once with a module that was built without ninja. - run_test.py asserts that when we are building with use_ninja=True, ninja is actually available on the system. Test Plan: Imported from OSS Differential Revision: D19730432 Pulled By: zou3519 fbshipit-source-id: 819590d01cf65e8da5a1e8019b8b3084792fee90	2020-02-05 18:49:29 -08:00
davidriazati	2060e0a9dd	Split serialization tests to their own file (#32241 ) Summary: Stacked PRs * #32244 - Make zip serialization the default * #32241 - Split serialization tests to their own file This makes them all easier to run as a batch. This PR is just a code move / fixing up imports. There are still some serialization tests in `test_torch.py` as part of `TestDeviceType`. ](https://our.intern.facebook.com/intern/diff/19415826/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/32241 Pulled By: driazati Differential Revision: D19415826 fbshipit-source-id: a3f6cfe1626ff2f9b9631c409bf525bd32e4639b	2020-01-28 15:04:05 -08:00
Pritam Damania	f050b16dd9	Move pytorch distributed tests to separate folder for contbuild. (#30445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445 Create distributed and rpc directories under caffe/test for better management of unit tests. Differential Revision: D18702786 fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606	2020-01-22 21:16:59 -08:00

48 Commits