Commit Graph

48 Commits

Author SHA1 Message Date
Rohan Varma
63e545e0fe Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol
Test Plan: revert-hammer

Differential Revision:
D21717199

Original commit changeset: 9feb856f94ee

fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259
2020-05-26 18:23:59 -07:00
mattip
2e6ee853ab make onnx expect tests resiliant to producer_version changes (#39002)
Summary:
closes gh-32561 closes gh-38545. As part of the fallout from gh-36797, this PR
- replaces the producer_version: "1.6" in onnx expect tests with `producer_version: "XXX"
- adapts `testing/_internal/common_utils.py` with a regex to change the onnx producer_version so tests still pass

The consistency of the torch version and the onnx `producer_version` is tested in gh-36797, so there is no reason to test it again in the expect tests.

xref gh-38629 which documented how to run the onnx tests and at the same time refactored the Community documentation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39002

Differential Revision: D21723062

Pulled By: ezyang

fbshipit-source-id: 1bd6a8ed37d5383e69d017226dc09c0645a69aff
2020-05-26 16:11:21 -07:00
Mike Ruberry
6ddca30b2d Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21717199

Pulled By: mruberry

fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a
2020-05-26 08:30:23 -07:00
Mike Ruberry
9cfc10d52e Updates assertEqual to use torch.isclose-like logic (#37294)
Summary:
Edit: this has been updated to reflect the PR's current status, which has changed after review.

This PR updates the behavior of the assertEqual, assertNotEqual, and assert_allclose to be consistent with each other and torch.isclose. It corrects several additional bugs in the current implementations and adds extensive testing and comments, too.

These updates follow from changes to assertEqual like https://github.com/pytorch/pytorch/pull/34258 and https://github.com/pytorch/pytorch/pull/37069, and from our discussion of torch.isclose for complex tensors (see https://github.com/pytorch/pytorch/issues/36462), where we decided to implement a NumPy-compatible mathematical notion of "closeness" for complex tensors that is not a great fit for our testing framework.

The detailed changelist is:

- New test framework functions for comparing tensors and scalars
  - Tensors are compared using isclose; the real and imaginary parts of complex tensors are compared independently
  - Scalars are compared using the same algorithm
  - assertEqual and assert_allclose now use this common comparison function, instead of each implementing their own with divergent behavior
  - assertEqual-like debug messages are now available for all tensor and scalar comparisons, with additional context when comparing the components of sparse, quantized, and complex tensors
- Extensive testing of the comparison behavior and debug messages
- Small Updates
  - assertEqual now takes an "exact_device" argument, analogous to "exact_dtype", which should be useful in multidevice tests
  - assertEqual now takes an "equal_nan" argument for argument consistency with torch.isclose
  - assertEqual no longer takes the "allow_inf" keyword, which misleadingly only applied to scalar comparisons, was only ever set (rarely) to true, and is not supported by torch.isclose
- Bug fixes:
  - the exact_dtype attribute has been removed (no longer needed after https://github.com/pytorch/pytorch/pull/38103)
  - message arguments passed to assertEqual are now handled correctly
  - bool x other dtype comparisons are now supported
  - uint8 and int8 tensor comparisons now function properly
  - rtol for integer comparisons is now supported (default is zero)
  - rtol and atol for scalar comparisons are now supported
  - complex scalar comparisons are now supported, analogous to complex tensor comparisons
  - assertNotEqual is now equivalent to the logical negation of assertEqual
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37294

Differential Revision: D21596830

Pulled By: mruberry

fbshipit-source-id: f2576669f7113a06f82581fc71883e6b772de19b
2020-05-15 16:24:03 -07:00
David Reiss
1f87f15ba3 Remove _reset_warning_registry (#38485)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38485

Python 2 has reached end-of-life and is no longer supported by PyTorch.
This class does nothing in Python 3.

Test Plan: CI

Reviewed By: ailzhang

Differential Revision: D21575260

Pulled By: dreiss

fbshipit-source-id: 184696c9fa501e8d2517950b47cdbc90b2ae8053
2020-05-14 15:03:30 -07:00
Nikolay Korovaiko
96885f73ed make test_jit infer the profiling mode, add a job for simple executor (#38374)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38374

Differential Revision: D21567658

Pulled By: Krovatkin

fbshipit-source-id: c0eb44cf6c842d5feebabf8c7d99c1b4aa6c4960
2020-05-13 23:55:40 -07:00
Pavel Belevich
4f08bdddfc Add skipIfNoSciPy/get_all_int_dtypes/get_all_fp_dtypes to common_utils (#38299)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38299

Test Plan: Imported from OSS

Differential Revision: D21534876

Pulled By: pbelevich

fbshipit-source-id: 864881b3be899aea3660039128d9bc2e94edab95
2020-05-12 19:11:31 -07:00
Vitaly Fedyunin
48ad9f5a30 assertEqual now requires matching dtypes (#38103)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38103

Test Plan: Imported from OSS

Differential Revision: D21477062

Pulled By: VitalyFedyunin

fbshipit-source-id: 9592fed336214dd97eb8e9d6b3e16f21ff6f072d
2020-05-09 14:49:01 -07:00
Vitaly Fedyunin
e3414c1ef1 AssertEqual now checks tensors dtype (#34154)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34154

Temporary replacing with `assertEqualIgnoreType` all cases when `AssertEqual` fails.

Test Plan: Imported from OSS

Differential Revision: D20251131

Pulled By: VitalyFedyunin

fbshipit-source-id: fa69c6e2b3a7963912af5b0fa42bec9eded323d3
2020-05-09 14:47:01 -07:00
Ailing Zhang
9232356e5f remove uses of type() and type_as() part 1. (#38029)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38029

Differential Revision: D21468523

Pulled By: ailzhang

fbshipit-source-id: 14b7185d43eb03f630cfaa2d70e02d637ff8551b
2020-05-08 08:16:24 -07:00
Nikita Shulga
53aa7d8bc5 Add option to skip tests after retries (#38079)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38079

Differential Revision: D21470238

Pulled By: malfet

fbshipit-source-id: b2e63be34090c6f61acad8b6530658a835c68870
2020-05-07 21:56:29 -07:00
Nikita Shulga
72e5b7ae5b Add option to run python unittests in parallel (#37180)
Summary:
So far results looks quite promising: test_nn is purely sequential tests and can be accelerated 3x
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37180

Differential Revision: D21437871

Pulled By: malfet

fbshipit-source-id: 8679a8af355f839f2c9dae3bf36d2e102af05425
2020-05-06 22:14:11 -07:00
Elias Ellison
0e3a05ec00 [JIT] rename enable_profiling_mode to enable_profiling_mode_for_profiling_tests (#37825)
Summary:
The existing contextmanager only conditionally enabled_profiling_mode, which was counter intuitive. When we changed the default executor it broke internal benchmarking as a result.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37825

Differential Revision: D21404611

Pulled By: eellison

fbshipit-source-id: 306b3c333ef4eb44ab6a6e5ab4e0682e5ce312ce
2020-05-06 11:30:02 -07:00
Nikita Shulga
2c6aed0d61 [Testing] Add --save-xml option (#37840)
Summary:
Passing `--save-xml` option to common test runner would have the same effect as setting up `IN_CIRCLECI` environment variable, but also would allow one to specify folder to save results
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37840

Differential Revision: D21410250

Pulled By: malfet

fbshipit-source-id: ae5855fafdc8c66b550d42b683d547c88b4e55d9
2020-05-05 14:57:50 -07:00
Nikolay Korovaiko
edc5ef1afb run the simple executor for jit tests by default, add profiling jobs … (#37017)
Summary:
…for fusion tests

fix flake8 warnings

fix ci failures

fix test_determination.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37017

Differential Revision: D21238446

Pulled By: Krovatkin

fbshipit-source-id: 393e6135883dc5ac57bdff580de96c66829d454c
2020-04-28 19:16:52 -07:00
Nikita Shulga
ea741f829e Add --repeat option to python unit-test (#37281)
Summary:
This would run same testsuite (or individual test) multiple time
Useful for detecting flaky tests

Example usage: `python test_autograd.py TestAutograd.test_profiler -v --repeat=100`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37281

Differential Revision: D21244442

Pulled By: malfet

fbshipit-source-id: 3ecafec7ae87bc1e418aa28151bbc472ef37a713
2020-04-25 13:56:58 -07:00
Brian Vaughan
a50a1fb4c3 Enforce kw-only args now that py2 is unsupported (#37069)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37069

Test Plan: Imported from OSS

Differential Revision: D21204729

Pulled By: nairbv

fbshipit-source-id: 8e93decae59e753706fa288bcdc3bf6278b8eeb5
2020-04-24 07:08:24 -07:00
David Reiss
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
Nikita Shulga
3b832ee2bf Use Python3 super() throughout torch.testing. (#37024)
Summary:
Hattip to ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37024

Differential Revision: D21173244

Pulled By: malfet

fbshipit-source-id: 7079703e28777d873f69bf9fd4dcbad8d53a2682
2020-04-22 09:00:28 -07:00
Brian Vaughan
54ed6fd3ee Use both absolute and relative tolerance in testing (#34258)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34258

This PR allows both atol and rtol to be specified, uses defaults based on the prior analysis (spreadsheet attached to https://github.com/pytorch/pytorch/pull/32538), but retains the absolute tolerance behavior in cases where precision was previously specified explicitly.

Test Plan: Imported from OSS

Differential Revision: D21110255

Pulled By: nairbv

fbshipit-source-id: 57b3a004c7d5ac1be80ee765f03668b1b13f4a7e
2020-04-19 06:16:49 -07:00
Elias Ellison
54a575c9bd [JIT] fix torch.tensor jit dtype (#36587)
Summary:
Previously we were always creating a double tensor from `torch.tensor(1.)`, whereas python eager uses the current default dtype. Fix for https://github.com/pytorch/pytorch/issues/36369
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36587

Differential Revision: D21043617

Pulled By: eellison

fbshipit-source-id: 38da303594f52e06941d86b6e57c4a06e7d36938
2020-04-16 10:55:49 -07:00
Mike Ruberry
d0c925f1c7 Returns float tensors for complex inputs to abs (#35871)
Summary:
Per title. A test is added to test_type_promotion for the behavior. This behavior is consistent with NumPy's.

For complex inputs to `abs` the result is cast to float after the computation since the computation of abs must be performed on the original complex tensor. While `std::abs` returns a float value when called on complex inputs, returning a FloatTensor directly would require additional loop instantiations in TensorIterator. This may be worthwhile to pursue in the future.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35871

Differential Revision: D20984456

Pulled By: mruberry

fbshipit-source-id: 226445178f92f2b0292e92578656d98674a6aa20
2020-04-16 09:03:17 -07:00
Natalia Gimelshein
f3f640d479 move test_abs to device-generic tests (#36465)
Summary:
Per title. test_abs used to be marked as slow_test and run on cpu only. Conceptually similar tests are done in TestTorchMathOps, so it's a matter of adding `abs` test there. 2 remaining checks (correct abs for large-valued long tensors, and correct abs for signed zeros) are factored into separate tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36465

Differential Revision: D21000248

Pulled By: ngimel

fbshipit-source-id: 8bc8b0da936b1c10fe016ff2f0dbb5ea428e7e61
2020-04-14 09:48:08 -07:00
Wanchao Liang
3526627f46 Use unittest assertWarns instead (#36411)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36411

This PR remove pytorch specific defined assertwarns and use the unit
test one, also format some tests

Test Plan: Imported from OSS

Differential Revision: D20998159

Pulled By: wanchaol

fbshipit-source-id: 1280ecff2dd293b95a639d13cc7417fc819c2201
2020-04-13 15:56:42 -07:00
Mike Ruberry
254be6a201 Adds NumPy array x Torch tensor binary ufunc interaction test (#35945)
Summary:
Adds test for behavior reported in https://github.com/pytorch/pytorch/issues/35257 to ensure it doesn't regress. The test was extended to reveal three additional issues:

- https://github.com/pytorch/pytorch/issues/36363
- https://github.com/pytorch/pytorch/issues/36058
- https://github.com/pytorch/pytorch/issues/36057
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35945

Differential Revision: D20984429

Pulled By: mruberry

fbshipit-source-id: a15be9455afba9c77e40c337a860f9be348bf8d5
2020-04-11 21:56:38 -07:00
Lu Fang
742c77971a Revert D20961711: [pytorch][PR] Returns float tensors for complex inputs to abs
Test Plan: revert-hammer

Differential Revision:
D20961711

Original commit changeset: 232f62cf64ca

fbshipit-source-id: 7b2a537d2effe6b2449f192dc42e375062058995
2020-04-11 02:55:41 -07:00
Mike Ruberry
3aeb2b1562 Returns float tensors for complex inputs to abs (#35871)
Summary:
Per title. A test is added to test_type_promotion for the behavior. This behavior is consistent with NumPy's.

For complex inputs to `abs` the result is cast to float after the computation since the computation of abs must be performed on the original complex tensor. While `std::abs` returns a float value when called on complex inputs, returning a FloatTensor directly would require additional loop instantiations in TensorIterator. This may be worthwhile to pursue in the future.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35871

Differential Revision: D20961711

Pulled By: mruberry

fbshipit-source-id: 232f62cf64caa4154eb2194969efa51d2082d842
2020-04-10 09:08:45 -07:00
Nikita Shulga
bb32e123e6 Report results of python unit tests during window test runs (#35687)
Summary:
Define `store_test_results` attribute in CircleCI yamls
Install `unittest-xml-reporting` and define `IN_CIRCLECI` environment variable to trigger test runners to save results to XML
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35687

Differential Revision: D20739831

Pulled By: malfet

fbshipit-source-id: 6a7bbf19f93c32766963f5edad191ad8ca316ff8
2020-03-30 12:33:03 -07:00
Mike Ruberry
683246e5ea Improves precision of linspace, logspace (#35461)
Summary:
The Torch algorithms for linspace and logspace conceptually compute each of their values using:

`start_value + step_value * idx`

[And NumPy does the same,](cef4dc9d91/numpy/core/function_base.py (L24)) except NumPy then [sets the last value in its array directly.](cef4dc9d91/numpy/core/function_base.py (L162)) This is because the above computation is unstable when using floats, and NumPy's contract, like PyTorch's, is that the last element in the array is the stop value.

In PyTorch there can be a divergence between the computed last value and the actual value. One user reported case was:

`torch.linspace(-0.031608279794, 0.031531572342, 257, dtype=torch.float32)`

Which causes a difference of 3.7253e-09 between the last value as set by NumPy and computed by PyTorch. After this PR the difference is zero.

Instead of simply setting the last element of the tensor, this PR updates the kernels with a "symmetric" algorithm that sets the first and last array elements without requiring an additional kernel launch on CUDA. The performance impact of this change seems small. I tested with a step sizes of 2^8 and 2^22, and all timing differences were imperceptible except for 2^22 on CPU, which appears to have suffered ~5% slowdown. I think that's an acceptable performance hit for the improved precision when we consider the context of linspace.

An alternative would be to simply set the last element, as NumPy does, on CPU. But I think it's preferable to keep the CPU and CUDA algorithms aligned and keep the algorithm symmetric. In current PyTorch, for example, torch.linspace starts generating values very similar to NumPy, but as the index increases so do the errors, giving our current implementation a "left bias."

Two tests are added to test_torch.py for this behavior. The linspace test will fail on current PyTorch, but the logspace test will succeed since its more complex computation needs wider error bars.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35461

Differential Revision: D20712539

Pulled By: mruberry

fbshipit-source-id: 2c1257c8706f4cdf080ff0331bbf2f7041ab9adf
2020-03-27 23:50:39 -07:00
Alban Desmaison
181da12126 Revert D20687652: [pytorch][PR] Report results from cpp unittests on Windows and Linux
Test Plan: revert-hammer

Differential Revision:
D20687652

Original commit changeset: fc370b7e2614

fbshipit-source-id: 8153815c8ed8f3d4f472caa95eda76180b038a42
2020-03-27 06:56:53 -07:00
Nikita Shulga
d2d40c45b6 Report results from cpp unittests on Windows and Linux (#35500)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35500

Test Plan:
Test in production :)
Results should eventually be published to: https://circleci.com/build-insights/gh/pytorch/pytorch/master

Differential Revision: D20687652

Pulled By: malfet

fbshipit-source-id: fc370b7e261402e14b427f42038ecb2d95bad059
2020-03-26 23:00:33 -07:00
Nikita Shulga
6fa0b3df2e [testing] Pass verbosity settings to XMLTestRunner (#35224)
Summary:
When `unittest.main()` is invoked with custom testRunner, verbosity settings for the runner must be set manually
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35224

Test Plan: CI

Differential Revision: D20605896

Pulled By: malfet

fbshipit-source-id: 79fc6f55911189b6d8a4bc83bd2390c94bd69e5e
2020-03-23 16:37:52 -07:00
Ailing Zhang
471ddacd8b Add retry decorator and use it for Hub tests. (#34829)
Summary:
fix https://github.com/pytorch/pytorch/issues/34751
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34829

Differential Revision: D20476231

Pulled By: ailzhang

fbshipit-source-id: eb38ee655e28250352b15e8e37b3b39310a7c378
2020-03-16 20:19:45 -07:00
Pearu Peterson
8bae1ed144 PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721)
Summary:
This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721

Differential Revision: D20444270

Pulled By: vincentqb

fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659
2020-03-16 14:13:30 -07:00
Edward Yang
4b929e5466 Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem
Test Plan: revert-hammer

Differential Revision:
D20193196

Original commit changeset: 78a487991242

fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8
2020-03-11 09:24:34 -07:00
Pearu Peterson
2ec779d46c PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488)
Summary:
This PR implements the following linear algebra algorithms for low-rank matrices:
- [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + uses `torch.lowrank.get_approximate_basis`
  + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] PCA - using `torch.svd_lowrank`
  + uses `torch.svd_lowrank`
  + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices, uses non-centered sparse matrix algorithm
  + [x] documentation
- [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically
  + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point.
- [x] benchmarks
  + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations.
  + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster.
  + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results)

Resolves https://github.com/pytorch/pytorch/issues/8049.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488

Differential Revision: D20193196

Pulled By: vincentqb

fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e
2020-03-11 07:33:49 -07:00
Edward Yang
ba1bd41767 Turn on strict dtype checking for test_torch.py (#33825)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33825

Partially addresses #20376

I do this by overriding assertEqual in classes that opt into
this.  This means I have to fix #33821.  The fix is a little
unsatisfactory as idiomatic Python 2 super() calls don't work
(since the class is no longer in scope); hopefully this will just
work when we go to Python 3.

General approach taken:
- A lot of dtype mismatches are because we specified tensor constants
  that infer to some dtype, but the actual dtype needed is something else.
  Those are easy, just annotate the tensor() constructor (often a legacy
  Tensor/FloatTensor call) with dtype
- There are a few cases where the promotion rules are nontrivial.  Some of them
  I just typed out the expected promotion rules manually (based on trial
  and error)
- There are some more complex cases; if it gets too hairy I just
  set exact_dtype=False and nope the fuck out

I don't have time to do it for all the other classes.  But the setup
should work if people just incrementally add the overrides to classes,
and then eventually flip the default.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20125791

Pulled By: ezyang

fbshipit-source-id: 389c2d1efbd93172af02f13e38ac5e92fe730c57
2020-03-03 14:45:53 -08:00
anjali411
dece155335 Modified assertEqual to handle complex tensors (#33773)
Summary:
- Modified assertEqual to handle complex tensors
- added a test in test_torch.py to test torch.zeros
- added dispatch for complex for index_kernel, index_put_kernel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33773

Differential Revision: D20135553

Pulled By: anjali411

fbshipit-source-id: f716604535c0447ecffa335b0fc843431397c988
2020-02-28 08:43:28 -08:00
Nikolay Korovaiko
a7e22b4c6a add bailout checks to checkScript (#32802)
Summary:
this adds enough infrastructure to run bailout checks in `checkScript`. I'll need to figure out the best way to enable it for nightly builds now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32802

Differential Revision: D19974718

Pulled By: Krovatkin

fbshipit-source-id: 40485503f6d3ae14edcce98e1eec1f0559f3ad08
2020-02-21 21:18:54 -08:00
Rohan Varma
6cb9e6b015 Back out "Revert D19871946: [distributed] pass in timeout to TCP store when initializing" (#33434)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33434

Reland of https://github.com/pytorch/pytorch/pull/33325, since the
unit test was flaky and failed on land.
To ensure that the test is not flaky, I bumped the timeout so the rendezvous
does not timeout (timing out the rendezvous in 1s led to the flakiness). I also
generalized our mechanism for retrying on errors to include retrying on errors
due to timeout in rendezvous.
ghstack-source-id: 98558377

Test Plan: Added UT test_tcp_store_timeout_set

Differential Revision: D19935390

fbshipit-source-id: 56ccf8c333dd2f954a33614d35cd1642d4e9473a
2020-02-19 17:17:17 -08:00
ptrblck
1e3664b6ef Remove c/pdist tests from _internal/common_utils.py (#33409)
Summary:
* remove brute_test from `torch/testing/_internal/common_utils.py`
* add these tests as internal tests to `test_torch.py`

CC ailzhang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33409

Differential Revision: D19951729

Pulled By: ailzhang

fbshipit-source-id: b1126aaf26fa64a0f17cbb582dc8038b79cfe3eb
2020-02-19 10:27:30 -08:00
Pritam Damania
fd684cc312 Use torch.set_default_dtype in test_data_parallel and rename dtype2prec (#32962)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32962

As per gchanan's comments on
https://github.com/pytorch/pytorch/pull/30445, I've used
`torch.set_default_dtype` in test_data_parallel instead of specifying
dtype=torch.double everywhere. Also, renamed dtype2prec to dtype2prec_DONTUSE
ghstack-source-id: 98388429

Test Plan: waitforbuildbot

Differential Revision: D19714374

fbshipit-source-id: eb55bbca33881625636ba9ea6dd4cb692f25668e
2020-02-15 14:07:54 -08:00
ptrblck
a64d0ffe81 Use int64 in pdist kernel to handle batches >= 46342 #30583 (#31593)
Summary:
Currently `torch.pdist` yields an illegal CUDA memory access for batch sizes >= 46342 as reported by SsnL in https://github.com/pytorch/pytorch/issues/30583.
Thanks for the minimal code reproduction, btw! ;)

Reason for this bug:
The calculation if `i` in the [`pdist_kerne_cuda_impl`](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L112)) might overflow, if a tensor with a `batch size >= 46342` is passed to `torch.pdist`.

Detailed description:
* `result` is resizes as ` n * (n - 1) / 2 = 1073767311` ([line of code](46ad80c839/aten/src/ATen/native/Distance.cpp (L140)))
* `grid` is initialized as `result.numel()` ([line of code](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L246)))
* `k` is assigned to the `blockIdx.x` as an `int32` ([line of code](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L108)))
* `i` is calculated using `2 * k >= 2147534622` ([line of code](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L112))), which overflows, since `2147534622 > 2147483647 (int32_max)`.

Using `const int64_t k = blockIdx.x;` would solve the illegal memory access. This seems also be done for [`cdist_kernel_cuda_impl`](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L198-L201)).

However, we might expect a slowdown, so I've timed the current PyTorch master vs. this PR:
(tested with `x = torch.randn(x.size(0), 128)` on a V100)

 |x.size(0) | int32 idx | int64 idx | slowdown |
 |----------|-----------|-----------|----------|
| 50000 | -              | 4.4460 | - |
| 25000 | 1.02522 | 1.10869 | 7.53% |
| 12500 | 0.25182 | 0.27277 | 7.68% |
| 6250 | 0.06291 | 0.06817 | 7.72% |
| 3125 | 0.01573 | 0.01704 | 7.69% |
| 1562 | 0.00393 | 0.00426 | 7.75% |

While checking the backward kernel, it seems I'm triggering another error with a size limit of
```python
x = torch.randn(1449, 1, device='cuda', requires_grad=True)
out = torch.pdist(x)
out.mean().backward()
> RuntimeError: CUDA error: invalid configuration argument
```
, while `[<=1448, 1]` works.

I'll take another look at this issue. Let me know, if the potential fix should go into this PR or if I should open a new issue.

CC ngimel, csarofeen
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31593

Differential Revision: D19825571

Pulled By: ngimel

fbshipit-source-id: ace9ccab49f3cf0ce894cdb6daef0795e2e8ec03
2020-02-11 12:00:39 -08:00
George Guanheng Zhang
f4fbe9549d Revert D19800021: [pytorch][PR] Improve error message for assertWarnsRegex
Test Plan: revert-hammer

Differential Revision:
D19800021

Original commit changeset: 1c31ae785c8f

fbshipit-source-id: d7b340d678562c25a84d48be66c576075000b50d
2020-02-10 12:17:52 -08:00
Peter Bell
c917a247a8 Improve error message for assertWarnsRegex (#33099)
Summary:
`assertWarnsRegex` now prints out any warnings that it caught while failing to find a matching warning. This makes it easier to debug tests by just looking at the CI logs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33099

Differential Revision: D19800021

Pulled By: ezyang

fbshipit-source-id: 1c31ae785c8ffc5d47619aff6597e479263be2de
2020-02-10 07:27:59 -08:00
Richard Zou
6209412647 Add option to use ninja to compile ahead-of-time cpp_extensions (#32495)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32495

Background
------------------------------
Previously, ninja was used to compile+link inline cpp_extensions and
ahead-of-time cpp_extensions were compiled with distutils. This PR adds
the ability to compile (but not link) ahead-of-time cpp_extensions with ninja.

The main motivation for this is to speed up cpp_extension builds: distutils
does not make use of parallelism. With this PR, using the new option, on my machine,
- torchvision compilation goes from 3m43s to 49s
- nestedtensor compilation goes from 2m0s to 28s.

User-facing changes
------------------------------

I added a `use_ninja` flag to BuildExtension. This defaults to
`True`. When `use_ninja` is True:
- it will attempt to use ninja.
- If we cannot use ninja, then this throws a warning and falls back to
distutils.
- Situations we cannot use ninja: Windows (NYI, I'll open a new issue
for this), if ninja cannot be found on the system.

Implementation Details
------------------------------

This PR makes this change in two steps. Please me know if it would be
easier to review this if I split this up into a stacked diff.
Those changes are:
1) refactor _write_ninja_file to separate the policy (what compiler flags
to pass) from the mechanism (how to write the ninja file and do compilation).
2) call _write_ninja_file and _run_ninja_build while building
ahead-of-time cpp_extensions. These are only used to compile objects;
distutils still handles the linking.

Change 1: refactor _write_ninja_file to seperate policy from mechanism
- I split _write_ninja_file into: _write_ninja_file and
_write_ninja_file_to_build_library
- I renamed _build_extension_module to _run_ninja_build

Change 2: Call _write_ninja_file while building ahead-of-time
cpp_extensions
- _write_ninja_file_and_compile_objects calls _write_ninja_file to only
build object files.
- We monkey-patch distutils.CCompiler.compile to call
_write_ninja_files_and_compile_objects
- distutils still handles the linking step. The linking step is not a
bottleneck so it was not a concern.
- This change only works on unix-based systems. Our code for windows
goes down a different codepath and I did not want to mess with that.
- If a system does not support ninja, we raise a warning and fall back
to the original compilation path.

Test Plan
------------------------------

Adhoc testing
- I built torchvision using pytorch master and printed out the build
commands. Next, I used this branch to build torchvision and looked at
the ninja file. I compared the ninja file with the build commands and
asserted that they were functionally the same.
- I repeated the above for pytorch/nestedtensor.

PyTorch test suite
- I split `test_cpp_extensions` into `test_cpp_extensions_aot` and
`test_cpp_extensions_jit`. The AOT (ahead-of-time) version tests
ahead-of-time and the JIT version tests just-in-time (not to be confused
with TorchScript)
- `test_cpp_extensions_aot` gets run TWICE by run_test.py, once with
a module that was built with ninja, and once with a module that was
built without ninja.
- run_test.py asserts that when we are building with use_ninja=True,
ninja is actually available on the system.

Test Plan: Imported from OSS

Differential Revision: D19730432

Pulled By: zou3519

fbshipit-source-id: 819590d01cf65e8da5a1e8019b8b3084792fee90
2020-02-05 18:49:29 -08:00
davidriazati
2060e0a9dd Split serialization tests to their own file (#32241)
Summary:
Stacked PRs
 * #32244 - Make zip serialization the default
 * **#32241 - Split serialization tests to their own file**

This makes them all easier to run as a batch. This PR is just a code move / fixing up imports. There are still some serialization tests in `test_torch.py` as part of `TestDeviceType`.
](https://our.intern.facebook.com/intern/diff/19415826/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32241

Pulled By: driazati

Differential Revision: D19415826

fbshipit-source-id: a3f6cfe1626ff2f9b9631c409bf525bd32e4639b
2020-01-28 15:04:05 -08:00
Pritam Damania
f050b16dd9 Move pytorch distributed tests to separate folder for contbuild. (#30445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445

Create distributed and rpc directories under caffe/test for better management
of unit tests.

Differential Revision: D18702786

fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
2020-01-22 21:16:59 -08:00