Commit Graph

55 Commits

Author SHA1 Message Date
Mike Ruberry
9ed5efda47 Adds TestCase.compare_with_numpy (#39179)
Summary:
Cut from https://github.com/pytorch/pytorch/pull/38994.

This is a helper function for comparing torch and NumPy behavior. It updates the existing and increasingly popular _np_compare function and moves it to be a method on TestCase.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39179

Differential Revision: D21855082

Pulled By: mruberry

fbshipit-source-id: edca3b78ae392d32243b02bf61960898b6ba590f
2020-06-03 15:27:32 -07:00
Nikita Shulga
86f46ac9ca Fix assertNotEqual error reporting (#39217)
Summary:
`msg` argument must be passed to `assertRaises`, because its exception is passed upstream (with custom error message) if `assertEquals` succeedes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39217

Differential Revision: D21786141

Pulled By: malfet

fbshipit-source-id: f8c3d4f30f474fe269e50252a06eade76d575a68
2020-05-29 10:35:56 -07:00
Jeff Daily
7e16dd299a [ROCm] enable mem leak check for rocm (#35953)
Summary:
CC iotamudelta
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35953

Differential Revision: D21742926

Pulled By: zou3519

fbshipit-source-id: f18534dbb88a84fe98b8d85ce8fde652916a72d5
2020-05-28 07:05:47 -07:00
Natalia Gimelshein
d92ef9268d Revert D21728402: Simplify precision-specification in tests.
Test Plan: revert-hammer

Differential Revision:
D21728402

Original commit changeset: 85f3daf63f1b

fbshipit-source-id: 4e2a36aca15cd8d842985173395b4e1cac7135d8
2020-05-27 17:34:28 -07:00
Brian
df4066bbb6 Simplify precision-specification in tests. (#37181)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37181

Now that assertEquals considers dtypes in determining tolerance, most
tests don't need explicitly set precision.

Those that do are a few half precision tests on cuda. In this PR, those
are broken out to be handled explicitly, though we may also want to
consider further loosening the tolerance on half-precision.

Test Plan: Imported from OSS

Differential Revision: D21728402

Pulled By: nairbv

fbshipit-source-id: 85f3daf63f1bdbb5101e8dea8c125f13448ca228
2020-05-27 12:05:33 -07:00
Mike Ruberry
13120bf677 Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21740237

Pulled By: mruberry

fbshipit-source-id: acbc027aa1d7877a49664d94db9a5fff91a07042
2020-05-27 06:31:07 -07:00
Nikolay Korovaiko
9b95f757af move num_profiled_runs to common_utils (#38687)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38687

Differential Revision: D21634080

Pulled By: Krovatkin

fbshipit-source-id: 55513124caf3885e475ffecd9d9f3dbc4729a573
2020-05-27 01:14:01 -07:00
Rohan Varma
63e545e0fe Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol
Test Plan: revert-hammer

Differential Revision:
D21717199

Original commit changeset: 9feb856f94ee

fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259
2020-05-26 18:23:59 -07:00
mattip
2e6ee853ab make onnx expect tests resiliant to producer_version changes (#39002)
Summary:
closes gh-32561 closes gh-38545. As part of the fallout from gh-36797, this PR
- replaces the producer_version: "1.6" in onnx expect tests with `producer_version: "XXX"
- adapts `testing/_internal/common_utils.py` with a regex to change the onnx producer_version so tests still pass

The consistency of the torch version and the onnx `producer_version` is tested in gh-36797, so there is no reason to test it again in the expect tests.

xref gh-38629 which documented how to run the onnx tests and at the same time refactored the Community documentation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39002

Differential Revision: D21723062

Pulled By: ezyang

fbshipit-source-id: 1bd6a8ed37d5383e69d017226dc09c0645a69aff
2020-05-26 16:11:21 -07:00
Mike Ruberry
6ddca30b2d Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21717199

Pulled By: mruberry

fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a
2020-05-26 08:30:23 -07:00
Mike Ruberry
9cfc10d52e Updates assertEqual to use torch.isclose-like logic (#37294)
Summary:
Edit: this has been updated to reflect the PR's current status, which has changed after review.

This PR updates the behavior of the assertEqual, assertNotEqual, and assert_allclose to be consistent with each other and torch.isclose. It corrects several additional bugs in the current implementations and adds extensive testing and comments, too.

These updates follow from changes to assertEqual like https://github.com/pytorch/pytorch/pull/34258 and https://github.com/pytorch/pytorch/pull/37069, and from our discussion of torch.isclose for complex tensors (see https://github.com/pytorch/pytorch/issues/36462), where we decided to implement a NumPy-compatible mathematical notion of "closeness" for complex tensors that is not a great fit for our testing framework.

The detailed changelist is:

- New test framework functions for comparing tensors and scalars
  - Tensors are compared using isclose; the real and imaginary parts of complex tensors are compared independently
  - Scalars are compared using the same algorithm
  - assertEqual and assert_allclose now use this common comparison function, instead of each implementing their own with divergent behavior
  - assertEqual-like debug messages are now available for all tensor and scalar comparisons, with additional context when comparing the components of sparse, quantized, and complex tensors
- Extensive testing of the comparison behavior and debug messages
- Small Updates
  - assertEqual now takes an "exact_device" argument, analogous to "exact_dtype", which should be useful in multidevice tests
  - assertEqual now takes an "equal_nan" argument for argument consistency with torch.isclose
  - assertEqual no longer takes the "allow_inf" keyword, which misleadingly only applied to scalar comparisons, was only ever set (rarely) to true, and is not supported by torch.isclose
- Bug fixes:
  - the exact_dtype attribute has been removed (no longer needed after https://github.com/pytorch/pytorch/pull/38103)
  - message arguments passed to assertEqual are now handled correctly
  - bool x other dtype comparisons are now supported
  - uint8 and int8 tensor comparisons now function properly
  - rtol for integer comparisons is now supported (default is zero)
  - rtol and atol for scalar comparisons are now supported
  - complex scalar comparisons are now supported, analogous to complex tensor comparisons
  - assertNotEqual is now equivalent to the logical negation of assertEqual
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37294

Differential Revision: D21596830

Pulled By: mruberry

fbshipit-source-id: f2576669f7113a06f82581fc71883e6b772de19b
2020-05-15 16:24:03 -07:00
David Reiss
1f87f15ba3 Remove _reset_warning_registry (#38485)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38485

Python 2 has reached end-of-life and is no longer supported by PyTorch.
This class does nothing in Python 3.

Test Plan: CI

Reviewed By: ailzhang

Differential Revision: D21575260

Pulled By: dreiss

fbshipit-source-id: 184696c9fa501e8d2517950b47cdbc90b2ae8053
2020-05-14 15:03:30 -07:00
Nikolay Korovaiko
96885f73ed make test_jit infer the profiling mode, add a job for simple executor (#38374)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38374

Differential Revision: D21567658

Pulled By: Krovatkin

fbshipit-source-id: c0eb44cf6c842d5feebabf8c7d99c1b4aa6c4960
2020-05-13 23:55:40 -07:00
Pavel Belevich
4f08bdddfc Add skipIfNoSciPy/get_all_int_dtypes/get_all_fp_dtypes to common_utils (#38299)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38299

Test Plan: Imported from OSS

Differential Revision: D21534876

Pulled By: pbelevich

fbshipit-source-id: 864881b3be899aea3660039128d9bc2e94edab95
2020-05-12 19:11:31 -07:00
Vitaly Fedyunin
48ad9f5a30 assertEqual now requires matching dtypes (#38103)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38103

Test Plan: Imported from OSS

Differential Revision: D21477062

Pulled By: VitalyFedyunin

fbshipit-source-id: 9592fed336214dd97eb8e9d6b3e16f21ff6f072d
2020-05-09 14:49:01 -07:00
Vitaly Fedyunin
e3414c1ef1 AssertEqual now checks tensors dtype (#34154)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34154

Temporary replacing with `assertEqualIgnoreType` all cases when `AssertEqual` fails.

Test Plan: Imported from OSS

Differential Revision: D20251131

Pulled By: VitalyFedyunin

fbshipit-source-id: fa69c6e2b3a7963912af5b0fa42bec9eded323d3
2020-05-09 14:47:01 -07:00
Ailing Zhang
9232356e5f remove uses of type() and type_as() part 1. (#38029)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38029

Differential Revision: D21468523

Pulled By: ailzhang

fbshipit-source-id: 14b7185d43eb03f630cfaa2d70e02d637ff8551b
2020-05-08 08:16:24 -07:00
Nikita Shulga
53aa7d8bc5 Add option to skip tests after retries (#38079)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38079

Differential Revision: D21470238

Pulled By: malfet

fbshipit-source-id: b2e63be34090c6f61acad8b6530658a835c68870
2020-05-07 21:56:29 -07:00
Nikita Shulga
72e5b7ae5b Add option to run python unittests in parallel (#37180)
Summary:
So far results looks quite promising: test_nn is purely sequential tests and can be accelerated 3x
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37180

Differential Revision: D21437871

Pulled By: malfet

fbshipit-source-id: 8679a8af355f839f2c9dae3bf36d2e102af05425
2020-05-06 22:14:11 -07:00
Elias Ellison
0e3a05ec00 [JIT] rename enable_profiling_mode to enable_profiling_mode_for_profiling_tests (#37825)
Summary:
The existing contextmanager only conditionally enabled_profiling_mode, which was counter intuitive. When we changed the default executor it broke internal benchmarking as a result.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37825

Differential Revision: D21404611

Pulled By: eellison

fbshipit-source-id: 306b3c333ef4eb44ab6a6e5ab4e0682e5ce312ce
2020-05-06 11:30:02 -07:00
Nikita Shulga
2c6aed0d61 [Testing] Add --save-xml option (#37840)
Summary:
Passing `--save-xml` option to common test runner would have the same effect as setting up `IN_CIRCLECI` environment variable, but also would allow one to specify folder to save results
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37840

Differential Revision: D21410250

Pulled By: malfet

fbshipit-source-id: ae5855fafdc8c66b550d42b683d547c88b4e55d9
2020-05-05 14:57:50 -07:00
Nikolay Korovaiko
edc5ef1afb run the simple executor for jit tests by default, add profiling jobs … (#37017)
Summary:
…for fusion tests

fix flake8 warnings

fix ci failures

fix test_determination.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37017

Differential Revision: D21238446

Pulled By: Krovatkin

fbshipit-source-id: 393e6135883dc5ac57bdff580de96c66829d454c
2020-04-28 19:16:52 -07:00
Nikita Shulga
ea741f829e Add --repeat option to python unit-test (#37281)
Summary:
This would run same testsuite (or individual test) multiple time
Useful for detecting flaky tests

Example usage: `python test_autograd.py TestAutograd.test_profiler -v --repeat=100`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37281

Differential Revision: D21244442

Pulled By: malfet

fbshipit-source-id: 3ecafec7ae87bc1e418aa28151bbc472ef37a713
2020-04-25 13:56:58 -07:00
Brian Vaughan
a50a1fb4c3 Enforce kw-only args now that py2 is unsupported (#37069)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37069

Test Plan: Imported from OSS

Differential Revision: D21204729

Pulled By: nairbv

fbshipit-source-id: 8e93decae59e753706fa288bcdc3bf6278b8eeb5
2020-04-24 07:08:24 -07:00
David Reiss
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
Nikita Shulga
3b832ee2bf Use Python3 super() throughout torch.testing. (#37024)
Summary:
Hattip to ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37024

Differential Revision: D21173244

Pulled By: malfet

fbshipit-source-id: 7079703e28777d873f69bf9fd4dcbad8d53a2682
2020-04-22 09:00:28 -07:00
Brian Vaughan
54ed6fd3ee Use both absolute and relative tolerance in testing (#34258)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34258

This PR allows both atol and rtol to be specified, uses defaults based on the prior analysis (spreadsheet attached to https://github.com/pytorch/pytorch/pull/32538), but retains the absolute tolerance behavior in cases where precision was previously specified explicitly.

Test Plan: Imported from OSS

Differential Revision: D21110255

Pulled By: nairbv

fbshipit-source-id: 57b3a004c7d5ac1be80ee765f03668b1b13f4a7e
2020-04-19 06:16:49 -07:00
Elias Ellison
54a575c9bd [JIT] fix torch.tensor jit dtype (#36587)
Summary:
Previously we were always creating a double tensor from `torch.tensor(1.)`, whereas python eager uses the current default dtype. Fix for https://github.com/pytorch/pytorch/issues/36369
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36587

Differential Revision: D21043617

Pulled By: eellison

fbshipit-source-id: 38da303594f52e06941d86b6e57c4a06e7d36938
2020-04-16 10:55:49 -07:00
Mike Ruberry
d0c925f1c7 Returns float tensors for complex inputs to abs (#35871)
Summary:
Per title. A test is added to test_type_promotion for the behavior. This behavior is consistent with NumPy's.

For complex inputs to `abs` the result is cast to float after the computation since the computation of abs must be performed on the original complex tensor. While `std::abs` returns a float value when called on complex inputs, returning a FloatTensor directly would require additional loop instantiations in TensorIterator. This may be worthwhile to pursue in the future.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35871

Differential Revision: D20984456

Pulled By: mruberry

fbshipit-source-id: 226445178f92f2b0292e92578656d98674a6aa20
2020-04-16 09:03:17 -07:00
Natalia Gimelshein
f3f640d479 move test_abs to device-generic tests (#36465)
Summary:
Per title. test_abs used to be marked as slow_test and run on cpu only. Conceptually similar tests are done in TestTorchMathOps, so it's a matter of adding `abs` test there. 2 remaining checks (correct abs for large-valued long tensors, and correct abs for signed zeros) are factored into separate tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36465

Differential Revision: D21000248

Pulled By: ngimel

fbshipit-source-id: 8bc8b0da936b1c10fe016ff2f0dbb5ea428e7e61
2020-04-14 09:48:08 -07:00
Wanchao Liang
3526627f46 Use unittest assertWarns instead (#36411)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36411

This PR remove pytorch specific defined assertwarns and use the unit
test one, also format some tests

Test Plan: Imported from OSS

Differential Revision: D20998159

Pulled By: wanchaol

fbshipit-source-id: 1280ecff2dd293b95a639d13cc7417fc819c2201
2020-04-13 15:56:42 -07:00
Mike Ruberry
254be6a201 Adds NumPy array x Torch tensor binary ufunc interaction test (#35945)
Summary:
Adds test for behavior reported in https://github.com/pytorch/pytorch/issues/35257 to ensure it doesn't regress. The test was extended to reveal three additional issues:

- https://github.com/pytorch/pytorch/issues/36363
- https://github.com/pytorch/pytorch/issues/36058
- https://github.com/pytorch/pytorch/issues/36057
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35945

Differential Revision: D20984429

Pulled By: mruberry

fbshipit-source-id: a15be9455afba9c77e40c337a860f9be348bf8d5
2020-04-11 21:56:38 -07:00
Lu Fang
742c77971a Revert D20961711: [pytorch][PR] Returns float tensors for complex inputs to abs
Test Plan: revert-hammer

Differential Revision:
D20961711

Original commit changeset: 232f62cf64ca

fbshipit-source-id: 7b2a537d2effe6b2449f192dc42e375062058995
2020-04-11 02:55:41 -07:00
Mike Ruberry
3aeb2b1562 Returns float tensors for complex inputs to abs (#35871)
Summary:
Per title. A test is added to test_type_promotion for the behavior. This behavior is consistent with NumPy's.

For complex inputs to `abs` the result is cast to float after the computation since the computation of abs must be performed on the original complex tensor. While `std::abs` returns a float value when called on complex inputs, returning a FloatTensor directly would require additional loop instantiations in TensorIterator. This may be worthwhile to pursue in the future.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35871

Differential Revision: D20961711

Pulled By: mruberry

fbshipit-source-id: 232f62cf64caa4154eb2194969efa51d2082d842
2020-04-10 09:08:45 -07:00
Nikita Shulga
bb32e123e6 Report results of python unit tests during window test runs (#35687)
Summary:
Define `store_test_results` attribute in CircleCI yamls
Install `unittest-xml-reporting` and define `IN_CIRCLECI` environment variable to trigger test runners to save results to XML
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35687

Differential Revision: D20739831

Pulled By: malfet

fbshipit-source-id: 6a7bbf19f93c32766963f5edad191ad8ca316ff8
2020-03-30 12:33:03 -07:00
Mike Ruberry
683246e5ea Improves precision of linspace, logspace (#35461)
Summary:
The Torch algorithms for linspace and logspace conceptually compute each of their values using:

`start_value + step_value * idx`

[And NumPy does the same,](cef4dc9d91/numpy/core/function_base.py (L24)) except NumPy then [sets the last value in its array directly.](cef4dc9d91/numpy/core/function_base.py (L162)) This is because the above computation is unstable when using floats, and NumPy's contract, like PyTorch's, is that the last element in the array is the stop value.

In PyTorch there can be a divergence between the computed last value and the actual value. One user reported case was:

`torch.linspace(-0.031608279794, 0.031531572342, 257, dtype=torch.float32)`

Which causes a difference of 3.7253e-09 between the last value as set by NumPy and computed by PyTorch. After this PR the difference is zero.

Instead of simply setting the last element of the tensor, this PR updates the kernels with a "symmetric" algorithm that sets the first and last array elements without requiring an additional kernel launch on CUDA. The performance impact of this change seems small. I tested with a step sizes of 2^8 and 2^22, and all timing differences were imperceptible except for 2^22 on CPU, which appears to have suffered ~5% slowdown. I think that's an acceptable performance hit for the improved precision when we consider the context of linspace.

An alternative would be to simply set the last element, as NumPy does, on CPU. But I think it's preferable to keep the CPU and CUDA algorithms aligned and keep the algorithm symmetric. In current PyTorch, for example, torch.linspace starts generating values very similar to NumPy, but as the index increases so do the errors, giving our current implementation a "left bias."

Two tests are added to test_torch.py for this behavior. The linspace test will fail on current PyTorch, but the logspace test will succeed since its more complex computation needs wider error bars.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35461

Differential Revision: D20712539

Pulled By: mruberry

fbshipit-source-id: 2c1257c8706f4cdf080ff0331bbf2f7041ab9adf
2020-03-27 23:50:39 -07:00
Alban Desmaison
181da12126 Revert D20687652: [pytorch][PR] Report results from cpp unittests on Windows and Linux
Test Plan: revert-hammer

Differential Revision:
D20687652

Original commit changeset: fc370b7e2614

fbshipit-source-id: 8153815c8ed8f3d4f472caa95eda76180b038a42
2020-03-27 06:56:53 -07:00
Nikita Shulga
d2d40c45b6 Report results from cpp unittests on Windows and Linux (#35500)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35500

Test Plan:
Test in production :)
Results should eventually be published to: https://circleci.com/build-insights/gh/pytorch/pytorch/master

Differential Revision: D20687652

Pulled By: malfet

fbshipit-source-id: fc370b7e261402e14b427f42038ecb2d95bad059
2020-03-26 23:00:33 -07:00
Nikita Shulga
6fa0b3df2e [testing] Pass verbosity settings to XMLTestRunner (#35224)
Summary:
When `unittest.main()` is invoked with custom testRunner, verbosity settings for the runner must be set manually
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35224

Test Plan: CI

Differential Revision: D20605896

Pulled By: malfet

fbshipit-source-id: 79fc6f55911189b6d8a4bc83bd2390c94bd69e5e
2020-03-23 16:37:52 -07:00
Ailing Zhang
471ddacd8b Add retry decorator and use it for Hub tests. (#34829)
Summary:
fix https://github.com/pytorch/pytorch/issues/34751
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34829

Differential Revision: D20476231

Pulled By: ailzhang

fbshipit-source-id: eb38ee655e28250352b15e8e37b3b39310a7c378
2020-03-16 20:19:45 -07:00
Pearu Peterson
8bae1ed144 PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721)
Summary:
This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721

Differential Revision: D20444270

Pulled By: vincentqb

fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659
2020-03-16 14:13:30 -07:00
Edward Yang
4b929e5466 Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem
Test Plan: revert-hammer

Differential Revision:
D20193196

Original commit changeset: 78a487991242

fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8
2020-03-11 09:24:34 -07:00
Pearu Peterson
2ec779d46c PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488)
Summary:
This PR implements the following linear algebra algorithms for low-rank matrices:
- [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + uses `torch.lowrank.get_approximate_basis`
  + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] PCA - using `torch.svd_lowrank`
  + uses `torch.svd_lowrank`
  + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices, uses non-centered sparse matrix algorithm
  + [x] documentation
- [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically
  + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point.
- [x] benchmarks
  + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations.
  + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster.
  + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results)

Resolves https://github.com/pytorch/pytorch/issues/8049.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488

Differential Revision: D20193196

Pulled By: vincentqb

fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e
2020-03-11 07:33:49 -07:00
Edward Yang
ba1bd41767 Turn on strict dtype checking for test_torch.py (#33825)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33825

Partially addresses #20376

I do this by overriding assertEqual in classes that opt into
this.  This means I have to fix #33821.  The fix is a little
unsatisfactory as idiomatic Python 2 super() calls don't work
(since the class is no longer in scope); hopefully this will just
work when we go to Python 3.

General approach taken:
- A lot of dtype mismatches are because we specified tensor constants
  that infer to some dtype, but the actual dtype needed is something else.
  Those are easy, just annotate the tensor() constructor (often a legacy
  Tensor/FloatTensor call) with dtype
- There are a few cases where the promotion rules are nontrivial.  Some of them
  I just typed out the expected promotion rules manually (based on trial
  and error)
- There are some more complex cases; if it gets too hairy I just
  set exact_dtype=False and nope the fuck out

I don't have time to do it for all the other classes.  But the setup
should work if people just incrementally add the overrides to classes,
and then eventually flip the default.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20125791

Pulled By: ezyang

fbshipit-source-id: 389c2d1efbd93172af02f13e38ac5e92fe730c57
2020-03-03 14:45:53 -08:00
anjali411
dece155335 Modified assertEqual to handle complex tensors (#33773)
Summary:
- Modified assertEqual to handle complex tensors
- added a test in test_torch.py to test torch.zeros
- added dispatch for complex for index_kernel, index_put_kernel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33773

Differential Revision: D20135553

Pulled By: anjali411

fbshipit-source-id: f716604535c0447ecffa335b0fc843431397c988
2020-02-28 08:43:28 -08:00
Nikolay Korovaiko
a7e22b4c6a add bailout checks to checkScript (#32802)
Summary:
this adds enough infrastructure to run bailout checks in `checkScript`. I'll need to figure out the best way to enable it for nightly builds now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32802

Differential Revision: D19974718

Pulled By: Krovatkin

fbshipit-source-id: 40485503f6d3ae14edcce98e1eec1f0559f3ad08
2020-02-21 21:18:54 -08:00
Rohan Varma
6cb9e6b015 Back out "Revert D19871946: [distributed] pass in timeout to TCP store when initializing" (#33434)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33434

Reland of https://github.com/pytorch/pytorch/pull/33325, since the
unit test was flaky and failed on land.
To ensure that the test is not flaky, I bumped the timeout so the rendezvous
does not timeout (timing out the rendezvous in 1s led to the flakiness). I also
generalized our mechanism for retrying on errors to include retrying on errors
due to timeout in rendezvous.
ghstack-source-id: 98558377

Test Plan: Added UT test_tcp_store_timeout_set

Differential Revision: D19935390

fbshipit-source-id: 56ccf8c333dd2f954a33614d35cd1642d4e9473a
2020-02-19 17:17:17 -08:00
ptrblck
1e3664b6ef Remove c/pdist tests from _internal/common_utils.py (#33409)
Summary:
* remove brute_test from `torch/testing/_internal/common_utils.py`
* add these tests as internal tests to `test_torch.py`

CC ailzhang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33409

Differential Revision: D19951729

Pulled By: ailzhang

fbshipit-source-id: b1126aaf26fa64a0f17cbb582dc8038b79cfe3eb
2020-02-19 10:27:30 -08:00
Pritam Damania
fd684cc312 Use torch.set_default_dtype in test_data_parallel and rename dtype2prec (#32962)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32962

As per gchanan's comments on
https://github.com/pytorch/pytorch/pull/30445, I've used
`torch.set_default_dtype` in test_data_parallel instead of specifying
dtype=torch.double everywhere. Also, renamed dtype2prec to dtype2prec_DONTUSE
ghstack-source-id: 98388429

Test Plan: waitforbuildbot

Differential Revision: D19714374

fbshipit-source-id: eb55bbca33881625636ba9ea6dd4cb692f25668e
2020-02-15 14:07:54 -08:00
ptrblck
a64d0ffe81 Use int64 in pdist kernel to handle batches >= 46342 #30583 (#31593)
Summary:
Currently `torch.pdist` yields an illegal CUDA memory access for batch sizes >= 46342 as reported by SsnL in https://github.com/pytorch/pytorch/issues/30583.
Thanks for the minimal code reproduction, btw! ;)

Reason for this bug:
The calculation if `i` in the [`pdist_kerne_cuda_impl`](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L112)) might overflow, if a tensor with a `batch size >= 46342` is passed to `torch.pdist`.

Detailed description:
* `result` is resizes as ` n * (n - 1) / 2 = 1073767311` ([line of code](46ad80c839/aten/src/ATen/native/Distance.cpp (L140)))
* `grid` is initialized as `result.numel()` ([line of code](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L246)))
* `k` is assigned to the `blockIdx.x` as an `int32` ([line of code](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L108)))
* `i` is calculated using `2 * k >= 2147534622` ([line of code](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L112))), which overflows, since `2147534622 > 2147483647 (int32_max)`.

Using `const int64_t k = blockIdx.x;` would solve the illegal memory access. This seems also be done for [`cdist_kernel_cuda_impl`](46ad80c839/aten/src/ATen/native/cuda/DistanceKernel.cu (L198-L201)).

However, we might expect a slowdown, so I've timed the current PyTorch master vs. this PR:
(tested with `x = torch.randn(x.size(0), 128)` on a V100)

 |x.size(0) | int32 idx | int64 idx | slowdown |
 |----------|-----------|-----------|----------|
| 50000 | -              | 4.4460 | - |
| 25000 | 1.02522 | 1.10869 | 7.53% |
| 12500 | 0.25182 | 0.27277 | 7.68% |
| 6250 | 0.06291 | 0.06817 | 7.72% |
| 3125 | 0.01573 | 0.01704 | 7.69% |
| 1562 | 0.00393 | 0.00426 | 7.75% |

While checking the backward kernel, it seems I'm triggering another error with a size limit of
```python
x = torch.randn(1449, 1, device='cuda', requires_grad=True)
out = torch.pdist(x)
out.mean().backward()
> RuntimeError: CUDA error: invalid configuration argument
```
, while `[<=1448, 1]` works.

I'll take another look at this issue. Let me know, if the potential fix should go into this PR or if I should open a new issue.

CC ngimel, csarofeen
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31593

Differential Revision: D19825571

Pulled By: ngimel

fbshipit-source-id: ace9ccab49f3cf0ce894cdb6daef0795e2e8ec03
2020-02-11 12:00:39 -08:00