Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60536
`torch.isclose` does not do this bool tensors, which results in a test failure since subtraction (`abs(actual - expected)`) is not supported for them (see #58981). Since the `dtype` is already checked at this point, we can safely move the upcasting before `torch.isclose` is invoked.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D29556356
Pulled By: mruberry
fbshipit-source-id: 4c65fad4f06cf402d6aab9dde5b127235766d5e0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60254
Before we only tested that the correct error message is returned if `msg` is passed as callable. This adds tests that make sure that
- the inputs passed to the callable are the same inputs passed to `torch.assert_close` and
- the `diagnostics` namespace has the same attributes and types as documented.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D29556354
Pulled By: mruberry
fbshipit-source-id: 9793c6d86fda842b6329381fc03b945eee878464
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60163
Changes to the default error message in case of mismatching values need to be reflected in the examples given in the docstring. Normally this should be enforced by a [`doctest`](https://docs.python.org/3/library/doctest.html). mruberry do you know why we don't have such a check?
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D29556353
Pulled By: mruberry
fbshipit-source-id: 8dbc3f566f429618811b542a059d9abde9a6530b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60091Closes#58383. (1) and (2) are implemented. (3) was rejected. No consensus was reached on (4) and (5).
Improvements:
- Instead of calling everything "Tensors" we now use "Scalars" and "Tensor-likes" depending on the shape. Plus, we now internally have the option to adapt this identifier for example to report "Imaginary components of complex tensor-likes", which is even more expressive.
- The reported conditions "not close" and "not equal" are now determined based on `rtol` and `atol`.
- The number of mismatched elements and the offending indices are only reported in case the inputs are not scalar
- The allowed `rtol` and `atol` is only reported if `> 0`
**Example 1**
```python
torch.testing.assert_close(1, 3, rtol=0, atol=1)
```
Before:
```
AssertionError: Tensors are not close!
Mismatched elements: 1 / 1 (100.0%)
Greatest absolute difference: 2 at 0 (up to 1 allowed)
Greatest relative difference: 0.6666666865348816 at 0 (up to 0 allowed)
```
After:
```
AssertionError: Scalars are not close!
Absolute difference: 2 (up to 1 allowed)
Relative difference: 0.6666666865348816
```
**Example 2**
```python
torch.manual_seed(0)
t = torch.rand((2, 2), dtype=torch.complex64)
torch.testing.assert_close(t, t + complex(0, 1))
```
Before:
```
AssertionError: Tensors are not close!
Mismatched elements: 4 / 4 (100.0%)
Greatest absolute difference: 1.0000000596046448 at (0, 0) (up to 1e-05 allowed)
Greatest relative difference: 0.8833684352411922 at (0, 1) (up to 1.3e-06 allowed)
The failure occurred for the imaginary part.
```
After:
```
AssertionError: Imaginary components of tensor-likes are not close!
Mismatched elements: 4 / 4 (100.0%)
Greatest absolute difference: 1.0000000596046448 at index (0, 0) (up to 1e-05 allowed)
Greatest relative difference: 0.8833684352411922 at index (0, 1) (up to 1.3e-06 allowed)
```
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D29556357
Pulled By: mruberry
fbshipit-source-id: 559d4a19ad4fc069b2b4f8cb5fc2f6058621e33d
Summary:
This adds support for quantized tensors the same way torch.testing._internal.common_utils.TestCase.assertEqual does:
bf269fdc98/torch/testing/_internal/common_utils.py (L1314-L1341)
- `.qscheme()` is checked for equality
- `.q_scale` and `q_zero_point` are checked for equality (see comment below) for `.qscheme() == torch.per_tensor_affine`
- `.q_per_channel_scales`, `q_per_channel_zero_points`, and `q_per_channel_axis` are checked for equality (see comment below) for `.qscheme() == torch.per_tensor_affine`
- values are checked with the default checks after a `.int_repr().to(torch.int32)` call
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58926
Reviewed By: jerryzh168
Differential Revision: D29483532
Pulled By: mruberry
fbshipit-source-id: 003fde7e21cf844778a879c3de0a7c84d13877bd
Summary:
We need to resolve the conjugate bit for complex tensors, because otherwise we may not be able to access the imaginary component:
```python
>>> torch.tensor(complex(1, 1)).conj().imag
RuntimeError: view_as_real doesn't work on unresolved conjugated tensors. To resolve the conjugate tensor so you can view it as real, use self.resolve_conj(); however, be warned that the resulting tensor will NOT alias the original.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60522
Reviewed By: ngimel
Differential Revision: D29353095
Pulled By: mruberry
fbshipit-source-id: c36eaf883dd55041166f692f7b1d35cd2a34acfb
Summary:
Changes including:
- introduced `linter/`, `testing/`, `stats/` folders in `tools/`
- move appropriate scripts into these folders
- change grepped references in the pytorch/pytorch repo
Next step
- introduce `build/` folder for build scripts
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60473
Test Plan:
- CI (this is important b/c pytorch/test-infra also rely on some script reference.
- tools/tests/
Reviewed By: albanD
Differential Revision: D29352716
Pulled By: walterddr
fbshipit-source-id: bad40b5ce130b35dfd9e59b8af34f9025f3285fd
Summary:
This adds support for sparse tensors the same way `torch.testing._internal.common_utils.TestCase.assertEqual` does:
5c7dace309/torch/testing/_internal/common_utils.py (L1287-L1313)
- Tensors are coalesced before comparison.
- Indices and values are compared individually.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58844
Reviewed By: zou3519
Differential Revision: D29160250
Pulled By: mruberry
fbshipit-source-id: b0955656c2c7ff3db37a1367427ca54ca14f2e87
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58919
Instead of having one large TestAsserts test case, we split of tests for
self-contained functionality like container or complex checking into
separate test cases. That makes it a lot easier to keep an overview over
what is tested.
Test Plan: Imported from OSS
Reviewed By: anjali411
Differential Revision: D29259407
Pulled By: mruberry
fbshipit-source-id: 9769cb6d56c1a3790280542db398cb247986b09a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58918
~Instead of a distinct `torch.testing.assert_close` and `torch.testing.assert_equal`, this makes `torch.testing.assert_equal` a special case of `torch.testing.assert_close` for `rtol=atol=0`. In this case the closeness definition `abs(actual - expected) <= atol + rtol * abs(expected)` boils down to `abs(actual - expected) <= 0`. Since `abs(x)` can never be `<0`, this is equivalent to `abs(a - b) == 0` and this again boils down to `a == b`.~
Following https://github.com/pytorch/pytorch/pull/58918#issuecomment-860642057 and some offline discussions, we opted to use `assert_equal` as an example how to `partial` it.
This makes maintaing the module a lot easier, because we don't need to keep two functions in sync.
Test Plan: Imported from OSS
Reviewed By: anjali411
Differential Revision: D29259404
Pulled By: mruberry
fbshipit-source-id: fa1a1fa93672a7ed1c5f0e4beb0dcd45b5c14fce
Summary:
Some machines don't have a versionless `python` on their PATH, which breaks these existing shebangs.
I'm assuming that all the existing versionless `python` shebangs are meant to be `python3` and not `python2`; please let me know if my assumption was incorrect for any of these.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58275
Test Plan: CI.
Reviewed By: zhouzhuojie
Differential Revision: D28428143
Pulled By: samestep
fbshipit-source-id: 6562be3d12924db72a92a0207b060ef740f61ebf
Summary:
In contrast to the initial opinion in https://github.com/pytorch/pytorch/issues/55385, there are legitimate use cases for nested containers. One such example is the [output of `LSTM`'s](https://pytorch.org/docs/stable/generated/torch.nn.LSTM):
```python
output: Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]] = torch.nn.LSTM()(input)
assert_close(output, expected)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57270
Reviewed By: albanD
Differential Revision: D28249303
Pulled By: mruberry
fbshipit-source-id: 75caa4414cc184ff0ce4cfc0dd5aafddfad42bcf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56365
Follow-up to https://github.com/pytorch/pytorch/pull/54784#discussion_r614156172. Instead of having one large testcase where most methods are decorated with `onlyCPU`, this factors out all tests that actually need another device into a separate test case.
Test Plan: Imported from OSS
Reviewed By: walterddr, albanD
Differential Revision: D28247529
Pulled By: mruberry
fbshipit-source-id: 946e7694b70e736941565f29b5dd459ed7fbca4e
Summary:
Redo of https://github.com/pytorch/pytorch/issues/57135 out of stack
---
Currently all values are used for the reported absolute and relative differences. This usually works fine, but breaks down for the extremals:
```python
torch.testing.assert_close(torch.tensor([1.0, 0.0]), torch.tensor([2.0, 0.0]))
```
```
[...]
Greatest absolute difference: 1.0 at 0 (up to 1e-05 allowed)
Greatest relative difference: nan at 1 (up to 1.3e-06 allowed)
```
Although the second element is matching it is listed as offender for the greatest relative difference. The `NaN` stems from the `0 / 0` division.
To overcome this, we should only use the values that were considered a mismatch for the reported stats.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57923
Reviewed By: ngimel
Differential Revision: D28317316
Pulled By: mruberry
fbshipit-source-id: 4c604493bbe13b37f41225ea9af9e839a7304161
Summary:
Redo of https://github.com/pytorch/pytorch/issues/56373 out of stack.
---
To reviewers: **please be nitpicky**. I've read this so often that I probably missed some typos and inconsistencies.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57247
Reviewed By: albanD
Differential Revision: D28247402
Pulled By: mruberry
fbshipit-source-id: 71142678ee5c82cc8c0ecc1dad6a0b2b9236d3e6
Summary:
Currently we require type equality for `torch.testing.assert_(equal|close)`:
3db45bcb91/torch/testing/_asserts.py (L509-L513)
That means `assert_equal(1, 1.0)` will correctly fail. Although the type of a scalar is similiar to a dtype of a tensor, `assert_equal(1, 1.0, check_dtype=False)` will also fail while `assert_equal(torch.as_tensor(1), torch.as_tensor(1.0), check_dtype=False)` will pass.
To make the interface more consistent, this PR relaxes the type equality constraint, by disabling it in case both inputs are scalars.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57532
Reviewed By: ngimel
Differential Revision: D28242428
Pulled By: mruberry
fbshipit-source-id: b643c77f48b64fc2c8a43925120d2b634ec336b5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55890
Proof-of-concept for https://github.com/pytorch/pytorch/pull/55145#issuecomment-817297273
With this the user is able to pass a custom error message to `assert_(equal|close)` which will be used in case the values mismatch. Optionally, a callable can be passed which will be called with mismatch diagnostics and should return an error message:
```python
def make_msg(a, b, info):
return (
f"Argh, we found {info.total_mismatches} mismatches! "
f"That is {info.mismatch_ratio:.1%}!"
)
torch.testing.assert_equal(torch.tensor(1), torch.tensor(2), msg=make_msg)
```
If you imagine `a` and `b` as the outputs of binary ufuncs, the error message could look like this:
```python
def make_msg(input, torch_output, numpy_output, info):
return (
f"For input {input} torch.binary_op() and np.binary_op() do not match: "
f"{torch_output} != {numpy_output}"
)
torch.testing.assert_equal(
torch.binary_op(input),
numpy.binary_op(input),
msg=lambda a, b, info: make_msg(input, a, b, info),
)
```
This should make it much easier for developers to find out what is actually going wrong.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D27903842
Pulled By: mruberry
fbshipit-source-id: 4c82e3d969e9a621789018018bec6399724cf388
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55786
Add support to compare scalars as well as `np.ndarray`'s with torch.testing. We are reusing the mathcing functionality that is already in place for tensors, by casting the inputs. The approach can easily extended if we want to support other input types as long as they can be cast to a tensor.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D27903814
Pulled By: mruberry
fbshipit-source-id: fe3d063d0c9513cbd8b3408a2023e94c490c817e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55385
This renames `assert_tensors_(equal|close)` to `_check_tensors_(equal|close)` and exposes two new functions: `assert_(equal|close)`. In addition to tensor pairs, the newly added functions also support the comparison of tensors in sequences or mappings. Otherwise their signature stays the same.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D27903805
Pulled By: mruberry
fbshipit-source-id: 719d19a1d26de8d14cb25846e3d22a6ac828c80a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55145
Repeating the discussion from https://github.com/pytorch/pytorch/pull/54784#issuecomment-811792089
The error messages for mismatched values are directly adapted from the old `_compare_tensors_internal`:
50cb75edce/torch/testing/__init__.py (L104-L111)
A sample error message right now looks like this
```
With rtol=1.3e-06 and atol=1e-05, found 1 different element(s) out of 12 (8.3%). The greatest difference of 4.0 (5.0 vs. 9.0) occurred at index (2, 3)
```
Using the same data with `numpy.testing.assert_equal` gives the following output:
```
Not equal to tolerance rtol=1.3e-06, atol=1e-05
Mismatched elements: 1 / 12 (8.33%)
Max absolute difference: 4.
Max relative difference: 0.44444445
x: array([[5., 5., 5., 5.],
[5., 5., 5., 5.],
[5., 5., 5., 5.]], dtype=float32)
y: array([[5., 5., 5., 5.],
[5., 5., 5., 5.],
[5., 5., 5., 9.]], dtype=float32)
```
Pros:
- The info is presented in a list instead of a sentence. IMO this makes it more readable
- The maximum relative difference is reported, which is beneficial in case a comparison fails due to the `rtol`
Cons:
- The values of the inputs are reported (this can be disabled by passing `verbose=False`, but lets face it: most users will use the default setting). In case the inputs are large, the output gets truncated with `...`. Not only is it hard to visually find the mismatching values, they could also live within the truncated part, making the output completely useless.
- Even when visually find the offending values it is hard to parse this back to the index in the inputs.
This implements a mix of both to get a short but expressive message:
```
Tensors are not close according to rtol=1.3e-6 and atol=1e-05:
Mismatched elements: 1 / 12 (8.3%)
Max. rel. diff.: 4.44e-1 at (2, 3)
Max. abs. diff.: 4.0 at (2, 3)
```
Test Plan: Imported from OSS
Reviewed By: heitorschueroff
Differential Revision: D27877157
Pulled By: mruberry
fbshipit-source-id: 6898a995f116f127e3ae8ed0bcb1ada63eadc45a
Summary:
This PR
- moves `torch/testing/_internal/mypy_wrapper.py` (and its accompanying tests from `test/test_testing.py`) to `tools`,
- removes the now-unused `test_run_mypy` from `test/test_type_hints.py`, and
- replaces the hardcoded list of `mypy` configs (previously duplicated across `mypy_wrapper.py` and `.github/workflows/lint.yml`) with a simpler glob
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54268
Test Plan:
Should also be run in the "Test tools" GHA workflow in CI:
```
python tools/test/test_mypy_wrapper.py
```
Reviewed By: janeyx99
Differential Revision: D27168095
Pulled By: samestep
fbshipit-source-id: a8dc18407b5e4c103ace23a636b0a8534951905a
Summary:
Step 2 to fixing https://github.com/pytorch/pytorch/issues/53882 :)
This changes TARGET_DET_LIST and sharding automation by checking if there's already cached data from the commit in `.pytorch-test-times`. If not, it pulls data from S3 and updates the file to have the stats. This way, S3 pulling does not need to happen more than once for the same commit.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54210
Test Plan:
the following methods should run the same set of tests.
First `export CIRCLE_JOB=pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2` or your favorite CIRCLE JOB.
1. Pull data first and use it:
Download the data from S3 and write it to the cache file with `python test/run_test.py --export-historic-test-times .pytorch-test-times`
Now run `python test/run_test.py --shard 1 10`
2. Make the sharding job pull data:
Delete the file you just created: `rm .pytorch-test-times`
Now run `python test/run_test.py --shard 1 10`
Reviewed By: walterddr
Differential Revision: D27136849
Pulled By: janeyx99
fbshipit-source-id: 51a42c4e2fa3f8cf15e682679dd3eb6130aad927
Summary:
This is an initial attempt in refactoring and consolidating our S3 read logic for print_test_stats.py, test_history.py, and run_test.py. This way, boto3 and botocore do not need to be imported in various places throughout the code base, and duplicated logic (such as the many type definitions) can exist in one place: `tools/stat_utils/s3_stat_parser.py`. walterddr contributed to this PR by moving print_test_stats.py to the tools folder and the corresponding tests a subfolder within tools.
**NOTE: this removes those tests from CI as the new `tools/test/test_stats.py` is not in the test/ directory as the other tests in TESTS in run_test.py.**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53755
Test Plan:
This refactoring change should not break anything, so running the files as before should work as they did previously.
To make sure that print_test_stats.py still functions: run `python tools/test/test_stats.py` and make sure all tests pass.
To make sure that test_history.py works, run the example commands from `tools/test_history.py --help` and check that their output matches that shown. Note that the script will continue printing for a while, so don't be alarmed.
Some next steps:
- Actually coming up with similarities among the three current use cases and further refactoring/consolidating of functions (e.g., combining simplify and get_cases)
- Moving more parsing logic to s3_stat_parser.py to have better abstraction between our files
- Adding tests for s3_stat_parser.py when there is more functionality in it
Reviewed By: agolynski, samestep
Differential Revision: D27030285
Pulled By: janeyx99
fbshipit-source-id: e664781324ef7c0c30943bfd7f17c895075ef7a7
Summary:
This PR disables the bulk of the output for test time regression reporting, since it's obscuring more important signal (especially in cases where shards are shifting around).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54078
Test Plan:
```
python test/test_testing.py
```
Reviewed By: ezyang, walterddr
Differential Revision: D27088987
Pulled By: samestep
fbshipit-source-id: 06a4eeb75641552bad2ab4b9154a8c70c57b0d68
Summary:
This PR:
1. moves sharding algorithm from run_test.py to framework_utils.py (let me know if you have a better place for it)
2. adds tests for the algorithm in test_testing.py
3. fixes the algorithm so that it doesn't tack on the unknown jobs all to the shard with the minimum time, but instead distributes them around the shards.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53942
Test Plan: python test/test_testing.py -k TestFrameworkUtils
Reviewed By: samestep
Differential Revision: D27047223
Pulled By: janeyx99
fbshipit-source-id: 824b20009c0bb707aa5361de445cdec795d5e3f1
Summary:
We want to store the file names that triggers each test suite so that we can use this data for categorizing those test files.
~~After considering several solutions, this one is the most backwards compatible, and the current test cases in test_testing.py for print test stats don't break.~~
The previous plan did not work, as there are multiple Python test jobs that spawn the same suites. Instead, the new S3 format will store test files (e.g., `test_nn` and `distributed/test_distributed_fork`) which will contain the suites they spawn, which will contain the test cases run within the suite. (Currently, there is no top layer of test files.)
Because of this major structural change, a lot of changes have now been made (thank you samestep!) to test_history.py and print_test_stats.py to make this new format backwards compatible.
Old test plan:
Make sure that the data is as expected in S3 after https://github.com/pytorch/pytorch/pull/52873 finishes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52869
Test Plan: Added tests to test_testing.py which pass, and CI.
Reviewed By: samestep
Differential Revision: D26672561
Pulled By: janeyx99
fbshipit-source-id: f46b91e16c1d9de5e0cb9bfa648b6448d979257e
Summary:
Take 2 of https://github.com/pytorch/pytorch/issues/50914
This change moves the early termination logic into common_utils.TestCase class.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52126
Test Plan: CI with ci-all tag
Reviewed By: malfet
Differential Revision: D26391762
Pulled By: walterddr
fbshipit-source-id: a149ecc47ccda7f2795e107fb95915506ae060b4
Summary:
This fixes an issue (currently blocking https://github.com/pytorch/pytorch/issues/51905) where the test time regression reporting step will fail if none of the most recent `master` ancestors have any reports in S3 (e.g. if a new job is added).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52054
Test Plan:
```
python test/test_testing.py
```
Reviewed By: walterddr
Differential Revision: D26369507
Pulled By: samestep
fbshipit-source-id: 4c4e1e290cb943ce8fcdadacbf51d66b31c3262a
Summary:
This is a follow up on https://github.com/pytorch/pytorch/issues/49869.
Previously CUDA early termination only happens for generic test classes that extends from `DeviceTypeTestBase`. However, JIT test cases which extends from common_utils.TestCase cannot benefit from the early termination.
This change moves the early termination logic into common_utils.TestCase class.
- all tests extended from common_utils.TestCase now should early terminate if CUDA assert occurs.
- For TestCases that extends from common_device_type.DeviceTypeTestBase, still only do torch.cuda.synchronize() when RTE is thrown.
- For TestCases extends common_utils.TestCase, regardless of whether a test case uses GPU or not, it will always synchronize CUDA as long as `torch.cuda.is_initialize()` returns true.
- Disabling this on common_distributed.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50914
Reviewed By: malfet
Differential Revision: D26019289
Pulled By: walterddr
fbshipit-source-id: ddc7c1c0d00db4d073a6c8bc5b7733637a7e77d1
Summary:
This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub.
**Important:** for uninteresting reasons, this PR moves the `print_test_stats.py` file.
- *Before:* `test/print_test_stats.py`
- *After:* `torch/testing/_internal/print_test_stats.py`
Notes on the approach:
- Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics.
- We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit.
- We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite.
- We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives.
- We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3).
- We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR.
- Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes.
- Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171
Test Plan:
To run the unit tests:
```
$ python test/test_testing.py
$ python test/print_test_stats.py
```
To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example:
- pytorch_linux_bionic_py3_6_clang9_test
To test locally, use the following steps.
First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option):
```
$ DATA_DIR=/tmp
$ ARBITRARY_TEST=testing
$ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST
```
Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)):
```
$ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489
$ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test
```
Download the `*.json.bz2` file(s) for that commit/job pair:
```
$ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive
```
And feed everything into `test/print_test_stats.py`:
```
$ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/*Z.json.bz2 | torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST
```
The first part of the output should be the same as before this PR; here is the new part, at the end of the output:
- https://pastebin.com/Jj1svhAn
Reviewed By: malfet, izdeby
Differential Revision: D26317769
Pulled By: samestep
fbshipit-source-id: 1ba06cec0fafac77f9e7341d57079543052d73db
Summary:
This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub.
**Important:** for uninteresting reasons, this PR moves the `print_test_stats.py` file.
- *Before:* `test/print_test_stats.py`
- *After:* `torch/testing/_internal/print_test_stats.py`
Notes on the approach:
- Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics.
- We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit.
- We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite.
- We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives.
- We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3).
- We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR.
- Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes.
- Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171
Test Plan:
To run the unit tests:
```
$ python test/test_testing.py
$ python test/print_test_stats.py
```
To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example:
- pytorch_linux_bionic_py3_6_clang9_test
To test locally, use the following steps.
First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option):
```
$ DATA_DIR=/tmp
$ ARBITRARY_TEST=testing
$ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST
```
Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)):
```
$ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489
$ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test
```
Download the `*.json.bz2` file(s) for that commit/job pair:
```
$ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive
```
And feed everything into `test/print_test_stats.py`:
```
$ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/*Z.json.bz2 | torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST
```
The first part of the output should be the same as before this PR; here is the new part, at the end of the output:
- https://pastebin.com/Jj1svhAn
Reviewed By: walterddr
Differential Revision: D26232345
Pulled By: samestep
fbshipit-source-id: b687b1737519d2eed68fbd591a667e4e029de509
Summary:
Closes https://github.com/pytorch/pytorch/issues/50513 by resolving all four checkboxes. If this PR is merged, I will also modify one or both of the following wiki pages to add instructions on how to use this `mypy` wrapper for VS Code editor integration:
- [Guide for adding type annotations to PyTorch](https://github.com/pytorch/pytorch/wiki/Guide-for-adding-type-annotations-to-PyTorch)
- [Lint as you type](https://github.com/pytorch/pytorch/wiki/Lint-as-you-type)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50826
Test Plan:
Unit tests for globbing function:
```
python test/test_testing.py TestMypyWrapper -v
```
Manual checks:
- Uninstall `mypy` and run `python test/test_type_hints.py` to verify that it still works when `mypy` is absent.
- Reinstall `mypy` and run `python test/test_type_hints.py` to verify that this didn't break the `TestTypeHints` suite.
- Run `python test/test_type_hints.py` again (should finish quickly) to verify that this didn't break `mypy` caching.
- Run `torch/testing/_internal/mypy_wrapper.py` on a few Python files in this repo to verify that it doesn't give any additional warnings when the `TestTypeHints` suite passes. Some examples (compare with the behavior of just running `mypy` on these files):
```sh
torch/testing/_internal/mypy_wrapper.py $PWD/README.md
torch/testing/_internal/mypy_wrapper.py $PWD/tools/fast_nvcc/fast_nvcc.py
torch/testing/_internal/mypy_wrapper.py $PWD/test/test_type_hints.py
torch/testing/_internal/mypy_wrapper.py $PWD/torch/random.py
torch/testing/_internal/mypy_wrapper.py $PWD/torch/testing/_internal/mypy_wrapper.py
```
- Remove type hints from `torch.testing._internal.mypy_wrapper` and verify that running `mypy_wrapper.py` on that file gives type errors.
- Remove the path to `mypy_wrapper.py` from the `files` setting in `mypy-strict.ini` and verify that running it again on itself no longer gives type errors.
- Add `test/test_type_hints.py` to the `files` setting in `mypy-strict.ini` and verify that running the `mypy` wrapper on it again now gives type errors.
- Change a return type in `torch/random.py` and verify that running the `mypy` wrapper on it again now gives type errors.
- Add the suggested JSON from the docstring of `torch.testing._internal.mypy_wrapper.main` to your `.vscode/settings.json` and verify that VS Code gives the same results (inline, while editing any Python file in the repo) as running the `mypy` wrapper on the command line, in all the above cases.
Reviewed By: walterddr
Differential Revision: D26049052
Pulled By: samestep
fbshipit-source-id: 0b35162fc78976452b5ea20d4ab63937b3c7695d
Summary:
Closes https://github.com/pytorch/pytorch/issues/50513 by resolving the first three checkboxes. If this PR is merged, I will also modify one or both of the following wiki pages to add instructions on how to use this `mypy` wrapper for VS Code editor integration:
- [Guide for adding type annotations to PyTorch](https://github.com/pytorch/pytorch/wiki/Guide-for-adding-type-annotations-to-PyTorch)
- [Lint as you type](https://github.com/pytorch/pytorch/wiki/Lint-as-you-type)
The test plan below is fairly manual, so let me know if I should add more automated tests to this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50826
Test Plan:
Unit tests for globbing function:
```
python test/test_testing.py TestMypyWrapper -v
```
Manual checks:
- Uninstall `mypy` and run `python test/test_type_hints.py` to verify that it still works when `mypy` is absent.
- Reinstall `mypy` and run `python test/test_type_hints.py` to verify that this didn't break the `TestTypeHints` suite.
- Run `python test/test_type_hints.py` again (should finish quickly) to verify that this didn't break `mypy` caching.
- Run `torch/testing/_internal/mypy_wrapper.py` on a few Python files in this repo to verify that it doesn't give any additional warnings when the `TestTypeHints` suite passes. Some examples (compare with the behavior of just running `mypy` on these files):
```sh
torch/testing/_internal/mypy_wrapper.py README.md
torch/testing/_internal/mypy_wrapper.py tools/fast_nvcc/fast_nvcc.py
torch/testing/_internal/mypy_wrapper.py test/test_type_hints.py
torch/testing/_internal/mypy_wrapper.py torch/random.py
torch/testing/_internal/mypy_wrapper.py torch/testing/_internal/mypy_wrapper.py
```
- Remove type hints from `torch.testing._internal.mypy_wrapper` and verify that running `mypy_wrapper.py` on that file gives type errors.
- Remove the path to `mypy_wrapper.py` from the `files` setting in `mypy-strict.ini` and verify that running it again on itself no longer gives type errors.
- Add `test/test_type_hints.py` to the `files` setting in `mypy-strict.ini` and verify that running the `mypy` wrapper on it again now gives type errors.
- Remove type hints from `torch/random.py` and verify that running the `mypy` wrapper on it again now gives type errors.
- Add the suggested JSON from the docstring of `torch.testing._internal.mypy_wrapper.main` to your `.vscode/settings.json` and verify that VS Code gives the same results (inline, while editing any Python file in the repo) as running the `mypy` wrapper on the command line, in all the above cases.
Reviewed By: glaringlee, walterddr
Differential Revision: D25977352
Pulled By: samestep
fbshipit-source-id: 4b3a5e8a9071fcad65a19f193bf3dc7dc3ba1b96
Summary:
This is follow up on https://github.com/pytorch/pytorch/issues/49799.
* uses `torch.cuda.synchronize()` to validate CUDA assert instead of inspecting error message.
* remove non CUDA tests.
hopefully can reproduce why slow_tests fails but not normal test. since the test still runs for >1min.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49869
Reviewed By: mruberry
Differential Revision: D25714385
Pulled By: walterddr
fbshipit-source-id: 04f8ccb50d8c9ee42826a216c49baf90285b247f
Summary:
this is a reland of https://github.com/pytorch/pytorch/issues/49527.
fixed slow test not running properly in py36 because capture_output is introduced in py37.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49799
Reviewed By: janeyx99
Differential Revision: D25692616
Pulled By: walterddr
fbshipit-source-id: 9c5352220d632ec8d7464e5f162ffb468a0f30df
Summary:
Fixes https://github.com/pytorch/pytorch/issues/49019
I marked the test_testing function as slow since it took ~1 minute to finish the subprocess test suite.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49527
Reviewed By: malfet
Differential Revision: D25623219
Pulled By: walterddr
fbshipit-source-id: 1b414623ecce14aace5e0996d5e4768a40e12e06
Summary:
should fixes https://github.com/pytorch/pytorch/issues/48879.
To test the effect of the messages: make test break, such as add `self.assertEqual(1, 2, "user_msg")` to any test
* Before:
```
AssertionError: False is not true : user_msg
```
* After
```
AssertionError: False is not true : Scalars failed to compare as equal! Comparing 1 and 2 gives a difference of 1, but the allowed difference with rtol=0 and atol=0 is only 0!
user_msg;
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48935
Reviewed By: samestep
Differential Revision: D25382153
Pulled By: walterddr
fbshipit-source-id: 95633a9f664f4b05a28801786b12a10bd21ff431
Summary:
Creates multiple new test suites to have fewer tests in test_torch.py, consistent with previous test suite creation like test_unary_ufuncs.py and test_linalg.py.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47356
Reviewed By: ngimel
Differential Revision: D25202268
Pulled By: mruberry
fbshipit-source-id: 75fde3ca76545d1b32b86d432a5cb7a5ba8f5bb6