Commit Graph

381 Commits

Author SHA1 Message Date
Catherine Lee
eea0733045 Reduce pytest blocklist (#96016)
`TestCase = object` or variations of it get switched to `TestCase = NoTest`.

unittest collects test based on subclassing unittest.TestCase, so setting TestCase = object removes it from unittest test collection.  pytest collects based on name (https://docs.pytest.org/en/7.1.x/reference/reference.html#confval-python_classes) but can be told to ignore a class (bottom of https://docs.pytest.org/en/7.1.x/example/pythoncollection.html#changing-naming-conventions)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96016
Approved by: https://github.com/ZainRizvi, https://github.com/huydhn
2023-03-07 18:30:27 +00:00
Catherine Lee
d21577f28c Run more tests through pytest (#95844)
Run more tests through pytest.

Use a block list for tests that shouldn't run through pytest.  As far as I can tell, the number of tests run, skipped, and xfailed for those not on the blocklist are the same.

Regarding the main module:

Usually tests are run in CI, we call `python <test file>`, which causes the file to be imported under the module name `__main__`.  However, pytest searches for the module to be imported under the file name, so the file will be reimported.  This can cause issues for tests that run module level code and change global state, like test_nn, which modifies lists imported from another file, or tests in test/lazy, which initialize a backend that cannot coexist with a second copy of itself.

My workaround for this is to run tests from the `__main__` module.  However, this results in pytest being unable to rewrite assertions (and possibly other things but I don't know what other things pytest does right now).  A better solution might be to call `pytest <test file>` directly and move all the code in run_tests(argv) to be module level code or put it in a hook in conftest.py.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95844
Approved by: https://github.com/huydhn
2023-03-03 17:32:26 +00:00
Sergii Dymchenko
35bf5bac26 Fix "sandcastle_skip_if decorator name is confusing" (#95649)
Fixes https://github.com/pytorch/pytorch/issues/89473
See the issue https://github.com/pytorch/pytorch/issues/89473

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95649
Approved by: https://github.com/atalman, https://github.com/malfet
2023-03-03 09:29:40 +00:00
Bin Bao
9835c93aba [CI] Change the way tests are triggered with dynamo and inductor (#94539)
Summary: Currently running PyTorch tests with dynamo and inductor is
controlled by environment variables, and CI sets them based on test
config name matching. Change them to use options of run_test.py.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94539
Approved by: https://github.com/huydhn
2023-03-01 13:06:23 +00:00
puririshi98
8aa34602f7 Jetson Update for CI Redo (#94549)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94549
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-02-21 17:13:38 +00:00
William Wen
5cdedab0cc Raise error if torch.compile is called from windows or py 3.11 (#94940)
For https://github.com/pytorch/pytorch/issues/94914

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94940
Approved by: https://github.com/albanD
2023-02-16 23:34:52 +00:00
Xuehai Pan
b005ec62b9 [BE] Remove dependency on six and future (#94709)
Remove the Python 2 and 3 compatibility library [six](https://pypi.org/project/six) and [future](https://pypi.org/project/future) and `torch._six`. We only support Python 3.8+ now. It's time to retire them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94709
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-02-14 09:14:14 +00:00
Alexander Grund
a0d1dbc446 Fix pytest arguments when --save-xml is not passed (#94589)
The expression `argv + [f'--junit-xml-reruns={test_report_path}'] if TEST_SAVE_XML else []` evaluates to the empty list when `TEST_SAVE_XML` is false and would need parentheses.

Instead simplify the code by appending the argument when required directly where `test_report_path` is set.
Note that `.append()` may not be used as that would modify `argv` and in turn `UNITTEST_ARGS` which might have undesired side effects.

Without this patch `pytest.main()` would be called, i.e. no arguments which will try to discover all tests in the current working directory which ultimately leads to (many) failures.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94589
Approved by: https://github.com/clee2000, https://github.com/Neilblaze
2023-02-13 22:19:51 +00:00
Aaron Gokaslan
3d82d8d0ed [BE] Enable more flake8-comprehensions checks (#94601)
I applied some flake8 fixes and enabled checking for them in the linter. I also enabled some checks for my previous comprehensions PR.

This is a follow up to #94323 where I enable the flake8 checkers for the fixes I made and fix a few more of them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94601
Approved by: https://github.com/ezyang
2023-02-10 23:40:29 +00:00
Jeff Daily
66bfcd32fd [ROCm] Remove PYTORCH_MIOPEN_SUGGEST_NHWC flag (#90725)
Fixes #64427.  MIOpen supports ChannelsLast.  No longer need to opt-in with env var.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90725
Approved by: https://github.com/malfet
2023-02-09 22:26:24 +00:00
Xuehai Pan
a229b4526f [BE] Prefer dash over underscore in command-line options (#94505)
Preferring dash over underscore in command-line options. Add `--command-arg-name` to the argument parser. The old arguments with underscores `--command_arg_name` are kept for backward compatibility.

Both dashes and underscores are used in the PyTorch codebase. Some argument parsers only have dashes or only have underscores in arguments. For example, the `torchrun` utility for distributed training only accepts underscore arguments (e.g., `--master_port`). The dashes are more common in other command-line tools. And it looks to be the default choice in the Python standard library:

`argparse.BooleanOptionalAction`: 4a9dff0e5a/Lib/argparse.py (L893-L895)

```python
class BooleanOptionalAction(Action):
    def __init__(...):
            if option_string.startswith('--'):
                option_string = '--no-' + option_string[2:]
                _option_strings.append(option_string)
```

It adds `--no-argname`, not `--no_argname`. Also typing `_` need to press the shift or the caps-lock key than `-`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94505
Approved by: https://github.com/ezyang, https://github.com/seemethere
2023-02-09 20:16:49 +00:00
Philip Meier
6f543e0d0a add not_close_error_metas for internal comparison machinery (#90004)
While discussing a possible addition of `assert_not_close` to the API (See #90005 later in the stack), it became clear that we should have an intermediate function that returns a bool-ish value that one can assert on. This PR introduces this function as `are_equal` as replacement for `assert_equal`. Interface is the same, but instead of raising in case a comparison failed, we return the `ErrorMeta`'s of all failures and leave it to the caller to handle. Note that this only applies to errors raised during the comparison stage. Everything else, e.g. only setting `atol` *or* `rtol`, will raise just as before.

We decided to keep this private for now unless there is user demand. The largest issue that needs to be solved before this can become public is the return type: if we have something like `torch.testing.are_close` we are targeting two uses cases:

1. Using it to branch inside code like `if are_close(...):`
2. Using it to assert closeness inside a test like `assert are_close(...)`. This is the default way to assert something with `pytest`

To do that, the return type has to be bool-ish, i.e. being an instance of `bool` or implementing `__bool__`. Plus, `bool(are_close()) is True` needs to be the if the inputs are close and `False` otherwise. The current logic of `are_close` satisfies the former, but violates the latter. In case everything is close, we return an empty list, but `bool([]) is False`.

Directly using an instance of `bool` would work for the requirements above, but then we would have no option to add diagnositics to the error. Meaning `assert are_close()` would work, but would be non-descriptive.

Using `Tuple[bool, str]` would work in general, but is quite dangerous and unexpected: since all non-empty tuples evaluate to `True`, this can easily hide bugs if the user is not super careful:

```pycon
>>> close = (False, "error message with diagnostics")
>>> assert close[0]
AssertionError: error message with diagnostics
>>> assert close
```

One possible solution here would be a thin custom object:

```py
class Close:
    def __init__(self, flag:bool, msg: str = "") -> None:
        self._flag = flag
        self._msg = msg

    def __bool__(self):
        return self._flag

    def __str__(self):
        return self._msg
```

Now we can do something like

```pycon
close = Close(False, "error message with diagnostics")  # coming from are_close
>>> if not close:
...     print("It works!")
It works!
>>> assert close
AssertionError
>>> assert close, close  # This looks weird, but does its job
AssertionError: error message with diagnostics
```

But this means we introduce another abstraction that the user has to deal with.

To reiterate, we are not going to make `are_close` public until there is user demand, since none of the options above is without flaws.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90004
Approved by: https://github.com/mruberry, https://github.com/malfet
2023-02-08 11:22:55 +00:00
Philip Meier
566eb49ed2 minor internal cleanup in assert_close (#90003)
Per title. I'm going to highlight them with inline comments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90003
Approved by: https://github.com/mruberry, https://github.com/malfet
2023-02-08 11:22:55 +00:00
Aaron Gokaslan
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
albanD
0b2dc3b3ac [Py-3.11] Skip dynamo related tests (#94187)
The quantization test fails to import Dynamo as expected.
The traceback tool looks a lot more tricky, opened https://github.com/pytorch/pytorch/issues/94189 to investigate further.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94187
Approved by: https://github.com/malfet
2023-02-07 16:40:55 +00:00
Nikita Shulga
5976f0bdfe Set min supported Python version to 3.8 (#93155)
Also, grep for `if sys.version_info .cond. (3, 8)` and replaces them with appropriate action.

This is a last in a series of PRs that moved CI/CD away from testing PyTorch behavior against Python-3.7.

Fixes https://github.com/pytorch/pytorch/issues/80513

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93155
Approved by: https://github.com/huydhn
2023-01-29 18:28:46 +00:00
Pearu Peterson
b3e4f5029b Add check-sparse-tensor-invariants flag to Context - 2nd try. (#92094)
This PR is a copy of https://github.com/pytorch/pytorch/pull/90849 that merge was reverted.

The PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI:

`torch.sparse.check_sparse_tensor_invariants` class provides different ways to enable/disable the invariant checking.

`torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden.

The PR fixes https://github.com/pytorch/pytorch/issues/90833

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92094
Approved by: https://github.com/cpuhrsch
2023-01-13 14:50:33 +00:00
Edward Z. Yang
7078ad5b8c Reland "AOT Autograd refactor + cleanup, handle intermediate views of bases, use view replay, fix non-tensor input handling" (#92076)
Original PR: https://github.com/pytorch/pytorch/pull/89532

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92076
Approved by: https://github.com/janeyx99, https://github.com/albanD
2023-01-12 21:32:05 +00:00
Eddie Yan
e096d2db5a [BC-Breaking] Separate stream_id, device_index, and device_type in pack and unpack for Streams (#81596)
#75854

A naive attempt at working around the limitations of using a single 64-bit integer to pack `stream_id`, `device_index`, and `device_type`.

Stills needs sanity checks, testing, and minimization of BC-breaking changes.

Currently a Holder for the `StreamData3` struct is used for `IValue` compatibility. While doing this seems to work for `ivalue.h` and `ivalue_inl.h`, this doesn't seem to be naively working for the JIT CUDA stream wrapper? (Something about ambiguous calls if an `intrusive_ptr` to `c10::ivalue::StreamData3Holder` is used as the return type for `pack()`. It turns out that the methods required to access the fields for rematerializing a CUDA Stream are basically already present anyway, so `pack` is simply removed in the wrapper for now and the methods to access the required fields are called directly.

CC @ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/81596
Approved by: https://github.com/ezyang
2023-01-12 14:16:49 +00:00
PyTorch MergeBot
c7a22bb7c7 Revert "Add check-sparse-tensor-invariants flag to Context. (#90849)"
This reverts commit b9a035c1c5.

Reverted https://github.com/pytorch/pytorch/pull/90849 on behalf of https://github.com/DanilBaibak due to Break internal build
2023-01-12 09:58:16 +00:00
Aleksandar Samardžić
8612ec5b90 Implement hybrid sparse to/from dense conversions. (#90177)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90177
Approved by: https://github.com/cpuhrsch, https://github.com/pearu
2023-01-12 03:31:30 +00:00
Pearu Peterson
b9a035c1c5 Add check-sparse-tensor-invariants flag to Context. (#90849)
This PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI:

- `torch.enable_check_sparse_tensor_invariants` and `torch.is_check_sparse_tensor_invariants_enabled` functions to globally enable/disable the invariant checks and to retrieve the state of the feature, respectively
- `torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden.

The PR also fixes https://github.com/pytorch/pytorch/issues/90833

# Main issue

*The following content is outdated after merging the PRs in this ghstack but kept for the record.*

The importance of this feature is that when enabling the invariants checks by default, say, via

<details>

```
$ git diff
diff --git a/torch/__init__.py b/torch/__init__.py
index c8543057c7..19a91d0482 100644
--- a/torch/__init__.py
+++ b/torch/__init__.py
@@ -1239,3 +1239,8 @@ if 'TORCH_CUDA_SANITIZER' in os.environ:

 # Populate magic methods on SymInt and SymFloat
 import torch.fx.experimental.symbolic_shapes
+
+# temporarily enable sparse tensor arguments validation in unsafe
+# constructors:
+
+torch._C._set_check_sparse_tensor_invariants(True)
```

</details>

a massive number of test failures/errors occur in test_sparse_csr.py tests:
```
$ pytest -sv test/test_sparse_csr.py
<snip>
==== 4293 failed, 1557 passed, 237 skipped, 2744 errors in 69.71s (0:01:09) ====
```
that means that we are silently constructing sparse compressed tensors that do not satisfy the sparse tensor invariants. In particular, the following errors are raised:

```
AssertionError: "resize_as_sparse_compressed_tensor_: self and src must have the same layout" does not match "expected values to be a strided and contiguous tensor"

RuntimeError: CUDA error: device-side assert triggered

RuntimeError: `col_indices[..., crow_indices[..., i - 1]:crow_indices[..., i]] for all i = 1, ..., nrows are sorted and distinct along the last dimension values` is not satisfied.

RuntimeError: expected col_indices to be a strided and contiguous tensor

RuntimeError: expected row_indices to be a strided and contiguous tensor

RuntimeError: expected values to be a strided and contiguous tensor

RuntimeError: for_each: failed to synchronize: cudaErrorAssert: device-side assert triggered

RuntimeError: tensor dimensionality must be sum of batch, base, and dense dimensionalities (=0 + 2 + 0) but got 3
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90849
Approved by: https://github.com/amjames, https://github.com/cpuhrsch
2023-01-11 01:05:14 +00:00
Joel Schlosser
1effabe257 Support per-parameter test decoration (#91658)
Continuation of #79979.

Fixes #79161

This PR does the following:
* Expands the `parametrize_fn()` signature from returning a 3-tuple of `(test, test_name, param_kwargs)` to returning a 4-tuple of `(test, test_name, param_kwargs, decorator_fn)`. Expected signature for the addition is `decorator_fn(param_kwargs) -> List[decorator]` i.e. given the full set of test params, return a list of decorators to apply.
    * `modules`, `ops`, and `parametrize` now fit the new signature, returning `decorator_fn`s instead of applying decorators themselves.
    * `instantiate_parametrized_tests()` and `instantiate_device_type_tests()` now call the returned `decorator_fn`, passing in the full set of `param_kwargs` (after composition + `device` / `dtype` additions) and applying the returned decorators.
    * Composing multiple `parametrize_fn`s also composes the corresponding `decorator_fn`s; the composed `decorator_fn` simply concatenates the decorator lists returned by the constituents.
* Expands `DecorateInfo.is_active` to support callables:
```python
DecorateInfo(
    unittest.expectedFailure, "TestOps", "test_python_ref_executor",
    device_type='cuda', active_if=lambda params: params['executor'] == 'nvfuser'
),
```
* Adds several tests to `test/test_testing.py` ensuring proper decoration using `@parametrize`, `@modules`, and `@ops`.
* (minor) Fixes a couple `ModuleInfo` naming oddities uncovered during testing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91658
Approved by: https://github.com/malfet
2023-01-04 21:08:32 +00:00
Nikita Vedeneev
7ef7c57ae7 CSC/BSC -> COO coalesce fix (#91440)
Fixes https://github.com/pytorch/pytorch/issues/91010.

CSC and BSC sparse formats are not inherently `coalesced`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91440
Approved by: https://github.com/pearu, https://github.com/amjames, https://github.com/cpuhrsch
2023-01-03 18:42:39 +00:00
Pearu Peterson
b797a24259 Support indices contiguity per batch and non-contiguous values in sparse compressed tensors (#91243)
Fixes https://github.com/pytorch/pytorch/issues/91062

With this PR, all reported failures in https://github.com/pytorch/pytorch/pull/90849 are resolved (modulo test_bmm that uses an unorthodox way to construct a batch CSR tensor).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91243
Approved by: https://github.com/nikitaved, https://github.com/amjames, https://github.com/lezcano
2023-01-02 18:08:46 +00:00
Michael Voznesensky
b72caf311d Introduce guardexpr, aot autograd guarding of duplicates into torch._guards (#90955)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90955
Approved by: https://github.com/ezyang
2022-12-18 03:05:47 +00:00
Huy Do
4e6455163f Fix unittest rerun logic when checking for skipped tests (#90888)
I made an important mistake here when thinking `not result.skipped` mean that the current test wasn't skipped.

Similar to `result.failures` or `result.errors`, `result.skipped` is that it's a list including all the skipped messages so far in the test suite (https://docs.python.org/3/library/unittest.html#unittest.TestResult).  As such, the correct way to check if the current test was skipped is to compare `skipped_before` and `len(result.skipped)` after running the test in the same way as failures and errors are handled.  If they are the same, the test isn't skipped.

### Testing

`python test/run_test.py -i test_autograd --verbose` to confirm that the disabled test `test_profiler_seq_nr` is run 50 times always in rerun mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90888
Approved by: https://github.com/clee2000
2022-12-15 05:13:59 +00:00
Bin Bao
7035bcdd0f [inductor] Enable test_torch (#90518)
Summary: Skipping failures in those tests so that CI can guard other
passing cases.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90518
Approved by: https://github.com/jansel
2022-12-13 16:21:35 +00:00
Philip Meier
7bb97c4ca4 move TypedStorage handling to assertEqual (#89557)
#85303 added a patch to `torch.testing.assert_close` to handle `torch.storage.TypedStorage`'s. This change is not reflected in the docs and is not intended for the public API. This PR removes the patch ones again and moves the behavior to `TestCase.assertEqual` instead. Meaning, `TypedStorage`'s are again not supported by the public API, but the behavior is the same for all internal use cases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89557
Approved by: https://github.com/kurtamohler, https://github.com/mruberry
2022-12-12 23:26:00 +00:00
Ram Rachum
351d73b97f Fix exception causes all over the codebase (#90271)
This is the continuation to #90134 and hopefully the final PR in this series.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271
Approved by: https://github.com/kit1980
2022-12-07 04:29:00 +00:00
Pearu Peterson
76c6dfeaa6 Add layout and blocksize arguments to Tensor.to_sparse method (#89502)
This PR extends the `Tensor.to_sparse()` method to `Tensor.to_sparse(layout=None, blocksize=None)` in a BC manner (`layout=None` means `layout=torch.sparse_coo`).

In addition, the PR adds support for the following conversions:
- non-hybrid/hybrid COO tensor to CSR or CSC or a COO tensor
- short, bool, byte, char, bfloat16, int, long, half CSR tensor to a BSR tensor

and fixes the following conversions:
- hybrid COO to COO tensor
- non-batch/batch hybrid BSR to BSR or BSC tensor

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89502
Approved by: https://github.com/amjames, https://github.com/cpuhrsch
2022-11-30 20:21:10 +00:00
Pearu Peterson
296e1ba4d0 Row and column select support for block compressed sparse tensors (#88733)
As in the title:

- Support `select` and `select_copy` on block sparse compressed tensors
- Fixes incorrect results when selecting dense dimensions

The PR also improves the performance of indexing sparse compressed tensors considerably:

<details>

Before:

```python
In [3]: a=torch.rand((1000, 1000)).to_sparse_csr()

In [4]: %timeit a.select(0, 0)
606 µs ± 4.27 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [5]: %timeit a.select(1, 0)
527 µs ± 57.7 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [6]: %timeit a[0, 0]
617 µs ± 3.74 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [7]: a = a.cuda()

In [8]: %timeit a.select(0, 0); torch.cuda.synchronize();
1.19 ms ± 137 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [9]: %timeit a.select(1, 0); torch.cuda.synchronize();
1.2 ms ± 119 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [10]: %timeit a[0, 0]; torch.cuda.synchronize();
1.23 ms ± 482 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```

This PR:

```python
In [3]: a=torch.rand((1000, 1000)).to_sparse_csr()

In [4]: %timeit a.select(0, 0)
4.75 µs ± 8.94 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [5]: %timeit a.select(1, 0)
565 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [6]: %timeit a[0, 0]
13.1 µs ± 435 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: a = a.cuda()

In [8]: %timeit a.select(0, 0); torch.cuda.synchronize();
21.6 µs ± 23.9 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [9]: %timeit a.select(1, 0); torch.cuda.synchronize();
1.15 ms ± 3.13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [10]: %timeit a[0, 0]; torch.cuda.synchronize();
63.7 µs ± 2.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```

</details>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88733
Approved by: https://github.com/nikitaved, https://github.com/amjames, https://github.com/cpuhrsch
2022-11-30 11:15:56 +00:00
Pearu Peterson
90bed8874f Generator of tensor inputs with variable layout and structure (batch/non-batch, hybrid/non-hybrid, block/non-block) (#88914)
This PR introduces `TestCase.generate_simple_inputs` method that is an improved and generalized version of the `TestSparseCompressed._generate_small_inputs` method.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88914
Approved by: https://github.com/cpuhrsch
2022-11-30 02:13:33 +00:00
Edward Z. Yang
5f8848f329 Don't suppress log messages for dynamo CI config (#89653)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89653
Approved by: https://github.com/albanD, https://github.com/kit1980
2022-11-28 03:39:40 +00:00
Edward Z. Yang
5266953443 Add crossref debug mode for functionalization, catches stride errors (#89498)
The idea is to add a custom handler to Functionalize key in Python
dispatcher that runs the functionalized version along side a non
functionalized version, and checks that their outputs agree in the
end.  (Technically, for metadata mutation we should also check the
inputs, but for now we're relying on those functions returning self.)
I turned this on for test_functionalize.py (new TestCrossRefFunctionalize)
and found a bunch of failures that look legit.

This probably doesn't interact that nicely if you're also tracing at
the same time, probably need more special logic for that (directly,
just disabling tracing for when we create the nested fake tensor mode,
but IDK if there's a more principled way to organize this.)

There are some misc fixups which I can split if people really want.

- xfail_inherited_tests moved to test common_utils
- Bindings for _dispatch_tls_set_dispatch_key_included,
  _dispatch_tls_is_dispatch_key_included and _functionalization_reapply_views_tls
- Type stubs for _enable_functionalization, _disable_functionalization
- all_known_overloads utility to let you iterate over all OpOverloads
  in all namespaces.  Iterator support on all torch._ops objects to let
  you iterate over their members.
- suspend_functionalization lets you temporarily disable functionalization mode
  in a context
- check_metadata_matches for easily comparing outputs of functions and see
  if they match (TODO: there are a few copies of this logic, consolidate!)
- _fmt for easily printing the metadata of a tensor without its data
- _uncache_dispatch for removing a particular dispatch key from the cache,
  so that we force it to regenerate
- check_significant_strides new kwarg only_cuda to let you also do stride
  test even when inputs are not CUDA
- Functionalize in torch._C.DispatchKey

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89498
Approved by: https://github.com/malfet
2022-11-23 04:18:25 +00:00
Huy Do
ce342ed2d3 Fix retrying logic for successful unittest tests under --rerun-disabled-tests mode (#89454)
When looking into Rockset data for disabled test unittest, for example `testAdd`, I see that it's re-run only 3 times instead of 50+ times as expected under rerun-disabled -test mode

```
[
  {
    "name": "testAdd",
    "classname": "TestLazyReuseIr",
    "filename": "lazy/test_reuse_ir.py",
    "flaky": false,
    "num_green": 3,
    "num_red": 0
  }
]
```

It turns out that I made a mistake mixing `RERUN_DISABLED_TESTS` and `report_only` into `(RERUN_DISABLED_TESTS or report_only) and num_retries_left < MAX_NUM_RETRIES` in https://github.com/pytorch/pytorch/pull/88646.  The retrying logic for successful tests under rerun-disabled-tests mode is never executed because num_retries_left would be equal to MAX_NUM_RETRIES (not smaller) if the very first run successes. Thus, the sample test `testAdd` finishes right away (1 success count)

* `report_only` and `RERUN_DISABLED_TESTS` are 2 different things and shouldn't be mixed together. RERUN_DISABLED_TESTS has the higher priority.
* We also don't want to retry skipped tests under rerun-disabled-tests mode because they are only skipped due to `check_if_enable` check `Test is enabled but --rerun-disabled-tests verification mode is set, so only disabled tests are run`

### Testing

* CI https://github.com/pytorch/pytorch/actions/runs/3518228784 generates https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/3518228784/1/artifact/test-reports-test-default-4-4-linux.4xlarge.nvidia.gpu_9627285587.zip in which `testAdd` is correctly called multiple times and `TestLazyReuseIr` is skipped correctly
* Locally

```
# export CI=1
# export PYTORCH_RETRY_TEST_CASES=1
# export PYTORCH_OVERRIDE_FLAKY_SIGNAL=1
# export PYTORCH_TEST_RERUN_DISABLED_TESTS=1
$ python test/run_test.py --verbose -i lazy/test_reuse_ir
Ignoring disabled issues:  []
Selected tests:
 lazy/test_reuse_ir
Prioritized test from test file changes.
reordering tests for PR:
prioritized: []
the rest: ['lazy/test_reuse_ir']

Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /Users/huydo/Storage/mine/pytorch/test/.pytorch-slow-tests.json
Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /Users/huydo/Storage/mine/pytorch/test/.pytorch-disabled-tests.json
parallel (file granularity) tests:
 lazy/test_reuse_ir
serial (file granularity) tests:

Ignoring disabled issues:  []
Ignoring disabled issues:  []
Running lazy/test_reuse_ir ... [2022-11-21 13:21:07.165877]
Executing ['/Users/huydo/miniconda3/envs/py3.9/bin/python', '-bb', 'lazy/test_reuse_ir.py', '-v', '--import-slow-tests', '--import-disabled-tests', '--rerun-disabled-tests'] ... [2022-11-21 13:21:07.166279]

Expand the folded group to see the log file of lazy/test_reuse_ir
##[group]PRINTING LOG FILE of lazy/test_reuse_ir (/Users/huydo/Storage/mine/pytorch/test/test-reports/lazy-test_reuse_ir_6cf_dxa1)

Running tests...
----------------------------------------------------------------------
Test results will be stored in test-reports/python-unittest/lazy.test_reuse_ir
  testAdd (__main__.TestLazyReuseIr) ... ok (1.215s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 50
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 49
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 48
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 47
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 46
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 45
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 44
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 43
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 42
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 41
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 40
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 39
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 38
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 37
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 36
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 35
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 34
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 33
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 32
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 31
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 30
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 29
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 28
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 27
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 26
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 25
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 24
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 23
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 22
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 21
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 20
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 19
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 18
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 17
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 16
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 15
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 14
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 13
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 12
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 11
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 10
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 9
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 8
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 7
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 6
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 5
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 4
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 3
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 2
ok (0.001s)
  testAdd (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 1
ok (0.001s)
  testAddSub (__main__.TestLazyReuseIr) ...     testAdd succeeded - num_retries_left: 0
skip: Test is enabled but --rerun-disabled-tests verification mode is set, so only disabled tests are run (0.001s)
  testAddSubFallback (__main__.TestLazyReuseIr) ... skip: Test is enabled but --rerun-disabled-tests verification mode is set, so only disabled tests are run (0.001s)
  testBatchNorm (__main__.TestLazyReuseIr) ... skip: Test is enabled but --rerun-disabled-tests verification mode is set, so only disabled tests are run (0.001s)

----------------------------------------------------------------------
Ran 54 tests in 1.264s

OK (skipped=3)
```

Here is the sample rockset query

```
WITH added_row_number AS (
  SELECT
    *,
    ROW_NUMBER() OVER(PARTITION BY name, classname, filename ORDER BY _event_time DESC) AS row_number
  FROM
    commons.rerun_disabled_tests
)
SELECT
  name,
  classname,
  filename,
  flaky,
  num_green,
  num_red
FROM
  added_row_number
WHERE
  row_number = 1
  AND name = 'testAdd'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89454
Approved by: https://github.com/clee2000
2022-11-22 03:39:17 +00:00
lezcano
c2cf0bde1f Move the OpInfo same-storage error to the autograd test (#88306)
This check was previously located at the `non_contiguous` test (quite
and odd location). Even more, at https://github.com/pytorch/pytorch/pull/86378#discussion_r993658395, Kshiteej found that this assert was not doing anything really.

We move it to the autograd test and make it a proper `self.assert`. We also disallow returning 1-tuples from sample_input functions, as they were breaking this assert.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88306
Approved by: https://github.com/mruberry
2022-11-21 13:59:03 +00:00
Huy Do
177621a0b2 Use pytest-flakefinder to rerun tests multiple times (#89106)
Per title. The way re-run is handled in https://github.com/pytorch/pytorch/pull/88646 only applies to unittest.

### Testing

* https://github.com/pytorch/pytorch/actions/runs/3484930558
* https://github.com/pytorch/pytorch/actions/runs/3484930319

Manually download the test report artifacts and verify that that pytest test_ops is called multiple times.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89106
Approved by: https://github.com/clee2000
2022-11-18 00:11:44 +00:00
Catherine Lee
175b7e1cde print xpass (#89020)
Print unexpected success as XPASS.  I will submit a PR to test-infra so that the log classifier can find these

Ex: https://github.com/pytorch/pytorch/actions/runs/3466368885/jobs/5790424173
```
  test_import_hipify (__main__.TestHipify) ... ok (0.000s)
  test_check_onnx_broadcast (__main__.TestONNXUtils) ... ok (0.000s)
  test_prepare_onnx_paddings (__main__.TestONNXUtils) ... ok (0.000s)
  test_load_standalone (__main__.TestStandaloneCPPJIT) ... ok (16.512s)

======================================================================
XPASS [4.072s]: test_smoke (__main__.TestCollectEnv)
----------------------------------------------------------------------

----------------------------------------------------------------------
Ran 31 tests in 24.594s

FAILED (skipped=7, unexpected successes=1)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89020
Approved by: https://github.com/huydhn, https://github.com/seemethere
2022-11-15 21:27:14 +00:00
Huy Do
21dd311077 Add a mode to rerun all disabled tests (without running anything else) (#88646)
Rerun all disabled test to gather their latest result so that we can close disabled tickets automatically. When running under this mode (RERUN_DISABLED_TESTS=true), only disabled tests are run while the rest are skipped `<skipped message="Test is enabled but --rerun-disabled-tests verification mode is set, so only disabled tests are run" type="skip"/>`

The logic is roughly as follows, the test runs multiple times (n=50)

* If the disabled test passes, and it's flaky, do nothing because it's still flaky.  In the test report, we'll see the test passes with the following skipped message:
```
<testcase classname="TestMultiprocessing" file="test_multiprocessing.py" line="357" name="test_fs" time="0.000" timestamp="0001-01-01T00:00:00">
    <skipped message="{&quot;flaky&quot;: True, &quot;num_red&quot;: 4, &quot;num_green&quot;: 0, &quot;max_num_retries&quot;: 3, &quot;rerun_disabled_test&quot;: true}" type="skip"/>
</testcase>
```

* If the disabled test passes every single time, and it is not flaky anymore, mark it so that it can be closed later.  We will see the test runs and passes, i.e.
```
<testcase classname="TestCommonCUDA" name="test_out_warning_linalg_lu_factor_cuda" time="0.170" file="test_ops.py" />
```

* If the disabled test fails after all retries, this is also expected. So only report this but don't fail the job (because we don't care about red signals here), we'll see the test is skipped (without the `flaky` field), i.e.
```
<testcase classname="TestMultiprocessing" file="test_multiprocessing.py" line="357" name="test_fs" time="0.000" timestamp="0001-01-01T00:00:00">
    <skipped message="{&quot;num_red&quot;: 4, &quot;num_green&quot;: 0, &quot;max_num_retries&quot;: 3, &quot;rerun_disabled_test&quot;: true}" type="skip"/>
</testcase>
```

This runs at the same schedule as `mem_leak_check` (daily).  The change to update test stats, and (potentially) grouping on HUD will come in separated PRs.

### Testing

* pull https://github.com/pytorch/pytorch/actions/runs/3447434434
* trunk https://github.com/pytorch/pytorch/actions/runs/3447434928
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88646
Approved by: https://github.com/clee2000
2022-11-15 05:08:26 +00:00
Philip Meier
e6561291b8 add hack to allow hybrid compressed sparse comparison in assertEqual (#88749)
Hybrid sparse CSR tensors can currently not be compared to strided ones since `.to_dense` does not work:

```py
import torch
from torch.testing._internal.common_utils import TestCase

assertEqual = TestCase().assertEqual

actual = torch.sparse_csr_tensor([0, 2, 4], [0, 1, 0, 1], [[1, 11], [2, 12] ,[3, 13] ,[4, 14]])
expected = torch.stack([actual[0].to_dense(), actual[1].to_dense()])
assertEqual(actual, expected)
```

```
main.py:4: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at ../aten/src/ATen/SparseCsrTensorImpl.cpp:54.)
  actual = torch.sparse_csr_tensor([0, 2, 4], [0, 1, 0, 1], [[1, 11], [2, 12] ,[3, 13] ,[4, 14]])
Traceback (most recent call last):
  File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 1098, in assert_equal
    pair.compare()
  File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 619, in compare
    actual, expected = self._equalize_attributes(actual, expected)
  File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 706, in _equalize_attributes
    actual = actual.to_dense() if actual.layout != torch.strided else actual
RuntimeError: sparse_compressed_to_dense: Hybrid tensors are not supported

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "main.py", line 10, in <module>
    assertEqual(actual, expected)
  File "/home/philip/git/pytorch/torch/torch/testing/_internal/common_utils.py", line 2503, in assertEqual
    msg=(lambda generated_msg: f"{generated_msg}\n{msg}") if isinstance(msg, str) and self.longMessage else msg,
  File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 1112, in assert_equal
    ) from error

RuntimeError: Comparing

TensorOrArrayPair(
    id=(),
    actual=tensor(crow_indices=tensor([0, 2, 4]),
       col_indices=tensor([0, 1, 0, 1]),
       values=tensor([[ 1, 11],
                      [ 2, 12],
                      [ 3, 13],
                      [ 4, 14]]), size=(2, 2, 2), nnz=4,
       layout=torch.sparse_csr),
    expected=tensor([[[ 1, 11],
         [ 2, 12]],

        [[ 3, 13],
         [ 4, 14]]]),
    rtol=0.0,
    atol=0.0,
    equal_nan=True,
    check_device=False,
    check_dtype=True,
    check_layout=False,
    check_stride=False,
    check_is_coalesced=False,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

This adds a temporary hack to `TestCase.assertEqual` to enable this. Basically, we are going through the individual CSR subtensors, call `.to_dense()` on them, and stack everything back together. I opted to not do this in the common machinery, since that way users are not affected by this (undocumented) hack.

I also added an xfailed test that will trigger as soon as the behavior is supported natively so we don't forget to remove the hack when it is no longer needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88749
Approved by: https://github.com/mruberry, https://github.com/pearu
2022-11-10 13:44:45 +00:00
PyTorch MergeBot
d98a884b33 Revert "[cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669)"
This reverts commit 3c6bddc3f6.

Reverted https://github.com/pytorch/pytorch/pull/87669 on behalf of https://github.com/eqy due to investigating convnext benchmark regressions
2022-11-08 19:04:25 +00:00
Catherine Lee
d632d94cc7 Disable mem leak check (#88373)
tbh at this point it might be easier to make a new workflow and copy the relevant jobs...

Changes:
* Disable cuda mem leak check except for on scheduled workflows
* Make pull and trunk run on a schedule which will run the memory leak check
* Periodic will always run the memory leak check -> periodic does not have parallelization anymore
* Concurrency check changed to be slightly more generous
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88373
Approved by: https://github.com/ZainRizvi, https://github.com/huydhn
2022-11-04 20:47:42 +00:00
soulitzer
4c20c0509d Split out forward AD tests from test_ops_gradients and reenable slow gradcheck CI (#88216)
Fixes: https://github.com/pytorch/pytorch/issues/88010

This PR does a couple things to stop slow gradcheck from timing out:
- Splits out test_ops_fwd_gradients from test_ops_gradients, and factors out TestFwdGradients and TestBwdGradients which both inherit from TestGradients, now situated in common_utils (maybe there is a better place?)
- Skips CompositeCompliance (and several other test files) for slow gradcheck CI since they do not use gradcheck
- because test times for test_ops_fwd_gradients and test_ops_gradients are either unknown or wrong, we hardcode them for now to prevent them from being put together. We can undo the hack after we see actual test times are updated. ("def calculate_shards" randomly divides tests with unknown test times in a round-robin fashion.)
- Updates references to test_ops_gradients and TestGradients
- Test files that are skipped for slow gradcheck CI are now centrally located in in run_tests.py, this reduces how fine-grained we can be with the skips, so for some skips (one so far) we still use the old skipping mechanism, e.g. for test_mps

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88216
Approved by: https://github.com/albanD
2022-11-03 00:20:45 +00:00
eqy
3c6bddc3f6 [cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669)
#58414

Has a small tweak to a test that was breaking on A10 (CC @malfet).

CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87669
Approved by: https://github.com/ngimel
2022-11-02 01:36:37 +00:00
Edward Z. Yang
ff94494644 Revert "Revert "Unify meta tensor and fake tensor converter conversion (#87943)"" (#88045)
This reverts commit bc64999b83.

Check torch/_subclasses/meta_utils.py for "This is very tricky" for the bugfix explanation.

cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88045
Approved by: https://github.com/kit1980, https://github.com/Chillee
2022-10-31 17:50:14 +00:00
PyTorch MergeBot
bc64999b83 Revert "Unify meta tensor and fake tensor converter conversion (#87943)"
This reverts commit baa715e790.

Reverted https://github.com/pytorch/pytorch/pull/87943 on behalf of https://github.com/kit1980 due to Broke several inductor tests
2022-10-29 18:39:28 +00:00
Edward Z. Yang
baa715e790 Unify meta tensor and fake tensor converter conversion (#87943)
Meta tensor does a lot of work to make sure tensors "look" similar
to the original parts; e.g., if the original was a non-leaf, meta
converter ensures the meta tensor is a non-leaf too.  Fake tensor
destroyed some of these properties when it wraps it in a FakeTensor.

This patch pushes the FakeTensor constructor into the meta converter
itself, so that we first create a fake tensor, and then we do various
convertibility bits to it to make it look right.

The two tricky bits:

- We need to have no_dispatch enabled when we allocate the initial meta
  tensor, or fake tensor gets mad at us for making a meta fake tensor.
  This necessitates the double-callback structure of the callback
  arguments: the meta construction happens *inside* the function so
  it is covered by no_dispatch

- I can't store tensors for the storages anymore, as that will result
  in a leak.  But we have untyped storage now, so I just store untyped
  storages instead.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87943
Approved by: https://github.com/eellison, https://github.com/albanD
2022-10-29 15:01:07 +00:00
wchen61
2c66889f90 Synchronize before change cuda stream (#82050) (#82056)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/82050

Need synchronize before change cuda stream

### Description
<!-- What did you change and why was it needed? -->

### Issue
<!-- Link to Issue ticket or RFP -->

### Testing
<!-- How did you test your change? -->

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82056
Approved by: https://github.com/ngimel
2022-10-26 23:44:13 +00:00
lezcano
5e4bcb049e Improve readability of the extra message errors in assertEqual (#87202)
Goes from (note the `linspace.default` is very difficult to find)
```
Mismatched elements: 15 / 50 (30.0%)
Greatest absolute difference: 1 at index (17,)
Greatest relative difference: 1.0 at index (17,) : linspace.default
args = (0, -3, 50)
kwargs = {'dtype': torch.int16, 'device': device(type='cpu'),
'pin_memory': False}
```
to
```
Mismatched elements: 15 / 50 (30.0%)
Greatest absolute difference: 1 at index (17,)
Greatest relative difference: 1.0 at index (17,)
linspace.default
args = (0, -3, 50)
kwargs = {'dtype': torch.int16, 'device': device(type='cpu'),
'pin_memory': False}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87202
Approved by: https://github.com/ezyang
2022-10-24 06:11:50 +00:00