Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69257
These are sample functions that already use generators internally, this just moves the `yield` into the sample function itself.
Diff is best viewed ignoring whitespace changes https://github.com/pytorch/pytorch/pull/69257/files?diff=unified&w=1
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision: D32942007
Pulled By: mruberry
fbshipit-source-id: bb5b253d6d87b3495b7059924bed35b09d2768a2
Summary:
This fixes the following error:
```python
Traceback (most recent call last):
File "/home/gaoxiang/pytorch-ucc2/test/distributed/test_distributed_spawn.py", line 40, in <module>
run_tests()
File "/home/gaoxiang/.local/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 618, in run_tests
['--import-slow-tests'] if IMPORT_SLOW_TESTS else List[str]([]))
File "/usr/lib/python3.9/typing.py", line 680, in __call__
raise TypeError(f"Type {self._name} cannot be instantiated; "
TypeError: Type List cannot be instantiated; use list() instead
Traceback (most recent call last):
File "/home/gaoxiang/pytorch-ucc2/test/run_test.py", line 1058, in <module>
main()
File "/home/gaoxiang/pytorch-ucc2/test/run_test.py", line 1036, in main
raise RuntimeError(err_message)
RuntimeError: distributed/test_distributed_spawn failed!
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69578
Reviewed By: mrshenli
Differential Revision: D32963113
Pulled By: malfet
fbshipit-source-id: b064e230c5e572e890b4ac66ebdda2707b8c12d7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69626
Sparse tensors are only supported by the TensorPipe RPC backend. As a
result, moving test_embedding_bag_with_no_grad_tensors to be a TensorPipe
specific test.
ghstack-source-id: 145134888
Test Plan: waitforbuildbot
Reviewed By: rohan-varma
Differential Revision: D32959952
fbshipit-source-id: d65f2edbb6dad7705475690a8c6293a322299dde
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67256
To change what tests can be run in various cases, the check logic should be moved to functions and variables that can be changed.
One challenge here is that decorators don't have dynamic functionality. If something is read in when imported and then changed afterwards, it will not actually change. This means we need to separate out the variables that need to be changed for our use case.
Those are put into common_distributed.py and can be changed before importing the distributed_test.py code.
The use case is to add new backends to the tests and split it into tests that can be ran on demand as a separate instance. To do so, you would change DistTestSkipCases after importing it into a launcher or a setup script and then load distributed_test.
Test Plan: Check the signals
Reviewed By: mrshenli
Differential Revision: D31906947
fbshipit-source-id: 45e3258c55f4dc34e12a468bed65280f4c25748f
Summary:
Earlier, we were only testing for inputs with the shape of `(5,)` for `nn.functional.dropout`, but since it's used a lot - I feel it's a good idea to test for a few more shapes including scalars. This PR:
1. Revises sample inputs for `nn.functional.dropout`
2. Adds an OpInfo for `nn.functional.dropout2d`.
A note regarding the documentation:
Looks like `nn.functional.dropout2d` also supports inputs of shape `(H, W)` apart from `(N, C, H, W) / (C, H, W)` but the [documentation](https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html#torch.nn.Dropout2d) doesn't mention that (`H, W` case). Should that be revised or am I missing anything here? (Filed an issue here: https://github.com/pytorch/pytorch/issues/67892)
```python
# A 2D tensor is a valid input for Dropout2d
In [11]: tensor = torch.randn((3, 4), device='cpu', dtype=torch.float32)
In [12]: dropout2d = torch.nn.Dropout2d(p=0.5)
In [13]: dropout2d(tensor)
Out[13]:
tensor([[-0.1026, -0.0000, -0.0000, -0.0000],
[-1.5647, 0.0000, -0.0000, -0.5820],
[-0.0000, -3.2080, 0.1164, -3.6780]])
```
Issue Tracker: https://github.com/pytorch/pytorch/issues/54261
cc: mruberry zou3519
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67891
Reviewed By: mrshenli
Differential Revision: D32628527
Pulled By: mruberry
fbshipit-source-id: 4c9b89550f1d49526e294378ce107eba9f29cabb
Summary:
As per title.
While working on this I have discovered several issues with these methods related to grad instabilities. I will file them and link here later. These were quite painful to force to pass all the tests with these discovered issues, sorry for the delay, mruberry!
cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69107
Reviewed By: zou3519
Differential Revision: D32920341
Pulled By: mruberry
fbshipit-source-id: 15b33e2b46acdcbff8a37d8e43e381eb55d1a296
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69486
As the title. Migrate from sign plugin to native trt layers. All the layers are fused into one single PWN kernel in TRT.
```
[TensorRT] VERBOSE: Engine Layer Information:
Layer(PointWiseV2): PWN(sign_1_sign_rhs + sign_1_sign_rhs_broadcast, PWN(PWN(sign_1_floor_div*2_rhs + sign_1_floor_div*2_rhs_broadcast, PWN(PWN(PWN([UNARY]-[acc_ops.sign]-[sign_1_prod_abs], [UNARY]-[acc_ops.sign]-[sign_1_prod_abs_exp]), PWN([UNARY]-[acc_ops.sign]-[sign_1_prod_exp], [ELEMENTWISE]-[acc_ops.sign]-[sign_1_exp_floor_div])), [ELEMENTWISE]-[acc_ops.sign]-[sign_1_floor_div*2])), [ELEMENTWISE]-[acc_ops.sign]-[sign_1_sign])), Tactic: 0, x[Float(2,2,3)] -> output0[Float(2,2,3)]
```
Test Plan: CI
Reviewed By: wushirong
Differential Revision: D32887537
fbshipit-source-id: ac250b5197e340319de29653a27f879a0e1ea9cd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69458
1. Added type hints to acc ops converters.
2. Put some of the class/logic in fx2trt.py to some separated files. (input_tensor_spec.py, trt_module.py, converter_registry.py).
3. Added import in `__init__.py` so that user can just call `from torch.fx.experimental.fx2trt import xxx` instead of `experimental.fx2trt.fx2trt`.
Test Plan: CI
Reviewed By: wushirong
Differential Revision: D32884637
fbshipit-source-id: e3e1e597edb9a08b47b4595bd371f570f2f3c9b6
Summary:
This PR:
- creates the "jiterator" pattern, allowing elementwise unary and binary kernels that don't accept scalars to be jit compiled when called
- ports the gcd and i1 CUDA kernels to use the jiterator
- extends elementwise binary systemic testing to be comparable to elementwise unary systemic testing
- separates one test case from test_out in test_ops.py
- updates more OpInfos to use expected failures instead of skips
The jiterator currently does not support half, bfloat16 or complex dtypes. It also (as mentioned above) doesn't support scalar inputs. In the future we expect to add support for those datatypes and scalars.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69439
Reviewed By: ngimel
Differential Revision: D32874968
Pulled By: mruberry
fbshipit-source-id: d44bb9cde4f602703e75400ec5a0b209f085e9b3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69327
Original commit changeset: d44096d88265
Original Phabricator Diff: D32144240 (668574af4a)
Test Plan:
CI
original diff failed 175 builds in CI
Reviewed By: airboyang, anjali411
Differential Revision: D32809407
fbshipit-source-id: c7c8e69bcee0274992e2d5da901f035332e60071
Summary:
This PR fixes https://github.com/pytorch/pytorch/issues/67612 by creating a tensor first and then converting the dtype explicitly using `.to(dtype)` call.
Looking forward to your feedback and suggestions on this.
cc: kshitij12345 mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68113
Reviewed By: zou3519
Differential Revision: D32797329
Pulled By: saketh-are
fbshipit-source-id: 5c34709ab277c82cda316a3ea1cf01e853e4c38b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68887Closes#46988, closes#46987, closes#46761
By "simple" I mean operators that map 0->0 so we can implement it by
just re-dispatching on the values tensor. That does mean we have `sin`
but not `cos` for example, but without fill value support this is the
best that can be done.
Most of these don't support autograd because the derivative formulas
use unsupported operators.
cc nikitaved pearu cpuhrsch IvanYashchuk
Test Plan: Imported from OSS
Reviewed By: jbschlosser
Differential Revision: D32734911
Pulled By: cpuhrsch
fbshipit-source-id: 203ab105799f3d2d682b01ca3d6b18e7c994776a
Summary:
This PR adds an OpInfo entry for tensorsolve function.
The keyword argument is different from NumPy so a lambda function is needed to be passed to `ref=`.
I had to change the dtypes for `test_reference_testing` because NumPy does computation internally using double for all linear algebra functions and maybe for some other functions. Using `torch.float64` and `torch.complex128` is more reliable for NumPy comparisons.
cc mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68810
Reviewed By: soulitzer
Differential Revision: D32696065
Pulled By: mruberry
fbshipit-source-id: a4305065d3e7d0097503dc05938b3c4784e14996
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65819
Related to #61669.
Functions registered as CompositeImplicitAutograd MUST work for most, if
not all, backends. This includes Tensor subclasses.
To achieve this, we (PyTorch) impose a set of constraints on how a
CompositeImplicitAutograd function can be written.
Concretely, this PR adds tests for all OpInfos that checks for
compliance. The things that get tested in this PR apply to composite
ops and are that:
- the op does not change the metadata of a Tensor without performing
dispatches
- the op does not call set_ or resize_
- the op does not directly access the data ptr
The mechanism for the test is to create a new __torch_dispatch__
object, CompositeCompliantTensor. For each operator, we wrap all inputs
in CompositeCompliantTensor, turn on python mode for it,
and send it through the operator.
Non-CompositeImplicitAutograd operators will pass the test because they
perform a dispatch to backend code. Here's how CompositeCompliantTensor
catches problems:
- If it sees set_ or resize_ getting called, it will directly error
out
- After each operation, CompositeCompliantTensor checks to make sure
that its metadata is consistent with that of the thing it is wrapping.
If the CompositeImplicitAutograd op modifes the metadata directly
(through e.g. the TensorImpl API) then the metadata will go out of sync.
- If data_ptr gets called, that returns a nice error (because the
storage is meta).
CompositeCompliantTensor is written in an interesting way. First off,
if a view operation occurs (e.g. `B = A.view_op(...)`), then B.storage()
must alias A.storage() where B.storage() is CompositeCompliantTensor's
storage, NOT the storage of the tensor it is wrapping. This is an
invariant in autograd, see #62182 for details. To handle
this we replay the view on A's storage and set it as B's storage.
Secondly, there are cases where the metadata is allowed to go out of
sync. I believe this is only possible with in-place view functions, like
transpose_, t_, squeeze_, unsqueeze_. Those are special cased.
Finally, I added a new section to aten/src/ATen/native/README.md about
what it means to be CompositeImplicitAutograd Compliant
Test Plan: - run tests
Reviewed By: ezyang, bdhirsh
Differential Revision: D31268369
Pulled By: zou3519
fbshipit-source-id: 31634b1cbe1778ab30196013cfc376ef9bd2e8b1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68887Closes#46988, closes#46987, closes#46761
By "simple" I mean operators that map 0->0 so we can implement it by
just re-dispatching on the values tensor. That does mean we have `sin`
but not `cos` for example, but without fill value support this is the
best that can be done.
Most of these don't support autograd because the derivative formulas
use unsupported operators.
cc nikitaved pearu cpuhrsch IvanYashchuk
Test Plan: Imported from OSS
Reviewed By: jbschlosser
Differential Revision: D32706197
Pulled By: cpuhrsch
fbshipit-source-id: 65e1acb3645737ca7bdb7f2db739d8e118906f4b
Summary:
This PR absolves `_TestParametrizer`s (e.g. `ops`, `modules`, `parametrize`) of the responsibility of adding device type (e.g. `'cpu'`, `'cuda'`, etc.) / dtype (e.g. 'float32') to generated test names. This fixes repeated instances of the device string being added to generated test names (e.g. `test_batch_norm_training_True_cuda_track_running_stats_True_cuda_affine_True_cuda`).
The responsibility for placing device / dtype suffixes is now handled by `instantiate_device_type_tests()` instead so it is added a single time. It will place `<device>_<dtype>` at the end of the test name unconditionally, maintaining the current naming convention.
As part of this work, I also tightened the semantics through some additional error case handling:
* Composing multiple decorators that each try to handle the same parameter will error out with a nice message. This includes the case to trying to compose `modules` + `ops`, as they each try to handle `dtype`. Similarly, `ops` + `dtypes` is forbidden when both try to handle `dtype`. This required changes in the following test files:
* `test/test_unary_ufuncs.py`
* `test/test_foreach.py`
* The `modules` / `ops` decorators will now error out with a nice message if used with `instantiate_parametrized_tests()` instead of `instantiate_device_type_tests()`, since they're not (currently) written to work outside of a device-specific context.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65217
Reviewed By: mruberry
Differential Revision: D32627303
Pulled By: jbschlosser
fbshipit-source-id: c2957228353ed46a0b7da8fa1a34c67598779312
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68885
`torch.neg` should preserve the input dtype but for sparse tensors it
was promoting integers to floating point. This would have been picked
up by the OpInfo-based test, but `neg` wasn't marked with
`supports_sparse=True` so it was never run.
cc nikitaved pearu cpuhrsch IvanYashchuk
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision: D32680008
Pulled By: cpuhrsch
fbshipit-source-id: 502f8743c1c33ab802e3d9d097792887352cd220
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68566
These are just auto-linear as pointed out by Jeffrey.
ghstack-source-id: 143814393
Test Plan: - Run OpInfo tests.
Reviewed By: albanD, soulitzer
Differential Revision: D32520239
Pulled By: zou3519
fbshipit-source-id: 807115157b131e6370f364f61db1b14700279789
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933
This PR exposes `torch.lu` as `torch.linalg.lu_factor` and
`torch.linalg.lu_factor_ex`.
This PR also adds support for matrices with zero elements both in
the size of the matrix and the batch. Note that this function simply
returns empty tensors of the correct size in this case.
We add a test and an OpInfo for the new function.
This PR also adds documentation for this new function in line of
the documentation in the rest of `torch.linalg`.
Fixes https://github.com/pytorch/pytorch/issues/56590
Fixes https://github.com/pytorch/pytorch/issues/64014
cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D32521980
Pulled By: mruberry
fbshipit-source-id: 26a49ebd87f8a41472f8cd4e9de4ddfb7f5581fb
Summary:
An update to https://github.com/pytorch/pytorch/issues/67442 to make sure all of the inputs produced are independent
Updates group_norm and instance_norm (local_response_norm was already producing independent inputs)
Also updates instance_norm for a bug in one set of inputs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68526
Reviewed By: ngimel
Differential Revision: D32532076
Pulled By: samdow
fbshipit-source-id: 45b9320fd9aecead052b21f838f95887cfb71821