Commit Graph

3340 Commits

Author SHA1 Message Date
PyTorch MergeBot
d98a884b33 Revert "[cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669)"
This reverts commit 3c6bddc3f6.

Reverted https://github.com/pytorch/pytorch/pull/87669 on behalf of https://github.com/eqy due to investigating convnext benchmark regressions
2022-11-08 19:04:25 +00:00
Edward Z. Yang
860e354d1c Support diag_embed.out decomposition (#88671)
This is a little tricky: there is a diag_embed.out, but its not bound
in Python because it's autogenerated, see https://github.com/pytorch/pytorch/issues/88598
So I can't "just" add the out variant to the ref, as this makes it
inconsistent with the torch API.  To workaround this, I mark the ref
as supporting out, but not the original function.

This is useful to do, because it means that diag_embed.out now supports
symbolic shapes.  However, this cannot be easily tested because
I can't mark the out variant as being supported in the normal OpInfo test.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88671
Approved by: https://github.com/mruberry
2022-11-08 18:28:36 +00:00
Kurt Mohler
ee28b865ee Deprecate TypedStorage, its derived classes, and all of their public methods (#85303)
Part of #85302

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303
Approved by: https://github.com/ezyang
2022-11-08 18:11:01 +00:00
Rodrigo Kumpera
6663ae5537 [2/n] Thread PG: add class _World to distributed_c10d.py (#781) (#88471)
Summary:
X-link: https://github.com/pytorch/torchrec/pull/781

Move a bunch of globals to instance methods and replace all use to them.

We move all PG related globals under World and use a singleton instance under _world.

This creates an undocumented extension point to inject full control of how how c10d
state behaves.

One simple hack is to change _world to an implementation that uses a threadlocal
and enable per-thread PGs.

It almost get DDP working and the PG is missing an implementation of all_reduce.

This enables notebook usage of PTD, which is a big deal for learning it:
https://gist.github.com/kumpera/32cb051fa26b8cad8bdf671f968dcd68

This change ensures BC by keeping the global variables around and have the default _World wrap it.

I have relinked this diff to a new github PR, so that I can update it. The original PR is
> Pull Request resolved: https://github.com/pytorch/pytorch/pull/86348

Differential Revision: D40236769

Pulled By: yhcharles

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88471
Approved by: https://github.com/gnadathur, https://github.com/rohan-varma
2022-11-07 17:56:40 +00:00
Brian Hirsh
a16ced03c9 reland "fix as_strided_scatter_backward (#87646)" (#88342)
This reverts commit 71fb763e54.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88342
Approved by: https://github.com/zou3519
2022-11-07 15:00:58 +00:00
PyTorch MergeBot
81042d3a53 Revert "Reenable optimizer overlap tests (#88439)"
This reverts commit da452bcadb.

Reverted https://github.com/pytorch/pytorch/pull/88439 on behalf of https://github.com/huydhn due to This change breaks trunk due to a land race missing reason parameter to sandcastle_skip_if da452bcadb
2022-11-06 02:29:53 +00:00
Nikita Karetnikov
bbaa0637df Add error inputs to gaussian_nll_loss OpInfo (#88486)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88486
Approved by: https://github.com/lezcano
2022-11-05 20:10:54 +00:00
Rohan Varma
da452bcadb Reenable optimizer overlap tests (#88439)
Closes https://github.com/pytorch/pytorch/issues/73259. Not sure the root cause but CI seems fine with these tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88439
Approved by: https://github.com/awgu
2022-11-05 18:26:01 +00:00
Andrew M. James
ff6770a9a1 enable backward for log1p (sparse layouts) (#88155)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88155
Approved by: https://github.com/cpuhrsch
2022-11-04 20:59:26 +00:00
Andrew M. James
6938dd0b2c Support sparse inputs to deg2rad (#88156)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88156
Approved by: https://github.com/cpuhrsch
2022-11-04 20:59:26 +00:00
Andrew M. James
f03302ba49 Add sparse layout support for torch.frac (#88153)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88153
Approved by: https://github.com/cpuhrsch
2022-11-04 20:59:22 +00:00
Catherine Lee
d632d94cc7 Disable mem leak check (#88373)
tbh at this point it might be easier to make a new workflow and copy the relevant jobs...

Changes:
* Disable cuda mem leak check except for on scheduled workflows
* Make pull and trunk run on a schedule which will run the memory leak check
* Periodic will always run the memory leak check -> periodic does not have parallelization anymore
* Concurrency check changed to be slightly more generous
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88373
Approved by: https://github.com/ZainRizvi, https://github.com/huydhn
2022-11-04 20:47:42 +00:00
PyTorch MergeBot
8c1c6759b2 Revert "remove assert_allclose from torch.testing (#87974)"
This reverts commit 5669e10d37.

Reverted https://github.com/pytorch/pytorch/pull/87974 on behalf of https://github.com/mehtanirav due to Internal breakages from method removal
2022-11-04 19:12:37 +00:00
Will Constable
70b00b1383 Add hf_bert + DDP multigpu test (#88435)
Spot-checks an e2e model working with ddp.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88435
Approved by: https://github.com/davidberard98
2022-11-04 03:17:48 +00:00
soulitzer
4c20c0509d Split out forward AD tests from test_ops_gradients and reenable slow gradcheck CI (#88216)
Fixes: https://github.com/pytorch/pytorch/issues/88010

This PR does a couple things to stop slow gradcheck from timing out:
- Splits out test_ops_fwd_gradients from test_ops_gradients, and factors out TestFwdGradients and TestBwdGradients which both inherit from TestGradients, now situated in common_utils (maybe there is a better place?)
- Skips CompositeCompliance (and several other test files) for slow gradcheck CI since they do not use gradcheck
- because test times for test_ops_fwd_gradients and test_ops_gradients are either unknown or wrong, we hardcode them for now to prevent them from being put together. We can undo the hack after we see actual test times are updated. ("def calculate_shards" randomly divides tests with unknown test times in a round-robin fashion.)
- Updates references to test_ops_gradients and TestGradients
- Test files that are skipped for slow gradcheck CI are now centrally located in in run_tests.py, this reduces how fine-grained we can be with the skips, so for some skips (one so far) we still use the old skipping mechanism, e.g. for test_mps

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88216
Approved by: https://github.com/albanD
2022-11-03 00:20:45 +00:00
Huy Do
5b882a34c4 Consolidate macos pip dependencies (#88071)
After conda, consolidating all macos pip dependencies to cache every dependencies that macos CI needs. Two small issues are found along the way in `_mac-test-mps` workflow:

* It didn't have `Install macOS homebrew dependencies` to install libomp like the regular `_mac-test` workflow
* It didn't install `scipy`, thus silently skipping some `signal.windows` tests

Both are fixed in this PR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88071
Approved by: https://github.com/malfet
2022-11-02 17:22:01 +00:00
PyTorch MergeBot
71fb763e54 Revert "fix as_strided_scatter_backward (#87646)"
This reverts commit f9d7985851.

Reverted https://github.com/pytorch/pytorch/pull/87646 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but I think this one or one of the PR in the stack break bionic-cuda11.7 on trunk 70782981f0
2022-11-02 16:54:36 +00:00
Brian Hirsh
f9d7985851 fix as_strided_scatter_backward (#87646)
as_strided_scatter's derivative formula was broken - instead of making a "mask" of 1's and 0's, it would effectively make a mask of 1's and uninitialized memory.

Fixes https://github.com/pytorch/pytorch/issues/88105

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87646
Approved by: https://github.com/albanD
2022-11-02 14:36:49 +00:00
Philip Meier
5669e10d37 remove assert_allclose from torch.testing (#87974)
See #87969 or #86586 for the reasoning.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87974
Approved by: https://github.com/mruberry
2022-11-02 14:05:01 +00:00
Philip Meier
b9c617838a remove make_non_contiguous from torch.testing (#87973)
See #87969 or #86586 for the reasoning.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87973
Approved by: https://github.com/mruberry
2022-11-02 14:05:01 +00:00
Philip Meier
8893c6cd07 remove deprecated dtype getters from torch.testing (#87972)
See #87969 or #86586 for the reasoning.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87972
Approved by: https://github.com/mruberry
2022-11-02 14:04:58 +00:00
Philip Meier
a360be50b5 remove deprecated device getter from torch.testing (#87971)
See #87969 or #86586 for the reasoning.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87971
Approved by: https://github.com/mruberry
2022-11-02 14:04:54 +00:00
Philip Meier
554cdc9a63 remove deprecated rand and randn from torch.testing (#87970)
See #87969 or #86586 for the reasoning.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87970
Approved by: https://github.com/mruberry
2022-11-02 14:04:51 +00:00
Philip Meier
bc73affdad prepare removal of deprecated functionality in torch.testing (#87969)
_Redo of #86586 with all BC breaking changes granularly placed into separate commits._

---

Per title. Deprecation happened on Feb 25, 2022 in c6f1bbc0ac, which made it into the 1.12 release. Since it is now 245 days later and the next release will be 1.14, the removals later in the stack comply with the [BC policy](https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#minimizing-the-disruption-of-bc-breaking-changes).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87969
Approved by: https://github.com/mruberry
2022-11-02 14:04:48 +00:00
Andrew Gu
d6b58d6924 [FSDP()][23/N] Refactor handle attr initialization (#87938)
**`_init_param_attributes()` -> `init_flat_param_attributes()`**
We move `_init_param_attributes()` to `FlatParamHandle.init_flat_param_attributes()` (as already marked as to-do during previous refactoring).

**`_reset_lazy_init()`**
We no longer delete `_local_shard` from each `FlatParameter` in `_reset_lazy_init()`.

**Analysis**
Thus, the two semantic differences are that we remove the initial `if hasattr(p, "_local_shard")` early return in `_init_param_attributes()` and the `delattr(p, "_local_shard")` in `_reset_lazy_init()`.

This is safe because
- If we never call `_reset_lazy_init()`, then `init_flat_param_attributes()` is only called once. There is no opportunity for an early return.
- If we call `_reset_lazy_init()`, then `init_flat_param_attributes()` will be called again in the next `_lazy_init()`. However, since we removed the early return, all of the attributes initialized in `init_flat_param_attributes()` simply get re-initialized and override any existing attributes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87938
Approved by: https://github.com/mrshenli
2022-11-02 11:32:56 +00:00
lezcano
39d9d2ed70 Implement reference for lerp (#87424)
We follow the vectorised CPU implementation for numerical accuracy

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87424
Approved by: https://github.com/ezyang
2022-11-02 11:21:01 +00:00
Kazuaki Ishizaki
2ddefbdc3c Fix typos used in documents under torch directory (#88300)
This PR fixes typos, in comments of Python files, that are found from a search box at https://pytorch.org/docs/master/search.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88300
Approved by: https://github.com/lezcano
2022-11-02 09:38:13 +00:00
Kshiteej K
e763b7abeb [complex] conv_transpose3d : complex support (#87967)
Reference: https://github.com/pytorch/pytorch/issues/71108

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87967
Approved by: https://github.com/anjali411
2022-11-02 06:37:33 +00:00
eqy
3c6bddc3f6 [cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669)
#58414

Has a small tweak to a test that was breaking on A10 (CC @malfet).

CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87669
Approved by: https://github.com/ngimel
2022-11-02 01:36:37 +00:00
Peter Bell
dfa9475755 Check SM version before calling flash attention with BFloat16 (#86600)
The flash attention code path requires sm80 or newer to run on
BFloat16, so any OpInfo tests running with BFloat16 would fail with
the error:
```
RuntimeError: Expected q_dtype == at::kHalf || (is_sm8x && q_dtype == at::kBFloat16) to be true, but got false.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86600
Approved by: https://github.com/ngimel
2022-11-02 00:52:30 +00:00
Peter Bell
bc9caafc78 record_function: update to use custom_class API (#76420)
Re-submit of gh-72302

This still has a small performance hit, but it much smaller. On my
machine I see `_record_fucntion_exit._RecordFunction` takes 1.05 us
compared to the `Tensor` overload taking 0.79 us.

In an overall comparison, I see a 0.7 us slowdown from 6.0 us to
6.7 us for this timeit benchmark
```python
import torch

def foo():
  with torch.profiler.record_function("foo"):
    return torch.eye(3)

%timeit foo()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76420
Approved by: https://github.com/robieta
2022-11-02 00:39:28 +00:00
Andrew M. James
d044b4cc58 Update torch.abs and torch.positive opinfos to reflect sparse support (#88151)
cc @nikitaved @pearu @cpuhrsch @bhosmer
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88151
Approved by: https://github.com/cpuhrsch
2022-11-01 22:18:56 +00:00
Yanli Zhao
44f8efd5c1 [BE]fix DDP when the number of output features is zero (#87793)
Fixes #87280

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87793
Approved by: https://github.com/rohan-varma
2022-11-01 15:27:40 +00:00
Andrew Gu
b1750d0440 [FSDP()][13/N] Refactor unshard/reshard/grads (#87926)
This PR is not too complicated. We just move unshard/reshard/grads out to `_runtime_utils.py` and make them take `state: _State` instead of `self`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87926
Approved by: https://github.com/mrshenli
2022-11-01 13:37:31 +00:00
Sean Ross-Ross
1a9edc8136 Changing from sample_inputs to reference_inputs in test_compare_cpu (#86462)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86462
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-31 20:06:03 +00:00
Edward Z. Yang
ff94494644 Revert "Revert "Unify meta tensor and fake tensor converter conversion (#87943)"" (#88045)
This reverts commit bc64999b83.

Check torch/_subclasses/meta_utils.py for "This is very tricky" for the bugfix explanation.

cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88045
Approved by: https://github.com/kit1980, https://github.com/Chillee
2022-10-31 17:50:14 +00:00
Fuzzkatt
d13f1e6ab4 Add sequence number support for UCC (#85047)
Add sequence number support for UCC, mostly following format of ProcressGroupNCCL.
Pass new test: `test_all_gather_object_subgroup`
Add skips for gather tests: `test_gather_object` and `test_gather_object_subgroup`

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85047
Approved by: https://github.com/kwen2501
2022-10-31 03:56:55 +00:00
PyTorch MergeBot
bc64999b83 Revert "Unify meta tensor and fake tensor converter conversion (#87943)"
This reverts commit baa715e790.

Reverted https://github.com/pytorch/pytorch/pull/87943 on behalf of https://github.com/kit1980 due to Broke several inductor tests
2022-10-29 18:39:28 +00:00
Edward Z. Yang
baa715e790 Unify meta tensor and fake tensor converter conversion (#87943)
Meta tensor does a lot of work to make sure tensors "look" similar
to the original parts; e.g., if the original was a non-leaf, meta
converter ensures the meta tensor is a non-leaf too.  Fake tensor
destroyed some of these properties when it wraps it in a FakeTensor.

This patch pushes the FakeTensor constructor into the meta converter
itself, so that we first create a fake tensor, and then we do various
convertibility bits to it to make it look right.

The two tricky bits:

- We need to have no_dispatch enabled when we allocate the initial meta
  tensor, or fake tensor gets mad at us for making a meta fake tensor.
  This necessitates the double-callback structure of the callback
  arguments: the meta construction happens *inside* the function so
  it is covered by no_dispatch

- I can't store tensors for the storages anymore, as that will result
  in a leak.  But we have untyped storage now, so I just store untyped
  storages instead.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87943
Approved by: https://github.com/eellison, https://github.com/albanD
2022-10-29 15:01:07 +00:00
Andrew Gu
e667c00656 [FSDP()][2/N] Refactor training state (#87916)
This PR actually has meaningful changes. We stratify `TrainingState` into two levels: one is per FSDP instance and one is per `FlatParamHandle`/`FlatParameter`.
- At the FSDP instance level, we only care about `IDLE`, FSDP computation (i.e. `FORWARD_BACKWARD`), or `SUMMON_FULL_PARAMS`. These dynamically modify behavior (e.g. `summon_full_params()` forces full precision).
- At the `FlatParamHandle` level, we care about the training state for invariants and debugging. Hence, we keep `IDLE`, `FORWARD`, `BACKWARD_PRE`, `BACKWARD_POST`, and `SUMMON_FULL_PARAMS`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87916
Approved by: https://github.com/mrshenli
2022-10-29 06:50:30 +00:00
Sherlock Huang
e8a97a3721 FakeTensorMode and Prims.add/sub/mul/div support scalar only inputs (#87759)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87759
Approved by: https://github.com/ngimel, https://github.com/mruberry, https://github.com/eellison
2022-10-28 04:34:25 +00:00
lezcano
f21d0b310c Add decomposition for diagonal_scatter (#87282)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87282
Approved by: https://github.com/mruberry
2022-10-28 00:50:29 +00:00
Alvaro Gaona
46b16977d9 Reimplement Kaiser window (#87330)
Relates to #85366

- For reference follow #87082.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87330
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-27 21:01:01 +00:00
Natalia Gimelshein
f1b78224ca Fix type promotion for 2 wrapped scalar args (#87845)
Fixes #76801

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87845
Approved by: https://github.com/SherlockNoMad, https://github.com/mruberry
2022-10-27 15:53:11 +00:00
kshitij12345
1780e0ef7f [complex] conv_transpose2d (#81805)
Reference: https://github.com/pytorch/pytorch/issues/71108

Fixes : #86414
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81805
Approved by: https://github.com/anjali411
2022-10-27 10:46:53 +00:00
Andrew Gu
107f92a683 [FSDP] ufmt FSDP test (#87812)
This applies `ufmt` to all of the FSDP test files in the `test/distributed/fsdp/` directory.

**Test Plan**
CI

**Notes**
For VSCode users,
- Install `ufmt`: https://pypi.org/project/ufmt/
- Install VSCode `ufmt` extension: https://marketplace.visualstudio.com/items?itemName=omnilib.ufmt
- Include in `settings.json`:
```
{
    "[python]": {
        "editor.defaultFormatter": "omnilib.ufmt",
        "editor.formatOnSave": true,
    },
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87812
Approved by: https://github.com/rohan-varma
2022-10-27 04:25:55 +00:00
wchen61
2c66889f90 Synchronize before change cuda stream (#82050) (#82056)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/82050

Need synchronize before change cuda stream

### Description
<!-- What did you change and why was it needed? -->

### Issue
<!-- Link to Issue ticket or RFP -->

### Testing
<!-- How did you test your change? -->

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82056
Approved by: https://github.com/ngimel
2022-10-26 23:44:13 +00:00
Nikita Karetnikov
59b9d29260 [primTorch] Check error_regex in test_python_ref_errors (#86987)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86987
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-26 23:34:34 +00:00
jpvillam
38dd4cbdf1 ROCm enable sparse_sampled_addmm (#86401)
Enables:
test_comprehensive_sparse_sampled_addmm_cuda_complex128
test_comprehensive_sparse_sampled_addmm_cuda_complex64
test_comprehensive_sparse_sampled_addmm_cuda_float32
test_comprehensive_sparse_sampled_addmm_cuda_float64
test_dispatch_meta_sparse_sampled_addmm_cuda_complex128
test_dispatch_meta_sparse_sampled_addmm_cuda_complex64
test_dispatch_meta_sparse_sampled_addmm_cuda_float32
test_dispatch_meta_sparse_sampled_addmm_cuda_float64
test_meta_sparse_sampled_addmm_cuda_complex128
test_meta_sparse_sampled_addmm_cuda_complex64
test_meta_sparse_sampled_addmm_cuda_float32
test_meta_sparse_sampled_addmm_cuda_float64

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86401
Approved by: https://github.com/ngimel
2022-10-26 19:39:24 +00:00
Bill Schnurr
0367c12bce Fix torch.testing.assert_close not exported from module (#87619)
For pylance/pyright static typechecking
"Imported symbols are considered private by default. If they use the “import A as A” (a redundant module alias), “from X import A as A” (a redundant symbol alias)" https://github.com/microsoft/pyright/blob/main/docs/typed-libraries.md#library-interface

torch.testing.assert_close not exported from module https://github.com/microsoft/pylance-release/issues/3526

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87619
Approved by: https://github.com/kit1980
2022-10-25 04:47:13 +00:00