Commit Graph

3017 Commits

Author SHA1 Message Date
Andrei Gheorghe
6275f91654 Improved DDP checkpoint documentation (#106985)
Amended the documentation for the specified case.

Fixes #84589

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106985
Approved by: https://github.com/wanchaol, https://github.com/fduwjj
2023-09-25 22:54:24 +00:00
Mikayla Gawarecki
abd83ce180 Small fix in SDPA docstring codeblock (#109086)
Fix https://github.com/pytorch/pytorch/issues/109072

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109086
Approved by: https://github.com/drisspg
2023-09-12 16:48:46 +00:00
FFFrog
bc3f0d341a LazyBatchNorm{1-3}d support dict&set (#109015)
Fixes #105292

As the title shown ,LazyBatchNorm don`t support dict&set, keep consistent with BatchNorm{1-3}d.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109015
Approved by: https://github.com/mikaylagawarecki
2023-09-12 09:09:59 +00:00
FFFrog
003c5bb156 Add checks to num_layers for RNN, LSTM, GRU (#108853)
Fixes #108223

As the title shown

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108853
Approved by: https://github.com/mikaylagawarecki
2023-09-09 19:33:52 +00:00
Randolf Scholz
ddbaad6d74 updated pad_sequence type hint (#108765)
Fixes #89623

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108765
Approved by: https://github.com/malfet, https://github.com/zou3519, https://github.com/ezyang
2023-09-08 13:06:03 +00:00
Randolf Scholz
8391e3fba4 fixed nn.Module.to type hint (#108767)
Fixes #108675

- [x] adds `str` as option for `device`
- [x] use `typing_extensions.Self` instead of `T`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108767
Approved by: https://github.com/ezyang
2023-09-08 02:40:53 +00:00
Minh-Long Luu (刘明龙)
95f268e426 Add examples for nn.CosineEmbeddingLoss (#108215)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108215
Approved by: https://github.com/mikaylagawarecki
2023-08-31 20:01:24 +00:00
hasteinmetz
b535ed2c1a Update to RNN documentation (issue #106085) (#106222)
Addresses [issue #106085](https://github.com/pytorch/pytorch/issues/106085).

In `torch/nn/modules/rnn.py`:
- Adds documentation string to RNNBase class.
- Adds parameters to __init__ methods for RNN, LSTM, and GRU, classes.
- Adds type annotations to __init__ methods for RNN, LSTM, and GRU.

In `torch/ao/nn/quantized/dynamic/modules/rnn.py`:
- Adds type specifications to `_FLOAT_MODULE` attributes in RNNBase, RNN, LSTM, and GRU classes.
> This resolves a `mypy` assignment error `Incompatible types in assignment (expression has type "Type[LSTM]", base class "RNNBase" defined the type as "Type[RNNBase]")` that seemed to be a result of fully specified type annotations in `torch/nn/modules/rnn.py`).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106222
Approved by: https://github.com/mikaylagawarecki
2023-08-31 00:50:32 +00:00
Mikayla Gawarecki
584a01b650 Fix LayerNorm(bias=False) error (#108060)
Fixes #108048

- [ ] Cherry pick this [here](https://github.com/pytorch/pytorch/issues/108055)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108060
Approved by: https://github.com/jbschlosser, https://github.com/albanD, https://github.com/malfet
2023-08-28 18:23:13 +00:00
FFFrog
969bf8a054 Fix the document of torch.nn.functional.conv2d (#107851)
Fixes #107692

Fix the document of torch.nn.functional.conv2d
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107851
Approved by: https://github.com/mikaylagawarecki
2023-08-24 18:02:03 +00:00
dvorst
91a674ccd4 Fix docstring for shape of target for MultiLabelSoftMarginLoss (#107817)
Fixes #92000

The documentation at https://pytorch.org/docs/stable/generated/torch.nn.MultiLabelSoftMarginLoss.html#multilabelsoftmarginloss states:
> label targets padded by -1 ensuring same shape as the input.

However, the shape of input and target tensor are compared, and an exception is raised if they differ in either dimension 0 or 1. Meaning the label targets are never padded. See the code snippet below and the resulting output. The documentation is therefore adjusted to:
> label targets must have the same shape as the input.

```
import torch
import torch.nn as nn

# Create some example data
input = torch.tensor(
    [
        [0.8, 0.2, -0.5],
        [0.1, 0.9, 0.3],
    ]
)
target1 = torch.tensor(
    [
        [1, 0, 1],
        [0, 1, 1],
        [0, 1, 1],
    ]
)
target2 = torch.tensor(
    [
        [1, 0],
        [0, 1],
    ]
)
target3 = torch.tensor(
    [
        [1, 0, 1],
        [0, 1, 1],
    ]
)
loss_func = nn.MultiLabelSoftMarginLoss()
try:
    loss = loss_func(input, target1).item()
except RuntimeError as e:
    print('target1 ', e)
try:
    loss = loss_func(input, target2).item()
except RuntimeError as e:
    print('target2 ', e)
loss = loss_func(input, target3).item()
print('target3 ', loss)
```

output:
```
target1  The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 0
target2  The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1
target3  0.6305370926856995
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107817
Approved by: https://github.com/mikaylagawarecki
2023-08-24 15:13:46 +00:00
kato8966
f7a51c4208 fix pad_sequence docstring (#107669)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107669
Approved by: https://github.com/mikaylagawarecki
2023-08-23 18:01:39 +00:00
Mikayla Gawarecki
48b1208e05 Disable nn.MHA fastpath for floating point masks (#107641)
Fixes https://github.com/pytorch/pytorch/issues/107084 by disabling the fast path when floating point masks (which should be additive) are passed

- [We claim in our docs for MHA that float masks will be added to the attention](https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html) (be it `key_padding_mask` or `attn_mask`)
- We always canonicalize any mask at the start of MHA in python by converting it to float
- my understanding from Driss is that SDPA properly supports additive masking (but there are many special cases for mask shape for MHA that don't work properly currently (BxT, TxT) so [we're turning this off for now](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/transformers/cuda/attention.cu#L531-L532)
- More broadly, the problem isn't with the SDPA path, but that things are broken for the path it falls back to
-  Right now mha "fast path" code with non-None masks is always going through [this path ](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/transformers/cuda/attention.cu#L554-L640) that  has a call to `masked_softmax` that [converts the masks back to bool](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/transformers/attention.cpp#L154-L156)
- the implication here is that **additive floating point attn_mask and additive key_padding_mask to nn.MHA fastpath are broken**
- This wasn't broken for the user in [https://github.com/pytorch/pytorch/issues/107084](https://l.workplace.com/l.php?u=https%3A%2F%2Fgithub.com%2Fpytorch%2Fpytorch%2Fissues%2F107084&h=AT35qHIQavtxKtriTkrkPsWRB3eSRh4qH5PQUyiTzrPTshoztPL0593AmKCmSdEQ5O-5wib0Fd4mwztVu4YbMWb2ghZnZw1pvpJb9-FYWjDsPQ6_oHRVPzFfj8xYXC1TaFnJCkMYjrGXkIfzzxZvmcQYNnIPgsJSiWgjIw) in 1.13.1 because of [this check which bypassed the fast path if attn_mask was defined](https://github.com/pytorch/pytorch/blob/v1.13.1/torch/nn/modules/activation.py#L1096-L1097) (as Driss pointed out though additive key_padding_mask with the fast path were probably  broken in 1.13.1)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107641
Approved by: https://github.com/drisspg, https://github.com/jbschlosser
2023-08-23 15:08:18 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
박종의 (PARK, Jongeui)
af0ed25ea8 Change >= in the GRU and the LSTM document to \ge (#107379)
Change >= in the GRU document to \ge
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107379
Approved by: https://github.com/ezyang
2023-08-18 20:44:51 +00:00
FFFrog
2d2d43d9fb add more check on LSTMCell (#107380)
Just like #107223, operator ``LSTMCell`` have the same problems as ``GRUCell``, and add some check and tests related to fix it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107380
Approved by: https://github.com/ezyang
2023-08-18 20:44:17 +00:00
Kurt Mohler
36141de427 Throw error if stateless.functional_call called with nn.DataParallel (#107403)
Part of #77576

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107403
Approved by: https://github.com/mikaylagawarecki
2023-08-18 03:02:04 +00:00
FFFrog
a4229690e3 Add Some Checks about dim (#107223)
Fixes #106769

As mentioned in [GRUCell](https://pytorch.org/docs/stable/generated/torch.nn.GRUCell.html#grucell), `hidden` should have the same dimension as `input`, and the dimension should be either `1D` or `2D`.

As for other aspects, it has been verified in `C++`, such as the batch of `Input` and `hidden` are the same, `Input`'s Dim1 and `input_size` are the same, `hidden`'s Dim1 and `hidden_size` are the same, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107223
Approved by: https://github.com/albanD
2023-08-16 22:03:31 +00:00
Mikayla Gawarecki
b08b0c915f [easy] Fix docs for sd calculation in BatchNorm1d/3d for consistency with BatchNorm2d (#107308)
Fixes https://github.com/pytorch/pytorch/issues/100048

BatchNorm2d docs were updated in https://github.com/pytorch/pytorch/pull/97974. There have been a number of issues filed due to confusion about this so I think we should fix before branch cut

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107308
Approved by: https://github.com/albanD
2023-08-16 21:51:02 +00:00
Guang Yang
0b57581dec [pytorch] Disable fast path in MultiheadAttention in Export (#106824)
Summary:
We are seeing `aten._native_multi_head_attention` op (not in core Aten op set) is left in the exported graph and causes problems in the downstream at runtime.

Two proposed solutions:
 1. Disable fast path while tracing to leverage the non-optimized path to get decomp, that way, the blamed op won't show up in the exported graph
 2. Add a decomp rule for `aten._native_multi_head_attention`

After discussing with kimishpatel and bdhirsh, #1 is preferred and verified it could immediately unblock the critical model enablement work for PP.

Test Plan: CI

Differential Revision: D48169806

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106824
Approved by: https://github.com/kimishpatel
2023-08-10 00:18:37 +00:00
Jason Lu
bc88028e8e Back out "Reland "Make adding buffers more like adding parameters (#104069)" (#106224)" (#106743)
Summary:
Original commit changeset: 81319beb97f3

Original Phabricator Diff: D47961182

Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822

Reviewed By: atuljangra

Differential Revision: D48131623

@diff-train-skip-merge
(D48131623 landed internally)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743
Approved by: https://github.com/malfet
2023-08-08 15:27:34 +00:00
Mikayla Gawarecki
1317dbf176 Reland "Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148)" (#106632)
Previous one was reverted because the PR stacked under which added error-checking to Pad variants https://github.com/pytorch/pytorch/pull/106147 was reverted as internally some people pass 2D inputs to ZeroPad2d (which should actually take 3d or 4d inputs :) but there wasn't actually anything this PR was breaking according to my understanding

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106632
Approved by: https://github.com/albanD
2023-08-07 20:10:25 +00:00
Mikayla Gawarecki
786977c647 [easy] Add reset_parameters for nn.PRelu (#106507)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106507
Approved by: https://github.com/albanD
2023-08-04 23:22:42 +00:00
Michael Gschwind
63d45275f4 is causal hints for transformer (#106143)
Summary:
make is_causal hint flags available for the top level transformer module.

It's debatable whether this is useful -- at present we autodetect causal masks for src and tgt masks in transformer encoder and decoder, respectively. is_causal flags available woul enable users to short-cut this check by asserting whether they mask is causal, or not.

I am putting this diff up for discussion, not as a solution.  Not doing anything may be the right solution, unless there is strong (data-driven) user demand. -- it appears the consensus is to move ahead with this, as per discussions below.

@cpuhrsch @mikaylagawarecki @jbschlosser @janEbert

Test Plan: sandcastle

Differential Revision: D47373260

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106143
Approved by: https://github.com/mikaylagawarecki
2023-08-04 14:16:48 +00:00
Michael Gschwind
3db255020b Clarify the clarification (#106358)
Summary: Clarify the clarification

Differential Revision: D47941982

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106358
Approved by: https://github.com/mikaylagawarecki
2023-08-03 16:58:36 +00:00
PyTorch MergeBot
dfcfd5cedb Revert "Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148)"
This reverts commit 87d2536971.

Reverted https://github.com/pytorch/pytorch/pull/106148 on behalf of https://github.com/malfet due to Reverting as dependent PR https://github.com/pytorch/pytorch/pull/106147 was reverted as well ([comment](https://github.com/pytorch/pytorch/pull/106148#issuecomment-1662344543))
2023-08-02 14:46:00 +00:00
PyTorch MergeBot
d83b887f2a Revert "Add error checking for padding modules (#106147)"
This reverts commit 0547b6279d.

Reverted https://github.com/pytorch/pytorch/pull/106147 on behalf of https://github.com/jeanschmidt due to sadly it is breaking internal builds, and I can't coordinate a FF due to timezone differences ([comment](https://github.com/pytorch/pytorch/pull/106147#issuecomment-1661870970))
2023-08-02 09:37:40 +00:00
Danni Li
5e3aca6c5c [BE] Input check for torch.nn.MultiheadAttention (#106363)
Summary: Check `embed_dim` and `num_heads ` of `torch.nn.MultiheadAttention`.

Test Plan: Please see GitHub Actions.

Differential Revision: D47943134

Fix: #105630

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106363
Approved by: https://github.com/mikaylagawarecki
2023-08-01 23:28:23 +00:00
Danni Li
1a6f1d816d [Doc] Add proj_size < hidden_size in LSTM (#106364)
Summary:
Add parameter constraint: `proj_size` has to be smaller than `hidden_size` in RNNBase doc.

Ref:
ceea08a986/torch/nn/modules/rnn.py (L83)

ceea08a986/torch/nn/modules/rnn.py (L458)

Test Plan: Please see GitHub Actions.

Differential Revision: D47943365

Fix: #105628

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106364
Approved by: https://github.com/mikaylagawarecki
2023-08-01 18:58:27 +00:00
Mikayla Gawarecki
87d2536971 Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148)
Fixes #105749 https://github.com/pytorch/pytorch/issues/95320

(tldr is that input should always be `[N, C, H, (W, D])` where only H, W and D dimensions get circular padding, so the 2D case where user wants both dimensions to be padded --> they should `.unsqueeze(0)` (as is the case for `Reflection/ReplicationPad`) but we didn't document this for circular padding. [This seems to be the old docstring](277b05014a/torch/nn/functional.py (L4689)) that was somehow lost.

Fixes no_batch_dim support https://github.com/pytorch/pytorch/issues/104860

- Adds missing documentation for circular padding
- Adds missing CircularPad modules
- Migrates legacy test_nn tests from circular padding to ModuleInfo
- Adds no_batch_dim support + sample inputs that test this

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106148
Approved by: https://github.com/albanD
ghstack dependencies: #106325, #106147
2023-08-01 12:49:58 +00:00
Mikayla Gawarecki
0547b6279d Add error checking for padding modules (#106147)
Fixes https://github.com/pytorch/pytorch/issues/105627

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106147
Approved by: https://github.com/albanD
ghstack dependencies: #106325
2023-08-01 12:49:58 +00:00
Mikayla Gawarecki
d8e5f2aa6d Reland "Make adding buffers more like adding parameters (#104069)" (#106224)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224
Approved by: https://github.com/atalman, https://github.com/albanD
2023-07-31 17:18:56 +00:00
Rohan Varma
c11412b4a8 [DDP] Support optim in backward after DDP init (#105995)
This allows in backward optimizers to be configured after DDP init, in
addition to before as was previously supported.

Differential Revision: [D47783347](https://our.internmc.facebook.com/intern/diff/D47783347/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105995
Approved by: https://github.com/fegin
2023-07-29 01:36:25 +00:00
chunyuan
cb6c3cbc91 inductor: enable weight prepack for LSTM (#103071)
- Enabled LSTM weight prepack in inductor.
- Added a mkldnn decomposition for lstm which won't change for different `seq_lens`. With the previous decomposition, for dynamic shapes use case where `seq_lens` changes, the graph will be different.
- Extended several inductor utility functions to support `List(Tensor`) as input. Previously those functions only supported `Tensor` input.

**Update 2023-07-26:**
- https://github.com/pytorch/pytorch/pull/103851 has moved CPU weight packing to be after AOTAutograd. Fixed the support in this PR to follow the same way (mainly in 3b207f7f1c (diff-6dffed1ade0ba3e887f9a4eafa3bfcec267ab2365b8adcb91bd391f49b3fd2e3)).
LSTM is decomposed in `aten.mkldnn_rnn_layer` by layer and by direction. The weight prepack is done at the `mkldnn_rnn_layer` level.
- Add a fix in rnn `__get_state__` function in case we need to recompile an `LSTM` module.
When compiling the module, the weights tensors which are the `named_parameters` of the module are converted to `functional_tensor` here:
76fb72e24a/torch/nn/utils/stateless.py (L125-L128)
The forward function of LSTM will be called:
76fb72e24a/torch/_functorch/aot_autograd.py (L3379-L3381)
In the forward function, the `_flat_weights` are updated to be the same as the weights, thus becoming `functional_tensor`:
76fb72e24a/torch/nn/modules/rnn.py (L775-L778)
The weights tensors are converted back to the original tensors (which are not `functional_tensor` anymore) before exiting the `_reparametrize_module` context here:
76fb72e24a/torch/nn/utils/stateless.py (L130-L142)
But since `_flat_weights` is not in the `named_parameters` of the module, it's still `functional_tensor` ([link of the parameters that will be converted to functional and reverted back](76fb72e24a/torch/_functorch/aot_autograd.py (L3695-L3698))).
At this moment, if we need to recompile the model, `deepcopy` will be called:
76fb72e24a/torch/_dynamo/utils.py (L915-L917)
And it will report `UnImplemented` since we have `functional_tensor` (`_flat_weights`) and will trigger graph break which is not what we expect:
76fb72e24a/torch/_subclasses/meta_utils.py (L514)
Added a fix in the `__get_state__`  to update the `_flat_weights` if ever weights have changed to fix this issue. The fix is covered in the `test_lstm_packed` UT.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103071
Approved by: https://github.com/jgong5, https://github.com/jansel
2023-07-28 13:54:32 +00:00
Michael Gschwind
723bc136a1 Add context for warning about batch_first (#106139)
Summary: Add context for warning about batch_first

Test Plan: sandcastle github

Differential Revision: D47809651

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106139
Approved by: https://github.com/mikaylagawarecki
2023-07-27 23:02:05 +00:00
Mikayla Gawarecki
ca7ece9b50 [easy] improve hint on error message in nn.Module.load_state_dict (#106042)
Fix #105963

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106042
Approved by: https://github.com/albanD
2023-07-27 19:56:02 +00:00
GwiHwan
2d41fa9d38 Revise err msgs for weight param of Multimarginloss (#106047)
Summary: fix lint issue of #106019

Fix: https://github.com/pytorch/pytorch/issues/106020
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106047
Approved by: https://github.com/Skylion007
2023-07-27 01:44:13 +00:00
Michael Gschwind
06dd850dd5 Simplify check (#106044)
Summary: Simplify check / refactor for readability

Test Plan: sandcastle, github

Differential Revision: D47800732

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106044
Approved by: https://github.com/mikaylagawarecki
2023-07-27 01:18:25 +00:00
FFFrog
9a1cdcb8a0 Format: fixing multiple string concatenation in single line (#106013)
Fixing multiple string concatenation in single line
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106013
Approved by: https://github.com/albanD
2023-07-26 18:39:18 +00:00
Alexander Pivovarov
28a4fc8d8a Fixe some typos (#105869)
### Description:
- Fixes for typos in comments
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105869
Approved by: https://github.com/mikaylagawarecki, https://github.com/Skylion007
2023-07-26 16:23:57 +00:00
janEbert
66b73b08df Allow disabling bias for Transformer (#101687)
As used by T5 and PaLM, citing "increased training stability for large models" (https://arxiv.org/abs/2204.02311).

Depends on #101683, which allows disabling bias for `LayerNorm`s. Marked as draft due to this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101687
Approved by: https://github.com/mikaylagawarecki
2023-07-26 13:50:41 +00:00
lezcano
9bde7f4e27 Fix the docs for cosine_similarity (#104772)
The behaviour of `cosine_similarity` was subtly changed in
https://github.com/pytorch/pytorch/pull/31378, but the docs were not
updated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104772
Approved by: https://github.com/albanD, https://github.com/svekars
2023-07-26 09:23:09 +00:00
Aaron Gokaslan
6d43c89f37 [BE]: Update Ruff to 0.0.280 (#105724)
Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724
Approved by: https://github.com/ezyang, https://github.com/janeyx99
2023-07-22 23:03:34 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
79c5e33349 [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436
Approved by: https://github.com/malfet, https://github.com/albanD
2023-07-21 07:38:46 +00:00
Thomas Findlay
8399cf9bfe Rnn base hidden size type check (#105659)
Fixes #105631

Added a type and value check on `hidden_size` to align behaviour between GPU and CPU modes and alert users when the wrong type is supplied.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105659
Approved by: https://github.com/albanD, https://github.com/mikaylagawarecki
2023-07-20 22:45:43 +00:00
Rodrigo Kumpera
795885d947 [docs] Fix docstring. (#105689)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105689
Approved by: https://github.com/clee2000
2023-07-20 22:02:43 +00:00
Andrey Talman
c6653b65d8 Back out "Make adding buffers more like adding parameters (#104069)" (#105581)
Summary:
D47537831 is breaking pyper tests: https://fb.workplace.com/groups/802176577445480/posts/1018902842439518/

with `TypeError: register_buffer() takes 3 positional arguments but 4 were given`

Original commit changeset: d4b4069fbd38

Original Phabricator Diff: D47537831

Test Plan:
```
buck2 run //caffe2/torch/fb/training_toolkit/integration_tests/training_lifecycle/cogwheel_tests/pyper_release_v2:cogwheel_smallworld_inline_cvr_infer_pyper_pyper__canary_offline_training-launcher -- --run-harness-in-tupperware --build-fbpkg ads_dper3 --build-fbpkg training_platform
```

Reviewed By: atalman

Differential Revision: D47600140

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105581
Approved by: https://github.com/mikaylagawarecki
2023-07-20 03:39:53 +00:00