Commit Graph

676 Commits

Author SHA1 Message Date
Xuehai Pan
f6e6e55fa7 [BE] enable UFMT for torch/nn/functional.py (#128592)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128592
Approved by: https://github.com/mikaylagawarecki
ghstack dependencies: #128596, #128594
2024-06-17 16:29:29 +00:00
Xuehai Pan
67ef2683d9 [BE] wrap deprecated function/class with typing_extensions.deprecated (#127689)
Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing.

Note that only warnings that their messages contain `[Dd]eprecat(ed|ion)` are updated in this PR.

Resolves #126888

- #126888

This PR is split from PR #126898.

- #126898

------

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127689
Approved by: https://github.com/Skylion007
2024-06-02 12:30:43 +00:00
PyTorch MergeBot
033e733021 Revert "[BE] wrap deprecated function/class with typing_extensions.deprecated (#126898)"
This reverts commit 749a132fb0.

Reverted https://github.com/pytorch/pytorch/pull/126898 on behalf of https://github.com/fbgheith due to switching typing-extensions=4.3.0 to 4.9.0 causes internal failure ([comment](https://github.com/pytorch/pytorch/pull/126898#issuecomment-2142884456))
2024-05-31 19:47:24 +00:00
lancerts
ff65b18fcf Update the is_causal explaination in the SDPA doc (#127209)
Fixes #126873

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127209
Approved by: https://github.com/drisspg
2024-05-29 18:53:17 +00:00
Xuehai Pan
749a132fb0 [BE] wrap deprecated function/class with typing_extensions.deprecated (#126898)
Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing.

Note that only warnings that their messages contain `[Dd]eprecat(ed|ion)` are updated in this PR.

UPDATE: Use `FutureWarning` instead of `DeprecationWarning`.

Resolves #126888

- #126888

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126898
Approved by: https://github.com/albanD
2024-05-29 12:09:27 +00:00
Joel Schlosser
d15920a7d0 Warn SDPA users about dropout behavior (#126294)
Fixes #124464
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126294
Approved by: https://github.com/mikaylagawarecki, https://github.com/drisspg
2024-05-15 20:58:23 +00:00
Noam Siegel
a03b9a2189 fix: typo (#125226)
Fixes spelling error: spacial is an incorrect spelling of spatial

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125226
Approved by: https://github.com/Skylion007
2024-04-30 16:57:39 +00:00
Aaron Gokaslan
5a1216bb2e [BE]: Update ruff to 0.4.1 (#124549)
Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.

Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0

| Repository                                         | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7         | 251.8         | 351.1            | 274.9            |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
2024-04-21 14:06:23 +00:00
Dmitry Ulyanov
c8e117fb76 Tiny comments improvement (#123426)
Fixed a typo in `functional.py` and moved comment line to correct place in `transformer.py`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123426
Approved by: https://github.com/mikaylagawarecki
2024-04-05 17:25:42 +00:00
Mikayla Gawarecki
487b6d40ec Add RMSNorm module (#121364)
Similar to dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L51)

**The implementation here is not optimized and we welcome pull requests to improve this**

- Use `normalized_shape` instead of singular integer `dim` to be aligned with the `nn.LayerNorm` implementation
- Remove the [upcast to float and downcast
](dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L73))

Differential Revision: [](https://our.internmc.facebook.com/intern/diff/)

Differential Revision: [D55485840](https://our.internmc.facebook.com/intern/diff/D55485840)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121364
Approved by: https://github.com/albanD
2024-03-29 18:05:28 +00:00
PyTorch MergeBot
8698121636 Revert "Add RMSNorm module (#121364)"
This reverts commit a7306de0dc.

Reverted https://github.com/pytorch/pytorch/pull/121364 on behalf of https://github.com/atalman due to Broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/121364#issuecomment-2025502007))
2024-03-28 15:31:10 +00:00
Mikayla Gawarecki
a7306de0dc Add RMSNorm module (#121364)
Similar to dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L51)

**The implementation here is not optimized and we welcome pull requests to improve this**

- Use `normalized_shape` instead of singular integer `dim` to be aligned with the `nn.LayerNorm` implementation
- Remove the [upcast to float and downcast
](dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L73))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121364
Approved by: https://github.com/albanD
2024-03-27 21:39:30 +00:00
Gonçalo Rua
139647d317 Fix #83241: torch.nn.TripletMarginLoss allowed margin less or equal to 0 (#121978)
Documentation states that the parameter margin of torch.nn.TripletMarginLoss is greater than 0, however any value was being accepted. Also fixed torch.nn.TripletMarginWithDistanceLoss which had the same problem. Added error test input for the new ValueError.

Fixes #83241

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121978
Approved by: https://github.com/mikaylagawarecki
2024-03-19 23:19:11 +00:00
João Gouveia
1afa8e0985 Fix #83153: torch.nn.hardtahn allowed min_val to be greater than max_val (#121627)
Fixes #83153

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121627
Approved by: https://github.com/albanD
2024-03-15 00:57:45 +00:00
drisspg
f5391dad82 Update docs to point to new sdpa_kernel context manager (#121180)
# Summary

Updates the SDPA docs to fix some small inaccuracies and points to the new sdpa_kernel context manger. The Enum like type binded from cpp SDPBackend does not render its fields for some reason. Manually list them instead for now

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121180
Approved by: https://github.com/mikaylagawarecki
2024-03-05 22:19:48 +00:00
lancerts
67c97a9aad fix the scale dot attention doc (#120859)
Fixes #120810

The code verifies the broadcast behavior (from the issue),
```
import torch

B = 3
S = 5
L = 7
E = 16
EV = 32
additional_batches = [2, 4]

query_shape = [B] + additional_batches + [L, E]
key_shape = [B] + additional_batches + [S, E]
value_shape = [B] + additional_batches + [S, EV]

query = torch.rand(*query_shape)
key = torch.rand(*key_shape)
value = torch.rand(*value_shape)
mask = torch.zeros((1, 1, S), dtype=torch.bool)
mask[:, :, S // 2 :] = True

# query.to("cuda")
# key.to("cuda")
# value.to("cuda")
# mask.to("cuda")

attention = torch.nn.functional.scaled_dot_product_attention(query, key, value, mask)

print(f"query shape = {query.shape}")
print(f"key shape = {key.shape}")
print(f"value shape = {value.shape}")
print(f"mask shape = {mask.shape}")
print(f"attention shape = {attention.shape}")

#in both CPU and cuda, output shape is:
# query shape = torch.Size([3, 2, 4, 7, 16])
# key shape = torch.Size([3, 2, 4, 5, 16])
# value shape = torch.Size([3, 2, 4, 5, 32])
# mask shape = torch.Size([1, 1, 5])
# attention shape = torch.Size([3, 2, 4, 7, 32])

## test add is broadcasting mask to query@(key.mT)
res = query@(key.mT)
print(res.shape)
res2 = torch.add(res, mask)
print(res2.shape)
```

At code level, in the default backend,
ab38354887/aten/src/ATen/native/transformers/attention.cpp (L735)

the add operation is broadcasting the `attn_mask` to `auto attn = at::matmul(query, key.transpose(-2, -1) * scaling_factor);`

- Changed the doc in [torch/nn/functional.py](https://github.com/pytorch/pytorch/pull/120859/files#diff-c358c214f663ba0c8b9c6846fbe0042fa29494cf02fe4714a17dcd0d268b035b).
- Also fixed a few inconsistencies in the cpp comments.

@mikaylagawarecki

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120859
Approved by: https://github.com/drisspg
2024-03-01 02:54:08 +00:00
Isuru Fernando
435063aa89 Decomposition for upsample_linear{1d, 3d} (#114774)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114774
Approved by: https://github.com/lezcano, https://github.com/vfdev-5, https://github.com/peterbell10
2024-02-27 11:57:45 +00:00
Lei Mao
91d1d2c421 Make MHA Query Scaling Behaviors Consistent (#119323)
The multi-head attention (MHA) query scaling behaviors are not consistent when [`need_weights`](8ac9b20d4b/torch/nn/modules/activation.py (L1073)) values are different.

On the current main, when `need_weights = True`, the query scaling was performed using a [division](8ac9b20d4b/torch/nn/functional.py (L5434)) and it will be exported as a `Div` operator in ONNX. When `need_weights = False`, the query scaling was performed using a [multiplication](422b4271ae/aten/src/ATen/native/transformers/attention.cpp (L711)) and it will be exported as a `Mul` operator in ONNX defined in the [PyTorch ONNX Symbolics](422b4271ae/torch/onnx/symbolic_opset14.py (L177)).

We should make the query scaling behaviors consistent. On most of the platforms, multiplication performs no worse than division. Therefore, we should use multiplication consistently for both `need_weights = True` and `need_weights = False`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119323
Approved by: https://github.com/mikaylagawarecki, https://github.com/albanD
2024-02-07 18:42:57 +00:00
lancerts
238d87f74d Add a short code snippet in the RNN doc (#119150)
Fixes #109443,
also remove a duplicated comment line `# Efficient implementation equivalent to the following:` in scaled_dot_product_attention doc.

@mikaylagawarecki
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119150
Approved by: https://github.com/malfet
2024-02-06 17:41:51 +00:00
lancerts
0ddcb5c3ca Include the documentation on scale arg being a keyword only arg (#119129)
Fixes #117240
@drisspg

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119129
Approved by: https://github.com/drisspg
2024-02-03 23:41:06 +00:00
Isuru Fernando
d40a7c6026 Add decompositions for replication_pad (#115113)
Fixes #115395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115113
Approved by: https://github.com/peterbell10
2023-12-09 02:44:07 +00:00
Wongboo
68f74dd162 Add python and C++ support for LPPool3d (#114199)
Add python and C++ support for LPPool3d to Fixes #114114

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114199
Approved by: https://github.com/mikaylagawarecki
2023-12-08 18:18:44 +00:00
drisspg
d4c79a3078 Add an attention bias subclass for a lower right causal masking (#114823)
# Summary
This PR introduces a new Tensor subclass that is designed to be used with torch.nn.functional.scaled_dot_product_attention. Currently we have a boolean `is_causal` flag that allows users to do do causal masking without the need to actually create the "realized" attention bias and pass into sdpa. We originally added this flag since there is native support in both fused kernels we support. This provides a big performance gain ( the kernels only need to iterate over ~0.5x the sequence, and for very large sequence lengths this can provide vary large memory improvements.

The flag was introduced when the early on in the kernel development and at the time it was implicitly meant to "upper_left" causal attention. This distinction only matters when the attention_bias is not square. For a more detailed break down see: https://github.com/pytorch/pytorch/issues/108108. The kernels default behavior has since changed, largely due to the rise of autogressive text generation. And unfortunately this would lead to a BC break. In the long term it may actually be beneficial to change the default meaning of `is_causal` to represent lower_right causal masking.

The larger theme though is laid here: https://github.com/pytorch/pytorch/issues/110681. The thesis being that there is alot of innovation in SDPA revolving around the attention_bias being used. This is the first in hopefully a few more attention_biases that we would like to add. The next interesting one would be `sliding_window` which is used by the popular mistral model family.

Results from benchmarking, I improved the meff_attention perf hence the slightly decreased max perf.
```Shell
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
|  Type   |      Speedup       | batch_size | num_heads | q_seq_len | k_seq_len | embed_dim |     dtype      | head_dim |
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
| Average | 1.2388050062214226 |            |           |           |           |           |                |          |
|   Max   | 1.831672915579016  |    128     |    32     |   1024    |   2048    |   2048    | torch.bfloat16 |    64    |
|   Min   | 0.9430534166730135 |     1      |    16     |    256    |    416    |   2048    | torch.bfloat16 |   128    |
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114823
Approved by: https://github.com/cpuhrsch
2023-12-06 08:29:26 +00:00
Kurt Mohler
6f32eb7eef Add decomp for replication_pad2d and use for CUDA deterministic (#111590)
Fixes #95578

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111590
Approved by: https://github.com/peterbell10
2023-12-01 18:56:09 +00:00
PyTorch MergeBot
013675ff59 Revert "Add decomp for replication_pad2d and use for CUDA deterministic (#111590)"
This reverts commit f1286161a6.

Reverted https://github.com/pytorch/pytorch/pull/111590 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing XLA job.  The job is also failing on the PR, but the log classifier failed to find the failed test which lead to it being marked wrongly as flaky ([comment](https://github.com/pytorch/pytorch/pull/111590#issuecomment-1833004794))
2023-11-30 02:28:14 +00:00
Kurt Mohler
f1286161a6 Add decomp for replication_pad2d and use for CUDA deterministic (#111590)
Fixes #95578

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111590
Approved by: https://github.com/peterbell10
2023-11-29 21:50:46 +00:00
drisspg
039a4689a2 Update sdpa doctstring to point to flash-attn-v2 (#114124)
# Summary
See title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114124
Approved by: https://github.com/albanD, https://github.com/Skylion007
2023-11-20 19:05:30 +00:00
pilot-j
9062e429db Fixed docstring errors in torch/nn/functional.py (Docathon H2) (#112856)
Fixes #112597
### Output:
**BEFORE:**
```functional.py:1 at module level:
        D400: First line should end with a period (not 'e')
functional.py:438 in public function `fractional_max_pool2d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:537 in public function `fractional_max_pool3d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:646 in public function `max_pool1d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:732 in public function `max_pool2d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:818 in public function `max_pool3d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:932 in public function `max_unpool1d`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:968 in public function `max_unpool2d`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:1000 in public function `max_unpool3d`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:1031 in public function `lp_pool2d`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1031 in public function `lp_pool2d`:
        D400: First line should end with a period (not 'f')
functional.py:1031 in public function `lp_pool2d`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1056 in public function `lp_pool1d`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1056 in public function `lp_pool1d`:
        D400: First line should end with a period (not 'f')
functional.py:1056 in public function `lp_pool1d`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1077 in public function `adaptive_max_pool1d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:1119 in public function `adaptive_max_pool2d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:1163 in public function `adaptive_max_pool3d_with_indices`:
        D400: First line should end with a period (not ')')
functional.py:1220 in public function `adaptive_avg_pool2d`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1220 in public function `adaptive_avg_pool2d`:
        D400: First line should end with a period (not 'f')
functional.py:1220 in public function `adaptive_avg_pool2d`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1237 in public function `adaptive_avg_pool3d`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1237 in public function `adaptive_avg_pool3d`:
        D400: First line should end with a period (not 'f')
functional.py:1237 in public function `adaptive_avg_pool3d`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1255 in public function `dropout`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1255 in public function `dropout`:
        D400: First line should end with a period (not 't')
functional.py:1275 in public function `alpha_dropout`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1287 in public function `dropout1d`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1287 in public function `dropout1d`:
        D400: First line should end with a period (not ',')
functional.py:1325 in public function `dropout2d`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1325 in public function `dropout2d`:
        D400: First line should end with a period (not ',')
functional.py:1369 in public function `dropout3d`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1369 in public function `dropout3d`:
        D400: First line should end with a period (not ',')
functional.py:1408 in public function `feature_alpha_dropout`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:1408 in public function `feature_alpha_dropout`:
        D400: First line should end with a period (not ',')
functional.py:1466 in public function `relu`:
        D400: First line should end with a period (not 'r')
functional.py:1466 in public function `relu`:
        D402: First line should not be the function's "signature"
functional.py:1491 in public function `glu`:
        D400: First line should end with a period (not 'r')
functional.py:1491 in public function `glu`:
        D402: First line should not be the function's "signature"
functional.py:1516 in public function `hardtanh`:
        D400: First line should end with a period (not 'r')
functional.py:1516 in public function `hardtanh`:
        D402: First line should not be the function's "signature"
functional.py:1542 in public function `relu6`:
        D400: First line should end with a period (not 'r')
functional.py:1542 in public function `relu6`:
        D402: First line should not be the function's "signature"
functional.py:1558 in public function `elu`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1582 in public function `selu`:
        D400: First line should end with a period (not 'r')
functional.py:1582 in public function `selu`:
        D402: First line should not be the function's "signature"
functional.py:1611 in public function `celu`:
        D400: First line should end with a period (not 'r')
functional.py:1611 in public function `celu`:
        D402: First line should not be the function's "signature"
functional.py:1638 in public function `leaky_relu`:
        D400: First line should end with a period (not 'r')
functional.py:1638 in public function `leaky_relu`:
        D402: First line should not be the function's "signature"
functional.py:1688 in public function `rrelu`:
        D400: First line should end with a period (not 'r')
functional.py:1688 in public function `rrelu`:
        D402: First line should not be the function's "signature"
functional.py:1755 in public function `tanhshrink`:
        D400: First line should end with a period (not 'r')
functional.py:1755 in public function `tanhshrink`:
        D402: First line should not be the function's "signature"
functional.py:1767 in public function `softsign`:
        D400: First line should end with a period (not 'r')
functional.py:1767 in public function `softsign`:
        D402: First line should not be the function's "signature"
functional.py:1806 in public function `softmin`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1832 in public function `softmax`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1868 in public function `gumbel_softmax`:
        D401: First line should be in imperative mood (perhaps 'Sample', not 'Samples')
functional.py:1930 in public function `log_softmax`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1969 in public function `tanh`:
        D400: First line should end with a period (not 'r')
functional.py:1969 in public function `tanh`:
        D402: First line should not be the function's "signature"
functional.py:1980 in public function `sigmoid`:
        D400: First line should end with a period (not 'r')
functional.py:1980 in public function `sigmoid`:
        D402: First line should not be the function's "signature"
functional.py:1990 in public function `hardsigmoid`:
        D400: First line should end with a period (not 'n')
functional.py:1990 in public function `hardsigmoid`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2057 in public function `silu`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:2057 in public function `silu`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2081 in public function `mish`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:2081 in public function `mish`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2100 in public function `hardswish`:
        D400: First line should end with a period (not ':')
functional.py:2100 in public function `hardswish`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2136 in public function `embedding`:
        D202: No blank lines allowed after function docstring (found 1)
functional.py:2136 in public function `embedding`:
        D401: First line should be in imperative mood; try rephrasing (found 'A')
functional.py:2254 in public function `embedding_bag`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:2254 in public function `embedding_bag`:
        D400: First line should end with a period (not 'e')
functional.py:2254 in public function `embedding_bag`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:2462 in public function `batch_norm`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2507 in public function `instance_norm`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:2507 in public function `instance_norm`:
        D400: First line should end with a period (not 'a')
functional.py:2507 in public function `instance_norm`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2540 in public function `layer_norm`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2554 in public function `group_norm`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2567 in public function `local_response_norm`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:2567 in public function `local_response_norm`:
        D400: First line should end with a period (not 'f')
functional.py:2567 in public function `local_response_norm`:
        D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2611 in public function `ctc_loss`:
        D401: First line should be in imperative mood; try rephrasing (found 'The')
functional.py:2679 in public function `nll_loss`:
        D401: First line should be in imperative mood; try rephrasing (found 'The')
functional.py:2895 in public function `kl_div`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:2895 in public function `kl_div`:
        D400: First line should end with a period (not 's')
functional.py:2895 in public function `kl_div`:
        D401: First line should be in imperative mood; try rephrasing (found 'The')
functional.py:2978 in public function `cross_entropy`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
functional.py:3069 in public function `binary_cross_entropy`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:3069 in public function `binary_cross_entropy`:
        D400: First line should end with a period (not 't')
functional.py:3069 in public function `binary_cross_entropy`:
        D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3139 in public function `binary_cross_entropy_with_logits`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:3139 in public function `binary_cross_entropy_with_logits`:
        D400: First line should end with a period (not 't')
functional.py:3139 in public function `binary_cross_entropy_with_logits`:
        D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3211 in public function `smooth_l1_loss`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:3211 in public function `smooth_l1_loss`:
        D400: First line should end with a period (not 'e')
functional.py:3211 in public function `smooth_l1_loss`:
        D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3251 in public function `huber_loss`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:3251 in public function `huber_loss`:
        D400: First line should end with a period (not 'e')
functional.py:3251 in public function `huber_loss`:
        D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3282 in public function `l1_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3282 in public function `l1_loss`:
        D402: First line should not be the function's "signature"
functional.py:3313 in public function `mse_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3313 in public function `mse_loss`:
        D402: First line should not be the function's "signature"
functional.py:3346 in public function `margin_ranking_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3346 in public function `margin_ranking_loss`:
        D402: First line should not be the function's "signature"
functional.py:3382 in public function `hinge_embedding_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3382 in public function `hinge_embedding_loss`:
        D402: First line should not be the function's "signature"
functional.py:3411 in public function `multilabel_margin_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3411 in public function `multilabel_margin_loss`:
        D402: First line should not be the function's "signature"
functional.py:3439 in public function `soft_margin_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3439 in public function `soft_margin_loss`:
        D402: First line should not be the function's "signature"
functional.py:3462 in public function `multilabel_soft_margin_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3462 in public function `multilabel_soft_margin_loss`:
        D402: First line should not be the function's "signature"
functional.py:3510 in public function `cosine_embedding_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3510 in public function `cosine_embedding_loss`:
        D402: First line should not be the function's "signature"
functional.py:3543 in public function `multi_margin_loss`:
        D400: First line should end with a period (not 'r')
functional.py:3543 in public function `multi_margin_loss`:
        D402: First line should not be the function's "signature"
functional.py:3708 in public function `upsample` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3713 in public function `upsample` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3718 in public function `upsample` (skipping F811):
        D205: 1 blank line required between summary line and description (found 0)
functional.py:3718 in public function `upsample` (skipping F811):
        D400: First line should end with a period (not 'n')
functional.py:3783 in private function `_is_integer`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:3794 in public function `interpolate` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3799 in public function `interpolate` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3804 in public function `interpolate` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3809 in public function `interpolate` (skipping F811):
        D103: Missing docstring in public function
functional.py:3821 in public function `interpolate` (skipping F811,B950):
        D205: 1 blank line required between summary line and description (found 0)
functional.py:3821 in public function `interpolate` (skipping F811,B950):
        D400: First line should end with a period (not 'n')
functional.py:4062 in public function `upsample_nearest` (skipping F811):
        D103: Missing docstring in public function
functional.py:4067 in public function `upsample_nearest` (skipping F811):
        D103: Missing docstring in public function
functional.py:4100 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4107 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4114 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4121 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4174 in public function `grid_sample`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:4174 in public function `grid_sample`:
        D400: First line should end with a period (not 'e')
functional.py:4315 in public function `affine_grid`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:4315 in public function `affine_grid`:
        D400: First line should end with a period (not 'f')
functional.py:4315 in public function `affine_grid`:
        D401: First line should be in imperative mood (perhaps 'Generate', not 'Generates')
functional.py:4608 in public function `triplet_margin_loss`:
        D200: One-line docstring should fit on one line with quotes (found 3)
functional.py:4608 in public function `triplet_margin_loss`:
        D400: First line should end with a period (not 's')
functional.py:4643 in public function `triplet_margin_with_distance_loss`:
        D200: One-line docstring should fit on one line with quotes (found 3)
functional.py:4705 in public function `normalize`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
functional.py:4733 in public function `assert_int_or_pair`:
        D103: Missing docstring in public function
functional.py:4743 in public function `unfold`:
        D401: First line should be in imperative mood (perhaps 'Extract', not 'Extracts')
functional.py:4773 in public function `fold`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:4773 in public function `fold`:
        D400: First line should end with a period (not 'g')
functional.py:4773 in public function `fold`:
        D401: First line should be in imperative mood (perhaps 'Combine', not 'Combines')
functional.py:4800 in private function `_in_projection_packed`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:4800 in private function `_in_projection_packed`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
functional.py:4867 in private function `_in_projection`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:4867 in private function `_in_projection`:
        D400: First line should end with a period (not 'y')
functional.py:4867 in private function `_in_projection`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
functional.py:5128 in public function `multi_head_attention_forward`:
        D205: 1 blank line required between summary line and description (found 0)
functional.py:5128 in public function `multi_head_attention_forward`:
        D400: First line should end with a period (not ':')
160
```

**AFTER:**

```
functional.py:3709 in public function `upsample` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3714 in public function `upsample` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3798 in public function `interpolate` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3803 in public function `interpolate` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3808 in public function `interpolate` (skipping F811,B950):
        D103: Missing docstring in public function
functional.py:3813 in public function `interpolate` (skipping F811):
        D103: Missing docstring in public function
functional.py:4068 in public function `upsample_nearest` (skipping F811):
        D103: Missing docstring in public function
functional.py:4073 in public function `upsample_nearest` (skipping F811):
        D103: Missing docstring in public function
functional.py:4106 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4113 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4120 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4127 in public function `upsample_bilinear` (skipping F811):
        D103: Missing docstring in public function
functional.py:4742 in public function `assert_int_or_pair`:
        D103: Missing docstring in public function
13
```

The file contained several docstring errors. I have fixed all of them(hopefully) and have tried to improve the over all readability of the code. For most part, I have included relevant description of functions (referred from official PyTorch Docs). In some cases where functions are purely mathematical or it is difficult to give one line description, I have just included references.

For testing, I relied on local system and created a separate file. For final edits, I directly changed the contents of forked repo as visible already.

Kindly review @svekars @subramen @kit1980

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112856
Approved by: https://github.com/kit1980
2023-11-13 22:16:49 +00:00
giacomo
7b28f8c5ea Better error message when applying interpolation on non-4D tensors (#113459)
Fixes #113445

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113459
Approved by: https://github.com/albanD
2023-11-10 21:06:51 +00:00
Eric Zhang
468a73f0e3 Support Numpy ints in the torch.nn.functional.interpolate dtype check (#110778)
In https://github.com/pytorch/pytorch/pull/99243, a check was added to ensure the `size` only contained integers.

This PR updates the check to also include numpy integers based on this comment (cc @kit1980): https://github.com/pytorch/pytorch/pull/99243#issuecomment-1646736646. Similar to the other commenter, I also ran into issues where existing software broke due to this after upgrading to PT2.1:

```
                if not torch.jit.is_scripting():
                    if not all(_is_integer(x) for x in size):
>                       raise TypeError(
                            "expected size to be one of int or Tuple[int] or Tuple[int, int] or "
                            f"Tuple[int, int, int], but got size with types {[type(x) for x in size]}"
                        )
E                       TypeError: expected size to be one of int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int], but got size with types [<class 'numpy.int64'>, <class 'numpy.int64'>]

/conda-env/lib/python3.8/site-packages/torch/nn/functional.py:3924: TypeError
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110778
Approved by: https://github.com/mikaylagawarecki
2023-10-10 01:46:33 +00:00
Mikayla Gawarecki
abd83ce180 Small fix in SDPA docstring codeblock (#109086)
Fix https://github.com/pytorch/pytorch/issues/109072

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109086
Approved by: https://github.com/drisspg
2023-09-12 16:48:46 +00:00
FFFrog
969bf8a054 Fix the document of torch.nn.functional.conv2d (#107851)
Fixes #107692

Fix the document of torch.nn.functional.conv2d
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107851
Approved by: https://github.com/mikaylagawarecki
2023-08-24 18:02:03 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Mikayla Gawarecki
1317dbf176 Reland "Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148)" (#106632)
Previous one was reverted because the PR stacked under which added error-checking to Pad variants https://github.com/pytorch/pytorch/pull/106147 was reverted as internally some people pass 2D inputs to ZeroPad2d (which should actually take 3d or 4d inputs :) but there wasn't actually anything this PR was breaking according to my understanding

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106632
Approved by: https://github.com/albanD
2023-08-07 20:10:25 +00:00
PyTorch MergeBot
dfcfd5cedb Revert "Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148)"
This reverts commit 87d2536971.

Reverted https://github.com/pytorch/pytorch/pull/106148 on behalf of https://github.com/malfet due to Reverting as dependent PR https://github.com/pytorch/pytorch/pull/106147 was reverted as well ([comment](https://github.com/pytorch/pytorch/pull/106148#issuecomment-1662344543))
2023-08-02 14:46:00 +00:00
Mikayla Gawarecki
87d2536971 Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148)
Fixes #105749 https://github.com/pytorch/pytorch/issues/95320

(tldr is that input should always be `[N, C, H, (W, D])` where only H, W and D dimensions get circular padding, so the 2D case where user wants both dimensions to be padded --> they should `.unsqueeze(0)` (as is the case for `Reflection/ReplicationPad`) but we didn't document this for circular padding. [This seems to be the old docstring](277b05014a/torch/nn/functional.py (L4689)) that was somehow lost.

Fixes no_batch_dim support https://github.com/pytorch/pytorch/issues/104860

- Adds missing documentation for circular padding
- Adds missing CircularPad modules
- Migrates legacy test_nn tests from circular padding to ModuleInfo
- Adds no_batch_dim support + sample inputs that test this

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106148
Approved by: https://github.com/albanD
ghstack dependencies: #106325, #106147
2023-08-01 12:49:58 +00:00
FFFrog
9a1cdcb8a0 Format: fixing multiple string concatenation in single line (#106013)
Fixing multiple string concatenation in single line
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106013
Approved by: https://github.com/albanD
2023-07-26 18:39:18 +00:00
lezcano
9bde7f4e27 Fix the docs for cosine_similarity (#104772)
The behaviour of `cosine_similarity` was subtly changed in
https://github.com/pytorch/pytorch/pull/31378, but the docs were not
updated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104772
Approved by: https://github.com/albanD, https://github.com/svekars
2023-07-26 09:23:09 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
79c5e33349 [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436
Approved by: https://github.com/malfet, https://github.com/albanD
2023-07-21 07:38:46 +00:00
drisspg
2ee440054b Small tweaks to SDPA docs (#104749)
Fixes #104652

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 2d61112</samp>

No summary available (An error occurred while summarizing these changes: Gave up after 3 retries: Failed to read error response)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104749
Approved by: https://github.com/mikaylagawarecki
2023-07-10 21:01:45 +00:00
yewentao
d3ba8901d8 Adding precision issue note docs for functional.interpolate (#104622)
Fixes #104157

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104622
Approved by: https://github.com/ezyang
2023-07-05 16:20:57 +00:00
vfdev
4ab140902b [docs] Fixed typo in grid_sample doctring (#104406)
Fixed a small typo in grid_sample doctring:

<img width="265" alt="image" src="https://github.com/pytorch/pytorch/assets/2459423/1d2dd7a2-895a-4683-9d9f-a4d1d9d1a4a7">

- https://pytorch.org/docs/main/generated/torch.nn.functional.grid_sample.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104406
Approved by: https://github.com/mikaylagawarecki, https://github.com/svekars
2023-06-29 19:44:54 +00:00
Ryan Smith
6bda97e2c1 Raise type error message for interpolate if size contains non-integer elements (#99243)
Raise type error message for interpolate when output size is a tuple containing elements that are not `int`

Fixes #98287

Check is only performed if `size` is an instance of `list` or `tuple`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99243
Approved by: https://github.com/Skylion007, https://github.com/Neilblaze, https://github.com/MovsisyanM, https://github.com/albanD
2023-06-23 00:48:45 +00:00
MysticalMusings
f1f13a35b0 Fix GELU-related docstring formatting (#102845)
The docstring about GELU seems formatted incorrectly. The original docstring about GELU is rendered as below:

$$ \text{GELU}(x) = 0.5 * x * (1 + \text{Tanh}(\sqrt(2 / \pi) * (x + 0.044715 * x^3))) $$

where the square root of which part is confusing.

I double-checked the formula, which should be:

$$ \text{GELU}(x) = 0.5 * x * (1 + \text{Tanh}(\sqrt{2 / \pi} * (x + 0.044715 * x^3))) $$

where round brackets in resource code should be brace brackets.

> _formula in [original paper](https://arxiv.org/abs/1606.08415)_
> ![Snipaste_2023-06-03_00-43-49](https://github.com/pytorch/pytorch/assets/39690782/22511c4e-2f20-4a16-9bda-4c182a360160)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102845
Approved by: https://github.com/mikaylagawarecki
2023-06-08 20:19:03 +00:00
cviviers
81c181dc01 Update BCEWithLogitsLoss pos_weight description in documentation (#101567)
Fixes #82496 and #65702

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101567
Approved by: https://github.com/mikaylagawarecki
2023-05-19 21:23:21 +00:00
Edward Z. Yang
c567748e16 Make interpolate_bilinear deterministic using decomposition (#101115)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101115
Approved by: https://github.com/ngimel
2023-05-11 22:48:01 +00:00
Joel Schlosser
bd9d50a3fc Remove future deprecation warning from kl_div docs (#96541)
Fixes #95687
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96541
Approved by: https://github.com/albanD
2023-05-05 23:01:21 +00:00