Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing.
Note that only warnings that their messages contain `[Dd]eprecat(ed|ion)` are updated in this PR.
Resolves#126888
- #126888
This PR is split from PR #126898.
- #126898
------
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127689
Approved by: https://github.com/Skylion007
Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing.
Note that only warnings that their messages contain `[Dd]eprecat(ed|ion)` are updated in this PR.
UPDATE: Use `FutureWarning` instead of `DeprecationWarning`.
Resolves#126888
- #126888
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126898
Approved by: https://github.com/albanD
Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.
Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0
| Repository | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7 | 251.8 | 351.1 | 274.9 |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
Documentation states that the parameter margin of torch.nn.TripletMarginLoss is greater than 0, however any value was being accepted. Also fixed torch.nn.TripletMarginWithDistanceLoss which had the same problem. Added error test input for the new ValueError.
Fixes#83241
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121978
Approved by: https://github.com/mikaylagawarecki
# Summary
Updates the SDPA docs to fix some small inaccuracies and points to the new sdpa_kernel context manger. The Enum like type binded from cpp SDPBackend does not render its fields for some reason. Manually list them instead for now
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121180
Approved by: https://github.com/mikaylagawarecki
# Summary
This PR introduces a new Tensor subclass that is designed to be used with torch.nn.functional.scaled_dot_product_attention. Currently we have a boolean `is_causal` flag that allows users to do do causal masking without the need to actually create the "realized" attention bias and pass into sdpa. We originally added this flag since there is native support in both fused kernels we support. This provides a big performance gain ( the kernels only need to iterate over ~0.5x the sequence, and for very large sequence lengths this can provide vary large memory improvements.
The flag was introduced when the early on in the kernel development and at the time it was implicitly meant to "upper_left" causal attention. This distinction only matters when the attention_bias is not square. For a more detailed break down see: https://github.com/pytorch/pytorch/issues/108108. The kernels default behavior has since changed, largely due to the rise of autogressive text generation. And unfortunately this would lead to a BC break. In the long term it may actually be beneficial to change the default meaning of `is_causal` to represent lower_right causal masking.
The larger theme though is laid here: https://github.com/pytorch/pytorch/issues/110681. The thesis being that there is alot of innovation in SDPA revolving around the attention_bias being used. This is the first in hopefully a few more attention_biases that we would like to add. The next interesting one would be `sliding_window` which is used by the popular mistral model family.
Results from benchmarking, I improved the meff_attention perf hence the slightly decreased max perf.
```Shell
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
| Type | Speedup | batch_size | num_heads | q_seq_len | k_seq_len | embed_dim | dtype | head_dim |
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
| Average | 1.2388050062214226 | | | | | | | |
| Max | 1.831672915579016 | 128 | 32 | 1024 | 2048 | 2048 | torch.bfloat16 | 64 |
| Min | 0.9430534166730135 | 1 | 16 | 256 | 416 | 2048 | torch.bfloat16 | 128 |
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114823
Approved by: https://github.com/cpuhrsch
Fixes#112597
### Output:
**BEFORE:**
```functional.py:1 at module level:
D400: First line should end with a period (not 'e')
functional.py:438 in public function `fractional_max_pool2d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:537 in public function `fractional_max_pool3d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:646 in public function `max_pool1d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:732 in public function `max_pool2d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:818 in public function `max_pool3d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:932 in public function `max_unpool1d`:
D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:968 in public function `max_unpool2d`:
D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:1000 in public function `max_unpool3d`:
D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:1031 in public function `lp_pool2d`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1031 in public function `lp_pool2d`:
D400: First line should end with a period (not 'f')
functional.py:1031 in public function `lp_pool2d`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1056 in public function `lp_pool1d`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1056 in public function `lp_pool1d`:
D400: First line should end with a period (not 'f')
functional.py:1056 in public function `lp_pool1d`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1077 in public function `adaptive_max_pool1d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:1119 in public function `adaptive_max_pool2d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:1163 in public function `adaptive_max_pool3d_with_indices`:
D400: First line should end with a period (not ')')
functional.py:1220 in public function `adaptive_avg_pool2d`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1220 in public function `adaptive_avg_pool2d`:
D400: First line should end with a period (not 'f')
functional.py:1220 in public function `adaptive_avg_pool2d`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1237 in public function `adaptive_avg_pool3d`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1237 in public function `adaptive_avg_pool3d`:
D400: First line should end with a period (not 'f')
functional.py:1237 in public function `adaptive_avg_pool3d`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1255 in public function `dropout`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1255 in public function `dropout`:
D400: First line should end with a period (not 't')
functional.py:1275 in public function `alpha_dropout`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1287 in public function `dropout1d`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1287 in public function `dropout1d`:
D400: First line should end with a period (not ',')
functional.py:1325 in public function `dropout2d`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1325 in public function `dropout2d`:
D400: First line should end with a period (not ',')
functional.py:1369 in public function `dropout3d`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1369 in public function `dropout3d`:
D400: First line should end with a period (not ',')
functional.py:1408 in public function `feature_alpha_dropout`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:1408 in public function `feature_alpha_dropout`:
D400: First line should end with a period (not ',')
functional.py:1466 in public function `relu`:
D400: First line should end with a period (not 'r')
functional.py:1466 in public function `relu`:
D402: First line should not be the function's "signature"
functional.py:1491 in public function `glu`:
D400: First line should end with a period (not 'r')
functional.py:1491 in public function `glu`:
D402: First line should not be the function's "signature"
functional.py:1516 in public function `hardtanh`:
D400: First line should end with a period (not 'r')
functional.py:1516 in public function `hardtanh`:
D402: First line should not be the function's "signature"
functional.py:1542 in public function `relu6`:
D400: First line should end with a period (not 'r')
functional.py:1542 in public function `relu6`:
D402: First line should not be the function's "signature"
functional.py:1558 in public function `elu`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1582 in public function `selu`:
D400: First line should end with a period (not 'r')
functional.py:1582 in public function `selu`:
D402: First line should not be the function's "signature"
functional.py:1611 in public function `celu`:
D400: First line should end with a period (not 'r')
functional.py:1611 in public function `celu`:
D402: First line should not be the function's "signature"
functional.py:1638 in public function `leaky_relu`:
D400: First line should end with a period (not 'r')
functional.py:1638 in public function `leaky_relu`:
D402: First line should not be the function's "signature"
functional.py:1688 in public function `rrelu`:
D400: First line should end with a period (not 'r')
functional.py:1688 in public function `rrelu`:
D402: First line should not be the function's "signature"
functional.py:1755 in public function `tanhshrink`:
D400: First line should end with a period (not 'r')
functional.py:1755 in public function `tanhshrink`:
D402: First line should not be the function's "signature"
functional.py:1767 in public function `softsign`:
D400: First line should end with a period (not 'r')
functional.py:1767 in public function `softsign`:
D402: First line should not be the function's "signature"
functional.py:1806 in public function `softmin`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1832 in public function `softmax`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1868 in public function `gumbel_softmax`:
D401: First line should be in imperative mood (perhaps 'Sample', not 'Samples')
functional.py:1930 in public function `log_softmax`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:1969 in public function `tanh`:
D400: First line should end with a period (not 'r')
functional.py:1969 in public function `tanh`:
D402: First line should not be the function's "signature"
functional.py:1980 in public function `sigmoid`:
D400: First line should end with a period (not 'r')
functional.py:1980 in public function `sigmoid`:
D402: First line should not be the function's "signature"
functional.py:1990 in public function `hardsigmoid`:
D400: First line should end with a period (not 'n')
functional.py:1990 in public function `hardsigmoid`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2057 in public function `silu`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:2057 in public function `silu`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2081 in public function `mish`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:2081 in public function `mish`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2100 in public function `hardswish`:
D400: First line should end with a period (not ':')
functional.py:2100 in public function `hardswish`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2136 in public function `embedding`:
D202: No blank lines allowed after function docstring (found 1)
functional.py:2136 in public function `embedding`:
D401: First line should be in imperative mood; try rephrasing (found 'A')
functional.py:2254 in public function `embedding_bag`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:2254 in public function `embedding_bag`:
D400: First line should end with a period (not 'e')
functional.py:2254 in public function `embedding_bag`:
D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
functional.py:2462 in public function `batch_norm`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2507 in public function `instance_norm`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:2507 in public function `instance_norm`:
D400: First line should end with a period (not 'a')
functional.py:2507 in public function `instance_norm`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2540 in public function `layer_norm`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2554 in public function `group_norm`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2567 in public function `local_response_norm`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:2567 in public function `local_response_norm`:
D400: First line should end with a period (not 'f')
functional.py:2567 in public function `local_response_norm`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
functional.py:2611 in public function `ctc_loss`:
D401: First line should be in imperative mood; try rephrasing (found 'The')
functional.py:2679 in public function `nll_loss`:
D401: First line should be in imperative mood; try rephrasing (found 'The')
functional.py:2895 in public function `kl_div`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:2895 in public function `kl_div`:
D400: First line should end with a period (not 's')
functional.py:2895 in public function `kl_div`:
D401: First line should be in imperative mood; try rephrasing (found 'The')
functional.py:2978 in public function `cross_entropy`:
D401: First line should be in imperative mood; try rephrasing (found 'This')
functional.py:3069 in public function `binary_cross_entropy`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:3069 in public function `binary_cross_entropy`:
D400: First line should end with a period (not 't')
functional.py:3069 in public function `binary_cross_entropy`:
D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3139 in public function `binary_cross_entropy_with_logits`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:3139 in public function `binary_cross_entropy_with_logits`:
D400: First line should end with a period (not 't')
functional.py:3139 in public function `binary_cross_entropy_with_logits`:
D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3211 in public function `smooth_l1_loss`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:3211 in public function `smooth_l1_loss`:
D400: First line should end with a period (not 'e')
functional.py:3211 in public function `smooth_l1_loss`:
D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3251 in public function `huber_loss`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:3251 in public function `huber_loss`:
D400: First line should end with a period (not 'e')
functional.py:3251 in public function `huber_loss`:
D401: First line should be in imperative mood; try rephrasing (found 'Function')
functional.py:3282 in public function `l1_loss`:
D400: First line should end with a period (not 'r')
functional.py:3282 in public function `l1_loss`:
D402: First line should not be the function's "signature"
functional.py:3313 in public function `mse_loss`:
D400: First line should end with a period (not 'r')
functional.py:3313 in public function `mse_loss`:
D402: First line should not be the function's "signature"
functional.py:3346 in public function `margin_ranking_loss`:
D400: First line should end with a period (not 'r')
functional.py:3346 in public function `margin_ranking_loss`:
D402: First line should not be the function's "signature"
functional.py:3382 in public function `hinge_embedding_loss`:
D400: First line should end with a period (not 'r')
functional.py:3382 in public function `hinge_embedding_loss`:
D402: First line should not be the function's "signature"
functional.py:3411 in public function `multilabel_margin_loss`:
D400: First line should end with a period (not 'r')
functional.py:3411 in public function `multilabel_margin_loss`:
D402: First line should not be the function's "signature"
functional.py:3439 in public function `soft_margin_loss`:
D400: First line should end with a period (not 'r')
functional.py:3439 in public function `soft_margin_loss`:
D402: First line should not be the function's "signature"
functional.py:3462 in public function `multilabel_soft_margin_loss`:
D400: First line should end with a period (not 'r')
functional.py:3462 in public function `multilabel_soft_margin_loss`:
D402: First line should not be the function's "signature"
functional.py:3510 in public function `cosine_embedding_loss`:
D400: First line should end with a period (not 'r')
functional.py:3510 in public function `cosine_embedding_loss`:
D402: First line should not be the function's "signature"
functional.py:3543 in public function `multi_margin_loss`:
D400: First line should end with a period (not 'r')
functional.py:3543 in public function `multi_margin_loss`:
D402: First line should not be the function's "signature"
functional.py:3708 in public function `upsample` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3713 in public function `upsample` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3718 in public function `upsample` (skipping F811):
D205: 1 blank line required between summary line and description (found 0)
functional.py:3718 in public function `upsample` (skipping F811):
D400: First line should end with a period (not 'n')
functional.py:3783 in private function `_is_integer`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:3794 in public function `interpolate` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3799 in public function `interpolate` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3804 in public function `interpolate` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3809 in public function `interpolate` (skipping F811):
D103: Missing docstring in public function
functional.py:3821 in public function `interpolate` (skipping F811,B950):
D205: 1 blank line required between summary line and description (found 0)
functional.py:3821 in public function `interpolate` (skipping F811,B950):
D400: First line should end with a period (not 'n')
functional.py:4062 in public function `upsample_nearest` (skipping F811):
D103: Missing docstring in public function
functional.py:4067 in public function `upsample_nearest` (skipping F811):
D103: Missing docstring in public function
functional.py:4100 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4107 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4114 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4121 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4174 in public function `grid_sample`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:4174 in public function `grid_sample`:
D400: First line should end with a period (not 'e')
functional.py:4315 in public function `affine_grid`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:4315 in public function `affine_grid`:
D400: First line should end with a period (not 'f')
functional.py:4315 in public function `affine_grid`:
D401: First line should be in imperative mood (perhaps 'Generate', not 'Generates')
functional.py:4608 in public function `triplet_margin_loss`:
D200: One-line docstring should fit on one line with quotes (found 3)
functional.py:4608 in public function `triplet_margin_loss`:
D400: First line should end with a period (not 's')
functional.py:4643 in public function `triplet_margin_with_distance_loss`:
D200: One-line docstring should fit on one line with quotes (found 3)
functional.py:4705 in public function `normalize`:
D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
functional.py:4733 in public function `assert_int_or_pair`:
D103: Missing docstring in public function
functional.py:4743 in public function `unfold`:
D401: First line should be in imperative mood (perhaps 'Extract', not 'Extracts')
functional.py:4773 in public function `fold`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:4773 in public function `fold`:
D400: First line should end with a period (not 'g')
functional.py:4773 in public function `fold`:
D401: First line should be in imperative mood (perhaps 'Combine', not 'Combines')
functional.py:4800 in private function `_in_projection_packed`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:4800 in private function `_in_projection_packed`:
D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
functional.py:4867 in private function `_in_projection`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:4867 in private function `_in_projection`:
D400: First line should end with a period (not 'y')
functional.py:4867 in private function `_in_projection`:
D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
functional.py:5128 in public function `multi_head_attention_forward`:
D205: 1 blank line required between summary line and description (found 0)
functional.py:5128 in public function `multi_head_attention_forward`:
D400: First line should end with a period (not ':')
160
```
**AFTER:**
```
functional.py:3709 in public function `upsample` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3714 in public function `upsample` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3798 in public function `interpolate` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3803 in public function `interpolate` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3808 in public function `interpolate` (skipping F811,B950):
D103: Missing docstring in public function
functional.py:3813 in public function `interpolate` (skipping F811):
D103: Missing docstring in public function
functional.py:4068 in public function `upsample_nearest` (skipping F811):
D103: Missing docstring in public function
functional.py:4073 in public function `upsample_nearest` (skipping F811):
D103: Missing docstring in public function
functional.py:4106 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4113 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4120 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4127 in public function `upsample_bilinear` (skipping F811):
D103: Missing docstring in public function
functional.py:4742 in public function `assert_int_or_pair`:
D103: Missing docstring in public function
13
```
The file contained several docstring errors. I have fixed all of them(hopefully) and have tried to improve the over all readability of the code. For most part, I have included relevant description of functions (referred from official PyTorch Docs). In some cases where functions are purely mathematical or it is difficult to give one line description, I have just included references.
For testing, I relied on local system and created a separate file. For final edits, I directly changed the contents of forked repo as visible already.
Kindly review @svekars @subramen @kit1980
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112856
Approved by: https://github.com/kit1980
In https://github.com/pytorch/pytorch/pull/99243, a check was added to ensure the `size` only contained integers.
This PR updates the check to also include numpy integers based on this comment (cc @kit1980): https://github.com/pytorch/pytorch/pull/99243#issuecomment-1646736646. Similar to the other commenter, I also ran into issues where existing software broke due to this after upgrading to PT2.1:
```
if not torch.jit.is_scripting():
if not all(_is_integer(x) for x in size):
> raise TypeError(
"expected size to be one of int or Tuple[int] or Tuple[int, int] or "
f"Tuple[int, int, int], but got size with types {[type(x) for x in size]}"
)
E TypeError: expected size to be one of int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int], but got size with types [<class 'numpy.int64'>, <class 'numpy.int64'>]
/conda-env/lib/python3.8/site-packages/torch/nn/functional.py:3924: TypeError
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110778
Approved by: https://github.com/mikaylagawarecki
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang