pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
isdanni	2f7bb18def	[Doc] Add padding size constraint in nn.ReflectionPad2d (#115995 ) Fixes #115532 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115995 Approved by: https://github.com/mikaylagawarecki	2023-12-18 21:29:14 +00:00
Mikayla Gawarecki	6d5fe07659	Fix numpy warning when importing torch without numpy installed (#115867 ) Fixes #115638 I verified locally that with no numpy install the warning no longer occurs Pull Request resolved: https://github.com/pytorch/pytorch/pull/115867 Approved by: https://github.com/soulitzer	2023-12-15 02:22:12 +00:00
Wongboo	68f74dd162	Add python and C++ support for LPPool3d (#114199 ) Add python and C++ support for LPPool3d to Fixes #114114 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114199 Approved by: https://github.com/mikaylagawarecki	2023-12-08 18:18:44 +00:00
Mikayla Gawarecki	f5919335db	Fix _load_from_state_dict for num_batches_tracked in batchnorm (#115285 ) I approved https://github.com/pytorch/pytorch/pull/110850 which did the following Previously: `num_batches_tracked` not in state_dict when doing `m.load_state_dict(state_dict)` --> always overwrite module's `num_batches_tracked` in `load_from_state_dict` with a 0 cpu tensor Now: `num_batches_tracked` not in state_dict loaded when doing `m.load_state_dict(state_dict)` --> only overwrite module's `num_batches_tracked` in `load_from_state_dict` with a 0 cpu tensor if module does not have `num_batches_tracked` This causes the following issue: ``` with torch.device('meta'): m = BatchNorm(...) m.load_state_dict(state_dict, assign=True) ``` If `num_batches_tracked` is not in `state_dict`, since `modules's` `num_batches_tracked` is present on meta device, it is not overwritten with a 0 cpu tensor. When compiling, this error is raised ``` AssertionError: Does not support mixing cuda+meta ``` I am not sure whether the explicit check for meta device makes sense as a fix, will add testing if this fix is ok Pull Request resolved: https://github.com/pytorch/pytorch/pull/115285 Approved by: https://github.com/albanD	2023-12-07 22:48:26 +00:00
Linus	7201edc0a5	Fix RNN class constructor signature (#115341 ) Fixes #114617 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115341 Approved by: https://github.com/mikaylagawarecki	2023-12-07 19:46:33 +00:00
Aaron Gokaslan	ea7d70aecc	[BE]: ruff FURB136: replace ternary with min/max (preview) (#114382 ) Replaces ternary if else statements with simple min max when appropriate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114382 Approved by: https://github.com/albanD	2023-11-22 22:10:01 +00:00
Brian Vaughan	dbb96ef30d	improve annotation device parameters where a device ordinal is allowed (#113647 ) Using mypy in code that depends on pytorch, I noticed that the type annotation doesn't allow a device ordinal. `error: Argument "device" to "to_empty" of "Module" has incompatible type "int"; expected "str \| device" [arg-type]` Pull Request resolved: https://github.com/pytorch/pytorch/pull/113647 Approved by: https://github.com/albanD	2023-11-17 14:41:22 +00:00
zabboud	53e7de4b65	Issue 112599 - fix pydocstyle errors (#113177 ) Fixes #112599 Fixed errors relating to pydocstyle in the following files. The remaining errors are related to docstrings at the module level and at methods within each module, `forward()`, `reset_parameters`, `__init__` ..etc pydocstyle torch/nn/modules/pooling.py --count before: 49 after: 29 remaining errors: ``` torch/nn/modules/pooling.py:1 at module level: D100: Missing docstring in public module torch/nn/modules/pooling.py:90 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:163 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:240 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:315 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:321 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:402 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:408 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:472 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:478 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:541 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:550 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:620 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:630 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:706 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:716 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:720 in public method `__setstate__`: D105: Missing docstring in magic method torch/nn/modules/pooling.py:774 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:792 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:845 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pooling.py:863 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:925 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:979 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:1026 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:1068 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:1111 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:1150 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:1189 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pooling.py:1228 in public method `forward`: D102: Missing docstring in public method ``` pydocstyle torch/nn/modules/upsampling.py --count before: 14 after: 7 remaining: ``` torch/nn/modules/upsampling.py:1 at module level: D100: Missing docstring in public module torch/nn/modules/upsampling.py:142 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/upsampling.py:156 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/upsampling.py:160 in public method `__setstate__`: D105: Missing docstring in magic method torch/nn/modules/upsampling.py:166 in public method `extra_repr`: D102: Missing docstring in public method torch/nn/modules/upsampling.py:216 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/upsampling.py:263 in public method `__init__`: D107: Missing docstring in __init__ ``` pydocstyle torch/nn/modules/rnn.py --count before: 47 after: 40 remaining ``` torch/nn/modules/rnn.py:1 at module level: D100: Missing docstring in public module torch/nn/modules/rnn.py:59 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:160 in public method `__setattr__`: D105: Missing docstring in magic method torch/nn/modules/rnn.py:225 in public method `reset_parameters`: D102: Missing docstring in public method torch/nn/modules/rnn.py:230 in public method `check_input`: D102: Missing docstring in public method torch/nn/modules/rnn.py:242 in public method `get_expected_hidden_size`: D102: Missing docstring in public method torch/nn/modules/rnn.py:256 in public method `check_hidden_size`: D102: Missing docstring in public method torch/nn/modules/rnn.py:272 in public method `check_forward_args`: D102: Missing docstring in public method torch/nn/modules/rnn.py:278 in public method `permute_hidden`: D102: Missing docstring in public method torch/nn/modules/rnn.py:284 in public method `extra_repr`: D102: Missing docstring in public method torch/nn/modules/rnn.py:305 in public method `__getstate__`: D105: Missing docstring in magic method torch/nn/modules/rnn.py:313 in public method `__setstate__`: D105: Missing docstring in magic method torch/nn/modules/rnn.py:355 in public method `all_weights`: D102: Missing docstring in public method torch/nn/modules/rnn.py:471 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:478 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:481 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:503 in public method `forward` (skipping F811): D102: Missing docstring in public method torch/nn/modules/rnn.py:762 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:768 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:771 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:774 in public method `get_expected_cell_size`: D102: Missing docstring in public method torch/nn/modules/rnn.py:786 in public method `check_forward_args`: D102: Missing docstring in public method torch/nn/modules/rnn.py:798 in public method `permute_hidden`: D102: Missing docstring in public method torch/nn/modules/rnn.py:809 in public method `forward` (skipping F811): D102: Missing docstring in public method torch/nn/modules/rnn.py:820 in public method `forward` (skipping F811): D102: Missing docstring in public method torch/nn/modules/rnn.py:1030 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:1036 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:1039 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:1046 in public method `forward` (skipping F811): D102: Missing docstring in public method torch/nn/modules/rnn.py:1054 in public method `forward` (skipping F811): D102: Missing docstring in public method torch/nn/modules/rnn.py:1123 in public class `RNNCellBase`: D101: Missing docstring in public class torch/nn/modules/rnn.py:1134 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:1152 in public method `extra_repr`: D102: Missing docstring in public method torch/nn/modules/rnn.py:1160 in public method `reset_parameters`: D102: Missing docstring in public method torch/nn/modules/rnn.py:1224 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:1230 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/rnn.py:1327 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:1332 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/rnn.py:1422 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/rnn.py:1427 in public method `forward`: D102: Missing docstring in public method ``` pydocstyle torch/nn/modules/pixelshuffle.py --count before: 13 after: 8 remaining: ``` torch/nn/modules/pixelshuffle.py:1 at module level: D100: Missing docstring in public module torch/nn/modules/pixelshuffle.py:52 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pixelshuffle.py:56 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pixelshuffle.py:59 in public method `extra_repr`: D102: Missing docstring in public method torch/nn/modules/pixelshuffle.py:105 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/pixelshuffle.py:109 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/pixelshuffle.py:112 in public method `extra_repr`: D102: Missing docstring in public method ``` pydocstyle torch/nn/modules/sparse.py --count before: 14 after: 8 remaining errors: ``` torch/nn/modules/sparse.py:1 at module level: D100: Missing docstring in public module torch/nn/modules/sparse.py:124 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/sparse.py:153 in public method `reset_parameters`: D102: Missing docstring in public method torch/nn/modules/sparse.py:162 in public method `forward`: D102: Missing docstring in public method torch/nn/modules/sparse.py:167 in public method `extra_repr`: D102: Missing docstring in public method torch/nn/modules/sparse.py:320 in public method `__init__`: D107: Missing docstring in __init__ torch/nn/modules/sparse.py:350 in public method `reset_parameters`: D102: Missing docstring in public method torch/nn/modules/sparse.py:396 in public method `extra_repr`: D102: Missing docstring in public method ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/113177 Approved by: https://github.com/ezyang	2023-11-14 20:55:22 +00:00
markstur	5540d276ce	Fix docstring errors in container.py, _functions.py, transformer.py, comm.py, parallel_apply.py, data_parallel.py, scatter_gather.py (#113250 ) Fix docstring errors in container.py, _functions.py, transformer.py, comm.py, parallel_apply.py, data_parallel.py, scatter_gather.py Fixes #112603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113250 Approved by: https://github.com/mikaylagawarecki	2023-11-10 21:07:25 +00:00
Alperen ÜNLÜ	cb233dada4	Fix docstrings on torch/nn/modules (#113260 ) Fixes #112598 ## Description Fixes the docstrings on following files. ```bash pydocstyle path-to-file --count ``` \| File \| Count \| \| ------------------------------------- \| ------- \| \| torch/nn/modules/adaptive.py \| 20 -> 4 \| \| torch/nn/modules/channelshuffle.py \| 7 -> 4 \| \| torch/nn/modules/conv.py \| 37 -> 25 \| \| torch/nn/modules/distance.py \| 7 -> 5 \| \| torch/nn/modules/dropout.py \| 17 -> 7 \| \| torch/nn/modules/flatten.py \| 10 -> 7 \| \| torch/nn/modules/fold.py \| 11 -> 7 \| \| torch/nn/modules/instancenorm.py \| 13 -> 1 \| \| torch/nn/modules/lazy.py \| 11 -> 2 \| \| torch/nn/modules/linear.py \| 20 -> 14 \| \| torch/nn/modules/normalization.py \| 25 -> 16 \| \| torch/nn/modules/padding.py \| 33 -> 19 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/113260 Approved by: https://github.com/mikaylagawarecki	2023-11-10 18:22:48 +00:00
Edward Z. Yang	b4dbb02d46	Adjust _list_with_default to also work with SymInt input (#113073 ) Fixes https://github.com/pytorch/pytorch/issues/112496 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113073 Approved by: https://github.com/jbschlosser	2023-11-07 00:59:25 +00:00
Adrian Wälchli	157bda1bf0	Fix pydocstyle errors in torch/nn/module (#112674 ) Fixes #112601 ``` pydocstyle torch/nn/modules/module.py --count ``` On master: 115 After my changes on this PR: 8 The remaining 8 are due to missing docstrings in the magic methods: ``` torch/nn/modules/module.py:1 at module level: D100: Missing docstring in public module torch/nn/modules/module.py:1635 in public method `__getstate__`: D105: Missing docstring in magic method torch/nn/modules/module.py:1640 in public method `__setstate__`: D105: Missing docstring in magic method torch/nn/modules/module.py:1674 in public method `__getattr__`: D105: Missing docstring in magic method torch/nn/modules/module.py:1689 in public method `__setattr__`: D105: Missing docstring in magic method torch/nn/modules/module.py:1748 in public method `__delattr__`: D105: Missing docstring in magic method torch/nn/modules/module.py:2480 in public method `__repr__`: D105: Missing docstring in magic method torch/nn/modules/module.py:2505 in public method `__dir__`: D105: Missing docstring in magic method ``` Should I add them too? Happy to do it, I just wasn't sure if you wanted these documented. Please let me know. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112674 Approved by: https://github.com/mikaylagawarecki	2023-11-02 20:40:56 +00:00
XiaobingSuper	395614c1a4	keep sync bn training flag same with converted bn's training flag (#111998 ) When converting bn to sync bn, we need to keep sync bn's training flag with the original bn flag, the motivation is there in case the given origin model has set some bn training flag and others are not seated, after we convert sync bn, we hoping not to change this behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111998 Approved by: https://github.com/mikaylagawarecki	2023-10-26 08:18:08 +00:00
FFFrog	0e0f6a248d	Fix num_batches_tracked of BatchNorm when load_state_dict (#110850 ) Fixes #110361 as the title shown Pull Request resolved: https://github.com/pytorch/pytorch/pull/110850 Approved by: https://github.com/mikaylagawarecki	2023-10-24 04:20:38 +00:00
Federico Galatolo	d118531733	Use `\odot` everywhere instead of mixing `\odot` and `` for the Hadamard product (#111763 ) This pull request addresses an inconsistency in the representation of the Hadamard product across PyTorch documentation. Currently, the notation varies among different modules: - In `torch.nn.LSTM` documentation the Hadamard product is represented with $\odot$ - In `torch.nn.GRU` documentation the Hadamard product is represented with $$ - In `torch.nn.LSTMCell` documentation the Hadamard product is represented with $$ - In `torch.nn.GRUCell` documentation the Hadamard product is represented with $$ - In `torch.ao.nn.quantized.dynamic.GRU` documentation the Hadamard product is represented with $$ This PR proposes consistently representing the Hadamard product throughout the documentation to enhance clarity and align with established standards. The notation $\odot$ will be uniformly adopted, following the convention in the [Deep Learning Book](https://www.deeplearningbook.org/contents/linear_algebra.html). Changes Made:* - Modified `torch.nn.GRU` documentation to represent the Hadamard product with $\odot$ - Modified `torch.nn.LSTMCell` documentation to represent the Hadamard product with $\odot$ - Modified `torch.nn.GRUCell` documentation to represent the Hadamard product with $\odot$ - Modified `torch.ao.nn.quantized.dynamic.GRU` documentation to represent the Hadamard product with $\odot$ Pull Request resolved: https://github.com/pytorch/pytorch/pull/111763 Approved by: https://github.com/albanD	2023-10-22 21:01:35 +00:00
Aniket Patil	6f06832219	Fixed typo in activation.py (#111358 ) liner -> linear Pull Request resolved: https://github.com/pytorch/pytorch/pull/111358 Approved by: https://github.com/mikaylagawarecki	2023-10-16 20:36:55 +00:00
isdanni	382327bd0e	[BE] Enable Ruff's Flake8 PYI034 (#111105 ) Enable [non-self-return-type (PYI034)](https://docs.astral.sh/ruff/rules/non-self-return-type/#non-self-return-type-pyi034) Link: #110950 EDIT: to newly added reviewers, please ignore the request, it's due to a rebase error 😅 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111105 Approved by: https://github.com/Skylion007	2023-10-13 21:19:53 +00:00
isdanni	6c7013a3dc	[Doc] Add weight dtype in torch.nn.CrossEntropyLoss (#110998 ) Fixes #101213 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110998 Approved by: https://github.com/albanD	2023-10-11 19:52:13 +00:00
Sehoon Kim	c36b31d530	`torch::nn::AdaptiveLogSoftmaxWithLoss`: check length of `cutoffs` (#106777 ) Fixes #106698 Also added a check for python API, because current error message ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/sehoon/pytorch-latest/torch/nn/modules/adaptive.py", line 128, in __init__ or (min(cutoffs) <= 0) \ ValueError: min() arg is an empty sequence ``` is not very comprehensible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106777 Approved by: https://github.com/albanD	2023-10-05 05:35:47 +00:00
FFFrog	bc3f0d341a	LazyBatchNorm{1-3}d support dict&set (#109015 ) Fixes #105292 As the title shown ,LazyBatchNorm don`t support dict&set, keep consistent with BatchNorm{1-3}d. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109015 Approved by: https://github.com/mikaylagawarecki	2023-09-12 09:09:59 +00:00
FFFrog	003c5bb156	Add checks to `num_layers` for `RNN`, `LSTM`, `GRU` (#108853 ) Fixes #108223 As the title shown Pull Request resolved: https://github.com/pytorch/pytorch/pull/108853 Approved by: https://github.com/mikaylagawarecki	2023-09-09 19:33:52 +00:00
Randolf Scholz	8391e3fba4	fixed nn.Module.to type hint (#108767 ) Fixes #108675 - [x] adds `str` as option for `device` - [x] use `typing_extensions.Self` instead of `T`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108767 Approved by: https://github.com/ezyang	2023-09-08 02:40:53 +00:00
Minh-Long Luu (刘明龙)	95f268e426	Add examples for `nn.CosineEmbeddingLoss` (#108215 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108215 Approved by: https://github.com/mikaylagawarecki	2023-08-31 20:01:24 +00:00
hasteinmetz	b535ed2c1a	Update to RNN documentation (issue #106085 ) (#106222 ) Addresses [issue #106085](https://github.com/pytorch/pytorch/issues/106085). In `torch/nn/modules/rnn.py`: - Adds documentation string to RNNBase class. - Adds parameters to __init__ methods for RNN, LSTM, and GRU, classes. - Adds type annotations to __init__ methods for RNN, LSTM, and GRU. In `torch/ao/nn/quantized/dynamic/modules/rnn.py`: - Adds type specifications to `_FLOAT_MODULE` attributes in RNNBase, RNN, LSTM, and GRU classes. > This resolves a `mypy` assignment error `Incompatible types in assignment (expression has type "Type[LSTM]", base class "RNNBase" defined the type as "Type[RNNBase]")` that seemed to be a result of fully specified type annotations in `torch/nn/modules/rnn.py`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/106222 Approved by: https://github.com/mikaylagawarecki	2023-08-31 00:50:32 +00:00
Mikayla Gawarecki	584a01b650	Fix LayerNorm(bias=False) error (#108060 ) Fixes #108048 - [ ] Cherry pick this [here](https://github.com/pytorch/pytorch/issues/108055) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108060 Approved by: https://github.com/jbschlosser, https://github.com/albanD, https://github.com/malfet	2023-08-28 18:23:13 +00:00
dvorst	91a674ccd4	Fix docstring for shape of `target` for MultiLabelSoftMarginLoss (#107817 ) Fixes #92000 The documentation at https://pytorch.org/docs/stable/generated/torch.nn.MultiLabelSoftMarginLoss.html#multilabelsoftmarginloss states: > label targets padded by -1 ensuring same shape as the input. However, the shape of input and target tensor are compared, and an exception is raised if they differ in either dimension 0 or 1. Meaning the label targets are never padded. See the code snippet below and the resulting output. The documentation is therefore adjusted to: > label targets must have the same shape as the input. ``` import torch import torch.nn as nn # Create some example data input = torch.tensor( [ [0.8, 0.2, -0.5], [0.1, 0.9, 0.3], ] ) target1 = torch.tensor( [ [1, 0, 1], [0, 1, 1], [0, 1, 1], ] ) target2 = torch.tensor( [ [1, 0], [0, 1], ] ) target3 = torch.tensor( [ [1, 0, 1], [0, 1, 1], ] ) loss_func = nn.MultiLabelSoftMarginLoss() try: loss = loss_func(input, target1).item() except RuntimeError as e: print('target1 ', e) try: loss = loss_func(input, target2).item() except RuntimeError as e: print('target2 ', e) loss = loss_func(input, target3).item() print('target3 ', loss) ``` output: ``` target1 The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 0 target2 The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1 target3 0.6305370926856995 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/107817 Approved by: https://github.com/mikaylagawarecki	2023-08-24 15:13:46 +00:00
Mikayla Gawarecki	48b1208e05	Disable nn.MHA fastpath for floating point masks (#107641 ) Fixes https://github.com/pytorch/pytorch/issues/107084 by disabling the fast path when floating point masks (which should be additive) are passed - [We claim in our docs for MHA that float masks will be added to the attention](https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html) (be it `key_padding_mask` or `attn_mask`) - We always canonicalize any mask at the start of MHA in python by converting it to float - my understanding from Driss is that SDPA properly supports additive masking (but there are many special cases for mask shape for MHA that don't work properly currently (BxT, TxT) so [we're turning this off for now](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/transformers/cuda/attention.cu#L531-L532) - More broadly, the problem isn't with the SDPA path, but that things are broken for the path it falls back to - Right now mha "fast path" code with non-None masks is always going through [this path ](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/transformers/cuda/attention.cu#L554-L640) that has a call to `masked_softmax` that [converts the masks back to bool](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/transformers/attention.cpp#L154-L156) - the implication here is that additive floating point attn_mask and additive key_padding_mask to nn.MHA fastpath are broken - This wasn't broken for the user in [https://github.com/pytorch/pytorch/issues/107084](https://l.workplace.com/l.php?u=https%3A%2F%2Fgithub.com%2Fpytorch%2Fpytorch%2Fissues%2F107084&h=AT35qHIQavtxKtriTkrkPsWRB3eSRh4qH5PQUyiTzrPTshoztPL0593AmKCmSdEQ5O-5wib0Fd4mwztVu4YbMWb2ghZnZw1pvpJb9-FYWjDsPQ6_oHRVPzFfj8xYXC1TaFnJCkMYjrGXkIfzzxZvmcQYNnIPgsJSiWgjIw) in 1.13.1 because of [this check which bypassed the fast path if attn_mask was defined](https://github.com/pytorch/pytorch/blob/v1.13.1/torch/nn/modules/activation.py#L1096-L1097) (as Driss pointed out though additive key_padding_mask with the fast path were probably broken in 1.13.1) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107641 Approved by: https://github.com/drisspg, https://github.com/jbschlosser	2023-08-23 15:08:18 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
박종의 (PARK, Jongeui)	af0ed25ea8	Change >= in the GRU and the LSTM document to \ge (#107379 ) Change >= in the GRU document to \ge Pull Request resolved: https://github.com/pytorch/pytorch/pull/107379 Approved by: https://github.com/ezyang	2023-08-18 20:44:51 +00:00
FFFrog	2d2d43d9fb	add more check on LSTMCell (#107380 ) Just like #107223, operator ``LSTMCell`` have the same problems as ``GRUCell``, and add some check and tests related to fix it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107380 Approved by: https://github.com/ezyang	2023-08-18 20:44:17 +00:00
FFFrog	a4229690e3	Add Some Checks about dim (#107223 ) Fixes #106769 As mentioned in [GRUCell](https://pytorch.org/docs/stable/generated/torch.nn.GRUCell.html#grucell), `hidden` should have the same dimension as `input`, and the dimension should be either `1D` or `2D`. As for other aspects, it has been verified in `C++`, such as the batch of `Input` and `hidden` are the same, `Input`'s Dim1 and `input_size` are the same, `hidden`'s Dim1 and `hidden_size` are the same, etc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107223 Approved by: https://github.com/albanD	2023-08-16 22:03:31 +00:00
Mikayla Gawarecki	b08b0c915f	[easy] Fix docs for sd calculation in BatchNorm1d/3d for consistency with BatchNorm2d (#107308 ) Fixes https://github.com/pytorch/pytorch/issues/100048 BatchNorm2d docs were updated in https://github.com/pytorch/pytorch/pull/97974. There have been a number of issues filed due to confusion about this so I think we should fix before branch cut Pull Request resolved: https://github.com/pytorch/pytorch/pull/107308 Approved by: https://github.com/albanD	2023-08-16 21:51:02 +00:00
Guang Yang	0b57581dec	[pytorch] Disable fast path in MultiheadAttention in Export (#106824 ) Summary: We are seeing `aten._native_multi_head_attention` op (not in core Aten op set) is left in the exported graph and causes problems in the downstream at runtime. Two proposed solutions: 1. Disable fast path while tracing to leverage the non-optimized path to get decomp, that way, the blamed op won't show up in the exported graph 2. Add a decomp rule for `aten._native_multi_head_attention` After discussing with kimishpatel and bdhirsh, #1 is preferred and verified it could immediately unblock the critical model enablement work for PP. Test Plan: CI Differential Revision: D48169806 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106824 Approved by: https://github.com/kimishpatel	2023-08-10 00:18:37 +00:00
Jason Lu	bc88028e8e	Back out "Reland "Make adding buffers more like adding parameters (#104069 )" (#106224 )" (#106743 ) Summary: Original commit changeset: 81319beb97f3 Original Phabricator Diff: D47961182 Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822 Reviewed By: atuljangra Differential Revision: D48131623 @diff-train-skip-merge (D48131623 landed internally) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743 Approved by: https://github.com/malfet	2023-08-08 15:27:34 +00:00
Mikayla Gawarecki	1317dbf176	Reland "Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148 )" (#106632 ) Previous one was reverted because the PR stacked under which added error-checking to Pad variants https://github.com/pytorch/pytorch/pull/106147 was reverted as internally some people pass 2D inputs to ZeroPad2d (which should actually take 3d or 4d inputs :) but there wasn't actually anything this PR was breaking according to my understanding Pull Request resolved: https://github.com/pytorch/pytorch/pull/106632 Approved by: https://github.com/albanD	2023-08-07 20:10:25 +00:00
Mikayla Gawarecki	786977c647	[easy] Add reset_parameters for nn.PRelu (#106507 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106507 Approved by: https://github.com/albanD	2023-08-04 23:22:42 +00:00
Michael Gschwind	63d45275f4	is causal hints for transformer (#106143 ) Summary: make is_causal hint flags available for the top level transformer module. It's debatable whether this is useful -- at present we autodetect causal masks for src and tgt masks in transformer encoder and decoder, respectively. is_causal flags available woul enable users to short-cut this check by asserting whether they mask is causal, or not. I am putting this diff up for discussion, not as a solution. Not doing anything may be the right solution, unless there is strong (data-driven) user demand. -- it appears the consensus is to move ahead with this, as per discussions below. @cpuhrsch @mikaylagawarecki @jbschlosser @janEbert Test Plan: sandcastle Differential Revision: D47373260 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106143 Approved by: https://github.com/mikaylagawarecki	2023-08-04 14:16:48 +00:00
Michael Gschwind	3db255020b	Clarify the clarification (#106358 ) Summary: Clarify the clarification Differential Revision: D47941982 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106358 Approved by: https://github.com/mikaylagawarecki	2023-08-03 16:58:36 +00:00
PyTorch MergeBot	dfcfd5cedb	Revert "Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148 )" This reverts commit `87d2536971`. Reverted https://github.com/pytorch/pytorch/pull/106148 on behalf of https://github.com/malfet due to Reverting as dependent PR https://github.com/pytorch/pytorch/pull/106147 was reverted as well ([comment](https://github.com/pytorch/pytorch/pull/106148#issuecomment-1662344543))	2023-08-02 14:46:00 +00:00
PyTorch MergeBot	d83b887f2a	Revert "Add error checking for padding modules (#106147 )" This reverts commit `0547b6279d`. Reverted https://github.com/pytorch/pytorch/pull/106147 on behalf of https://github.com/jeanschmidt due to sadly it is breaking internal builds, and I can't coordinate a FF due to timezone differences ([comment](https://github.com/pytorch/pytorch/pull/106147#issuecomment-1661870970))	2023-08-02 09:37:40 +00:00
Danni Li	5e3aca6c5c	[BE] Input check for torch.nn.MultiheadAttention (#106363 ) Summary: Check `embed_dim` and `num_heads ` of `torch.nn.MultiheadAttention`. Test Plan: Please see GitHub Actions. Differential Revision: D47943134 Fix: #105630 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106363 Approved by: https://github.com/mikaylagawarecki	2023-08-01 23:28:23 +00:00
Danni Li	1a6f1d816d	[Doc] Add proj_size < hidden_size in LSTM (#106364 ) Summary: Add parameter constraint: `proj_size` has to be smaller than `hidden_size` in RNNBase doc. Ref: `ceea08a986/torch/nn/modules/rnn.py (L83)` `ceea08a986/torch/nn/modules/rnn.py (L458)` Test Plan: Please see GitHub Actions. Differential Revision: D47943365 Fix: #105628 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106364 Approved by: https://github.com/mikaylagawarecki	2023-08-01 18:58:27 +00:00
Mikayla Gawarecki	87d2536971	Add nn.CircularPad{*}d for consistency + fix no_batch_dim support (#106148 ) Fixes #105749 https://github.com/pytorch/pytorch/issues/95320 (tldr is that input should always be `[N, C, H, (W, D])` where only H, W and D dimensions get circular padding, so the 2D case where user wants both dimensions to be padded --> they should `.unsqueeze(0)` (as is the case for `Reflection/ReplicationPad`) but we didn't document this for circular padding. [This seems to be the old docstring](`277b05014a/torch/nn/functional.py (L4689)`) that was somehow lost. Fixes no_batch_dim support https://github.com/pytorch/pytorch/issues/104860 - Adds missing documentation for circular padding - Adds missing CircularPad modules - Migrates legacy test_nn tests from circular padding to ModuleInfo - Adds no_batch_dim support + sample inputs that test this Pull Request resolved: https://github.com/pytorch/pytorch/pull/106148 Approved by: https://github.com/albanD ghstack dependencies: #106325, #106147	2023-08-01 12:49:58 +00:00
Mikayla Gawarecki	0547b6279d	Add error checking for padding modules (#106147 ) Fixes https://github.com/pytorch/pytorch/issues/105627 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106147 Approved by: https://github.com/albanD ghstack dependencies: #106325	2023-08-01 12:49:58 +00:00
Mikayla Gawarecki	d8e5f2aa6d	Reland "Make adding buffers more like adding parameters (#104069 )" (#106224 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224 Approved by: https://github.com/atalman, https://github.com/albanD	2023-07-31 17:18:56 +00:00
chunyuan	cb6c3cbc91	inductor: enable weight prepack for LSTM (#103071 ) - Enabled LSTM weight prepack in inductor. - Added a mkldnn decomposition for lstm which won't change for different `seq_lens`. With the previous decomposition, for dynamic shapes use case where `seq_lens` changes, the graph will be different. - Extended several inductor utility functions to support `List(Tensor`) as input. Previously those functions only supported `Tensor` input. Update 2023-07-26: - https://github.com/pytorch/pytorch/pull/103851 has moved CPU weight packing to be after AOTAutograd. Fixed the support in this PR to follow the same way (mainly in `3b207f7f1c (diff-6dffed1ade0ba3e887f9a4eafa3bfcec267ab2365b8adcb91bd391f49b3fd2e3)`). LSTM is decomposed in `aten.mkldnn_rnn_layer` by layer and by direction. The weight prepack is done at the `mkldnn_rnn_layer` level. - Add a fix in rnn `__get_state__` function in case we need to recompile an `LSTM` module. When compiling the module, the weights tensors which are the `named_parameters` of the module are converted to `functional_tensor` here: `76fb72e24a/torch/nn/utils/stateless.py (L125-L128)` The forward function of LSTM will be called: `76fb72e24a/torch/_functorch/aot_autograd.py (L3379-L3381)` In the forward function, the `_flat_weights` are updated to be the same as the weights, thus becoming `functional_tensor`: `76fb72e24a/torch/nn/modules/rnn.py (L775-L778)` The weights tensors are converted back to the original tensors (which are not `functional_tensor` anymore) before exiting the `_reparametrize_module` context here: `76fb72e24a/torch/nn/utils/stateless.py (L130-L142)` But since `_flat_weights` is not in the `named_parameters` of the module, it's still `functional_tensor` ([link of the parameters that will be converted to functional and reverted back](`76fb72e24a/torch/_functorch/aot_autograd.py (L3695-L3698)`)). At this moment, if we need to recompile the model, `deepcopy` will be called: `76fb72e24a/torch/_dynamo/utils.py (L915-L917)` And it will report `UnImplemented` since we have `functional_tensor` (`_flat_weights`) and will trigger graph break which is not what we expect: `76fb72e24a/torch/_subclasses/meta_utils.py (L514)` Added a fix in the `__get_state__` to update the `_flat_weights` if ever weights have changed to fix this issue. The fix is covered in the `test_lstm_packed` UT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103071 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-07-28 13:54:32 +00:00
Michael Gschwind	723bc136a1	Add context for warning about batch_first (#106139 ) Summary: Add context for warning about batch_first Test Plan: sandcastle github Differential Revision: D47809651 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106139 Approved by: https://github.com/mikaylagawarecki	2023-07-27 23:02:05 +00:00
Mikayla Gawarecki	ca7ece9b50	[easy] improve hint on error message in nn.Module.load_state_dict (#106042 ) Fix #105963 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106042 Approved by: https://github.com/albanD	2023-07-27 19:56:02 +00:00

1 2 3 4 5 ...

1433 Commits