This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
In preparation of adopting future rocblas library options, it is necessary to track when the backward pass of training is executing. The scope-based helper class `BackwardPassGuard` is provided to toggle state.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71881
Approved by: https://github.com/albanD
Summary:
Fixes https://github.com/pytorch/pytorch/issues/55741 by adding a comment regarding the behavior of `kaiming_uniform_`
The docstring is correct in this case. For example:
```python
import math
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
in_channels = 120
groups = 2
kernel = (3, 8)
m = nn.Conv2d(in_channels=in_channels, groups=groups,
out_channels=100, kernel_size=kernel)
k = math.sqrt(groups / (in_channels * math.prod(kernel)))
print(f"k: {k:0.6f}")
print(f"min weight: {m.weight.min().item():0.6f}")
print(f"max weight: {m.weight.max().item():0.6f}")
```
outputs:
```
k: 0.026352
min weight: -0.026352
max weight: 0.026352
```
And when we plot the distribution, it is uniform with the correct bounds:
```python
_ = plt.hist(m.weight.detach().numpy().ravel())
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58931
Reviewed By: anjali411
Differential Revision: D28689863
Pulled By: jbschlosser
fbshipit-source-id: 98eebf265dfdaceed91f1991fc4b1592c0b3cf37
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45667
First part of #3867 (Pooling operators still to do)
This adds a `padding='same'` mode to the interface of `conv{n}d`and `nn.Conv{n}d`. This should match the behaviour of `tensorflow`. I couldn't find it explicitly documented but through experimentation I found `tensorflow` returns the shape `ceil(len/stride)` and always adds any extra asymmetric padding onto the right side of the input.
Since the `native_functions.yaml` schema doesn't seem to support strings or enums, I've moved the function interface into python and it now dispatches between the numerically padded `conv{n}d` and the `_conv{n}d_same` variant. Underscores because I couldn't see any way to avoid exporting a function into the `torch` namespace.
A note on asymmetric padding. The total padding required can be odd if both the kernel-length is even and the dilation is odd. mkldnn has native support for asymmetric padding, so there is no overhead there, but for other backends I resort to padding the input tensor by 1 on the right hand side to make the remaining padding symmetrical. In these cases, I use `TORCH_WARN_ONCE` to notify the user of the performance implications.
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision: D27170744
Pulled By: jbschlosser
fbshipit-source-id: b3d8a0380e0787ae781f2e5d8ee365a7bfd49f22
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50002
The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003
Reviewed By: mrshenli
Differential Revision: D26325953
Pulled By: jerryzh168
fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53366
gchanan albanD
Thanks for the feedback. Did a first pass trying to address the concerns in the original issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53495
Reviewed By: mrshenli
Differential Revision: D26914768
Pulled By: albanD
fbshipit-source-id: fa049f1952ef05598f0da2abead9a5a5d3602f75
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857
These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
- `GLOSSARY.md`
- `aten/src/ATen/core/op_registration/README.md`
- `scripts/README.md`
- `torch/csrc/jit/codegen/fuser/README.md`
The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```
I looked over the auto-generated changes and didn't see anything that looked problematic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406
Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377
This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348
Reviewed By: walterddr, seemethere
Differential Revision: D26856620
Pulled By: samestep
fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
Summary:
Some minor improvement for lazy modules introduced in https://github.com/pytorch/pytorch/issues/44538, https://github.com/pytorch/pytorch/issues/47350 and https://github.com/pytorch/pytorch/issues/51548.
This PR mainly turn the bias to `UninitializedParameter` and instead of creating empty tensors like
```python
self.bias = Parameter(torch.Tensor(0))
self.bias = UninitializedParameter()
```
I think it would be better to
```python
self.register_parameter('bias', None)
self.bias = UninitializedParameter()
```
In addition, I change the constructor of the `LazyBatchNorm` from
```python
self.running_mean = UninitializedBuffer()
```
to
```python
self.register_buffer('running_mean', UninitializedBuffer())
```
as the original one would not change the underlying `self._buffers`.
Thank you for your time on reviewing this PR :).
Gently ping albanD, mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52212
Reviewed By: jbschlosser
Differential Revision: D26504508
Pulled By: albanD
fbshipit-source-id: 7094d0bb4fa9e2a40a07b79d350ea12a6ebfd080
Summary:
Fixes https://github.com/pytorch/pytorch/issues/49064 by using raw strings
I removed `# noqa: W605` because that's the "invalid escape sequence" check: https://www.flake8rules.com/rules/W605.html
I wrote a quick test to make sure the strings are the same before and after this PR. This block should print `True` (it does for me).
```
convolution_notes1 = \
{"groups_note": r"""* :attr:`groups` controls the connections between inputs and outputs.
:attr:`in_channels` and :attr:`out_channels` must both be divisible by
:attr:`groups`. For example,
* At groups=1, all inputs are convolved to all outputs.
* At groups=2, the operation becomes equivalent to having two conv
layers side by side, each seeing half the input channels
and producing half the output channels, and both subsequently
concatenated.
* At groups= :attr:`in_channels`, each input channel is convolved with
its own set of filters (of size
:math:`\frac{\text{out\_channels}}{\text{in\_channels}}`).""",
"depthwise_separable_note": r"""When `groups == in_channels` and `out_channels == K * in_channels`,
where `K` is a positive integer, this operation is also known as a "depthwise convolution".
In other words, for an input of size :math:`(N, C_{in}, L_{in})`,
a depthwise convolution with a depthwise multiplier `K` can be performed with the arguments
:math:`(C_\text{in}=C_\text{in}, C_\text{out}=C_\text{in} \times \text{K}, ..., \text{groups}=C_\text{in})`."""} # noqa: B950
convolution_notes2 = \
{"groups_note": """* :attr:`groups` controls the connections between inputs and outputs.
:attr:`in_channels` and :attr:`out_channels` must both be divisible by
:attr:`groups`. For example,
* At groups=1, all inputs are convolved to all outputs.
* At groups=2, the operation becomes equivalent to having two conv
layers side by side, each seeing half the input channels
and producing half the output channels, and both subsequently
concatenated.
* At groups= :attr:`in_channels`, each input channel is convolved with
its own set of filters (of size
:math:`\\frac{\\text{out\_channels}}{\\text{in\_channels}}`).""", # noqa: W605
"depthwise_separable_note": """When `groups == in_channels` and `out_channels == K * in_channels`,
where `K` is a positive integer, this operation is also known as a "depthwise convolution".
In other words, for an input of size :math:`(N, C_{in}, L_{in})`,
a depthwise convolution with a depthwise multiplier `K` can be performed with the arguments
:math:`(C_\\text{in}=C_\\text{in}, C_\\text{out}=C_\\text{in} \\times \\text{K}, ..., \\text{groups}=C_\\text{in})`."""} # noqa: W605,B950
print(convolution_notes1 == convolution_notes2)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49065
Reviewed By: agolynski
Differential Revision: D25464507
Pulled By: H-Huang
fbshipit-source-id: 88a65a24e3cc29774af25e09823257b2136550fe
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48661
Previously _conv_forward will add self.bias to the result, so bias is added twice in qat ConvBn module
this PR added a bias argument to _conv_forward and _conv_forward is called with zero bias
in ConvBn module
fixes: https://github.com/pytorch/pytorch/issues/48514
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25249175
fbshipit-source-id: 4536c7545d3dcd7e8ea254368ffb7cf15118d78c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48593
Previously _conv_forward will add self.bias to the result, so bias is added twice in qat ConvBn module
this PR added a bias argument to _conv_forward and _conv_forward is called with zero bias
in ConvBn module
fixes: https://github.com/pytorch/pytorch/issues/48514
Test Plan: Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D25222215
fbshipit-source-id: 90c0ab79835b6d09622dcfec9de4139881a60746