Attempts to fix#92656
BC-breaking! This changes the default of zero_grad in optim and in nn to default set grads to None instead of zero tensors. We are changing the default because there are proven perf wins and existing code has typically not regressed due to this change. (will probably have to flesh out this note more).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92731
Approved by: https://github.com/ngimel
Fixes#52664. Checks if the attribute is a property that defines a setter and uses fset in __setattr__ rather than registering an inaccessible module / parameter.
This is BC-breaking as the attribute setters on nn.Module properties used to be ignored and now will be called properly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92044
Approved by: https://github.com/albanD
Fixes#91654.
Currently, the `hook` parameters of `nn.Module.register_forward_pre_hook` and `nn.Module.register_forward_hook` are typed as `Callable[..., None]`, which 1) does not enable the validation of the signature of `hook` and 2) incorrectly restricts the return type of `hook`, which the docstrings of these methods themselves state can be non-`None`.
The typing of the first parameter of `hook` as `TypeVar("T", bound="Module")` allows the binding of `Callable` whose first parameter is a subclass of `Module`.
---
Here are some examples of:
1. forward hooks and pre-hook hooks being accepted by mypy according to the new type hints
2. mypy throwing errors d.t. incorrect `hook` signatures
3. false negatives of pre-hooks being accepted as forward hooks
4. false negatives of hooks with kwargs being accepted irrespective of the value provided for `with_kwargs`
```python
from typing import Any, Dict, Tuple
import torch
from torch import nn
def forward_pre_hook(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
) -> None:
...
def forward_pre_hook_return_input(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
) -> Tuple[torch.Tensor, ...]:
...
def forward_pre_hook_with_kwargs(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
kwargs: Dict[str, Any],
) -> None:
...
def forward_pre_hook_with_kwargs_return_input(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
kwargs: Dict[str, Any],
) -> Tuple[Tuple[torch.Tensor, ...], Dict[str, Any]]:
...
def forward_hook(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
output: torch.Tensor,
) -> None:
...
def forward_hook_return_output(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
output: torch.Tensor,
) -> torch.Tensor:
...
def forward_hook_with_kwargs(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
kwargs: Dict[str, Any],
output: torch.Tensor,
) -> None:
...
def forward_hook_with_kwargs_return_output(
module: nn.Linear,
args: Tuple[torch.Tensor, ...],
kwargs: Dict[str, Any],
output: torch.Tensor,
) -> torch.Tensor:
...
model = nn.Module()
# OK
model.register_forward_pre_hook(forward_pre_hook)
model.register_forward_pre_hook(forward_pre_hook_return_input)
model.register_forward_pre_hook(forward_pre_hook_with_kwargs, with_kwargs=True)
model.register_forward_pre_hook(forward_pre_hook_with_kwargs_return_input, with_kwargs=True)
model.register_forward_hook(forward_hook)
model.register_forward_hook(forward_hook_return_output)
model.register_forward_hook(forward_hook_with_kwargs, with_kwargs=True)
model.register_forward_hook(forward_hook_with_kwargs_return_output, with_kwargs=True)
# mypy(error): [arg-type]
model.register_forward_pre_hook(forward_hook)
model.register_forward_pre_hook(forward_hook_return_output)
model.register_forward_pre_hook(forward_hook_with_kwargs)
model.register_forward_pre_hook(forward_hook_with_kwargs_return_output)
model.register_forward_hook(forward_pre_hook)
model.register_forward_hook(forward_pre_hook_return_input)
# false negatives
model.register_forward_hook(forward_pre_hook_with_kwargs)
model.register_forward_hook(forward_pre_hook_with_kwargs_return_input)
model.register_forward_pre_hook(forward_pre_hook_with_kwargs, with_kwargs=False)
model.register_forward_pre_hook(forward_pre_hook_with_kwargs_return_input, with_kwargs=False)
...
```
---
Though it is not functional as of mypy 0.991, the ideal typing of these methods would use [`typing.Literal`](https://mypy.readthedocs.io/en/stable/literal_types.html#literal-types):
```python
T = TypeVar("T", bound="Module")
class Module:
@overload
def register_forward_hook(
self,
hook: Callable[[T, Tuple[Any, ...], Any], Optional[Any]],
*,
prepend: bool = ...,
with_kwargs: Literal[False] = ...,
) -> RemovableHandle:
...
@overload
def register_forward_hook(
self,
hook: Callable[[T, Tuple[Any, ...], Dict[str, Any], Any], Optional[Any]],
*,
prepend: bool = ...,
with_kwargs: Literal[True] = ...,
) -> RemovableHandle:
...
def register_forward_hook(...):
...
```
which would:
1. validate the signature of `hook` according to the corresponding literal value provided for `with_kwargs` (and fix the false negative examples above)
2. implicitly define the [fallback `bool` signature](https://github.com/python/mypy/issues/6113#issuecomment-1266186192) e.g. to handle if a non-literal is provided for `with_kwargs`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92061
Approved by: https://github.com/albanD
closes#35643
This PR is mostly borrowed from #82042. Thanks @Padarn for implementing
the first version and debugging into the errors.
Based on the discussion in #82042 this PR adds a with_kwargs
argument to register_forward_pre_hook and register_forward_hook
methods. When the arg is set to true, the provided hook must accept
kwargs args. Under the hook, this PR adds a
`_forward_pre_hooks_with_kwargs` and a `_forward_hook_with_kwargs`
set to keep track of which hooks accept kwargs.
Differential Revision: [D41431111](https://our.internmc.facebook.com/intern/diff/D41431111)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89389
Approved by: https://github.com/soulitzer
There is a bug in the implementation of the registration hooks introduced in https://github.com/pytorch/pytorch/pull/86148 whereby if the hook returns a tensor, then the short circuiting logic:
```
value = hook(self, name, value) or value
```
Raises an exception
```
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
```
Fixing the logic so that it only checks to see if the value is `None` before overriding
Fixes#85837
CC: @albanD @jbschlosser
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87369
Approved by: https://github.com/albanD
As described in the issue, this PR adds hooks to be run when `register_parameter`, `register_buffer` and `register_module` are called.
Fixes#85837
cc @albanD @mruberry @jbschlosser @walterddr @kshitij12345 @saketh-are
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86148
Approved by: https://github.com/albanD
Fixes#61398
The original implementation is very slow when the state_dict.keys() is long. This PR only passes relevant keys to the child module.
existing test passes: `pytest test/test_nn.py -k state_dict`
I couldn't figure out a good way to write a new test for this new behavior. Had a new snippet, but it will be flaky if integrated into the main CI because it's a timing based check.
But I can verify that the test took 30s to run, after this PR it only takes 0.5s.
```python
def test_load_state_dict_large(self):
# construct a module with 4 levels of module, 10 linear each, leads to 10k items in the dictionary
import copy
import time
base_module = nn.Linear(1,1)
model = base_module
for level in range(4):
model = nn.Sequential(*[copy.deepcopy(model) for _ in range(10)])
state_dict = model.state_dict()
self.assertEqual(len(state_dict), 20000)
st = time.time()
model.load_state_dict(state_dict, strict=True)
strict_load_time = time.time() - st
# it took 0.5 seconds to
self.assertLess(strict_load_time, 10)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85743
Approved by: https://github.com/albanD
This is a new version of #15648 based on the latest master branch.
Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.
In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)
Fixes https://github.com/pytorch/pytorch/issues/71105
@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
Module's overridden `__setattr__` has special handling for parameters, submodules, and buffers, resulting in reduces performance for its default attributes (which are not of those types). Setting them directly results in a significant improvement for module construction speed (~10x).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77098
Approved by: https://github.com/jbschlosser
Summary:
Fixes https://github.com/pytorch/pytorch/issues/72778
TODO
- [x] Waiting for a conclusion from discussion in the issue.
- [x] Still bugs in handling misplaced args. Need a re-design to cover all corner cases.
TODO changes
- [x] Put deprecated signature to the second.
- [x] Change to kwargs, positional deprecated
- [x] `DeprecationWarning` add comment on why not use it
- [x] Remove unnecessary comments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72780
Reviewed By: george-qi
Differential Revision: D34398656
Pulled By: albanD
fbshipit-source-id: e8f2708e3dfd925ff354e098a66905f9775f4e0a
(cherry picked from commit 7f8eaf05fc48b333d22a07af57a7024b8b9ec6bf)