Commit Graph

247 Commits

Author SHA1 Message Date
Ansley Ussery
85109ce427 Support submodule manipulation in GraphModule (#52358)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52358

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D26759260

Pulled By: ansley

fbshipit-source-id: 25d2b9124a7d957704f1700a45dca143aaed391d
2021-03-04 14:52:35 -08:00
Horace He
a649d808e6 Added fast path in the case of no hooks (#52576)
Summary:
See the discussion here: https://github.com/pytorch/pytorch/pull/50431

~~Not completely done yet - need to figure out the backwards compatibility stuff as well as `RemovableHandle`.~~

~~Also, this concretely breaks Torchscript (which tries to script the properties), and more generally, probably requires modifying Torchscript hook support: https://github.com/pytorch/pytorch/issues/34329~~

Just kidding, I think all problems are solved :)

Another thing I could do in this PR is to simply replace all the `len(x) > 0` checks with the faster checks. That's about 1.5-2k more Python instructions and .4 - .5 microseconds slower.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52576

Reviewed By: ailzhang

Differential Revision: D26650352

Pulled By: Chillee

fbshipit-source-id: 0fd73e916354b9e306701a8a396c5dc051e69f0d
2021-02-24 21:48:09 -08:00
Jeff Yang
0c455332e8 docs: add link to Tensor.share_memory_ in Module.share_memory (#52561)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48228

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52561

Reviewed By: malfet

Differential Revision: D26626012

fbshipit-source-id: 7aab44c60d1bcbda68012521ec852843864abc7f
2021-02-23 20:20:50 -08:00
Akifumi Imanishi
b3fda95fe7 Add LazyBatchNormXd (#51862)
Summary:
Same diff with https://github.com/pytorch/pytorch/issues/51548 (cc. albanD)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51862

Reviewed By: izdeby

Differential Revision: D26312289

Pulled By: albanD

fbshipit-source-id: 9cdec0e0c9021c33d10d85010978c7fa5cb4dc60
2021-02-09 10:29:03 -08:00
Alban Desmaison
a930162c69 Revert D26276903: [pytorch][PR] Add LazyBatchNormXd
Test Plan: revert-hammer

Differential Revision:
D26276903 (aa1fd6b45a)

Original commit changeset: 0ac706974178

fbshipit-source-id: bfe01b01cd460f1e2845ea5ef1fc1514e6b6ba54
2021-02-05 12:37:29 -08:00
Akifumi Imanishi
aa1fd6b45a Add LazyBatchNormXd (#51548)
Summary:
This PR implements UninitializedBuffer and LazyBatchnormXd based on https://github.com/pytorch/pytorch/issues/44538. (cc. emcastillo and albanD)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51548

Reviewed By: zhangguanheng66

Differential Revision: D26276903

Pulled By: albanD

fbshipit-source-id: 0ac706974178363f8af075e59b41d5989418922f
2021-02-05 10:27:04 -08:00
chengjun
4a8ef4525e Add new backend type for Intel heterogeneous computation platform. (#49786)
Summary:
Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc.

https://github.com/pytorch/pytorch/issues/48246

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786

Reviewed By: mrshenli

Differential Revision: D25893962

Pulled By: ezyang

fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052
2021-01-20 08:15:18 -08:00
jonykarki
934805bc49 cleaned up ModuleAttributeError (#50298)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/49726
Just cleaned up the unnecessary `ModuleAttributeError`

BC-breaking note:
`ModuleAttributeError` was added in the previous unsuccessful [PR](https://github.com/pytorch/pytorch/pull/49879) and removed here. If a user catches `ModuleAttributeError` specifically, this will no longer work. They should catch `AttributeError` instead.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50298

Reviewed By: mrshenli

Differential Revision: D25907620

Pulled By: jbschlosser

fbshipit-source-id: cdfa6b1ea76ff080cd243287c10a9d749a3f3d0a
2021-01-14 06:58:01 -08:00
Guilherme Leobas
5f8e1a1da9 add type annotations to torch.nn.modules.module (#49045)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/49044

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49045

Reviewed By: malfet

Differential Revision: D25767092

Pulled By: walterddr

fbshipit-source-id: a81ba96f3495943af7bb9ee3e5fc4c94c690c405
2021-01-11 17:01:47 -08:00
Alex Henrie
5f2ec6293d Unused variables in neural net classes and functions (#50100)
Summary:
These unused variables were identified by [pyflakes](https://pypi.org/project/pyflakes/). They can be safely removed to simplify the code and possibly improve performance.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50100

Reviewed By: ezyang

Differential Revision: D25797764

Pulled By: smessmer

fbshipit-source-id: ced341aee692f429d2dcc3a4ef5c46c8ee99cabb
2021-01-06 08:16:57 -08:00
Samuel Marks
e6779d4357 [*.py] Rename "Arguments:" to "Args:" (#49736)
Summary:
I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings.

```sh
(pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do
    printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" | paste -s -d+ -- | bc)"; done
Args:      1095
Arguments: 0336
```

It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per:

  - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md)

  - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md)

  - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst)

Therefore, only `Args:` is valid. This PR replaces them throughout the codebase.

PS: For related PRs, see tensorflow/tensorflow/pull/45420

PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736

Reviewed By: albanD

Differential Revision: D25710534

Pulled By: soumith

fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619
2020-12-28 09:34:47 -08:00
albanD
ccd646696b Fix Module backward hooks for all Tensor inputs/outputs (#46163)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/598

This is BC-breaking as we now explicitly don't call the hook when there are not Tensors at the top level of the output.
This feature was not working anyways as the returned grad_input/grad_output were wrong (not respecting the output structure and wrong inputs for multi-Node Module).

This is also BC-breaking as we now report the correct gradients for `nn.Module`s that contain multiple autograd `Node`s while we use to return bad results before.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46163

Reviewed By: ailzhang, mruberry

Differential Revision: D24894180

Pulled By: albanD

fbshipit-source-id: e1b5d193d2818eb2f51e2a2722c7405c8bd13c2b
2020-12-18 09:04:36 -08:00
Richard Zou
22d21414d7 Revert D24574649: [pytorch][PR] Utility that loads a DP/DDP model state dict into a non-DDP model with the same architecture.
Test Plan: revert-hammer

Differential Revision:
D24574649 (b631c872c9)

Original commit changeset: 17d29ab16ae2

fbshipit-source-id: 6766c6b21b82c9463143da0370192d9c68dbce6c
2020-11-10 06:55:45 -08:00
Pradeep Ganesan
b631c872c9 Utility that loads a DP/DDP model state dict into a non-DDP model with the same architecture. (#45643)
Summary:
Added a convenience function that allows users to load models without DP/DDP from a DP/DDP state dict.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45643

Reviewed By: rohan-varma

Differential Revision: D24574649

fbshipit-source-id: 17d29ab16ae24a30890168fa84da6c63650e61e9
2020-11-09 20:49:29 -08:00
Ivan Yashchuk
6de619e4a4 Allow converting parameters of nn.Module to complex dtypes (#44788)
Summary:
This PR makes it possible to cast the parameters of nn.Module to complex dtypes.
The following code works with the proposed changes.
```python
In [1]: import torch
In [2]: lin = torch.nn.Linear(5, 1).to(torch.complex64)
In [3]: lin(torch.zeros(3, 5, dtype=torch.complex64))
Out[3]:
tensor([[-0.1739+0.j],
        [-0.1739+0.j],
        [-0.1739+0.j]], grad_fn=<AddmmBackward>)
```
Fixes https://github.com/pytorch/pytorch/issues/43477.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44788

Reviewed By: zou3519

Differential Revision: D24307225

Pulled By: anjali411

fbshipit-source-id: dacc4f5c8c9a99303f74d1f5d807cd657b3b69b5
2020-10-21 08:54:59 -07:00
Emilio Castillo
d38a71d579 torch.nn.modules.LazyModuleMixin and torch.nn.LazyLinear (Shape Inference II) (#44538)
Summary:
Retake on https://github.com/pytorch/pytorch/issues/40493 after all the feedback from albanD

This PR implements the generic Lazy mechanism and a sample `LazyLinear` layer with the `UninitializedParameter`.

The main differences with the previous PR are two;
Now `torch.nn.Module` remains untouched.
We don't require an explicit initialization or a dummy forward pass before starting the training or inference of the actual module. Making this much simpler to use from the user side.

As we discussed offline, there was the suggestion of not using a mixin, but changing the `__class__` attribute of `LazyLinear` to become `Linear` once it's completely initialized. While this can be useful, by the time being we need `LazyLinear` to be a `torch.nn.Module` subclass since there are many checks that rely on the modules being instances of `torch.nn.Module`.
This can cause problems when we create complex modules such as
```
class MyNetwork(torch.nn.Module):
    def __init__(self):
        super(MyNetwork, self).__init__()
        self.conv = torch.nn.Conv2d(20, 4, 2)
        self.linear = torch.nn.LazyLinear(10)
    def forward(self, x):
        y = self.conv(x).clamp(min=0)
        return self.linear(y)
```
Here, when the __setattr__ function is called at the time LazyLinear is registered, it won't be added to the child modules of `MyNetwork`, so we have to manually do it later, but currently there is no way to do such thing as we can't access the parent module from LazyLinear once it becomes the Linear module. (We can add a workaround to this if needed).

TODO:

Add convolutions once the design is OK
Fix docstrings

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44538

Reviewed By: ngimel

Differential Revision: D24162854

Pulled By: albanD

fbshipit-source-id: 6d58dfe5d43bfb05b6ee506e266db3cf4b885f0c
2020-10-19 13:13:54 -07:00
Taras Galkovskyi
f11f9a8c1f [pytorch][improvement] Improve torch logging to identify problematic key (#45766)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45766

As per subj, making KeyError message more verbose.

Test Plan:
Verified that breakage can be successfully investigated with verbose error message
unit tests

Reviewed By: esqu1

Differential Revision: D24080362

fbshipit-source-id: f4e22a78809e5cff65a69780d5cbbc1e8b11b2e5
2020-10-05 14:54:52 -07:00
Michael Carilli
3e6bb5233f Reference amp tutorial (recipe) from core amp docs (#44725)
Summary:
https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html is live.  Core amp docs should reference it.

Also i fixed some typos in the `zero_grad` docs we ignored when git was behaving weirdly during ngimel 's merge of https://github.com/pytorch/pytorch/pull/44423.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44725

Reviewed By: mruberry

Differential Revision: D23723807

Pulled By: ngimel

fbshipit-source-id: ca0b76365f8ca908bd978e3b38bf81857fa6c2a3
2020-09-16 11:37:58 -07:00
taiyuanz
c515881137 Add reset_grad() function (#44423)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44423

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42754

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D23010859

Pulled By: ngimel

fbshipit-source-id: 56eec43eba88b98cbf714841813977c68f983564
2020-09-09 22:05:45 -07:00
Oliver Thomas
e39b43fd76 Issue 43057 (#43063)
Summary:
A small change that adds a docstring that can be found with
`getattr(nn.Module, nn.Module.forward.__name__, None).__doc__`

Fixes https://github.com/pytorch/pytorch/issues/43057

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43063

Reviewed By: mrshenli

Differential Revision: D23161782

Pulled By: ezyang

fbshipit-source-id: 95456f858e2b6a0e41ae551ea4ec2e78dd35ee3f
2020-08-18 12:50:53 -07:00
Max Berrendorf
ddb8849ffc Fix method stub used for fixing mypy issue to work with pylint (#42356)
Summary:
Make function from method

Since _forward_unimplemented is defined within the nn.Module class,
pylint (correctly) complains about not implementing this method in subclasses.

Fixes https://github.com/pytorch/pytorch/issues/42305

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42356

Reviewed By: mruberry

Differential Revision: D22867255

Pulled By: ezyang

fbshipit-source-id: ccf3e45e359d927e010791fadf70b2ef231ddb0b
2020-08-05 19:57:38 -07:00
Sharvil Nanavati
832b1659e7 Fix missing attribute when loading model from older version (#42242) (#42290)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/42242

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42290

Reviewed By: VitalyFedyunin

Differential Revision: D22844096

Pulled By: albanD

fbshipit-source-id: 707e552e0ed581fbe00f1527ab7426880edaed64
2020-07-31 09:03:07 -07:00
Yanli Zhao
79cfd85987 grad detach_ only when it has grad_fn in zero_grad call (#41283)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41283

in optimizer.zero_grad(), detach_ is useful to avoid memory leak only when grad has grad_fn, so add check to call grad.detach_ only when the grad has grad_fn in zero_grad() function
ghstack-source-id: 108702289

Test Plan: unit test

Reviewed By: mrshenli

Differential Revision: D22487315

fbshipit-source-id: 861909b15c8497f1da57f092d8963d4920c85e38
2020-07-29 11:40:13 -07:00
Michael Suo
300a3aaaad [jit] move private implementation out of jit/__init__.py (#40807)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40807

We pack a lot of logic into `jit/__init__.py`, making it unclear to
developers and users which parts of our API are public vs. internal. This
is one in a series of PRs intended to pull implementation out into
separate files, and leave `__init__.py` as a place to register the
public API.

This PR moves all the tracing-related stuff out, and fixes other spots up
as necessary. Followups will move other core APIs out.

The desired end-state is that we conform to the relevant rules in [PEP 8](https://www.python.org/dev/peps/pep-0008/#public-and-internal-interfaces). In particular:
- Internal implementation goes in modules prefixed by `_`.
- `__init__.py` exposes a public API from these private modules, and nothing more.
- We set `__all__` appropriately to declare our public API.
- All use of JIT-internal functionality outside the JIT are removed (in particular, ONNX is relying on a number internal APIs). Since they will need to be imported explicitly, it will be easier to catch new uses of internal APIs in review.

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D22320645

Pulled By: suo

fbshipit-source-id: 0720ea9976240e09837d76695207e89afcc58270
2020-07-05 22:01:11 -07:00
Tongzhou Wang
b678666a04 Add module.training to docs (#40923)
Summary:
A lot of people ask https://discuss.pytorch.org/t/check-if-model-is-eval-or-train/9395/3
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40923

Reviewed By: pbelevich

Differential Revision: D22358799

Pulled By: zou3519

fbshipit-source-id: b5465ffedb691fb4811e097c4dbd7bbc405be09c
2020-07-02 12:36:59 -07:00
Jiayu Liu
0203d70c63 [nit] fix some typo within documentation (#40692)
Summary:
Apologize if this seems trivial, but i'd like to fix them on my way of reading some of the source code. Thanks!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40692

Differential Revision: D22284651

Pulled By: mrshenli

fbshipit-source-id: 4259d1808aa4d15a02cfd486cfb44dd75fdc58f8
2020-06-30 19:24:44 -07:00
Tongzhou Wang
ef5a314597 [typing] fix register_buffer/parameter (#40669)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40669

Differential Revision: D22286130

Pulled By: ezyang

fbshipit-source-id: c0cc173279678978726895a0830343d5234e474e
2020-06-30 10:39:32 -07:00
Emilio Castillo
5e77999ecb Add global hooks to torch.nn.Module (#38972)
Summary:
This allows registering hooks that will be executed for every module.

This idea arose in a discussion with tkerola and niboshi kindly proposed this approach.

The use case for this is to avoid boilerplate code when registering the same hook for all the modules in a complex model, the internal use-case was to allow every model to accept a NumPy array in the forward pass in a simpler way. Other use cases involve general mechanisms for plotting or tracing & debugging.

Currently, this is shared for all the modules but this can be worked out to have the hooks shared only per type of module.

If this functionality is not needed feel free to close the PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38972

Differential Revision: D22091364

Pulled By: albanD

fbshipit-source-id: 204ff5f9e119eff5bdd9140c64cb5dc467bb23a2
2020-06-17 12:20:35 -07:00
Edward Yang
eace053398 Move all torch.nn.modules type annotations inline (#38211)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38211

Just because the annotations are inline doesn't mean the files type
check; most of the newly annotated files have type errors and I
added exclusions for them in mypy.ini.  The payoff of moving
all of these modules inline is I can delete the relevant code
generation logic for the pyi files (which was added ignore
annotations that weren't actually relevant anymore.)

For the most part the translation was completely mechanical, but there
were two hairy issues.  First, I needed to work around a Python 3.6 and
earlier bug where Generic has a nontrivial metaclass.  This fix is in
torch/jit/__init__.py.  Second, module.py, we need to apply the same
fix for avoiding contravariance checks that the pyi file used to have;
this is done by declaring forward as a variable (rather than a
function), which appears to be sufficient enough to get mypy to not
contravariantly check input arguments.

Because we aren't actually typechecking these modules in most
cases, it is inevitable that some of these type annotations are wrong.
I slavishly copied the old annotations from the pyi files unless there
was an obvious correction I could make.  These annotations will probably
need fixing up later.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D21497397

Pulled By: ezyang

fbshipit-source-id: 2b08bacc152c48f074e7edc4ee5dce1b77d83702
2020-06-11 15:59:57 -07:00
Alban Desmaison
3569c59600 Inverse logic of persistent set and prevent use in jit (#38131)
Summary:
jit.ScriptModule deletes all the actual attributes but still uses the nn.Module implementation.
Since I don't know how to add this new set() to the ScriptModule, it is simpler to just raise a nice error for now.
I also reverted the logic so that an empty set() (which is always the case in a ScriptModule) means that everything is persistent.

cc zdevito should we open an issue to add this to the ScriptModule?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38131

Differential Revision: D21502183

Pulled By: albanD

fbshipit-source-id: 96f83098d9a2a9156e8af5bf5bd3526dd0fefc98
2020-05-11 09:59:24 -07:00
Sharvil Nanavati
594b33ea10 Add support for non-persistent buffers. (#37191)
Summary:
Issue: https://github.com/pytorch/pytorch/issues/18056
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37191

Differential Revision: D21428373

Pulled By: albanD

fbshipit-source-id: a7d367bafb95137e1bc380178b82b08eff5d5a5a
2020-05-07 06:52:31 -07:00
Peter Bell
f8ec51bd86 Ensure DataParallel replicas can be saved (#37307)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/37182

The `zero_grad` wrapper from `_replicate_for_data_parallel` can't be pickled. So instead, I set an attribute `_is_replica = True` and check for this in `Module.zero_grad`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37307

Differential Revision: D21246119

Pulled By: mrshenli

fbshipit-source-id: 4755786d48a20bc247570ba672de9dd526914ce1
2020-04-25 20:57:24 -07:00
Alban Desmaison
3799d1d74a Fix many doc issues (#37099)
Summary:
Fix https://github.com/pytorch/pytorch/issues/35643 https://github.com/pytorch/pytorch/issues/37063 https://github.com/pytorch/pytorch/issues/36307 https://github.com/pytorch/pytorch/issues/35861 https://github.com/pytorch/pytorch/issues/35299 https://github.com/pytorch/pytorch/issues/23108 https://github.com/pytorch/pytorch/issues/4661

Just a bunch of small updates on the doc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37099

Differential Revision: D21185713

Pulled By: albanD

fbshipit-source-id: 4ac06d6709dc0da6109a6ad3daae75667ee5863e
2020-04-23 10:01:03 -07:00
Zachary DeVito
967cdc2baf Simplify replicate logic (#36174)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36174

Test Plan: Imported from OSS

Differential Revision: D20903301

Pulled By: zdevito

fbshipit-source-id: 714a32fe417b7d1615886936c41505d1ba538f47
2020-04-13 11:21:43 -07:00
mpariente
79054495d3 (Fixes #33934) Fix AttributeError for nn.Module's properties (#34324)
Summary:
As described in https://github.com/pytorch/pytorch/issues/33934, the current attribute error in `nn.Module`'s properties are wrong.

```python
from torch import nn

class MyModule(nn.Module):
    property
    def something(self):
        hey = self.unknown_function()
        return hey

model = MyModule()
print(model.something)
```
This raises `AttributeError: 'MyModule' object has no attribute 'something'` when what we want is `AttributeError: MyModule instance has no attribute 'unknown_function'`.

This fixes this issue and will make properties much easier to debug !
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34324

Differential Revision: D20645563

Pulled By: ezyang

fbshipit-source-id: 130f861851bdbef43803569a5ce9e24d2b942179
2020-03-26 07:43:21 -07:00
rohithkrn
66ee4f1c81 [ROCm] Enable Bfloat16 type for activation and batch-norm
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32065

Differential Revision: D19728858

Pulled By: ezyang

fbshipit-source-id: 8f828c558bfe6c5f43f476ff8a0f967341f8f351
2020-02-11 21:04:20 -08:00
Peter Bell
efba630287 Issue a warning when zero_grad is used in DataParallel (#33064)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/31768, second attempt of https://github.com/pytorch/pytorch/issues/32870

DataParallel creates replicas of the original `nn.Module` with the parameters duplicated onto the destination devices. Calling `backwards` will propagate gradients onto the original module parameters but calling `zero_grad` on the replica module doesn't clear the gradients from the parent module. However, any replica using backwards was broken anyway since the replica's parameters are not leaf nodes in autograd. So, we should issue a warning.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33064

Differential Revision: D19790178

Pulled By: albanD

fbshipit-source-id: 886f36640acef4834a6fa57a26ce16b42ff0e9ad
2020-02-10 07:04:27 -08:00
Richard Zou
8195961f20 Revert D19730209: [pytorch][PR] Issue a warning when using zero_grad in DataParallel
Test Plan: revert-hammer

Differential Revision:
D19730209

Original commit changeset: cb9b2cb0c2e0

fbshipit-source-id: 5bf53ea3c37a7ed2411a2acc34e40d07eff144c9
2020-02-06 07:05:51 -08:00
Peter Bell
46c3c18bcc Issue a warning when using zero_grad in DataParallel (#32870)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/31768

`DataParallel` creates replicas of the original `nn.Module` with the parameters duplicated onto the destination devices. Calling `backwards` will propagate gradients onto the original module parameters but calling `zero_grad` on the replica module doesn't clear the gradients from the parent module,

~breaking any model that uses `backward`-`zero_grad` in its `forward`. I fix this by patching the replica module so that `zero_grad` clears grads on the parent as well.~

However, any replica using backwards was broken anyway since the replica's parameters are not leaf nodes in autograd. So, we should raise a warning.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32870

Differential Revision: D19730209

Pulled By: ezyang

fbshipit-source-id: cb9b2cb0c2e0aca688ce0ff3e56b40fbd2aa3c66
2020-02-05 20:25:04 -08:00
Seyyed Hossein Hasanpour
c7bf4d22fe added exception args to the returned error message (#32693)
Summary:
addresses https://github.com/pytorch/pytorch/issues/32692
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32693

Differential Revision: D19606757

Pulled By: mrshenli

fbshipit-source-id: 79fc09f8bb6a33e1b73ce0bbc45387544c7adc1b
2020-01-29 08:26:27 -08:00
Alban Desmaison
81048c41ab remove simple .data from torch/nn
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31482

Test Plan: Imported from OSS

Differential Revision: D19303243

Pulled By: albanD

fbshipit-source-id: 5afdfeb4b8382c09b9ec65acd545148ed76d4285
2020-01-15 12:40:38 -08:00
Alban Desmaison
77c78b7d28 remove .data from torch/nn doc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31481

Test Plan: Imported from OSS

Differential Revision: D19303242

Pulled By: albanD

fbshipit-source-id: 4f650df9e9e302a299175967bcc6e30a5099fa2a
2020-01-14 07:30:42 -08:00
Vitaly Fedyunin
66f2bba852 Adding function to convert Module to channels last
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28991

Test Plan: Imported from OSS

Differential Revision: D18430810

Pulled By: VitalyFedyunin

fbshipit-source-id: 0693d4e31fc6f9831722c29fc83517f16ddfc028
2019-12-12 11:38:35 -08:00
James Reed
309b28ee3a Trace module calls
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29261

Test Plan: Imported from OSS

Differential Revision: D18343363

Pulled By: jamesr66a

fbshipit-source-id: 0c6394205e2c0ea8708028d20df83fe17b466ff4
2019-11-06 15:05:49 -08:00
zou3519
e5d6b75319 Bag of documentation fixes; fix more sphinx warnings (#27850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27850

Many of these are real problems in the documentation (i.e., link or
bullet point doesn't display correctly).

Test Plan: - built and viewed the documentation for each change locally.

Differential Revision: D17908123

Pulled By: zou3519

fbshipit-source-id: 65c92a352c89b90fb6b508c388b0874233a3817a
2019-10-15 07:31:14 -07:00
Michael Suo
ffa422a8b3 kill _parameter_list (#27399)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27399

This was devised in a time when we didn't have module attributes. They
are essentially just tensor lists, so represent them that way. This has
the additional benefit of making the RNN forward pass faster because we
effectively cache the flattened weights.

The only complication part is that someone may come along and do:
```
my_rnn_mod.w_ih_l0 = torch.nn.Parameter(...)
```

This means we need to override setattr to keep the flattened weights
cache up to date.

Test Plan: Imported from OSS

Differential Revision: D17785658

Pulled By: suo

fbshipit-source-id: 7789cd1d0d4922bfd5eba1716976442fbf150766
2019-10-12 09:51:53 -07:00
davidriazati
0046092178 Reduce special casing around 'training' (#27109)
Summary:
Most of this was old cruft left over from special handling of `training` before we had a `bool` type. This makes all modules have a `training` attribute that is true by default and removes all other special handling.

Fixes #26884
](https://our.intern.facebook.com/intern/diff/17728129/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27109

Pulled By: driazati

Differential Revision: D17728129

fbshipit-source-id: 8ddc9fbb07a953dd05529538bfdd01ed88b5cb57
2019-10-07 13:52:59 -07:00
Gregory Chanan
23fde77d3d Remove Module._backend as it's not used anymore.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25342

Test Plan: Imported from OSS

Differential Revision: D17101571

Pulled By: gchanan

fbshipit-source-id: 2cda46fe197e26a1cacb5e912f535809973d306e
2019-08-29 15:43:49 -07:00
Michael Suo
e2f5bc5c08 Properly mangle nn.Module.__construct (#23779)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23779

Mangling is two underscores, not one :(. We want this method to be
private so that inheritors who define a `__construct` do not interfere
with Module initialization

Test Plan: Imported from OSS

Differential Revision: D16645156

Pulled By: suo

fbshipit-source-id: b9060cb35bfaa0391ff200b63fb78b1ac15fee39
2019-08-05 17:58:34 -07:00
Michael Suo
cbf05305c0 don't try to set training after ScriptModule has been initialized. (#23680)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23680

Now when initializing a ScriptModule during the torch.jit.load()
process, there is already a cpp module backing the thing. That means
that setting training will overwrite whatever the initialized
ScriptModule had.

This PR splits apart the common "set up internal state" part of the
Module __init__ and calls that from ScriptModule.__init__ and
Module.__init__, leaving the "nn.Module-specific" part (setting
`self.training`) for the nn.Module __init__

Test Plan: Imported from OSS

Differential Revision: D16606959

Pulled By: suo

fbshipit-source-id: f7ea6b36551ff4e4472b7685f65731d5cfab87fd
2019-08-04 15:04:55 -07:00
Jiakai Liu
3b1c3996e1 remove RTTI check for TensorImpl shadow copy (#22773)
Summary:
We introduced RTTI in recent change: https://github.com/pytorch/pytorch/pull/21613

For internal mobile build we don't enable '-frtti' yet. This diff is trying to replace
RTTI with alternative approach.

According to dzhulgakov we could compare two tensors' type_id directly in most cases -
which is more strict than comparing TensorImpl subclass type as TensorImpl -> type_id
mapping is 1-to-n but it's more proper for this use case.

The only two cases where we can relax direct type comparison (for legacy reason) are:
1. CPUTensor <-> CUDATensor;
2. SparseCPUTensor <-> SparseCUDATensor;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22773

Differential Revision: D16277696

Pulled By: ljk53

fbshipit-source-id: 043e264fbacc37b7a11af2046983c70ddb62a599
2019-07-15 23:21:57 -07:00
SsnL
478d480d37 Add Module.requires_grad_ (#22576)
Summary:
addresses https://github.com/pytorch/pytorch/issues/20241
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22576

Differential Revision: D16149314

Pulled By: zou3519

fbshipit-source-id: 1cc4c1ec084df30e00e9ae73ce1a53494a034d5c
2019-07-08 12:13:07 -07:00
Jerry Zhang
577c04c490 add mutation support for forward_pre_hook and forward_hook (#22285)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22285

Previously forward hooks are expected to return None, this PR adds the support to overwrite input and output in `forward_pre_hook` and `forward_hook`, this is used to implement inserting quant/dequant function calls around forward functions.

Differential Revision: D16022491

fbshipit-source-id: 02340080745f22c8ea8a2f80c2c08e3a88e37253
2019-07-01 11:06:42 -07:00
Adam Paszke
41d0525de3 Improve repr for IncompatibleKeys (#22119)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/20128.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22119

Differential Revision: D15961965

Pulled By: ezyang

fbshipit-source-id: 9cc397726e6bea5580e79d291cfc1ee75337fa0c
2019-06-24 15:26:54 -07:00
Dmytro Dzhulgakov
82dd69326b Split nn.Module._save_to_state_dict to make it overridable (#21933)
Summary:
# Motivation

We allow to override JIT module serialization with `__getstate__/__setstate__` in order to cover cases where parameters are not serializable. Use cases include: MKLDNN integration: a388c78350/torch/utils/mkldnn.py (L18-L26)
and also fbgemm prepacked format integration for quantized tensors.

However many Eager scripts use `torch.save(module.state_dict())` form of serialization. There are several ways to make it work:

* make packed_weight itself pickleable (e.g. by binding `__getstate__/__setstate__` on C++ UDT level)
    * change: we’d need to allow module buffers to be of arbitrary, non-Tensor types
    * pro: no change to state_dict behavior
    * cons: might not be directly inspectable by user calling .state_dict(), especially if packed weights represent several tensors fused together
* make packed_weight being proper Tensor layout
    * pro: no change to state_dict or buffers behavior
    * cons: adding new tensor layouts is pretty costly today
    * cons: doesn’t work if multiple tensors are packed in one interleaved representation
* *[this approach]* allow Modules to override state_dict and return regular tensors
    * pro: most flexible and hackable
    * pro: maintains semantic meaning of statedict as all data necessary to represent module’s state
    * cons: complicates state_dict logic
    * cons: potential code duplication between `__getstate__/__setstate__`

Based on discussions with zdevito and gchanan we decided to pick latter approach. Rationale: this behavior is fully opt-in and will impact only modules that need it. For those modules the requirement listed above won't be true. But we do preserve requirement that all elements of state_dict are tensors. (https://fburl.com/qgybrug4 for internal discussion)

In the future we might also implement one of the approaches above but those are more involved.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21933

Differential Revision: D15937678

Pulled By: dzhulgakov

fbshipit-source-id: 3cb5d1a8304d04def7aabc0969d0a2e7be182367
2019-06-21 09:55:22 -07:00
Will Feng
6b972795e4 Add torch.__future__._overwrite_module_params_on_conversion global flag, and check it in nn.Module._apply() (#21613)
Summary:
https://github.com/pytorch/pytorch/pull/17072 breaks `model.to(xla_device)`, because moving `model` to XLA device involves changing its parameters' TensorImpl type, and the current implementation of `nn.Module.to()` doesn't support changing module parameters' TensorImpl type:
```python
# 6dc445e1a8/torch/nn/modules/module.py (L192-L208)
def _apply(self, fn):
    ...
    for param in self._parameters.values():
        if param is not None:
            # Tensors stored in modules are graph leaves, and we don't
            # want to create copy nodes, so we have to unpack the data.
            param.data = fn(param.data)  # NOTE: this doesn't allow changing `param.data`'s TensorImpl type
            if param._grad is not None:
                param._grad.data = fn(param._grad.data)  # NOTE: this doesn't allow changing `param._grad.data`'s TensorImpl type
   ...
```

yf225 TODO: fix the description here when we finish the implementation

To fix this problem, we introduce a new API `model.to_()` that always assign new tensors to the parameters (thus supporting changing the parameters to any TensorImpl type), and also bump the version counter of the original parameters correctly so that they are invalidated in any autograd graph they participate in.

We also add warning to the current `model.to()` API to inform users about the upcoming behavior change of `model.to()`: in future releases, it would create and return a new model instead of in-place updating the current model.

This unblocks adding XLA to our CI test suite, which also allows XLA to catch up with other changes in our codebase, notably the c10 dispatcher.

[xla ci]

cc. resistor ailzhang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21613

Differential Revision: D15895387

Pulled By: yf225

fbshipit-source-id: b79f230fb06019122a37fdf0711bf2130a016fe6
2019-06-19 10:30:02 -07:00
Will Feng
4b1df5c1f5 Use fn(param) instead of fn(param.data) in nn.Module._apply (#21865)
Summary:
When we pass `fn` to `nn.Module._apply()` and `fn` is an in-place operation, the correct behavior should also include bumping the parameters' and their gradients' version counters. This PR fixes the old incorrect behavior and makes sure the new behavior is right.

Note that this PR is BC-breaking in the following way:

Previously, passing an in-place operation to `nn.Module._apply()` does not bump the module's parameters' and their gradients' version counters. After this PR, the module's parameters' and their gradients' version counters will be correctly bumped by the in-place operation, which will invalidate them in any autograd graph they previously participate in.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21865

Differential Revision: D15881952

Pulled By: yf225

fbshipit-source-id: 62f9244a4283a110147e9f20145ff232a5579fbd
2019-06-18 20:45:40 -07:00
杨培文 (Yang Peiwen)
e447a733a1 Update module.py
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21570

Differential Revision: D15732665

Pulled By: ezyang

fbshipit-source-id: caa12a8619ad1396540f787b5c849d29cc5b03bd
2019-06-09 15:28:35 -07:00
Dmytro Dzhulgakov
c25e33789e Lightweight at-most-once logging for API usage (#20745)
Summary:
Resubmit #20698 which got messed up.

Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl.

Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745

Differential Revision: D15429196

Pulled By: dzhulgakov

fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca
2019-05-23 23:17:59 -07:00
Sam Gross
c1fa449763 Break reference cycle in load_state_dict (#20397)
Summary:
load_state_dict includes a recursive inner function `load` that captures
Tensors through the close-over variable `state_dict`. Because it's
recursive, it also captures itself leading to a reference cycle.

This breaks the reference cycle so that any Tensors in state_dict can be
collected immediately instead of waiting until the next GC cycle.

Alternatively, we could have passed `state_dict` and `metadata` as
arguments to load to prevent capture of Tensors. (That would still
result in cyclic garbage, but not any cyclic garbage of Tensors).

See:
https://github.com/pytorch/pytorch/issues/20199#issuecomment-491089004
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20397

Differential Revision: D15414834

Pulled By: colesbury

fbshipit-source-id: 4c2275a08b2d8043deb3779db28be03bda15872d
2019-05-20 11:46:00 -07:00
Edward Z. Yang
9b1dbffba5
Re-sync with internal repository (#20702) 2019-05-20 09:22:57 -04:00
Dmytro Dzhulgakov
d3059b9c49 Lightweight logging for once-only API usage 2019-05-19 23:04:40 -07:00
Alexandros Metsai
9e3bdb3231 Update module.py documentation. (#19347)
Summary:
Added the ">>>" python interpreter sign(three greater than symbols), so that the edited lines will appear as code, not comments/output, in the documentation. Normally, the interpreter would display "..." when expecting a block, but I'm not sure how this would work on the pytorch docs website. It seems that in other code examples the ">>>" sign is used as well, therefore I used with too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19347

Differential Revision: D14986154

Pulled By: soumith

fbshipit-source-id: 8f4d07d71ff7777b46c459837f350eb0a1f17e84
2019-04-18 06:46:24 -07:00
Sepehr Sameni
b11a8c6aef return missing keys from load_state_dict (#18668)
Summary:
return missing_keys and unexpected_keys from load_state_dict so the user can handle them when strict mode is off; also removed an unused variable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18668

Differential Revision: D14782073

Pulled By: ezyang

fbshipit-source-id: ab3b855eb77bb7422594d971988067e86eef20f2
2019-04-04 18:11:56 -07:00
Alexandr Morev
abc171bd53 Fix typo in docstring
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18216

Differential Revision: D14539824

Pulled By: ezyang

fbshipit-source-id: 490b72951a75f3f8b949a2d692d660a3693ee98a
2019-03-20 11:16:36 -07:00
Kai Zhang
4ad17c9031 Misleading documentation for module._load_from_state_dict (#17618)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17618

Base on the code, we only add key to `missing_keys` and `unexpected_keys` if `$strict` is `True`. The documentation is confusing.

This diff also fix one FLAKE8 warning.

Reviewed By: ailzhang

Differential Revision: D14280593

fbshipit-source-id: d368f5596bdf74ff62ee4d28d79120f5af91e0a3
2019-03-12 16:57:39 -07:00
ZhuBaohe
acf5ec07af Correct conv and pooling docstrings in nn module (#17052)
Summary:
This PR fix conv and pooling docstrings in nn module
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17052

Differential Revision: D14068566

Pulled By: ezyang

fbshipit-source-id: 3ec1de232ff6334b6a544dadefbb0ee6193d443a
2019-02-15 06:58:02 -08:00
Michael Suo
bd75fba4e8 fix tracing using a dictionary as input (#16616)
Summary:
Previously this would fail with the error message:
```
ValueError: Auto nesting doesn't know how to process an input object of type dict. Accepted types: Tensors, or lists/tuples of them
```
Turns out we're not using the line that causes this error (or a side effect of that line), so removing it fixes the issue. Also cleaned up some related dead code (cc apaszke to make sure the code isn't useful in some way)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16616

Differential Revision: D13908352

Pulled By: suo

fbshipit-source-id: 27094f1f4ea0af215b901f7ed3520e94fbc587b3
2019-02-01 14:44:56 -08:00
FrankHui
fe4ae9dfe4 add if in register_buffer like register_parameters (#16110)
Summary:
without this "if", code below will throw error " Linear' object has no attribute '_buffers' "
And with this if, error would be "cannot assign buffer before Module.\_\_init\_\_() call", which I think it's more accurate, just like register_parameter.
```
import math
import torch
from torch.nn.parameter import Parameter
from torch.nn import functional as F
from torch.nn import Module
class Linear(Module):
    def __init__(self, in_features, out_features, bias=True):

        self.in_features = in_features
        self.out_features = out_features
        self.register_buffer('test', torch.Tensor(out_features, in_features))
        self.weight = Parameter(torch.Tensor(out_features, in_features))
        if bias:
            self.bias = Parameter(torch.Tensor(out_features))
        else:
            self.register_parameter('bias', None)

        super(Linear, self).__init__()

        self.reset_parameters()

    def reset_parameters(self):
        stdv = 1. / math.sqrt(self.weight.size(1))
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

    def forward(self, input):
        return F.linear(input, self.weight, self.bias)

    def extra_repr(self):
        return 'in_features={}, out_features={}, bias={}'.format(
            self.in_features, self.out_features, self.bias is not None
        )

linear = Linear(3,4)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16110

Differential Revision: D13715839

Pulled By: soumith

fbshipit-source-id: c300eff0a8655aade448354cf489a592f7db722a
2019-01-17 11:50:12 -08:00
Derek Kim
19717224c5 Miscellaneous broken RSTs fixed (#16033)
Summary:
https://pytorch.org/docs/master/tensors.html#torch.Tensor.bernoulli_
https://pytorch.org/docs/master/torch.html#torch.addmm
https://pytorch.org/docs/master/distributed_deprecated.html#torch.distributed.deprecated.reduce_multigpu
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16033

Differential Revision: D13671202

Pulled By: soumith

fbshipit-source-id: 276e10e610affe205376573e7f0f9894695d218d
2019-01-15 09:50:12 -08:00
Peter Goldsborough
aec9fdf0a4 Fix _apply in nn.Module (#15305)
Summary:
Fixes an issue that arose from https://github.com/pytorch/pytorch/pull/13481 where `.shared_memory()` couldn't be called. Effectively undoes all changes to `nn.Module` from that PR and solve the relevant problem in a different way (the goal was to be able to call `._apply()` on the Python wrapper for a C++ module).

soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15305

Differential Revision: D13493937

Pulled By: goldsborough

fbshipit-source-id: 4cb8687f90fc8709a536c5e7eacd0dc8edf6f750
2018-12-17 16:22:21 -08:00
Peter Goldsborough
0bf1383f0a Python <-> C++ Frontend inter-op (#13481)
Summary:
This PR enables C++ frontend modules to be bound into Python and added as submodules of Python modules. For this, I added lots of pybind11 bindings for the `torch::nn::Module` class, and modified the `torch.nn.Module` class in Python to have a new Metaclass that makes `isinstance(m, torch.nn.Module)` return true when `m` is a C++ frontend module. The methods and fields of C++ modules are bound in such a way that they work seamlessly as submodules of Python modules for most operations (one exception I know of: calling `.to()` ends up calling `.apply()` on each submodule with a Python lambda, which cannot be used in C++ -- this may require small changes on Python side).

I've added quite a bunch of tests to verify the bindings and equality with Python. I think I should also try out adding a C++ module as part of some large PyTorch module, like a WLM or something, and see if everything works smoothly.

The next step for inter-op across our system is ScriptModule <-> C++ Frontend Module inter-op. I think this will then also allow using C++ frontend modules from TorchScript.

apaszke zdevito

CC dzhulgakov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13481

Differential Revision: D12981996

Pulled By: goldsborough

fbshipit-source-id: 147370d3596ebb0e94c82cec92993a148fee50a7
2018-12-13 08:04:02 -08:00
Ryan Moore
29d697aec4 typo in Module docstring
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14511

Differential Revision: D13246061

Pulled By: soumith

fbshipit-source-id: 6c13a2957c4c4324ab5d839d634689c61e25b0fe
2018-11-29 07:17:29 -08:00
albanD
f80d34a1c8 Update Tensor doc (#14339)
Summary:
Add to the Tensor doc info about `.device`, `.is_cuda`, `.requires_grad`, `.is_leaf` and `.grad`.
Update the `register_backward_hook` doc with a warning stating that it does not work in all cases.
Add support in the `_add_docstr` function to add docstring to attributes.

There is an explicit cast here but I am not sure how to handle it properly. The thing is that the doc field for getsetdescr is written as being a const char * (as all other doc fields in descriptors objects) in cpython online documentation. But in the code, it is the only one that is not const.
I assumed here that it is a bug in the code because it does not follow the doc and the convention of the others descriptors and so I cast out the const.
EDIT: the online doc I was looking at is for 3.7 and in that version both the code and the doc are const. For older versions, both are non const.
Please let me know if this should not be done. And if it should be done if there is a cleaner way to do it !
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14339

Differential Revision: D13243266

Pulled By: ezyang

fbshipit-source-id: 75b7838f7cd6c8dc72b0c61950e7a971baefaeeb
2018-11-28 15:28:17 -08:00
Tongzhou Wang
2cd912bcc2 Fix more spectral norm bugs (#13350)
Summary:
Problems with SN and DP after #12671 :
1. in eval mode, `weight_orig` is not getting correct gradient #12737 .

    Fix: keep `v` vector around as a buffer and always calculate `W = W_orig / (u @ W_orig @ v)` even in eval.

2. in training mode, the `weight` buffer of the parallelized module is never updated, if someone touches `weight_orig` and/or `weight` and makes them not sharing storage. So in `eval` the weight used is wrong.

    Fix: Make `weight` not a buffer anymore and always calculate it as above.

3. #12671 changed SN to update `u` in-place to make DP work correctly, but then it breaks backward through two forwards (e.g., the common GAN loss `D(real) - D(fake)`) because the vectors needed to backprop the 1st forward is changed in the 2nd forward.

    Fix: This PR clones `u` and `v` before using them.

To maintain BC, I added a hook interface for producing and loading state_dict. This is ugly and we should really have better interface for spectral_norm. But for the purpose to fix this issue, I make this patch. Even if we have a better interface, BC mechanism for legacy loading legacy state_dict still needs to be done.

cc The controller you requested could not be found. crcrpar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13350

Differential Revision: D12931044

Pulled By: SsnL

fbshipit-source-id: 8be6f934eaa62414d76d2c644dedd7e1b7eb31ef
2018-11-06 19:16:13 -08:00
Evgeniy Zheltonozhskiy
c774cb8913 Rephrase unclear error message for shape mismatch (#12870)
Summary:
I spent a couple of minutes trying to understand which shape corresponds to checkpoint and which one to the model
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12870

Differential Revision: D10466600

Pulled By: SsnL

fbshipit-source-id: 3b68530b1b756462a2acd59e3a033ff633567a6b
2018-10-22 08:57:16 -07:00
Tongzhou Wang
de460c7ad3 Improvements on conv/pool/fold/stft/ParamDict docs (#11106)
Summary:
Also fixes some incorrect formula rendering.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11106

Differential Revision: D9752433

Pulled By: SsnL

fbshipit-source-id: 535fc8498638e8b645757fc7535d8771992b7d21
2018-09-11 08:56:21 -07:00
gngdb
c5b021cc88 State dict loading arguments were in the wrong order (#11200)
Summary:
In the state dict loading code, it would print the error message referring to the shape of the loaded parameters and the parameters in the initialised model with the formatting in the wrong order. Swapped them round to fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11200

Differential Revision: D9631160

Pulled By: SsnL

fbshipit-source-id: 03d9446303bd417fef67027b10d7a27de06486be
2018-09-03 15:42:30 -07:00
Jerry Ma
afd7477eaa Add `buffers(), named_buffers()` methods. (#10554)
Summary:
This commit adds the ``buffers()`` and ``named_buffers()`` methods as
analogues of ``parameters()`` and ``named_parameters()``.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10554

Reviewed By: SsnL

Differential Revision: D9367762

Pulled By: jma127

fbshipit-source-id: f2042e46a7e833dce40cb41681dbd80d7885c74e
2018-08-16 16:26:48 -07:00
Ailing Zhang
9df9c46992 fix loading 1dim tensor from 0.3.* to 0dim tensor (#9781)
Summary:
This PR fixes #9743 .

Adding backward support when loading a checkpoint from 0.3.* with 1dim tensor, they are now 0 dim tensor in 0.4+.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9781

Differential Revision: D8988196

Pulled By: ailzhang

fbshipit-source-id: a7a1bc771d597394208430575d5a4d23b9653fef
2018-07-26 17:09:41 -07:00
Adam Paszke
aa7af94656 Make JIT tracing a thread-local property (#9414)
Summary:
As in the title. Lets us simplify a lot of code.

Depends on #9363, so please review only the last commit.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414

Reviewed By: zdevito

Differential Revision: D8836496

Pulled By: apaszke

fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3
2018-07-19 19:09:39 -07:00
Tongzhou Wang
623ae0c07c Fix loading 0.4 BN checkpoints (#9004)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/8481
Closes https://github.com/pytorch/pytorch/pull/9004

Reviewed By: soumith

Differential Revision: D8684017

Pulled By: SsnL

fbshipit-source-id: 57820ad5f6b60795358c9447409a364a93ffa1d9
2018-07-01 22:24:21 -07:00
Karan Dwivedi
a41d433d9d Check key should be string in nn.Module.add_module, parameter and buffer (#8960)
Summary:
Because I probably messed up the rebase in https://github.com/pytorch/pytorch/pull/8905
Closes https://github.com/pytorch/pytorch/pull/8960

Reviewed By: soumith

Differential Revision: D8668202

Pulled By: ezyang

fbshipit-source-id: 41e19803c7ac7aac898c8e70c6a9769314476ca9
2018-06-27 19:40:00 -07:00
Ailing
5f64484800
update to avoid potential duplicate error msg (#8638) 2018-06-19 08:50:00 -07:00
Ailing
f14887a63f
check for exact shape match before loading (#8619)
* check for exact shape match before loading

* Use RuntimeError instead of ValueError to keep it consistent with other errors

* fix lint
2018-06-18 20:16:34 -07:00
Tongzhou Wang
a77b391de7 [SpectralNorm] don't register original weight as buffer (#8170)
* don't register original weight as buffer; fixes for buffers that require grad

* add test
2018-06-12 14:42:05 -04:00
Tongzhou Wang
c0a419e6ba
Add non_blocking to Tensor/Module.to (#7312)
* Add non_blocking to Tensor/Module.to

* flake8

* Add argparse tests

* cpp parse

* Use C++ parser

* use a commong parse function with Tensor.to

* fix test_jit

* use THPObjectPtr

* increase refcount for None, True, and False

* address comments

* address comments
2018-06-04 18:46:52 -04:00
li-roy
d564ecb4a5 Update docs with new tensor repr (#6454)
* Update docs with new tensor repr

* remove cuda in dtype

* remove changes to gloo submodule

* [docs] document tensor.new_* ctor

* [docs] Add docs for tensor.to(), tensor.float(), etc

* [docs] Moar examples for docs.

* [docs] Warning for tensor ctor copy behavior

* Quick fix

* [docs] Document requires_grad_()

* [docs] Add example for requires_grad_()

* update slogdet and *fft

* update tensor rst

* small fixes

* update some docs

* additional doc changes

* update torch and tensor docs

* finish changing tensor docs

* fix flake8

* slogdet with negative det

* Update functional.py tensor ctors

* Fix nll_loss docs

* reorder to move device up

* torch.LongTensor -> torch.tensor or torch.empty in docs

* update tensor constructors in docs

* change tensor constructors

* change constructors

* change more Tensor() to tensor()

* Show requires_grads_ docs

* Fix set_default_dtype docs

* Update docs with new tensor repr

* remove cuda in dtype

* remove changes to gloo submodule

* [docs] document tensor.new_* ctor

* [docs] Add docs for tensor.to(), tensor.float(), etc

* [docs] Moar examples for docs.

* [docs] Warning for tensor ctor copy behavior

* Quick fix

* [docs] Document requires_grad_()

* [docs] Add example for requires_grad_()

* update slogdet and *fft

* update tensor rst

* small fixes

* update some docs

* additional doc changes

* update torch and tensor docs

* finish changing tensor docs

* fix flake8

* slogdet with negative det

* Update functional.py tensor ctors

* Fix nll_loss docs

* reorder to move device up

* torch.LongTensor -> torch.tensor or torch.empty in docs

* update tensor constructors in docs

* change tensor constructors

* change constructors

* change more Tensor() to tensor()

* Show requires_grads_ docs

* Fix set_default_dtype docs

* Link to torch.no_grad, etc, from torch doc

* Add dtype aliases to table

* regen docs again

* Tensor attributes stub page

* link to inplace sampling

* Link torch.dtype, device, and layout

* fix dots after nonfinite floats

* better layout docs
2018-04-21 07:35:37 -04:00
Tongzhou Wang
6a41e2dc47 Add BC mechanism to Module.load_state_dict (#6639)
* Add version counter to module, change load_state_dict to use load_local_state_dict which does class specific loading

* Clarifies version number in docs

* fix jit tests

* fix state_dict tests

* typo

* fix ddp

* exclude version numbers from state dict entries

* Fix jit test and empty modules

* address comments

* test for "."

* revert the private version change in state_dict

* make IN case a hard error

* fix not reporting error when unexpected submodule

* address comments

* disallow empty string in name and remvoe trailing dot
2018-04-19 15:36:30 -04:00
Tongzhou Wang
de9bdf1d31
Module.to doc udpate and example format update (#6774) 2018-04-19 13:30:40 -04:00
Tongzhou Wang
354dac9769
updates module.to doc for the new tensor.to(requires_grad) (#6733) 2018-04-18 18:42:15 -04:00
Tongzhou Wang
1c01eabd3c
Codemod to update our codebase to 0.4 standard (#6641)
* Codemod to update our codebase to 0.4 standard

* Update some of the test scri[ts

* remove Variable in test_clip_grad_value

* fix _symbolic_override_wrapper_maker
2018-04-17 22:06:54 -04:00
Tongzhou Wang
0e93a2c334
Add Module.to (#6629) 2018-04-16 17:46:52 -04:00
Kento NOZAWA
3b58b859b2 Fix typos in docs (#6389) 2018-04-07 12:41:15 -04:00
Kaiyu Shi
605307f8f3 Add support for printing extra information in Module and refactor redundant codes (#5936)
This PR enables users to print extra information of their subclassed nn.Module.
Now I simply insert the user-defined string at the ending of module name, which should be discussed in this PR.

Before this PR, users should redefine the __repr__ and copy&paste the source code from Module.

* Add support for extra information on Module

* Rewrite the repr method of Module

* Fix flake8

* Change the __repr__ to get_extra_repr in Linear

* Fix extra new-line for empty line

* Add test for __repr__ method

* Fix bug of block string indent

* Add indent for multi-line repr test.

* Address review comments

* Update tutorial for creating nn.Module

* Fix flake8, add extra_repr of bilinear

* Refactor DropoutNd

* Change to extra_repr in some Modules

* Fix flake8

* Refactor padding modules

* Refactor pooling module

* Fix typo

* Change to extra_repr

* Fix bug for GroupNorm

* Fix bug for LayerNorm
2018-04-02 13:52:33 -04:00
Tongzhou Wang
b21e135ab8 Add class-specific error when key mismatch in load_state_dict (#6086) 2018-03-29 12:22:23 +02:00
Tongzhou Wang
261dd6ea83 fix named_modules doc, clarify eval doc (#5691) 2018-03-10 17:35:07 -05:00
Kaiyu Shi
248c93372d Check value type for register_buffer (#5657)
* Check value type when registering buffer

* Fix PEP8

* Use isinstance in favor of is_tensor
2018-03-10 13:02:04 +01:00
Tongzhou Wang
57c7d132c9 Fix nn.Module.apply doc formatting (#5623)
* fix nn.Module.apply doc example

* other examples' double-colon and newline'
2018-03-08 22:26:01 -05:00
anderspapitto
b9cc035654 import torch.jit in torch/__init__.py (#5638)
previously, it was being implicitly imported via the import of
torch.onnx

this is no longer the case, and is a hacky thing to depend on anyway,
so import it explicitly
2018-03-08 22:17:47 -05:00