* Add non_blocking to Tensor/Module.to
* flake8
* Add argparse tests
* cpp parse
* Use C++ parser
* use a commong parse function with Tensor.to
* fix test_jit
* use THPObjectPtr
* increase refcount for None, True, and False
* address comments
* address comments
* Add version counter to module, change load_state_dict to use load_local_state_dict which does class specific loading
* Clarifies version number in docs
* fix jit tests
* fix state_dict tests
* typo
* fix ddp
* exclude version numbers from state dict entries
* Fix jit test and empty modules
* address comments
* test for "."
* revert the private version change in state_dict
* make IN case a hard error
* fix not reporting error when unexpected submodule
* address comments
* disallow empty string in name and remvoe trailing dot
* Codemod to update our codebase to 0.4 standard
* Update some of the test scri[ts
* remove Variable in test_clip_grad_value
* fix _symbolic_override_wrapper_maker
This PR enables users to print extra information of their subclassed nn.Module.
Now I simply insert the user-defined string at the ending of module name, which should be discussed in this PR.
Before this PR, users should redefine the __repr__ and copy&paste the source code from Module.
* Add support for extra information on Module
* Rewrite the repr method of Module
* Fix flake8
* Change the __repr__ to get_extra_repr in Linear
* Fix extra new-line for empty line
* Add test for __repr__ method
* Fix bug of block string indent
* Add indent for multi-line repr test.
* Address review comments
* Update tutorial for creating nn.Module
* Fix flake8, add extra_repr of bilinear
* Refactor DropoutNd
* Change to extra_repr in some Modules
* Fix flake8
* Refactor padding modules
* Refactor pooling module
* Fix typo
* Change to extra_repr
* Fix bug for GroupNorm
* Fix bug for LayerNorm
previously, it was being implicitly imported via the import of
torch.onnx
this is no longer the case, and is a hacky thing to depend on anyway,
so import it explicitly
* Improvize documentation
1. Add formula for erf, erfinv
2. Make exp, expm1 similar to log, log1p
3. Symbol change in ge, le, ne, isnan
* Fix minor nit in the docstring
* More doc improvements
1. Added some formulae
2. Complete scanning till "Other Operations" in Tensor docs
* Add more changes
1. Modify all torch.Tensor wherever required
* Fix Conv docs
1. Fix minor nits in the references for LAPACK routines
* Improve Pooling docs
1. Fix lint error
* Improve docs for RNN, Normalization and Padding
1. Fix flake8 error for pooling
* Final fixes for torch.nn.* docs.
1. Improve Loss Function documentation
2. Improve Vision Layers documentation
* Fix lint error
* Improve docstrings in torch.nn.init
* Fix lint error
* Fix minor error in torch.nn.init.sparse
* Fix Activation and Utils Docs
1. Fix Math Errors
2. Add explicit clean to Makefile in docs to prevent running graph generation script
while cleaning
3. Fix utils docs
* Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py
* Fix batch norm doc error
The nn.* counterpart of #5443 . Mostly removed Variable wrapper. Also added doc for nn.RReLU.
Notice that torch.randn(*, requires_grad=True) isn't documented until #5462 is done.
This replaces the torch.Tensor constructors with factories that produce
Variables. Similarly, functions on the torch module (e.g. torch.randn)
now return Variables.
To keep the PR to a reasonable size, I've left most of the unused tensor
code. Subsequent PRs will remove the dead code, clean-up calls to
torch.autograd.Variable, and rename Variable to Tensor everywhere.
There are some breaking changes because Variable and Tensors had
slightly different semantics. There's a list of those changes here:
https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge
This removes volatile from Variable. The functionality is mostly
replaced by a global (thread-local) flag, which is controlled by
torch.set_grad_enabled() and the context manager torch.no_grad().
In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled()
Fixes#3627
* Avoid casting integer params and buffers to float(), double() and half()
* Add test for immune integer buffers
* Fix documentation for float(), double() and half()
* Fix test
* made it explicit in the docstring of Module.register_forward_hook() that the hook(s) will be called AFTER calling forward().
* added "every time" in docstring of Module.register_forward_pre_hook()
* Add weight normalization implementation
This adds forward "pre-hooks" which get called before the module's
forward() method. Weight norm is implemented as a hook which calculates
the weight variable from the weight_g and weight_v every iteration.
Based on @rtqichen implementation.
* Specify return type
a module that returns a non-standard data structure currently breaks
due to checks for backwards hooks. This refactors the code slightly so
this will only break in the event of backwards hooks.
We were keying hooks by RemovableHandle id. However, we don't hold onto
handles and ids of dead objects can be reused. This replaces id(handle)
with a global counter.
The core autograd Variable, Function, and Engine no longer depend on the
Python API. This let's us implement functions in C++. In the future, we
can also multithread engine and release the GIL for most of the
non-Python backwards.
Here's the command I used to invoke autopep8 (in parallel!):
git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i
Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.
Also configures flake8 to match pep8's behavior.
Also configures TravisCI to check the whole project for lint.
The load_state_dict() function now raises an error if the argument
state_dict has extra keys or is missing keys.
Previously, load_state_dict() ignored extra and missing keys, which made
it hard to notice when you load an invalid state_dict. This could
happen, for example, if you save the state_dict for a DataParallel, but
load it into a single model.
The state_dict() function now only includes the Tensor data from the
paramters, which reduces checkpoint size by not saving gradients.
The register hook calls now return an object that can be used to remove
the hook. For example,
>>> h = module.register_forward_hook(callback)
>>> h.remove() # removes hook
Or as a context manager:
>>> with module.register_forward_hook(callback):
... pass
This makes it easier for libraries to use hooks without worrying about
name collisions.
This hooks into the (internal) ForkingPickler class in multiprocessing
to reduce tensors, storages, and CUDA events instead of our queue from
joblib. This makes it easier to use the standard multiprocessing classes
in later versions of Python.
This also exposes:
- Tensor/Storage.share_memory_()
- Module.share_memory()
These methods move the CPU tensors and storages to shared memory. If
you're using the "fork" method of multiprocessing, these objects can be
directly inherited instead of serialized through a queue.
Uses the assignment syntax to get deterministic ordering of parameters.
The ordering of parameters using the constructor syntax is
non-deterministic because kwargs use dict() in Python 3.5 and earlier.
modules(): returns an iterator over all modules in the network
children(): returns an iterator over immediate children
Also fix __getitem__ in Sequential
* _forward is renamed forward since users should override it
* some __call__ overrides are changed to forward
* function which return a single variable are changed to return that
variable instead of a one-element tuple