Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44715
We have provided a nice and intuitive API in Python. But in the context of large scale distributed training (e.g. Distributed Model Parallel), users often want to use multithreaded training instead of multiprocess training as it provides better resource utilization and efficiency.
This PR introduces functional optimizer concept (that is similar to the concept of `nn.functional`), we split optimizer into two parts: 1. optimizer state management 2. optimizer computation. We expose the computation part as a separate functional API that is available to be used by internal and OSS developers, the caller of the functional API will maintain their own states in order to directly calls the functional API. While maintaining the end user API be the same, the functional API is TorchScript friendly, and could be used by the distributed optimizer to speed up the training without GIL.
Test Plan: Imported from OSS
Reviewed By: ailzhang
Differential Revision: D23935258
Pulled By: wanchaol
fbshipit-source-id: d2a5228439edb3bc64f7771af2bb9e891847136a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24980
We'll need this internally, so just updating the open source version. the other optimizers have this argument anyways.
Test Plan: Imported from OSS
Differential Revision: D16945279
Pulled By: li-roy
fbshipit-source-id: 0b8cc86f15387cd65660747899d3d7dd870cff27
Summary:
The current code initialize the `state` in `__init__` method, but the initialization process is not invoked in `add_parameter_group`.
I followed the same approach in other Optimizers to init the `state`.
```python
import torch
emb = torch.nn.Embedding(10,10)
emb2 = torch.nn.Embedding(10,10)
optim = torch.optim.Adagrad(emb.parameters())
print(optim.state[emb.weight]) # already initialized
optim.add_param_group({'params': emb2.parameters()})
print(optim.state[emb2.weight]) # empty dict
loss = emb2.weight.sum() + emb.weight.sum()
loss.backward()
optim.step() # raised KeyError
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17679
Differential Revision: D14577575
Pulled By: ezyang
fbshipit-source-id: 12440079ac964b9eedad48e393d47f558babe300
Summary:
This PR cleans up the `at::Tensor` class by removing all methods that start with an underscore in favor of functions in the `at::` namespace. This greatly cleans up the `Tensor` class and makes it clearer what is the public and non-public API.
For this I changed `native_functions.yaml` and `Declarations.cwrap` to make all underscore methods `variant: function` (or add such a statement to begin with), and then fixed all code locations using the underscore methods.
ezyang colesbury gchanan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11152
Differential Revision: D9683607
Pulled By: goldsborough
fbshipit-source-id: 97f869f788fa56639c05a439e2a33be49f10f543
As discussed in #1441.
I also added some docs giving clear guidance about how to coalescing
in sparse tensors.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Here's the command I used to invoke autopep8 (in parallel!):
git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i
Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.
Also configures flake8 to match pep8's behavior.
Also configures TravisCI to check the whole project for lint.