Commit Graph

3 Commits

Author SHA1 Message Date
Peter Goldsborough
13814d6744 Remove use of data() in optimizers (#10490)
Summary:
After talking to users of the C++ API we found that having the tensor type be `autograd::Variable` causes more complications than having it be `at::Tensor`. It used to be a problem because `at::Tensor` didn't have the "autograd API" of variable (e.g. `detach()` or `grad()` methods), but those methods are now on `at::Tensor`. As such, we want to make a last big breaking change to have the tensor type be `at::Tensor`, while factory methods like `torch::ones` will return `Variable`s disguised as `at::Tensor`. This will make many things easier, like calling functions in ATen that take vectors of tensors.

This PR makes a small step in this direction by updating the optimizer classes to not use `.data()` on `Variable` to access the underlying `at::Tensor`. Using `.data()` is effectively a hack to work around our modification rules for tensors that require grad. The proper way of doing things is to use `with torch.no_grad` or equivalently `NoGradGuard` in C++ to guard in-place operations.

The next step can then simply redefine `torch::Tensor` to be `at::Tensor`. This transition should be smooth, since all methods available on `Variable` are at this point available on `at::Tensor`.

For this PR I:

1. Modified the implementations of optimizers to not use `.data()`. This means the implementations are now different from PyTorch, which still uses the legacy method of using `.data`.
2. To properly verify (1), I added more fine-grained test cases to our optimizer tests, e.g. `SGD` with and without `weight_decay`, then with `nesterov` etc. Generally more tests = more happy!
3. Minor cleanup of the optimizer codebase

ebetica apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10490

Differential Revision: D9318229

Pulled By: goldsborough

fbshipit-source-id: fb386700f37840542bc5d323f308ea88fe5ea5c5
2018-08-14 13:10:19 -07:00
Peter Goldsborough
fef9a66d08 Use torch:: instead of at:: (#8911)
Summary:
This PR is the final step to making `torch::` the only  namespace users of the C++ API ever see. Basically, I did:

``` cpp

namespace torch {
using namespace at;
}
```

And then changed `torch::` to `at::` almost everywhere. This worked surprisingly well out of the box. So users can now write `torch::relu`  and `torch::log_softmax` and `torch::conv2d` instead of having to know when to use `at::` and when `torch::`. This is happy!

Another thing I did was to have `using Dtype = at::ScalarType`, which will be the eventual name anyway.

ebetica ezyang apaszke zdevito
Closes https://github.com/pytorch/pytorch/pull/8911

Reviewed By: ezyang

Differential Revision: D8668230

Pulled By: goldsborough

fbshipit-source-id: a72ccb70fca763c396c4b0997d3c4767c8cf4fd3
2018-06-27 14:42:01 -07:00
Peter Goldsborough
1f36caceb2
[C++ API] Rework optimization package (#8815)
* Rework optim folder

* Removed TORCH_OPTIMIZER_CLASS macro

* Got rid of CRTP/Impl

* Removed TORCH_AUTOGRAD_KWARG

* Differentiate between Optimizer and LossClosureOptimizer

* Make Optimizers parameters based instead of model based

* Allow construction of optimizer from arbitrary vector

* Added test for zero grad

* Added test for external parameter vectors

* Now comparing against baseline values

* Documentation

* Post rebase fixes

* Different strategy for creating and accessing buffers in optimizers

* Fix member ordering
2018-06-26 10:13:14 -07:00