Commit Graph

278 Commits

Author SHA1 Message Date
David Riazati
fef52cc1f8 Add resolver for 'torch' module (#10847)
Summary:
This lets you compile builtin functions from C++ without having a dependence on Python

```cpp
auto module = torch::jit::compile(JIT"(
def my_script_method(x, y):
    return torch.relu(x) + y
)");
IValue result = module->run_method("my_script_method", 1, 2);
```

goldsborough zdevito apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10847

Differential Revision: D9543461

Pulled By: driazati

fbshipit-source-id: 6160dae094030ca144a0df93cb9f26aa78c8cf27
2018-09-06 12:42:21 -07:00
Peter Goldsborough
dccd0f2de6 Bag of clang tidy fixes for torch/csrc/ and torch/csrc/autograd (#11050)
Summary:
Linting `torch/csrc/` (non-recursive) and `torch/csrc/autograd` (non-recursive).

Fixed things like:
- `typedef` vs `using`
- Use `.empty()` instead of comparing with empty string/using `.size() == 0`
- Use range for loops instead of old style loops (`modernize-`)
- Remove some `virtual` + `override`
- Replace `stdint.h` with `cstdint`
- Replace `return Type(x, y)` with `return {x, y}`
- Use boolean values (`true`/`false`)  instead of numbers (1/0)
- More ...

ezyang apaszke cpuhrsch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11050

Differential Revision: D9597505

Pulled By: goldsborough

fbshipit-source-id: cb0fb4793ade885a8dbf4b10484487b84c64c7f2
2018-09-05 19:55:50 -07:00
Edward Yang
b2217109ec Move TensorOptions to ATen/core
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11147

Reviewed By: gchanan

Differential Revision: D9614321

fbshipit-source-id: 618cb342eb7c52181425f6bb9c17b9ecdb87a394
2018-09-04 08:55:54 -07:00
Edward Yang
0ff1bb0d8a Remove Type constructor from TensorOptions, add Type::options (#11189)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11189

Replaces it with an operator TensorOptions() method on
Type, reestablishing the implicit conversion.  I originally
wanted to get rid of the implicit conversion entirely, but
there were a *lot* of use-sites, so I added it back to avoid
a huge codemod.  In this patch, I only had to fix sites that
used the optional device_index API.

Reviewed By: cpuhrsch

Differential Revision: D9628281

fbshipit-source-id: 5fe2a68eefb77a3c9bb446f03a94ad723ef90210
2018-09-04 08:10:04 -07:00
Edward Yang
cd4c32691d Add complex32, complex64 and complex128 dtypes (#11173)
Summary:
We don't generate a corresponding Type implementations for them,
so this doesn't do anything at the moment.

We don't plan on supporting complex32 in the near future, but
it is added to reserve the name and number in case we do at
some point in the future.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/11173

Reviewed By: SsnL

Differential Revision: D9627477

Pulled By: ezyang

fbshipit-source-id: f49a44ab1c92d8a33130c249ac7b234f210a65e6
2018-09-03 19:19:36 -07:00
Edward Yang
2c5ae8c4bf Get rid of type() method on TensorOptions; use at::getType instead (#11023)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11023

I'd like TensorOptions to not know anything about Context, so I can
move it to ATen/core without pulling in Context.  To do this, the
type() method has to go, since it consults the context to get a Type.

Reviewed By: cpuhrsch

Differential Revision: D9562467

fbshipit-source-id: 61a18a76eb042a5e70b64b963501e9d68c25d4f0
2018-08-31 14:27:05 -07:00
Edward Yang
d95e68c8cc Delete Tensor constructor from TensorOptions. (#11101)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11101

I'd like to invert the dependency between Tensor and TensorOptions
(such that Tensor includes TensorOptions); to do this, I'd prefer
there to not be a Tensor constructor.  Eventually, all references
of Tensor will disappear from TensorOptions.h

Reviewed By: cpuhrsch

Differential Revision: D9585627

fbshipit-source-id: dd4a28b2c06b1e55f629762915f03c2b6c34d840
2018-08-31 09:55:01 -07:00
Edward Yang
9fac0a5093 Rename at::getType to at::getNonVariableType (#11096)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11096

To discourage willy-nilly use, and make it clearer that it
is not a Variable

Reviewed By: cpuhrsch

Differential Revision: D9583699

fbshipit-source-id: 4fbde0c01ae3deb2c7ef8c125a9028f089b203ae
2018-08-31 09:10:49 -07:00
Edward Yang
c836a04dc8 Delete a bunch of uses of getType in favor of TensorOptions.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11087

Reviewed By: cpuhrsch

Differential Revision: D9581560

fbshipit-source-id: ebe3c4c0956da8a7215ada287bf6526dbcb2b07d
2018-08-30 20:11:24 -07:00
Gregory Chanan
87a7840fa6 Remove Tensor constructor of Scalar. (#10852)
Summary:
This is along the way of removing Tensor as a member of the tagged union in Scalar.  This simplifies ordering dependencies, because currently Scalar and Tensor both depend on each other (so we introduce a TensorBase).  Also, this API isn't particularly useful publicly: we can't autograd through Scalars, so you still need a Tensor overload basically everywhere anyway.

I'm undecided what the final API should be here.  We could keep a Tensor constructor on Scalar, but have it generate a local scalar; this is convenient but given this API used to be non-synchronizing, it may not be the best.

For now, I'm just using _local_scalar, which is clear, although we should get rid of the prefix _ if that's the API we intend to promote.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10852

Reviewed By: ezyang

Differential Revision: D9496766

Pulled By: gchanan

fbshipit-source-id: 16f39b57536b9707132a5a4d915650c381bb57db
2018-08-24 16:02:05 -07:00
Peter Goldsborough
9403e0cac0 Use ATen implementation of RNNs (#10761)
Summary:
apaszke recently ported RNNs from Python into ATen, which means we can replace our implementation in the C++ API (written by ebetica) with the ATen implementation, which cleans up a lot of code (+99, -323). Thanks apaszke!

I also added the `bidirectional` and `batch_first` options to the C++ API RNN options, just because why not.

apaszke ebetica
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10761

Differential Revision: D9443885

Pulled By: goldsborough

fbshipit-source-id: b6ef7566b9ced2b2f0b2e1f46c295b6f250c65a8
2018-08-23 16:12:14 -07:00
Edward Yang
19031c68dc Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage (#10488)
Summary:
```
Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage

This patch does two major changes:

- It replaces the use of Retainable in Storage with a new implementation
  based on intrusive_ptr.  This will be necessary because Caffe2 will
  be using this class to implement intrusive_ptrs, and we need to
  line these up for the merge.  One good thing about the new implementation is
  that the default copy/move constructors/assignment operators and destructor
  work automatically, instead of needing to be hardcoded into Storage/Tensor.

- It replaces all places where we returned std::unique_ptr<Storage> with
  Storage, collapsing an unnecessary double indirection that is no longer
  necessary now that we have correctly working copy/move constructors.

I didn't initially want to do step (2), but it was very important to
eliminate all bare uses of new Storage and new StorageImpl, and this making
the API change was the most straightforward way to do this.

HOW TO FIX YOUR CODE IN THE NEW API

- You no longer need to dereference the result of tensor.storage() to pass
  it to set.  So, instead of:

      x.set_(*y.storage());

  just write:

      x.set_(y.storage());

- If you were accessing methods on StorageImpl via the pImpl() method, you
  must use the dot operator to run pImpl().  Even better; just drop pImpl,
  we now have method forwarding.  So, instead of:

      storage->pImpl()->data();

  just do:

      storage->data();
      // storage.pImpl()->data() works too but is not as recommended

- storage->getDevice() is no more; instead use storage->device().index()

MISC CODE UPDATES

- retain, release, weak_retain, weak_release and weak_lock are now
  reimplemented using the "blessed API", and renamed to make it
  clearer that their use is discouraged.

- nvcc OS X and general OS X portability improvements to intrusive_ptr

- A new comment in intrusive_ptr describing how stack allocated
  intrusive_ptr_targets work differently than heap allocated ones
  from c10::make_intrusive

CAVEAT EMPTOR

- THStorage_weakRetain used to work on strong pointers, but it NO LONGER
  works with intrusive_ptr.  You must reclaim the strong pointer into a
  real strong pointer, construct a weak pointer from it, and then release
  the strong and weak pointers.  See StorageSharing.cpp for an example.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10488

Reviewed By: gchanan

Differential Revision: D9306134

Pulled By: ezyang

fbshipit-source-id: 02d58ef62dab8e4da6131e1a24834a65c21048e2
2018-08-21 21:39:55 -07:00
Edward Yang
6bdbad93b9 Refactor Device to not depend on Backend. (#10478)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10478

- Removed Backend constructor from Device, and fixed all
  use-sites to use DeviceType::CPU instead of kCPU, or
  use a new function backendToDeviceType to perform
  the conversion.
- New method device_type() on Type; it gives you the
  underlying device type, e.g., CPU for SparseCPU.
- We add backward compatibility for kCPU/kCUDA uses,
  by introducing a new special type which is implicitly
  convertible to both DeviceType and Backend.  As long as
  you don't define a function that's overloaded on both
  DeviceType and Backend (but not on BackendOrDeviceType),
  the implicit conversions will ensure that uses
  of at::Device(at::kCPU) keep working. We fixed use-sites in
  the library, but did NOT fix sites in the test code, so that
  we can exercise this BC code.

Reviewed By: Yangqing

Differential Revision: D9301861

fbshipit-source-id: 9a9d88620500715c7b37e655b4fd761f6dd72716
2018-08-18 17:39:14 -07:00
Peter Goldsborough
2e0dd86903 Make torch::Tensor -> at::Tensor (#10516)
Summary:
This PR removes the `using Tensor = autograd::Variable;` alias from `torch/tensor.h`, which means `torch::Tensor` is now `at::Tensor`. This PR fixes up some last uses of `.data()` and tidies up the resulting code. For example, I was able to remove `TensorListView` such that code like

```
auto loss = torch::stack(torch::TensorListView(policy_loss)).sum() +
    torch::stack(torch::TensorListView(value_loss)).sum();
```

is now

```
auto loss = torch::stack(policy_loss).sum() + torch::stack(value_loss).sum();
```

CC jgehring

ebetica
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10516

Differential Revision: D9324691

Pulled By: goldsborough

fbshipit-source-id: a7c1cb779c9c829f89cea55f07ac539b00c78449
2018-08-15 21:25:12 -07:00
Peter Goldsborough
13814d6744 Remove use of data() in optimizers (#10490)
Summary:
After talking to users of the C++ API we found that having the tensor type be `autograd::Variable` causes more complications than having it be `at::Tensor`. It used to be a problem because `at::Tensor` didn't have the "autograd API" of variable (e.g. `detach()` or `grad()` methods), but those methods are now on `at::Tensor`. As such, we want to make a last big breaking change to have the tensor type be `at::Tensor`, while factory methods like `torch::ones` will return `Variable`s disguised as `at::Tensor`. This will make many things easier, like calling functions in ATen that take vectors of tensors.

This PR makes a small step in this direction by updating the optimizer classes to not use `.data()` on `Variable` to access the underlying `at::Tensor`. Using `.data()` is effectively a hack to work around our modification rules for tensors that require grad. The proper way of doing things is to use `with torch.no_grad` or equivalently `NoGradGuard` in C++ to guard in-place operations.

The next step can then simply redefine `torch::Tensor` to be `at::Tensor`. This transition should be smooth, since all methods available on `Variable` are at this point available on `at::Tensor`.

For this PR I:

1. Modified the implementations of optimizers to not use `.data()`. This means the implementations are now different from PyTorch, which still uses the legacy method of using `.data`.
2. To properly verify (1), I added more fine-grained test cases to our optimizer tests, e.g. `SGD` with and without `weight_decay`, then with `nesterov` etc. Generally more tests = more happy!
3. Minor cleanup of the optimizer codebase

ebetica apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10490

Differential Revision: D9318229

Pulled By: goldsborough

fbshipit-source-id: fb386700f37840542bc5d323f308ea88fe5ea5c5
2018-08-14 13:10:19 -07:00
Zeming Lin
b8530dc1f0 A few additions (#9837)
Summary:
This PR provides 4 fixes / features:

1. torch::nn::Cloneable inherits virtually from torch::nn::Module. We want to pass around a module with new functions, and the best way to do this is to do a diamond inheritance pattern, i.e.

```c++
struct MySuperModuleImpl : virtual public torch::nn::Module {
  virtual void myFunction() = 0;
}

struct MySuperModule : public torch::nn::Cloneable<MySuperModule>, MySuperModuleImple {};

struct MyModule : public MySuperModule<MyModule> {
  void myFunction() override;
};
```

This way, we can simply pass around MySuperModuleImpl around instead of torch::nn::Module.

2. Optimizer options are public now, since there's no way to decay the LR or modify it during training otherwise
3. Serialization functions creates autograd history and calls copy_! Bad!
4. Optimizers did not create buffers after add_parameters was called.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9837

Reviewed By: goldsborough

Differential Revision: D9199746

Pulled By: ebetica

fbshipit-source-id: 76d6b22e589a42637b7cc0b5bcd3c6b6662fb299
2018-08-13 10:24:58 -07:00
Sebastian Messmer
f51f15bb27 Update include paths for ATen/core (#10130)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10130

Update some include paths to make them internally consistent

Reviewed By: ezyang

Differential Revision: D9119906

fbshipit-source-id: b44e5cab8e8e795ee18afe9ffc6caf1f2b413467
2018-08-03 11:57:02 -07:00
Xiang Gao
6fc75eadf0 Add CELU activation to pytorch (#8551)
Summary:
Also fuse input scale multiplication into ELU

Paper:
https://arxiv.org/pdf/1704.07483.pdf
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8551

Differential Revision: D9088477

Pulled By: SsnL

fbshipit-source-id: 877771bee251b27154058f2b67d747c9812c696b
2018-08-01 07:54:44 -07:00
Christian Puhrsch
ef9801f32c Merge THStorage into at::Storage
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9772

Reviewed By: ezyang

Differential Revision: D9019375

Pulled By: cpuhrsch

fbshipit-source-id: d5185e29747929d648e4260db4967452cd40f563
2018-07-27 13:53:55 -07:00
Anders Papitto
620952117e remove unnecessary -Wno= flags
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9608

Differential Revision: D8946664

Pulled By: anderspapitto

fbshipit-source-id: b05f10af58da25b2a2588f7153f393bb3637f29a
2018-07-24 18:40:42 -07:00
Peter Goldsborough
d05a8145c5 Change behavior of clone to clone to a device (#9609)
Summary:
ebetica made me aware that `nn::Module::clone()` always clones to the current device (usually CPU) instead of preserving the device of each parameter. This PR changes the signature of `clone` from

`shared_ptr<Module> clone()`

to

`shared_ptr<Module> clone(optional<Device> device = nullopt)`

with semantics of:

1. If a `device` is given, all parameters/buffers are moved to that device,
2. If no `device` is supplied (default), parameters/buffers retain their device.

ezyang apaszke ebetica
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9609

Differential Revision: D8957367

Pulled By: goldsborough

fbshipit-source-id: 0d409ae645ed2b8d97d6fc060240de2f3d4bc6c8
2018-07-23 14:55:25 -07:00
Peter Goldsborough
31ba2f15e1 Rename embedding variable to weight (#9720)
Summary:
I renamed the variable in the `Embedding` module from `weight` to `table` a few months ago, because it seemed like a more meaningful name. Turns out it's not such a good idea because it deviates from PyTorch, which unnecessarily breaks C++->Python translated code.

ebetica ezyang apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9720

Differential Revision: D8955647

Pulled By: goldsborough

fbshipit-source-id: 77228b07d2b733866e8cdecaa6d0686eef4cc3ea
2018-07-23 14:55:24 -07:00
Peter Goldsborough
5094684238 Create torch::from_blob for variables (#9605)
Summary:
Need an overload of `at::from_blob` for Variables.

ezyang colesbury ebetica
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9605

Differential Revision: D8926226

Pulled By: goldsborough

fbshipit-source-id: e377c0d019d4377f3fc124614c7dcc562aa69990
2018-07-23 12:40:12 -07:00
Edward Yang
23ed26a0c3 Guard include of cuda-only header comm.h (#9656)
Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9656

Reviewed By: colesbury

Differential Revision: D8941361

Pulled By: ezyang

fbshipit-source-id: c18cb0e606ae0608e5892040192b8792ae542b74
2018-07-20 19:46:36 -07:00
Peter Goldsborough
b770156a7a Functional DataParallel (#9234)
Summary:
This PR adds the functional version of `DataParallel` (i.e. `data_parallel`) to the C++ frontend.

For this, I had to:
1. Add "differentiable" versions of scatter and gather, which perform their inverse operation in the backward pass, to C++. I've added them under `torch/csrc/autograd/functions/comm.{h,cpp}`. I had to move some utilities from `VariableType.cpp` into `torch/csrc/autograd/functions/utils.h`, and changed them a bit to fix the `const_cast`s for which there were `TODO`s,
2. Implement the `replicate`, `parallel_apply` and the combining `data_parallel` functions in C++.

`replicate` is implemented based on our existing `clone()` interface, along with the ability to set the current device via `at::OptionsGuard` (so nice).

`parallel_apply` is implemented using `at::parallel_for` (CC cpuhrsch) and [follows the code from PyTorch](https://github.com/pytorch/pytorch/blob/master/torch/nn/parallel/parallel_apply.py).

Added lots of tests for these things.

apaszke ezyang ebetica colesbury
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9234

Differential Revision: D8865182

Pulled By: goldsborough

fbshipit-source-id: 4f1fecf2b3f3bc1540c071dfb2d23dd45de433e4
2018-07-19 16:12:04 -07:00
Peter Goldsborough
7e78e80d94 Make error message for empty module friendlier (#9565)
Summary:
In our pimpl system, default constructing a module holder default constructs the contained module. This means `Linear linear;` is ill-formed, since `Linear` doesn't have a default constructor. Instead we require `Linear linear = nullptr;` to get the empty state of the `Linear`. This PR makes the error message for the ill-formed case nicer.

I had to change the forwarding constructors of most of our modules for this, but that's a minor adjustment.

E.g.

```
Linear linear;

In file included from /home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/module.h:5:0,
                 from /home/psag/pytorch/pytorch/test/cpp/api/module.cpp:3:
/home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/pimpl.h: In instantiation of ‘torch::nn::ModuleHolder<Contained>::ModuleHolder() [with Contained = torch::nn::LinearImpl]’:
/home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/modules/dropout.h:45:1:   required from here
/home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/pimpl.h:46:5: error: static assertion failed: You are trying to default construct a module which has no default constructor. Use = nullptr to give it the empty state (like an empt
y std::shared_ptr).
     static_assert(
```

ebetica ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9565

Differential Revision: D8903666

Pulled By: goldsborough

fbshipit-source-id: 5e6b788921a27a44359db89afdc2b057facc5cec
2018-07-19 15:56:54 -07:00
Peter Goldsborough
3b886500a0 Add CUDAGuard to ATen (#9277)
Summary:
THCStream was recently moved to ATen by mruberry: https://github.com/pytorch/pytorch/pull/8997. This PR now introduces a guard class that replaces `AutoStream` from `torch/csrc/` and also uses this new stream interface.

I had to extend the `CUDAStream` interface with unchecked calls, so that we can reset the stream without throwing an exception in the guard's destructor.

colesbury apaszke ezyang

Fixes https://github.com/pytorch/pytorch/issues/7800
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9277

Differential Revision: D8865183

Pulled By: goldsborough

fbshipit-source-id: 67c9bc09629d92fa5660286b5eec08fde9108cd7
2018-07-18 14:40:31 -07:00
Peter Goldsborough
2249751422 Add OptimizerBase::add_parameters (#9472)
Summary:
ebetica asked for a way to add parameters to `Optimizer`s after they are created.

ebetica ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9472

Differential Revision: D8872176

Pulled By: goldsborough

fbshipit-source-id: 39a4032c519a6d3b458dd3596361b04afea10365
2018-07-17 14:10:22 -07:00
Peter Goldsborough
ae44a6b5e3 Fix Sequential::clone() (#9372)
Summary:
I noticed that `Sequential::clone()` does not work. This is because `Sequential` does not use `reset()` which is normally where modules have to initialize and register its submodules. Further, this is because of the way `Sequential` allows its modules to be passed in the constructor, which doesn't work with `reset()` (since it does "late" initialization).

I've added some better error messages inside `Cloneable::clone()` which makes this kind of mistake clearer for other users, and tests for `Sequential::clone()`.

I also had to give `AnyModule` a deep `clone()` method.

ebetica ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9372

Differential Revision: D8865189

Pulled By: goldsborough

fbshipit-source-id: b81586e0d3157cd3c4265b19ac8dd87c5d8dcf94
2018-07-16 21:53:42 -07:00
Peter Goldsborough
4a796e4430 Initialization functions (#9295)
Summary:
To allow our C++  customers to use our initialization methods as well, this PR moves some of the code from `torch.nn.init` to ATen, calls it from Python, and adds equivalent code to the C++ frontend.

Notes:
1. Happy to hear thoughts on whether it's ok to have e.g. `torch.nn.init.dirac_` *and* `torch.dirac_` (the former has a `no_grad` guard). We have this for `ones_` and stuff too, so I don't mind it.
2. I left the exception checking in Python because they throw `ValueError`s while ATen errors show as `RuntimeError`s. I imagine this would break users' error handling if someone were to have a `try`-`except` handler for `ValueError` (or maybe it's a far fetch)

EDIT: After discussions with zdevito, the PR now simply duplicates the code in C++ exclusively for the C++ API, and we leave the Python code as-is (to make it easier for people to read/modify).

ebetica ezyang apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9295

Differential Revision: D8813793

Pulled By: goldsborough

fbshipit-source-id: 4b969f3f75952c1be4e837e19e23b8098e5fbd4b
2018-07-12 18:53:57 -07:00
Peter Goldsborough
153e2e96d4 Make Sequential ref-counted (#9151)
Summary:
In the C++ API, `Sequential` currently was not refcounted itself, but stored `shared_ptr<AnyModule>` to get the reference semantics. This is unfortunate because most modules in the API are accessed via `->`, e.g. `Linear l(1, 2); l->forward(...);`. `Sequential` was different in that it had value semantics itself, thus was accessed via `.`.

This PR makes `Sequential` store `AnyModule` (without extra indirection), and uses the same pImpl mechanism we use for all other modules to make `Sequential` have reference semantics itself. This makes it consistent with the rest of the library. It also removes one level of indirection inside of `Sequential`, which is cool.

One thing I had to change was that the `ModuleHolder` with which the whole pImpl thing is implemented previously did some tricks to make `Linear(3, 4)` actually construct `Linear(LinearOptions(3, 4))`. This doesn't work well with `Sequential` since it takes a variadic parameter pack. Instead, I made `ModuleHolder` forward all arguments to the underlying module, and then further pushed the trick to forward parameters to modules' options types into the actual Modules. This adds one constructor per Module in the library. This is not something user modules have to do (unless they want this nice forwarding themselves). It makes the code simpler overall.

ezyang ebetica apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9151

Reviewed By: ezyang

Differential Revision: D8809298

Pulled By: goldsborough

fbshipit-source-id: da68452c3de912fbc67af330ba93b5220de6909f
2018-07-11 17:24:59 -07:00
Peter Goldsborough
d863391871 nn::Module::as (#9149)
Summary:
Added a way to `dynamic_cast` an `nn::Module` and get a pointer to it. `nn::Module::is<T>` just checked if the return value of the `dynamic_cast` was nullptr, so I got rid of `is<T>` since it's equivalent to `as<T> != nullptr`(or just `as<T>` due to boolean conversion).

We're now at

```
if (auto* conv = module.as<nn::Conv2d>()) {
  conv->weight.data().normal_(0.0, 0.02);
} else if (auto* bn = module.as<nn::BatchNorm>()) {
  bn->weight.data().normal_(1.0, 0.02);
  bn->bias.data().fill_(0);
}
```

ezyang apaszke ebetica
Closes https://github.com/pytorch/pytorch/pull/9149

Differential Revision: D8735954

Pulled By: goldsborough

fbshipit-source-id: e2b8f6f0cea16a621f8bc0807a33cc7651d25154
2018-07-06 11:10:29 -07:00
Peter Goldsborough
97b9712aed Create Sequential::extend (#9116)
Summary:
There is no way to concatenate two `Sequential`s in Python, but it's also easier to do in an immutable fashion by just writing `Sequential(first.modules() + second.modules())`. Concatenating vectors isn't as easy in C++, so I think it's fair to save users some for loops by giving them `Sequential::extend()`.

apaszke ebetica ezyang

CC jamespinkerton
Closes https://github.com/pytorch/pytorch/pull/9116

Reviewed By: ezyang

Differential Revision: D8719630

Pulled By: goldsborough

fbshipit-source-id: 840d7ac70755350e6202b493c531e30ecbb6546f
2018-07-02 19:42:03 -07:00
Peter Goldsborough
9ce15173fb Move _cudnn_init_dropout_state to TensorOptions and enable cuDNN dropout in C++ API RNNs (#9012)
Summary:
The goal of this PR was to add support for dropout descriptors in the C++ API's RNN class.
The end result is a 4x-5x speedup for our RNN integration tests since they can now use cuDNN instead of autograd when dropout is set.

To achieve this, I had to move `_cudnn_init_dropout_state` to the `TensorOptions` API.

I also fixed a bug around `RNN::cuda()` not flattening parameters for cuDNN.

ebetica ezyang
Closes https://github.com/pytorch/pytorch/pull/9012

Reviewed By: pjh5

Differential Revision: D8689786

Pulled By: goldsborough

fbshipit-source-id: 44fb191f5a38e41c4ded5417306b5bbc012cd56c
2018-06-29 17:25:23 -07:00
Peter Goldsborough
f0772c0ab2 Replace max_pool with max_pool_with_indices (#8946)
Summary:
Re-push from https://github.com/pytorch/pytorch/pull/8892
Closes https://github.com/pytorch/pytorch/pull/8946

Differential Revision: D8666862

Pulled By: goldsborough

fbshipit-source-id: 44cd3d63d347316818a7b0f5f89fce8ff7486736
2018-06-28 16:10:08 -07:00
Peter Goldsborough
66465f1e17 Create nn::Module::is (#8970)
Summary:
When initializing weights for my C++ model, I had to write

```cpp
void initialize_weights(nn::Module& module) {
  if (module.name().find("Conv2d") != std::string::npos) {
    module.parameters()["weight"].data().normal_(0.0, 0.02);
  } else if (module.name().find("BatchNorm") != std::string::npos) {
    auto parameters = module.parameters();
    parameters["weight"].data().normal_(1.0, 0.02);
    parameters["bias"].data().fill_(0);
  }
}
```

The string-based module determination is not very nice, and not very C++-y. So I created `nn::Module::is<T>` which does a `dynamic_cast` inside. It also handles the `ModuleHolder` vs. `Module` distinction.

It now becomes

```cpp
if (module.is<nn::Conv2d>()) {
    module.parameters()["weight"].data().normal_(0.0, 0.02);
  } else if (module.is<nn::BatchNorm>()) {
    auto parameters = module.parameters();
    parameters["weight"].data().normal_(1.0, 0.02);
    parameters["bias"].data().fill_(0);
  }
```

ebetica ezyang apaszke
Closes https://github.com/pytorch/pytorch/pull/8970

Differential Revision: D8677476

Pulled By: goldsborough

fbshipit-source-id: 053294e19b6a58cce868167596c89639f7de91c2
2018-06-28 16:10:04 -07:00
Peter Goldsborough
ccc14071f4 Fix Module::zero_grad (#8964)
Summary:
`nn::Module::zero_grad` did not respect undefined `grad()` variables. This is fixed (the code now replicates PyTorch).

ebetica ezyang apaszke
Closes https://github.com/pytorch/pytorch/pull/8964

Reviewed By: ezyang

Differential Revision: D8677529

Pulled By: goldsborough

fbshipit-source-id: afdc4ba00dbf5012c37d1f794c731937ee5e422e
2018-06-28 10:26:52 -07:00
Peter Goldsborough
148088a681 Convert at::Tensor to torch::Tensor in AnyModule (#8968)
Summary:
Operations on `Variable`s (or `torch::Tensor`) usually return `at::Tensor`. This is usually fine, but the `AnyModule` used in the implementation of `torch::Sequential` is very picky about types, and does not understand implicit conversions like this. This means that `sequential.forward(at_tensor_that_is_actually_a_variable)` will fail unless you wrap `at_tensor_that_is_actually_a_variable` with `torch::Tensor`.

This PR adds a special case to `AnyModule` that will convert an `at::Tensor` to `torch::Tensor` when the tensor is really a variable, and else just pass the `at::Tensor`. This is a nice little usability improvement for the often-used `Sequential` class.

ebetica ezyang
Closes https://github.com/pytorch/pytorch/pull/8968

Reviewed By: ezyang

Differential Revision: D8670407

Pulled By: goldsborough

fbshipit-source-id: 3635ed6ed28238f3900ce4a876d07f1b11713831
2018-06-28 06:40:48 -07:00
Peter Goldsborough
03d0a70a4d Set random seed at the start of C++ tests (#8903)
Summary:
Sets the random seed at the start of C++ tests so that everything is super deterministic.

I made sure we only generate random values from torch instead of `std::`, so that this seed always applies. I.e. I do:

```
torch::randint(2, {2}, at::kInt64)
```

instead of

```
std::rand() % 2
```

Also got rid of the tests that test the random seeding, since it would interfere here. And the test is not useful since we just use ATen's seeding mechanism, which should work.

Fixes  #7288 #7286 #7289

ebetica ezyang
Closes https://github.com/pytorch/pytorch/pull/8903

Differential Revision: D8667269

Pulled By: goldsborough

fbshipit-source-id: a833e86e156d5e68dae8c53a4b1c433cb0608b6c
2018-06-27 20:09:46 -07:00
Peter Goldsborough
fef9a66d08 Use torch:: instead of at:: (#8911)
Summary:
This PR is the final step to making `torch::` the only  namespace users of the C++ API ever see. Basically, I did:

``` cpp

namespace torch {
using namespace at;
}
```

And then changed `torch::` to `at::` almost everywhere. This worked surprisingly well out of the box. So users can now write `torch::relu`  and `torch::log_softmax` and `torch::conv2d` instead of having to know when to use `at::` and when `torch::`. This is happy!

Another thing I did was to have `using Dtype = at::ScalarType`, which will be the eventual name anyway.

ebetica ezyang apaszke zdevito
Closes https://github.com/pytorch/pytorch/pull/8911

Reviewed By: ezyang

Differential Revision: D8668230

Pulled By: goldsborough

fbshipit-source-id: a72ccb70fca763c396c4b0997d3c4767c8cf4fd3
2018-06-27 14:42:01 -07:00
Orion Reblitz-Richardson
9ec0a2aef4 fbshipit-source-id: ba600fcd2b5cefc7621357bdeb05e24cea02e5af 2018-06-27 04:50:56 -07:00
Peter Goldsborough
290d20b094
Replace max_pool with max_pool_with_indices (#8892)
* Create max_poolXd_with_indices

* Match ATen names in ONNX symbolic
2018-06-26 17:09:30 -07:00
Peter Goldsborough
55757357b2
[C++ API] Better forward methods (#8739)
* Better forward methods in C++ API

capitalize error message in test_torch.test_flatten

Support for operator()

* Add operator() to Functional

* Get rid of SigmoidLinear

* Add BoundFunction to FunctionalImpl

* Remove macro from conv because it makes errors more nasty
2018-06-26 13:23:16 -07:00
Peter Goldsborough
1f36caceb2
[C++ API] Rework optimization package (#8815)
* Rework optim folder

* Removed TORCH_OPTIMIZER_CLASS macro

* Got rid of CRTP/Impl

* Removed TORCH_AUTOGRAD_KWARG

* Differentiate between Optimizer and LossClosureOptimizer

* Make Optimizers parameters based instead of model based

* Allow construction of optimizer from arbitrary vector

* Added test for zero grad

* Added test for external parameter vectors

* Now comparing against baseline values

* Documentation

* Post rebase fixes

* Different strategy for creating and accessing buffers in optimizers

* Fix member ordering
2018-06-26 10:13:14 -07:00
Peter Goldsborough
47492ed451
[C++ API] Bag of fixes (#8843)
* Bag of fixes

* Rename tensor_range.h to tensor_list_view.h

* Post rebase fixes

* Rename torch::tensor namespace to torch::tensors due to name conflict

* Avoid recursion in Module::to
2018-06-25 21:11:49 -07:00
Peter Goldsborough
a5df8ec841
Created DefaultTensorOptions in ATen (#8647)
* Created DefaultTensorOptions

* Fix TensorOptions() call which was interpreted as function decl

* Fix empty OptionsGuard

* Make options_ and mutex_ in DefaultTensorOptions class static because of dynamic linker issues

* Make DefaultOptions thread local
2018-06-24 21:15:09 -07:00
Peter Goldsborough
521f5111ad
[C++ API] Use torch::Tensor instead of at::Tensor/Variable mix (#8680)
* Use torch::Tensor instead of at::Tensor/Variable mix

* TensorRange -> TensorListView
2018-06-24 19:03:39 -07:00
Peter Goldsborough
17784d2029
Make at::tensor faster (#8709) 2018-06-20 14:46:58 -07:00
Peter Goldsborough
9335885b1b
Create at::tensor (#8475) 2018-06-20 11:44:21 -07:00
Peter Goldsborough
065fdbd500
Created Tensor::to functions (#8643)
* Created Tensor::to functions

* Only have to(dtype) and to(device)

* Ignore requires_grad in TensorOptions(Tensor) constructor
2018-06-20 09:28:08 -07:00
Peter Goldsborough
d46312fd15
Create at::from_blob (#8640) 2018-06-19 17:00:28 -07:00
Peter Goldsborough
a2dd707031
[C++ API] Create fixed width dtypes in torch:: namespace (#8639)
* Create fixed width dtypes in torch:: namespace

* Make kByte -> kUInt8
2018-06-19 12:40:58 -07:00
Peter Goldsborough
271406f276
[C++ API] Make pImpl easy to use in modules to enable happy reference semantics (#8347)
* Created TORCH_MODULE macro

Rewrote Linear

Rewrote Dropout and added default constructor to TORCH_MODULE macro

Turned TORCH_MODULE contens into a proper base class

Added some documentation

Got rid of the old Dropout module

Got rid of the old Embedding module

Got rid of the old BatchNorm module

Got rid of the old Conv module

Fixing optimizers

Rebase

Removed old RNN modules and the TORCH_ATTR macro

Removed temporary P:: namespace

Added cloning behavior to all modules

Got rid of some get() calls

self review nits

Remove noexcept from ModuleHolder methods that can throw

Remove spaces

Add missing override to reset() methods

Added examples to documentation in pimpl.h

* Post rebase fixes
2018-06-18 19:45:53 -07:00
Peter Goldsborough
372d1d6735
Create ATen tensors via TensorOptions (#7869)
* Created TensorOptions

Storing the type in TensorOptions to solve the Variable problem

Created convenience creation functions for TensorOptions and added tests

Converted zeros to TensorOptions

Converted rand to TensorOptions

Fix codegen for TensorOptions and multiple arguments

Put TensorOptions convenience functions into torch namespace too

All factory functions except *_like support TensorOptions

Integrated with recent JIT changes

Support *_like functions

Fix in place modification

Some cleanups and fixes

Support sparse_coo_tensor

Fix bug in Type.cpp

Fix .empty calls in C++ API

Fix bug in Type.cpp

Trying to fix device placement

Make AutoGPU CPU compatible

Remove some auto_gpu.h uses

Fixing some headers

Fix some remaining CUDA/AutoGPU issues

Fix some AutoGPU uses

Fixes to dispatch_tensor_conversion

Reset version of new variables to zero

Implemented parsing device strings

Random fixes to tests

Self review cleanups

flake8

Undo changes to variable.{h,cpp} because they fail on gcc7.2

Add [cuda] tag to tensor_options_cuda.cpp

Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks

Fix linker error in AutoGPU.cpp

Fix bad merge conflict in native_functions.yaml

Fixed caffe2/contrib/aten

Fix new window functions added to TensorFactories.cpp

* Removed torch::TensorOptions

Added code to generate wrapper functions for factory methods

Add implicit constructor from Backend to TensorOptions

Remove Var() from C++ API and use torch:: functions

Use torch:: functions more subtly in C++ API

Make AutoGPU::set_device more exception safe

Check status directly in DynamicCUDAHooksInterface

Rename AutoGPU to DeviceGuard

Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad

remove python_default_init: self.type()

Add back original factory functions, but with deprecation warnings

Disable DeviceGuard for a couple functions in ATen

Remove print statement

Fix DeviceGuard construction from undefined tensor

Fixing CUDA device compiler issues

Moved as many methods as possible into header files

Dont generate python functions for deprecated factories

Remove merge conflict artefact

Fix tensor_options_cuda.cpp

Fix set_requires_grad not being checked

Fix tensor_new.h

TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac

Fix bug in DeviceGuard.h

Missing includes

TEMPORARILY moving a few more methods into .cpp to see if it fixes windows

Fixing linker errors

* Fix up SummaryOps to use new factories

Undo device agnostic behavior of DeviceGuard

Use -1 instead of optional for default device index

Also move DeviceGuard methods into header

Fixes around device index after optional -> int32_t switch

Fix use of DeviceGuard in new_with_tensor_copy

Fix tensor_options.cpp

* Fix Type::copy(

* Remove test_non_float_params from ONNX tests

* Set requires_grad=False in ONNX tests that use ints

* Put layout/dtype/device on Tensor

* Post merge fixes

* Change behavior of DeviceGuard to match AutoGPU

* Fix C++ API integration tests

* Fix flip functions
2018-06-16 00:40:35 -07:00
Peter Goldsborough
de4e97e89a
[C++ API] Cursors (#8190)
* Add cursors to C++ API

* Small self nits

* s/struct/class

* Use more STL like names for cursors
2018-06-11 09:48:43 -07:00
Sam Gross
12229afd00
Record shape and type in autograd to validate gradients (#8168)
The check that the gradient is defined is currently disabled because
TestJit.test_ge_optimized will trigger the error.
2018-06-06 18:09:53 -04:00
Peter Goldsborough
990c6c5531 [C++ API] Improve and use OrderedDict for parameters / modules (#7823)
* Improve OrderedDict for C++ API

* Give OrderedDict a subject and fix review comments

* Fix OrderedDict use in torch/csrc/jit/script/init.cpp
2018-06-05 14:29:09 -04:00
Peter Goldsborough
4a80755834
Split up detail.h (#7836) 2018-05-30 08:55:34 -07:00
Peter Goldsborough
28b1a3852c
Add backward() to Tensor and Variable (#7774)
* Add backward() to Tensor and Variable

* Add at:: in front of Tensor

* Trying to not move optional to appease windows?

* Move implementation into cpp file

* Undo some formatting changes
2018-05-24 17:31:41 -07:00
Peter Goldsborough
b12164005f
[C++ API] Remove virtual forward and implement Sequential based on Any(Module) (#7508)
* Remove virtual forward

* Rebase
2018-05-24 12:46:51 -07:00
Peter Goldsborough
cfd70dc1cf
[C++ API] Back to reset() and fixed in-place cloning (#7796)
* Back to reset() and fixed in-place cloning

* Add final override to clone_
2018-05-23 22:11:32 -07:00
Will Feng
60745b3380 Revert #7750 and #7762 to fix Windows CI on master (#7772)
* Revert "Add missing brace (#7762)"

This reverts commit ea27c5af50.

* Revert "[C++ API] Add backward() to Tensor and Variable  (#7750)"

This reverts commit 1e2762796f.
2018-05-22 15:42:52 -07:00
Peter Goldsborough
ea27c5af50 Add missing brace (#7762) 2018-05-22 14:18:22 -04:00
Peter Goldsborough
1e2762796f
[C++ API] Add backward() to Tensor and Variable (#7750)
* Add backward() to Tensor and Variable

* Added a couple tests
2018-05-22 10:43:04 -07:00
Peter Goldsborough
549b4069bb
[C++ API] Using new registration mechanism (#7663)
* Using new registration mechanism

* Fix signature of param() in module.cpp

* Remove ParameterList

* Fix tests
2018-05-21 17:59:21 -07:00
Peter Goldsborough
cba19e59ca
[C++ API] Implement builder style construction (#7597)
* Implemented fused builder based construction mechanism

* "weights" -> "weight"

* Use int64_t instead of size_t everywhere in RNN

* Extracted Conv::ExpandingSize into its own thing

* Rename TORCH_PARAMETER to TORCH_ATTR

* Added documentation

* Fix weight names in batchnorm module
2018-05-17 17:10:15 -04:00
Matt Le
562d9971c9 Add LBFGS optimization algorithm to C++ API (#7596)
* Adding LBFGS to cpp API

* Adding stop conditions

* Test cases now passing and adding closure to all algs

* Addressing code review

* Set seeds to make optim tests more deterministic
2018-05-17 14:03:08 -04:00
Peter Goldsborough
3414475653
[C++ API] Remove initialize_* functions (#7517)
* Remove initialize_ functions

* Fix clone() to recursively clone children

* Small codemove
2018-05-14 18:24:58 -07:00
Peter Goldsborough
6ada041b31 Some small fixes in C++ API (#7510) 2018-05-11 18:56:53 -07:00
Peter Goldsborough
c5de3314cf Add name() to C++ modules (#7409)
* Add name() to C++ modules

* Use RTTI to get module name by default

* Add functional.cpp to CMakeLists.txt

* Call typeid() inside name() instead of constructor

* Add tests and use default constructor
2018-05-10 08:52:38 -07:00
Peter Goldsborough
4eaf5261d3 Provide default implementation of clone() in base module (#7446) 2018-05-10 00:49:29 -07:00
Peter Goldsborough
3023dd25f3 Use set_type to implement type conversions in C++ API (#7408)
* Use set_type to implement .cuda() in C++ API

* Change C++ module parameter types in place

* Fix bug where batchnorm state was not moved to CUDA
2018-05-09 17:01:19 -04:00
Peter Goldsborough
8fce8673bb
Rename Container to Module in autogradpp and reorg code (#7304)
* Rename autograd namespace to torch and change torch.h into python.h

* Pave the way for torch::nn::Module

* Reorganize module code structure

* Undo ONNX update

* Remove sleef submodule
2018-05-07 14:45:00 -07:00
Zeming Lin
5c575a1497
Fixes RNN shapes for C++ API (#7272) 2018-05-04 14:00:30 -04:00
Edward Z. Yang
157d7499e7 Disable two flaky C++ API tests. (#7290)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-05-04 10:23:52 -07:00
Peter Goldsborough
67d0d14908
Rename autograd namespace to torch and change torch.h into python.h (#7267)
* Rename autograd namespace to torch and change torch.h into python.h

* Include torch.h instead of python.h in test/cpp/api

* Change some mentions of torch.h to python.h in C++ extensions

* Set paths directly, without find_path
2018-05-04 08:04:57 -07:00
Peter Goldsborough
afe3c2688f Update C++ API tests to use Catch2 (#7108)
* Update C++ API tests to use Catch2

* Update download_mnist.py to be less verbose
2018-04-30 21:36:35 -04:00
Peter Goldsborough
af71fb882f
Merge autogradpp into PyTorch (#7074)
* Dump autogradpp into PyTorch

* Fixed up CMake for autogradpp/C++ API

* Made cereal a submodule

* Change search location of autogradpps mnist directory

* Add test_api to CI

* Download MNIST from the internet instead of storing in repo

* Fix warnings
2018-04-30 12:53:46 -07:00