Commit Graph

37 Commits

Author SHA1 Message Date
Edward Z. Yang
a88a8ec827
Convolution derivatives in ATen (#4116)
* Convolution derivatives in ATen

This PR introduces ATen implementation of convolution, which dispatches to
THNN/CuDNN/nnpack based on input parameters. The general strategy is to compose
this function out of the various forward-backward pairs of specific
implementations, rather than write a monolithic function with backwards (which
is what we did before because the boilerplate of doing it otherwise would have
been very high.) The new API provides the following functions:

  - _convolution, which is a fully generic, native convolution implementation
    that dispatches to various other convolution implementations depending on
    input characteristics. This is prefixed with an underscore because it
    explicitly takes benchmark, deterministic and cudnn_enabled which are
    implementation details for CuDNN. The intent is to eventually provide a
    convolution that reads these parameters out of the context using #4104.
  - _convolution_nogroup is a convolution implementation for non-CuDNN
    algorithms which don't support group convolution natively.
  - _convolution_double_backward is the generic double-backwards implementation
    for convolution.

In more detail:

- Most functionality from torch/csrc/autograd/functions/convolution.cpp has been
  moved into aten/src/ATen/native/Convolution.cpp
- We continue to make use of ConvParams, but we now construct the parameters
  upon entry to a function from the function signature (which does not use
  ConvParams; having convolution take ConvParams directly would require teaching
  the code generator how to accept these as parameters, complicating ATen's API
  model) and destruct them when making subprocedure calls.
- I introduce a new idiom, input_r, which represents a const Tensor& reference,
  which will subsequently be assigned to a local Tensor input. This is helpful
  because a lot of the existing algorithms relied on being able to assign to
  locals, which is not permitted with a const reference.
- The native argument parser now supports std::array<bool,2> inputs (NB: there
  MUST NOT be a space; this is the same hack as is applied to derivatives.yaml)
- Native parser now supports Tensor? arguments, which indicates a nullable
  tensor. Previously this function was only used by NN methods.
- Documentation updates on THNN library
- I added an extra fgradInput argument to VolumetricConvolutionMM_updateOutput
  and VolumetricConvolutionMM_accGradParameters so that its buffer list lines up
  with the backward argument list. This makes it possible to write derivative
  for conv3d which previously was not supported (commented out in
  derivatives.yaml)
- Extra double_backward declarations for all convolution backwards functions was
  added.
- You can now use the syntax Tensor? in native_functions.yaml to indicate that a
  tensor argument is nullable.  There are adjustments to propagate this to the
  Python argument parser.
- NNPACK was ported to ATen, and ATen now builds and links against ATen if
  possible. New AT_NNPACK_ENABLED macro.  The nnpack functions are
  nnpack_spatial_convolution.
- Some modest CuDNN convolution refactoring to remove _forward from names.
- There's a new cudnn_convolution_backward function to deal with the fact that
  CuDNN convolution double backward requires you to have computed all gradients
  in one go.
- Variable set_flags now checks if the tensor is undefined, fixing a silent memory
  corruption.
- checkSameType updated to not raise an exception if called with Variable arguments
- "no ATen declaration found for" error message is improved to say what available declarations are
- make_variable now accepts undefined tensors, and returns an undefined tensor in this case.
2017-12-20 14:19:27 -05:00
peterjc123
77ea2f26d8 Add build support for Python 2.7 using MSVC (#4226) 2017-12-20 15:07:25 +01:00
Zachary DeVito
929a11f920
Add interpreter support for Handles/PythonOp/CppOp (#3866)
* Add interpreter support for Handles/PythonOp/CppOp

This treats Handles as a first-class type in the interpreter
since this turned out to be conceptually simpler than treating
them as a separate concept, which requires a second channel for
register allocating and moving data from one op to the next.

Notes:
* The refcounting nature of tensors is factored into its own base type
so that it can be shared with other refcounted types such as handle.
* Some methods redundant with TensorBase have been deleted from Tensor
* The interpreter uses raw refcounted handles. In addition to being
able to treat Tensors and Handles as the same base object, it removes
a lot of redundant refcounting as objects moved from tensors to input/
output lists.
* aten_dispatch has been updated to work directly on the raw refcounted
lists to avoid refcounting and duplicate lists.
* Removing jit_closure.cpp, The interpreter can now handle all pathways.

* Functions like `unsafeToTensorShare` describe how
ownership transfers in the interpreter. The `Steal` variants
take rvalue references as arguments, and invalidate those
arguments to prevent potential problems.
* Make TensorTemporary is not a  subtype relationship because it is too easy to
do something horribly unsafe:

```
  void foo(at::Tensor bar) {
    // bar destructor call release on a temporary!
  }

  foo(TensorTemporary(retainable)); // structure slicing!
```
2017-11-29 11:38:57 -05:00
Edward Z. Yang
0217ad29d2 Fix OS X build, fixes #3573
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-11-09 16:16:16 +08:00
peterjc123
aa911939a3 Improve Windows Compatibility (for csrc/scripts) (#2941) 2017-11-08 19:51:35 +01:00
Sam Gross
fde355f7d4
Allow in-place operations on views (#3384)
Allow in-place operations on views

Adds VariableViewImpl, a subclass of VariableImpl which has a pointer to
the base Variable on which it is a view. In-place operations on views
change the grad_fn of the base.

Note that in-place operations only work on views that are the first output of the function that created them. All C++/ATen implemented functions have this behavior, but it's possible to write Python-implemented autograd functions that do not. In-place operations on these view will raise an exception.

Fixes #3313
2017-11-06 18:19:56 -05:00
Sam Gross
67839ce7bc Delete unused Softmax code (#3220)
Softmax and LogSoftmax are automatically bound and dispatched through
VariableType.
2017-10-21 20:51:27 +02:00
Sam Gross
f1f64c8d07 Generate autograd functions for NN / more refactors (#3136)
Generate autograd functions for NN and implement more derivatives in derivatives.yaml

A big refactor of gen_variable_type.py
2017-10-19 15:03:26 -04:00
Adam Paszke
98e67448fa Large Softmax and LogSoftmax refactor
- Cleaned up THNN and THCUNN code and kernels
- Improved THCUNN kernel performance 5x, making it match cuDNN performance
- Added support for computing softmax over arbitrary dims
  NOTE: The default dim for 3D inputs is now 1 (used to be 0)
- Both functions now accept inputs with arbitrarily many dimensions
- Autograd functions no longer save the input (it's unnecessary)
- Added cuDNN bindings for softmax, but they are unused as THCUNN
  matches or even exceeds cuDNN performance
2017-10-19 19:51:10 +02:00
Priya Goyal
2443fcac0b Deterministic cudnn algorithms 2017-10-10 10:53:34 -04:00
Sam Gross
de757805fc Implement some autograd functions using ATen (#2805)
This adds some generated autograd functions implemented in C++, which
are generated from derivatives.yaml. It also generates Python bindings
for the Variable methods. The generated files are:

 Functions.cpp/h: subclasses of torch::autograd::Function
 VariableType.cpp/h: The at::Type for autograd Variables
 python_variable_methods.cpp: Python bindings to torch::autograd::Variable
 python_variable_methods_dispatch.h: wrapper which releases GIL and sets the
     CUDA device
 python_functions.cpp/h: exposes generated autograd functions as Python
     objects

The generated functions are mostly shadowed by the definitions in
variable.py. We'll remove the Python implementations in favor of the
generated C++ implementations in a subsequent commit.
2017-09-26 17:08:00 -04:00
houseroad
6855d24ff1 Move pybind11 type_caster to different pybind.h in the corresponding folders. (#222) 2017-09-19 10:53:32 -04:00
Sam Gross
1290e586fb Use at::Tensor based autograd Variable (#2676)
Variable is now a subclass of at::Tensor backed by a VariableImpl* pImpl. The implementation of the ATen functions is defined in the auto-generated VariableType.h/cpp file.

Currently, only functions which fall through to the base type, such as sizes() and isCuda() are implemented. Differentiable ops like add() and mul() will be added in a subsequent PR.
2017-09-12 11:36:01 -04:00
Adam Paszke
3b1dfcb51c Add trace flag checking in backward passes too 2017-09-06 21:35:50 -04:00
Edward Z. Yang
161e21f68d Missing batchnorm fix
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Edward Z. Yang
2e266837f5 Port TracingState to pybind11, new export() method.
Along the way I added converters for Variable and TracingInput.  Variable should
probably be moved to a more widely known spot.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Adam Paszke
594f98ce16 Support multi-stage AutogradClosures 2017-09-05 17:48:55 -04:00
Edward Z. Yang
82efbe349b Handle batchnorm properly.
Basic idea:
- Pass buffers (marked as non-Variable tensors) as input variables to
  the trace.   Every buffer gets represented as an input variable
  to the trace, and we remember a correspondence of the underlying
  TH pointer and an input variable in the trace.
- When we initially trace a function, we DO NOT record the buffers
  as edges.  This is so autograd doesn't have to know anything about buffers.
  If we ever turn buffers into requires_grad=False parameters, then
  this problem goes away.
- When we primspec the buffer, NOW we reach into the cached buffers
  (now appropriately named) and gin up the buffer information we need.

Other things:
- CppOp execution is now supported (but lightly tested) using
  SimpleEval (thanks @apaszke!)

Todo:
- E2E tests need to have their hacks removed.
- Figure out what is going on with backwards

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Adam Paszke
fa308b3183 Improve backward tracing 2017-09-05 17:48:55 -04:00
Trevor Killeen
c304d04fc6 Replace thpp::Tensor with ATen Tensor in autograd csrc (#2170) 2017-07-28 10:18:37 -04:00
gchanan
925208af72 Implement BatchNorm double backwards (#2207)
* Implement BatchNorm double backwards as a python function called directly from C++.

This will be converted to C++ code once ATen is integrated with autograd.

* Some performance improvements via inplace ops and reusing calculations.
2017-07-27 06:00:31 +05:30
albanD
c888857461 Conv double backward groups (#1993)
* add support for groups in double backward

* add tests for group in double backward

* fix lint

* separate some tests to reduce number of test cases

* remove redundant testing for different number of output channels
2017-07-13 00:41:14 -04:00
albanD
6cdcd9c603 Add Narrow function
clean error message and support non perfectly sized inputs
2017-06-17 11:11:48 -04:00
albanD
2f8d21a7f2 add contiguous function 2017-06-17 11:11:48 -04:00
albanD
462ab8a644 add Transpose View Expand C functions 2017-06-17 11:11:48 -04:00
albanD
dd5c7c473f Add ConvBackwardBackward class 2017-06-17 11:11:48 -04:00
Alykhan Tejani
501467db17 added param name to tuple_parser for better error messages 2017-06-02 16:16:21 +02:00
Trevor Killeen
05bc877a05 make THPPointer have explicit constructors (#1636) 2017-05-25 15:35:54 -04:00
Adam Paszke
1c304a9ef6 Expose variable attribute of AccumulateGrad 2017-05-10 16:43:14 +02:00
Luca Antiga
23b556ef77 Expose custom attributes from C++ functions (#1430) 2017-05-07 13:49:55 +02:00
Adam Paszke
5c7453447f Fix bugs, rename differentiate to grad, make it more flexible 2017-05-01 16:44:56 -04:00
Adam Paszke
de9998e198 Add support for the new Function format 2017-05-01 16:44:56 -04:00
Adam Paszke
702a2e3bc5 Make Variables not subclass Function anymore
Because of this Variables can no longer appear in the graph.
Every usage of a leaf Variable will leave an AccumulateGrad
function that has no outputs, but modifies var.grad as a side
effect.
2017-05-01 16:44:56 -04:00
ngimel
b3ab4b1094 Check torch.backends.cudnn.enabled, padding, and output_padding (#996)
* Check torch.backends.cudnn.enabled
* Don't allow negative padding and output_padding values
2017-03-22 19:42:11 -04:00
Sam Gross
34ce58c909 Parallelize backwards 2017-03-03 11:26:00 -08:00
Adam Paszke
da725830c2 Add support for variable length sequences in RNNs (#873) 2017-03-01 17:36:32 +01:00
Sam Gross
bd5303010d Refactor autograd package to separate Python dependencies. (#662)
The core autograd Variable, Function, and Engine no longer depend on the
Python API. This let's us implement functions in C++. In the future, we
can also multithread engine and release the GIL for most of the
non-Python backwards.
2017-02-13 16:00:16 -08:00