This adds some generated autograd functions implemented in C++, which
are generated from derivatives.yaml. It also generates Python bindings
for the Variable methods. The generated files are:
Functions.cpp/h: subclasses of torch::autograd::Function
VariableType.cpp/h: The at::Type for autograd Variables
python_variable_methods.cpp: Python bindings to torch::autograd::Variable
python_variable_methods_dispatch.h: wrapper which releases GIL and sets the
CUDA device
python_functions.cpp/h: exposes generated autograd functions as Python
objects
The generated functions are mostly shadowed by the definitions in
variable.py. We'll remove the Python implementations in favor of the
generated C++ implementations in a subsequent commit.
Variable is now a subclass of at::Tensor backed by a VariableImpl* pImpl. The implementation of the ATen functions is defined in the auto-generated VariableType.h/cpp file.
Currently, only functions which fall through to the base type, such as sizes() and isCuda() are implemented. Differentiable ops like add() and mul() will be added in a subsequent PR.
Along the way I added converters for Variable and TracingInput. Variable should
probably be moved to a more widely known spot.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Basic idea:
- Pass buffers (marked as non-Variable tensors) as input variables to
the trace. Every buffer gets represented as an input variable
to the trace, and we remember a correspondence of the underlying
TH pointer and an input variable in the trace.
- When we initially trace a function, we DO NOT record the buffers
as edges. This is so autograd doesn't have to know anything about buffers.
If we ever turn buffers into requires_grad=False parameters, then
this problem goes away.
- When we primspec the buffer, NOW we reach into the cached buffers
(now appropriately named) and gin up the buffer information we need.
Other things:
- CppOp execution is now supported (but lightly tested) using
SimpleEval (thanks @apaszke!)
Todo:
- E2E tests need to have their hacks removed.
- Figure out what is going on with backwards
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Implement BatchNorm double backwards as a python function called directly from C++.
This will be converted to C++ code once ATen is integrated with autograd.
* Some performance improvements via inplace ops and reusing calculations.
* add support for groups in double backward
* add tests for group in double backward
* fix lint
* separate some tests to reduce number of test cases
* remove redundant testing for different number of output channels
Because of this Variables can no longer appear in the graph.
Every usage of a leaf Variable will leave an AccumulateGrad
function that has no outputs, but modifies var.grad as a side
effect.
The core autograd Variable, Function, and Engine no longer depend on the
Python API. This let's us implement functions in C++. In the future, we
can also multithread engine and release the GIL for most of the
non-Python backwards.