Commit Graph

83 Commits

Author SHA1 Message Date
Sam Gross
0a434ff685 Remove Function::is_executable (#3907)
* Remove Function::is_executable

Ensure that grad_fn is null if requires_grad is false.

* Assert that grad_fn implies requires_grad=True
2017-11-28 18:29:27 -08:00
Adam Paszke
cf407213f9 Clean up stochastic function related dead code (#3782) 2017-11-20 12:44:45 -05:00
Gregory Chanan
9a2b54e08b [ATen] Rename isCuda -> is_cuda. 2017-11-15 18:33:07 -08:00
Sam Gross
b09d66e60d
Fix a reference cycle when in-place ops on views save the output (#3679)
Previously, an in-place operation that saves its output (such as
relu/threshold) would create a reference cycle when applied to the a
view. There were two cycles created:

1) The cycle base.grad_fn.fn.input_.base
   base.grad_fn is a CopySlices
   base.grad_fn.fn is ThresholdBackward
   base.grad_fn.fn.input_ is a SavedVariable with base pointing to base

2) The cycle base.grad_fn.fn.input_.grad_fn.next_functions[0]
   base.grad_fn.fn.input_.grad_fn is AsStridedBackward
   and next_functions[0] points to base.grad_fn

Generally, we avoid cycles because the AD graph is mostly immutable. Two
notable exceptions are:

a) Variable.grad_fn can change to point to a new grad_fn
b) SavedVariables in a function can be set after the function is created

The first case is not a problem if grad_fns do not hold strong references
to Variables. Removing "base" from SavedVariable removes the strong ref.

For the second case, we need to avoid saving the grad_fn of outputs. We
were incorrectly saving the grad_fns of outputs when they were the
result of in-place ops on views.
2017-11-15 15:19:41 -05:00
Zach DeVito
ef4b19f767 Refactor ir.h to distinguish Nodes and Values
This commit adds a Value type similar to the one @ezyang suggested a while
ago for handling multi-return nodes.

Previously if we had a graph like:

  a = op1(b)
  c, d = op2(a)

Then its in-memory format would look like:

  %0 = op1(b)
  %1 = op2(%0)
  %2 = select(%1, 0)
  %2 = select(%1, 1)

Select nodes were used only to handle the multi-output case. In the
single-output case ops referred directly to their uses.

This required special handling for the single- and multi- output cases,
and was confusing when used with ONNX which distinguishes values (the
inputs/outputs of a node) from the nodes themselves (e.g. a Conv).

This commit adds the Node/Value distinction to the IR. In the example
above, `a`, `b`, `c`, and `d` are now Value objects, while `op1` and
`op2` are now Node objects. Inputs/Outputs to the graph are values.

* Nodes now always have multiple outputs, accessible through their `output()`
  method.
* Methods exist for adding/removing outputs from a node.
* Nodes own their output Values, destroying a node destroys its outputs and it
is only valid to destroy a node when no uses of its outputs remain.
* Unlike select, Values do not appear in the nodes list.
* The method `node()` on `Value` retrieves its defining node. Calling it
is always valid. For inputs, its kind is "Param". Like "Return" there is a single Param
node representing all inputs.
* For single-output Nodes, the method `output()` retrieves the single
output Value, asserting that the node is in-fact single output.
* Functions are the same, but some functions like `type()` have moved to
Value.
* `replaceAllUsesWith` is now sanely defined for both Values and Nodes.
In the case of Nodes, it replaces all outputs of the node with the outputs
of the replacement node.
* stage is defined both on Node/Value. This is because Inputs require a stage.
* Apart from changing data types from Node->Value most passes remain the same.
  Things that previously assumed single-output nodes now have to call output()
  to get the node.
* This removes the uses = [...] field in the outputs because it was
getting confusing even before this commit when uses would refer to nodes,
but we print the names of Values. The lint pass validates the use list,
so printing it out seems less necessary.
2017-11-15 11:47:18 -08:00
Sam Gross
fde355f7d4
Allow in-place operations on views (#3384)
Allow in-place operations on views

Adds VariableViewImpl, a subclass of VariableImpl which has a pointer to
the base Variable on which it is a view. In-place operations on views
change the grad_fn of the base.

Note that in-place operations only work on views that are the first output of the function that created them. All C++/ATen implemented functions have this behavior, but it's possible to write Python-implemented autograd functions that do not. In-place operations on these view will raise an exception.

Fixes #3313
2017-11-06 18:19:56 -05:00
Sam Gross
3003ebe67a
Replace None grad_inputs with zero tensors in some cases (#3433)
Replace None grad_inputs with zero tensors in some cases

In Python-implemented autograd functions, we sometimes return None as
the grad_input if the output is marked "non-differentiable". This
replaces those None values with zero-filled Variables if the
corresponding input has requires_grad=True.

C++ implemented autograd functions expect the input (grad_outputs) to
be defined if they're executed. They always return non-null grad_inputs
if should_compute_output(i) is true. This could lead to segfaults if a
subsequent Python-implemented function returned None.

See #3412, #3241
2017-11-02 17:23:25 -04:00
Adam Paszke
fa0f3cf98a Re-enable and fix most JIT tests 2017-10-27 02:40:09 +05:30
Sam Gross
e970d35091 Make VariableVersion refcounting thread-safe (#3184)
I've also made the version counter and the "live" reference count
atomics.

Note that it's not safe to set the version counter (operator=) from
multiple threads, because shared_ptr assignment isn't thread safe.
Currently, the only call sites to these functions are on newly created
variables before they can be accessed from other threads.

See #3111
2017-10-19 17:22:01 -04:00
Edward Z. Yang
66bb3d6dec Remove incorrect comment that join_with is symmetric.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-10-13 01:31:22 +02:00
Adam Paszke
b6b41c829a Add inplace checks in JIT 2017-10-03 10:20:58 -04:00
Adam Paszke
411e1469e0 Add tools for autograd profiling 2017-09-25 23:21:30 -04:00
Edward Z. Yang
c08395e290 Give a better error message when we hit a legacy function.
We now include the type name of the legacy function implementing
class.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-25 12:26:07 -04:00
Edward Z. Yang
6efd797376 Document unchecked invariant.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-20 12:24:27 -04:00
Edward Z. Yang
25c2b7d8b2 Some minor extra comments on python_function
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-20 12:24:27 -04:00
Adam Paszke
c536da7064 Remove TensorMeta 2017-09-19 10:53:32 -04:00
Adam Paszke
ba6e652c02 Add simple mode to Eval 2017-09-19 10:53:32 -04:00
Edward Z. Yang
1f80dd03bd Track change of Variable from shared_ptr to ATen style tensor
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-19 10:53:32 -04:00
Adam Paszke
28828e033f Make certain functions traceable 2017-09-19 10:53:32 -04:00
Adam Paszke
4d1ed4ec42 Assign traces before saving Variables 2017-09-19 10:53:32 -04:00
Adam Paszke
964b731af3 Try to handle NULL Variables in the tracer 2017-09-19 10:53:32 -04:00
Adam Paszke
ddd417faf0 Fix non-CUDA builds after Windows PRs (#2760) 2017-09-17 02:02:52 -04:00
Sam Gross
1290e586fb Use at::Tensor based autograd Variable (#2676)
Variable is now a subclass of at::Tensor backed by a VariableImpl* pImpl. The implementation of the ATen functions is defined in the auto-generated VariableType.h/cpp file.

Currently, only functions which fall through to the base type, such as sizes() and isCuda() are implemented. Differentiable ops like add() and mul() will be added in a subsequent PR.
2017-09-12 11:36:01 -04:00
Adam Paszke
965a349bbd Record context edges in the JIT 2017-09-05 17:48:55 -04:00
Adam Paszke
9f97291408 Make tracer thread-safe 2017-09-05 17:48:55 -04:00
Adam Paszke
fa308b3183 Improve backward tracing 2017-09-05 17:48:55 -04:00
Zach DeVito
55cd9f37d1 remove Select, and NodeWithKind 2017-09-05 17:48:55 -04:00
Zach DeVito
24cdb897d6 starting removing nodes by removing Return 2017-09-05 17:48:55 -04:00
Zach DeVito
b037efa92c prep for removing node subtypes 2017-09-05 17:48:55 -04:00
Adam Paszke
7f60a18293 Add initial support for backward tracing 2017-09-05 17:48:55 -04:00
Edward Z. Yang
4a1bbc01ac Fix #41.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Edward Z. Yang
765b0bf137 Make in-place work again.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Adam Paszke
1c4538e017 Trace C functions 2017-09-05 17:48:55 -04:00
Adam Paszke
bdcbbeaf68 Remove GlobalTracingState 2017-09-05 17:48:55 -04:00
Edward Z. Yang
c931feaad0 Elaborate on NB a little 2017-09-05 17:48:55 -04:00
Adam Paszke
3e0f1608fe Capture Variables that are not inputs as constants 2017-09-05 17:48:55 -04:00
Adam Paszke
af21c6b018 Add Node type to JIT IR
Rewrite Type as a class hierarchy

PR comments + rebase fixes
2017-09-05 17:48:55 -04:00
Adam Paszke
f270973937 Add JIT IR -> Autograd IR converter 2017-09-05 17:48:55 -04:00
Adam Paszke
9662cffd26 Use std::list in JIT IR 2017-09-05 17:48:55 -04:00
Adam Paszke
3dcbba1f35 Keep Variable mapping as part of TracingState 2017-09-05 17:48:55 -04:00
Adam Paszke
6be47ec907 Minor fixes and improvements 2017-09-05 17:48:55 -04:00
Adam Paszke
ea05ac8f41 Move JIT-related files to jit dir. Remove IR interpreter 2017-09-05 17:48:55 -04:00
Zach DeVito
1325fa511c JIT IR including use-def chains and updated comments. 2017-09-05 17:48:55 -04:00
Zach DeVito
7c083b00f8 refcounting for Node/Value 2017-09-05 17:48:55 -04:00
Zach DeVito
f369f8e80d simplify IR 2017-09-05 17:48:55 -04:00
Edward Z. Yang
4979359800 Add graphs, trace them.
It is not an /expression/ we trace, but it is a /graph/: that is,
a closed expression which knows its parameters.  Knowing the list
of parameters is helpful and helps remove a hack when interpreting.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Edward Z. Yang
c1dec0663f New stratification: add Operator/Instruction
This prevents nested lets, which are not allowed in ANF.  We
basically have SSA now.

There's some niftiness with the visitor returning a lambda which
then gets fed the actual argument. I like it.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Edward Z. Yang
7bd4c5a27c Minor sanity check.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Edward Z. Yang
3055b69f63 Refactor Arg class away.
Although ANF style developments traditionally stratifies syntactic
classes into atomic (Arg) and complex (Expr) expressions, where
atomic expressions could be variables, constants or lambdas, Zach has
successfully convinced me that we should do away with the variant here and
always require arguments to be variables.  There are a few reasons for
this:

1) Tensor constants, not currently supported, could be modeled using a
"Constant" instruction, removing the need for them to be representable
directly inline.  An inline constant is marginally more convenient
for peephole optimizations, but since we have gone full ANF, we are going
to need to be able to see across def-uses in any case, and it is not
too much worse to need to handle constants this way.  By the way,
Swift Intermediate Language also made a similar choice, see
the slide on "Literal Instructions" in
http://llvm.org/devmtg/2015-10/slides/GroffLattner-SILHighLevelIR.pdf

2) Scalar constants, which are quite important for passing non-tensor
arguments to Python operators, are now stored out-of-band as NON
first-class values.  This more closely matches the ToffeeIR design,
and makes it clear what parameters are "first class" (tensors only)
and which ones are not.  However, we need to be able to unswizzle
the separate scalar/tensor lists into a unified list in the correct
format; this is what PyFunctionCConv is for.

Also, Locals got renamed into Tuple.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Edward Z. Yang
c466b2c1f6 Make an error message better
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00