Commit Graph

40 Commits

Author SHA1 Message Date
Sam Gross
a8bda67ff1
Only check that arguments are Variables in VariableType (#4991)
Don't check the ScalarType and Backend of arguments in VariableType.
Instead, only check that arguments are Variables of any type. The
precise type checks are handled by the base type.

Many of our functions take heterogeneous types. There isn't enough
information in Declarations.yaml to ensure the precise types of
arguments in VariableType, which makes it difficult to add new methods.

This is #4943 with a fix to the memset call
2018-02-01 14:56:56 -05:00
Sam Gross
2bf9ed8e05
Revert "Only check that arguments are Variables in VariableType (#4943)" (#4980)
Revert "Only check that arguments are Variables in VariableType (#4943)"
2018-02-01 11:59:21 -05:00
Sam Gross
d44437968f
Only check that arguments are Variables in VariableType (#4943)
Don't check the ScalarType and Backend of arguments in VariableType.
Instead, only check that arguments are Variables of any type. The
precise type checks are handled by the base type.

Many of our functions take heterogeneous types. There isn't enough
information in Declarations.yaml to ensure the precise types of
arguments in VariableType, which makes it difficult to add new methods.
2018-01-31 12:31:11 -05:00
gchanan
9bb6d33d35
Enable scalars if compiled with WITH_SCALAR environment variable. (#4806)
* Enable scalars if compiled with WITH_SCALAR environment variable.

We are pretty close to enabling scalars (0-dimensional arrays); this allows turning them on
for development purposes and to be able to write code that works both with and without scalars enabled.

WITH_SCALARS is currently broken with distributions, but should work for test_torch, test_autograd, test_nn.

* Fix unsqueeze.

* Fix wrap dim, wrapping with Scalar.
2018-01-23 15:44:11 -05:00
Sam Gross
57549b7e44
Bind functions with out= arguments in VariableType (#4565)
This adds overrides in VariableType for the xxx_out ATen functions and
implements Python bindings. There is no support for automatic
differentiation. If any of the inputs (or outputs) requires grad, then the
function will throw an exception unless it's running in "no-grad" mode.

The bindings for calling torch.xxx functions on Variables are moved to a
different object. Previously, they were static method on VariableBase.
This change prevents users from accidentally calling static methods as if
they were instance methods.
2018-01-17 18:27:42 -05:00
gchanan
eb857ec367
Introduce a (non-public) autograd scalar method and improve printing (#4586)
* Specialize Variable pinting and always print device for GPU tensors/Variables.

* Introduce a (non-public) _scalar_sum() method for autograd scalar testing.
2018-01-12 14:26:38 -05:00
Richard Zou
2060f355a6 Fix python gc race condition with THPVariable_traverse (#4437) 2018-01-02 19:57:21 +01:00
Sam Gross
de28e754b2
Make Variable.is_sparse an attribute (#4308)
This matches Tensor.is_sparse, which makes it easier to replace Tensor
with Variable.
2017-12-22 12:46:28 -05:00
Sam Gross
d605058212
Replace Variable.volatile with torch.no_grad() (#3970)
This removes volatile from Variable. The functionality is mostly
replaced by a global (thread-local) flag, which is controlled by
torch.set_grad_enabled() and the context manager torch.no_grad().

In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled()

Fixes #3627
2017-12-18 15:46:13 -05:00
Sam Gross
090a23251e
Add Variable._cdata (#4045)
This is to help with merging Variable and Tensor. It's equivalent to
Tensor._cdata and ATen's unsafeGetTH().
2017-12-06 12:26:01 -05:00
Sam Gross
0a434ff685 Remove Function::is_executable (#3907)
* Remove Function::is_executable

Ensure that grad_fn is null if requires_grad is false.

* Assert that grad_fn implies requires_grad=True
2017-11-28 18:29:27 -08:00
Sam Gross
4518793aa2
Implement indexing in ATen (#3725)
Implements basic and advanced indexing using ATen tensors/variables.
Basic indexing is translated at the Python-binding level
(python_variable_indexing.cpp) to slice/squeeze/unsqueeze/select calls.
Advanced indexing is implemented in ATen in terms of take() and put()
calls.
2017-11-21 13:19:00 -05:00
gchanan
067f799e9f
Implement remaining Variable fallthrough methods via ATen (#3744)
* Use aten version of is_signed.

* Define is_cuda native function and use it for variable.

* Use ATen dim for Variable dim/ndimension.

* Get rid of dim, ndimension fallthroughs in variable.py.

* Move size/stride Variable methods to use ATen.

* Implement shape property on Variable via ATen.

* Remove the _getattr__ function from Variable.

* Get rid of dispatch functions and avoid cast.

* Add THPUtils_packInt64Array.

* Throw python errors.

* Use fallthrough and fix fallthrough generation for native functions.

* is_cuda is a property, not a method.
2017-11-17 15:57:56 -05:00
anderspapitto
b97dfc8a92 Pretty names: support names set via export or Variable constructor (#3371)
Add (fully opt-in) functionality to support setting pretty names for
nodes in the graph. In particular

- Variable now has a `name` parameter in the constructor
- export now has `input_names` and `export_names` parameters

Nodes that are not named via this mechanism continue to be named
internally with unique integers.

Names have a few rules.

- They must all be unique in the graph.
- They may not be integers (because of potential conflicts with
  internally generated names).
2017-11-16 21:11:34 -05:00
Gregory Chanan
9a2b54e08b [ATen] Rename isCuda -> is_cuda. 2017-11-15 18:33:07 -08:00
Gregory Chanan
5fd93b56fd [master] Don't expose 0-dim tensors to Variable API. 2017-11-07 15:15:42 -05:00
Sam Gross
fde355f7d4
Allow in-place operations on views (#3384)
Allow in-place operations on views

Adds VariableViewImpl, a subclass of VariableImpl which has a pointer to
the base Variable on which it is a view. In-place operations on views
change the grad_fn of the base.

Note that in-place operations only work on views that are the first output of the function that created them. All C++/ATen implemented functions have this behavior, but it's possible to write Python-implemented autograd functions that do not. In-place operations on these view will raise an exception.

Fixes #3313
2017-11-06 18:19:56 -05:00
Sam Gross
6647475bc2 Lazily create Variable.data PyObject* (#3149)
Previously, we the Variable.data PyObject* in THPVariable_Wrap. For many
Variables, we don't access their data directly. Instead, they are passed
from one Variable compuatation to another.

This reduces the overhead of ATen-implemented Variable methods by
~200ns.
2017-10-17 11:54:55 -04:00
Sam Gross
e92246fffa Visit hooks in C++ implemented autograd functions (#3138)
Once mul uses ATen, this is necessary for TestAutograd.test_hooks_cycle
to pass.
2017-10-16 17:30:09 -04:00
Edward Z. Yang
06c44e2283 Replace Variable(new VariableImpl(...), false) with make_variable.
Also squash a warning about an implicit conversion that will never
occur (because the type being converted to is a superclass).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-14 14:33:08 -04:00
albanD
2356ee41b7 Fix segfault in backward 2017-09-13 14:47:26 -04:00
Sam Gross
1290e586fb Use at::Tensor based autograd Variable (#2676)
Variable is now a subclass of at::Tensor backed by a VariableImpl* pImpl. The implementation of the ATen functions is defined in the auto-generated VariableType.h/cpp file.

Currently, only functions which fall through to the base type, such as sizes() and isCuda() are implemented. Differentiable ops like add() and mul() will be added in a subsequent PR.
2017-09-12 11:36:01 -04:00
Adam Paszke
1c4538e017 Trace C functions 2017-09-05 17:48:55 -04:00
Adam Paszke
ea05ac8f41 Move JIT-related files to jit dir. Remove IR interpreter 2017-09-05 17:48:55 -04:00
Edward Z. Yang
a797ab9343 Rewrite AST to a new, more functional representation.
Previously, our AST was a DAG, where shared Nodes indicated a computation
should be reused.  This commit rewrites the IR into a new functional
representation which represents sharing explicitly using variable
bindings.

We offer a few justifications for this new style:

1. The new representation is not all that different from the
old one; it is about as easy to construct, and the lack of an
explicit graph doesn't negatively impact our ability to interpret
the graph, since we've chosen, as a matter of design, to NOT have
the IR participate in the actual execution of a graph.

2. The new let-binding representation has an implicit ordering,
which we can use to conveniently keep track of the original order
the trace showed up as.  This automatically gives us a topsort,
and gives us an easier to read textual representation of our
IR:

  %14 = Embedding %11, %0, -1, None, 2, False, False
  %15 = Dropout %14, 0.2, True, False
  %16 = Index %12, 0
  %17 = Index %12, 1
  %18 = Index %13, 0
  %19 = Index %13, 1
  %20 = Index %15, 0
  %21 = Linear %20, %1, %3
  %22 = Linear %16, %2, %4

3. It moves us closer to a Futhark style language
(http://futhark-lang.org/publications/pldi17.pdf).

Major aspects of the diff

- Node is replaced with Expr and Arg, a pair of mutually recursive
  structures which represent our new language.  In BNF, the language
  looks like this:

    a ::= c | %i
    e ::= %i, ... = e
        | PyOp e, ...
        | Ret %i, ...

  Technically, Ret is not actually a return (no control flow is involved),
  it just tuples up a series of tensors (identified by variables).

  One important invariant is that locals are always tensors; they
  are never constants (this is asymmetric with Args.)

- Arguments support Python constants.  This is an important piece because
  many operators take extra Python literals like integers and tuples in
  order to specify extra parameters about how an operator operates.  Adding
  this was essential to getting word_language_model to work.

- As both Expr and Arg have multiple variants, there is new infrastructure
  for doing case on the variants using ExprVisitor and ArgVisitor.  The
  strategy here is adapted from WebAssembly's visitors, although we have
  generalized to permit arbitrary argument forwarding, which is necessary
  to support tail-recursive visitor calls.  TCO is important because our
  interpreter may recurse arbitrarily deep into a stack of nested lets.
  If users wish, they can also manually case on the type tag.

- Tracing is now turned on and off using _tracer_enter/_tracer_exit in
  torch._C.  _tracer_enter accepts a list of variables which are to be
  treated as arguments; _tracer_exit accepts the list of traced variables
  which should be returned when you reexecute the trace, and returns
  the trace expression which can be reexecuted.  GlobalTracingState
  is a global variable which tracks whether or not we are tracing or not.

- You use run_forward to execute a trace on some set of parameters.

- When under tracing, variables keep track, via trace_local, what the
  name of their variables in the IR are.

Here is a simple runner which leaks memory but can be used to JIT models:

  import torch.autograd.function as F
  import torch._C

  def jit(model):
      import types
      real_forward = model.forward
      def forward(self, *args):
          def flatten(x):
              return tuple(F._iter_variables(x))
          if not hasattr(self, "saved_trace"):
              torch._C._tracer_enter(tuple(self.parameters()) + flatten(args))
              out = real_forward(*args)
              self.saved_trace = torch._C._tracer_exit(flatten(out))
              self.saved_outs = out
              return out
          else:
              flat_out = Variable._execution_engine.run_forward(self.saved_trace, tuple(self.parameters()) + flatten(args))
              return F._unflatten(flat_out, self.saved_outs)

Major problems:

- Sanity checking is spotty at best, especially when users pass in variables.

- The interpreter leaks tensor memory from the store.  When we add back def-use
  we should be able to deallocate tensors as soon as we know they are no longer
  necessary.

- The interpreter needs to reach feature parity with the old execution engine.
  From there, we need to see if backwards can be subsumed as well.

- I still have no confidence in having memory managed everything correctly.
  This requires a close look.

- Rather than return an *open* expression as a trace, we should return a
  *lambda* instead, which knows about how many formal parameters it
  requires.

- The IR is not introspectable from Python at the moment, but this is simply a
  matter of implementing all the binding code.

- The tracer is NOT reentrant (you can't trace while you're inside a trace.)
  Furthermore, no sanity checking is done if you try to incorrectly reuse
  things from one trace in another.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Edward Z. Yang
e1b7872fc2 Make it possible to access IR from Python.
Also, add a new trace_fn field to attach forward IR to Variables.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00
Zachary DeVito
43c944acbd Remove dead THPP code that has been replaced with ATen objects. (#2235)
THPP usage is now isolated in THD.
2017-07-29 08:07:41 +05:30
Zachary DeVito
bf26a51f91 fix a bug where an uninitialized at::Tensor was passed to createPyObject (#2239) 2017-07-29 06:28:18 +05:30
Trevor Killeen
c304d04fc6 Replace thpp::Tensor with ATen Tensor in autograd csrc (#2170) 2017-07-28 10:18:37 -04:00
Sam Gross
eba3dc8561 Fix gc_refs assertion failure (#1705)
* Fix gc_refs assertion failure

Ensure that each THPVariable -> THPFunction reference contributes one
ref count to the THPFunction by creating a new shared_ptr for each ref.

Because multiple shared_ptrs can again manage a single THPFunction, it's
not safe to use std::weak_ptr where it may point to a PyFunction. It's
still safe to use weak_ptr for grad_accumulator since these are never
PyFunctions.

Fixes #1626

* Remove stale comment
2017-06-02 21:08:50 -04:00
Adam Paszke
feef54ec34 Don't modify non-volatile grads in zero_grad 2017-05-10 16:43:14 +02:00
Adam Paszke
35cf380ed1 Improve output wrapping logic in autograd 2017-05-10 16:43:14 +02:00
Adam Paszke
20aa5b066f Convert some of the functions to new format
Also, fix a lot of issues that appeared after the previous commits.
2017-05-01 16:44:56 -04:00
Adam Paszke
702a2e3bc5 Make Variables not subclass Function anymore
Because of this Variables can no longer appear in the graph.
Every usage of a leaf Variable will leave an AccumulateGrad
function that has no outputs, but modifies var.grad as a side
effect.
2017-05-01 16:44:56 -04:00
Adam Paszke
2ca787fcf4 Refactor attribute names in autograd 2017-05-01 16:44:56 -04:00
Sam Gross
5073132837 Implement 'pre' and 'post' hooks at the C++ autograd level 2017-03-06 12:47:53 -08:00
Sam Gross
34ce58c909 Parallelize backwards 2017-03-03 11:26:00 -08:00
Martin Raison
f17cfe4293 sparse tensor operations (#735) 2017-03-03 18:37:03 +01:00
Adam Paszke
c0c62d099a Make detach() actually remove the creator 2017-02-17 10:40:08 +05:30
Sam Gross
bd5303010d Refactor autograd package to separate Python dependencies. (#662)
The core autograd Variable, Function, and Engine no longer depend on the
Python API. This let's us implement functions in C++. In the future, we
can also multithread engine and release the GIL for most of the
non-Python backwards.
2017-02-13 16:00:16 -08:00