Commit Graph

24 Commits

Author SHA1 Message Date
Sam Gross
ebd32f7bcd
Check that parsed_args contains enough space for all parameters (#5467) 2018-02-28 14:34:04 -05:00
gchanan
f4cfd9bbfc
Don't python bind 'tensor' or 'sparse_coo_tensor'. (#5390)
These are internal ATen functions; we have better python APIs.
2018-02-26 11:06:25 -05:00
Sam Gross
30ec06c140
Merge Variable and Tensor classes (#5225)
This replaces the torch.Tensor constructors with factories that produce
Variables. Similarly, functions on the torch module (e.g. torch.randn)
now return Variables.

To keep the PR to a reasonable size, I've left most of the unused tensor
code. Subsequent PRs will remove the dead code, clean-up calls to
torch.autograd.Variable, and rename Variable to Tensor everywhere.

There are some breaking changes because Variable and Tensors had
slightly different semantics. There's a list of those changes here:

 https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge
2018-02-23 18:03:31 -05:00
gchanan
0878c6d4d7
Various dtype improvements. (#5321)
* Various dtype improvements.

1) Add dtypes to the new data-based constructors: Variable.new_tensor and torch.autograd.variable.
2) In the python signatures, use Type instead of Dtype to match	the C++ signatures; the error messages still print as dtype.
3) Handle / add a better error message when a dtype is used when ATen was not compiled with that type (e.g. cuda types).
4) Move cuda_lazy_init to its own file.

A later commit will add support to the legacy constructors as well.

* Move implementation of lazy_init to cpp.

* Fix parsed_arg size.
2018-02-21 17:37:59 -05:00
gchanan
5edf6b2037
Add numpy-style dtypes to Variable factories. (#5245)
* Add numpy-style dtypes to Variable factories.

1) Add numpy-style dtypes corresponding to torch tensor types.  These are:
torch.float16, torch.float32, torch.float64, torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64
as well as torch.cuda, torch.sparse, and torch.cuda.sparse equivalents.

2) Adds "legacy" names for the above dtypes that correspond more closely to existing tensor names.  These are:
torch.half, torch.float, torch.double, torch.short, torch.int, torch.long.
torch.byte and torch.char don't exist because they either don't match numpy semantics or differ on different architectures.

3) Adds a "dtype" parameter to Variable factories (e.g. zeros, ones) that allows the user to specify the type without changing the default tensor type.

4) Adds a "dtype" getter to Variables that return the canonical dtype from 1)

This PR is missing the following useful features that should be added in the future:
A) We only add the "dtype" parameter to auto-generated factories; hand-written factories like in tensor_new.cpp don't support this yet.

B) We don't allow type conversions to use dtypes; that should be added to type(param) or a new function.

C) We don't yet have a "device" parameter for these factories; right now, they will only create Variables on the default device.

* backend_to_string can be private.

* Define python binding argument indexes in a more simple way.

* add all_declared_types, still need to hook it up to THPDType.

* Fix all_declared_types for missing types (it's Sparse + Half).

* Ensure cuda dtypes are created even if compiled with NO_CUDA=1.

* Fix case where dtype is provided but dispatch is via namespace.

This happens in ones_like, empty_like, randn_like.

There is some question if we should do:
1) at::ones_like(tensor).toType(dtype)
2) at::ones_like(tensor.toType(dtype))

I did the former because this matches with the numpy documentation, i.e.:
"Overrides the data type of the result." and it's easier to implement.

Note that the above causes an extra copy, either of the input or output.
Here's a better implementation:
1) Make zeros_like, ones_like native functions that take an optional type (named dtype?).
2) Match the type argument with the dtype, so we don't have two different parameters.
3) Call at::zeros_like(input, type) -> at::native::zeros_like(input, type) -> type.zeros(input.sizes())

* Don't return from maybe_initialize_cuda.

* Don't leak DType name.

* Address cpp review comments.

* Share code between sparse and non-sparse test_dtypes.

* Rewrite _like functions as native function with explicit type parameter.

* Use type 'Type' instead of 'dtype' for consistency.

* Address review comments.

* Handle arg_idx when there is requires_grad but no dtype in python_binding_arguments.
2018-02-20 11:04:14 -05:00
Sam Gross
0509f26d41 Speed-up nn.Linear for the 3d input case (#5279)
This adds at::_unsafe_view and uses it in matmul. The _unsafe_view
function is identical to view except that the output is not treated
like a view by the automatic differentiation code. This avoids in-place
modifications triggering the more expensive CopySlices/AsStridedBackward
behavior.

The _unsafe_view function is only safe to use on temporaries that will
be immediately discarded and that do not alias other tensors. Otherwise,
in-place modificatiions may trigger incorrect gradients. The funciton is
not exposed to Python.

See #5169
2018-02-19 19:47:20 -05:00
Sam Gross
c1b98f0841
Add deprecated add_out overload (#5088)
We have a few calls that use this signature on Tensors. This also
updates the binding code to support deprecated xxx_out signatures.
2018-02-06 17:08:23 -05:00
Edward Z. Yang
7bd2db997e
Port cuDNN RNN bindings to ATen (#4881)
* Add transpose() to TensorGeometry.

This code is dead; I briefly used it in my RNN patchset but
eventually rewrote it to not be necessary.  However, it seemed
like a useful gadget so I kept it.  In general, it seems that it
would be useful for TensorGeometry to support all operations that
Tensor does, but it only computes the changes to sizes/strides
instead of actually doing the computation.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Turn on wrap_dim behavior for TensorGeometry

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support for hard-coded differentiable outputs.

Some outputs of functions are nondifferentiable, and should always
be returned with requires_grad=False.  Traditionally, we have used
the presence of 'grad' to signal that only the first output is
differentiable, and the rest are not, but cudnn_rnn (to be
implemented) breaks this pattern; its first three outputs are differentiable,
but its last output is a buffer that is just consumed by backwards.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* TensorGeometry constructor from just sizes

The sizes are assumed to form a contiguous tensor, and we compute
the strides we would get in that case.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support saving TensorList for backwards.

There is some back story here.  Saved TensorList in backwards will
be used by cudnn_rnn, and it is worth asking, why is it necessary to
save a list of tensors?  Indeed, *technically* speaking a list of
tensors is not necessary, we only need to save the sizes of each
of the weight tensors.  (We need the sizes because cuDNN is only
going to blast the derivative of weights into a flat buffer, but
we need to match the sizes of the views into the buffer when we
eventually return the derivatives.)

However, it was surprisingly awful trying to implement passing just
sizes, because as non-Tensor arguments, the JIT interpreter generation
code is expected to handle all non-Tensor arguments as attributes in the
trace, and our attributes struct doesn't actually know how to do
arrays of arrays.  Saved TensorList code was much easier to get working,
so that's what this patch does.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* MatrixRef - an ArrayRef with a stride, making it a 2D ArrayRef.

Like ArrayRef, this class does not own the underlying data, it is expected
to be used in situations where the data resides in some other buffer.
This is intended to be trivially copyable, so it should be passed by
value.

For now, 2D only (so the copies are actually cheap, without having
to write a SmallVector class) and contiguous only (so we can
return non-strided ArrayRef on index).

The intended use-case (not in this commit) is to make it easier to
work with RNN weights, which are num_weights x num_layers matrix of
parameters.

P.S. dimension 0 indexes rows, dimension 1 indexes columns

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Generalize getDataType in Descriptors.h

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Change copy_range to take Tensor, and change cat_tensors_backward accordingly

Should a backward function return a Variable or a Tensor?  For the most
part, all of our backward functions return Tensor, except cat_tensors_backward,
which returns a variable_list (which is really the only thing that matters,
because Tensor and Variable are interconvertible).  But this is kind of weird,
because it means that you can't implement a backwards in ATen that returns
a std::vector<Tensor>, and then hook it up transparently with the derivatives
code.  So I switched it over.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support 5-ary return Tensor tuple.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support code generation with mixed Tensor/TensorList in output.

I don't think I ended up using this in cudnn_rnn, but this seems
it might be useful for someone else later.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support 4-ary boolean array

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add support for retain_variables in tools/autograd/derivatives.yaml

'retain_variables', a bool which is true if a user has specified
that saved variables should be retained in case the backwards is
run again later.  This allows an optimization where we can
destroy saved buffers if we know variables are not going to be retained,
e.g., it is (will be) used by _cudnn_rnn

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Lazily initialize cuDNN descriptors

Previously, cuDNN descriptors were eagerly allocated as soon
as a FooDescriptor object was created.  However, in some uses
of TensorDescriptor, this is problematic: some tensors are optional
and cuDNN's API expects to be given a nullptr TensorDescriptor
in this case, not an uninitialized (but allocated) descriptor.

Lazily initializing the descriptors makes it less likely for
us to use uninitialized memory and matches the usual semantics of
unique_ptr.  It's good sense!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Port cuDNN RNNs to ATen.

This brings three new functions:
  - _cudnn_rnn_flatten_weight: flatten a matrix of weight tensors into
    a single contiguous weight buffer as required by cuDNN
  - _cudnn_rnn: run RNN forwards
  - _cudnn_rnn_backward: run RNN backwards

RNNs have a lot of parameters, so we restructured what was previously
a single 'fn' object that recorded all the parameters into three
objects: RNNDescriptorParams, TensorDescriptorListParams and
DropoutDescriptorParams.

We make use of MatrixRef to organize the weight tensors (which are
weight/bias x number of layers), but I did not teach the codegen
how to pass these as arguments/return values natively, so instead
a MatrixRef is passed as its constituent ArrayRef and int64_t stride0.

cudnn_rnn has three differentiable outputs and one nondifferentiable
one, so it makes use of the support for hard-coded differentiable outputs.

I haven't deleted all of the descriptor code from Python, because dropout
initialization still goes through this codepath, that should be fixed soon
but I don't see it as essential for this PR.

This commit also removes the last use of NestedIOFunction from PyTorch.

There are some shenanigans with cuDNN dropout descriptor initialization,
see below:

Note [cuDNN dropout descriptor initialization]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In most cases, setting descriptors in cuDNN is cheap (e.g.,
cudnnSetTensorNdDescriptor).  However, this is not the case for
cudnnSetDropoutDescriptor: in cuDNN 6/7 (and possibly others) it does an
expensive precomputation to initialize the random number generator states.  In
cuDNN 6, this is the ONLY official mechanism to initialize a dropout descriptor,
which means that law-abiding clients were expected to generate a dropout
descriptor once and cache it.  However, our ATen interface is (1) stateless (so
we can't cache the descriptors) and (2) does not accept arbitrary user types in
its interface (so we can't pass the descriptor in).  This puts us in a pickle.

In cuDNN 7, a new function, cudnnRestoreDropoutDescriptor was added, which
forgoes the expensive initialization process, and can initialize the
descriptor with a pre-initialized state CUDA tensor.  This is great, because
it means we can simply pass in the state tensor and then initialize the
descriptor internally.  Unfortunately, this function is not available in
cuDNN 6.

To work around this, we break the cuDNN abstraction barrier, and have
the struct layout of the underlaying dropout descriptor.  With this struct,
we can reimplement cudnnRestoreDropoutDescriptor from scratch. Great!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Fix cuDNN 7 behavior.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Delete some unused, controversial methods from MatrixRef.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add missing filter_dim_a slice

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Replace nested for-loop with itertools.chain.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* CR comment on mut_desc()

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Refactor DropoutDescriptor API.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Use cached CurrentDeviceProperties from Context.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Document _cudnn_rnn outputs.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Improve fmap docs, convert some functions to use it.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Move IndexRange to autograd/function.h

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Elaborate on CUDNN_STATUS_INVALID_VALUE return some more.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add an all-in-one setter for RNNDescriptorParams.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Print what the unrecognized RNN mode was

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* RNN TensorDescriptor improvements

- Have an explicit size/stride overload for set TensorDescriptor,
  so you don't have to create a goofy view to feed in.

- Change the padding to 3D rather than 5D, which is all you actually
  need (it's just 2D that is not supported by cuDNN API.)

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Fix implementation of cudnnRestoreDropoutDescriptor, plus test.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Better comments about input layout.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add comment about no-DropoutDescriptor argument RNNDescriptor function.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Rename vocab_size back to input_size.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Don't use backslash in comment.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Bugfix for contiguous TensorGeometry calculation.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Don't allocate a dummy tensor when setting TensorDescriptor for flatten_weight.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Make contiguity errors more user-friendly.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* s/fn.dropout.train/fn_train/

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* s/_cudnn_rnn_backward_grad/_cudnn_rnn_backward_input/

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Make dcx properly undefined when not required.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Remove old TODO.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add state size check in cudnnRestoreDropoutDescriptor

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Explicitly narrow int64_t to size_t

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Restore copyParams comment.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Update benchmark numbers, and slight engineering improvements.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Typofix.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-02-05 13:54:11 -05:00
Richard Zou
bc11511cda Restore sparse variable transpose_() and t_() (#4779)
* Restore sparse variable transpose_() and t_()

* Add dimension wrapping to transpose_, t_

* Don't expose sparse_raw_resize_ to python
2018-01-23 21:32:40 -05:00
gchanan
c49f0279a6
Add kwarg-only 'requires_grad' parameter to Variable factories. (#4748)
* Add kwarg-only 'requires_grad' parameter to Variable factories.

Functions that create variables, e.g. torch.ones_like currently always return Variables with requires_grad=False;
this is less convenient than the existing Variable constructor that has a requires_grad parameter.  This commit
adds the parameter at the python binding level.

* Fix flake8.

* Address review comments.

* Match set_requires_grad implementation with tensor_new version.
2018-01-22 19:15:11 -05:00
Sam Gross
57549b7e44
Bind functions with out= arguments in VariableType (#4565)
This adds overrides in VariableType for the xxx_out ATen functions and
implements Python bindings. There is no support for automatic
differentiation. If any of the inputs (or outputs) requires grad, then the
function will throw an exception unless it's running in "no-grad" mode.

The bindings for calling torch.xxx functions on Variables are moved to a
different object. Previously, they were static method on VariableBase.
This change prevents users from accidentally calling static methods as if
they were instance methods.
2018-01-17 18:27:42 -05:00
Sam Gross
f8a4b1a266
Split off load_derivatives and gen_autograd_functions from gen_variable_type (#4370) 2017-12-27 18:59:41 -05:00
Tongzhou Wang
d8b2e5d091 Add python only default init expression; Implement stft, hann/hamming/bartlett window. (#4095)
* implement stft

* addressed comments; implemented window functions; added support for python only default initialization
2017-12-18 12:28:23 -05:00
Sam Gross
c813ce3787 Implement Variable._sparse_mask (#4124)
* Implement Variable._sparse_mask

* Use SparseTensor as the dyanmic_type
2017-12-15 17:25:20 -05:00
Sam Gross
aeb7a3668d
Implement Variable.new (#4080) 2017-12-11 15:45:43 -05:00
Tongzhou Wang
c681b03d37 Add determinant function on variable; Add backward on svd (#3816)
* determinant on variable

* svd bwd
2017-12-01 13:22:46 -05:00
Adam Paszke
65e0d5bad8 Fix void* wrapping in autograd codegen
Also, add assertions here and there to make sure bad things
never happen again.
2017-11-24 13:33:13 +01:00
Sam Gross
9cb8b43778
Split off in-place NN functions (#3683)
For example, this splits threshold into threshold(), which is now
never in-place, and threshold_() which is always in-place.

This simplifies the in-place vs. non-in-place logic in
gen_variable_type.py, which was bug-prone.
2017-11-14 12:59:06 -05:00
Zach DeVito
5aa5b572e4 update build so that all of TH* is in libATen 2017-11-02 19:53:36 -04:00
Edward Z. Yang
53fe804322 Make ONNX work with new C++ autograd world.
The general strategy is there is a new module, torch.onnx.symbolic, which
contains a function for every ATen method name with the ONNX translation.
While implementing this, I took the opportunity to expunge all references
of 'g' from the public API; instead, it is managed by a global variable in
torch.onnx which tracks the "current graph".

Other changes:

- If you pass a Tensor to op as an argument, it will now automatically be
  converted into a Constant ONNX node.  This lets us remove needing to
  implement ONNX

- Rename value to other, wherever there is both a Scalar and Tensor overload.
  This way, keyword dispatch can work uniformly in both cases.

- Deleted any autograd Function classes that both had a symbolic and were ported
  to the new C++ autograd implementation.  There may still be some straggling
  classes that didn't have symbolic.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-10-20 15:38:01 -04:00
Sam Gross
f1f64c8d07 Generate autograd functions for NN / more refactors (#3136)
Generate autograd functions for NN and implement more derivatives in derivatives.yaml

A big refactor of gen_variable_type.py
2017-10-19 15:03:26 -04:00
Sam Gross
f29bcab67e Use Declarations.yaml to generate python bindings 2017-10-07 00:41:29 -04:00
Sam Gross
558d26a69e Fix argument indices 2017-10-07 00:41:29 -04:00
Sam Gross
dcb8d0f088 Refactor out python binding generation from gen_variable_type.py
- Also includes some prep work for binding NN functions
2017-10-07 00:41:29 -04:00