pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
gchanan	2a18e7c45b	Have python dispatch respect 'auto_gpu' and 'with_gil'. (#7137 )	2018-05-01 13:51:02 -04:00
gchanan	a6bfa16c17	torch.arange: add numpy-style type inference. (#7016 ) * torch.arange: add numpy-style type inference. This is a backwards-compatibility breaking change. * Fix flake8. * Use at::optional. * Remove unneeded header files. * Use reference wrapper. * Update arange for test. * Address review comments.	2018-04-27 15:11:45 -04:00
gchanan	3d907ef78e	Consistently check 'out' variants against specified dtype/layout/device parameters. (#6973 ) We were previously doing this in the most common cases, but not consistently.	2018-04-25 22:46:42 -04:00
Adam Paszke	d26ab68485	Sort declarations when generating Python bindings (#6701 ) * Sort declarations when generating Python bindings This helps resolve ambiguities in argument parsing according to any rules we will need. For now, this allows us to make scalar operations more conservarive wrt. argument types, but makes them commutative again. * Fix inconsistencies between mod with tensor and scalar * Fix a stupid mistake	2018-04-18 21:51:35 -04:00
Thomas Viehmann	bd0cc7d364	Implement torch.einsum (fixes #1889 ) (#6307 ) * start at generic trilinear * Implement einsum (fixes #1889) This provides a simple implementation of einsum. It is built on top of the work for computing bilinear (#6110). It uses a naive left-to-right resolution at the moment. Autograd is able to differentiate by itself. The obvious unsupported feature is taking diagonals (einsum('ii->i',(a,)). * add tests and docs * fix flake8 * clean diff * rebase on current master to resolve conflicting String wrapping * clean up after rebase * better commentary in einsum and sumproduct_pair * don't say fixme if it's fixed and rename num_outputs to num_output_dims * adapt python wrapper to use std::string instead of String to avoid typedef at::String * typos and some vector to array conversion * fix accidental python<->python3 change * really fix bad rebase	2018-04-18 13:41:27 +02:00
gchanan	5ed3f3347a	Add dtypes (with reasonable defaults) to sum, prod, cumsum, cumprod. (#6573 ) * Add dtypes (with reasonable defaults) to sum, prod, cumsum, cumprod. This adds optional dtypes to torch.sum, torch.prod, torch.cumsum, torch.cumprod. By default, the dtype is torch.float64 for integral types, and the dtype of the input for floating point types. * Don't use optional<ScalarType>, because the jit can't handle it yet. Instead, we manually build the overloads. This is fairly painful because of default arguments, but should be easy to pull out once the jit can handle optional<ScalarType>. * Fix keepdim with out parameters. * Fix _cudnn_rnn_flatten_weight. * If dtype is provided to an out function, make sure it matches the dtype of the result. * Fix typo.	2018-04-16 23:52:59 -04:00
gchanan	749d51414a	Separate cuda-ness from dtype. (#6470 ) * Separate cuda-ness from dtype. There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType. At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device). There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types on reduction functions. * Fix test_autograd. * Add defaults to randint_like. * Track is_cuda in py tensor types. * Fix test_sparse. * Fix multiprocessing. * Fix rnn. * Fix test_nn. * Fix flake8.	2018-04-12 14:05:44 -04:00
gchanan	87e369111a	Add string-style devices to all tensors. (#6283 ) * Add string-style devices to all tensors. Previously, tensors only had a 'get_device' method which would throw an exception on a CPU tensor. This made it necessary to if/else code that was meant to be device agnostic. This PR implements the following: 1) Adds a 'device' property to all tensors that returns a string representation of the device for all tensors. For cpu tensors this is 'cpu'. For cuda tensors this is 'cuda:X', where X is the cuda device ordinal. 2) Adds a DeviceSpec class. This is just a helper class for separating device_type and device_index specification and to allow partial specification. For example, you can call DeviceSpec('cuda'), DeviceSpec('cuda:0'), DeviceSpec('cuda', 1). Also has backwards compatibility support for specifying integers, which are treated as cuda devices. DeviceSpecs have the following properties: a) device_type: string representation of the device type (i.e. 'cpu' or 'cuda') b) device_index: integer for the device index (None if not specified) c) cuda_device_index: for backwards compatibility; behaves roughly like `get_device` did previously. I.e. if a function previously took integers for cuda devices, it can now take DeviceSpecs (or strings), and can maintain the old functionality by calling `old_index = DeviceSpec(old).cuda_device_index`. 3) tensor methods and torch. functions that took integer devices can now take integers, strings, or DeviceSpecs. For example: torch.randn((2,3), dtype=torch.cuda.float32, device='cuda:1') TODO in future PRs: A) Split out cuda from dtype so you don't need to overspecify cuda-ness B) We currently only support strings/DeviceSpecs in tensor methods and torch. functions. We should have equivalents torch.cuda.device(...), torch.cuda.device_of, etc. at the torch. level that work on strings/DeviceSpecs * Add deviceInt64 to python arg parser. * device_str. * Remove device_str. * remove device prefix from attributes. * Use const char * instead of string. * Move autogpu index out of Device. * comment on is_default. * Rename torch.DeviceSpec to torch.device. * comment. * Fix tests. * Fix flake8. * Fix sparse_coo_tensor parameter name. * Improve error message. * Remove device_ prefix from C++ device object. * Allocate static strings. * Return not implemented from rich compare. * Move torch::Device to THPDevice. * Remove cuda index. * Py_RETURN_NOTIMPLEMENTED doesn't exist in python2.	2018-04-06 15:12:05 -04:00
Peter Goldsborough	9ba70856a1	Add max_values and argmax convenience functions to ATen (#6201 ) * Add max_values and argmax convenience functions to ATen * Add documentation for torch.argmax/argmin and skip max_values * Add tests for argmax/argmin * Dont default the dim argument * Use dim=0 in test_torch.py for argmax tests * Implement argmin() and argmax() without dim * Call .contiguous() before .view(-1)	2018-04-04 15:53:26 -04:00
gchanan	4c81282c33	Introduce torch.layout and split layout from dtypes. (#6145 ) * Introduce torch.layout and split layout from dtypes. Tensors (and tensor types) now have a 'layout' attribute that returns either 'torch.strided' or 'torch.sparse_coo'. Previously, dtypes were 1-to-1 with ATen types/PyTensorTypes; the impetus behind this decision was to make things easy in the common case (i.e. specifying a type in a factory function). But this doesn't really follow for sparity, which isn't a common case. It also doesn't properly represent the concept or a dtype, which in numpy are proper scalar types (i.e. roughly the type returned from indexing the last dimension of an n-d array). But this should be the same whether or not the tensor is represented via strides, sparsity, etc. This is accomplished by: 1) having the dtype of tensor return the (device-type, scalar-type) combination, i.e. torch.cuda.float32, so both torch.cuda.FloatTensor and torch.cuda.sparse.FloatTensor have the same dtype 2) Adding a layout parameter to python functions, where the combination of (dtype, layout) maps to an ATen type that is used for dispatch. * Formatting, make init throw python_error. * Fix cuda not enabled error message. * Fix test.	2018-04-02 14:07:50 -04:00
Tongzhou Wang	d2c0f8bb57	avoid generating torch.*_backward_(input\|weight\|bias) (#6114 )	2018-03-30 15:23:56 -04:00
gchanan	df039e2998	Unify handling of type_dispatched_args in gen_python_functions. (#6088 ) This is just to simplify the handling, there is no generated code difference.	2018-03-28 22:23:20 -04:00
Richard Zou	e3e0c34390	Unify error checking for tesnor.index_copy_ (#5642 )	2018-03-22 20:07:15 -04:00
gchanan	a3442f62bc	Support native namespace functions with type dispatch. (#5576 ) * Support native namespace functions with type dispatch. Use 'ones' as an example. Note this is a "halfway" solution; i.e. the call chain is: at::ones(shape, dtype) -> dtype.ones(shape, dtype) -> CPUFloatType.ones(shape, dtype) -> at::native::ones(shape, dtype) The "nicer" solution would probably be something like: at::ones(shape, dtype) -> dtype.ones(shape) -> CPUFloatType.ones(shape) -> at::native::ones(shape, this) * Fix type inference. * Fix test install. * Fix extensions. * Put dtype argument at the beginning. * Fix extension.cpp. * Fix rnn. * Move zeros in the same manner. * Fix cuda. * Change randn. * Change rand. * Change randperm. * Fix aten contrib. * Resize in randperm_out. * Implement eye. * Fix sparse zeros. * linspace, logspace. * arange. * range. * Remove type dispatch from gen_python_functions. * Properly generate maybe_init_cuda for type dispatch functions not named type. * Don't duplicate dtype, this parameters for native type dispatched functions. * Call VariableType factory methods from the base type so it gets version number 0. * Address review comments.	2018-03-09 10:52:53 -05:00
gchanan	0f86f64398	Add support for device python arguments with constructors. (#5384 ) * Add support for device python arguments with constructors. * Fix flake8. * Simplify device handling. * Dont use torch._C._VariableFunctions. * Handle default values for functions that have tensor args (e.g. ones_like).	2018-02-28 14:41:57 -05:00
Sam Gross	ebd32f7bcd	Check that parsed_args contains enough space for all parameters (#5467 )	2018-02-28 14:34:04 -05:00
gchanan	f4cfd9bbfc	Don't python bind 'tensor' or 'sparse_coo_tensor'. (#5390 ) These are internal ATen functions; we have better python APIs.	2018-02-26 11:06:25 -05:00
Sam Gross	30ec06c140	Merge Variable and Tensor classes (#5225 ) This replaces the torch.Tensor constructors with factories that produce Variables. Similarly, functions on the torch module (e.g. torch.randn) now return Variables. To keep the PR to a reasonable size, I've left most of the unused tensor code. Subsequent PRs will remove the dead code, clean-up calls to torch.autograd.Variable, and rename Variable to Tensor everywhere. There are some breaking changes because Variable and Tensors had slightly different semantics. There's a list of those changes here: https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge	2018-02-23 18:03:31 -05:00
gchanan	0878c6d4d7	Various dtype improvements. (#5321 ) * Various dtype improvements. 1) Add dtypes to the new data-based constructors: Variable.new_tensor and torch.autograd.variable. 2) In the python signatures, use Type instead of Dtype to match the C++ signatures; the error messages still print as dtype. 3) Handle / add a better error message when a dtype is used when ATen was not compiled with that type (e.g. cuda types). 4) Move cuda_lazy_init to its own file. A later commit will add support to the legacy constructors as well. * Move implementation of lazy_init to cpp. * Fix parsed_arg size.	2018-02-21 17:37:59 -05:00
gchanan	5edf6b2037	Add numpy-style dtypes to Variable factories. (#5245 ) * Add numpy-style dtypes to Variable factories. 1) Add numpy-style dtypes corresponding to torch tensor types. These are: torch.float16, torch.float32, torch.float64, torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64 as well as torch.cuda, torch.sparse, and torch.cuda.sparse equivalents. 2) Adds "legacy" names for the above dtypes that correspond more closely to existing tensor names. These are: torch.half, torch.float, torch.double, torch.short, torch.int, torch.long. torch.byte and torch.char don't exist because they either don't match numpy semantics or differ on different architectures. 3) Adds a "dtype" parameter to Variable factories (e.g. zeros, ones) that allows the user to specify the type without changing the default tensor type. 4) Adds a "dtype" getter to Variables that return the canonical dtype from 1) This PR is missing the following useful features that should be added in the future: A) We only add the "dtype" parameter to auto-generated factories; hand-written factories like in tensor_new.cpp don't support this yet. B) We don't allow type conversions to use dtypes; that should be added to type(param) or a new function. C) We don't yet have a "device" parameter for these factories; right now, they will only create Variables on the default device. * backend_to_string can be private. * Define python binding argument indexes in a more simple way. * add all_declared_types, still need to hook it up to THPDType. * Fix all_declared_types for missing types (it's Sparse + Half). * Ensure cuda dtypes are created even if compiled with NO_CUDA=1. * Fix case where dtype is provided but dispatch is via namespace. This happens in ones_like, empty_like, randn_like. There is some question if we should do: 1) at::ones_like(tensor).toType(dtype) 2) at::ones_like(tensor.toType(dtype)) I did the former because this matches with the numpy documentation, i.e.: "Overrides the data type of the result." and it's easier to implement. Note that the above causes an extra copy, either of the input or output. Here's a better implementation: 1) Make zeros_like, ones_like native functions that take an optional type (named dtype?). 2) Match the type argument with the dtype, so we don't have two different parameters. 3) Call at::zeros_like(input, type) -> at::native::zeros_like(input, type) -> type.zeros(input.sizes()) * Don't return from maybe_initialize_cuda. * Don't leak DType name. * Address cpp review comments. * Share code between sparse and non-sparse test_dtypes. * Rewrite _like functions as native function with explicit type parameter. * Use type 'Type' instead of 'dtype' for consistency. * Address review comments. * Handle arg_idx when there is requires_grad but no dtype in python_binding_arguments.	2018-02-20 11:04:14 -05:00
Sam Gross	0509f26d41	Speed-up nn.Linear for the 3d input case (#5279 ) This adds at::_unsafe_view and uses it in matmul. The _unsafe_view function is identical to view except that the output is not treated like a view by the automatic differentiation code. This avoids in-place modifications triggering the more expensive CopySlices/AsStridedBackward behavior. The _unsafe_view function is only safe to use on temporaries that will be immediately discarded and that do not alias other tensors. Otherwise, in-place modificatiions may trigger incorrect gradients. The funciton is not exposed to Python. See #5169	2018-02-19 19:47:20 -05:00
Sam Gross	c1b98f0841	Add deprecated add_out overload (#5088 ) We have a few calls that use this signature on Tensors. This also updates the binding code to support deprecated xxx_out signatures.	2018-02-06 17:08:23 -05:00
Edward Z. Yang	7bd2db997e	Port cuDNN RNN bindings to ATen (#4881 ) * Add transpose() to TensorGeometry. This code is dead; I briefly used it in my RNN patchset but eventually rewrote it to not be necessary. However, it seemed like a useful gadget so I kept it. In general, it seems that it would be useful for TensorGeometry to support all operations that Tensor does, but it only computes the changes to sizes/strides instead of actually doing the computation. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Turn on wrap_dim behavior for TensorGeometry Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support for hard-coded differentiable outputs. Some outputs of functions are nondifferentiable, and should always be returned with requires_grad=False. Traditionally, we have used the presence of 'grad' to signal that only the first output is differentiable, and the rest are not, but cudnn_rnn (to be implemented) breaks this pattern; its first three outputs are differentiable, but its last output is a buffer that is just consumed by backwards. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * TensorGeometry constructor from just sizes The sizes are assumed to form a contiguous tensor, and we compute the strides we would get in that case. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support saving TensorList for backwards. There is some back story here. Saved TensorList in backwards will be used by cudnn_rnn, and it is worth asking, why is it necessary to save a list of tensors? Indeed, technically speaking a list of tensors is not necessary, we only need to save the sizes of each of the weight tensors. (We need the sizes because cuDNN is only going to blast the derivative of weights into a flat buffer, but we need to match the sizes of the views into the buffer when we eventually return the derivatives.) However, it was surprisingly awful trying to implement passing just sizes, because as non-Tensor arguments, the JIT interpreter generation code is expected to handle all non-Tensor arguments as attributes in the trace, and our attributes struct doesn't actually know how to do arrays of arrays. Saved TensorList code was much easier to get working, so that's what this patch does. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * MatrixRef - an ArrayRef with a stride, making it a 2D ArrayRef. Like ArrayRef, this class does not own the underlying data, it is expected to be used in situations where the data resides in some other buffer. This is intended to be trivially copyable, so it should be passed by value. For now, 2D only (so the copies are actually cheap, without having to write a SmallVector class) and contiguous only (so we can return non-strided ArrayRef on index). The intended use-case (not in this commit) is to make it easier to work with RNN weights, which are num_weights x num_layers matrix of parameters. P.S. dimension 0 indexes rows, dimension 1 indexes columns Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Generalize getDataType in Descriptors.h Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Change copy_range to take Tensor, and change cat_tensors_backward accordingly Should a backward function return a Variable or a Tensor? For the most part, all of our backward functions return Tensor, except cat_tensors_backward, which returns a variable_list (which is really the only thing that matters, because Tensor and Variable are interconvertible). But this is kind of weird, because it means that you can't implement a backwards in ATen that returns a std::vector<Tensor>, and then hook it up transparently with the derivatives code. So I switched it over. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support 5-ary return Tensor tuple. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support code generation with mixed Tensor/TensorList in output. I don't think I ended up using this in cudnn_rnn, but this seems it might be useful for someone else later. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support 4-ary boolean array Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add support for retain_variables in tools/autograd/derivatives.yaml 'retain_variables', a bool which is true if a user has specified that saved variables should be retained in case the backwards is run again later. This allows an optimization where we can destroy saved buffers if we know variables are not going to be retained, e.g., it is (will be) used by _cudnn_rnn Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Lazily initialize cuDNN descriptors Previously, cuDNN descriptors were eagerly allocated as soon as a FooDescriptor object was created. However, in some uses of TensorDescriptor, this is problematic: some tensors are optional and cuDNN's API expects to be given a nullptr TensorDescriptor in this case, not an uninitialized (but allocated) descriptor. Lazily initializing the descriptors makes it less likely for us to use uninitialized memory and matches the usual semantics of unique_ptr. It's good sense! Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Port cuDNN RNNs to ATen. This brings three new functions: - _cudnn_rnn_flatten_weight: flatten a matrix of weight tensors into a single contiguous weight buffer as required by cuDNN - _cudnn_rnn: run RNN forwards - _cudnn_rnn_backward: run RNN backwards RNNs have a lot of parameters, so we restructured what was previously a single 'fn' object that recorded all the parameters into three objects: RNNDescriptorParams, TensorDescriptorListParams and DropoutDescriptorParams. We make use of MatrixRef to organize the weight tensors (which are weight/bias x number of layers), but I did not teach the codegen how to pass these as arguments/return values natively, so instead a MatrixRef is passed as its constituent ArrayRef and int64_t stride0. cudnn_rnn has three differentiable outputs and one nondifferentiable one, so it makes use of the support for hard-coded differentiable outputs. I haven't deleted all of the descriptor code from Python, because dropout initialization still goes through this codepath, that should be fixed soon but I don't see it as essential for this PR. This commit also removes the last use of NestedIOFunction from PyTorch. There are some shenanigans with cuDNN dropout descriptor initialization, see below: Note [cuDNN dropout descriptor initialization] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In most cases, setting descriptors in cuDNN is cheap (e.g., cudnnSetTensorNdDescriptor). However, this is not the case for cudnnSetDropoutDescriptor: in cuDNN 6/7 (and possibly others) it does an expensive precomputation to initialize the random number generator states. In cuDNN 6, this is the ONLY official mechanism to initialize a dropout descriptor, which means that law-abiding clients were expected to generate a dropout descriptor once and cache it. However, our ATen interface is (1) stateless (so we can't cache the descriptors) and (2) does not accept arbitrary user types in its interface (so we can't pass the descriptor in). This puts us in a pickle. In cuDNN 7, a new function, cudnnRestoreDropoutDescriptor was added, which forgoes the expensive initialization process, and can initialize the descriptor with a pre-initialized state CUDA tensor. This is great, because it means we can simply pass in the state tensor and then initialize the descriptor internally. Unfortunately, this function is not available in cuDNN 6. To work around this, we break the cuDNN abstraction barrier, and have the struct layout of the underlaying dropout descriptor. With this struct, we can reimplement cudnnRestoreDropoutDescriptor from scratch. Great! Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Fix cuDNN 7 behavior. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Delete some unused, controversial methods from MatrixRef. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add missing filter_dim_a slice Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Replace nested for-loop with itertools.chain. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * CR comment on mut_desc() Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Refactor DropoutDescriptor API. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Use cached CurrentDeviceProperties from Context. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Document _cudnn_rnn outputs. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Improve fmap docs, convert some functions to use it. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Move IndexRange to autograd/function.h Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Elaborate on CUDNN_STATUS_INVALID_VALUE return some more. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add an all-in-one setter for RNNDescriptorParams. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Print what the unrecognized RNN mode was Signed-off-by: Edward Z. Yang <ezyang@fb.com> * RNN TensorDescriptor improvements - Have an explicit size/stride overload for set TensorDescriptor, so you don't have to create a goofy view to feed in. - Change the padding to 3D rather than 5D, which is all you actually need (it's just 2D that is not supported by cuDNN API.) Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Fix implementation of cudnnRestoreDropoutDescriptor, plus test. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Better comments about input layout. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add comment about no-DropoutDescriptor argument RNNDescriptor function. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Rename vocab_size back to input_size. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Don't use backslash in comment. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Bugfix for contiguous TensorGeometry calculation. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Don't allocate a dummy tensor when setting TensorDescriptor for flatten_weight. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Make contiguity errors more user-friendly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * s/fn.dropout.train/fn_train/ Signed-off-by: Edward Z. Yang <ezyang@fb.com> * s/_cudnn_rnn_backward_grad/_cudnn_rnn_backward_input/ Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Make dcx properly undefined when not required. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Remove old TODO. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add state size check in cudnnRestoreDropoutDescriptor Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Explicitly narrow int64_t to size_t Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Restore copyParams comment. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Update benchmark numbers, and slight engineering improvements. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Typofix. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-02-05 13:54:11 -05:00
Richard Zou	bc11511cda	Restore sparse variable transpose_() and t_() (#4779 ) * Restore sparse variable transpose_() and t_() * Add dimension wrapping to transpose_, t_ * Don't expose sparse_raw_resize_ to python	2018-01-23 21:32:40 -05:00
gchanan	c49f0279a6	Add kwarg-only 'requires_grad' parameter to Variable factories. (#4748 ) * Add kwarg-only 'requires_grad' parameter to Variable factories. Functions that create variables, e.g. torch.ones_like currently always return Variables with requires_grad=False; this is less convenient than the existing Variable constructor that has a requires_grad parameter. This commit adds the parameter at the python binding level. * Fix flake8. * Address review comments. * Match set_requires_grad implementation with tensor_new version.	2018-01-22 19:15:11 -05:00
Sam Gross	57549b7e44	Bind functions with out= arguments in VariableType (#4565 ) This adds overrides in VariableType for the xxx_out ATen functions and implements Python bindings. There is no support for automatic differentiation. If any of the inputs (or outputs) requires grad, then the function will throw an exception unless it's running in "no-grad" mode. The bindings for calling torch.xxx functions on Variables are moved to a different object. Previously, they were static method on VariableBase. This change prevents users from accidentally calling static methods as if they were instance methods.	2018-01-17 18:27:42 -05:00
Sam Gross	f8a4b1a266	Split off load_derivatives and gen_autograd_functions from gen_variable_type (#4370 )	2017-12-27 18:59:41 -05:00
Tongzhou Wang	d8b2e5d091	Add python only default init expression; Implement stft, hann/hamming/bartlett window. (#4095 ) * implement stft * addressed comments; implemented window functions; added support for python only default initialization	2017-12-18 12:28:23 -05:00
Sam Gross	c813ce3787	Implement Variable._sparse_mask (#4124 ) * Implement Variable._sparse_mask * Use SparseTensor as the dyanmic_type	2017-12-15 17:25:20 -05:00
Sam Gross	aeb7a3668d	Implement Variable.new (#4080 )	2017-12-11 15:45:43 -05:00
Tongzhou Wang	c681b03d37	Add determinant function on variable; Add backward on svd (#3816 ) * determinant on variable * svd bwd	2017-12-01 13:22:46 -05:00
Adam Paszke	65e0d5bad8	Fix void* wrapping in autograd codegen Also, add assertions here and there to make sure bad things never happen again.	2017-11-24 13:33:13 +01:00
Sam Gross	9cb8b43778	Split off in-place NN functions (#3683 ) For example, this splits threshold into threshold(), which is now never in-place, and threshold_() which is always in-place. This simplifies the in-place vs. non-in-place logic in gen_variable_type.py, which was bug-prone.	2017-11-14 12:59:06 -05:00
Zach DeVito	5aa5b572e4	update build so that all of TH* is in libATen	2017-11-02 19:53:36 -04:00
Edward Z. Yang	53fe804322	Make ONNX work with new C++ autograd world. The general strategy is there is a new module, torch.onnx.symbolic, which contains a function for every ATen method name with the ONNX translation. While implementing this, I took the opportunity to expunge all references of 'g' from the public API; instead, it is managed by a global variable in torch.onnx which tracks the "current graph". Other changes: - If you pass a Tensor to op as an argument, it will now automatically be converted into a Constant ONNX node. This lets us remove needing to implement ONNX - Rename value to other, wherever there is both a Scalar and Tensor overload. This way, keyword dispatch can work uniformly in both cases. - Deleted any autograd Function classes that both had a symbolic and were ported to the new C++ autograd implementation. There may still be some straggling classes that didn't have symbolic. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-10-20 15:38:01 -04:00
Sam Gross	f1f64c8d07	Generate autograd functions for NN / more refactors (#3136 ) Generate autograd functions for NN and implement more derivatives in derivatives.yaml A big refactor of gen_variable_type.py	2017-10-19 15:03:26 -04:00
Sam Gross	f29bcab67e	Use Declarations.yaml to generate python bindings	2017-10-07 00:41:29 -04:00
Sam Gross	558d26a69e	Fix argument indices	2017-10-07 00:41:29 -04:00
Sam Gross	dcb8d0f088	Refactor out python binding generation from gen_variable_type.py - Also includes some prep work for binding NN functions	2017-10-07 00:41:29 -04:00

39 Commits