pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Edward Z. Yang	711e5a6ceb	Port THS to ATen. (#8409 ) * Port THS to ATen. The basic structure of the patch: - All kernels in aten/src/THS got rewritten as native functions in aten/src/ATen/native/sparse I took the liberty to rename some of the kernels, opting for a longer, more transparent names than things like 'spaddcmul'. - Instead of holding fields for sparse tensor in the TH C struct THSTensor, they are now held in a C++ class SparseTensorImpl (this explains why I had to do this all in one go; I can't have two reps for sparse tensors!) Along the way, we change a key internal representation invariant: an "empty" sparse tensor has dimI == 1 and dimV == 0 (this is different from dimI == 0 and dimV == 0 we had before); this ensures that we maintain the invariant that dim == dimI + dimV. "Scalar" sparse tensors are made illegal, because there really is no way to properly express them in COO format. - Because we haven't ported THCS or any of the traditional dense TH implementations, there is a new set of adapter functions in native/LegacyBridge.cpp exclusively devoted to deciding whether or not to go to the new native implementation or back to the legacy TH binding (prefixed with th_). The intent is that when everything gets ported, we can delete this file. - I've kept the stubs for all the THS functions, but they now all error if you try to actually call them. Eventually, we should replace these with calls to ATen so that everything keeps working. - I gobbled up SparseMM (SparseMM.cpp is no more). It was tasty. There are some miscellaneous improvements which were needed for other changes in this patch: - There is now AT_FORALL_SCALAR_TYPES_EXCEPT_HALF, which does what it says on the tin. - axpy templated function moved to TH/BlasUtils.h, there's a new macro which lets you easily forward to all of the TH functions. We also expose THBlas_copy. I'm not terribly pleased with these functions but they seem to serve a purpose they need. - New method on Tensor to get TensorImpl, unsafeGetTensorImpl - accessor() is now this-const, since const-correctness on Tensor is a lie - New toSparse()/toDense() methods on Type; now you can call these directly without having to manually apply at::toSparse/toDense on the Backend and then running toBackend yourself. Changes to the kernels: - Previously, the whole body of all kernels was compiled for every supported scalar type. In our new implementation, the scalar dispatch has been pushed into the smallest extent which (1) is not in a type loop and (2) requires statically knowing the scalar type. These sites all use AT_DISPATCH_ALL_TYPES. I tried to use lambdas as much as possible, but sometimes it was not possible when a OpenMP pragma was used. - Anywhere we tested if the nDimension of a tensor was zero, we replaced with a test that numel is zero. Because, as we known, nDimension of zero-size tensors in TH is zero, and that's wrong wrong wrong (and not done this way in ATen). Some subtleties: - Places where previously fastget1d was used, I now use a TensorAccessor. However, you have to be careful about grabbing the accessor, because sometimes you will be accessor'ing indices/values and they are empty, which means they will be 1D* ("oh, aren't indices always 2D?" Nope. Nyet.) So, essentially, it is only safe to grab an accessor after you have checked that nnz != 0. All of these shenanigans will go away when we properly support zero-size dimensions. A few places, we test for this case just by wrapping the loop in a conditional on nnz. Some other places this is not so easy, so we instead short-circuit the function with a special case for when nnz == 0 (usually, these implementations are degenerate). - There is a very subtle but important difference between _sparse_get_impl(self)->indices() and self._indices(); the latter may return a view! This is because nnz is not guaranteed to match the dimensions of indices/values; you can "truncate" a sparse tensor by setting the nnz. Actually, I think this is not a good idea and we should enforce a stronger invariant, but for this patch I slavishly adhere to the old ways, and as such I have to be very careful if I want to resize something, I had better use the former and not the latter. - I had to reimplement broadcasting by hand (thus the s_ and non-s_ functions in the sparse native files). There is a very important distinction between foo_out and foo_, so it is important that the LegacyBridge function always call to the lower layer, and not try to avoid boilerplate by calling to another LegacyBridge function first. I did NOT put broadcasting in LegacyBridge (even though, ultimately, that's where it must live), because the th_ functions which are invoked from LegacyBridge handle broadcasting themselves, and I don't want to broadcast twice. - Sparse function MUST explicitly specify the Type they dispatch from, otherwise Variable wrapping/unwrapping will not work correctly. If you use _get_sparse_impl, that is sufficient to levy this requirement. - The "has native" tests in LegacyBridge.cpp are not 100%, because some of the functions are mixed dense-sparse functions, and so you can't just say, "Oh, if it's sparse and CPU, call the native sparse implementation." This is handled on a case by case basis. There is some especially complex logic for add(), which has dense-dense, sparse-sparse and dense-sparse implementations. - I added some uses of SparseTensorRef in native_functions.yaml, but you will notice that these are all on native_* functions, and not the actual, top-level functions. So the SparseTensorRef is purely documentary (helping you not call the wrong overload) but there is no magic; we do the wrapping ourselves the hard way. (This is in constrast to the TH binding code which is magical.) Except for _sparse_mask; _sparse_mask is magical. - There is a raw_copy_sparse_ method, which is really my way of getting around the fact that copy_ has never been implemented for sparse tensors (even before this patch), but there IS a super secret, internal way of doing these copies that the THS code used, and which I needed to get my hands on when I did this port. We should refactor so that either (a) copy_ does support sparse-sparse copy natively, or (b) we do this other ways. - Irritatingly, I must explicitly resize_as_ before copy_ into a tensor. This was not the case with THTensor_(copy) but I don't have any direct binding that doesn't have this requirement. - For some reason, the sparse tensor constructor accepts a scalar tensor for the values tensor. This is kind of weird because you always need an nnz-dimension. However, the old code supported this and just expanded it into a 1D size 0 tensor; so we need some explicit code to do this. There are maybe a bit more AT_ASSERTs in some of the kernels than is wise. I added them all when I was debugging and was loathe to remove them. Some last mile fixes after this commit went into PR - Move expand outside of dispatch so autograd works (it used to be inside and then we lost all of the recorded broadcasts). - Hack to duplicate the derivatives for our now two definitions TH and native. Mercifully the derivatives are short. - Apparently, TH has a special case to make foo_ functions method only, and if you don't do this the Python arg parsing is wrong. We carefully work around this in the native bindings - Apply DCE to a test_jit case, fixes wobbling due to DCE trick in tracing - Update test_function's output - Some last mile fixes for dispatch confusion in sparse_coo_tensor functions. - New simplified regression test based on failures I saw in ONNX - Increase tolerance on super resolution test - More robust dynamic_type normalization, fixes ONNX bug. The dynamic_type situation is very delicate; probably need to stop having both Scalar and real. - Make new_with_tensor_sparse more CUDA safe - Note about CUDA-safety in SparseTensorImpl - Rename dimI/dimV to sparseDims/denseDims. - Make localScalar on SparseTensorImpl work. - Make numel uniformly supported on all types, not just dense types - Add tests for is_nonzero() method (which exercises localScalar) - Disable constant JIT autogenerated tests, which are fragile and broken by this change, but being fixed in a parallel track. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-06-15 17:52:21 -04:00
li-roy	6869a5f0fb	Throw error on 0-length tensor slicing (#7775 ) * throw error on 0-length tensor slicing * return empty tensor instead of throwing error * make 0 slice work for tuples also * add tests * move check to aten * Address comments	2018-06-14 17:40:51 -04:00
Wei Yang	ae55865a3b	Migrated hardshrink() to ATen and deprecated nn.Hardshrink() (#8117 ) * 1. added hardshrink() to ATen (CPU + GPU); 2. removed nn.Hardshrink(); 3. reusing previous tests for nn.Hardshrink() and included CUDA tests at test_nn; 4. default parameter lambda=0.5 is not working yet * optimized memory read/write * 1. pass in lambd as scalar for CPU/CUDA_apply; 2. removed tests for hardshrink at test_legacy_nn fixes test_utils * 1. replace zeros_like with empty_like; 2. use scalar_cast in cuda * 1. printing lambd value; 2. default lambd=0.5 is still failing * getting around Scalar bug buy removing default value of lambd from native_functions.yaml, and declare it at nn/functional.py * cleaned up debug printf	2018-06-14 16:42:20 -04:00
Chintak Sheth	21609e0fd0	``bincount`` feature implementation (#6688 ) * Implement CPU bincount feature support * Incorporate feedback on renaming to SummaryOps file and other nits * bincount gpu implementation * refactor cuda code and incorporate nits * doc fix * cuda bincount - cast weights to double if integral type * fix: signed unsigned comparison error * fix: ssize_t error * refactor * make template typenames readable and other nist * make compatible with v0.5 * incorporate comments * update test cases to ensure CUDA code coverage	2018-06-14 11:38:04 -04:00
li-roy	6a85b133d3	Improve number formatting in tensor print (#7632 ) * Improve number formatting in tensor print * fix bad rebase * address comments * fix test * fix test * use assertExpected for tests * address comments * address comments	2018-06-13 23:57:16 -07:00
Wei Yang	71a3633e3f	change tensor.set_() argument names to match descriptions in doc (#8403 ) Replaced args name `storage` and `sourceStorage` to `source` in tensor.set_() to match the descriptions in docs.	2018-06-13 13:22:50 -07:00
Will Feng	ffffee6aa9	Skip test_multinomial_invalid_probs on Windows (#8360 )	2018-06-12 17:00:49 -04:00
Wei Yang	c3e4b3c88b	raise more informative error msg for torch.load not support seek (#7754 ) Raising more informative error msg for torch.load() when input file does not support seek() or tell()	2018-06-12 12:57:28 -07:00
Tongzhou Wang	742912512c	Move signal window functions to ATen; add Blackman window (#8130 ) * Move signal window functions to ATen; add Blackman window * fix cuda test not checking scipy	2018-06-08 11:37:46 -04:00
Will Feng	89ea6acde2	[NEEDS REVIEW] Add nan and inf probability check to multinomial (#7647 ) * Add nan and inf probs check to multinomial * fix bug * Spawn CUDA test in subprocess * Make sure invalid input won't pass the test case * Try to fix error * Test failure cases in Python 3 only * Try to fix Windows error * Move CUDA test to test_cuda.py * fix issues * fix module name error * no need to check for CUDA existence in test_cuda * Use PY3	2018-06-06 22:49:12 -04:00
Tongzhou Wang	c0a419e6ba	Add non_blocking to Tensor/Module.to (#7312 ) * Add non_blocking to Tensor/Module.to * flake8 * Add argparse tests * cpp parse * Use C++ parser * use a commong parse function with Tensor.to * fix test_jit * use THPObjectPtr * increase refcount for None, True, and False * address comments * address comments	2018-06-04 18:46:52 -04:00
li-roy	bafec1637e	support loading gzip (#6490 ) * support loading gzip * address comments * address comments * fix lint * fix test for python2	2018-05-31 15:06:38 -04:00
Seth Hendrickson	e9c33e91d9	Remove python bindings for `torch.slice` (#7924 ) * skip python bindings for slice * remove tests * convert slice test to indexing	2018-05-31 13:42:49 -04:00
Richard Zou	b5594ac750	Raise error when torch.load a storage on a non-existing device (#7921 ) * Raise error when torch.load a storage on a non-existing device Before, doing torch.load(...) on a CUDA tensor on a CPU-only machine would raise an unreadable error: ``` ~/pytorch/pytorch/torch/cuda/__init__.py in __enter__(self) 223 if self.idx is -1: 224 return --> 225 self.prev_idx = torch._C._cuda_getDevice() 226 if self.prev_idx != self.idx: 227 torch._C._cuda_setDevice(self.idx) AttributeError: module 'torch._C' has no attribute '_cuda_getDevice' ``` This PR makes it so that torch.load raises a hard error if one tries to load a storage onto a non-existing device and suggests the user to use torch.load's map_location feature. * Address comments * missing dep	2018-05-31 09:44:50 -04:00
Richard Zou	769f5f7cfe	Handling of scalars in torch.Size (#5676 ) * Handling of scalars in torch.Size torch.Size() constructor uses python_arg_parser IntList in python_arg_parser can take iter/range Have IntList take python iterables and ranges. Address comments: don't use python_arg_parser and instead call __index__ in THPSize_pynew Address comments Address comments * Rebased * Address nit	2018-05-30 17:50:32 -04:00
Thomas Viehmann	42a68749bf	einsum: don't inplace modify arguments (fixes: #7763 ) (#7765 ) Thank you, Pierce Freeman, for the report and minimal example!	2018-05-29 11:26:39 +01:00
Ailing	b4ae80d459	serialization for torch.device (#7713 )	2018-05-21 11:34:26 +02:00
Ailing	75cf0faf4c	Implement __reduce__ for torch.dtype (#7699 )	2018-05-20 14:59:02 +02:00
gchanan	4f20a0e439	Fix various sparse transpose issues; remove dead code from Declaratio… (#7200 ) * Fix various sparse transpose issues; remove dead code from Declarations.yaml. 1) Fixes some checks in t_, transpose_ that don't allow transposing empty sparse tensors. 2) Remove out= variants from docs since they don't exist (and haven't since at least v0.3.1). 3) Unify implementations of t_, transpose_, t, transpose. 4) Move dead checking code from Declarations.cwrap to actual implementations. 5) Fix test which never tested transpose_. * Add test for error with t, t_. * Address review comments. * Fix jit tests. * Fix test_jit.	2018-05-18 19:51:41 +02:00
gchanan	7abdc303c6	Don't allow requires_grad to be set on integer Tensor constructors in… (#7185 ) * Don't allow requires_grad to be set on integer Tensor constructors in tensor_new. * Fix autograd test. * Fix test_distributions. * Fix test_jit. * Fix NN tests.	2018-05-18 19:45:10 +02:00
Seth Hendrickson	32b23a4bfc	Throw error on tensor creation when sequence shape cannot be determined (#7583 ) * first commit * unit test * minor style edits	2018-05-18 19:14:42 +02:00
Thomas Viehmann	bf95dff85b	Map digamma +/-inf results to nan in test (fixes #7651 ) (#7665 )	2018-05-18 16:35:00 +02:00
Thomas Viehmann	e1148db7f2	Implement logsumexp (fixes #2591 ) (#7254 ) * Implement logsumexp (fixes #2591) * Add logsumexp_backward, fix _out declaration. Thank you Simon and Edward for your comments!	2018-05-14 22:08:14 -04:00
Thomas Viehmann	cfc1d92975	Implement ellipses ('...') and diagonals (e.g. 'ii->i') in einsum. (#7173 ) This brings the two most important missing numpy einsum features to toch.einsum.	2018-05-12 23:39:37 -04:00
Richard Zou	eaa3f2e613	Fix advanced indexing with negative indices (#7345 ) * Fix advanced indexing with negative indices Fixes #7156 Here is some behavior before this PR: ``` In[1]: x = torch.arange(9).view(3, 3).contiguous() x[[0], [-1]] # Should be equivalent to x[0, -1] Out[1]: tensor([ 8]) ``` The bug is that negative indices are added to the computed linear index directly. In the above example, the linear index computed is "-1", which wraps around to "8", giving the last element of a flattened view of `x`. Instead, we should wrap negative indices around before adding them to the linear index. * Use toCLong()	2018-05-12 23:24:40 -04:00
Jon Walsh	857e3f4a5e	Throw error in tensor constructor when numpy strides mismatch (#7440 )	2018-05-11 11:00:43 +02:00
Ethan Steinberg	9fa1dff66a	Allow the use of torch.device for loading (#7339 ) * Allow using torch.device for loading * Make recommended changes * Better tests	2018-05-10 15:50:00 -04:00
Richard Zou	71626491c4	Add batched linear solver to torch.gesv() (#6100 ) * Add batched linear solver to torch.gesv() Fixes #3164 Picks up from #4502 I moved `gesv` to ATen. Adds bindings for MAGMA's `gesv_batched` function for CUDA. For CPU, runs `THLapack(gesv)` in a for loop. The new function supports arbitrary batch dimensions (and broadcasting of those dimensions). For example, the 4-d tensor `A x B x M x M` should be treated as having batch-size `(A x B)`. The overhead of creating the magma_queue_t is: ~350000 microseconds the first time it's called and ~6 microseconds every time after that. * Tests and docs * Address comments * Address comments * Rebase * Address comments * Fix rebase * Addressed comments * Address comments * Address comments * Addressed comments	2018-05-08 17:06:27 -04:00
Adam Paszke	8091388d0f	Add support for __floordiv__ and __rdiv__ for integral tensors (#7245 )	2018-05-03 23:34:59 +02:00
gchanan	681baa9254	Restore warning to torch.range. (#7194 ) Also, get rid of warning specification in Declarations.cwrap, which currently has no effect.	2018-05-02 21:53:00 -04:00
Thomas Viehmann	07513cfd1d	implement sum over multiple dimensions (fixes #2006 ) (#6152 )	2018-05-02 21:50:29 -04:00
cpuhrsch	88a705555a	Add SLEEF for float and double (#6725 )	2018-05-02 18:40:44 +00:00
gchanan	8031da5479	Implement torch.as_tensor, similar to numpy.asarray. (#7109 ) * Implement torch.as_tensor, similar to numpy.asarray. torch.as_tensor behaves like torch.tensor except it avoids copies if possible; so also somewhat like tensor.new but without the size overloads. I didn't add a requires_grad field, because we haven't decided on the semantics such as as_param. * Remove requires_grad for doc.	2018-05-01 12:54:43 -04:00
Thomas Viehmann	8fbab83c2a	only Tensors of floating point dtype can require gradients (see #7021 ) (#7034 )	2018-04-30 10:20:00 +02:00
gchanan	361648a4a7	Fix torch.tensor(...) device-type calculation when used with numpy an… (#6995 ) * Fix torch.tensor(...) device-type calculation when used with numpy and type inference. * Fix tensor device type inference as well. * Better variable type inference: infer cuda-ness only if device is not specified.	2018-04-27 18:12:33 -04:00
cpuhrsch	ae35e0e924	Support non-contiguous tensors for unary ops (#6119 )	2018-04-27 21:31:34 +02:00
gchanan	a6bfa16c17	torch.arange: add numpy-style type inference. (#7016 ) * torch.arange: add numpy-style type inference. This is a backwards-compatibility breaking change. * Fix flake8. * Use at::optional. * Remove unneeded header files. * Use reference wrapper. * Update arange for test. * Address review comments.	2018-04-27 15:11:45 -04:00
gchanan	18ed2160b0	Use Index rather than Long for IntList parsing (#6674 ) * Use Index rather than Long for IntList, so floating-point types convertible to ints fail the parsing. Basically, our unpackLong code works with floating-point types that are convertible to ints, but this isn't often what you want (because of truncation). What you actually want is to convert to an index, which will usually find such issues. I made this the minimal change I could because: 1) I didn't want to change unpackLong because the existing code call checkLong before unpackLong, so this should be a non-issue most of the time. And fixing this properly requires calling checkLong again, which will slow everything down. 2) An exception above is with IntList, which only checks that 1) it is a tuple or 2) it is a varargs tuple (i.e. torch.ones(1, 2, 3)). * Fix bug. * Don't conflict tensor and IntList bindings. * Change function to be consistent between python 2 and 3. * Check Index. * Move IntList overloads in legacy new functions to below Tensor overloads.	2018-04-26 19:13:23 -04:00
gchanan	a08091a42d	Implement matmul_out and dot_out. (#6961 ) * Implement matmul_out and dot_out. * Fix autograd by only calling _out variants if we have an out ourselves. * Disallow mismatched types in dot_out. * Make sure out variant doesn't have a method. * Do proper type conversion.	2018-04-26 16:52:58 -04:00
Thomas Viehmann	2b44c420c8	Enhance diagonal (fixes #6479 ) (#6718 ) * Enhance diagonal This patch - adds Tensor.diagonal to complement torch.diagonal - implements diagonal natively in ATen - makes diagonal a view - implements taking arbitrary diagonals - implements diagonal backward instead of referring to the (more limited) diag * add tests, copy diagonal code to backward for double differentiability * improve tests and doc comment. Thank you, Adam! * Mark diagonal as view function in gen_autograd.py, use simple backward.	2018-04-26 11:11:20 -04:00
Thomas Viehmann	f98b778086	Fix forward and backward for norm/renorm with infty norm (fixes #6817 ) (#6969 )	2018-04-26 12:54:53 +02:00
gchanan	3d907ef78e	Consistently check 'out' variants against specified dtype/layout/device parameters. (#6973 ) We were previously doing this in the most common cases, but not consistently.	2018-04-25 22:46:42 -04:00
Soumith Chintala	333e8c9b22	any/all returns LongTensor, make test expect that (#6957 )	2018-04-25 14:05:29 -04:00
Tao He	39d4814933	Make any and all on ByteTensor behave like sum/prod. (#4627 )	2018-04-25 10:25:38 +02:00
cpuhrsch	a8bdb561b7	Fix reductions on some contiguous tensors where size(dim) == 1 (#6815 )	2018-04-22 13:55:55 -04:00
Richard Zou	d1a992a85e	Disallow chunks that are <= in torch.chunk (#6761 ) Fixes #6759. Before, `tensor.chunk(0)` would cause a divide by 0. `tensor.chunk(-1)` would throw an error complaining that "split_size needs to be positive". This PR changes it so that the error message makes it clear that `chunks` has to be greater than 0.	2018-04-19 18:31:14 -04:00
MRuberry	9c47eb5548	Fixes test_torch.py so that all tests pass on Volta hardware. (#6736 ) Issue: "python3 test_cuda.py" currently results in a failure when using Volta hardware. The failure is in test_advancedindex, and is caused by two "sub-tests." At line 4651 a series of indices are used to compare PyTorch's and Numpy's indexing behavior. At least two of these indices index the same element of the reference tensor multiple times. These are: [slice(None), [[2]], [[0, 3], [4, 4]]] [slice(None), [[0, 1], [1, 0]], [[2, 3], [3, 0]]] The first index selects the 5th element of the third row twice, and the second index selects the 4th element of the second row twice. This causes the test to attempt to update the same index with two distinct values simultaneously. On my machine the Numpy created tensor will always take the "latter" of these two values, while the Volta tensor will always take the "former." (Not to say this behavior is guaranteed by either framework.) The fix is to remove these two indices from test_torch.py. This causes all tests to pass. While updating test_torch.py I also noticed that assert_get_eq(tensor, indexer) had a bug where it was referring to "reference" instead of "tensor." This bug had no impact on behavior. The fix is to have this function refer to its input tensor, "tensor," instead. All tests still pass after this fix.	2018-04-18 22:44:14 -04:00
Adam Paszke	d26ab68485	Sort declarations when generating Python bindings (#6701 ) * Sort declarations when generating Python bindings This helps resolve ambiguities in argument parsing according to any rules we will need. For now, this allows us to make scalar operations more conservarive wrt. argument types, but makes them commutative again. * Fix inconsistencies between mod with tensor and scalar * Fix a stupid mistake	2018-04-18 21:51:35 -04:00
Thomas Viehmann	bd0cc7d364	Implement torch.einsum (fixes #1889 ) (#6307 ) * start at generic trilinear * Implement einsum (fixes #1889) This provides a simple implementation of einsum. It is built on top of the work for computing bilinear (#6110). It uses a naive left-to-right resolution at the moment. Autograd is able to differentiate by itself. The obvious unsupported feature is taking diagonals (einsum('ii->i',(a,)). * add tests and docs * fix flake8 * clean diff * rebase on current master to resolve conflicting String wrapping * clean up after rebase * better commentary in einsum and sumproduct_pair * don't say fixme if it's fixed and rename num_outputs to num_output_dims * adapt python wrapper to use std::string instead of String to avoid typedef at::String * typos and some vector to array conversion * fix accidental python<->python3 change * really fix bad rebase	2018-04-18 13:41:27 +02:00
Francisco Massa	feb8522f99	randperm supports n=0 (#6656 ) This makes it compatible with arange and numpy.random.permutation	2018-04-17 19:03:57 +02:00

1 2 3 4 5 ...

332 Commits