pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Tongzhou Wang	22ef8e5654	[fft][1 of 3] build system and helpers to support cuFFT and MKL (#5855 ) This is the first of three PRs that #5537 will be split into. This PR adds mkl headers to included files, and provides helper functions for MKL fft and cuFFT. In particular, on POSIX, headers are using mkl-include from conda, and on Windows, it is from a new file @yf225 and I made and uploaded to s3. * add mkl-include to required packages * include MKL headers; add AT_MKL_ENABLED flag; add a method to query MKL availability * Add MKL and CUFFT helpers	2018-03-19 15:43:14 -04:00
gchanan	a3442f62bc	Support native namespace functions with type dispatch. (#5576 ) * Support native namespace functions with type dispatch. Use 'ones' as an example. Note this is a "halfway" solution; i.e. the call chain is: at::ones(shape, dtype) -> dtype.ones(shape, dtype) -> CPUFloatType.ones(shape, dtype) -> at::native::ones(shape, dtype) The "nicer" solution would probably be something like: at::ones(shape, dtype) -> dtype.ones(shape) -> CPUFloatType.ones(shape) -> at::native::ones(shape, this) * Fix type inference. * Fix test install. * Fix extensions. * Put dtype argument at the beginning. * Fix extension.cpp. * Fix rnn. * Move zeros in the same manner. * Fix cuda. * Change randn. * Change rand. * Change randperm. * Fix aten contrib. * Resize in randperm_out. * Implement eye. * Fix sparse zeros. * linspace, logspace. * arange. * range. * Remove type dispatch from gen_python_functions. * Properly generate maybe_init_cuda for type dispatch functions not named type. * Don't duplicate dtype, this parameters for native type dispatched functions. * Call VariableType factory methods from the base type so it gets version number 0. * Address review comments.	2018-03-09 10:52:53 -05:00
Edward Z. Yang	0877558e60	Port cuDNN RNN dropout state initialization to ATen and make Python c… (#5383 ) * Port cuDNN RNN dropout state initialization to ATen and make Python code use it. Fixes #5138. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Variable/Tensor bugfix Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-02 10:00:00 -05:00
Sam Gross	895aebac08	Use Variable instead of Tensor in Function.forward (#4786 ) The Tensor and Variable classes are being merged. autograd.Function.forward is now called on Variables, but with "no-grad" mode (torch.no_grad()) enabled. One benefit is that we no longer have to explicitly track shared storages.	2018-02-06 17:24:27 -05:00
Edward Z. Yang	7bd2db997e	Port cuDNN RNN bindings to ATen (#4881 ) * Add transpose() to TensorGeometry. This code is dead; I briefly used it in my RNN patchset but eventually rewrote it to not be necessary. However, it seemed like a useful gadget so I kept it. In general, it seems that it would be useful for TensorGeometry to support all operations that Tensor does, but it only computes the changes to sizes/strides instead of actually doing the computation. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Turn on wrap_dim behavior for TensorGeometry Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support for hard-coded differentiable outputs. Some outputs of functions are nondifferentiable, and should always be returned with requires_grad=False. Traditionally, we have used the presence of 'grad' to signal that only the first output is differentiable, and the rest are not, but cudnn_rnn (to be implemented) breaks this pattern; its first three outputs are differentiable, but its last output is a buffer that is just consumed by backwards. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * TensorGeometry constructor from just sizes The sizes are assumed to form a contiguous tensor, and we compute the strides we would get in that case. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support saving TensorList for backwards. There is some back story here. Saved TensorList in backwards will be used by cudnn_rnn, and it is worth asking, why is it necessary to save a list of tensors? Indeed, technically speaking a list of tensors is not necessary, we only need to save the sizes of each of the weight tensors. (We need the sizes because cuDNN is only going to blast the derivative of weights into a flat buffer, but we need to match the sizes of the views into the buffer when we eventually return the derivatives.) However, it was surprisingly awful trying to implement passing just sizes, because as non-Tensor arguments, the JIT interpreter generation code is expected to handle all non-Tensor arguments as attributes in the trace, and our attributes struct doesn't actually know how to do arrays of arrays. Saved TensorList code was much easier to get working, so that's what this patch does. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * MatrixRef - an ArrayRef with a stride, making it a 2D ArrayRef. Like ArrayRef, this class does not own the underlying data, it is expected to be used in situations where the data resides in some other buffer. This is intended to be trivially copyable, so it should be passed by value. For now, 2D only (so the copies are actually cheap, without having to write a SmallVector class) and contiguous only (so we can return non-strided ArrayRef on index). The intended use-case (not in this commit) is to make it easier to work with RNN weights, which are num_weights x num_layers matrix of parameters. P.S. dimension 0 indexes rows, dimension 1 indexes columns Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Generalize getDataType in Descriptors.h Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Change copy_range to take Tensor, and change cat_tensors_backward accordingly Should a backward function return a Variable or a Tensor? For the most part, all of our backward functions return Tensor, except cat_tensors_backward, which returns a variable_list (which is really the only thing that matters, because Tensor and Variable are interconvertible). But this is kind of weird, because it means that you can't implement a backwards in ATen that returns a std::vector<Tensor>, and then hook it up transparently with the derivatives code. So I switched it over. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support 5-ary return Tensor tuple. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support code generation with mixed Tensor/TensorList in output. I don't think I ended up using this in cudnn_rnn, but this seems it might be useful for someone else later. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Support 4-ary boolean array Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add support for retain_variables in tools/autograd/derivatives.yaml 'retain_variables', a bool which is true if a user has specified that saved variables should be retained in case the backwards is run again later. This allows an optimization where we can destroy saved buffers if we know variables are not going to be retained, e.g., it is (will be) used by _cudnn_rnn Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Lazily initialize cuDNN descriptors Previously, cuDNN descriptors were eagerly allocated as soon as a FooDescriptor object was created. However, in some uses of TensorDescriptor, this is problematic: some tensors are optional and cuDNN's API expects to be given a nullptr TensorDescriptor in this case, not an uninitialized (but allocated) descriptor. Lazily initializing the descriptors makes it less likely for us to use uninitialized memory and matches the usual semantics of unique_ptr. It's good sense! Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Port cuDNN RNNs to ATen. This brings three new functions: - _cudnn_rnn_flatten_weight: flatten a matrix of weight tensors into a single contiguous weight buffer as required by cuDNN - _cudnn_rnn: run RNN forwards - _cudnn_rnn_backward: run RNN backwards RNNs have a lot of parameters, so we restructured what was previously a single 'fn' object that recorded all the parameters into three objects: RNNDescriptorParams, TensorDescriptorListParams and DropoutDescriptorParams. We make use of MatrixRef to organize the weight tensors (which are weight/bias x number of layers), but I did not teach the codegen how to pass these as arguments/return values natively, so instead a MatrixRef is passed as its constituent ArrayRef and int64_t stride0. cudnn_rnn has three differentiable outputs and one nondifferentiable one, so it makes use of the support for hard-coded differentiable outputs. I haven't deleted all of the descriptor code from Python, because dropout initialization still goes through this codepath, that should be fixed soon but I don't see it as essential for this PR. This commit also removes the last use of NestedIOFunction from PyTorch. There are some shenanigans with cuDNN dropout descriptor initialization, see below: Note [cuDNN dropout descriptor initialization] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In most cases, setting descriptors in cuDNN is cheap (e.g., cudnnSetTensorNdDescriptor). However, this is not the case for cudnnSetDropoutDescriptor: in cuDNN 6/7 (and possibly others) it does an expensive precomputation to initialize the random number generator states. In cuDNN 6, this is the ONLY official mechanism to initialize a dropout descriptor, which means that law-abiding clients were expected to generate a dropout descriptor once and cache it. However, our ATen interface is (1) stateless (so we can't cache the descriptors) and (2) does not accept arbitrary user types in its interface (so we can't pass the descriptor in). This puts us in a pickle. In cuDNN 7, a new function, cudnnRestoreDropoutDescriptor was added, which forgoes the expensive initialization process, and can initialize the descriptor with a pre-initialized state CUDA tensor. This is great, because it means we can simply pass in the state tensor and then initialize the descriptor internally. Unfortunately, this function is not available in cuDNN 6. To work around this, we break the cuDNN abstraction barrier, and have the struct layout of the underlaying dropout descriptor. With this struct, we can reimplement cudnnRestoreDropoutDescriptor from scratch. Great! Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Fix cuDNN 7 behavior. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Delete some unused, controversial methods from MatrixRef. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add missing filter_dim_a slice Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Replace nested for-loop with itertools.chain. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * CR comment on mut_desc() Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Refactor DropoutDescriptor API. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Use cached CurrentDeviceProperties from Context. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Document _cudnn_rnn outputs. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Improve fmap docs, convert some functions to use it. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Move IndexRange to autograd/function.h Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Elaborate on CUDNN_STATUS_INVALID_VALUE return some more. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add an all-in-one setter for RNNDescriptorParams. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Print what the unrecognized RNN mode was Signed-off-by: Edward Z. Yang <ezyang@fb.com> * RNN TensorDescriptor improvements - Have an explicit size/stride overload for set TensorDescriptor, so you don't have to create a goofy view to feed in. - Change the padding to 3D rather than 5D, which is all you actually need (it's just 2D that is not supported by cuDNN API.) Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Fix implementation of cudnnRestoreDropoutDescriptor, plus test. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Better comments about input layout. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add comment about no-DropoutDescriptor argument RNNDescriptor function. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Rename vocab_size back to input_size. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Don't use backslash in comment. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Bugfix for contiguous TensorGeometry calculation. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Don't allocate a dummy tensor when setting TensorDescriptor for flatten_weight. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Make contiguity errors more user-friendly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * s/fn.dropout.train/fn_train/ Signed-off-by: Edward Z. Yang <ezyang@fb.com> * s/_cudnn_rnn_backward_grad/_cudnn_rnn_backward_input/ Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Make dcx properly undefined when not required. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Remove old TODO. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add state size check in cudnnRestoreDropoutDescriptor Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Explicitly narrow int64_t to size_t Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Restore copyParams comment. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Update benchmark numbers, and slight engineering improvements. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Typofix. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-02-05 13:54:11 -05:00
Edward Z. Yang	7d25a41251	Fix #4492 , make it impossible to forget to reset cudnn flags (#4503 ) Three stage plan to no more stupidly weird "why isn't cuDNN enabled" bugs: - Add torch.backends.cudnn.disable_global_flags(), which as its name suggests, disables global flag setting in cuDNN, so that you are not allowed to make changes to this state. However, the flags() context manager continues to work (since they are non-global changes). - Call disable_global_flags() in test/common.py - Switch all of the manual flag setting/unsetting in test/test_nn.py to use the context manager. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-01-08 12:21:09 -05:00
Edward Z. Yang	5f7c5502b8	Further improvements to ATen convolution (#4287 ) - Rename THNN convolution to have thnn_ prefix. - Propagate CuDNN benchmark and deterministic to at::Context - Add 'convolution', 'convNd' and 'conv_transposeNd' native wrappers, with defaults The conv_transposeNd wrappers are updated to have the same argument order as Python. - torch.nn.functional directly dispatches to the native wrappers - Make it possible to turn off tracing for some native wrappers, so I don't have to write symbolics for all the functions above - Spectral ops can now make use of CuDNN convolution if possible - Better commentary on cudnn_batch_norm - Turn on DCE for all JIT tests. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-21 13:03:43 -05:00
Edward Z. Yang	787b9c5202	Propagate CuDNN enabled to ATen library. (#4104 ) This is not currently used by anything, but eventually ATen will need to make decisions about whether or not to use CuDNN functions or not, which means we need to propagate this variable to ATen. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-14 11:29:25 -05:00
Richard Zou	28890b2046	Add rnn args check (#3925 ) * Add rnn args check * Check both hidden sizes for LSTM * RNN args check test	2017-12-13 12:48:00 -05:00
peter	ba3b79b06b	Fix the missing import	2017-11-14 09:36:43 +01:00
Christian Sarofeen	0443c11f7e	Fix for cuDNN half precision RNN for pre-volta archs (#3613 ) * Fix for cuDNN half RNN on pre-volta archs * Fix cuDNN versioning in rnn. * lint fix	2017-11-11 11:34:58 -05:00
peterjc123	aa911939a3	Improve Windows Compatibility (for csrc/scripts) (#2941 )	2017-11-08 19:51:35 +01:00
Sean Naren	cf256ee268	Added tensor op check for cudnn rnns (#3409 )	2017-11-01 05:51:23 -04:00
Priya Goyal	2443fcac0b	Deterministic cudnn algorithms	2017-10-10 10:53:34 -04:00
Adam Paszke	ceb4f84d12	Improve memory usage of cuDNN RNN modules (#2179 )	2017-07-25 04:00:17 +05:30
Gregory Chanan	69287250d1	Add a broadcast parameter to copy_, use it in the library in cases where there is non-broadcasting calls exposed by the tests.	2017-06-11 05:37:59 -04:00
Sam Gross	625850c2c2	Check cuDNN version at runtime (#1586 ) * Check cuDNN version at runtime This checks that the version from cudnn.h matches the version from libcudnn.so. Fixes #1476 * Only check major and minor version numbers	2017-05-19 01:55:09 -04:00
Sam Gross	e6c9509a41	Fix call to Tensor.set_ in rnn.py (#1592 )	2017-05-18 20:28:49 -04:00
Sam Gross	b9379cfab7	Use cuDNN and NCCL symbols from _C library (#1017 ) This ensures that we use the same library at the C++ level and with Python ctypes. It moves the searching for the correct library from run-time to compile-time.	2017-03-16 16:10:17 -04:00
Adam Paszke	1487278fdf	Allow backprop through cuDNN RNN in eval mode Handling of dropout descriptors has been improved too.	2017-03-01 19:42:39 +01:00
Adam Paszke	da725830c2	Add support for variable length sequences in RNNs (#873 )	2017-03-01 17:36:32 +01:00
Christian Sarofeen	04aba1caec	Fix cuDNN dropout desc for multi-gpu (#772 )	2017-02-17 19:16:12 +01:00
bdfhjk	a217fefee1	Update rnn.py Fixed a problem with outputting the RuntimeError if arguments are incorrect in cudnn/rnn.py	2017-02-15 21:49:42 +01:00
Adam Paszke	72c1982734	Add some more asserts to cuDNN RNN	2017-02-14 21:28:50 +01:00
Adam Paszke	63edca44f2	Add tests for non-contiguous inputs and gradients	2017-02-14 21:28:50 +01:00
ngimel	f096fb6859	adding cudnn V6 support (#515 )	2017-01-31 02:01:37 +01:00
Adam Paszke	0180e638e5	Remove unnecessary zero_() calls in cuDNN RNN	2017-01-28 14:36:57 +01:00
Adam Paszke	95c6ae04fb	Fix non-contiguous grad handling in cuDNN RNN	2017-01-28 14:36:57 +01:00
Luke Yeager	e7c1e6a8e3	[pep8] Fix most lint automatically with autopep8 Here's the command I used to invoke autopep8 (in parallel!): git ls-files \| grep '\.py$' \| xargs -n1 -P`nproc` autopep8 -i Several rules are ignored in setup.cfg. The goal is to let autopep8 handle everything which it can handle safely, and to disable any rules which are tricky or controversial to address. We may want to come back and re-enable some of these rules later, but I'm trying to make this patch as safe as possible. Also configures flake8 to match pep8's behavior. Also configures TravisCI to check the whole project for lint.	2017-01-28 01:15:51 +01:00
ngimel	b32dd4a876	add cudnn deb package installation paths to cudnn discovery, add 5.1.10 to load options (#448 )	2017-01-13 14:32:23 -05:00
ngimel	59b23d79c6	fix cudnn rnn batch_first with tests (#445 ) * fix cudnn rnn batch_first with tests	2017-01-13 13:40:27 -05:00
Adam Lerer	183b3aacd2	Hold CuDNN PRNG state between RNN iterations	2016-12-30 00:14:55 +01:00
Sam Gross	8a29338837	Use cuDNN for Conv3d and ConvTranspose3d (#359 ) I've also updated test_nn.py to run marked tests twice: once with cuDNN enabled and once with it disabled.	2016-12-28 16:14:47 -05:00
Adam Paszke	cd82b2b869	Implement comparison and logical operators for tensors	2016-12-28 00:04:08 +01:00
soumith	a9c2809ce3	change the order of cudnn libs	2016-12-21 05:44:16 -08:00
Sergey Zagoruyko	5586f48ad5	add cudnn 5.0.5 to supported versions (#321 )	2016-12-17 07:57:20 -05:00
Adam Paszke	8e09f0590b	Make sure that C extension was compiled with cuDNN before using it	2016-12-15 00:47:55 +01:00
Adam Paszke	0580f5a928	Add __len__ for tensors	2016-12-01 23:14:41 +01:00
Marat Dukhan	e3f440b1d0	Make torch.backends.cudnn work on OSX	2016-11-22 19:06:08 +01:00
Adam Lerer	7f51af7cbc	adding dropout, bidirection, etc. to RNN (#214 )	2016-11-10 13:25:14 -05:00
Sam Gross	ad2d413c0b	Add C++ bindings for cuDNN (#167 ) The Python ctypes bindings overhead was high enough that it slowed down multi-gpu training when using 4+ Maxwell GPUs.	2016-10-26 19:51:48 -04:00
Adam Lerer	b5d13296c6	addressing comments	2016-10-23 21:11:22 -07:00
Adam Lerer	86288265ad	Adding rnn cell library	2016-10-23 20:23:48 -07:00
Adam Lerer	1eb6870853	add nobias option to rnn	2016-10-23 20:23:48 -07:00
Adam Lerer	942ca477a6	Copying weights for CUDNN	2016-10-23 20:23:48 -07:00
Adam Lerer	b0e33fb473	cudnn + THNN match with parameters	2016-10-23 20:23:48 -07:00
Adam Lerer	d58b627b98	CUDNN RNN bindings	2016-10-23 20:23:48 -07:00
Sam Gross	a02917f502	Fix typo	2016-10-14 14:07:29 -07:00
Sam Gross	70d8bd04c0	Make cuDNN descriptors extend object Fixes weird double __del__ issue	2016-10-14 13:58:20 -07:00
Soumith Chintala	50326e94b1	try cudnn 5.1.5 and 5.1.3 in that order to load them up. This is needed because cudnn for cuda 7.5 ships with 5.1.3 and cudnn for cuda 8.0 ships with 5.1.5	2016-10-09 22:26:43 -04:00
Soumith Chintala	160723b5b4	fix cudnn lib name	2016-10-09 21:19:50 -04:00
soumith	833bedb46b	cudnn relative check in binary builds	2016-10-02 11:45:46 -07:00
Sam Gross	14965cfce9	Run cuDNN operations on the correct device	2016-09-29 16:27:07 -07:00
Sam Gross	cb5d4e836f	Lazy load CUDA and THNN modules (#64 )	2016-09-28 19:29:53 -04:00
Soumith Chintala	412019dbe4	fixing CPU builds by making cuda imports optional	2016-09-28 11:56:18 -04:00
Sam Gross	779a460030	Add cuDNN support for convolutions (#36 )	2016-09-27 17:55:04 -04:00

1 2 3 4 5

206 Commits