pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Soumith Chintala	dc186cc9fe	Remove NO_* and WITH_* across codebase, except in setup.py (#8555 ) * remove legacy options from CMakeLists * codemod WITH_ to USE_ for WITH_CUDA, WITH_CUDNN, WITH_DISTRIBUTED, WITH_DISTRIBUTED_MW, WITH_GLOO_IBVERBS, WITH_NCCL, WITH_ROCM, WITH_NUMPY * cover SYSTEM_NCCL, MKLDNN, NNPACK, C10D, NINJA * removed NO_* variables and hotpatch them only in setup.py * fix lint	2018-06-15 12:29:48 -04:00
James Reed	04503962ff	[ONNX] Add an ATen fallback pathway for ONNX export (#8273 ) * ATen fallback for ONNX export * Move to enum * Fix model test * Add comment * Address comments BC interface	2018-06-12 22:59:45 -07:00
Pieter Noordhuis	695d40efc2	Create initial Python bindings for c10d (#8119 ) * Build and install c10d from tools/build_pytorch_libs.sh * Create initial Python bindings for c10d * clang-format * Switch link order to include more symbols * Add bindings and tests for ProcessGroupGloo * Add broadcast test * Separate build flag for c10d * Explicit PIC property * Skip c10d tests if not available * Remove c10d from Windows blacklist Let it skip by itself because it won't be available anyway. * Make lint happy * Comments * Move c10d module into torch.distributed * Close tempfile such that it is deleted	2018-06-08 12:59:51 -07:00
Edward Z. Yang	15122e93bc	Test if ASAN is actually working as part of ASAN tests. (#6050 ) * Test if ASAN is actually working as part of ASAN tests. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Drop explicit use of libstdc++, we should not care. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Build with DEBUG=1 Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Increase main thread stack size when using ASAN. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-05-30 11:31:42 -04:00
Zachary DeVito	286cd04a20	JIT cleanup (#7631 ) Cleans up dead code in the JIT: * Remove interpreter_autograd_function * Remove Handles * Remove HandleBuilder * Remove creates_handles, and tracing_autograd_python_function flags * Remove unused var_args * Fix submodules	2018-05-21 10:06:29 -07:00
Adam Paszke	0829d4502d	Trace size-dependent expressions correctly (#6554 ) This makes the JIT tracer much more robust, by allowing it to record dependencies on tensor sizes. For example, if you were to trace this function def fn(x): return x.view(x.size(1), -1) before this patch, then it would embed the actual value of x.size(1) in the trace as a constant, making it very hard to have e.g. batch size independent traces. Now, this will correctly record the dependency, and will retrieve the size of x at every run.	2018-05-04 10:55:39 +02:00
Zachary DeVito	d985cf46f1	Add workaround to fix include warnings in Python 2 builds. (#6716 )	2018-04-24 12:30:19 -07:00
li-roy	d1bb75e273	Redo tensor repr to make it less verbose (#6370 ) * Redo tensor repr to make it less verbose * fix empty tensor * fix scaled scalars * update for device-dtype split * address comments * removed repeated lines * address comments * add cuda to device string	2018-04-18 18:25:07 -07:00
bddppq	c43c911662	Export onnx protobuf bindings to python (#6651 ) * Export onnx protobuf bindings to python * rename native onnx module to _onnx	2018-04-17 16:38:57 -07:00
gchanan	d7cb78478f	Split set_default_tensor_type(dtype) into set_default_dtype(dtype). (#6599 ) * Split set_default_tensor_type(dtype) into set_default_dtype(dtype). * Fix flake8. The difference between this one and set_default_tensor_type is that it only sets scalar type what determines the type + device of a tensor returned from a factory function with defaults is the default tensor type + the current device (if the default tensor type is cuda). This just changes the scalar type of the default tensor type. We do eventually want to deprecate set_default_tensor_type; it is not clear how to do that in a sensible and backwards compatible way.	2018-04-16 13:49:00 -04:00
gchanan	749d51414a	Separate cuda-ness from dtype. (#6470 ) * Separate cuda-ness from dtype. There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType. At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device). There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types on reduction functions. * Fix test_autograd. * Add defaults to randint_like. * Track is_cuda in py tensor types. * Fix test_sparse. * Fix multiprocessing. * Fix rnn. * Fix test_nn. * Fix flake8.	2018-04-12 14:05:44 -04:00
gchanan	87e369111a	Add string-style devices to all tensors. (#6283 ) * Add string-style devices to all tensors. Previously, tensors only had a 'get_device' method which would throw an exception on a CPU tensor. This made it necessary to if/else code that was meant to be device agnostic. This PR implements the following: 1) Adds a 'device' property to all tensors that returns a string representation of the device for all tensors. For cpu tensors this is 'cpu'. For cuda tensors this is 'cuda:X', where X is the cuda device ordinal. 2) Adds a DeviceSpec class. This is just a helper class for separating device_type and device_index specification and to allow partial specification. For example, you can call DeviceSpec('cuda'), DeviceSpec('cuda:0'), DeviceSpec('cuda', 1). Also has backwards compatibility support for specifying integers, which are treated as cuda devices. DeviceSpecs have the following properties: a) device_type: string representation of the device type (i.e. 'cpu' or 'cuda') b) device_index: integer for the device index (None if not specified) c) cuda_device_index: for backwards compatibility; behaves roughly like `get_device` did previously. I.e. if a function previously took integers for cuda devices, it can now take DeviceSpecs (or strings), and can maintain the old functionality by calling `old_index = DeviceSpec(old).cuda_device_index`. 3) tensor methods and torch. functions that took integer devices can now take integers, strings, or DeviceSpecs. For example: torch.randn((2,3), dtype=torch.cuda.float32, device='cuda:1') TODO in future PRs: A) Split out cuda from dtype so you don't need to overspecify cuda-ness B) We currently only support strings/DeviceSpecs in tensor methods and torch. functions. We should have equivalents torch.cuda.device(...), torch.cuda.device_of, etc. at the torch. level that work on strings/DeviceSpecs * Add deviceInt64 to python arg parser. * device_str. * Remove device_str. * remove device prefix from attributes. * Use const char * instead of string. * Move autogpu index out of Device. * comment on is_default. * Rename torch.DeviceSpec to torch.device. * comment. * Fix tests. * Fix flake8. * Fix sparse_coo_tensor parameter name. * Improve error message. * Remove device_ prefix from C++ device object. * Allocate static strings. * Return not implemented from rich compare. * Move torch::Device to THPDevice. * Remove cuda index. * Py_RETURN_NOTIMPLEMENTED doesn't exist in python2.	2018-04-06 15:12:05 -04:00
Sam Gross	6b3a4637d6	Make the tensor type torch.Tensor instead of torch.autograd.Variable (#5785 ) This changes type(tensor) to return `torch.Tensor` instead of `torch.autograd.Variable`. This requires a few implementation changes: - torch.Tensor is now a regular Python class instead of a pseudo-factory like torch.FloatTensor/torch.DoubleTensor - torch.autograd.Variable is just a shell with a __new__ function. Since no instanes are constructed it doesn't have any methods. - Adds torch.get_default_dtype() since torch.Tensor.dtype returns <attribute 'dtype' of 'torch._C._TensorBase' objects>	2018-04-03 16:29:25 -04:00
Sam Gross	83926393d3	Detect re-initialization of _C shared library (#6232 ) We had a bug in the Buck build of PyTorch due to symbols from _C being present in two shared libraries that were both loaded at runtime. This caused global variables to be initialized twice and destructed twice on exit. The second destruction often caused segfaults on exit. This attempts to detect that sort of situation early on. If Module.cpp is compiled twice, the symbol pytorch_duplicate_guard()::initialized will be shared. The second initialization will print an error message and abort.	2018-04-03 15:28:37 -04:00
gchanan	4c81282c33	Introduce torch.layout and split layout from dtypes. (#6145 ) * Introduce torch.layout and split layout from dtypes. Tensors (and tensor types) now have a 'layout' attribute that returns either 'torch.strided' or 'torch.sparse_coo'. Previously, dtypes were 1-to-1 with ATen types/PyTensorTypes; the impetus behind this decision was to make things easy in the common case (i.e. specifying a type in a factory function). But this doesn't really follow for sparity, which isn't a common case. It also doesn't properly represent the concept or a dtype, which in numpy are proper scalar types (i.e. roughly the type returned from indexing the last dimension of an n-d array). But this should be the same whether or not the tensor is represented via strides, sparsity, etc. This is accomplished by: 1) having the dtype of tensor return the (device-type, scalar-type) combination, i.e. torch.cuda.float32, so both torch.cuda.FloatTensor and torch.cuda.sparse.FloatTensor have the same dtype 2) Adding a layout parameter to python functions, where the combination of (dtype, layout) maps to an ATen type that is used for dispatch. * Formatting, make init throw python_error. * Fix cuda not enabled error message. * Fix test.	2018-04-02 14:07:50 -04:00
Tongzhou Wang	22ef8e5654	[fft][1 of 3] build system and helpers to support cuFFT and MKL (#5855 ) This is the first of three PRs that #5537 will be split into. This PR adds mkl headers to included files, and provides helper functions for MKL fft and cuFFT. In particular, on POSIX, headers are using mkl-include from conda, and on Windows, it is from a new file @yf225 and I made and uploaded to s3. * add mkl-include to required packages * include MKL headers; add AT_MKL_ENABLED flag; add a method to query MKL availability * Add MKL and CUFFT helpers	2018-03-19 15:43:14 -04:00
cpuhrsch	84400d5531	ReduceOps cleanup and set_num_threads (#5723 )	2018-03-19 13:40:56 -04:00
Sam Gross	7588893ce2	Some additional clean-ups (#5505 ) - Remove some uses of mega-header THP.h - Use HANDLE_TH_ERRORS in functions that may throw - Move NumPy includes to common header - Delete unused allocator	2018-03-05 17:45:02 -05:00
Sam Gross	5dedc648bb	Compile DataLoader.cpp separately (#5507 ) Don't #include DataLoader.cpp in Module.cpp	2018-03-02 05:54:33 -05:00
gchanan	285a9e2452	Add dtype to torch.Tensor constructors and accept them in set_default_tensor_type (#5444 ) * Add dtype to torch.Tensor, torch.FloatTensor, etc. * Support passing dtypes to set_default_tensor_type. * Check dtype exception. * Correctly handle new type initialization order. * Move handling of torch.Storage alias to C++. * Delete function that erroneously reappeared.	2018-03-01 14:06:55 -05:00
Sam Gross	509aed6ca3	More Variable/Tensor clean-ups (#5464 )	2018-02-28 16:46:47 -05:00
Sam Gross	48a3349c29	Delete dead Tensor code paths (#5417 ) This deletes most of the dead Tensor code paths, including the TensorMethods cwrap and generic/Tensor.cpp. This also moves the THNN.cwrap/.cpp generation to generate_code which can use ninja if installed.	2018-02-27 17:58:09 -05:00
gchanan	d5038309a1	Remove WITH_SCALARS, as it's enabled by default now. (#5437 )	2018-02-27 14:51:11 -05:00
Sam Gross	30ec06c140	Merge Variable and Tensor classes (#5225 ) This replaces the torch.Tensor constructors with factories that produce Variables. Similarly, functions on the torch module (e.g. torch.randn) now return Variables. To keep the PR to a reasonable size, I've left most of the unused tensor code. Subsequent PRs will remove the dead code, clean-up calls to torch.autograd.Variable, and rename Variable to Tensor everywhere. There are some breaking changes because Variable and Tensors had slightly different semantics. There's a list of those changes here: https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge	2018-02-23 18:03:31 -05:00
gchanan	5edf6b2037	Add numpy-style dtypes to Variable factories. (#5245 ) * Add numpy-style dtypes to Variable factories. 1) Add numpy-style dtypes corresponding to torch tensor types. These are: torch.float16, torch.float32, torch.float64, torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64 as well as torch.cuda, torch.sparse, and torch.cuda.sparse equivalents. 2) Adds "legacy" names for the above dtypes that correspond more closely to existing tensor names. These are: torch.half, torch.float, torch.double, torch.short, torch.int, torch.long. torch.byte and torch.char don't exist because they either don't match numpy semantics or differ on different architectures. 3) Adds a "dtype" parameter to Variable factories (e.g. zeros, ones) that allows the user to specify the type without changing the default tensor type. 4) Adds a "dtype" getter to Variables that return the canonical dtype from 1) This PR is missing the following useful features that should be added in the future: A) We only add the "dtype" parameter to auto-generated factories; hand-written factories like in tensor_new.cpp don't support this yet. B) We don't allow type conversions to use dtypes; that should be added to type(param) or a new function. C) We don't yet have a "device" parameter for these factories; right now, they will only create Variables on the default device. * backend_to_string can be private. * Define python binding argument indexes in a more simple way. * add all_declared_types, still need to hook it up to THPDType. * Fix all_declared_types for missing types (it's Sparse + Half). * Ensure cuda dtypes are created even if compiled with NO_CUDA=1. * Fix case where dtype is provided but dispatch is via namespace. This happens in ones_like, empty_like, randn_like. There is some question if we should do: 1) at::ones_like(tensor).toType(dtype) 2) at::ones_like(tensor.toType(dtype)) I did the former because this matches with the numpy documentation, i.e.: "Overrides the data type of the result." and it's easier to implement. Note that the above causes an extra copy, either of the input or output. Here's a better implementation: 1) Make zeros_like, ones_like native functions that take an optional type (named dtype?). 2) Match the type argument with the dtype, so we don't have two different parameters. 3) Call at::zeros_like(input, type) -> at::native::zeros_like(input, type) -> type.zeros(input.sizes()) * Don't return from maybe_initialize_cuda. * Don't leak DType name. * Address cpp review comments. * Share code between sparse and non-sparse test_dtypes. * Rewrite _like functions as native function with explicit type parameter. * Use type 'Type' instead of 'dtype' for consistency. * Address review comments. * Handle arg_idx when there is requires_grad but no dtype in python_binding_arguments.	2018-02-20 11:04:14 -05:00
Choongwoo Han	fae6c67121	Configurable flushing denormal numbers on CPU (#5294 ) * Configurable flushing denormal numbers on CPU * Formatting * Update docs * Minor doc changes	2018-02-19 19:23:43 -05:00
gchanan	9bb6d33d35	Enable scalars if compiled with WITH_SCALAR environment variable. (#4806 ) * Enable scalars if compiled with WITH_SCALAR environment variable. We are pretty close to enabling scalars (0-dimensional arrays); this allows turning them on for development purposes and to be able to write code that works both with and without scalars enabled. WITH_SCALARS is currently broken with distributions, but should work for test_torch, test_autograd, test_nn. * Fix unsqueeze. * Fix wrap dim, wrapping with Scalar.	2018-01-23 15:44:11 -05:00
gchanan	1569797b15	Use ATen infer_size implementation rather than TH. (#4781 ) * Use ATen infer_size implementation rather than TH. The only substantitive difference between the two implementations is in how empty sizes are handled; in ATen these are treated as scalars (i.e., can be expanded to anything), whereas in TH they are treated as a special case of empty tensors (i.e., can't be expanded to anything). Therefore, this change is necessary to support scalars (0-dimensional tensors). We could also take a bool parameter for determining how we treat empty tensors but this seems unnecessary: if one tries to expand an empty tensors (as a result of an infer_size calculation), the expansion will fail. * Make changes for review. * Attempt to fix windows build. * long -> int.	2018-01-22 15:34:31 -05:00
Sam Gross	e855317370	Make dirichlet_grad and standard_gamma match ATen declarations (#4722 ) The Python function has an underscore (_) prefix so the C++ IMPLEMENT_STATELESS call should have an underscore prefix as well.	2018-01-18 16:49:18 -05:00
Adam Paszke	1061d7970d	Move broadcast and broadcast_coalesced to C++	2018-01-18 11:16:45 +01:00
Sam Gross	57549b7e44	Bind functions with out= arguments in VariableType (#4565 ) This adds overrides in VariableType for the xxx_out ATen functions and implements Python bindings. There is no support for automatic differentiation. If any of the inputs (or outputs) requires grad, then the function will throw an exception unless it's running in "no-grad" mode. The bindings for calling torch.xxx functions on Variables are moved to a different object. Previously, they were static method on VariableBase. This change prevents users from accidentally calling static methods as if they were instance methods.	2018-01-17 18:27:42 -05:00
HE, Tao	f4a75deccf	Fix the inconsistency of `polygamma` on Tensor and Variable, for issue #4466 (#4527 ) * Fix the inconsistency of `polygamma` on Tensor and Variable. Signed-off-by: HE, Tao <sighingnow@gmail.com> * Regression test for #4466, polygamma works on variables. Signed-off-by: HE, Tao <sighingnow@gmail.com> * Add macro IMPLEMENT_STATELESS_SWAP to dispatch stateless methods on Variables correctly. When call stateless methods with more than one arguments and the `self` comes second, the `self` argument needs to be swapped to the first position before dispatching. The macro `IMPLEMENT_STATELESS_ADDXX` is still reserved for deprecated `add**` methods. Signed-off-by: HE, Tao <sighingnow@gmail.com>	2018-01-09 10:39:09 -05:00
Fritz Obermeyer	35abc4efa2	Add low-precision digamma() and polygamma() functions (#4399 )	2018-01-02 11:53:23 +01:00
Vishwak Srinivasan	e519ef5337	Adding torch.expm1() and its inplace function (#4350 )	2017-12-28 18:56:03 +09:00
SsnL	658d4c7ea8	allow optional int tensor	2017-12-24 03:08:28 +08:00
Edward Z. Yang	5f7c5502b8	Further improvements to ATen convolution (#4287 ) - Rename THNN convolution to have thnn_ prefix. - Propagate CuDNN benchmark and deterministic to at::Context - Add 'convolution', 'convNd' and 'conv_transposeNd' native wrappers, with defaults The conv_transposeNd wrappers are updated to have the same argument order as Python. - torch.nn.functional directly dispatches to the native wrappers - Make it possible to turn off tracing for some native wrappers, so I don't have to write symbolics for all the functions above - Spectral ops can now make use of CuDNN convolution if possible - Better commentary on cudnn_batch_norm - Turn on DCE for all JIT tests. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-21 13:03:43 -05:00
Edward Z. Yang	9bf5e40dfa	Refactor cudnn code layout / make build more robust. (#4201 ) * Refactor cudnn code layout / make build more robust. When I previously moved cuDNN into ATen, I wasn't too familiar with the ATen native function directory layout, and so I did a number of suboptimal things. This commit fixes those problems. - If NO_CUDA was set but cuDNN is installed on your system, we'd incorrectly assume that CUDNN was enabled, to hilarious effect. - We now distinguish between cudnn implementation files and cudnn native function files. The native files now live in ATen/native/cudnn, and are unconditionally compiled, even when we are not building with cuDNN. This means that we can unconditionally declare cudnn functions in yaml and they are always available, even if they are broken. The cuDNN specific files live in 'cudnn', they are never installed, and they are used purely for implementation purposes. I had to add stub implementations of all ATen functions to achieve this. - I had written headers for at::native functions manually, but codegen will generate them for me automatically. So I deleted the headers. That lets me get rid of some header install logic as well. - There's a new note about ATen preprocessor philosophy.	2017-12-18 16:47:57 -05:00
Sam Gross	d605058212	Replace Variable.volatile with torch.no_grad() (#3970 ) This removes volatile from Variable. The functionality is mostly replaced by a global (thread-local) flag, which is controlled by torch.set_grad_enabled() and the context manager torch.no_grad(). In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled() Fixes #3627	2017-12-18 15:46:13 -05:00
gchanan	0876bab8b7	Support CPU Apply in ATen and implement standard_gamma using it (#4161 ) * Support CPU Apply directly in ATen and implement standard_gamma using it. Main changes in this PR: 1) Added a TH_APPLY-style templatized function for CPU apply calls (currently only 2 and 3 tensor argument versions are supported, but more are easy to add). In fact, this is basically identical to TH_APPLY, except it uses ATen functions and the API is a template instead of a macro. The template takes an operation that is performed on the data (and an indicator to signal early termination); i.e. you don't need to know that x_data is a pointer to the current data location of x. 2) Refactors the ATen dispatch code to easily generate dispatch code for different subsets of the scalar types. This is in preference to the template_scalar path, which requires valid specialization of each scalar type. Valid specializations are particularly annoying with CUDA because you most likely can't put the specializations in a header so need to write some sort of for-all-scalar-type macro to get the correct specializations. Currently, we only generate dispatch_all (all scalar types, the equivalent existed already), and dispatch_cpu_floating_types (which is used by standard_gamma). 3) Implements standard_gamma using the above changes (this is an arbitrary choice, it was the latest apply macro to be committed). The forward is bound via Declarations.yaml, the backward via the Apply template, and then they are hooked together in derivatives.yaml. This eliminates needing to change TH at all going forward, which means one can write idiomatic C++ instead of the TH-style macros (e.g. TH_MATH_NAME). * Generate Dispatch code with nicer spacing. * Small cleanups. * Fix typo. * Add TODOs for changing macros, remove dead code. * Use a lambda function. * Get rid of early exit. * Rename Scalar,ScalarType template parameters to CScalar. * Reorder _standard_gamma_grad parameters. * Add comments explaining calling convention. * Don't generate Dispatch.h anymore. * Get rid of backend specific checks in dispatch. * Fix empty/scalar check.	2017-12-18 15:45:01 -05:00
Fritz Obermeyer	ee98e7a82e	Implement Dirichlet and Beta distributions (#4117 )	2017-12-18 19:11:37 +01:00
Zachary DeVito	d8c5f2ae21	Fix a bug where from_dlpack failes if cuda is not initialized. (#4182 )	2017-12-14 21:54:36 -05:00
Edward Z. Yang	787b9c5202	Propagate CuDNN enabled to ATen library. (#4104 ) This is not currently used by anything, but eventually ATen will need to make decisions about whether or not to use CuDNN functions or not, which means we need to propagate this variable to ATen. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-14 11:29:25 -05:00
Fritz Obermeyer	05ebd21a36	Implement reparameterized gradient for Gamma sampler (#3978 )	2017-12-11 03:32:15 -08:00
gchanan	1c96809cf8	Bind cauchy_, exponential_, normal_, uniform_ functions to THPVariable. (#3945 ) * Bind cauchy_, exponential_, normal_, uniform_ functions to THPVariable. Also changes the error messages around Generator parser; previously, you'd get an error like: torch._C.Generator is not a torch.Generator; now the check is proper but returns that only None is supported. * Support passing Generators to ATen Variable-bound methods. This involves changing THPGenerator to have an at::Generator rather than a THGenerator. TH getRNGState, setRNGState are still called directly because they are not bound from ATen yet; they should probably be on the Generators and return (opaque) GenerateState objects. * Fix default values. * Properly use THRandom_initialSeed. * update standard gamma to use new default generator.	2017-12-07 14:34:51 -08:00
Sam Gross	d0cabbde74	Implement Variable.from_numpy (#4043 ) Implements from_numpy using ATen tensors. Variable.from_numpy is a convenient placeholder for the variant that returns Variables until we merge Tensor and Variable. The behavior is slightly changed: - from_numpy() on an empty array now returns an empty tensor instead of throwing an exception. The shape may not be preserved. - CharTensor(ndarray) used to throw an exception. It now copies the ndarray. Copying is implemented via ATen toType.	2017-12-06 14:08:56 -05:00
Fritz Obermeyer	165d0897e4	Implement distributions.Gamma (#3841 )	2017-12-02 01:10:08 +01:00
Edward Z. Yang	1c0fbd27a1	CuDNN bindings rewrite (into ATen) (#3666 ) * Comprehensive rewrite of Torch CuDNN bindings / a bit of ATen infra The executive summary is that this moves the torch/csrc/cudnn library into ATen, adding a number of new cudnn_ methods to ATen for batchnorm, convolution, affine grid generator and grid sampler. ATen infra changes: - TensorGeometry was moved to ATen - TensorGeometry was modified to make its interface resemble that of Tensor; in particular, sizes is no longer a field, it's a method. - AT_CUDA_ENABLED macro is set via ATen/Config.h header which is generated at cmake configure time. Fixes https://github.com/zdevito/ATen/issues/168 - Change AT_CUDA_ENABLED macro to be a function macro, so that we error if it is not defined - Introduce a new TensorArg class, which is a Tensor plus a little metadata. This helps us give good error messages when checking dimensions/shapes of tensors. Fixes https://github.com/zdevito/ATen/issues/169 - Also introduce a TensorGeometryArg class, for when you don't need the actual tensor data (which is most of the time.) - Add ATen/Check.h, which contains a number of utility functions for testing shapes, types and devices of input tensors. This will be particulary useful for native methods, which don't get code generated input testing code. These functions take a 'CheckedFrom' argument, at the moment just a string, which specifies some extra information about what function was doing the actual checking; this greatly improves error messages. - Many check functions take initializer lists, which let you test that all tensors have some property. This API is peculiar, in that we IGNORE undefined tensors in this case. This is handled by filterDefined. - Add AT_CUDNN_ENABLED macro - CuDNN linking from ATen was improved; for example, we now actually add the CuDNN headers to our include path. - Add some missing override specifiers to some methods - We now actually build tests with CUDA functionality accessible (previously, AT_CUDA_ENABLED was not defined, meaning that the headers were missing all CUDA-only functionality.) - Native functions now support giving explicit names to return outputs in yaml. This makes it possible to hook into the NN autogenerated derivatives codepath using native functions. CuDNN rewrite changes: - torch/csrc/cudnn now uses ATen (rather than passing around THVoidTensor) and lives in ATen. This lets us remove tensorPointer shenanigans. The functions are exposed to ATen as native functions described in aten/src/ATen/cudnn/cuDNN.yaml - ATen now builds and links against CuDNN when enabled. The cmake package script was taken from Caffe2. - Some header reorganization was done to help reduce dependencies on headers (this reorg is no longer used but I've kept it) - Rename CHECK to CUDNN_CHECK - Rip out old shape/type testing code in favor of modern ATen/Check.h interface using TensorArg. In many cases, increase the robustness of the checking code. - Change the inputs of the public facing functions, so that they can be bound by ATen - Delete THCState; this is retrieved from the global ATen context - Delete cudnnHandle_t, this is retrieved from the global Handles.h - Delete cudnnDataType_t, this is retrieved from the Tensor type - Delete Convolution class, instead its constituent arguments are passed individually - Change functions to return tensors, rather than take an appropriately sized output tensor as an input. - Redo how transposed convolution / backward convolution is implemented (knock on effect of returning tensors). Previously it was assumed that you would always pass an appropriately sized output tensor, but we don't want to do this anymore. For backwards, we instead give the desired output tensor (input, really) size, because that is readily available. For transposed* convolution, however, we take output_padding, and otherwise do the shape calculation. - Redo how legacy group convolution is implemented (knock on effect from porting cudnn to ATen.) Previously, group convolution was implemented by manually constructing sizes and strides and then outputting appropriate, with macros switching between individual groups and all-at-once based on CuDNN version. Now, the code looks exactly what you'd expect: there's a top-level wrapping function that supports group convolution no matter the version of CuDNN, and a low-level wrapper which supports only what CuDNN supports. The top-level function conditions on CuDNN version, and invokes the low-level interface 1 or n times. - There is now a debugging printer for tensor descriptors. - Convolution struct is replaced with ConvolutionArgs, which is not part of the public API but is used internally to conveniently pass around all of the arguments needed for Convolution. - Add some constexprs for well-known dimensions, reduce amount of magic numbers in code. - Put 'deterministic' in to ConvParams. Fixes #3659 - Lots more comments. - Some pessimizations, in the name of code clarity: - The descriptors are initialized on every invocation of convolution forward/backward. Previously, the descriptors were cached, so that you didn't have to initialize them again on backwards. This is difficult to support in the ATen interface so I didn't support it. - Legacy group convolution initializes its workspace for every group it performs. I did not feel motivated to fix this because the legacy codepath is already quite slow. - Affine grid generator and grid sampler automatically call contiguous on their arguments as necessary. - Batchnorm input checking is greatly beefed up, it now checks for the following input characteristics: - Definedness - GPU location - Type - Contiguity - Size PyTorch binding code changes - batchnorm now uses consistent var/data naming - batchnorm and convolution make use of new ATen bindings - Affine grid generator and grid sampler make use of ATen CuDNN bindings via derivatives.yaml. This means I had to restructure the code a little, since the THNN bindings still go through a legacy Python class. - I fixed some warnings: - s/friend class/friend struct/ on InterpreterStateImpl - Removed pessimizing move 'detached' in torch/csrc/autograd/variable.cpp - Removed unused pack_list on Scalar Signed-off-by: Edward Z. Yang <ezyang@fb.com> GCC 4.8 buildfix Signed-off-by: Edward Z. Yang <ezyang@fb.com> Add TensorGeometry to ATen.h Signed-off-by: Edward Z. Yang <ezyang@fb.com> CUDNN_CHECK Signed-off-by: Edward Z. Yang <ezyang@fb.com> Update TODO comment Signed-off-by: Edward Z. Yang <ezyang@fb.com> Delete return in cudnn_grid_sampler Signed-off-by: Edward Z. Yang <ezyang@fb.com> s/cudnnSetStreamToCurrent/setCuDNNStreamToCurrent/g Signed-off-by: Edward Z. Yang <ezyang@fb.com> Don't allocate a new vector when filtering defined. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Remove Check overloads, convert to pass references. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Some more microbenchmarking. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-11-30 23:06:58 -05:00
SsnL	1661370ac5	Signal handling in DataLoader workers; Timeout option (#3474 )	2017-11-29 23:52:14 +01:00
Sam Gross	4bce69be22	Implement Variable.storage() (#3765 ) This still uses THPStorage, but avoids touching THPTensor	2017-11-20 14:18:07 -05:00
peterjc123	aa911939a3	Improve Windows Compatibility (for csrc/scripts) (#2941 )	2017-11-08 19:51:35 +01:00

1 2 3 4

169 Commits