pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Peter Goldsborough	d05a8145c5	Change behavior of clone to clone to a device (#9609 ) Summary: ebetica made me aware that `nn::Module::clone()` always clones to the current device (usually CPU) instead of preserving the device of each parameter. This PR changes the signature of `clone` from `shared_ptr<Module> clone()` to `shared_ptr<Module> clone(optional<Device> device = nullopt)` with semantics of: 1. If a `device` is given, all parameters/buffers are moved to that device, 2. If no `device` is supplied (default), parameters/buffers retain their device. ezyang apaszke ebetica Pull Request resolved: https://github.com/pytorch/pytorch/pull/9609 Differential Revision: D8957367 Pulled By: goldsborough fbshipit-source-id: 0d409ae645ed2b8d97d6fc060240de2f3d4bc6c8	2018-07-23 14:55:25 -07:00
Peter Goldsborough	b770156a7a	Functional DataParallel (#9234 ) Summary: This PR adds the functional version of `DataParallel` (i.e. `data_parallel`) to the C++ frontend. For this, I had to: 1. Add "differentiable" versions of scatter and gather, which perform their inverse operation in the backward pass, to C++. I've added them under `torch/csrc/autograd/functions/comm.{h,cpp}`. I had to move some utilities from `VariableType.cpp` into `torch/csrc/autograd/functions/utils.h`, and changed them a bit to fix the `const_cast`s for which there were `TODO`s, 2. Implement the `replicate`, `parallel_apply` and the combining `data_parallel` functions in C++. `replicate` is implemented based on our existing `clone()` interface, along with the ability to set the current device via `at::OptionsGuard` (so nice). `parallel_apply` is implemented using `at::parallel_for` (CC cpuhrsch) and [follows the code from PyTorch](https://github.com/pytorch/pytorch/blob/master/torch/nn/parallel/parallel_apply.py). Added lots of tests for these things. apaszke ezyang ebetica colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/9234 Differential Revision: D8865182 Pulled By: goldsborough fbshipit-source-id: 4f1fecf2b3f3bc1540c071dfb2d23dd45de433e4	2018-07-19 16:12:04 -07:00
Peter Goldsborough	7e78e80d94	Make error message for empty module friendlier (#9565 ) Summary: In our pimpl system, default constructing a module holder default constructs the contained module. This means `Linear linear;` is ill-formed, since `Linear` doesn't have a default constructor. Instead we require `Linear linear = nullptr;` to get the empty state of the `Linear`. This PR makes the error message for the ill-formed case nicer. I had to change the forwarding constructors of most of our modules for this, but that's a minor adjustment. E.g. ``` Linear linear; In file included from /home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/module.h:5:0, from /home/psag/pytorch/pytorch/test/cpp/api/module.cpp:3: /home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/pimpl.h: In instantiation of ‘torch::nn::ModuleHolder<Contained>::ModuleHolder() [with Contained = torch::nn::LinearImpl]’: /home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/modules/dropout.h:45:1: required from here /home/psag/pytorch/pytorch/torch/csrc/api/include/torch/nn/pimpl.h:46:5: error: static assertion failed: You are trying to default construct a module which has no default constructor. Use = nullptr to give it the empty state (like an empt y std::shared_ptr). static_assert( ``` ebetica ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/9565 Differential Revision: D8903666 Pulled By: goldsborough fbshipit-source-id: 5e6b788921a27a44359db89afdc2b057facc5cec	2018-07-19 15:56:54 -07:00
Peter Goldsborough	ae44a6b5e3	Fix Sequential::clone() (#9372 ) Summary: I noticed that `Sequential::clone()` does not work. This is because `Sequential` does not use `reset()` which is normally where modules have to initialize and register its submodules. Further, this is because of the way `Sequential` allows its modules to be passed in the constructor, which doesn't work with `reset()` (since it does "late" initialization). I've added some better error messages inside `Cloneable::clone()` which makes this kind of mistake clearer for other users, and tests for `Sequential::clone()`. I also had to give `AnyModule` a deep `clone()` method. ebetica ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/9372 Differential Revision: D8865189 Pulled By: goldsborough fbshipit-source-id: b81586e0d3157cd3c4265b19ac8dd87c5d8dcf94	2018-07-16 21:53:42 -07:00
Peter Goldsborough	153e2e96d4	Make Sequential ref-counted (#9151 ) Summary: In the C++ API, `Sequential` currently was not refcounted itself, but stored `shared_ptr<AnyModule>` to get the reference semantics. This is unfortunate because most modules in the API are accessed via `->`, e.g. `Linear l(1, 2); l->forward(...);`. `Sequential` was different in that it had value semantics itself, thus was accessed via `.`. This PR makes `Sequential` store `AnyModule` (without extra indirection), and uses the same pImpl mechanism we use for all other modules to make `Sequential` have reference semantics itself. This makes it consistent with the rest of the library. It also removes one level of indirection inside of `Sequential`, which is cool. One thing I had to change was that the `ModuleHolder` with which the whole pImpl thing is implemented previously did some tricks to make `Linear(3, 4)` actually construct `Linear(LinearOptions(3, 4))`. This doesn't work well with `Sequential` since it takes a variadic parameter pack. Instead, I made `ModuleHolder` forward all arguments to the underlying module, and then further pushed the trick to forward parameters to modules' options types into the actual Modules. This adds one constructor per Module in the library. This is not something user modules have to do (unless they want this nice forwarding themselves). It makes the code simpler overall. ezyang ebetica apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/9151 Reviewed By: ezyang Differential Revision: D8809298 Pulled By: goldsborough fbshipit-source-id: da68452c3de912fbc67af330ba93b5220de6909f	2018-07-11 17:24:59 -07:00
Peter Goldsborough	d863391871	nn::Module::as (#9149 ) Summary: Added a way to `dynamic_cast` an `nn::Module` and get a pointer to it. `nn::Module::is<T>` just checked if the return value of the `dynamic_cast` was nullptr, so I got rid of `is<T>` since it's equivalent to `as<T> != nullptr`(or just `as<T>` due to boolean conversion). We're now at ``` if (auto* conv = module.as<nn::Conv2d>()) { conv->weight.data().normal_(0.0, 0.02); } else if (auto* bn = module.as<nn::BatchNorm>()) { bn->weight.data().normal_(1.0, 0.02); bn->bias.data().fill_(0); } ``` ezyang apaszke ebetica Closes https://github.com/pytorch/pytorch/pull/9149 Differential Revision: D8735954 Pulled By: goldsborough fbshipit-source-id: e2b8f6f0cea16a621f8bc0807a33cc7651d25154	2018-07-06 11:10:29 -07:00
Peter Goldsborough	9ce15173fb	Move _cudnn_init_dropout_state to TensorOptions and enable cuDNN dropout in C++ API RNNs (#9012 ) Summary: The goal of this PR was to add support for dropout descriptors in the C++ API's RNN class. The end result is a 4x-5x speedup for our RNN integration tests since they can now use cuDNN instead of autograd when dropout is set. To achieve this, I had to move `_cudnn_init_dropout_state` to the `TensorOptions` API. I also fixed a bug around `RNN::cuda()` not flattening parameters for cuDNN. ebetica ezyang Closes https://github.com/pytorch/pytorch/pull/9012 Reviewed By: pjh5 Differential Revision: D8689786 Pulled By: goldsborough fbshipit-source-id: 44fb191f5a38e41c4ded5417306b5bbc012cd56c	2018-06-29 17:25:23 -07:00
Peter Goldsborough	66465f1e17	Create nn::Module::is (#8970 ) Summary: When initializing weights for my C++ model, I had to write ```cpp void initialize_weights(nn::Module& module) { if (module.name().find("Conv2d") != std::string::npos) { module.parameters()["weight"].data().normal_(0.0, 0.02); } else if (module.name().find("BatchNorm") != std::string::npos) { auto parameters = module.parameters(); parameters["weight"].data().normal_(1.0, 0.02); parameters["bias"].data().fill_(0); } } ``` The string-based module determination is not very nice, and not very C++-y. So I created `nn::Module::is<T>` which does a `dynamic_cast` inside. It also handles the `ModuleHolder` vs. `Module` distinction. It now becomes ```cpp if (module.is<nn::Conv2d>()) { module.parameters()["weight"].data().normal_(0.0, 0.02); } else if (module.is<nn::BatchNorm>()) { auto parameters = module.parameters(); parameters["weight"].data().normal_(1.0, 0.02); parameters["bias"].data().fill_(0); } ``` ebetica ezyang apaszke Closes https://github.com/pytorch/pytorch/pull/8970 Differential Revision: D8677476 Pulled By: goldsborough fbshipit-source-id: 053294e19b6a58cce868167596c89639f7de91c2	2018-06-28 16:10:04 -07:00
Peter Goldsborough	ccc14071f4	Fix Module::zero_grad (#8964 ) Summary: `nn::Module::zero_grad` did not respect undefined `grad()` variables. This is fixed (the code now replicates PyTorch). ebetica ezyang apaszke Closes https://github.com/pytorch/pytorch/pull/8964 Reviewed By: ezyang Differential Revision: D8677529 Pulled By: goldsborough fbshipit-source-id: afdc4ba00dbf5012c37d1f794c731937ee5e422e	2018-06-28 10:26:52 -07:00
Peter Goldsborough	03d0a70a4d	Set random seed at the start of C++ tests (#8903 ) Summary: Sets the random seed at the start of C++ tests so that everything is super deterministic. I made sure we only generate random values from torch instead of `std::`, so that this seed always applies. I.e. I do: ``` torch::randint(2, {2}, at::kInt64) ``` instead of ``` std::rand() % 2 ``` Also got rid of the tests that test the random seeding, since it would interfere here. And the test is not useful since we just use ATen's seeding mechanism, which should work. Fixes #7288 #7286 #7289 ebetica ezyang Closes https://github.com/pytorch/pytorch/pull/8903 Differential Revision: D8667269 Pulled By: goldsborough fbshipit-source-id: a833e86e156d5e68dae8c53a4b1c433cb0608b6c	2018-06-27 20:09:46 -07:00
Peter Goldsborough	fef9a66d08	Use torch:: instead of at:: (#8911 ) Summary: This PR is the final step to making `torch::` the only namespace users of the C++ API ever see. Basically, I did: ``` cpp namespace torch { using namespace at; } ``` And then changed `torch::` to `at::` almost everywhere. This worked surprisingly well out of the box. So users can now write `torch::relu` and `torch::log_softmax` and `torch::conv2d` instead of having to know when to use `at::` and when `torch::`. This is happy! Another thing I did was to have `using Dtype = at::ScalarType`, which will be the eventual name anyway. ebetica ezyang apaszke zdevito Closes https://github.com/pytorch/pytorch/pull/8911 Reviewed By: ezyang Differential Revision: D8668230 Pulled By: goldsborough fbshipit-source-id: a72ccb70fca763c396c4b0997d3c4767c8cf4fd3	2018-06-27 14:42:01 -07:00
Peter Goldsborough	55757357b2	[C++ API] Better forward methods (#8739 ) * Better forward methods in C++ API capitalize error message in test_torch.test_flatten Support for operator() * Add operator() to Functional * Get rid of SigmoidLinear * Add BoundFunction to FunctionalImpl * Remove macro from conv because it makes errors more nasty	2018-06-26 13:23:16 -07:00
Peter Goldsborough	47492ed451	[C++ API] Bag of fixes (#8843 ) * Bag of fixes * Rename tensor_range.h to tensor_list_view.h * Post rebase fixes * Rename torch::tensor namespace to torch::tensors due to name conflict * Avoid recursion in Module::to	2018-06-25 21:11:49 -07:00
Peter Goldsborough	521f5111ad	[C++ API] Use torch::Tensor instead of at::Tensor/Variable mix (#8680 ) * Use torch::Tensor instead of at::Tensor/Variable mix * TensorRange -> TensorListView	2018-06-24 19:03:39 -07:00
Peter Goldsborough	a2dd707031	[C++ API] Create fixed width dtypes in torch:: namespace (#8639 ) * Create fixed width dtypes in torch:: namespace * Make kByte -> kUInt8	2018-06-19 12:40:58 -07:00
Peter Goldsborough	271406f276	[C++ API] Make pImpl easy to use in modules to enable happy reference semantics (#8347 ) * Created TORCH_MODULE macro Rewrote Linear Rewrote Dropout and added default constructor to TORCH_MODULE macro Turned TORCH_MODULE contens into a proper base class Added some documentation Got rid of the old Dropout module Got rid of the old Embedding module Got rid of the old BatchNorm module Got rid of the old Conv module Fixing optimizers Rebase Removed old RNN modules and the TORCH_ATTR macro Removed temporary P:: namespace Added cloning behavior to all modules Got rid of some get() calls self review nits Remove noexcept from ModuleHolder methods that can throw Remove spaces Add missing override to reset() methods Added examples to documentation in pimpl.h * Post rebase fixes	2018-06-18 19:45:53 -07:00
Peter Goldsborough	372d1d6735	Create ATen tensors via TensorOptions (#7869 ) * Created TensorOptions Storing the type in TensorOptions to solve the Variable problem Created convenience creation functions for TensorOptions and added tests Converted zeros to TensorOptions Converted rand to TensorOptions Fix codegen for TensorOptions and multiple arguments Put TensorOptions convenience functions into torch namespace too All factory functions except _like support TensorOptions Integrated with recent JIT changes Support _like functions Fix in place modification Some cleanups and fixes Support sparse_coo_tensor Fix bug in Type.cpp Fix .empty calls in C++ API Fix bug in Type.cpp Trying to fix device placement Make AutoGPU CPU compatible Remove some auto_gpu.h uses Fixing some headers Fix some remaining CUDA/AutoGPU issues Fix some AutoGPU uses Fixes to dispatch_tensor_conversion Reset version of new variables to zero Implemented parsing device strings Random fixes to tests Self review cleanups flake8 Undo changes to variable.{h,cpp} because they fail on gcc7.2 Add [cuda] tag to tensor_options_cuda.cpp Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks Fix linker error in AutoGPU.cpp Fix bad merge conflict in native_functions.yaml Fixed caffe2/contrib/aten Fix new window functions added to TensorFactories.cpp * Removed torch::TensorOptions Added code to generate wrapper functions for factory methods Add implicit constructor from Backend to TensorOptions Remove Var() from C++ API and use torch:: functions Use torch:: functions more subtly in C++ API Make AutoGPU::set_device more exception safe Check status directly in DynamicCUDAHooksInterface Rename AutoGPU to DeviceGuard Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad remove python_default_init: self.type() Add back original factory functions, but with deprecation warnings Disable DeviceGuard for a couple functions in ATen Remove print statement Fix DeviceGuard construction from undefined tensor Fixing CUDA device compiler issues Moved as many methods as possible into header files Dont generate python functions for deprecated factories Remove merge conflict artefact Fix tensor_options_cuda.cpp Fix set_requires_grad not being checked Fix tensor_new.h TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac Fix bug in DeviceGuard.h Missing includes TEMPORARILY moving a few more methods into .cpp to see if it fixes windows Fixing linker errors * Fix up SummaryOps to use new factories Undo device agnostic behavior of DeviceGuard Use -1 instead of optional for default device index Also move DeviceGuard methods into header Fixes around device index after optional -> int32_t switch Fix use of DeviceGuard in new_with_tensor_copy Fix tensor_options.cpp * Fix Type::copy( * Remove test_non_float_params from ONNX tests * Set requires_grad=False in ONNX tests that use ints * Put layout/dtype/device on Tensor * Post merge fixes * Change behavior of DeviceGuard to match AutoGPU * Fix C++ API integration tests * Fix flip functions	2018-06-16 00:40:35 -07:00
Peter Goldsborough	de4e97e89a	[C++ API] Cursors (#8190 ) * Add cursors to C++ API * Small self nits * s/struct/class * Use more STL like names for cursors	2018-06-11 09:48:43 -07:00
Peter Goldsborough	4a80755834	Split up detail.h (#7836 )	2018-05-30 08:55:34 -07:00
Peter Goldsborough	28b1a3852c	Add backward() to Tensor and Variable (#7774 ) * Add backward() to Tensor and Variable * Add at:: in front of Tensor * Trying to not move optional to appease windows? * Move implementation into cpp file * Undo some formatting changes	2018-05-24 17:31:41 -07:00
Peter Goldsborough	b12164005f	[C++ API] Remove virtual forward and implement Sequential based on Any(Module) (#7508 ) * Remove virtual forward * Rebase	2018-05-24 12:46:51 -07:00
Peter Goldsborough	cfd70dc1cf	[C++ API] Back to reset() and fixed in-place cloning (#7796 ) * Back to reset() and fixed in-place cloning * Add final override to clone_	2018-05-23 22:11:32 -07:00
Will Feng	60745b3380	Revert #7750 and #7762 to fix Windows CI on master (#7772 ) * Revert "Add missing brace (#7762)" This reverts commit `ea27c5af50`. * Revert "[C++ API] Add backward() to Tensor and Variable (#7750)" This reverts commit `1e2762796f`.	2018-05-22 15:42:52 -07:00
Peter Goldsborough	ea27c5af50	Add missing brace (#7762 )	2018-05-22 14:18:22 -04:00
Peter Goldsborough	1e2762796f	[C++ API] Add backward() to Tensor and Variable (#7750 ) * Add backward() to Tensor and Variable * Added a couple tests	2018-05-22 10:43:04 -07:00
Peter Goldsborough	549b4069bb	[C++ API] Using new registration mechanism (#7663 ) * Using new registration mechanism * Fix signature of param() in module.cpp * Remove ParameterList * Fix tests	2018-05-21 17:59:21 -07:00
Peter Goldsborough	cba19e59ca	[C++ API] Implement builder style construction (#7597 ) * Implemented fused builder based construction mechanism * "weights" -> "weight" * Use int64_t instead of size_t everywhere in RNN * Extracted Conv::ExpandingSize into its own thing * Rename TORCH_PARAMETER to TORCH_ATTR * Added documentation * Fix weight names in batchnorm module	2018-05-17 17:10:15 -04:00
Peter Goldsborough	3414475653	[C++ API] Remove initialize_* functions (#7517 ) * Remove initialize_ functions * Fix clone() to recursively clone children * Small codemove	2018-05-14 18:24:58 -07:00
Peter Goldsborough	6ada041b31	Some small fixes in C++ API (#7510 )	2018-05-11 18:56:53 -07:00
Peter Goldsborough	c5de3314cf	Add name() to C++ modules (#7409 ) * Add name() to C++ modules * Use RTTI to get module name by default * Add functional.cpp to CMakeLists.txt * Call typeid() inside name() instead of constructor * Add tests and use default constructor	2018-05-10 08:52:38 -07:00
Peter Goldsborough	4eaf5261d3	Provide default implementation of clone() in base module (#7446 )	2018-05-10 00:49:29 -07:00
Peter Goldsborough	3023dd25f3	Use set_type to implement type conversions in C++ API (#7408 ) * Use set_type to implement .cuda() in C++ API * Change C++ module parameter types in place * Fix bug where batchnorm state was not moved to CUDA	2018-05-09 17:01:19 -04:00

32 Commits