pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

Author	SHA1	Message	Date
Stefan Krah	fc8834df4b	Port adaptive_max_pool3d() to ATen (#19547 ) Summary: This is the second part of #18064. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19547 Differential Revision: D15046630 Pulled By: ezyang fbshipit-source-id: 03f80602b94d47bca66bfd0dcab1b7bb99e5b7f1	2019-04-23 12:51:25 -07:00
Stefan Krah	75ce5173a9	Port adaptive_max_pool2d() to ATen (#19409 ) Summary: This is the first part of #18064. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19409 Differential Revision: D15037390 Pulled By: ezyang fbshipit-source-id: 16a3feed2fd9cc66033696da224a7d5fb7208534	2019-04-23 07:37:25 -07:00
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
Thomas Viehmann	5360984fbd	Remove TH(CU)NN Sparse Linear (#17610 ) Summary: Sparse Linear in TH(CU)NN implements sparse linear layers without using sparse matrices. It is currently not documented in PyTorch and there is no functional or module interface. This means it is unused from a PyTorch point of view. The reason for removing it is twofold: - The module uses sort, which I would like to move to ATen. - When we implement a SparseLinear layer, we would want to do it using sparse tensors, so it's not all that useful, anyway. I checked this on slack with soumith, I hope the above is an accurate representation. All bad ideas are my own. This is part of the ongoing work to move sort/topk/mode/median/kthvalue to ATen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17610 Differential Revision: D14280663 Pulled By: gchanan fbshipit-source-id: 289231d2c20626855ce2ceecd4f204b460c32378	2019-03-01 12:36:52 -08:00
vishwakftw	8c81a72e87	Switch to CUDA implementation if batch size >= 65536 for affine_grid (#16403 ) Summary: Changelog: - Append a condition that switches to the native CUDA implementation for affine_grid Fixes #16365 Differential Revision: D13832192 Pulled By: soumith fbshipit-source-id: 3f484e6673d71e3ba7627b170cb8f1611e12b9b2	2019-01-26 11:18:57 -08:00
Shen Li	23e28efed4	Porting legacy reflection_pad2d to ATen Summary: Other changes: 1. Avoided using `THCDeviceTensor` by re-calculating the mapping from cuda (blockIdx, threadIdx) to input/output tensor index. 2. Changed Camelcase naming to underscore naming. Differential Revision: D13546803 fbshipit-source-id: 1df54f13e64934da3d803d9b6586bd5208d42d6d	2019-01-09 20:55:27 -08:00
Lin Huang	2d8b332262	Port replication_pad2d and replication_pad3d to ATen (#15538 ) Summary: port replication padding 2D and 3D from legacy TH API implementation to ATen implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15538 Differential Revision: D13547567 Pulled By: lhuang04 fbshipit-source-id: decfe100d9edfdcfb62f39ee23f37b6cae0d461f	2019-01-04 17:08:14 -08:00
Shen Li	279ca4acd2	Port legacy reflection_pad1d to ATen (#15480 ) Summary: 1. Avoided using `THCDeviceTensor` by re-calculating the mapping from cuda (blockIdx, threadIdx) to input/output tensor index. 2. Changed Camelcase naming to underscore naming. Profiling: Legacy: ```bash $py.test test/test_nn.py -k ReflectionPad1d -v -s .... =========== 2 passed, 1258 deselected, 800 warnings in 4.35 seconds ============ ``` Now: ```bash $py.test test/test_nn.py -k ReflectionPad1d -v -s ... =========== 2 passed, 1258 deselected, 800 warnings in 4.03 seconds ============ ``` I have two questions about the code. Any insights are appreciated. gchanan zou3519 1. I can verify that [this magic](https://github.com/pytorch/pytorch/blob/master/aten/src/THCUNN/TemporalReflectionPadding.cu#L32-L36) correctly maps output index to input index in different cases. But, I have no idea about how did you come up with this algorithm that merges three categories (in left padding, in original input, in right padding) into a single statement? 2. Why do we need [get contiguous](https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/TemporalReflectionPadding.c#L80) tensors when calculating forward and backward propagation? Reflection_pad2d porting will come in the next PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15480 Differential Revision: D13544924 Pulled By: mrshenli fbshipit-source-id: 182045434f210032a82cab721a190da0cd781fbf	2019-01-03 10:30:37 -08:00
Lin Huang	b7bc49ad70	Port replication_pad1d to ATen (#15507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15485 port replication_pad1d Reviewed By: ezyang Differential Revision: D13531920 fbshipit-source-id: dcd64ebd2c24b7431996231b8d5addfb600b1072	2018-12-24 06:34:02 -08:00
David Riazati	f3cc9b2218	Remove fully qualified weak script names (#15364 ) Summary: Cleanup to make references to `weak_script` consistent across codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/15364 Differential Revision: D13509676 Pulled By: driazati fbshipit-source-id: 93dbbbe57e9b9b6587895f3cc6fac678babd21de	2018-12-18 16:48:52 -08:00
Roy Li	e0b261a35b	Port nn fold and unfold to c++ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14597 Reviewed By: ezyang Differential Revision: D13272227 fbshipit-source-id: 6eccab5ff5830a977398a96393b778095120edc6	2018-12-17 15:46:37 -08:00
Immanuel Alexander	64b3364209	Move adaptive avg pooling 2d to ATen native (#14714 ) Summary: adaptive_avg_pool1d, adaptive_avg_pool2d, and adaptive_avgpool3d are neural network functions that are currently implemented in our legacy THNN (CPU) / THCUNN (CUDA) libraries. It is generally better if these live in our new library ATen, since it is more feature complete and reduces cognitive overhead. This change moves currently to adaptive_avg_pool1d and adaptive_avg_pool2d to ATen. timed relevant cpu tests with this change: ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) .s.s.s.s.s.s.s... ---------------------------------------------------------------------- Ran 17 tests in 6.273s OK (skipped=7) real 0m7.164s user 3m1.289s sys 0m0.905s ``` compared to master: ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) .s.s.s.s.s.s.s... ---------------------------------------------------------------------- Ran 17 tests in 7.232s OK (skipped=7) real 0m8.065s user 3m34.714s sys 0m2.440s ``` also timed relevant cuda tests with this change: ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) ................. ---------------------------------------------------------------------- Ran 17 tests in 21.049s OK real 0m24.106s user 0m20.890s sys 0m4.026s ``` compared to master ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) ................. ---------------------------------------------------------------------- Ran 17 tests in 23.021s OK real 0m27.095s user 0m20.121s sys 0m3.668s ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14714 Differential Revision: D13384084 Pulled By: xnder fbshipit-source-id: 344442103ccbbda72d3c010d2feea00e9985d226	2018-12-12 12:25:22 -08:00
Elias Ellison	82175f31b4	Move Affine grid to C++ (#14392 ) Summary: Port AffineGrid to C++, because script does not support compiling Function classes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14392 Differential Revision: D13219698 Pulled By: eellison fbshipit-source-id: 3ddad8a84c72010b5a6c6f7f9712be614202faa6	2018-11-27 18:38:11 -08:00
William Horton	1bec8f773b	Move ConstantPadNd into ATen (#10885 ) Summary: Addresses #9499. Completed work on the forward function, tests should be passing for that. Working on backward function now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10885 Differential Revision: D9643786 Pulled By: SsnL fbshipit-source-id: 2930d6f3d2975c45b2ba7042c55773cbdc8fa3ac	2018-10-26 15:25:27 -07:00
Adam Paszke	6d6655e6be	Port PackedSequences functions to C++ (#11224 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11224 Differential Revision: D9652703 Pulled By: apaszke fbshipit-source-id: 558e39457e590cad07516e5bb2ecb12789564950	2018-09-05 06:35:15 -07:00
Adam Paszke	542aadd9a7	Stop using symbolic override for tracing RNNs (#10638 ) Summary: This disables the symbolic override hacks and makes tracing emit the recently added ATen ops for RNNs (`aten::lstm`, `aten::gru`, ...). I managed to reuse pretty much all of the translation code for their symbolics. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10638 Differential Revision: D9385830 Pulled By: apaszke fbshipit-source-id: ff06ef7b1ae7c3b7774825e0991bc3887e1ff59b	2018-08-24 20:25:58 -07:00
Tongzhou Wang	de11a5fb28	Resubmit #8322 with scipy version check Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10775 Differential Revision: D9458207 Pulled By: SsnL fbshipit-source-id: f2b0dbf2d236134afded9b15d8bf55ff98f50e7b	2018-08-22 13:39:49 -07:00
Adam Paszke	d35f365ad5	Remove all cuDNN specific inputs to RNN functions (#10581 ) Summary: This is still not the final PR, but it removes all blockers for actually using the RNN functions directly in the JIT. Next patch should be final, and will actually remove the symbolic_override code, and change it to proper symbolics for those ATen functions. Turns out the symbolic code can be also cleaned up a bit, and I'll do that too. zdevito ezyang colesbury (for minor DispatchStub.h) changes There was no way to handle those in the JIT for now, and they turned out to be completely unnecessary. It should make the Python and C++ module code much simpler too, since all the logic is now centralized in the native functions. The downside is that RNN modules no longer own their dropout buffers, which are shared per-device instead (with appropriate locking and synchronization). This might appear as a perf regression at first, but in reality it's highly unlikely that anyone will want to run cuDNN RNNs on the same GPU in parallel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10581 Reviewed By: colesbury Differential Revision: D9365541 Pulled By: apaszke fbshipit-source-id: 3ef8677ee5481bae60c74a9117a2508665b476b5	2018-08-17 11:09:51 -07:00
Simon Wang	a129f9ad3b	Revert D9332335: [pytorch][PR] Implements volumetric (5d) affine grid generation. Differential Revision: D9332335 Original commit changeset: 1b3a91d078ef fbshipit-source-id: 3dcce680257a6da121f5d67918ed4236e0c5bfec	2018-08-15 15:25:11 -07:00
Adam Paszke	86363e1d8e	Move RNN implementations to C++ (#10481 ) Summary: This is the first of two changes that are supposed to improve how we handle RNNs in the JIT. They still get traced as `PythonOp`s, but now it will be much easier to actually expose them to the JIT as e.g. `aten::lstm`, and ignore the Python interpreter entirely. This needs some symbolic adjustments that will be part of a second PR. Even when we fix symbolics, there will still be a bit of a problem with statefulness of the cuDNN API (we need a mutable cache for the dropout state, but our IR has no way of representing that). zdevito ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/10481 Reviewed By: ezyang Differential Revision: D9341113 Pulled By: apaszke fbshipit-source-id: 0ae30ead72a1b12044b7c12369d11e5ca8ec30b5	2018-08-15 13:25:41 -07:00
Eli Stevens	f5a4dd89b5	Implements volumetric (5d) affine grid generation. (#8322 ) Summary: I've implemented affine grid generation for volumetric (5d) inputs. The implementation is based off of the spatial implementation, extended by one dimension. I have a few questions about my implementation vs. the existing one that I will add inline. I have some extensive test cases for the forward pass here: https://gist.github.com/elistevens/6e3bfb20d8d0652b83bd16b3e911285b However, they use `pytest.fixture` extensively, so I'm not sure the best way to incorporate them into the pytorch test suite. Suggestions? I have not tested backwards at all. Diff probably best viewed with whitespace changes ignored. Thanks for considering! Pull Request resolved: https://github.com/pytorch/pytorch/pull/8322 Differential Revision: D9332335 Pulled By: SsnL fbshipit-source-id: 1b3a91d078ef41a6d0a800514e49298fd817e4df	2018-08-15 11:02:08 -07:00
Adam Paszke	adbcb3c1dc	Move dropout and alpha dropout to ATen (#10384 ) Summary: zdevito ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/10384 Reviewed By: ezyang Differential Revision: D9272583 Pulled By: apaszke fbshipit-source-id: ed5d37b28ce9ff25800bbaa0daf066cfbf1f9921	2018-08-10 14:55:28 -07:00
Adam Paszke	be5fb8f6fd	Move fused RNN kernels into ATen (#10305 ) Summary: As in the title. I also did a small refactor that let us loose almost 400 loc. This is a first step in moving the RNN code to C++. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10305 Reviewed By: ezyang Differential Revision: D9196227 Pulled By: apaszke fbshipit-source-id: 54da905519aade29baa63ab1774a3ee1db5663ba	2018-08-10 09:12:05 -07:00
Natalia Gimelshein	5bb21493fd	add fused dropout kernels (#9666 ) Summary: While waiting for dropout to be fully ported to ATen, here's performance fix for the most common dropout case. Dropout is still in python function, I just added efficient path to it. I could not make inplace work, because generator always generates `return self` for inplace function, and I need to return both original tensor and mask, so inplace goes on the existing pass. Even with non-inplace version, since mask is now a ByteTensor, memory used is just a little larger than for inplace dropout, due to savings on mask. Once dropout is moved to aten, these kernels still can be used for efficient implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9666 Reviewed By: SsnL Differential Revision: D8948077 Pulled By: ezyang fbshipit-source-id: 52990ef769471d957e464af635e5f9b4e519567a	2018-08-07 13:34:53 -07:00
Sam Gross	829d763c69	Implement add, sub, mul, div using TensorIterator (#8919 ) Summary: ``` This adds TensorIterator, a helper class for computing element-wise operations that's intended to replace the CPU and CUDA apply utils functions. CPU kernels are implemented as functions that operate on strided 1-d tensors compared to CPUApplyUtils which operated individual elements. This allows the kernels to handle vectorization, while TensorIterator handles parallelization and non-coalesced dimensions. GPU kernels continue to operate on elements, but the number of specializations is reduced. The contiguous case remains the same. The non-contiguous case uses a single (reduced) shape for all operands and the fast integer division from THCIntegerDivider. To avoid extra specializations for indexing with 64-bits, large operations are split into smaller operations that can be indexed with 32-bits. Major semantic changes: - No more s_add, s_mul, s_div, or s_sub. Broadcasting is handled by TensorIterator. The autograd engine performs the reduction assuming standard broadcasting if the gradient shape does not match the expected shape. Functions that do not use standard broadcasting rules should either continue to trace the expand calls or handle the reduction in their derivative formula. - Use ONNX v7, which supports broadcasting ops. Performance impact: - Small increased fixed overhead (~0.5 us) - Larger overhead for wrapped numbers (~2.5 us) - No significant change for ops on contiguous tensors - Much faster worst-case performance for non-contiguous GPU tensors - Faster CPU bias addition (~2x) - Faster GPU bias addition (~30% faster) Future work: - Decrease overhead, especially for wrapping numbers in Tensors - Handle general inter-type operations - Extend to unary ops and reductions - Use buffering for compute-bound operations on non-contiguous tensors (pull in from CPUApplyUtils) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/8919 Differential Revision: D8677600 Pulled By: colesbury fbshipit-source-id: 61bc9cc2a36931dfd00eb7153501003fe0584afd	2018-07-27 14:43:24 -07:00
Adam Paszke	aa7af94656	Make JIT tracing a thread-local property (#9414 ) Summary: As in the title. Lets us simplify a lot of code. Depends on #9363, so please review only the last commit. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414 Reviewed By: zdevito Differential Revision: D8836496 Pulled By: apaszke fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3	2018-07-19 19:09:39 -07:00
tippisum	5c695e3a60	Implement 2D and 3D alpha_dropout (#9073 ) Summary: It implements per-channel alpha_dropout. It also creates corresponding function classes and unifies the process of dropout and alpha_dropout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9073 Differential Revision: D8727008 Pulled By: ezyang fbshipit-source-id: 9d509f9c5db4e98f7b698cdfc4443505a4d2b331	2018-07-17 17:10:16 -07:00
Roy Li	a47a30b9ce	Implement grid_sampler in aten (#8929 ) Summary: Partially addresses #8928. Maybe #7273? Pull Request resolved: https://github.com/pytorch/pytorch/pull/8929 Reviewed By: ezyang Differential Revision: D8668919 Pulled By: li-roy fbshipit-source-id: 8ad07b224d2ab211c274c4c10f042501efaae32c	2018-07-10 15:10:24 -07:00
Wei Yang	cb1bfe91af	Deprecated several functions at torch.nn.functional (#8748 ) Summary: 1. fixes #6245 2. deprecated tanh, sigmoid Closes https://github.com/pytorch/pytorch/pull/8748 Differential Revision: D8697975 Pulled By: weiyangfb fbshipit-source-id: f30714aa0611a1fe870040692f3dbcc8238aece9	2018-07-02 15:54:46 -07:00
Peter Goldsborough	9ce15173fb	Move _cudnn_init_dropout_state to TensorOptions and enable cuDNN dropout in C++ API RNNs (#9012 ) Summary: The goal of this PR was to add support for dropout descriptors in the C++ API's RNN class. The end result is a 4x-5x speedup for our RNN integration tests since they can now use cuDNN instead of autograd when dropout is set. To achieve this, I had to move `_cudnn_init_dropout_state` to the `TensorOptions` API. I also fixed a bug around `RNN::cuda()` not flattening parameters for cuDNN. ebetica ezyang Closes https://github.com/pytorch/pytorch/pull/9012 Reviewed By: pjh5 Differential Revision: D8689786 Pulled By: goldsborough fbshipit-source-id: 44fb191f5a38e41c4ded5417306b5bbc012cd56c	2018-06-29 17:25:23 -07:00
Thomas Viehmann	9a9eadacc6	explicitly check device for grid_sampler (fixes: #8599 ) (#8646 )	2018-06-19 11:53:46 -04:00
Soumith Chintala	92f67d9404	fix lint	2018-06-15 18:18:20 -07:00
li-roy	26bed6d83e	assert limit on cudnn grid_sampler (#8576 )	2018-06-15 17:33:33 -07:00
ngimel	63ae163b24	put dropout states on the input device (#7515 ) * put dropout states on the input device * add assert to aten, add test, fix lint * only assert device if states are defined	2018-05-13 16:25:37 -04:00
Tongzhou Wang	1c01eabd3c	Codemod to update our codebase to 0.4 standard (#6641 ) * Codemod to update our codebase to 0.4 standard * Update some of the test scri[ts * remove Variable in test_clip_grad_value * fix _symbolic_override_wrapper_maker	2018-04-17 22:06:54 -04:00
James Reed	3b0204d43c	[JIT] Hacky: Staged symbolics for RNN nodes (#6297 ) * Staged symbolic for RNN modules * Move function to symbolic.py * Add comments, improve tests, fixup logic	2018-04-12 16:29:25 -07:00
gchanan	749d51414a	Separate cuda-ness from dtype. (#6470 ) * Separate cuda-ness from dtype. There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType. At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device). There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types on reduction functions. * Fix test_autograd. * Add defaults to randint_like. * Track is_cuda in py tensor types. * Fix test_sparse. * Fix multiprocessing. * Fix rnn. * Fix test_nn. * Fix flake8.	2018-04-12 14:05:44 -04:00
Richard Zou	5d628db0a2	Deprecate ctx.saved_variables via python warning. (#5923 ) * Deprecate ctx.saved_variables via python warning. Advises replacing saved_variables with saved_tensors. Also replaces all instances of ctx.saved_variables with ctx.saved_tensors in the codebase. Test by running: ``` import torch from torch.autograd import Function class MyFunction(Function): @staticmethod def forward(ctx, tensor1, tensor2): ctx.save_for_backward(tensor1, tensor2) return tensor1 + tensor2 @staticmethod def backward(ctx, grad_output): var1, var2 = ctx.saved_variables return (grad_output, grad_output) x = torch.randn((3, 3), requires_grad=True) y = torch.randn((3, 3), requires_grad=True) model = MyFunction() model.apply(x, y).sum().backward() ``` and assert the warning shows up. * Address comments * Add deprecation test for saved_variables	2018-03-26 14:13:45 -04:00
li-roy	e4eee7c2cf	Implement MarginRankingLoss as native function and add reduce=True arg to it (#5346 ) * add reduce=True arg to MarginRankingLoss * make default margin arg match for legacy * remove accidentally added test * fix test * fix native_functions.yaml alphabetical order	2018-03-21 15:40:58 -04:00
li-roy	1dcad08537	Support N-D tensors in Bilinear (#5764 ) * support n-d inputs in bilinear and move to aten * support n-d inputs in bilinear and move to aten * add asserts to bilinear inputs * address comments * cast int64_t in asserts	2018-03-17 11:57:43 -04:00
li-roy	4c4a42b3f9	implement CosineEmbeddingLoss as a native function and add reduce arg (#5646 ) * implement CosineEmbeddingLoss as a native function and add reduce=True arg to it * fix flake8 * address comments * add reference function to tests * fix flake8	2018-03-08 17:54:24 -05:00
Edward Z. Yang	9de922991c	Revert "implement CosineEmbeddingLoss as a native function and add reduce arg" (#5640 ) * Revert "implement CosineEmbeddingLoss as a native function and add reduce arg (#5447)" This reverts commit `c16478fe3f`.	2018-03-08 14:07:17 -05:00
li-roy	c16478fe3f	implement CosineEmbeddingLoss as a native function and add reduce arg (#5447 ) forward (new) [1.1905965859768912, 1.160144692985341, 1.1558120870031416] backward (new) [1.9150976981036365, 1.9792822760064155, 1.8779143309220672] double backward (new) [3.6898688060464337, 3.5784677929477766, 3.569505032035522] forward (old) [3.2359962839400396, 3.275224728975445, 3.3409753759624436] backward (old) [5.668679727939889, 5.722980880062096, 5.585088661056943] double backward (old) N/A * implement CosineEmbeddingLoss as a native function and add reduce=True arg to it * fix flake8 * address comments * add reference function to tests * fix flake8	2018-03-08 13:15:12 -05:00
Edward Z. Yang	f064c5aa33	Expunge all occurrences of torch._C._VariableFunctions (#5525 ) Some of the call-sites now look a little hokey with this removed, saving that for another patch. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-02 12:19:44 -05:00
Edward Z. Yang	0877558e60	Port cuDNN RNN dropout state initialization to ATen and make Python c… (#5383 ) * Port cuDNN RNN dropout state initialization to ATen and make Python code use it. Fixes #5138. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Variable/Tensor bugfix Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-02 10:00:00 -05:00
Soumith Chintala	36abf023bd	Added 3d grid sampler (for volumetric transformer networks) (#5453 ) * add 3d grid_sample * add cuda implementation, more testing	2018-02-28 19:32:15 -05:00
gchanan	affe742d31	Add scalar module tests for test_nn. (#5116 ) * Add scalar module tests for test_nn. * Properly return from glu. * Guard scalar test with skipIf.	2018-02-08 13:53:24 -05:00
anderspapitto	ef14590209	Support calling pack_padedd_sequence with a Variable lengths (#5113 ) This was accidentally lost while addressing review comments on https://github.com/pytorch/pytorch/pull/4695 pack_padded_sequence may be called either with a list or with a Variable. If called with a list we convert to Variable internally. I added to test_nn to test the new codepath. The bug was also caught by the onnx-fb-universe tests (which rely on passing in Variable).	2018-02-07 17:11:33 -05:00
anderspapitto	b2cfd961d3	Handle sequence lengths correctly when exporting RNNs to ONNX (#4695 ) * PackedSequence: store batch_sizes as tensor rather than converting to a list of python integers. This maintains the invariant that module's inputs/outputs are collections of Variables. In particular, this causes the JIT to no longer choke when flattening and unflattening arguments. * Handle sequence lengths correctly when exporting RNNs to ONNX - when uniform sequence lengths are provided, correctly omit the argument when constructing the ONNX graph, so as to not fix the graph to the batch size. - handle PackedSequences by floating them through the graph and eliminating them in an optimization pass. ONNX does not have packed sequences, but operates on a representation equivalent to PaddedSequence, so we hide the representation-switching from ONNX - as a preliminary step towards handling PackedSequences, not directly tied to ONNX export, change batch_sizes from being an argument to the RNN operators into being an argument to the forward() function of those RNN operators. This more closely models the reality that batch_sizes are effectively part of the input sequences.	2018-02-06 21:40:27 -05:00
Sam Gross	895aebac08	Use Variable instead of Tensor in Function.forward (#4786 ) The Tensor and Variable classes are being merged. autograd.Function.forward is now called on Variables, but with "no-grad" mode (torch.no_grad()) enabled. One benefit is that we no longer have to explicitly track shared storages.	2018-02-06 17:24:27 -05:00

1 2 3 4 5 ...

262 Commits