pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Gao, Xiang	a47749cb28	Add at::one_hot (#15208 ) Summary: Closes: https://github.com/pytorch/pytorch/issues/15060 Differential Revision: D13528014 Pulled By: ezyang fbshipit-source-id: 5a18689a4c5638d92f9390c91517f741e5396293	2018-12-20 14:24:58 -08:00
Erik Brinkman	8db44eda01	Add support for batched pdist (#12302 ) Summary: This updates pdist to work for batched inputs, and updates the documentation to reflect issues raised. closes #9406 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12302 Reviewed By: ezyang Differential Revision: D13528485 Pulled By: erikbrinkman fbshipit-source-id: 63d93a6e1cc95b483fb58e9ff021758b341cd4de	2018-12-20 09:41:08 -08:00
David Riazati	f3cc9b2218	Remove fully qualified weak script names (#15364 ) Summary: Cleanup to make references to `weak_script` consistent across codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/15364 Differential Revision: D13509676 Pulled By: driazati fbshipit-source-id: 93dbbbe57e9b9b6587895f3cc6fac678babd21de	2018-12-18 16:48:52 -08:00
David Riazati	3118124cd6	Add (Un)Fold modules to standard library (#14759 ) Summary: Depends on #14597 for the corresponding aten ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14759 Differential Revision: D13325356 Pulled By: driazati fbshipit-source-id: 99e39449c1ccfa293de05672c31a11e580bdd11f	2018-12-18 12:03:08 -08:00
Roy Li	e0b261a35b	Port nn fold and unfold to c++ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14597 Reviewed By: ezyang Differential Revision: D13272227 fbshipit-source-id: 6eccab5ff5830a977398a96393b778095120edc6	2018-12-17 15:46:37 -08:00
David Riazati	59d71b9664	Bicubic interpolation for nn.functional.interpolate (#9849 ) Summary: Addresses #918, interpolation results should be similar to tf * Adds bicubic interpolation operator to `nn.functional.interpolate` * Corresponding test in `test_nn.py` The operator is added in legacy `TH` to be aligned with the other upsampling operators; they can be refactored/moved to ATen all at once when #10482 is resolved Pull Request resolved: https://github.com/pytorch/pytorch/pull/9849 Differential Revision: D9007525 Pulled By: driazati fbshipit-source-id: 93ef49a34ce4e5ffd4bda94cd9a6ddc939f0a4cc	2018-12-17 15:31:48 -08:00
Yuxin Wu	110ccbb689	Improve the docs of interpolate(align_corners=) (#14806 ) Summary: ailzhang Pull Request resolved: https://github.com/pytorch/pytorch/pull/14806 Reviewed By: ailzhang Differential Revision: D13366332 Pulled By: ppwwyyxx fbshipit-source-id: 08fcea95d5c86b11cdfe464fdd9daa50050871f1	2018-12-10 12:50:38 -08:00
David Riazati	a66669a110	Enable testing on Loss modules (#14778 ) Summary: This PR adds `None` buffers as parameters (similarly to #14715). It also cleans up a bunch of the `test_jit.py` tests that should be covered by `common_nn.py` and brings in `criterion_tests` to test loss functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14778 Differential Revision: D13330849 Pulled By: driazati fbshipit-source-id: 924cc4cf94e0dcd11e811a55222fd2ebc42a9e76	2018-12-04 18:35:10 -08:00
Ailing Zhang	ef91cfd68b	Add new reduction mode in kl_div (#14457 ) Summary: Fixes #6622 . We used to average over all elements for kl divergence, which is not aligned with its math definition. This PR corrects the default reduction behavior of KL divergence that it now naverages over batch dimension. - In KL, default behavior `reduction=mean` averages over batch dimension. While for most other loss functions, `reduction=mean` averages over all elements. - We used to support scalar tensor as well. For BC purpose, we still support it, no reduction is performed on scalar tensor. - Added a new reduction mode called `batchmean` which has the correct behavior for KL. Add a warning to make `batchmean` as default for KL instead of `mean` in next major release. - [deprecated]I chose to not add a new reduction option, since "mean over batch dimension" is kinda special, and it only makes sense in few cases like KL. We don't want to explain why there's a option "batchmean" but it's not applicable for all other functions. I'm open to discussion on this one, as I cannot think of a perfect solution for this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14457 Differential Revision: D13236016 Pulled By: ailzhang fbshipit-source-id: 905cc7b3bfc35a11d7cf098b1ebc382170a087a7	2018-12-04 12:24:28 -08:00
David Riazati	a23863fd6f	Add Pooling modules to Script (#14527 ) Summary: Depends on #14584 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14527 Differential Revision: D13270773 Pulled By: driazati fbshipit-source-id: e4acd43ccbce0f4b62d41c30ce8d5c721171e19a	2018-12-03 23:55:04 -08:00
David Riazati	d429e78a9a	Add fractional_max_pool2d to standard lib Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14591 Differential Revision: D13270755 Pulled By: driazati fbshipit-source-id: 138a60256795f5ef8d236c75be2cfd929059b98f	2018-12-03 23:49:38 -08:00
Elias Ellison	404ad939e5	Revert existing no_grad_embedding_renorm_ from aten (#14639 ) Summary: Remove no_grad_embedding_renorm_ from aten. Setting the derivatives of the inputs to false has different semantics from calling with no_grad(), because it will not error if an input is modified and then has it's grad accessed. Instead, make a custom op, and use NoGradGuard. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14639 Differential Revision: D13285604 Pulled By: eellison fbshipit-source-id: c7d343fe8f22e369669e92799f167674f124ffe7	2018-11-30 16:57:51 -08:00
David Riazati	89c3dbcad8	Add binary cross entropy to standard lib Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14583 Differential Revision: D13269423 Pulled By: driazati fbshipit-source-id: 7cc1594d8189c3e8f2d4ce0462fdc0a03683006e	2018-11-29 22:23:13 -08:00
David Riazati	15e8bb379e	Add `List` to annotations (#14482 ) Summary: This PR adds a polyfill for `typing.List` for Python versions that don't support `typing` as a builtin. It also moves the type defintions from `annotations.py` so that they can be used in `torch.nn`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14482 Differential Revision: D13237570 Pulled By: driazati fbshipit-source-id: 6575b7025c2d98198aee3b170f9c4323ad5314bd	2018-11-29 17:23:29 -08:00
David Riazati	666d383a00	Add broadcast list default arg support (#14361 ) Summary: To convert `max_unpool` functions to weak script, this PR adds support for `T` as default arguments for `BroadcastingListN[T]`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14361 Differential Revision: D13192231 Pulled By: driazati fbshipit-source-id: a25b75a0e88ba3dfa22d6a83775e9778d735e249	2018-11-29 15:15:47 -08:00
David Riazati	9e93a02624	Use nn module tests in test_jit (#14238 ) Summary: This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types` Also depends on #14379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238 Differential Revision: D13252887 Pulled By: driazati fbshipit-source-id: e9638cf74089884a32b8f0f38396cf432c02c988	2018-11-28 23:31:25 -08:00
Elias Ellison	6d63e9dbff	Support Embedding + EmbeddingBag in Script + (Ignore flakey test) (#14509 ) Summary: Resubmitting PR #14415 The tests added for Embedding + EmbeddingBag had random numbers as input, which affected the random number generator & caused the flakey test to break. Everything but the last two commits have already been accepted Pull Request resolved: https://github.com/pytorch/pytorch/pull/14509 Differential Revision: D13247917 Pulled By: eellison fbshipit-source-id: ea6963c47f666c07687787e2fa82020cddc6aa15	2018-11-28 19:16:38 -08:00
Elias Ellison	105fa58748	pointwise_loss (#14134 ) Summary: Adding pointwise loss ops to weak_script Pull Request resolved: https://github.com/pytorch/pytorch/pull/14134 Differential Revision: D13209455 Pulled By: eellison fbshipit-source-id: 87fc0222121f34a2f4edb24c2da2a11124b097d8	2018-11-28 18:14:38 -08:00
Edward Yang	5f07b33857	Revert D13219647: [pytorch][PR] Support Embedding + EmbeddingBag in Script Differential Revision: D13219647 Original commit changeset: c90706aa6fbd fbshipit-source-id: d189e717ba0773de43d633876bc3a688830a9303	2018-11-28 13:38:58 -08:00
Elias Ellison	7749804099	Support Embedding + EmbeddingBag in Script (#14415 ) Summary: Add support for Embedding and EmbeddingBag in script. Both functions require with torch.no_grad(), which we don't have any plans to support in the near future. To work around this, I added a embedding_renorm function without derivatives. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14415 Reviewed By: wanchaol Differential Revision: D13219647 Pulled By: eellison fbshipit-source-id: c90706aa6fbd48686eb10f3efdb65844be7b8717	2018-11-28 10:52:30 -08:00
David Riazati	3d98810fbd	Revert D13192230: [pytorch][PR] [jit] Use nn module tests in test_jit Differential Revision: D13192230 Original commit changeset: 36488960b6c9 fbshipit-source-id: 63b68bd909b9ef0548f52c986c84f549aecb8909	2018-11-28 00:23:09 -08:00
David Riazati	4cdcbbf410	Use nn module tests in test_jit (#14238 ) Summary: This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types` Also depends on #14379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238 Differential Revision: D13192230 Pulled By: driazati fbshipit-source-id: 36488960b6c91448b38c0fa65422539a93af8c5e	2018-11-27 21:19:51 -08:00
David Riazati	662f66ebb9	Add poisson_nll_loss to script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14420 Differential Revision: D13220726 Pulled By: driazati fbshipit-source-id: 6c08a0050075beafcc8ba413c9603b273870c70c	2018-11-27 19:39:16 -08:00
David Riazati	d75f751bec	Add boolean dispatch for function overloading (#14425 ) Summary: This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See max_pool1d for an example usage. This is the first step in enabling the use of max_pool functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions. Fixes #14081 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14425 Differential Revision: D13222104 Pulled By: driazati fbshipit-source-id: 8cb676b8b13ebcec3262234698edf4a7d7dcbbe1	2018-11-27 19:36:47 -08:00
Elias Ellison	82175f31b4	Move Affine grid to C++ (#14392 ) Summary: Port AffineGrid to C++, because script does not support compiling Function classes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14392 Differential Revision: D13219698 Pulled By: eellison fbshipit-source-id: 3ddad8a84c72010b5a6c6f7f9712be614202faa6	2018-11-27 18:38:11 -08:00
David Riazati	1b80644b4d	Revert D13192228: [pytorch][PR] [jit] Add boolean dispatch for function overloading Differential Revision: D13192228 Original commit changeset: fce33c400c1f fbshipit-source-id: 75c9991dc7097f9513c6c89d16eff2de6e287c3b	2018-11-27 13:14:42 -08:00
David Riazati	66c8bbf021	Add boolean dispatch for function overloading (#14081 ) Summary: This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See `max_pool1d` for an example usage. This is the first step in enabling the use of `max_pool` functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions. Depends on #14232 for `Optional[BroadcastingList[T]]` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14081 Differential Revision: D13192228 Pulled By: driazati fbshipit-source-id: fce33c400c1fd06e59747d98507c5fdcd8d4c113	2018-11-27 10:51:32 -08:00
Wanchao Liang	7fc34a4122	Convert gumbel_softmax, lp pooling weak functions and modules (#14232 ) Summary: 1. Support `Optional[BroadcastingList1[int]]` like type annotation to accept a int or a list[int] 2. Convert gumbel_softmax, lp pooling weak functions and modules Pull Request resolved: https://github.com/pytorch/pytorch/pull/14232 Differential Revision: D13164506 Pulled By: wanchaol fbshipit-source-id: 6c2a2b9a0613bfe907dbb5934122656ce2b05700	2018-11-21 23:44:24 -08:00
David Riazati	d9cdcc9a3b	Add list inequality operator (#14129 ) Summary: This PR adds `aten::neq` for list inequality comparisons and converts `nll_loss` to weak script Pull Request resolved: https://github.com/pytorch/pytorch/pull/14129 Differential Revision: D13123894 Pulled By: driazati fbshipit-source-id: 8c1edf7c163217ec00eb653f95d196db3998613f	2018-11-21 16:32:58 -08:00
David Riazati	8f20d40bb7	Allow undefined tensors as constants (#14120 ) Summary: This PR inserts `prim::None` constants for undefined tensors. This comes in the standard library if an `Optional[Tensor]` is statically determined to be `None`: ```python torch.jit.script def fn(x=None): # type: (Optional[Tensor]) -> Tensor return torch.jit._unwrap_optional(x) torch.jit.script def fn2(): # type: () -> Tensor return fn() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14120 Differential Revision: D13124625 Pulled By: driazati fbshipit-source-id: 9eaa82e478c49c503f68ed89d8c770e8273ea569	2018-11-20 16:54:27 -08:00
Wanchao Liang	d6bfc53b9e	Export BatchNorm functional and module, add necessary JIT support (#14016 ) Summary: This PR did three things: 1. It export the BatchNorm functional and module, and rewrite some of the components to stay align with the current supported JIT features 2. In the process of export, add necessary compiler support for in_place op aug assign 4. change the test_jit behavior in add_module_test to utilize a single rng state during module initialization Pull Request resolved: https://github.com/pytorch/pytorch/pull/14016 Differential Revision: D13112064 Pulled By: wanchaol fbshipit-source-id: 31e3aee5fbb509673c781e7dbb6d8884cfa55d91	2018-11-20 14:15:06 -08:00
David Riazati	0d29846d5e	Convert more weak functions (#14003 ) Summary: Same deal as #13707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14003 Differential Revision: D13076403 Pulled By: driazati fbshipit-source-id: eb3cb3b2c31caf1de591b613bdc4c9a6ed4e1767	2018-11-15 16:45:50 -08:00
Wanchao Liang	6d094224b9	Fix optional import/export, export multi-margin-loss (#13877 ) Summary: This PR did two thing: 1. it fix the optional import/export to include any type including tensor types (previously we only support base types), this is essential to unblock optional tensor type annotation in our test logic 2. it tries to export mult_margin_loss functional to serve as a example of optional undefined tensor use case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13877 Differential Revision: D13076090 Pulled By: wanchaol fbshipit-source-id: c9597295efc8cf4b6462f99a93709aae8dcc0df8	2018-11-15 00:45:22 -08:00
Xiang Gao	143ba72264	Move cosine_similarity to ATen (#12199 ) Summary: I'm now traveling and don't have access to a good computer to compile test by myself. Will see the outcome of CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12199 Differential Revision: D13062326 Pulled By: nairbv fbshipit-source-id: 85873525caa94906ccaf2c739eb4cd55a72a4ffd	2018-11-14 10:41:44 -08:00
David Riazati	5163a28917	Convert more weak functions (#13707 ) Summary: Convert some more functions to match up with features added. Some conversions were unsuccessful but the type line was left in for later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13707 Differential Revision: D13030210 Pulled By: driazati fbshipit-source-id: 02d5712779b83b7f18d0d55539e336321335e0cc	2018-11-13 13:50:57 -08:00
David Riazati	0c375571f5	Support OptionalType export and type match (#13647 ) Summary: * Adds `OptionalType` support for import/export * Optionals get exported along with their contained type, i.e. 'Optional[int]' * Allows concrete types and `None` to be passed to an op that takes an optional * Converts `softmax` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13647 Differential Revision: D12954672 Pulled By: driazati fbshipit-source-id: 159e9bfb7f3e398bec3912d414c393098cc7455a	2018-11-12 12:15:25 -08:00
Wanchao Liang	79ceecec8e	Optional undefined tensor support (#13650 ) Summary: This PR is a part of task to unblock standard library export. * we treat None differently from Tensor and other types, when passing None as Tensor, it's an undefined tensor rather than the None IValue. * Refine the type system so that we have correct tensor types hierarchy (Dynamic/Tensor/CompleteTensor), Dynamic should be at the top of the inheritance hierarchy. * It also tries to export bilinear as an example of undefined tensor(None) input. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13650 Differential Revision: D12967026 Pulled By: wanchaol fbshipit-source-id: 6aedccc7ce2a12fadd13d9e620c03e1260103a5a	2018-11-09 11:29:57 -08:00
Dan Zheng	51f58f0990	Fix typo in CTC loss doc comments. (#13727 ) Summary: `target_lenghts` -> `target_lengths` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13727 Differential Revision: D12981582 Pulled By: zou3519 fbshipit-source-id: e5e02b26cf3030a91494655ff863273333cc4133	2018-11-08 14:50:48 -08:00
David Riazati	556ff8e7b7	Add builtins for `size()` and list with defaults (#13639 ) Summary: * `aten::size()` to match `torch.Tensor.size` * `aten::list_with_default` for semantics of `torch.nn.modules.utils.list_with_default` * converts `adaptive_avg_pool2d` and `adaptive_avg_pool3d` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13639 Differential Revision: D12954670 Pulled By: driazati fbshipit-source-id: 68c30af0efc02c60af5fb8c9715b2435cc01a0d9	2018-11-08 11:26:35 -08:00
David Riazati	4472ad3b2f	Move functional _Reduction to its own module (#13401 ) Summary: To support `_Reduction` in the jit this PR moves it out to a new file so that it goes through the paths for python modules in the script compiler and converts `F.ctc_loss` to weak script Depends on #13484 for saving rng state Pull Request resolved: https://github.com/pytorch/pytorch/pull/13401 Differential Revision: D12868501 Pulled By: driazati fbshipit-source-id: 23cec0fb135744578c73e31ac825e238db495d27	2018-11-08 01:04:10 -08:00
Gregory Chanan	7341ab0a33	Fix range of target examples and JIT test case for CTC loss. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13644 Differential Revision: D12949733 Pulled By: gchanan fbshipit-source-id: 1c4cacbb6a50d5002165bdd0a7881883db5c8249	2018-11-07 07:04:31 -08:00
David Riazati	fc6a9a19ea	Add torch._C._nn built-in, more weak fns (#13322 ) Summary: This PR adds functions defined in `torch._C._nn` as builtin functions (including inplace variants). This allows for the conversion of more functions to weak script NB: many `torch.nn.functional` functions will have to be slightly rewritten to avoid early returns (as with `threshold` in this PR) Converts these functions to weak script: * `threshold` * `relu` * `hardtanh` * `relu6` * `elu` * `selu` * `celu` * `leaky_relu` * `rrelu` * `tanh` * `sigmoid` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13322 Differential Revision: D12852203 Pulled By: driazati fbshipit-source-id: 220670df32cb1ff39d120bdc04aa1bd41209c809	2018-11-05 21:02:18 -08:00
David Riazati	1969898647	Convert functional dropouts to weak script (#13484 ) Summary: To convert `nn.functional.dropout` * `_VF` had to be exposed as a Python module so this PR adds a module class to forward to `torch._C._VariableFunctions` * rng state between calls in the tests needed to be made consistent Pull Request resolved: https://github.com/pytorch/pytorch/pull/13484 Differential Revision: D12929622 Pulled By: driazati fbshipit-source-id: 78b455db9c8856b94d2dda573fb7dc74d5784f56	2018-11-05 17:13:07 -08:00
Sam Gross	98f5c005da	Speed up CPU threshold and relu implementation (#13182 ) Summary: ``` The previous threshold implementation was not vectorized or parallelized. This speeds up ResNet-50 CPU inference [1] from ~88 ms to ~67 ms CPU timings: https://gist.github.com/colesbury/d0d1be6974841d62696dbde329a8fde8 1 thread (before vs. after) 10240: 17.4 us vs. 6.9 µs per loop 102400: 141 us vs. 39.8 µs per loop 16 threads (before vs. after) 10240: 17.4 us vs. 6.7 µs per loop 102400: 141 us vs. 14.3 µs per loop CUDA timings are not measurably different. [1]: compiled with MKL-DNN, 8 threads, batch norm merged into convolutions https://gist.github.com/colesbury/8a64897dae97558b3b82da665048c782 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13182 Reviewed By: soumith Differential Revision: D12825105 Pulled By: colesbury fbshipit-source-id: 557da608ebb87db8a04adbb0d2882af4f2eb3c15	2018-11-05 12:51:29 -08:00
Tongzhou Wang	99a5d19591	Rename elementwise_mean to mean (#13419 ) Summary: Closes #12459 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13419 Differential Revision: D12883299 Pulled By: SsnL fbshipit-source-id: 8b4512ff73b66fdc674412904dbb3bf497ba70a7	2018-11-01 10:31:26 -07:00
Ailing Zhang	488d393ea6	Fix pointwise loss broadcast (#12996 ) Summary: Fixes #12129 , #12327 Differential Revision: D10513781 Pulled By: ailzhang fbshipit-source-id: a210008a39ff6c3f056c9fbe3f0576cfcce638ec	2018-10-31 10:17:25 -07:00
Michael Suo	d2659f6689	fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13346 Differential Revision: D12850686 Pulled By: michaelsuo fbshipit-source-id: b7474d0a3f3347034592bef45125610c040cff6a	2018-10-30 16:22:58 -07:00
verhoek	0db505bf27	Made docstrings for Embedding more accurate. (#13310 ) Summary: Made the previous description for max_norm more precise, avoiding 'this' and describing what actually happens in the code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13310 Differential Revision: D12840813 Pulled By: SsnL fbshipit-source-id: 98090c884267a62ce93cd85da84252d46926dfa5	2018-10-30 12:25:38 -07:00
Jason Gauci	5b15a501da	Refactor & unit test feed predictor Summary: 1. Refactor DDPG predictor. Merge the critic predictor with ParametricDQNPredictor since they are the same 2. Fix bug where loss was multiplied by the batch size 3. Create DDPGFeedPredictor which uses the feed predictor output format 4. Add support for gridworld simulation memoization to DDPG. Also memoize normalization tables. Reviewed By: kittipatv Differential Revision: D10161240 fbshipit-source-id: 2813890043de1241c1fb9b9c2b6a897403f9fc12	2018-10-30 10:27:47 -07:00
William Horton	1bec8f773b	Move ConstantPadNd into ATen (#10885 ) Summary: Addresses #9499. Completed work on the forward function, tests should be passing for that. Working on backward function now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10885 Differential Revision: D9643786 Pulled By: SsnL fbshipit-source-id: 2930d6f3d2975c45b2ba7042c55773cbdc8fa3ac	2018-10-26 15:25:27 -07:00
David Riazati	14ea4bf0d1	Make 7 nn modules into weak modules (#12966 ) Summary: Depends on #12682 ([stacked diff](https://github.com/driazati/pytorch/compare/weak_mod...driazati:mod_conv1)) * Adds tests for weak module conversion that creates a `ScriptModule` that uses the weak module and checks its graph * Adds `torch._jit_internal.weak_module` tags to modules that already work * `Sigmoid` * `Tanh` * `Hardshrink` * `PReLU` * `Softsign` * `Tanhshrink` * `PairwiseDistance` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12966 Differential Revision: D10559557 Pulled By: driazati fbshipit-source-id: dc4bea3aa744b3c44d4fa7dceefd97e951f824d0	2018-10-25 13:59:34 -07:00
Thomas Viehmann	dd823ccd28	small improvements to torch.nn.normalization docs (#12936 ) Summary: Based on a [discussion at the forums](https://discuss.pytorch.org/t/question-about-functional-normalize-and-torch-norm/27755), it might be worthwhile to clarify the documentation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12936 Differential Revision: D10502139 Pulled By: ezyang fbshipit-source-id: 480c3c367f8c685dcde107b3018cb4129032322d	2018-10-22 23:14:47 -07:00
David Riazati	1e8064dec0	Convert 2 nn.functional functions to weak script (#12723 ) Summary: * Moves `weak_script` annotation to `torch/_jit_internal.py` folder to resolve dependency issue between `torch.jit` and `torch.nn` * Add `torch._jit.weak_script` to `tanhshrink` and `softsign`, their tests now pass instead of giving an `unknown builtin op` error * Blacklist converted `torch.nn.functional` functions from appearing in the builtin op list if they don't actually have corresponding `aten` ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/12723 Differential Revision: D10452986 Pulled By: driazati fbshipit-source-id: c7842bc2d3ba0aaf7ca6e1e228523dbed3d63c36	2018-10-21 14:09:55 -07:00
Thomas Viehmann	0521c47c91	Amend nondeterminism notes (#12217 ) Summary: include atomicAdd commentary as this is less well known There is some discussion in #12207 Unfortunately, I cannot seem to get the ..include working in `_tensor_docs.py` and `_torch_docs.py`. I could use a hint for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12217 Differential Revision: D10419739 Pulled By: SsnL fbshipit-source-id: eecd04fb7486bd9c6ee64cd34859d61a0a97ec4e	2018-10-16 23:59:26 -07:00
Tongzhou Wang	ac994f2c78	Fix SpectralNorm with DataParallel (#12671 ) Summary: There were two problems with SN + DP: 1. In SN, the updated _u vector is saved back to module via a `setattr`. However, in DP, everything is run on a replica, so those updates are lost. 2. In DP, the buffers are broadcast via a `broadcast_coalesced`, so on replicas they are all views. Therefore, the `detach_` call won't work. Fixes are: 1. Update _u vector in-place so, by the shared storage between 1st replica and the parallelized module, the update is retained 2. Do not call `detach_`. 3. Added comments in SN about the subtlety. 4. Added a note to the DP doc on this particular behavior of DP. cc crcrpar taesung89 The controller you requested could not be found. yaoshengfu Fixes https://github.com/pytorch/pytorch/issues/11476 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12671 Differential Revision: D10410232 Pulled By: SsnL fbshipit-source-id: c447951844a30366d8c196bf9436340e88f3b6d9	2018-10-16 16:02:17 -07:00
Ailing Zhang	e15501fb68	fix bce_with_logits with legacy reduce (#12689 ) Summary: Fix #12624 . internal usecase of legacy `reduce`. Add test in test_nn Pull Request resolved: https://github.com/pytorch/pytorch/pull/12689 Reviewed By: ezyang Differential Revision: D10391195 Pulled By: ailzhang fbshipit-source-id: 1af2b258c4abb2b6527eaaeac63e8bf1762c66a1	2018-10-16 09:46:58 -07:00
Natalia Gimelshein	a98958d3bd	dtype option for softmax (#11719 ) Summary: Add dtype argument to softmax/log_softmax functions. Computing softmax in fp32 precision is necessary for mixed precision training, and converting output of the previous layer into fp32 and then reading it as fp32 in softmax is expensive, memory and perf-wise, this PR allows one to avoid it. For most input data/dtype combinations, input data is converted to dtype and then softmax is computed. If input data is half type and dtype is fp32, kernels with the corresponding template arguments are called. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11719 Reviewed By: ezyang Differential Revision: D10175514 Pulled By: zou3519 fbshipit-source-id: 06d285af91a0b659932236d41ad63b787eeed243	2018-10-13 17:57:10 -07:00
Ailing Zhang	5317429e82	move bceWithLogits from python to Aten (#11054 ) Summary: Fixes #10648 . Perf comparison: ``` import torch import torch.nn as nn import time def bm(testsize, repeat=100, cuda=False): total_time = 0.0 pos_weight= torch.ones(testsize[1], device='cuda' if cuda else 'cpu') / testsize[1] # loss = nn.BCEWithLogitsLoss(pos_weight=pos_weight) loss = nn.BCEWithLogitsLoss() input = torch.randn(testsize, device='cuda' if cuda else 'cpu').clamp_(2.8e-2, 1 - 2.8e-2) target = torch.randn(testsize, device='cuda' if cuda else 'cpu').gt(0).float() input.requires_grad = True target.requires_grad = True for _ in range(repeat): start = time.time() l = loss(input, target) l.backward() # print(target.grad) end = time.time() total_time += end - start return total_time for cuda in [False, True]: for testsize in [(100, 100), (1000, 1000), (2000, 2000)]: # print(testsize, cuda) print('{:.5f}'.format(bm(testsize, cuda=cuda))) ``` \| \| Python CPU \| Aten CPU \| Python GPU \| Aten GPU \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| \| (100, 100) \| 0.15813s \| 0.10890s \| 0.14601s \| 0.07070s \| \| (1000, 1000) \| 1.74051s \| 0.95038s \| 0.15158s \| 0.10153s \| \| (2000, 2000) \| 5.36515s \| 2.46996s \| 0.31322s \| 0.200941s \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/11054 Differential Revision: D9728289 Pulled By: ailzhang fbshipit-source-id: b7c5bc50635f8cc63c317caa4321e32f7df860f8	2018-10-12 11:13:33 -07:00
Wei Yang	de11fe0c83	migrate PReLU to ATen (#11758 ) Summary: - fixes https://github.com/pytorch/pytorch/issues/10723 - migrate PReLU to ATen and deprecate legacy PReLU - performance: CPU with weight.numel() = 1 ``` >>> m = nn.PReLU() >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 100 loops, best of 100: 9.43 ms per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 10 loops, best of 100: 24.4 ms per loop >>> m = nn.PReLU() >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 1000 loops, best of 100: 695 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 100 loops, best of 100: 2.47 ms per loop ``` CPU with weight.numel() = channels ``` >>> m = nn.PReLU(100) >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 1000 loops, best of 100: 603 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 100 loops, best of 100: 13.3 ms per loop >>> m = nn.PReLU(100) >>> x = torch.randn(100, 100, 100, requires_grad=True) >>> %timeit -r 100 y = m(x) 1000 loops, best of 100: 655 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 y.backward(retain_graph=True) 100 loops, best of 100: 2.45 ms per loop ``` CUDA with weight.numel() = 1 ``` >>> m = nn.PReLU().cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 10000 loops, best of 100: 187 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.01 ms per loop >>> m = nn.PReLU().cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 1000 loops, best of 100: 195 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.28 ms per loop ``` CUDA with weight.numel() = channel ``` >>> m = nn.PReLU(100).cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 1000 loops, best of 100: 174 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.27 ms per loop >>> m = nn.PReLU(100).cuda() >>> x = torch.randn(100, 100, 100, requires_grad=True).cuda() >>> %timeit -r 100 torch.cuda.synchronize(); y = m(x); torch.cuda.synchronize(); 10000 loops, best of 100: 181 µs per loop >>> y = m(x).sum() >>> %timeit -r 100 torch.cuda.synchronize(); y.backward(retain_graph=True); torch.cuda.synchronize(); 100 loops, best of 100: 2.26 ms per loop ``` The huge performance regression in CPU when weight.numel() = 1 is addressed by replacing at::CPU_tensor_apply* with parallelized kernels. ezyang SsnL zou3519 soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/11758 Differential Revision: D9995799 Pulled By: weiyangfb fbshipit-source-id: d289937c78075f46a54dafbde92fab0cc4b5b86e	2018-09-21 16:26:04 -07:00
Marc Ferradou	e734c94fa2	Quick update to embedding_bag doc (#11784 ) Summary: Related to #11624 adding maxes to the function def of embedding_bag. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11784 Differential Revision: D9892598 Pulled By: ezyang fbshipit-source-id: e6372ccf631826ddf1e1885b2f8f75f354a36c0b	2018-09-17 23:56:05 -07:00
Gao, Xiang	513fd3dd36	Improve doc of `torch.nn.functional.pad` (#11623 ) Summary: I'm reading the doc of `torch.nn.functional.pad` and it looks a bit confusing to me. Hopefully this PR makes it clearer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11623 Differential Revision: D9818255 Pulled By: soumith fbshipit-source-id: 4f6b17b0211c6927007f44bfdf42df5f84d47536	2018-09-13 19:25:24 -07:00
Tongzhou Wang	760679352e	Move Pixel Shuffle to ATen (#9721 ) Summary: <del>#9692 </del> Pull Request resolved: https://github.com/pytorch/pytorch/pull/9721 Differential Revision: D8955829 Pulled By: SsnL fbshipit-source-id: 4f4d1c7720b6f757fbef9a10f70209ae76f61399	2018-09-13 18:25:48 -07:00
Marc Ferradou	f129da1a47	Add max to the ValueError for EmbeddingBag mode check (#11655 ) Summary: Related to #11624 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11655 Differential Revision: D9815454 Pulled By: SsnL fbshipit-source-id: 8dd82e0c0aa68362e12b301e095a85af7d7fd71a	2018-09-13 14:39:40 -07:00
Roy Li	75f49befeb	move instance_norm to aten (#10792 ) Summary: This also removes the usage of torch.onnx.symbolic_override in instance_norm. Fixes #8439. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10792 Differential Revision: D9800643 Pulled By: li-roy fbshipit-source-id: fa13a57de5a31fbfa2d4d02639d214c867b9e1f1	2018-09-13 12:26:22 -07:00
Rasmus Diederichsen	35348dab10	WIP: Include note on cudnn determinism in each function backed by cudnn (#11434 ) Summary: Ping ezyang This addresses your comment in #114. Strangely, when running the doc build (`make html`) none of my changes are actually showing, could you point out what I'm doing wrong? Once #11329 is merged it might make sense to link to the reproducibility note everywhere. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11434 Differential Revision: D9751208 Pulled By: ezyang fbshipit-source-id: cc672472449564ff099323c39603e8ff2b2d35c9	2018-09-11 20:27:09 -07:00
Peter Goldsborough	d95fedb436	Use ATen dropout implementation in Dropout module and add FeatureDropout (#11458 ) Summary: This PR does two things: 1. Replaces the implementation of the `Dropout` module with a call to the ATen function, 2. Replaces `Dropout2d` with a new `FeatureDropout` module that shall take the place of `Dropout2d` and `Dropout3d`. I contemplated calling it `Dropout2d` and making `Dropout3d` an alias for it, but similar to our decision for `BatchNorm{1,2,3}d` (c.f. https://github.com/pytorch/pytorch/pull/9188), we can deviate from Python PyTorch in favor of the ideal-world solution, which is to have a single module, since both actually just call `feature_dropout`. I also replaced the implementation of `dropout3d` with a call to `dropout2d` in Python. The code is the same and it's easier for developers to parse than having to manually match the tokens to make sure it's really 100% the same code (which it is, if I matched the tokens correctly). ebetica ezyang SsnL Pull Request resolved: https://github.com/pytorch/pytorch/pull/11458 Differential Revision: D9756603 Pulled By: goldsborough fbshipit-source-id: fe847cd2cda2b6da8b06779255d76e32a974807c	2018-09-11 20:16:12 -07:00
Tongzhou Wang	de460c7ad3	Improvements on conv/pool/fold/stft/ParamDict docs (#11106 ) Summary: Also fixes some incorrect formula rendering. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11106 Differential Revision: D9752433 Pulled By: SsnL fbshipit-source-id: 535fc8498638e8b645757fc7535d8771992b7d21	2018-09-11 08:56:21 -07:00
Wei Yang	425ea6b31e	fix doc for functional.dropout* (#10417 ) Summary: - fixes #4177 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10417 Differential Revision: D9542876 Pulled By: weiyangfb fbshipit-source-id: 480ed973d1fe0364f4acb5cd596c2031895b82df	2018-09-05 17:26:00 -07:00
Erik Brinkman	611a608517	Add ATen pdist CPU kernel (#10782 ) Summary: Also add single grad whitelist to the jit test Pull Request resolved: https://github.com/pytorch/pytorch/pull/10782 Reviewed By: ezyang Differential Revision: D9583378 Pulled By: erikbrinkman fbshipit-source-id: 069e5ae68ea7f3524dec39cf1d5fe9cd53941944	2018-08-30 11:55:27 -07:00
Roy Li	f2bb9f0bb5	speed up kl div loss (#10336 ) Summary: Moved kl div loss to aten. benchmarks for 5000 iterations on input size (1000,100) New ``` cuda: forward [0.9736350309103727, 0.9922929517924786, 0.9694818360731006] input requires_grad=True: backward [0.5595634011551738, 0.558339926879853, 0.5546616851352155] double backward [1.2445648494176567, 1.2245905152522027, 1.2349751549772918] target requires_grad=True: backward (new C++) [0.9489959231577814, 0.9553070571273565, 0.9556351029314101] double backward (new C++) [1.8184774098917842, 1.8164670099504292, 1.845708406995982] cpu: forward (new C++) [7.892430987209082, 8.3068826389499, 7.985283812973648] input requires_grad=True: backward (new C++) [4.328460982069373, 4.45323242014274, 4.27946363389492] double backward (new C++) [5.153504415880889, 4.629372010007501, 4.712803596165031] target requires_grad=True: backward (new C++) [3.4181493939831853, 3.3771288259886205, 3.7086612950079143] double backward (new C++) [0.21922698011621833, 0.1858532396145165, 0.19477044604718685] ``` Old ``` cuda: forward [3.101281268056482, 3.068499860819429, 3.0527669726870954] input requires_grad=True: backward [0.5650290949270129, 0.5730433077551425, 0.5588279226794839] double backward [1.1287697306834161, 1.13834543293342, 1.1298578432761133] target requires_grad=True: backward [0.9470391101203859, 0.9560198178514838, 0.9750375030562282] double backward [1.85760727385059, 1.7989214668050408, 1.788982989732176] cpu: forward (new C++) [12.474591840058565, 12.511441555805504, 12.666544185951352] input requires_grad=True: backward (new C++) [7.660991386976093, 7.449987292289734, 7.513917901087552] double backward (new C++) [4.073225498665124, 4.264980792999268, 4.429787891916931] target requires_grad=True: backward (new C++) [3.448499082121998, 3.9072313378565013, 3.2433970272541046] double backward (new C++) [2.126378359273076, 1.9045450473204255, 1.7932004742324352] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10336 Differential Revision: D9213636 Pulled By: li-roy fbshipit-source-id: 27cc530f6276f58d35dc7a1d56dfc758a0fc4a7b	2018-08-27 16:10:59 -07:00
Tongzhou Wang	d043f83019	Add tests for Tensor.* nn.* F.* docs (#10311 ) Summary: Test only for existence for now. I had to skip a lot of them so there a FIXME in the test. Also I'm not testing torch.* because of namespace issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10311 Differential Revision: D9196341 Pulled By: SsnL fbshipit-source-id: 9c2ca1ffe660bc1cc664474993f8a21198525ccc	2018-08-14 11:39:46 -07:00
Adam Paszke	adbcb3c1dc	Move dropout and alpha dropout to ATen (#10384 ) Summary: zdevito ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/10384 Reviewed By: ezyang Differential Revision: D9272583 Pulled By: apaszke fbshipit-source-id: ed5d37b28ce9ff25800bbaa0daf066cfbf1f9921	2018-08-10 14:55:28 -07:00
Tongzhou Wang	6a55238a3f	Grid sampler: nearest interpolation & reflection padding (#10051 ) Summary: closes #9702 . cc jph00 Commit structure: 1. Change the index calculation logic. I will explain using 1-D for simplicity. Previously we have (in pseudo code): ``` // 1. get the float locations from grid scalar_t x = from_grid() // 2. find the integral surrounding indices int x_left = floor(x) int x_right = x_left + 1 // 3. calculate the linear interpolate weights scalar_t w_left = x_right - x scalar_t w_right = x - x_left // 4. manipulate the integral surrounding indices if needed // (e.g., clip for border padding_mode) x_left = manipulate(x_left, padding_mode) x_right = manipulate(x_right, padding_mode) // 5. interpolate output_val = interpolate(w_left, w_right, x_left, x_right) ``` This is actually incorrect (and also unintuitive) because it calculates the weights before manipulate out-of-boundary indices. Fortunately, this isn't manifested in both of the current supported modes, `'zeros'` and `'border'` padding: + `'zeros'`: doesn't clip + `'border'`: clips, but for out-of-bound `x` both `x_left` and `x_right` are clipped to the same value, so weights don't matter But this is a problem with reflection padding, since after each time we reflect, the values of `w_left` and `w_right` should be swapped. So in this commit I change the algorithm to (numbers corresponding to the ordering in the above pseudo-code) ``` 1. get float location 4. clip the float location 2. find the integral surrounding indices 3. calculate the linear interpolate weights ``` In the backward, because of this change, I need to add new variables to track `d manipulate_output / d manipulate_input`, which is basically a multiplier on the gradient calculated for `grid`. From benchmarking this addition doesn't cause obvious slow downs. 2. Implement reflection padding. The indices will keep being reflected until they become within boundary. Added variant of `clip_coordinates` and `reflect_coordinates` to be used in backward. E.g., ```cpp // clip_coordinates_set_grad works similarly to clip_coordinates except that // it also returns the `d output / d input` via pointer argument `grad_in`. // This is useful in the backward pass of grid_sampler. scalar_t clip_coordinates_set_grad(scalar_t in, int64_t clip_limit, scalar_t grad_in) ``` For example, if `in` is clipped in `'border'` mode, `grad_in` is set to `0`. If `in` is reflected odd* times in `'reflection'` mode, `grad_in` is set to `-1`. 3. Implement nearest interpolation. 4. Add test cases 5. Add better input checking Discussed with goldsborough for moving `operator<<` of `at::Device`, `at::DeviceType` and `at::Layout` into `at` namespace. (Otherwise `AT_CHECK` can't find them.) 6. Support empty tensors. cc gchanan + Make empty tensors not acceptable by cudnn. + Add `AT_ASSERT(kernel block size > 0)` if using `GET_BLOCKS` + Cache `numel` in `TensorGeometry` I was going to use `numel` to test if cudnn descriptor should accept a tensor, but it isn't used eventually. I can revert this if needed. 7. Add more test cases, including on input checking and empty tensors 8. Remove an obsolete comment 9. Update docs. Manually tested by generating docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10051 Differential Revision: D9123950 Pulled By: SsnL fbshipit-source-id: ac3b4a0a36b39b5d02e83666cc6730111ce216f6	2018-08-10 12:43:27 -07:00
Wei Yang	149d4f776b	use logsigmoid at multilabel_soft_margin_loss, and change output from shape=(N, C)to (N,) (#9965 ) Summary: - fixes #9141, #9301 - use logsigmoid at multilabel_soft_margin_loss to make it more stable (NOT fixing legacy MultiLabelSoftMarginCriterion) - return (N) instead of (N, C) to match the same behavior as MultiMarginLoss - Note that with this PR, the following behavior is expected: ``` loss = F.multilabel_soft_margin_loss(outputs, labels, reduction='none') loss_mean = F.multilabel_soft_margin_loss(outputs, labels, reduction='elementwise_mean') loss_sum = F.multilabel_soft_margin_loss(outputs, labels, reduction='sum') loss.sum() == loss_sum # True loss.mean() == loss_mean # True ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/9965 Differential Revision: D9038402 Pulled By: weiyangfb fbshipit-source-id: 0fa94c7b3cd370ea62bd6333f1a0e9bd0b8ccbb9	2018-08-03 17:54:19 -07:00
Rob Kunkle	6e85112f12	Adding katex rendering of equations, and required edits to equations. (#8848 ) Summary: This fixes issue #8529. - Adds Katex extension to conf.py and requirements.txt - Fixes syntax differences in docs - Should allow documentation pages to render faster Pull Request resolved: https://github.com/pytorch/pytorch/pull/8848 Reviewed By: soumith Differential Revision: D8677702 Pulled By: goodlux fbshipit-source-id: c4a832c5879e0eebcb14763b35a41663331ba23f	2018-08-02 12:25:17 -07:00
Xiang Gao	6fc75eadf0	Add CELU activation to pytorch (#8551 ) Summary: Also fuse input scale multiplication into ELU Paper: https://arxiv.org/pdf/1704.07483.pdf Pull Request resolved: https://github.com/pytorch/pytorch/pull/8551 Differential Revision: D9088477 Pulled By: SsnL fbshipit-source-id: 877771bee251b27154058f2b67d747c9812c696b	2018-08-01 07:54:44 -07:00
Kyle M. Tarplee	aae37324cc	fixed a newly introduced regression in softmax (#10066 ) Summary: There is a regression in softmin in 0.4.1 that was not present in 0.4.0. The behavior of softmin(x) should match softmax(-x) however instead it is implemented (in v0.4.1) as -softmax(x). These are not the same. The fix is trivial because the bug is due to operator precedence. This is a major regression that broke my training. I'm not sure how a unit test did not catch this. ``` x = torch.tensor([1, 2, 3.5, 4]) print(F.softmin(x, dim=0)) # this has the wrong output in 0.4.1 but correct in 0.4.0 print(F.softmax(-x, dim=0)) # this is what softmax should be print(F.softmax(x, dim=0)) print(-F.softmax(x, dim=0)) # this is how softmax is implemented incorrectly ``` In 0.4.1 this produces tensor([-0.0278, -0.0755, -0.3385, -0.5581]) tensor([0.6668, 0.2453, 0.0547, 0.0332]) tensor([0.0278, 0.0755, 0.3385, 0.5581]) tensor([-0.0278, -0.0755, -0.3385, -0.5581]) In 0.4.0 this produces the correct values tensor([ 0.6668, 0.2453, 0.0547, 0.0332]) tensor([ 0.6668, 0.2453, 0.0547, 0.0332]) tensor([ 0.0278, 0.0755, 0.3385, 0.5581]) tensor([-0.0278, -0.0755, -0.3385, -0.5581]) Pull Request resolved: https://github.com/pytorch/pytorch/pull/10066 Differential Revision: D9106995 Pulled By: soumith fbshipit-source-id: 7332503c6077e8461ad6cd72422c749cf6ca595b	2018-07-31 19:28:30 -07:00
Roy Li	2422801625	fix _pointwise_loss for target gradients (#10018 ) Summary: _pointwise loss has some python special casing, we converted reduction to aten enums too early. fixes #10009 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10018 Differential Revision: D9075489 Pulled By: li-roy fbshipit-source-id: 4bf2f5e2911e757602c699ee1ec58223c61d0162	2018-07-31 13:39:58 -07:00
Thomas Viehmann	685224aa14	Add CTC loss (#9628 ) Summary: The CPU and CUDA variants are a direct transposition of Graves et al.'s description of the algorithm with the modification that is is in log space. The there also is a binding for the (much faster) CuDNN implementation. This could eventually fix #3420 I still need to add tests (TestNN seems much more elaborate than the other testing) and fix the bugs than invariably turn up during the testing. Also, I want to add some more code comments. I could use feedback on all sorts of things, including: - Type handling (cuda vs. cpu for the int tensors, dtype for the int tensors) - Input convention. I use log probs because that is what the gradients are for. - Launch parameters for the kernels - Errors and obmissions and anything else I'm not even aware of. Thank you for looking! In terms of performance it looks like it is superficially comparable to WarpCTC (and thus, but I have not systematically investigated this). I have read CuDNN is much faster than implementations because it does not use log-space, but also the gathering step is much much faster (but I avoided trying tricky things, it seems to contribute to warpctc's fragility). I might think some more which existing torch function (scatter or index..) I could learn from for that step. Average timings for the kernels from nvprof for some size: ``` CuDNN: 60.464us compute_alphas_and_betas 16.755us compute_grads_deterministic Cuda: 121.06us ctc_loss_backward_collect_gpu_kernel (= grads) 109.88us ctc_loss_gpu_kernel (= alphas) 98.517us ctc_loss_backward_betas_gpu_kernel (= betas) WarpCTC: 299.74us compute_betas_and_grad_kernel 66.977us compute_alpha_kernel ``` Of course, I still have the (silly) outer blocks loop rather than computing consecutive `s` in each thread which I might change, and there are a few other things where one could look for better implementations. Finally, it might not be unreasonable to start with these implementations, as the performance of the loss has to be seen in the context of the entire training computation, so this would likely dilute the relative speedup considerably. My performance measuring testing script: ``` import timeit import sys import torch num_labels = 10 target_length = 30 input_length = 50 eps = 1e-5 BLANK = 0#num_labels batch_size = 16 torch.manual_seed(5) activations = torch.randn(input_length, batch_size, num_labels + 1) log_probs = torch.log_softmax(activations, 2) probs = torch.exp(log_probs) targets = torch.randint(1, num_labels+1, (batch_size * target_length,), dtype=torch.long) targets_2d = targets.view(batch_size, target_length) target_lengths = torch.tensor(batch_size[target_length]) input_lengths = torch.tensor(batch_size[input_length]) activations = log_probs.detach() def time_cuda_ctc_loss(grout, args): torch.cuda.synchronize() culo, culog_alpha = torch._ctc_loss(args) g, = torch.autograd.grad(culo, args[0], grout) torch.cuda.synchronize() def time_cudnn_ctc_loss(groupt, args): torch.cuda.synchronize() culo, cugra= torch._cudnn_ctc_loss(args) g, = torch.autograd.grad(culo, args[0], grout) torch.cuda.synchronize() def time_warp_ctc_loss(grout, args): torch.cuda.synchronize() culo = warpctc.ctc_loss(args, blank_label=BLANK, size_average=False, length_average=False, reduce=False) g, = torch.autograd.grad(culo, args[0], grout) torch.cuda.synchronize() if sys.argv[1] == 'cuda': lpcu = log_probs.float().cuda().detach().requires_grad_() args = [lpcu, targets_2d.cuda(), input_lengths.cuda(), target_lengths.cuda(), BLANK] grout = lpcu.new_ones((batch_size,)) torch.cuda.synchronize() print(timeit.repeat("time_cuda_ctc_loss(grout, args)", number=1000, globals=globals())) elif sys.argv[1] == 'cudnn': lpcu = log_probs.float().cuda().detach().requires_grad_() args = [lpcu, targets.int(), input_lengths.int(), target_lengths.int(), BLANK, True] grout = lpcu.new_ones((batch_size,)) torch.cuda.synchronize() print(timeit.repeat("time_cudnn_ctc_loss(grout, args)", number=1000, globals=globals())) elif sys.argv[1] == 'warpctc': import warpctc activations = activations.cuda().detach().requires_grad_() args = [activations, input_lengths.int(), targets.int(), target_lengths.int()] grout = activations.new_ones((batch_size,), device='cpu') torch.cuda.synchronize() print(timeit.repeat("time_warp_ctc_loss(grout, *args)", number=1000, globals=globals())) ``` I'll also link to a notebook that I used for writing up the algorithm in simple form and then test the against implementations against it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9628 Differential Revision: D8952453 Pulled By: ezyang fbshipit-source-id: 18e073f40c2d01a7c96c1cdd41f6c70a06e35860	2018-07-31 11:09:48 -07:00
Adam Paszke	aa7af94656	Make JIT tracing a thread-local property (#9414 ) Summary: As in the title. Lets us simplify a lot of code. Depends on #9363, so please review only the last commit. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414 Reviewed By: zdevito Differential Revision: D8836496 Pulled By: apaszke fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3	2018-07-19 19:09:39 -07:00
tippisum	5c695e3a60	Implement 2D and 3D alpha_dropout (#9073 ) Summary: It implements per-channel alpha_dropout. It also creates corresponding function classes and unifies the process of dropout and alpha_dropout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9073 Differential Revision: D8727008 Pulled By: ezyang fbshipit-source-id: 9d509f9c5db4e98f7b698cdfc4443505a4d2b331	2018-07-17 17:10:16 -07:00
Roy Li	a47a30b9ce	Implement grid_sampler in aten (#8929 ) Summary: Partially addresses #8928. Maybe #7273? Pull Request resolved: https://github.com/pytorch/pytorch/pull/8929 Reviewed By: ezyang Differential Revision: D8668919 Pulled By: li-roy fbshipit-source-id: 8ad07b224d2ab211c274c4c10f042501efaae32c	2018-07-10 15:10:24 -07:00
Tongzhou Wang	e8536c08a1	Update extension docs, fix Fold/Unfold docs (#9239 ) Summary: Commits: 1. In extension doc, get rid of all references of `Variable` s (Closes #6947 ) + also add minor improvements + also added a section with links to cpp extension :) goldsborough + removed mentions of `autograd.Function.requires_grad` as it's not used anywhere and hardcoded to `return_Py_True`. 2. Fix several sphinx warnings 3. Change `*` in equations in `module/conv.py` to `\times` 4. Fix docs for `Fold` and `Unfold`. + Added better shape check for `Fold` (it previously may give bogus result when there are not enough blocks). Added test for the checks. 5. Fix doc saying `trtrs` not available for CUDA (#9247 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/9239 Reviewed By: soumith Differential Revision: D8762492 Pulled By: SsnL fbshipit-source-id: 13cd91128981a94493d5efdf250c40465f84346a	2018-07-08 19:09:39 -07:00
Ailing Zhang	227c8f2654	Implement nn.functional.interpolate based on upsample. (#8591 ) Summary: This PR addresses #5823. * fix docstring: upsample doesn't support LongTensor * Enable float scale up & down sampling for linear/bilinear/trilinear modes. (following SsnL 's commit) * Enable float scale up & down sampling for nearest mode. Note that our implementation is slightly different from TF that there's actually no "align_corners" concept in this mode. * Add a new interpolate function API to replace upsample. Add deprecate warning for upsample. * Add an area mode which is essentially Adaptive_average_pooling into resize_image. * Add test cases for interpolate in test_nn.py * Add a few comments to help understand linear interpolation code. There is only "cubic" mode missing in resize_images API which is pretty useful in practice. And it's labeled as hackamonth here #1552. I discussed with SsnL that we probably want to implement all new ops in ATen instead of THNN/THCUNN. Depending on the priority, I could either put it in my queue or leave it for a HAMer. After the change, the files named as Upsampling.c works for both up/down sampling. I could rename the files if needed. Differential Revision: D8729635 Pulled By: ailzhang fbshipit-source-id: a98dc5e1f587fce17606b5764db695366a6bb56b	2018-07-06 15:28:11 -07:00
Tongzhou Wang	7b25cbbef9	Test nn.Module on non-contiguous inputs (#9114 ) Summary: 1. Let `ModuleTest` raise when they fail on non-contiguous inputs. Fix legacy modules. 2. Fix BN (both THNN and cuDNN) not working on non-contiguous inputs. 3. Fix CUDA EmbeddingBag not working on non-contiguous inputs. To prevent calling `.contiguous()` on in both `forward` and `backward`, a. prefix all current `embedding_bag` functions with `_`, indicating that they require input to be contiguous (there is a check in each function). b. create `embedding_bag`, which makes input arguments `.contiguous()`, and calls `_embedding_bag` 3. Make many ATen `embedding` functions to work on non-contiguous inputs so we don't need to call `input = input.contiguous()` in Python `nn.functional.embedding`. 4. Fix dense-sparse addition when the sparse input is not coalesced and indices or values tensor is not contiguous. This came up in the test cases of Embedding modules with `sparse=True`. Added tests. 5. Update `TensorUtils.cpp` to use `AT_` macros. Request: review from cpuhrsch on the `Embedding` changes. review from ezyang on ATen sparse & BN changes. Closes https://github.com/pytorch/pytorch/pull/9114 Differential Revision: D8717299 Pulled By: SsnL fbshipit-source-id: 0acc6f1c9522b5b605361e75112c16bbe1e98527	2018-07-05 21:09:34 -07:00
Roy Li	21c786071b	update nn loss tests to use new reduction arg (#9118 ) Summary: The tests were using the old args, which caused them to emit a lot of deprecation warnings. closes #9103. Reviewed By: ezyang Differential Revision: D8720581 Pulled By: li-roy fbshipit-source-id: 3b79527f6fe862fb48b99a6394e8d7b89fc7a8c8	2018-07-02 19:41:57 -07:00
Wei Yang	cb1bfe91af	Deprecated several functions at torch.nn.functional (#8748 ) Summary: 1. fixes #6245 2. deprecated tanh, sigmoid Closes https://github.com/pytorch/pytorch/pull/8748 Differential Revision: D8697975 Pulled By: weiyangfb fbshipit-source-id: f30714aa0611a1fe870040692f3dbcc8238aece9	2018-07-02 15:54:46 -07:00
Roy Li	c61f0217a5	combine size_average and reduce args in loss functions (#8018 ) Summary: closes #7929 Closes https://github.com/pytorch/pytorch/pull/8018 Differential Revision: D8682540 Pulled By: li-roy fbshipit-source-id: 649170dd1a7f373151c1d4e949838bd1c5651936	2018-07-01 05:39:00 -07:00
Peter Goldsborough	f0772c0ab2	Replace max_pool with max_pool_with_indices (#8946 ) Summary: Re-push from https://github.com/pytorch/pytorch/pull/8892 Closes https://github.com/pytorch/pytorch/pull/8946 Differential Revision: D8666862 Pulled By: goldsborough fbshipit-source-id: 44cd3d63d347316818a7b0f5f89fce8ff7486736	2018-06-28 16:10:08 -07:00
Orion Reblitz-Richardson	9ec0a2aef4	fbshipit-source-id: ba600fcd2b5cefc7621357bdeb05e24cea02e5af	2018-06-27 04:50:56 -07:00
Peter Goldsborough	290d20b094	Replace max_pool with max_pool_with_indices (#8892 ) * Create max_poolXd_with_indices * Match ATen names in ONNX symbolic	2018-06-26 17:09:30 -07:00
Vadim Velikodniy	6e28d4d364	Add pos_weight argument to nn.BCEWithLogitsLoss (#5660 ) (#6856 ) * Add pos_weight argument to nn.BCEWithLogitsLoss and F.binary_cross_entropy_with_logits (#5660) - Add an option to control precision/recall in imbalanced datasets - Add tests (but new_criterion_tests) * Move pos_weight to the end of args list in the documentation. `pos_weight` was moved to the end because it is the last argument in both `nn.BCEWithLogitsLoss` and `binary_cross_entropy_with_logits`	2018-06-26 12:31:07 -04:00
Peter Goldsborough	8e98a1a84d	Create avg_pool1d in ATen (#8880 ) * Create avg_pool1d in ATen * Put function name into check1d method	2018-06-25 20:31:32 -07:00
li-roy	85f4d2b55a	throw error when grid_sample is passed unsupported mode (#8884 )	2018-06-25 22:37:41 -04:00
Tongzhou Wang	731273b8d6	Improve convT output_padding docs (#8825 ) * improve output_padding doc for convT modules * Update functional.py * Update conv.py * lint	2018-06-23 14:33:18 -04:00
Ailing	ddda7cfea5	allow output_size to contain None in adaptive pooling methods (#8596 ) * allow output_size to contain None in adaptive pooling methods * fix lint * address comments	2018-06-22 13:29:15 -04:00
Thomas Viehmann	0ae8b6c027	add fold example and add nn.Fold/nn.Unfold and F.fold/F.unfold to doc (#8600 ) * add fold example and add nn.Fold/nn.Unfold and F.fold/F.unfold to doc and a few drive-by doc fixes * typo	2018-06-18 09:36:42 -04:00
Wei Yang	ae55865a3b	Migrated hardshrink() to ATen and deprecated nn.Hardshrink() (#8117 ) * 1. added hardshrink() to ATen (CPU + GPU); 2. removed nn.Hardshrink(); 3. reusing previous tests for nn.Hardshrink() and included CUDA tests at test_nn; 4. default parameter lambda=0.5 is not working yet * optimized memory read/write * 1. pass in lambd as scalar for CPU/CUDA_apply; 2. removed tests for hardshrink at test_legacy_nn fixes test_utils * 1. replace zeros_like with empty_like; 2. use scalar_cast in cuda * 1. printing lambd value; 2. default lambd=0.5 is still failing * getting around Scalar bug buy removing default value of lambd from native_functions.yaml, and declare it at nn/functional.py * cleaned up debug printf	2018-06-14 16:42:20 -04:00
Tongzhou Wang	a77b391de7	[SpectralNorm] don't register original weight as buffer (#8170 ) * don't register original weight as buffer; fixes for buffers that require grad * add test	2018-06-12 14:42:05 -04:00
Tongzhou Wang	f9926e4ce5	Fix EmbeddingBag max_norm option (#7959 ) * fix EmbeddingBag max_norm option * flake8 * add warning to the embedding bag arg change	2018-05-31 09:42:56 -04:00

1 2 3 4 5 ...

356 Commits