pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Will Feng	c9e66351a7	Port all PyTorch and Caffe2 jobs to CircleCI (#11264 ) Summary: This PR adds all PyTorch and Caffe2 job configs to CircleCI. Steps for the CircleCI mini-trial: - [ ] Make sure this PR passes Jenkins CI and fbcode internal tests - [x] Approve this PR - [ ] Ask CircleCI to turn up the number of build machines - [ ] Land this PR so that the new `.circleci/config.yml` will take effect Several Caffe2 tests are flaky on CircleCI machines and hence skipped when running on CircleCI. A proper fix for them will be worked on after a successful mini-trial. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11264 Differential Revision: D9656793 Pulled By: yf225 fbshipit-source-id: 7832e90018f3dff7651489c04a179d6742168fe1	2018-09-05 16:28:11 -07:00
Xiaomeng Yang	288d37998a	[Caffe2] Fix gradient_check on in-place ops (#8828 ) * Fix gradient_check on in-place ops * Fix hsm_test * Fix SplitByLengthOp test * Fix input_device_options for gradient_checker * Fix hypothesis_test_util.py	2018-06-25 15:25:56 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Alexander Sidorov	e0e124e617	Fix RNN scoping situation Summary: There is a long lasting problem of scoping which was introduced in original python wrappers early in H1. Basically each RNNCell implemented has to manually scope outputs of each of the operators. If somebody forgets, then there could be weird bugs with layers etc. Approach is the following. User has to explicitly specify current scope when using apply_over_sequence function and others if the function is going to be called several times (like for stacking layers). This way we use Caffe2 native scoping approach instead of inventing one extra API people have to use (i.e. passing scope name as an argument to the RNNCell constructor). Closes https://github.com/caffe2/caffe2/pull/1681 Differential Revision: D6777536 Pulled By: salexspb fbshipit-source-id: 73d860b8d4857589e04bdea5a6fcd3080d68427c	2018-02-07 17:35:29 -08:00
Anders Papitto	d8748a9d53	GRU sequence lengths: allow unspecified sequence lengths Summary: modeled after the earlier change for LSTM Closes https://github.com/caffe2/caffe2/pull/1841 Differential Revision: D6837461 Pulled By: anderspapitto fbshipit-source-id: de4e787019fa30f813a4b29f14b7000ce9d22d8e	2018-02-05 13:20:05 -08:00
Anders Papitto	0aa1a6387e	Add a seed to the gru unit test Summary: as it calls np.random and sometimes fails unreproducibly Closes https://github.com/caffe2/caffe2/pull/1779 Reviewed By: pietern Differential Revision: D6779802 Pulled By: anderspapitto fbshipit-source-id: 2ad069f8a15f70a8110b1a6bdb06f81577c53ad4	2018-01-23 13:47:43 -08:00
Anders Papitto	12309f4aa6	GRU cell: add linear_before_reset boolean parameter Summary: This matches the semantics of cudnn (and others, like pytorch) Closes https://github.com/caffe2/caffe2/pull/1695 Reviewed By: dzhulgakov Differential Revision: D6658208 Pulled By: anderspapitto fbshipit-source-id: 00e1716fba47b0ac296d1e9e0131165f4997ac7d	2018-01-08 13:22:56 -08:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Alexander Sidorov	a7be496fe2	Revert D5589309: modify _LSTM into _RNN to adapt GRU Summary: This reverts commit f5af67dfe0842acd68223f6da3e96a81639e8049 bypass-lint Differential Revision: D5589309 fbshipit-source-id: 79b0a3a9455829c3899472a1368ef36dc75f6e14	2017-08-10 16:42:41 -07:00
Tao Wu	7b86a34610	modify _LSTM into _RNN to adapt GRU Summary: GRU is different than LSTM that it only has hidden states but no cell states. So in this case, reusing the code of _LSTM is problematic, as we need to delete the part of creating cell state, and change many other places that use hard-coded 4 (hidden_all, hidden, cell_all, cell) into 2 (hidden_all, hidden). Otherwise GRU will break during the backward pass, when the optimizer tries to apply gradient to each of the parameters, because cell state is never used, so it does not have gradients for the corresponding parameters (i.e., cell_state_w, cell_state_b). Differential Revision: D5589309 fbshipit-source-id: f5af67dfe0842acd68223f6da3e96a81639e8049	2017-08-09 13:24:45 -07:00
Jianlong Zhong	152d2ae3a8	Implement CUDA version of GRU operator Summary: Add CUDA version of GRU operator Reviewed By: jamesr66a Differential Revision: D5571043 fbshipit-source-id: 332aa64fc8a9116cc33382f2b2907080e58c13b3	2017-08-08 10:57:40 -07:00
Robert Verkuil	97193478c7	Implemented GRUCell Summary: Implemented python logic and tests to create an RNNCell for GRU. Uses the preexisting GRU Unit Op code. Reviewed By: salexspb Differential Revision: D5364893 fbshipit-source-id: 2451d7ec8c2eacb8d8c9b7c893bfd21b65fb9d18	2017-07-10 17:52:25 -07:00
Robert Verkuil	2409c2e359	GRUUnit Op Backwards Pass Summary: Just an implementation of the forward pass of the GRU Unit Op, not the full RNNCell. Functions were created to mimic LSTM implementation as closely as possible. Backwards pass implementations are defined in GRU_unit_op.{h, cc} assertGradientChecks call added to gru_cell_test.py Reviewed By: salexspb Differential Revision: D5364856 fbshipit-source-id: 09cff4478091827763b40cc331e4e0abf0ec258f	2017-07-10 17:52:24 -07:00
Robert Verkuil	279f3f095e	Implemented Gated Recurrent Unit (GRU) c++ operator forward pass Summary: Just an implementation of the forward pass of the GRU Unit Op, not the full RNNCell. Functions were created to mimic LSTM implementation as closely as possible. Implementation defined in GRU_unit_op.{h, cc} tests put in gru_cell_test.py, which import rnn_cell_test_util.py for sigmoid, tanh, and _prepare_rnn functions. Reviewed By: jamesr66a Differential Revision: D5363697 fbshipit-source-id: f9ba9fe0be01ffc868dd22027be8be4975b84998	2017-07-10 17:52:23 -07:00

14 Commits