Commit Graph

8 Commits

Author SHA1 Message Date
Anders Papitto
12309f4aa6 GRU cell: add linear_before_reset boolean parameter
Summary:
This matches the semantics of cudnn (and others, like pytorch)
Closes https://github.com/caffe2/caffe2/pull/1695

Reviewed By: dzhulgakov

Differential Revision: D6658208

Pulled By: anderspapitto

fbshipit-source-id: 00e1716fba47b0ac296d1e9e0131165f4997ac7d
2018-01-08 13:22:56 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Alexander Sidorov
a7be496fe2 Revert D5589309: modify _LSTM into _RNN to adapt GRU
Summary:
This reverts commit f5af67dfe0842acd68223f6da3e96a81639e8049

bypass-lint

Differential Revision: D5589309

fbshipit-source-id: 79b0a3a9455829c3899472a1368ef36dc75f6e14
2017-08-10 16:42:41 -07:00
Tao Wu
7b86a34610 modify _LSTM into _RNN to adapt GRU
Summary: GRU is different than LSTM that it only has hidden states but no cell states. So in this case, reusing the code of _LSTM is problematic, as we need to delete the part of creating cell state, and change many other places that use hard-coded 4 (hidden_all, hidden, cell_all, cell) into 2 (hidden_all, hidden). Otherwise GRU will break during the backward pass, when the optimizer tries to apply gradient to each of the parameters, because cell state is never used, so it does not have gradients for the corresponding parameters (i.e., cell_state_w, cell_state_b).

Differential Revision: D5589309

fbshipit-source-id: f5af67dfe0842acd68223f6da3e96a81639e8049
2017-08-09 13:24:45 -07:00
Jianlong Zhong
152d2ae3a8 Implement CUDA version of GRU operator
Summary: Add CUDA version of GRU operator

Reviewed By: jamesr66a

Differential Revision: D5571043

fbshipit-source-id: 332aa64fc8a9116cc33382f2b2907080e58c13b3
2017-08-08 10:57:40 -07:00
Robert Verkuil
97193478c7 Implemented GRUCell
Summary: Implemented python logic and tests to create an RNNCell for GRU.  Uses the preexisting GRU Unit Op code.

Reviewed By: salexspb

Differential Revision: D5364893

fbshipit-source-id: 2451d7ec8c2eacb8d8c9b7c893bfd21b65fb9d18
2017-07-10 17:52:25 -07:00
Robert Verkuil
2409c2e359 GRUUnit Op Backwards Pass
Summary:
Just an implementation of the forward pass of the GRU Unit Op, not the full RNNCell.
Functions were created to mimic LSTM implementation as closely as possible.
Backwards pass implementations are defined in GRU_unit_op.{h, cc}
assertGradientChecks call added to gru_cell_test.py

Reviewed By: salexspb

Differential Revision: D5364856

fbshipit-source-id: 09cff4478091827763b40cc331e4e0abf0ec258f
2017-07-10 17:52:24 -07:00
Robert Verkuil
279f3f095e Implemented Gated Recurrent Unit (GRU) c++ operator forward pass
Summary:
Just an implementation of the forward pass of the GRU Unit Op, not the full RNNCell.
Functions were created to mimic LSTM implementation as closely as possible.
Implementation defined in GRU_unit_op.{h, cc}
tests put in gru_cell_test.py, which import rnn_cell_test_util.py for sigmoid, tanh, and _prepare_rnn functions.

Reviewed By: jamesr66a

Differential Revision: D5363697

fbshipit-source-id: f9ba9fe0be01ffc868dd22027be8be4975b84998
2017-07-10 17:52:23 -07:00