Commit Graph

3 Commits

Author SHA1 Message Date
James Reed
10d95bd0f0 Remove batch_size parameter from attention and LSTMWithAttention interfaces
Summary: Reshape based on tensor shapes in the graph rather than based on a passed-in batch_size parameter

Reviewed By: urikz

Differential Revision: D4702086

fbshipit-source-id: c4c1d8425cd36c1e86695918eaba2667c27e9601
2017-03-16 11:47:52 -07:00
James Reed
8de1db9eb6 Implement recurrent attention in C2
Summary: Super rough implementation of recurrent attention. Planning to factor out the common code between the two functions as well as train and eval. I want to get this out and get eyes on it sooner rather than later

Differential Revision: D4647837

fbshipit-source-id: 54bc4e8ed0df6f04c86c425926decbe89f73b068
2017-03-08 11:21:28 -08:00
Yury Zemlyanskiy
4a53ab3cb6 LSTMWithAttention implementation in Caffe2
Summary:
Implementation of ##LSTMWithAttention##

Still TBD:
1. There are problems with back propagation, because gradient is not implemented for ops with broadcasting
2. I need to make initial_recurrent_state to be of shape [dim] rather than [1, batch_size, dim], so one doesn't need to provide batch_size to LSTMWithAttention

Differential Revision: D4298735

fbshipit-source-id: 8903fcff4d6a66647ee6d45a6ef28803fc3091e5
2017-02-23 04:08:34 -08:00