pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Alexander Sidorov	95262032d8	] Char RNN bug fix for batching Summary: It could be that only first item in the batch was really used in a case rest of the memory was 0. Or if memory there had a big positive integer, then whole sequence was used. So we used rest of the batch depending on our luck :) Reviewed By: Yangqing Differential Revision: D4599569 fbshipit-source-id: ae89cee796bbcbc232e4abcab71dee360b0d8bc6	2017-02-22 17:34:30 -08:00
Alexander Sidorov	2727317384	char-rnn: add comments Summary: Just some comments Reviewed By: pietern Differential Revision: D4544518 fbshipit-source-id: b517023bf5e9712a2bf96ae15a709553e5ee6032	2017-02-10 12:20:58 -08:00
Alexander Sidorov	98f66fd282	Char-rnn : fix batching Summary: Input have to be arranged in such a way so j-th example of batch i goes right before j-th example in batch i+1 in the text. Reviewed By: urikz Differential Revision: D4519553 fbshipit-source-id: 9dd80658e0c4d9ff0f97a7904cbb164f267fe39f	2017-02-10 10:07:32 -08:00
Alexander Sidorov	e676f4411b	GPU support for RecurrentOp + Char RNN example Summary: On batch size of 32 and other default parameters I get 70 iterations per second vs. 40 on CPU. batching still doesn't produce good loss, I am going to work on this in a separate diff Reviewed By: urikz Differential Revision: D4516566 fbshipit-source-id: d0611534747beb2cd935a8607a283369378e4a6c	2017-02-09 22:54:53 -08:00
Alexander Sidorov	2ce3cfefe1	Char-RNN Tutorial Summary: This learns Shakespeare and then generates samples one character at a time. We want this to be an example of using our LSTM and RNNs in general. Now it takes 4ms to run the training net on current parameters (with batch size = 1). I don't have data on how much each operator takes yet. But overal python loop doesn't seem to influence much - with 1000 fake iterations in run_net it took 4s for each iteration as expected. Future work: * fixing convergence for batching * profiling on operator level * trying it out with GPUs * benchmarking against existing char-rnn implementations * stacking lstms (one lstm is different from two, one needs to take care of scoping) Reviewed By: urikz Differential Revision: D4430612 fbshipit-source-id: b36644fed9844683f670717d57f8527c25ad285c	2017-02-02 15:44:32 -08:00

5 Commits