Summary: D4734505 part 2. Remove more instances of the batch_size parameter
Reviewed By: urikz
Differential Revision: D4736906
fbshipit-source-id: fc9d374e9308017d61c427890364c5ab9cec2edf
Summary: UNK needs tobe indexed in the vocabulary for validation to work. Default args now result in training loss decreasing.
Reviewed By: urikz
Differential Revision: D4703393
fbshipit-source-id: e4d6ad100daf8392f8ba1e502f9ecf39bb8ce24a
Summary: We should be using the vocabulary built on the training data, and corpus_eval as data for the evaluation phase.
Reviewed By: urikz
Differential Revision: D4700382
fbshipit-source-id: ca1dd043a28f9bb585faad050c82fb12c1cdf6cc
Summary:
OSS implementation of seq2seq model in Caffe2. The script uses Seq2SeqModelCaffe2 class to build and run the model. It takes in training data in the form of text file with one sentence in each line, builds a vocabulary, generates batches based on batch size and runs the net for a configurable number of epochs. It prints total scalar loss at the end of each epoch.
All FBLearner and neural_mt type system dependencies have been removed. Unimplemented and unnecessary methods have been removed to make the script simpler.
fblearner/flow/projects/langtech/translation/neural_mt/model_util_caffe2.py has been moved to caffe2/caffe2/python/examples/seq2seq_util.py and remains unchanged
Potential TODOs:
- Get the model running in GPU. Only GatherOp does not have a corresponding GPU implementation. Try adding CopyGPUToCPU before and CopyCPUToGPU after Gather, and use CUDA DeviceOption.
- Add evaluation on test data with suitable metric (perplexity? bleu?)
Reviewed By: urikz
Differential Revision: D4653333
fbshipit-source-id: 1c7d970ebc86afe23fad4d48854296bf54eb0f77