pytorch/benchmarks/fastrnns
Sungmann Cho f59581218f Fix spelling errors (#21665)
Summary:
alloctor -> allocator
excutable -> executable
excution -> execution
foward -> forward
initiaize -> initialize
paralell -> parallel
preprocesor -> preprocessor
tranpose -> transpose
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21665

Differential Revision: D15806155

Pulled By: soumith

fbshipit-source-id: d92b21ec8650a2b32f05faf9af0b7d2b073e992c
2019-06-13 15:21:55 -07:00
..
__init__.py Turn on F401: Unused import warning. (#18598) 2019-03-30 09:01:17 -07:00
bench.py add more info back to BenchResult (#21502) 2019-06-06 18:43:51 -07:00
cells.py try to make at::cat in mm_tree_reduction operate on contig tensors (#18816) 2019-04-24 23:44:25 -07:00
custom_lstms.py Move fast rnn benchmark to pytorch/pytorch 2019-03-27 14:46:09 -07:00
factory.py Fix spelling errors (#21665) 2019-06-13 15:21:55 -07:00
profile.py Turn on F401: Unused import warning. (#18598) 2019-03-30 09:01:17 -07:00
README.md Move fast rnn benchmark to pytorch/pytorch 2019-03-27 14:46:09 -07:00
runner.py try to make at::cat in mm_tree_reduction operate on contig tensors (#18816) 2019-04-24 23:44:25 -07:00
scratch.py Move fast rnn benchmark to pytorch/pytorch 2019-03-27 14:46:09 -07:00
test.py Turn on F401: Unused import warning. (#18598) 2019-03-30 09:01:17 -07:00

Fast RNN benchmarks

Benchmarks for TorchScript models

For most stable results, do the following:

  • Set CPU Governor to performance mode (as opposed to energy save)
  • Turn off turbo for all CPUs (assuming Intel CPUs)
  • Shield cpus via cset shield when running benchmarks.

Some of these scripts accept command line args but most of them do not because I was lazy. They will probably be added sometime in the future, but the default sizes are pretty reasonable.

Test fastrnns (fwd + bwd) correctness

Test the fastrnns benchmarking scripts with the following: python -m fastrnns.test or run the test independently: python -m fastrnns.test --rnns jit

Run benchmarks

python -m fastrnns.bench

should give a good comparision, or you can specify the type of model to run

python -m fastrnns.bench --rnns cudnn aten jit --group rnns

Run model profiling, calls nvprof

python -m fastrnns.profile

should generate nvprof file for all models somewhere. you can also specify the models to generate nvprof files separately:

python -m fastrnns.profile --rnns aten jit

Caveats

Use Linux for the most accurate timing. A lot of these tests only run on CUDA.