mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Nikita Shulga 171f265d80 Back out "Revert D25717510: Clean up some type annotations in benchmarks/fastrnns" (#50556 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50556 Original commit changeset: 2bcc19cd4340 Test Plan: Soft revert hammer Reviewed By: walterddr, seemethere Differential Revision: D25917129 fbshipit-source-id: e5caad77655789d607b84eee820aa7c960e00f51		2021-01-14 15:15:03 -08:00
..
__init__.py	Ignore F401 in all __init__.py without putting noqa (#25823 )	2019-10-23 15:28:13 -07:00
bench.py	Remove py2 compatible future imports (#44735 )	2020-09-16 12:55:57 -07:00
cells.py	Back out "Revert D25717510: Clean up some type annotations in benchmarks/fastrnns" (#50556 )	2021-01-14 15:15:03 -08:00
conftest.py	Benchmarks: make fuser and executor configurable from command line. (#44291 )	2020-09-09 11:59:35 -07:00
custom_lstms.py	Back out "Revert D25717510: Clean up some type annotations in benchmarks/fastrnns" (#50556 )	2021-01-14 15:15:03 -08:00
factory.py	Back out "Revert D25717510: Clean up some type annotations in benchmarks/fastrnns" (#50556 )	2021-01-14 15:15:03 -08:00
fuser.py	remove non-default settings in fuser.py (#48862 )	2020-12-05 20:58:39 -08:00
profile.py	Remove (most) Python 2 support from Python code (#35615 )	2020-04-22 09:23:14 -07:00
README.md	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
runner.py	try to make at::cat in mm_tree_reduction operate on contig tensors (#18816 )	2019-04-24 23:44:25 -07:00
scratch.py
test_bench.py	Remove py2 compatible future imports (#44735 )	2020-09-16 12:55:57 -07:00
test.py	Turn on F401: Unused import warning. (#18598 )	2019-03-30 09:01:17 -07:00

README.md

Fast RNN benchmarks

Benchmarks for TorchScript models

For most stable results, do the following:

Set CPU Governor to performance mode (as opposed to energy save)
Turn off turbo for all CPUs (assuming Intel CPUs)
Shield cpus via cset shield when running benchmarks.

Some of these scripts accept command line args but most of them do not because I was lazy. They will probably be added sometime in the future, but the default sizes are pretty reasonable.

Test fastrnns (fwd + bwd) correctness

Test the fastrnns benchmarking scripts with the following: python -m fastrnns.test or run the test independently: python -m fastrnns.test --rnns jit

Run benchmarks

python -m fastrnns.bench

should give a good comparison, or you can specify the type of model to run

python -m fastrnns.bench --rnns cudnn aten jit --group rnns

Run model profiling, calls nvprof

python -m fastrnns.profile

should generate nvprof file for all models somewhere. you can also specify the models to generate nvprof files separately:

python -m fastrnns.profile --rnns aten jit

Caveats

Use Linux for the most accurate timing. A lot of these tests only run on CUDA.