mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

Aaron Gokaslan e738f7ba23 [BE]: Enable ruff rule SIM113 (#147290 ) Lint rules that tells the user to avoid keeping track of their own counter and use the builtin enumerate when possible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147290 Approved by: https://github.com/jansel		2025-02-16 22:41:16 +00:00
..
__init__.py	[BE][Easy][3/19] enforce style for empty lines in import segments in `benchmarks/` (#129754 )	2024-07-17 14:34:42 +00:00
bench.py	Fix unused Python variables outside torch/ and test/ (#136359 )	2024-12-11 17:10:23 +00:00
cells.py	Migrate from Tuple -> tuple in benchmarks (#144259 )	2025-01-07 04:09:52 +00:00
conftest.py	[BE][Easy][3/19] enforce style for empty lines in import segments in `benchmarks/` (#129754 )	2024-07-17 14:34:42 +00:00
custom_lstms.py	[BE]: Enable ruff rule SIM113 (#147290 )	2025-02-16 22:41:16 +00:00
factory.py	PEP585 update - benchmarks tools torchgen (#145101 )	2025-01-18 05:05:07 +00:00
fuser.py	Apply UFMT to all files in benchmarks/ (#105928 )	2023-07-26 01:18:48 +00:00
profile.py	Apply UFMT to all files in benchmarks/ (#105928 )	2023-07-26 01:18:48 +00:00
README.md
runner.py	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 )	2024-05-27 14:49:57 +00:00
scratch.py	[BE][Easy][3/19] enforce style for empty lines in import segments in `benchmarks/` (#129754 )	2024-07-17 14:34:42 +00:00
test_bench.py	Fix unused Python variables outside torch/ and test/ (#136359 )	2024-12-11 17:10:23 +00:00
test.py	[Ez][BE]: Enable new stable ruff rules (#129825 )	2024-07-02 14:47:10 +00:00

README.md

Fast RNN benchmarks

Benchmarks for TorchScript models

For most stable results, do the following:

Set CPU Governor to performance mode (as opposed to energy save)
Turn off turbo for all CPUs (assuming Intel CPUs)
Shield cpus via cset shield when running benchmarks.

Some of these scripts accept command line args but most of them do not because I was lazy. They will probably be added sometime in the future, but the default sizes are pretty reasonable.

Test fastrnns (fwd + bwd) correctness

Test the fastrnns benchmarking scripts with the following: python -m fastrnns.test or run the test independently: python -m fastrnns.test --rnns jit

Run benchmarks

python -m fastrnns.bench

should give a good comparison, or you can specify the type of model to run

python -m fastrnns.bench --rnns cudnn aten jit --group rnns

Run model profiling, calls nvprof

python -m fastrnns.profile

should generate nvprof file for all models somewhere. you can also specify the models to generate nvprof files separately:

python -m fastrnns.profile --rnns aten jit

Caveats

Use Linux for the most accurate timing. A lot of these tests only run on CUDA.