mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

Kevin Stephano 26a91a9f04 [WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support (#44101 ) Summary: Modified files in `benchmarks/tensorexpr` to add support for NVIDIA's Fuser for the jit compiler. This support has some modifications besides adding an option to support the NVIDIA fuser: * Adds FP16 Datatype support * Fixes SOL/Algo calculations to generally use the data type instead of being fixed to 4 bytes * Adds IR printing and kernel printing knobs * Adds a knob `input_iter` to create ranges of inputs currently only for reductions * Adds further reduction support for Inner and Outer dimension reductions that are compatible with the `input_iter` knob. * Added `simple_element`, `reduce2d_inner`, and `reduce2d_outer` to isolate performance on elementwise and reduction operations in the most minimal fashion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44101 Reviewed By: ngimel Differential Revision: D23713658 Pulled By: bertmaher fbshipit-source-id: d6b83cfab559aefe107c23b3c0f2df9923b3adc1		2020-09-15 15:10:49 -07:00
..
distributed/ddp	Add distributed data parallel benchmark tool (#35198 )	2020-04-08 15:07:03 -07:00
fastrnns	Benchmarks: make fuser and executor configurable from command line. (#44291 )	2020-09-09 11:59:35 -07:00
framework_overhead_benchmark	Fix spelling errors	2020-01-28 04:46:15 -08:00
functional_autograd_benchmark	Reland of benchmark code (#43428 )	2020-08-24 13:27:26 -07:00
operator_benchmark	Add benchmark for channel_shuffle operator (#43509 )	2020-09-02 08:15:19 -07:00
overrides_benchmark	Add __torch_function__ for methods (#37091 )	2020-08-05 20:44:13 -07:00
profiler_benchmark	Destroy CUDA events after profiling (#39962 )	2020-06-23 10:44:39 -07:00
record_function_benchmark	move benchmark utils into torch namespace (#41506 )	2020-07-23 09:48:39 -07:00
serialization	[JIT] Make new zip serialization for torch save/load significantly (~70%) faster (#38379 )	2020-05-29 01:56:18 -07:00
static_runtime	[Static Runtime] Add OSS build for static runtime benchmarks (#43881 )	2020-09-02 08:00:18 -07:00
tensorexpr	[WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support (#44101 )	2020-09-15 15:10:49 -07:00
compare-fastrnn-results.py	Benchmarks: add scripts for FastRNNs results comparison. (#44134 )	2020-09-03 13:44:42 -07:00
compare.sh	Benchmarks: add scripts for FastRNNs results comparison. (#44134 )	2020-09-03 13:44:42 -07:00
README.md	Fix spelling errors	2020-01-28 04:46:15 -08:00
upload_scribe.py	Benchmarks: make fuser and executor configurable from command line. (#44291 )	2020-09-09 11:59:35 -07:00

README.md

PyTorch Benchmarks

NOTE: This folder is currently work in progress.

This folder contains scripts that produce reproducible timings of various PyTorch features.

It also provides mechanisms to compare PyTorch with other frameworks.

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
conda install pytorch torchvision -c pytorch

# Install the latest pytorch master from source.
# It should supersede the installation from the release binary.
cd $PYTORCH_HOME
python setup.py build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Benchmark List

Please refer to each subfolder to discover each benchmark suite

Fast RNNs benchmarks