pytorch/benchmarks
Kevin Stephano 26a91a9f04 [WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support (#44101)
Summary:
Modified files in `benchmarks/tensorexpr` to add support for NVIDIA's Fuser for the jit compiler.

This support has some modifications besides adding an option to support the NVIDIA fuser:

* Adds FP16 Datatype support
* Fixes SOL/Algo calculations to generally use the data type instead of being fixed to 4 bytes
* Adds IR printing and kernel printing knobs
* Adds a knob `input_iter` to create ranges of inputs currently only for reductions
* Adds further reduction support for Inner and Outer dimension reductions that are compatible with the `input_iter` knob.
* Added `simple_element`, `reduce2d_inner`, and `reduce2d_outer` to isolate performance on elementwise  and reduction operations in the most minimal fashion.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44101

Reviewed By: ngimel

Differential Revision: D23713658

Pulled By: bertmaher

fbshipit-source-id: d6b83cfab559aefe107c23b3c0f2df9923b3adc1
2020-09-15 15:10:49 -07:00
..
distributed/ddp Add distributed data parallel benchmark tool (#35198) 2020-04-08 15:07:03 -07:00
fastrnns Benchmarks: make fuser and executor configurable from command line. (#44291) 2020-09-09 11:59:35 -07:00
framework_overhead_benchmark Fix spelling errors 2020-01-28 04:46:15 -08:00
functional_autograd_benchmark Reland of benchmark code (#43428) 2020-08-24 13:27:26 -07:00
operator_benchmark Add benchmark for channel_shuffle operator (#43509) 2020-09-02 08:15:19 -07:00
overrides_benchmark Add __torch_function__ for methods (#37091) 2020-08-05 20:44:13 -07:00
profiler_benchmark Destroy CUDA events after profiling (#39962) 2020-06-23 10:44:39 -07:00
record_function_benchmark move benchmark utils into torch namespace (#41506) 2020-07-23 09:48:39 -07:00
serialization [JIT] Make new zip serialization for torch save/load significantly (~70%) faster (#38379) 2020-05-29 01:56:18 -07:00
static_runtime [Static Runtime] Add OSS build for static runtime benchmarks (#43881) 2020-09-02 08:00:18 -07:00
tensorexpr [WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support (#44101) 2020-09-15 15:10:49 -07:00
compare-fastrnn-results.py Benchmarks: add scripts for FastRNNs results comparison. (#44134) 2020-09-03 13:44:42 -07:00
compare.sh Benchmarks: add scripts for FastRNNs results comparison. (#44134) 2020-09-03 13:44:42 -07:00
README.md Fix spelling errors 2020-01-28 04:46:15 -08:00
upload_scribe.py Benchmarks: make fuser and executor configurable from command line. (#44291) 2020-09-09 11:59:35 -07:00

PyTorch Benchmarks

NOTE: This folder is currently work in progress.

This folder contains scripts that produce reproducible timings of various PyTorch features.

It also provides mechanisms to compare PyTorch with other frameworks.

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
conda install pytorch torchvision -c pytorch

# Install the latest pytorch master from source.
# It should supersede the installation from the release binary.
cd $PYTORCH_HOME
python setup.py build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Benchmark List

Please refer to each subfolder to discover each benchmark suite