mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Bert Maher b7261de0df [pytorch][te] Add compilation time benchmark (#46124 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46124 We want to make sure we can actually fuse kernels within a fairly tight time budget. So here's a quick benchmark of codegen for a simple pointwise activation function (swish). I kept all the intermediate tensors separate to force TE to actually do inlining. Test Plan: ``` buck run mode/opt //caffe2/benchmarks/cpp/tensorexpr:tensorexpr_bench ``` I've only run in debug mode so results aren't super meaningful, but even in that mode it's 18ms for compilation, 15 of which are in llvm. Update, opt build mode: ``` ---------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------- BM_CompileSwish 5123276 ns 5119846 ns 148 BM_CompileSwishLLVMOnly 4754361 ns 4753701 ns 160 ``` Reviewed By: asuhan Differential Revision: D24232801 fbshipit-source-id: d58a8b7f79bcd9244c49366af7a693e09f24bf76		2020-10-09 23:11:37 -07:00
..
cpp/tensorexpr	[pytorch][te] Add compilation time benchmark (#46124 )	2020-10-09 23:11:37 -07:00
distributed/ddp	Add distributed data parallel benchmark tool (#35198 )	2020-04-08 15:07:03 -07:00
fastrnns	Benchmarks: tweak PE config settings. (#45349 )	2020-09-26 23:13:29 -07:00
framework_overhead_benchmark	Remove py2 compatible future imports (#44735 )	2020-09-16 12:55:57 -07:00
functional_autograd_benchmark	Reland of benchmark code (#43428 )	2020-08-24 13:27:26 -07:00
operator_benchmark	[quant][pyper] Rename the sparse argument for embedding_bag ops (#46003 )	2020-10-08 16:15:28 -07:00
overrides_benchmark	Add __torch_function__ for methods (#37091 )	2020-08-05 20:44:13 -07:00
profiler_benchmark	Source code level attribution in profiler (#43898 )	2020-09-30 00:57:35 -07:00
record_function_benchmark	Fix D23995953 import.	2020-09-29 19:30:23 -07:00
serialization	[JIT] Make new zip serialization for torch save/load significantly (~70%) faster (#38379 )	2020-05-29 01:56:18 -07:00
static_runtime	[StaticRuntime] Integrate Static Runtime into PyTorchPredictor (#45640 )	2020-10-02 23:03:05 -07:00
tensorexpr	[JIT] Add dynamic shape benchmark for NV Fuser (#46107 )	2020-10-09 22:09:21 -07:00
compare-fastrnn-results.py	Benchmarks: add scripts for FastRNNs results comparison. (#44134 )	2020-09-03 13:44:42 -07:00
compare.sh	Benchmarks: add scripts for FastRNNs results comparison. (#44134 )	2020-09-03 13:44:42 -07:00
README.md	Fix spelling errors	2020-01-28 04:46:15 -08:00
upload_scribe.py	Benchmarks: make fuser and executor configurable from command line. (#44291 )	2020-09-09 11:59:35 -07:00

README.md

PyTorch Benchmarks

NOTE: This folder is currently work in progress.

This folder contains scripts that produce reproducible timings of various PyTorch features.

It also provides mechanisms to compare PyTorch with other frameworks.

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
conda install pytorch torchvision -c pytorch

# Install the latest pytorch master from source.
# It should supersede the installation from the release binary.
cd $PYTORCH_HOME
python setup.py build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Benchmark List

Please refer to each subfolder to discover each benchmark suite

Fast RNNs benchmarks