pytorch/benchmarks
Xuan Zhang c2d1b225e6 [PT2][partitioners] raise getitems in partitioners to allow earlier release of buffers (#155809)
**Problem & Solution:**
Assume we have something like:
```
x = some_op(...)
x0 = x[0]
do_something_with_and_is_last_use_of(x0)
do_a_bunch_of_other_things()
x1 = x[1]
```
In this case, the memory associated with `x0` cannot be released until `x1 = x[1]`. Since `x1 = x[1]` does not use additional memory, it would be beneficial to move and `x1 = x[1]` and all such `getitem` operations to be immediately after `x = some_op(...)` such as
```
x = some_op(...)
x0 = x[0]
x1 = x[1]
do_something_with_and_is_last_use_of(x0)
do_a_bunch_of_other_things()
```

**Results:**
For instance, for the `res2net101_26w_4s` model in pytorch benchmark, when running with `aot_eager` backend and with `activation_memory_budget=0.4`, the peak memory are
* baseline: 7.73GiB
* with the chage: 6.45GiB

As a sanity check, for the same setting with `inductor` backend, the peak memory is not regressed.

cc and credit to @ShatianWang for noticing this issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155809
Approved by: https://github.com/fmassa, https://github.com/bdhirsh
2025-06-21 19:57:21 +00:00
..
distributed/ddp [BE] Remove outdated RPC benchmark (#146716) 2025-03-29 04:44:36 +00:00
dynamo [PT2][partitioners] raise getitems in partitioners to allow earlier release of buffers (#155809) 2025-06-21 19:57:21 +00:00
fastrnns [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
framework_overhead_benchmark Fix unused Python variables outside torch/ and test/ (#136359) 2024-12-11 17:10:23 +00:00
functional_autograd_benchmark [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
fuser Fix unused Python variables outside torch/ and test/ (#136359) 2024-12-11 17:10:23 +00:00
gpt_fast [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
inductor_backends Rename inductor cache (#156128) 2025-06-17 03:57:18 +00:00
inference [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
instruction_counts [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
nested Fix unused Python variables outside torch/ and test/ (#136359) 2024-12-11 17:10:23 +00:00
operator_benchmark [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
overrides_benchmark [BE][Easy][3/19] enforce style for empty lines in import segments in benchmarks/ (#129754) 2024-07-17 14:34:42 +00:00
profiler_benchmark Apply TorchFix TOR203 fixes (#143691) 2024-12-23 18:21:03 +00:00
record_function_benchmark
serialization Fix unused Python variables outside torch/ and test/ (#136359) 2024-12-11 17:10:23 +00:00
sparse [build] Change --cmake{,-only} arguments to envvars to support modern Python build frontend (#156045) 2025-06-17 11:40:24 +00:00
static_runtime [3/N] Use internal linkage in C++ files (#151297) 2025-05-05 17:48:39 +00:00
tensorexpr [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
transformer [BE] fix typos in benchmarks/ (#156077) 2025-06-17 13:12:18 +00:00
compare-fastrnn-results.py [BE][Easy][3/19] enforce style for empty lines in import segments in benchmarks/ (#129754) 2024-07-17 14:34:42 +00:00
compare.sh
README.md Removing conda references from PyTorch Docs (#152702) 2025-05-20 20:33:28 +00:00
upload_scribe.py Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00

PyTorch Benchmarks

This folder contains scripts that produce reproducible timings of various PyTorch features.

It also provides mechanisms to compare PyTorch with other frameworks.

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
pip3 install torch torchvision

# Install the latest pytorch master from source.
# It should supersede the installation from the release binary.
cd $PYTORCH_HOME
python setup.py build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Benchmark List

Please refer to each subfolder to discover each benchmark suite. Links are provided where descriptions exist: