mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
PyTorch Data Sampler benchmark (#156974)
## Motivation Many PRs optimizing samplers (for eg https://github.com/pytorch/pytorch/pull/147706, https://github.com/pytorch/pytorch/pull/137423) are leveraging an adhoc script for benchmarking samplers. The script and outputs are often copied over in PRs. We want to begin centralizing benchmarks for torch.utils.data components. ## What ? * This PR adds a new sub-folder in `benchmarks` for `data`. This is aimed to cover benchmarking scripts for torch.utils.data components like dataloader and sampler. * Specifically, this PR includes a simple script to time samplers. This is often "copy-pasted" in PRs optimizing samplers. Having it in a centralized location should prevent that, and allow a common standard. ## Output ``` Benchmark Results: +--------------+-------------+----------------+-----------+-----------+ | Batch Size | Drop Last | Original (s) | New (s) | Speedup | +==============+=============+================+===========+===========+ | 4 | True | 0.004 | 0.0088 | -119.62% | +--------------+-------------+----------------+-----------+-----------+ | 4 | False | 0.0083 | 0.009 | -9.23% | +--------------+-------------+----------------+-----------+-----------+ | 8 | True | 0.003 | 0.0074 | -147.64% | +--------------+-------------+----------------+-----------+-----------+ | 8 | False | 0.0054 | 0.0075 | -38.72% | +--------------+-------------+----------------+-----------+-----------+ | 64 | True | 0.0021 | 0.0056 | -161.92% | +--------------+-------------+----------------+-----------+-----------+ | 64 | False | 0.0029 | 0.0055 | -92.50% | +--------------+-------------+----------------+-----------+-----------+ | 640 | True | 0.002 | 0.0055 | -168.75% | +--------------+-------------+----------------+-----------+-----------+ | 640 | False | 0.0024 | 0.0062 | -161.35% | +--------------+-------------+----------------+-----------+-----------+ | 6400 | True | 0.0021 | 0.0055 | -160.13% | +--------------+-------------+----------------+-----------+-----------+ | 6400 | False | 0.0021 | 0.0068 | -215.46% | +--------------+-------------+----------------+-----------+-----------+ | 64000 | True | 0.0042 | 0.0065 | -55.29% | +--------------+-------------+----------------+-----------+-----------+ | 64000 | False | 0.0029 | 0.0077 | -169.56% | +--------------+-------------+----------------+-----------+-----------+ ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/156974 Approved by: https://github.com/ramanishsingh
This commit is contained in:
parent
195ef1bce8
commit
e6d8ed02cb
|
|
@ -31,3 +31,4 @@ Please refer to each subfolder to discover each benchmark suite. Links are provi
|
|||
* [Overrides](overrides_benchmark/README.md)
|
||||
* [Sparse](sparse/README.md)
|
||||
* [Tensor expression](tensorexpr/HowToRun.md)
|
||||
* [Data](data/README.md)
|
||||
|
|
|
|||
62
benchmarks/data/README.md
Normal file
62
benchmarks/data/README.md
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
# PyTorch Data Benchmarks
|
||||
|
||||
This directory contains benchmarks for the `torch.utils.data` module components, focusing on the performance of samplers.
|
||||
|
||||
## Dependencies
|
||||
|
||||
The benchmarks require the following dependencies:
|
||||
```
|
||||
numpy
|
||||
tabulate
|
||||
```
|
||||
|
||||
You can install them using pip:
|
||||
```bash
|
||||
pip install numpy tabulate
|
||||
```
|
||||
|
||||
## Running the benchmarks
|
||||
|
||||
To run the BatchSampler benchmark:
|
||||
```bash
|
||||
python samplers_benchmark.py
|
||||
```
|
||||
|
||||
## Sampler Benchmark
|
||||
|
||||
The `samplers_benchmark.py` script benchmarks the performance of PyTorch's BatchSampler against an alternative implementation as an example. It tests with the following parameters:
|
||||
|
||||
- Batch sizes: 4, 8, 64, 640, 6400, 64000
|
||||
- Drop last options: True, False
|
||||
- Each configuration is run 10 times and averaged
|
||||
- Results include speedup percentage calculations
|
||||
|
||||
### Output
|
||||
|
||||
The benchmark outputs a table with the following columns:
|
||||
- Batch Size
|
||||
- Drop Last
|
||||
- Original (s): Time taken by the original implementation
|
||||
- New (s): Time taken by the alternative implementation
|
||||
- Speedup: Percentage improvement of the new implementation over the original
|
||||
|
||||
Example output:
|
||||
```
|
||||
+------------+-----------+---------------+----------+---------+
|
||||
| Batch Size | Drop Last | Original (s) | New (s) | Speedup |
|
||||
+============+===========+===============+==========+=========+
|
||||
| 4 | True | 0.1234 | 0.1000 | 18.96% |
|
||||
+------------+-----------+---------------+----------+---------+
|
||||
| 4 | False | 0.1345 | 0.1100 | 18.22% |
|
||||
+------------+-----------+---------------+----------+---------+
|
||||
...
|
||||
```
|
||||
|
||||
### Extending the Benchmark
|
||||
|
||||
To benchmark a different implementation:
|
||||
|
||||
On local:
|
||||
1. Modify the `NewBatchSampler` class in `samplers_benchmark.py` with your implementation. Similarly replace `BatchSampler` with the corresponding PyTorch implementation.
|
||||
* Ensure to include all inputs like `replacement` for `RandomSampler` and its variations
|
||||
2. Run the benchmark to compare its performance against the original
|
||||
143
benchmarks/data/samplers_benchmark.py
Normal file
143
benchmarks/data/samplers_benchmark.py
Normal file
|
|
@ -0,0 +1,143 @@
|
|||
#!/usr/bin/env python3
|
||||
|
||||
import time
|
||||
from collections.abc import Iterable, Iterator
|
||||
from typing import Union
|
||||
|
||||
import numpy as np
|
||||
from tabulate import tabulate
|
||||
|
||||
from torch.utils.data import BatchSampler, Sampler, SequentialSampler
|
||||
|
||||
|
||||
class NewBatchSampler(Sampler[list[int]]):
|
||||
"""Alternative implementation of BatchSampler for benchmarking purposes."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
sampler: Union[Sampler[int], Iterable[int]],
|
||||
batch_size: int,
|
||||
drop_last: bool,
|
||||
) -> None:
|
||||
if (
|
||||
not isinstance(batch_size, int)
|
||||
or isinstance(batch_size, bool)
|
||||
or batch_size <= 0
|
||||
):
|
||||
raise ValueError(
|
||||
f"batch_size should be a positive integer value, but got batch_size={batch_size}"
|
||||
)
|
||||
if not isinstance(drop_last, bool):
|
||||
raise ValueError(
|
||||
f"drop_last should be a boolean value, but got drop_last={drop_last}"
|
||||
)
|
||||
self.sampler = sampler
|
||||
self.batch_size = batch_size
|
||||
self.drop_last = drop_last
|
||||
|
||||
def __iter__(self) -> Iterator[list[int]]:
|
||||
if self.drop_last:
|
||||
sampler_iter = iter(self.sampler)
|
||||
while True:
|
||||
try:
|
||||
batch = [next(sampler_iter) for _ in range(self.batch_size)]
|
||||
yield batch
|
||||
except StopIteration:
|
||||
break
|
||||
else:
|
||||
batch = [0] * self.batch_size
|
||||
idx_in_batch = 0
|
||||
for idx in self.sampler:
|
||||
batch[idx_in_batch] = idx
|
||||
idx_in_batch += 1
|
||||
if idx_in_batch == self.batch_size:
|
||||
yield batch
|
||||
idx_in_batch = 0
|
||||
batch = [0] * self.batch_size
|
||||
if idx_in_batch > 0:
|
||||
yield batch[:idx_in_batch]
|
||||
|
||||
def __len__(self) -> int:
|
||||
# Can only be called if self.sampler has __len__ implemented
|
||||
if self.drop_last:
|
||||
return len(self.sampler) // self.batch_size # type: ignore[arg-type]
|
||||
else:
|
||||
return (len(self.sampler) + self.batch_size - 1) // self.batch_size # type: ignore[arg-type]
|
||||
|
||||
|
||||
def main():
|
||||
"""Run benchmark with specified parameters."""
|
||||
DATA_SIZE = 99999
|
||||
AVG_TIMES = 10
|
||||
BATCH_SIZES = [4, 8, 64, 640, 6400, 64000]
|
||||
DROP_LAST_OPTIONS = [True, False]
|
||||
|
||||
results = []
|
||||
|
||||
# Set up samplers here, ensure right args are passed in
|
||||
baselineSampler = BatchSampler
|
||||
testSampler = NewBatchSampler
|
||||
|
||||
for batch_size in BATCH_SIZES:
|
||||
for drop_last in DROP_LAST_OPTIONS:
|
||||
print(f"Benchmarking with batch_size={batch_size}, drop_last={drop_last}")
|
||||
|
||||
# Benchmark baselineSampler
|
||||
original_times = []
|
||||
for _ in range(AVG_TIMES):
|
||||
start = time.perf_counter()
|
||||
for _ in baselineSampler(
|
||||
sampler=SequentialSampler(range(DATA_SIZE)),
|
||||
batch_size=batch_size,
|
||||
drop_last=drop_last,
|
||||
):
|
||||
pass
|
||||
end = time.perf_counter()
|
||||
original_times.append(end - start)
|
||||
time.sleep(0.1)
|
||||
|
||||
original_avg = float(np.mean(original_times))
|
||||
|
||||
# Benchmark testSampler
|
||||
new_times = []
|
||||
for _ in range(AVG_TIMES):
|
||||
start = time.perf_counter()
|
||||
for _ in testSampler(
|
||||
sampler=SequentialSampler(range(DATA_SIZE)),
|
||||
batch_size=batch_size,
|
||||
drop_last=drop_last,
|
||||
):
|
||||
pass
|
||||
end = time.perf_counter()
|
||||
new_times.append(end - start)
|
||||
time.sleep(0.1) # Small delay to reduce system load
|
||||
|
||||
new_avg = float(np.mean(new_times))
|
||||
|
||||
# Calculate speedup
|
||||
if original_avg > 0 and new_avg > 0:
|
||||
speedup = (original_avg - new_avg) / original_avg * 100
|
||||
speedup_str = f"{speedup:.2f}%"
|
||||
else:
|
||||
speedup_str = "N/A"
|
||||
|
||||
print(f"Speedup: {speedup_str}\n")
|
||||
|
||||
results.append(
|
||||
[
|
||||
batch_size,
|
||||
drop_last,
|
||||
f"{original_avg:.4f}",
|
||||
f"{new_avg:.4f}",
|
||||
speedup_str,
|
||||
]
|
||||
)
|
||||
|
||||
# Print results in a table
|
||||
headers = ["Batch Size", "Drop Last", "Original (s)", "New (s)", "Speedup"]
|
||||
print("\nBenchmark Results:")
|
||||
print(tabulate(results, headers=headers, tablefmt="grid"))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
|
@ -6,6 +6,12 @@ from typing import Generic, Optional, TypeVar, Union
|
|||
import torch
|
||||
|
||||
|
||||
# Note: For benchmarking changes to samplers, see:
|
||||
# /benchmarks/data/samplers_bench.py
|
||||
# This benchmark compares the performance of different sampler implementations
|
||||
# and can be used to evaluate the impact of optimizations.
|
||||
|
||||
|
||||
__all__ = [
|
||||
"BatchSampler",
|
||||
"RandomSampler",
|
||||
|
|
@ -324,7 +330,6 @@ class BatchSampler(Sampler[list[int]]):
|
|||
self.drop_last = drop_last
|
||||
|
||||
def __iter__(self) -> Iterator[list[int]]:
|
||||
# Implemented based on the benchmarking in https://github.com/pytorch/pytorch/pull/76951
|
||||
sampler_iter = iter(self.sampler)
|
||||
if self.drop_last:
|
||||
# Create multiple references to the same iterator
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user