Commit Graph

4 Commits

Author SHA1 Message Date
Mikayla Gawarecki
b0c9ccdc4b Add standard deviation of metrics over runs to inference benchmark (#113309)
Run each `(batch_size, compile)` benchmark 10 times in `./runner.sh` and get mean and standard deviation of metrics in output table

Only report `warmup latency`, `average_latency`, `throughput` and `gpu_util`

Break `output.md` file into a single markdown file per `(batch_size, compile)` configuration. Further runs of `./runner.sh` will append one row to the table in each file for easy comparison

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113309
Approved by: https://github.com/albanD
2023-11-09 18:38:05 +00:00
Mikayla Gawarecki
df149581bc Tabulate outputs in inference benchmark (#112900)
- Fix error where script was always compiling model
- Make`runner.sh` parse outputs into nice `.md` format

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112900
Approved by: https://github.com/albanD
ghstack dependencies: #112582, #112863
2023-11-03 23:53:30 +00:00
Mikayla Gawarecki
c799689437 Refactor inference benchmark and add runner script to do sweep (#112863)
- Added `runner.sh` that does a sweep over `batch_size=(1, 32, 64, 128, 256)` and `compile=(True, False)`
- Added GPU utilization as a metric
- Converted frontend from 2 processes (one putting requests into `request_queue` and one reading from `response_queue` and collecting metrics) to a single process with 3 threads (one putting requests into `request_queue` and one reading from `response_queue` and collecting metrics and one polling `nvidia-smi` for gpu utilization)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112863
Approved by: https://github.com/albanD
ghstack dependencies: #112582
2023-11-03 20:26:43 +00:00
Mikayla Gawarecki
7cbf9869d5 Add v0 inference benchmark script (#112582)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112582
Approved by: https://github.com/albanD
2023-11-02 17:21:15 +00:00