mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
[AOTI][dashboard] Update how peak memory is measured (#150534)
Summary: In the dashboard measurement script, AOTI needs to run Eager first to register the output pytree, so the peak memory compression ratio on the dashboard is always close to 1. Update AOTI run to use an extra warmup run, so the peak memory compression ratio measures the result at the run time instead of the compile time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150534 Approved by: https://github.com/yushangdi
This commit is contained in:
parent
6fa1b17195
commit
d4c30b4599
|
|
@ -3735,6 +3735,10 @@ def run(runner, args, original_dir=None):
|
|||
# AOTInductor doesn't support control flow yet
|
||||
runner.skip_models.update(runner.skip_models_due_to_control_flow)
|
||||
runner.skip_models.update(runner.skip_models_due_to_export_not_supported)
|
||||
|
||||
# For AOTI, we only measure the memory compression ratio at the run time
|
||||
# instead of the compile time, so use a warmup run to trigger AOTI compilation.
|
||||
args.use_warm_peak_memory = True
|
||||
elif args.backend == "torchao":
|
||||
assert "cuda" in args.devices, "Quantization requires CUDA device."
|
||||
assert args.bfloat16, "Quantization requires dtype bfloat16."
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user