mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
https://github.com/pytorch/pytorch/pull/152708 expanded support of `get_estimated_runtime` to many more types of `SchedulerNodes`. This caused an increase in compile time because we're always calling `get_estimated_runtime` to populate the metrics table. This PR adds a flag for this logging, which reduces the instruction count by 8%. Long term, we should probably merge metrics.py with TORCH_LOGS/tlparse (suggestion from @xmfan). Update: added support for TORCH_LOGS for the metrics logging. Test Plan: mm_loop.py and many existing tests cover. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153506 Approved by: https://github.com/eellison |
||
|---|---|---|
| .. | ||
| operator_inp_logs | ||
| __init__.py | ||
| analyze_templates.py | ||
| bench_mm_fusion.py | ||
| benchmark_helper.py | ||
| cache_debug_microbenchmarks.py | ||
| cache_hit_microbenchmarks.py | ||
| dynamo_guard_eval.py | ||
| dynamo_microbenchmarks.py | ||
| fx_microbenchmarks.py | ||
| inductor_bmm.py | ||
| inductor_cpu_atomic.py | ||
| inductor_mm.py | ||
| matmul_relu.py | ||
| microbench.py | ||
| model.py | ||
| operator_inp_utils.py | ||
| operatorbench.py | ||
| overheads.py | ||
| tensor_layout_mini_benchmark.py | ||
| utils.py | ||