pytorch/tools/stats
Catherine Lee ba5c4a727f Upload sccache stats into benchmark database with build step time (#140839)
Guinea pig benchmark database
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140839
Approved by: https://github.com/huydhn

Co-authored-by: Huy Do <huydhn@gmail.com>
2024-11-21 22:38:45 +00:00
..
__init__.py
check_disabled_tests.py Remove most rockset references (#139922) 2024-11-12 21:17:43 +00:00
export_test_times.py [BE][Easy][5/19] enforce style for empty lines in import segments in tools/ and torchgen/ (#129756) 2024-07-17 06:44:35 +00:00
import_test_stats.py Move slow tests to be in repo (#132379) 2024-08-07 18:42:56 +00:00
monitor.py deprecated datetime.utcnow() fix and _RendezvousJoinOp module initiation bug fix (#136141) 2024-09-24 07:26:10 +00:00
README.md Remove most rockset references (#139922) 2024-11-12 21:17:43 +00:00
sccache_stats_to_benchmark_format.py Upload sccache stats into benchmark database with build step time (#140839) 2024-11-21 22:38:45 +00:00
test_dashboard.py Upload all run attempts when in upload_test_stats_intermediate (#140459) 2024-11-18 21:40:10 +00:00
upload_artifacts.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_dynamo_perf_stats.py Remove most rockset references (#139922) 2024-11-12 21:17:43 +00:00
upload_external_contrib_stats.py Remove upload_test_stat_aggregates script (#139915) 2024-11-07 20:14:12 +00:00
upload_metrics.py Remove most rockset references (#139922) 2024-11-12 21:17:43 +00:00
upload_sccache_stats.py Remove most rockset references (#139922) 2024-11-12 21:17:43 +00:00
upload_stats_lib.py Remove most rockset references (#139922) 2024-11-12 21:17:43 +00:00
upload_test_stats_intermediate.py Upload all run attempts when in upload_test_stats_intermediate (#140459) 2024-11-18 21:40:10 +00:00
upload_test_stats_running_jobs.py Continuous job for pulling artifacts and doing upload (#140453) 2024-11-20 20:41:52 +00:00
upload_test_stats.py Remove most rockset references (#139922) 2024-11-12 21:17:43 +00:00

PyTorch CI Stats

We track various stats about each CI job.

  1. Jobs upload their artifacts to an intermediate data store (either GitHub Actions artifacts or S3, depending on what permissions the job has). Example: a9f6a35a33/.github/workflows/_linux-build.yml (L144-L151)
  2. When a workflow completes, a workflow_run event triggers upload-test-stats.yml.
  3. upload-test-stats downloads the raw stats from the intermediate data store and uploads them as JSON to s3, which then uploads to our database backend
graph LR
    J1[Job with AWS creds<br>e.g. linux, win] --raw stats--> S3[(AWS S3)]
    J2[Job w/o AWS creds<br>e.g. mac] --raw stats--> GHA[(GH artifacts)]

    S3 --> uts[upload-test-stats.yml]
    GHA --> uts

    uts --json--> s3[(s3)]
    s3 --> DB[(database)]

Why this weird indirection? Because writing to the database requires special permissions which, for security reasons, we do not want to give to pull request CI. Instead, we implemented GitHub's recommended pattern for cases like this.

For more details about what stats we export, check out upload-test-stats.yml