mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Some disabled test runs weren't being uploaded as disabled tests because some dynamo tests are set to mark themselves as skipped if they are failing. This makes the script think that there are fewer retries than there are actually are and that the job is not a rerun disabled tests job. Instead, query for the job name to see if it contains rerun disabled tests and fall back to counting the number of retries if querying fails Alternate options: relax the check for the number of tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/148027 Approved by: https://github.com/huydhn |
||
|---|---|---|
| .. | ||
| upload_utilization_stats | ||
| __init__.py | ||
| check_disabled_tests.py | ||
| export_test_times.py | ||
| import_test_stats.py | ||
| monitor.py | ||
| README.md | ||
| sccache_stats_to_benchmark_format.py | ||
| test_dashboard.py | ||
| upload_artifacts.py | ||
| upload_dynamo_perf_stats.py | ||
| upload_external_contrib_stats.py | ||
| upload_metrics.py | ||
| upload_sccache_stats.py | ||
| upload_stats_lib.py | ||
| upload_test_stats_intermediate.py | ||
| upload_test_stats_running_jobs.py | ||
| upload_test_stats.py | ||
| utilization_stats_lib.py | ||
PyTorch CI Stats
We track various stats about each CI job.
- Jobs upload their artifacts to an intermediate data store (either GitHub
Actions artifacts or S3, depending on what permissions the job has). Example:
a9f6a35a33/.github/workflows/_linux-build.yml (L144-L151) - When a workflow completes, a
workflow_runevent triggersupload-test-stats.yml. upload-test-statsdownloads the raw stats from the intermediate data store and uploads them as JSON to s3, which then uploads to our database backend
graph LR
J1[Job with AWS creds<br>e.g. linux, win] --raw stats--> S3[(AWS S3)]
J2[Job w/o AWS creds<br>e.g. mac] --raw stats--> GHA[(GH artifacts)]
S3 --> uts[upload-test-stats.yml]
GHA --> uts
uts --json--> s3[(s3)]
s3 --> DB[(database)]
Why this weird indirection? Because writing to the database requires special permissions which, for security reasons, we do not want to give to pull request CI. Instead, we implemented GitHub's recommended pattern for cases like this.
For more details about what stats we export, check out
upload-test-stats.yml