From eafa2355828724dc1e955a351542bcaddfb16ac8 Mon Sep 17 00:00:00 2001 From: Sam Estep Date: Tue, 30 Mar 2021 11:46:10 -0700 Subject: [PATCH] Clarify and document commit choice for CI jobs (#54967) Summary: PRs https://github.com/pytorch/pytorch/issues/53652 and https://github.com/pytorch/pytorch/issues/54693 attempted to increase the consistency of our choice of commit (head vs merge) for CI on PRs, and have so far been unsuccessful. This PR takes a less ambitious approach to the problem by clarifying the choice in one specific way (see the following paragraph) and documenting it in `CONTRIBUTING.md`. In addition to documentation, this PR also removes the current behavior of our GHA jobs that checkout the PR tip instead of the merge commit. At first glance, this behavior seems to increase consistency (by eliminating the special-case for `ghstack` PRs), but in reality, it actually just means that for non-`ghstack` PRs, the question "Which commit is used in CI?" has *two* answers instead of just one; see the description of https://github.com/pytorch/pytorch/issues/53652 for more details. Once merged, this PR will unblock other PRs that add modify our GHA workflows in breaking ways, such as https://github.com/pytorch/pytorch/issues/54737. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54967 Test Plan: None. Reviewed By: walterddr, seemethere Differential Revision: D27435913 Pulled By: samestep fbshipit-source-id: 405fb419cf015cf88107d5eb2498cfb5bcb7ce33 --- .github/workflows/build_linux_conda.yml | 2 - .github/workflows/build_linux_libtorch.yml | 2 - .github/workflows/build_linux_wheels.yml | 2 - .github/workflows/clang_format.yml | 1 - .github/workflows/lint.yml | 9 ---- .github/workflows/test_tools.yml | 1 - CONTRIBUTING.md | 59 +++++++++++++++++++++- 7 files changed, 57 insertions(+), 19 deletions(-) diff --git a/.github/workflows/build_linux_conda.yml b/.github/workflows/build_linux_conda.yml index 3671afe2030..f36bdef347d 100644 --- a/.github/workflows/build_linux_conda.yml +++ b/.github/workflows/build_linux_conda.yml @@ -17,8 +17,6 @@ jobs: steps: - name: Clone pytorch/pytorch uses: actions/checkout@v2 - with: - ref: ${{ github.event.pull_request.head.sha }} - name: Generating build matrix id: set-matrix run: | diff --git a/.github/workflows/build_linux_libtorch.yml b/.github/workflows/build_linux_libtorch.yml index 1bc49df7f39..8385945f984 100644 --- a/.github/workflows/build_linux_libtorch.yml +++ b/.github/workflows/build_linux_libtorch.yml @@ -17,8 +17,6 @@ jobs: steps: - name: Clone pytorch/pytorch uses: actions/checkout@v2 - with: - ref: ${{ github.event.pull_request.head.sha }} - name: Generating build matrix id: set-matrix run: | diff --git a/.github/workflows/build_linux_wheels.yml b/.github/workflows/build_linux_wheels.yml index 605ad09ed9d..8bc3399ec37 100644 --- a/.github/workflows/build_linux_wheels.yml +++ b/.github/workflows/build_linux_wheels.yml @@ -17,8 +17,6 @@ jobs: steps: - name: Clone pytorch/pytorch uses: actions/checkout@v2 - with: - ref: ${{ github.event.pull_request.head.sha }} - name: Generating build matrix id: set-matrix run: | diff --git a/.github/workflows/clang_format.yml b/.github/workflows/clang_format.yml index 0f085f5bf30..f8656981d7a 100644 --- a/.github/workflows/clang_format.yml +++ b/.github/workflows/clang_format.yml @@ -15,7 +15,6 @@ jobs: - name: Fetch PyTorch uses: actions/checkout@v2 with: - ref: ${{ github.event.pull_request.head.sha }} fetch-depth: 0 # deep clone, to allow us to use git merge-base - name: Run clang-format run: | diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml index 4c76e4641e2..d1c9d91148e 100644 --- a/.github/workflows/lint.yml +++ b/.github/workflows/lint.yml @@ -17,8 +17,6 @@ jobs: architecture: x64 - name: Checkout PyTorch uses: actions/checkout@v2 - with: - ref: ${{ github.event.pull_request.head.sha }} - name: Ensure consistent CircleCI YAML config run: | pip install -r requirements.txt @@ -66,8 +64,6 @@ jobs: architecture: x64 - name: Fetch PyTorch uses: actions/checkout@v2 - with: - ref: ${{ github.event.pull_request.head.sha }} - name: Get HEAD commit SHA run: echo ::set-output name=commit-sha::$(git rev-parse HEAD) id: get-commit-sha @@ -102,7 +98,6 @@ jobs: - name: Checkout PyTorch uses: actions/checkout@v2 with: - ref: ${{ github.event.pull_request.head.sha }} fetch-depth: 0 # deep clone, to allow us to use git merge-base - name: Get HEAD commit SHA run: echo ::set-output name=commit-sha::$(git rev-parse HEAD) @@ -203,8 +198,6 @@ jobs: architecture: x64 - name: Fetch PyTorch uses: actions/checkout@v2 - with: - ref: ${{ github.event.pull_request.head.sha }} - name: Run cmakelint run: | set -eux @@ -224,8 +217,6 @@ jobs: architecture: x64 - name: Fetch PyTorch uses: actions/checkout@v2 - with: - ref: ${{ github.event.pull_request.head.sha }} - name: Install dependencies run: | set -eux diff --git a/.github/workflows/test_tools.yml b/.github/workflows/test_tools.yml index fb4aebe5f69..d75ef6b8970 100644 --- a/.github/workflows/test_tools.yml +++ b/.github/workflows/test_tools.yml @@ -18,7 +18,6 @@ jobs: - name: Checkout PyTorch uses: actions/checkout@v2 with: - ref: ${{ github.event.pull_request.head.sha }} fetch-depth: 0 # deep clone, to allow us to use git log - name: Install dependencies # mypy and boto3 versions copied from diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index ffea19ce6f5..b3ea8f83831 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -39,6 +39,7 @@ - [Why no leak detection?](#why-no-leak-detection) - [Caffe2 notes](#caffe2-notes) - [CI failure tips](#ci-failure-tips) + - [Which commit is used in CI?](#which-commit-is-used-in-ci) ## Contributing to PyTorch @@ -1137,8 +1138,9 @@ Once you submit a PR or push a new commit to a branch that is in an active PR, CI jobs will be run automatically. Some of these may fail and you will need to find out why, by looking at the logs. -Fairly often, a CI failure might be unrelated to your changes. In this -case, you can usually ignore the failure. +Fairly often, a CI failure might be unrelated to your changes. In this case, you +can usually ignore the failure. See [the following +subsection](#which-commit-is-used-in-ci) for more details. Some failures might be related to specific hardware or environment configurations. In this case, if the job is run by CircleCI, you can @@ -1169,3 +1171,56 @@ following steps: For certain Windows failures, it may be useful to have a full [Remote Desktop](https://docs.microsoft.com/en-us/windows-server/remote/remote-desktop-services/clients/remote-desktop-clients) connection. See detailed instructions [here](https://github.com/pytorch/pytorch/wiki/Debugging-Windows-with-Remote-Desktop-or-CDB-(CLI-windbg)-on-CircleCI) for how to set that up after rerunning the job. + +### Which commit is used in CI? + +For CI run on `master`, this repository is checked out for a given `master` +commit, and CI is run on that commit (there isn't really any other choice). For +PRs, however, it's a bit more complicated. Consider this commit graph, where +`master` is at commit `A`, and the branch for PR #42 (just a placeholder) is at +commit `B`: + +``` + o---o---B (refs/pull/42/head) + / \ + / C (refs/pull/42/merge) + / / +---o---o---o---A (refs/heads/master) +``` + +There are two possible choices for which commit to use: + +1. Checkout commit `B`, the head of the PR (manually committed by the PR + author). +2. Checkout commit `C`, the hypothetical result of what would happen if the PR + were merged into `master` (automatically generated by GitHub). + +This choice depends on several factors; here is the decision tree as of +2021-03-30: + +- For CI jobs on CircleCI: + - If the name of the job (or one of its ancestors in the workflow DAG) + contains "xla" or "gcc5", choice **2** is used. This includes the following + jobs: + - pytorch_linux_xenial_py3_6_gcc5_4_build + - pytorch_cpp_doc_build + - pytorch_doc_test + - pytorch_linux_backward_compatibility_check_test + - pytorch_linux_xenial_py3_6_gcc5_4_jit_legacy_test + - pytorch_linux_xenial_py3_6_gcc5_4_test + - pytorch_python_doc_build + - pytorch_xla_linux_bionic_py3_6_clang9_build + - pytorch_xla_linux_bionic_py3_6_clang9_test + - Otherwise, choice **1** is used. +- For CI jobs on GitHub Actions: + - If the PR was created using [`ghstack`](https://github.com/ezyang/ghstack), + choice **1** is used. + - Otherwise, choice **2** is used. + +This is important to be aware of, because if you see a CI failure on your PR and +choice **2** is being used for that CI job, it is possible that the failure is +nondeterministically caused by a commit that does not exist in the ancestry of +your PR branch. If you happen to have write access to this repo, you can choose +to use `ghstack` to eliminate this nondeterminism for GitHub Actions jobs on +your PRs, but it will still be present for the select CircleCI jobs listed +above.