pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Sam Estep	21ef248fb8	[reland] Report test time regressions (#50171 ) Summary: This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub. Important: for uninteresting reasons, this PR moves the `print_test_stats.py` file. - Before: `test/print_test_stats.py` - After: `torch/testing/_internal/print_test_stats.py` Notes on the approach: - Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics. - We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit. - We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite. - We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives. - We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3). - We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR. - Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes. - Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171 Test Plan: To run the unit tests: ``` $ python test/test_testing.py $ python test/print_test_stats.py ``` To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example: - pytorch_linux_bionic_py3_6_clang9_test To test locally, use the following steps. First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option): ``` $ DATA_DIR=/tmp $ ARBITRARY_TEST=testing $ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST ``` Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)): ``` $ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489 $ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test ``` Download the `.json.bz2` file(s) for that commit/job pair: ``` $ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive ``` And feed everything into `test/print_test_stats.py`: ``` $ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/Z.json.bz2 \| torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST ``` The first part of the output should be the same as before this PR; here is the new part, at the end of the output: - https://pastebin.com/Jj1svhAn Reviewed By: malfet, izdeby Differential Revision: D26317769 Pulled By: samestep fbshipit-source-id: 1ba06cec0fafac77f9e7341d57079543052d73db	2021-02-08 15:35:21 -08:00
Sam Estep	21dccbca62	Revert D26232345: [pytorch][PR] Report test time regressions Test Plan: revert-hammer Differential Revision: D26232345 (`7467f90b13`) Original commit changeset: b687b1737519 fbshipit-source-id: 10a031c5500b083f7c82f2ae2743b671c5a07bff	2021-02-08 10:15:07 -08:00
Sam Estep	7467f90b13	Report test time regressions (#50171 ) Summary: This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub. Important: for uninteresting reasons, this PR moves the `print_test_stats.py` file. - Before: `test/print_test_stats.py` - After: `torch/testing/_internal/print_test_stats.py` Notes on the approach: - Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics. - We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit. - We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite. - We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives. - We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3). - We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR. - Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes. - Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171 Test Plan: To run the unit tests: ``` $ python test/test_testing.py $ python test/print_test_stats.py ``` To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example: - pytorch_linux_bionic_py3_6_clang9_test To test locally, use the following steps. First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option): ``` $ DATA_DIR=/tmp $ ARBITRARY_TEST=testing $ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST ``` Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)): ``` $ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489 $ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test ``` Download the `.json.bz2` file(s) for that commit/job pair: ``` $ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive ``` And feed everything into `test/print_test_stats.py`: ``` $ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/Z.json.bz2 \| torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST ``` The first part of the output should be the same as before this PR; here is the new part, at the end of the output: - https://pastebin.com/Jj1svhAn Reviewed By: walterddr Differential Revision: D26232345 Pulled By: samestep fbshipit-source-id: b687b1737519d2eed68fbd591a667e4e029de509	2021-02-08 07:54:34 -08:00
Yujun Zhao	f3a79b881f	add `lcov` to oss for beautiful html report (#44568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44568 By `lcov`, we can generate beautiful html. It's better than current file report and line report. Therefore in oss gcc, remove `export` code and `file/line level report` code, only use the html report. But in clang, since such tool is not available, we will still use file-report and line-report generated by ourself. Test Plan: Test in docker ubuntu machine. ## Mesurement 1. After running `atest`, it takes about 15 mins to collect code coverage and genrate the report. ``` # gcc code coverage python oss_coverage.py --run-only=atest ``` ## Presentation The html result looks like: Top Level: {F328330856} File Level: {F328336709} Reviewed By: malfet Differential Revision: D23550784 fbshipit-source-id: 1fff050e7f7d1cc8e86a6a200fd8db04b47f5f3e	2020-09-11 15:29:24 -07:00
Yujun Zhao	c2b40b056a	Filter default tests for `clang` coverage in oss Summary: Some tests like `test_dataloader.py` are not able to run under `clang` in oss, because it generates too large intermediate files (~40G) that can't be merged by `llvm`. Skip them when user doesn't specify the `--run-only` option Test Plan: Test locally. But still, not recomend user to run `clang` coverage in default mode, because it takes too much space. Reviewed By: malfet Differential Revision: D23549829 fbshipit-source-id: 0737e6e9dcbe3f38de00580ee6007906e743e52f	2020-09-11 15:28:15 -07:00
Elias Ellison	f9146b4598	fix lint (#44346 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44346 Reviewed By: jamesr66a Differential Revision: D23589324 Pulled By: eellison fbshipit-source-id: a4e22b69196909ec200ac3e262f04d2aaf78e9cf	2020-09-08 18:29:44 -07:00
Yujun Zhao	49e979bfde	Set default compiler differently according to platform (#43890 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43890 1. auto-detect `CXX` default compiler type in oss, and `clang` as default compiler type in fbcode (because auto-detecting will say `gcc` is the default compiler on devserver). 2. change `compiler type` from str `"CLANG" "GCC"` to enum type 3. rename function `get_cov_type` to `detect_compiler_type` 4. auto-set the default pytorch folder for users in oss Test Plan: on devserver: ``` buck run :coverage //caffe2/c10: ``` on oss: ``` python oss_coverage.py --run-only=atest ``` Reviewed By: malfet Differential Revision: D23420034 fbshipit-source-id: c0ea88188578bb1343a286f2090eb8a74cdf3982	2020-09-08 14:57:35 -07:00
yujunzhao@devvm1621.atn0.facebook.com	db6bd9d60b	rename input argunment `interested-folder` to `interest-only` -- be consistent with other arguments (#43889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43889 1. rename input argunment `interested-folder` to `interest-only` -- be consistent with `run-only`, `coverage-only` and be shorted Test Plan: Test on devserver and linux docker. Reviewed By: malfet Differential Revision: D23417338 fbshipit-source-id: ce9711e75ca3a1c30801ad6bd1a620f3b06819c5	2020-09-01 11:46:23 -07:00
yujunzhao@devvm1621.atn0.facebook.com	e941a462a3	Enable gcc coverage in OSS (#43883 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43883 Check the result of GCC coverage in OSS is reasonable and ready to ship. The amount of executable lines are not the same between `gcc` and `clang` because of the following reasons: * Lines following are counted in `clang` but not in `gcc`: 1. empty line or line with only “{” or “}” 3. some comments are counted in clang but not in gcc 5. `#define ...` -- not supported by gcc according to official documentation * Besides, a statement that explains to more than one line will be counted as only one executable line in gcc, but several lines in clang ## Advantage of `gcc` coverage 1. Much faster - code coverage tool runtime is onle 4 min (ammazzzing!!) by `gcc`, compared to 3 hours!! by `clang`, to analyze all the tests' artifacts 2. Use less disk - `Clang`'s artifacts will take as large as 170G, but `GCC` is 980M Besides, also update `README.md`. Test Plan: Compare the result in OSS `clang` and OSS `gcc` with the same command: ``` python oss_coverage.py --run-only atest test_nn.py --interested-folder=aten ``` ---- ## GCC Summary > time: 0:15:45 summary percentage: 44.85% Report and Log [File Coverage Report](P140825162) [Line Coverage Report](P140825196) [Log](P140825385) ------ ## CLANG Summary > time: 0:21:35 summary percentage: 44.08% Report and Log [File Coverage Report](P140825845) [Line Coverage Report](P140825923) [Log](P140825950) ---------- # Run all tests ``` # run all tests and get coverage over Pytorch python oss_coverage.py ``` Summary > time: 1:27:20. ( time to run tests: 1:23:33) summary percentage: 56.62% Report and Log [File Coverage Report](P140837175) [Log](P140837121) Reviewed By: malfet Differential Revision: D23416772 fbshipit-source-id: a6810fa4d8199690f10bd0a4f58a42ab2a22182b	2020-08-31 16:11:33 -07:00
yujunzhao@devvm229.ftw0.facebook.com	0564d7a652	Land code coverage tool for OSS (#43778 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43778 Move code_coverage_tool from experimental folder to caffe2/tools folder. Delete `TODO` and fb-related code. Test Plan: Test locally Reviewed By: malfet Differential Revision: D23399983 fbshipit-source-id: 92316fd3cc88409d087d2dc6ed0be674155b3762	2020-08-28 13:56:15 -07:00

10 Commits