pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
William Wen	e800d27b10	[dashboard] Add graphs for all summary metrics, add additional testing flags (#89580 ) Title. Test post: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1325572179 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89580 Approved by: https://github.com/davidberard98	2022-11-23 20:11:39 +00:00
William Wen	8bf8e4d71e	[dashboard] Add metric graphs back to dashboard (#89531 ) Title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89531 Approved by: https://github.com/davidberard98	2022-11-22 23:42:09 +00:00
Animesh Jain	5bba783d21	[dashboard] Remove aot_cudagraphs and nvprims_nvfuser (#89514 ) Helps speeding up Dashboard runs We will bring these back when the backends are ready to be tested on full model suite. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89514 Approved by: https://github.com/SherlockNoMad	2022-11-22 22:25:30 +00:00
William Wen	77d7f2c659	[dashboard] Add commit date & fix date related issues (#89517 ) Add commit date to build summary of dashboard. Make the date of the run reflective of when the run started, not when the run ended. Use PST (UTC -8) to determine day, rather than GMT (UTC +0). Test comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1324176119 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89517 Approved by: https://github.com/anijain2305	2022-11-22 21:17:36 +00:00
William Wen	fa4980cd5e	Add commit hash to dynamo dashboard (#89462 ) Title - also fix a small bug with dashboard outputs. Sample: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1322732698 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89462 Approved by: https://github.com/anijain2305	2022-11-21 22:56:13 +00:00
William Wen	af448e84eb	Fix bug in dynamo dashboard summary stats diff (#89226 ) Fixes issue where a suite may not be present in one of the logs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89226 Approved by: https://github.com/anijain2305	2022-11-17 19:20:49 +00:00
William Wen	640af8d70a	More dynamo dashboard improvements (#89155 ) A number of dashboard improvements: - Add accuracy failures to warnings section - Add regression detection to all metrics (speedup, compile time, peak memory), not just accuracy - Add testing flag to update-dashboard to prevent image/comment uploads - Add section for comparing summary statistics (passrate, speedup) between 2 most recent reports - Show names of reports for summary stats diff and regression detection sections - Remove metric graphs from the comment (they can still be found in the generated text file) Sample comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1317565972 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89155 Approved by: https://github.com/anijain2305	2022-11-16 21:54:27 +00:00
William Wen	45d2daaf85	Fix lookup file update in dashboard (#89024 ) Lookup file should be updated before graphs are generated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89024 Approved by: https://github.com/mlazos, https://github.com/anijain2305	2022-11-15 02:32:55 +00:00
William Wen	36d87465fb	Fix long comment error on dashboard (#89002 ) Fix dashboard comment failure due to the following trace: ``` Traceback (most recent call last): File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1180, in <module> DashboardUpdater(args).update() File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1119, in update self.comment_on_gh(comment) File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1096, in comment_on_gh subprocess.check_call( File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 368, in check_call retcode = call(popenargs, kwargs) File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 349, in call with Popen(popenargs, **kwargs) as p: File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 951, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 1821, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) OSError: [Errno 7] Argument list too long: '/data/home/anijain/miniconda/bin/gh' srun: error: a100-st-p4d24xlarge-27: task 0: Exited with exit code 1 ``` That is, we were trying to execute a gh command in the OS that was too long. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89002 Approved by: https://github.com/davidberard98	2022-11-14 18:43:50 +00:00
William Wen	4bcf2c53e5	Add warnings & regressions info text (#88837 ) Add text about what warnings and accuracy regressions dropdowns mean. Sample: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1310770285 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88837 Approved by: https://github.com/anijain2305	2022-11-10 19:22:09 +00:00
William Wen	0b8889c724	Do not flag models in dashboard due to NaN values (#88792 ) Title. Tested by running `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-4 --training --visualize_logs` on a copy of a recent set of logs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88792 Approved by: https://github.com/anijain2305	2022-11-10 01:48:04 +00:00
William Wen	6e3555edea	Add absolute latency to dashboard (#88790 ) Add absolute latency to dashboard, as requested by https://github.com/pytorch/torchdynamo/issues/1833#issuecomment-1302742914 Tested by setting `run.sh` to ``` # Setup the output directory rm -rf ../test-dynamo-runner-logs-7/ mkdir ../test-dynamo-runner-logs-7/ # Commands for torchbench for device=cuda, dtype=float32 for training and for performance testing python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=../test-dynamo-runner-logs-7//inductor_torchbench_float32_training_cuda_performance.csv --training --inductor --no-skip --dashboard --only mobilenet_v2 --cold_start_latency # Commands for torchbench for device=cuda, dtype=float32 for training and for accuracy testing python benchmarks/dynamo/torchbench.py --accuracy --float32 -dcuda --output=../test-dynamo-runner-logs-7//inductor_torchbench_float32_training_cuda_accuracy.csv --training --inductor --no-skip --dashboard --only mobilenet_v2 ``` and running `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-7/ --dashboard-archive-path /data/home/williamwen/dynamo-runner-logs-copy --training --run --compilers inductor --flag-compilers inductor --suites torchbench --update-dashboard` (need to comment out the `generate_commands` line and change the github issue ID from 681 to something else). Sample comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1309645562 NOTE: this change breaks processing old logs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88790 Approved by: https://github.com/anijain2305	2022-11-10 01:45:52 +00:00
William Wen	16bd363863	Fix dynamo dashboard passrate denominator (#88777 ) Before the dashboard improvements, the passrate table looked like this: ~~~ +------------------------+------------+-------------+-------------+ \| Compiler \| torchbench \| huggingface \| timm_models \| +------------------------+------------+-------------+-------------+ \| eager \| 98%, 54/55 \| 100%, 43/43 \| 100%, 61/61 \| \| aot_eager \| 95%, 52/55 \| 100%, 43/43 \| 97%, 59/61 \| \| aot_cudagraphs \| 75%, 41/55 \| 49%, 21/43 \| 38%, 23/61 \| \| nvprims_nvfuser \| 71%, 39/55 \| 16%, 7/43 \| 48%, 29/61 \| \| inductor \| 87%, 48/55 \| 93%, 40/43 \| 95%, 58/61 \| \| inductor_no_cudagraphs \| 93%, 51/55 \| 93%, 40/43 \| 95%, 58/61 \| +------------------------+------------+-------------+-------------+ ~~~ After the change, the table looked like: ~~~ +------------------------+------------+-------------+-------------+ \| Compiler \| torchbench \| huggingface \| timm_models \| +------------------------+------------+-------------+-------------+ \| eager \| 82%, 53/65 \| 84%, 43/51 \| 82%, 61/74 \| \| aot_eager \| 83%, 54/65 \| 84%, 43/51 \| 82%, 61/74 \| \| aot_cudagraphs \| 69%, 45/65 \| 65%, 33/51 \| 38%, 28/74 \| \| nvprims_nvfuser \| 48%, 31/65 \| 78%, 40/51 \| 26%, 19/74 \| \| inductor \| 75%, 49/65 \| 82%, 42/51 \| 81%, 60/74 \| \| inductor_no_cudagraphs \| 82%, 53/65 \| 82%, 42/51 \| 82%, 61/74 \| +------------------------+------------+-------------+-------------+ ~~~ There is no actual regression, but the passrate is lower since the denominator is wrong. Check fix by running locally (e.g. `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-5 --training --visualize_logs`) and comparing passrate table output to previously correct one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88777 Approved by: https://github.com/anijain2305	2022-11-10 00:26:58 +00:00
William Wen	0e67b2f7dd	Dynamo Dashboard Improvements (#88516 ) Implement various features in https://github.com/pytorch/torchdynamo/issues/1644: - Upload nightly run logs to /fsx before parsing - for backing up parsing failures. - Flag models with (1) < 0.95x speedup, (2) > 2min compile time, (3) < 0.9x compression ratio - Flag models that were passing yesterday but failed today. - Other small bug fixes. See https://github.com/pytorch/torchdynamo/issues/1831 for sample outputs. Also tested by running run.sh: ```bash # Setup the output directory rm -rf ../test-dynamo-runner-logs-3/ mkdir ../test-dynamo-runner-logs-3/ # Commands for torchbench for device=cuda, dtype=float32 for training and for performance testing python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=../test-dynamo-runner-logs-3//inductor_torchbench_float32_training_cuda_performance.csv --training --inductor --no-skip --dashboard --only mobilenet_v2 --cold_start_latency # Commands for torchbench for device=cuda, dtype=float32 for training and for accuracy testing python benchmarks/dynamo/torchbench.py --accuracy --float32 -dcuda --output=../test-dynamo-runner-logs-3//inductor_torchbench_float32_training_cuda_accuracy.csv --training --inductor --no-skip --dashboard --only mobilenet_v2 ``` with the command `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-3/ --dashboard-archive-path /data/home/williamwen/dynamo-runner-logs-copy --training --run --compilers inductor --flag-compilers inductor --suites torchbench --update-dashboard` (need to comment out the `generate_commands` line and change the github issue ID from 681 to something else). Pull Request resolved: https://github.com/pytorch/pytorch/pull/88516 Approved by: https://github.com/anijain2305	2022-11-07 22:24:44 +00:00
Animesh Jain	f8b73340c8	[dashboard] Replace aot_nvfuser with nvprims_nvfuser (#88437 ) @IvanYashchuk @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/88437 Approved by: https://github.com/soumith	2022-11-03 19:07:03 +00:00
Yanbo Liang	72958b9665	[Dynamo] Update Dynamo benchmarks running commands (#87844 ) Fixes https://github.com/pytorch/torchdynamo/issues/1761 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87844 Approved by: https://github.com/jansel	2022-11-01 22:45:13 +00:00
Animesh Jain	d67b2edec3	[dynamo][dashboard] minor fixes for a clean Dashboard (#88056 ) * better check for cold start latency * sort on inductor column for better readability. cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx Pull Request resolved: https://github.com/pytorch/pytorch/pull/88056 Approved by: https://github.com/ngimel	2022-10-31 02:30:29 +00:00
Animesh Jain	83b381d34d	[dynamo] add inductor runs w/o cudagraphs (#87847 ) as title cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx Pull Request resolved: https://github.com/pytorch/pytorch/pull/87847 Approved by: https://github.com/jansel	2022-10-27 19:49:29 +00:00
William Wen	cc64863d71	Clean Inductor complication cache during dynamo dashboard run (#87246 ) Implement improvement from https://github.com/pytorch/torchdynamo/issues/1644. Tested by running `python benchmarks/dynamo/runner.py --print_run_commands --training` and inspecting the generated `run.sh` file for the `--cold_start_latency` flag, e.g. ``` python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=benchmark_logs/inductor_torchbench_float32_training_cuda_performance.csv --training --inductor --no-skip --dashboard -x fambench_xlmr -x detectron2_fasterrcnn_r_50_c4 -x detectron2_fasterrcnn_r_50_dc5 -x detectron2_maskrcnn_r_101_fpn -x detectron2_maskrcnn_r_50_fpn -x detectron2_fasterrcnn_r_50_fpn -x detectron2_maskrcnn -x detectron2_fasterrcnn_r_101_dc5 -x opacus_cifar10 -x detectron2_maskrcnn_r_101_c4 -x pyhpc_turbulent_kinetic_energy -x maml -x detectron2_fasterrcnn_r_101_fpn -x pyhpc_equation_of_state -x detectron2_fasterrcnn_r_101_c4 -x pyhpc_isoneutral_mixing --cold_start_latency ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/87246 Approved by: https://github.com/anijain2305, https://github.com/jansel	2022-10-19 16:39:12 +00:00
Animesh Jain	c30cfb07ab	[dynamo][dashboard] Run 2 iterations for the correctness runs (#87104 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87104 Approved by: https://github.com/soumith	2022-10-18 15:53:40 +00:00
Animesh Jain	2b558138cf	[inductor] Set correct strides in fallback example run (#87049 ) Fixes #ISSUE_NUMBER Helps in resolving many issues seen in https://github.com/pytorch/torchdynamo/issues/1675 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87049 Approved by: https://github.com/jansel	2022-10-17 15:43:53 +00:00
Jason Ansel	c7c09722ad	Move TorchDynamo into PyTorch core (#86461 ) Context: https://github.com/pytorch/torchdynamo/issues/1588 This PR moves [TorchDynamo](https://github.com/pytorch/torchdynamo) and TorchInductor into PyTorch core. - `torchdynamo` becomes `torch._dynamo` - `torchinductor` becomes `torch._inductor` This PR was generated by running `copy_to_core.sh` in https://github.com/pytorch/torchdynamo/pull/1538 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86461 Approved by: https://github.com/voznesenskym	2022-10-13 23:18:06 +00:00

22 Commits