Commit Graph

30 Commits

Author SHA1 Message Date
Xuehai Pan
a229b4526f [BE] Prefer dash over underscore in command-line options (#94505)
Preferring dash over underscore in command-line options. Add `--command-arg-name` to the argument parser. The old arguments with underscores `--command_arg_name` are kept for backward compatibility.

Both dashes and underscores are used in the PyTorch codebase. Some argument parsers only have dashes or only have underscores in arguments. For example, the `torchrun` utility for distributed training only accepts underscore arguments (e.g., `--master_port`). The dashes are more common in other command-line tools. And it looks to be the default choice in the Python standard library:

`argparse.BooleanOptionalAction`: 4a9dff0e5a/Lib/argparse.py (L893-L895)

```python
class BooleanOptionalAction(Action):
    def __init__(...):
            if option_string.startswith('--'):
                option_string = '--no-' + option_string[2:]
                _option_strings.append(option_string)
```

It adds `--no-argname`, not `--no_argname`. Also typing `_` need to press the shift or the caps-lock key than `-`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94505
Approved by: https://github.com/ezyang, https://github.com/seemethere
2023-02-09 20:16:49 +00:00
Jason Ansel
5d709af59a Rename aot_cudagraphs to cudagraphs (#93821)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93821
Approved by: https://github.com/ezyang
2023-02-03 21:01:27 +00:00
William Wen
37a28255cb [dynamo, benchmarks] Fix dashboard update location (#94006)
Get dashboard uploading again

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94006
Approved by: https://github.com/yanboliang
2023-02-02 23:01:57 +00:00
Edward Z. Yang
35ea82541b Send float32 to a different GitHub issue (#93168)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93168
Approved by: https://github.com/Chillee, https://github.com/jansel
2023-01-27 19:55:06 +00:00
Edward Z. Yang
729f1a8ef2 Setup shebang and set -x on generated runner script (#93007)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93007
Approved by: https://github.com/williamwen42
2023-01-26 16:52:38 +00:00
blzheng
0c1777acec Dynamo benchmark: add CPU specific changes (#88477)
This pr adds some CPU specific changes:

- Add support for IPEX backend
- https://github.com/pytorch/torchdynamo/issues/1618
- https://github.com/pytorch/torchdynamo/issues/1534
- Enable CPU launcher in runner.py.
- Fix the issue that some environment variables are not support on CPU

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88477
Approved by: https://github.com/jgong5, https://github.com/jansel
2023-01-07 09:26:06 +00:00
Animesh Jain
5a79144a79 [dashboaard] Fix flag compilers (#89853)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89853
Approved by: https://github.com/williamwen42
2022-11-30 01:02:36 +00:00
William Wen
63843401f5 Fix archive issue impacting summary stat diff (#89789)
Summary stat diff was reporting diff between previous day and the day before that, instead of today and previous day. Issue was because summary stats were not uploaded to the archive before the summary stat differ was run.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89789
Approved by: https://github.com/anijain2305
2022-11-29 00:55:06 +00:00
William Wen
e800d27b10 [dashboard] Add graphs for all summary metrics, add additional testing flags (#89580)
Title. Test post: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1325572179

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89580
Approved by: https://github.com/davidberard98
2022-11-23 20:11:39 +00:00
William Wen
8bf8e4d71e [dashboard] Add metric graphs back to dashboard (#89531)
Title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89531
Approved by: https://github.com/davidberard98
2022-11-22 23:42:09 +00:00
Animesh Jain
5bba783d21 [dashboard] Remove aot_cudagraphs and nvprims_nvfuser (#89514)
Helps speeding up Dashboard runs

We will bring these back when the backends are ready to be tested on full model suite.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89514
Approved by: https://github.com/SherlockNoMad
2022-11-22 22:25:30 +00:00
William Wen
77d7f2c659 [dashboard] Add commit date & fix date related issues (#89517)
Add commit date to build summary of dashboard. Make the date of the run reflective of when the run started, not when the run ended. Use PST (UTC -8) to determine day, rather than GMT (UTC +0).

Test comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1324176119

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89517
Approved by: https://github.com/anijain2305
2022-11-22 21:17:36 +00:00
William Wen
fa4980cd5e Add commit hash to dynamo dashboard (#89462)
Title - also fix a small bug with dashboard outputs.

Sample: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1322732698

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89462
Approved by: https://github.com/anijain2305
2022-11-21 22:56:13 +00:00
William Wen
af448e84eb Fix bug in dynamo dashboard summary stats diff (#89226)
Fixes issue where a suite may not be present in one of the logs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89226
Approved by: https://github.com/anijain2305
2022-11-17 19:20:49 +00:00
William Wen
640af8d70a More dynamo dashboard improvements (#89155)
A number of dashboard improvements:
- Add accuracy failures to warnings section
- Add regression detection to all metrics (speedup, compile time, peak memory), not just accuracy
- Add testing flag to update-dashboard to prevent image/comment uploads
- Add section for comparing summary statistics (passrate, speedup) between 2 most recent reports
- Show names of reports for summary stats diff and regression detection sections
- Remove metric graphs from the comment (they can still be found in the generated text file)

Sample comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1317565972

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89155
Approved by: https://github.com/anijain2305
2022-11-16 21:54:27 +00:00
William Wen
45d2daaf85 Fix lookup file update in dashboard (#89024)
Lookup file should be updated before graphs are generated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89024
Approved by: https://github.com/mlazos, https://github.com/anijain2305
2022-11-15 02:32:55 +00:00
William Wen
36d87465fb Fix long comment error on dashboard (#89002)
Fix dashboard comment failure due to the following trace:
```
Traceback (most recent call last):
  File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1180, in <module>
    DashboardUpdater(args).update()
  File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1119, in update
    self.comment_on_gh(comment)
  File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1096, in comment_on_gh
    subprocess.check_call(
  File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 368, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 349, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 1821, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 7] Argument list too long: '/data/home/anijain/miniconda/bin/gh'
srun: error: a100-st-p4d24xlarge-27: task 0: Exited with exit code 1
```
That is, we were trying to execute a gh command in the OS that was too long.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89002
Approved by: https://github.com/davidberard98
2022-11-14 18:43:50 +00:00
William Wen
4bcf2c53e5 Add warnings & regressions info text (#88837)
Add text about what warnings and accuracy regressions dropdowns mean.

Sample: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1310770285

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88837
Approved by: https://github.com/anijain2305
2022-11-10 19:22:09 +00:00
William Wen
0b8889c724 Do not flag models in dashboard due to NaN values (#88792)
Title.

Tested by running `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-4 --training --visualize_logs` on a copy of a recent set of logs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88792
Approved by: https://github.com/anijain2305
2022-11-10 01:48:04 +00:00
William Wen
6e3555edea Add absolute latency to dashboard (#88790)
Add absolute latency to dashboard, as requested by https://github.com/pytorch/torchdynamo/issues/1833#issuecomment-1302742914

Tested by setting `run.sh` to
```
# Setup the output directory
rm -rf ../test-dynamo-runner-logs-7/
mkdir ../test-dynamo-runner-logs-7/

# Commands for torchbench for device=cuda, dtype=float32 for training and for performance testing
python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=../test-dynamo-runner-logs-7//inductor_torchbench_float32_training_cuda_performance.csv --training --inductor   --no-skip --dashboard --only mobilenet_v2 --cold_start_latency

# Commands for torchbench for device=cuda, dtype=float32 for training and for accuracy testing
python benchmarks/dynamo/torchbench.py --accuracy --float32 -dcuda --output=../test-dynamo-runner-logs-7//inductor_torchbench_float32_training_cuda_accuracy.csv --training --inductor   --no-skip --dashboard --only mobilenet_v2
```
and running `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-7/ --dashboard-archive-path /data/home/williamwen/dynamo-runner-logs-copy --training --run --compilers inductor --flag-compilers inductor --suites torchbench --update-dashboard`  (need to comment out the `generate_commands` line and change the github issue ID from 681 to something else).

Sample comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1309645562

NOTE: this change breaks processing old logs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88790
Approved by: https://github.com/anijain2305
2022-11-10 01:45:52 +00:00
William Wen
16bd363863 Fix dynamo dashboard passrate denominator (#88777)
Before the dashboard improvements, the passrate table looked like this:
~~~
+------------------------+------------+-------------+-------------+
|        Compiler        | torchbench | huggingface | timm_models |
+------------------------+------------+-------------+-------------+
|         eager          | 98%, 54/55 | 100%, 43/43 | 100%, 61/61 |
|       aot_eager        | 95%, 52/55 | 100%, 43/43 | 97%, 59/61  |
|     aot_cudagraphs     | 75%, 41/55 | 49%, 21/43  | 38%, 23/61  |
|    nvprims_nvfuser     | 71%, 39/55 |  16%, 7/43  | 48%, 29/61  |
|        inductor        | 87%, 48/55 | 93%, 40/43  | 95%, 58/61  |
| inductor_no_cudagraphs | 93%, 51/55 | 93%, 40/43  | 95%, 58/61  |
+------------------------+------------+-------------+-------------+
~~~
After the change, the table looked like:
~~~
+------------------------+------------+-------------+-------------+
|        Compiler        | torchbench | huggingface | timm_models |
+------------------------+------------+-------------+-------------+
|         eager          | 82%, 53/65 | 84%, 43/51  | 82%, 61/74  |
|       aot_eager        | 83%, 54/65 | 84%, 43/51  | 82%, 61/74  |
|     aot_cudagraphs     | 69%, 45/65 | 65%, 33/51  | 38%, 28/74  |
|    nvprims_nvfuser     | 48%, 31/65 | 78%, 40/51  | 26%, 19/74  |
|        inductor        | 75%, 49/65 | 82%, 42/51  | 81%, 60/74  |
| inductor_no_cudagraphs | 82%, 53/65 | 82%, 42/51  | 82%, 61/74  |
+------------------------+------------+-------------+-------------+
~~~
There is no actual regression, but the passrate is lower since the denominator is wrong. Check fix by running locally (e.g. `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-5 --training --visualize_logs`) and comparing passrate table output to previously correct one.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88777
Approved by: https://github.com/anijain2305
2022-11-10 00:26:58 +00:00
William Wen
0e67b2f7dd Dynamo Dashboard Improvements (#88516)
Implement various features in https://github.com/pytorch/torchdynamo/issues/1644:
- Upload nightly run logs to /fsx before parsing - for backing up parsing failures.
- Flag models with (1) < 0.95x speedup, (2) > 2min compile time, (3) < 0.9x compression ratio
- Flag models that were passing yesterday but failed today.
- Other small bug fixes.

See https://github.com/pytorch/torchdynamo/issues/1831 for sample outputs.
Also tested by running run.sh:
```bash
# Setup the output directory
rm -rf ../test-dynamo-runner-logs-3/
mkdir ../test-dynamo-runner-logs-3/

# Commands for torchbench for device=cuda, dtype=float32 for training and for performance testing
python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=../test-dynamo-runner-logs-3//inductor_torchbench_float32_training_cuda_performance.csv --training --inductor   --no-skip --dashboard --only mobilenet_v2 --cold_start_latency

# Commands for torchbench for device=cuda, dtype=float32 for training and for accuracy testing
python benchmarks/dynamo/torchbench.py --accuracy --float32 -dcuda --output=../test-dynamo-runner-logs-3//inductor_torchbench_float32_training_cuda_accuracy.csv --training --inductor   --no-skip --dashboard --only mobilenet_v2
```

with the command
`python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-3/ --dashboard-archive-path /data/home/williamwen/dynamo-runner-logs-copy --training --run --compilers inductor --flag-compilers inductor --suites torchbench --update-dashboard` (need to comment out the `generate_commands` line and change the github issue ID from 681 to something else).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88516
Approved by: https://github.com/anijain2305
2022-11-07 22:24:44 +00:00
Animesh Jain
f8b73340c8 [dashboard] Replace aot_nvfuser with nvprims_nvfuser (#88437)
@IvanYashchuk @ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88437
Approved by: https://github.com/soumith
2022-11-03 19:07:03 +00:00
Yanbo Liang
72958b9665 [Dynamo] Update Dynamo benchmarks running commands (#87844)
Fixes https://github.com/pytorch/torchdynamo/issues/1761

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87844
Approved by: https://github.com/jansel
2022-11-01 22:45:13 +00:00
Animesh Jain
d67b2edec3 [dynamo][dashboard] minor fixes for a clean Dashboard (#88056)
* better check for cold start latency
* sort on inductor column for better readability.

cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88056
Approved by: https://github.com/ngimel
2022-10-31 02:30:29 +00:00
Animesh Jain
83b381d34d [dynamo] add inductor runs w/o cudagraphs (#87847)
as title

cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87847
Approved by: https://github.com/jansel
2022-10-27 19:49:29 +00:00
William Wen
cc64863d71 Clean Inductor complication cache during dynamo dashboard run (#87246)
Implement improvement from https://github.com/pytorch/torchdynamo/issues/1644.

Tested by running `python benchmarks/dynamo/runner.py --print_run_commands --training` and inspecting the generated `run.sh` file for the `--cold_start_latency` flag, e.g.
```
python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=benchmark_logs/inductor_torchbench_float32_training_cuda_performance.csv --training --inductor   --no-skip --dashboard -x fambench_xlmr -x detectron2_fasterrcnn_r_50_c4 -x detectron2_fasterrcnn_r_50_dc5 -x detectron2_maskrcnn_r_101_fpn -x detectron2_maskrcnn_r_50_fpn -x detectron2_fasterrcnn_r_50_fpn -x detectron2_maskrcnn -x detectron2_fasterrcnn_r_101_dc5 -x opacus_cifar10 -x detectron2_maskrcnn_r_101_c4 -x pyhpc_turbulent_kinetic_energy -x maml -x detectron2_fasterrcnn_r_101_fpn -x pyhpc_equation_of_state -x detectron2_fasterrcnn_r_101_c4 -x pyhpc_isoneutral_mixing --cold_start_latency
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87246
Approved by: https://github.com/anijain2305, https://github.com/jansel
2022-10-19 16:39:12 +00:00
Animesh Jain
c30cfb07ab [dynamo][dashboard] Run 2 iterations for the correctness runs (#87104)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87104
Approved by: https://github.com/soumith
2022-10-18 15:53:40 +00:00
Animesh Jain
2b558138cf [inductor] Set correct strides in fallback example run (#87049)
Fixes #ISSUE_NUMBER

Helps in resolving many issues seen in https://github.com/pytorch/torchdynamo/issues/1675
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87049
Approved by: https://github.com/jansel
2022-10-17 15:43:53 +00:00
Jason Ansel
c7c09722ad Move TorchDynamo into PyTorch core (#86461)
Context:
https://github.com/pytorch/torchdynamo/issues/1588

This PR moves [TorchDynamo](https://github.com/pytorch/torchdynamo) and TorchInductor into PyTorch core.
- `torchdynamo` becomes `torch._dynamo`
- `torchinductor` becomes `torch._inductor`

This PR was generated by running `copy_to_core.sh` in https://github.com/pytorch/torchdynamo/pull/1538

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86461
Approved by: https://github.com/voznesenskym
2022-10-13 23:18:06 +00:00