pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Xuehai Pan	c0ed38e644	[BE][Easy][3/19] enforce style for empty lines in import segments in `benchmarks/` (#129754 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129754 Approved by: https://github.com/ezyang	2024-07-17 14:34:42 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
Shunting Zhang	c0735a3dd3	[pt2-bench] fix accuracy failure for a few models (#129941 ) This PR batch the fix for a few accuracy failures issues during training by raising tolerance. I do that only for models that I think it fails not due to real issue. ## sebotnet33ts_256 The accuracy test for this model start to fail around June 05 [link](https://hud.pytorch.org/benchmark/timm_models/inductor_with_cudagraphs?dashboard=torchinductor&startTime=Sun%2C%2002%20Jun%202024%2007%3A19%3A38%20GMT&stopTime=Tue%2C%2002%20Jul%202024%2007%3A19%3A38%20GMT&granularity=day&mode=training&dtype=amp&lBranch=main&lCommit=04a0d856207d83c2031e4b9cb6825ba3e0092850&rBranch=main&rCommit=e62925930f6a62f6aeeb1fe1a661a9bd3352b53d&model=sebotnet33ts_256). I can not repro locally, but from the log from the dashboard: ``` RMSE (res-fp64): 0.09441, (ref-fp64): 0.02971 and shape=torch.Size([1536]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.040000 ``` raising the tolerance should fix it. ## DebertaForQuestionAnswering This model fails accuracy test on the dashboard only in max-autotune mode. I can not repro locally by command: ``` TORCHINDUCTOR_MAX_AUTOTUNE=1 time python benchmarks/dynamo/huggingface.py --accuracy --no-translation-validation --training --amp --backend inductor --device cuda --only DebertaForQuestionAnswering ``` From error message on the dashboard: ``` RMSE (res-fp64): 0.01803, (ref-fp64): 0.00537 and shape=torch.Size([2]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.010000 ``` 0.02 tolerance should suppress this error. ## gluon_inception_v3 This model fail on the dashboard in max-autotune mode. I can not repro locally by command ``` TORCHINDUCTOR_MAX_AUTOTUNE=1 time python benchmarks/dynamo/timm_models.py --accuracy --training --amp --backend inductor --disable-cudagraphs --device cuda --only gluon_inception_v3 ``` From error message on the dashboard ``` RMSE (res-fp64): 0.02798, (ref-fp64): 0.00730 and shape=torch.Size([384]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.010000 Accuracy failed for key name Mixed_7c.branch3x3dbl_3a.bn.running_var ``` raising tolerance should suppress this error. # mobilenetv3_large_100 Fail in MA model. I can not repro locally by command ``` TORCHINDUCTOR_MAX_AUTOTUNE=1 time python benchmarks/dynamo/timm_models.py --accuracy --training --amp --backend inductor --disable-cudagraphs --device cuda --only ``` The error message on the dashboard is ``` RMSE (res-fp64): 0.29754, (ref-fp64): 0.05205 and shape=torch.Size([]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.040000 ``` The tensor is so small that the noise can be high. I use larger multiplier for smaller tensor in torch._dynamo.utils.same. # yolov3 Fail on dashboard with error ``` Error on the dashboard: RMSE (res-fp64): 0.01278, (ref-fp64): 0.00246 and shape=torch.Size([256]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.001000 ``` Fix it by using a larger multiplier for smaller tensors and raising the tolereance. # timm_efficientdet Fail on the dashboard with error ``` E0623 18:37:43.638000 139924418725056 torch/_dynamo/utils.py:1468] RMSE (res-fp64): 0.00096, (ref-fp64): 0.00009 and shape=torch.Size([2]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.001000 ``` But I can not repro locally with command ``` time python benchmarks/dynamo/torchbench.py --backend inductor --amp --performance --only timm_efficientdet --training ``` Raise the tolerance should fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129941 Approved by: https://github.com/jansel ghstack dependencies: #129996	2024-07-05 10:26:39 +00:00
PyTorch MergeBot	fa3953a2e1	Revert "[pt2-bench] fix accuracy failure for a few models (#129941 )" This reverts commit `dafbd603ee`. Reverted https://github.com/pytorch/pytorch/pull/129941 on behalf of https://github.com/jeanschmidt due to Seems to have introduced breakages in main cuda12 focal jobs ([comment](https://github.com/pytorch/pytorch/pull/129996#issuecomment-2209175516))	2024-07-04 14:55:38 +00:00
Shunting Zhang	dafbd603ee	[pt2-bench] fix accuracy failure for a few models (#129941 ) This PR batch the fix for a few accuracy failures issues during training by raising tolerance. I do that only for models that I think it fails not due to real issue. ## sebotnet33ts_256 The accuracy test for this model start to fail around June 05 [link](https://hud.pytorch.org/benchmark/timm_models/inductor_with_cudagraphs?dashboard=torchinductor&startTime=Sun%2C%2002%20Jun%202024%2007%3A19%3A38%20GMT&stopTime=Tue%2C%2002%20Jul%202024%2007%3A19%3A38%20GMT&granularity=day&mode=training&dtype=amp&lBranch=main&lCommit=04a0d856207d83c2031e4b9cb6825ba3e0092850&rBranch=main&rCommit=e62925930f6a62f6aeeb1fe1a661a9bd3352b53d&model=sebotnet33ts_256). I can not repro locally, but from the log from the dashboard: ``` RMSE (res-fp64): 0.09441, (ref-fp64): 0.02971 and shape=torch.Size([1536]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.040000 ``` raising the tolerance should fix it. ## DebertaForQuestionAnswering This model fails accuracy test on the dashboard only in max-autotune mode. I can not repro locally by command: ``` TORCHINDUCTOR_MAX_AUTOTUNE=1 time python benchmarks/dynamo/huggingface.py --accuracy --no-translation-validation --training --amp --backend inductor --device cuda --only DebertaForQuestionAnswering ``` From error message on the dashboard: ``` RMSE (res-fp64): 0.01803, (ref-fp64): 0.00537 and shape=torch.Size([2]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.010000 ``` 0.02 tolerance should suppress this error. ## gluon_inception_v3 This model fail on the dashboard in max-autotune mode. I can not repro locally by command ``` TORCHINDUCTOR_MAX_AUTOTUNE=1 time python benchmarks/dynamo/timm_models.py --accuracy --training --amp --backend inductor --disable-cudagraphs --device cuda --only gluon_inception_v3 ``` From error message on the dashboard ``` RMSE (res-fp64): 0.02798, (ref-fp64): 0.00730 and shape=torch.Size([384]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.010000 Accuracy failed for key name Mixed_7c.branch3x3dbl_3a.bn.running_var ``` raising tolerance should suppress this error. # mobilenetv3_large_100 Fail in MA model. I can not repro locally by command ``` TORCHINDUCTOR_MAX_AUTOTUNE=1 time python benchmarks/dynamo/timm_models.py --accuracy --training --amp --backend inductor --disable-cudagraphs --device cuda --only ``` The error message on the dashboard is ``` RMSE (res-fp64): 0.29754, (ref-fp64): 0.05205 and shape=torch.Size([]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.040000 ``` The tensor is so small that the noise can be high. I use larger multiplier for smaller tensor in torch._dynamo.utils.same. # yolov3 Fail on dashboard with error ``` Error on the dashboard: RMSE (res-fp64): 0.01278, (ref-fp64): 0.00246 and shape=torch.Size([256]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.001000 ``` Fix it by using a larger multiplier for smaller tensors and raising the tolereance. # timm_efficientdet Fail on the dashboard with error ``` E0623 18:37:43.638000 139924418725056 torch/_dynamo/utils.py:1468] RMSE (res-fp64): 0.00096, (ref-fp64): 0.00009 and shape=torch.Size([2]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.001000 ``` But I can not repro locally with command ``` time python benchmarks/dynamo/torchbench.py --backend inductor --amp --performance --only timm_efficientdet --training ``` Raise the tolerance should fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129941 Approved by: https://github.com/jansel ghstack dependencies: #129996	2024-07-04 01:14:29 +00:00
Valentine233	62b710782d	change LayoutLMForSequenceClassification inference accuracy tolerance (#129728 ) Fixes #128510. https://github.com/pytorch/pytorch/pull/124451 makes LayoutLMForSequenceClassification hit the SDPA pattern 1 and then encounter the accuracy issue. The issue only happens with BF16 inference single thread. This PR tends to increase the model tolerance and make the check pass. Note that even the math-version SDPA could have the issue because of some small implementation diff. The test log: Single thread ``` correct_result: SequenceClassifierOutput(loss=tensor(0.5998), logits=tensor([[0.3301, 0.1338]], dtype=torch.bfloat16), hidden_states=None, attentions=None) new_result: SequenceClassifierOutput(loss=tensor(0.6016), logits=tensor([[0.3281, 0.1357]], dtype=torch.bfloat16), hidden_states=None, attentions=None) E0627 01:09:16.762789 140281313759104 torch/_dynamo/utils.py:1476] RMSE (res-fp64): 0.00151, (ref-fp64): 0.00046 and shape=torch.Size([1, 2]). res.dtype: torch.bfloat16, multiplier: 3.000000, tol: 0.001000 E0627 01:09:16.762972 140281313759104 torch/_dynamo/utils.py:1390] Accuracy failed for key name logits fail_accuracy ``` Multiple threads ``` correct_result: SequenceClassifierOutput(loss=tensor(0.6007), logits=tensor([[0.3301, 0.1357]], dtype=torch.bfloat16), hidden_states=None, attentions=None) new_result: SequenceClassifierOutput(loss=tensor(0.6016), logits=tensor([[0.3281, 0.1357]], dtype=torch.bfloat16), hidden_states=None, attentions=None) pass ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129728 Approved by: https://github.com/jgong5, https://github.com/jansel	2024-07-03 06:28:27 +00:00
Xu Zhao	094183dba6	[torchbench][pt2] Enable Huggingface and Timm models for interal buck runner (#127460 ) Summary: Add huggingface and timm model runs to the internal pt2 benchmark runner. Test Plan: Tesing huggingface model: ``` $ buck2 run mode/opt //pytorch/benchmark:pt2 -- --only BlenderbotSmallForCausalLM --performance --training --device=cuda --amp 33/ 33 +0 frames 2s 13 graphs 13 graph calls 0/ -12 = 0% ops 0% time ``` Testing timm model: ``` $ buck2 run mode/opt //pytorch/benchmark:pt2 -- --only coat_lite_mini --performance --training --device=cuda --amp loading model: 0it [00:11, ?it/s] cuda train coat_lite_mini 8/ 8 +0 frames 4s 2 graphs 2 graph calls 0/ -1 = 0% ops 0% time ``` Differential Revision: D57930582 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127460 Approved by: https://github.com/HDCharles, https://github.com/huydhn	2024-05-30 21:18:28 +00:00
Xuehai Pan	26f4f10ac8	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980	2024-05-27 14:49:57 +00:00
PyTorch MergeBot	55c0ab2887	Revert "[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 )" This reverts commit `7763c83af6`. Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))	2024-05-27 09:22:08 +00:00
Xuehai Pan	7763c83af6	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980 ghstack dependencies: #127122, #127123, #127124, #127125	2024-05-27 04:22:18 +00:00
Sam Larsen	ecd9a4e5c3	Enable FX graph cache for huggingface and timm benchmarks (#126205 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126205 Approved by: https://github.com/eellison	2024-05-17 18:36:05 +00:00
eellison	000d55870a	Enable in oss (#124031 ) Biggest movement is 4% HF inference, 9% TIMM inference. Note, this is max-autotune mode so we are more tolerant of compilation increases. We could improve compilation time by limiting: ``` # Take how many of the top triton kernels to benchmark epilogue max_epilogue_benchmarked_choices = 3 ``` There is a hf_Whisper failure which you can repro on main without this stack with `TORCHINDUCTOR_MAX_AUTOTUNE_GEMM_BACKENDS=TRITON TORCHINDUCTOR_MAX_AUTOTUNE=1 python benchmarks/dynamo/torchbench.py --backend inductor --amp --accuracy --training --only hf_Whisper`. When you turn off epilogue fusion, it fixes the accuracy. I bisected the failure to an epilogue, however when you compare the results of that epilogue with the corresponding separate kernels the results of the output are equivalent. Inference: <img width="1686" alt="image" src="https://github.com/pytorch/pytorch/assets/11477974/0b240080-cd33-4c08-89d3-583103b1fb0c"> Training: <img width="1329" alt="Screenshot 2024-04-16 at 6 16 30 PM" src="https://github.com/pytorch/pytorch/assets/11477974/db0afcc9-7288-4c27-84ce-4fc1a5690788"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124031 Approved by: https://github.com/Chillee, https://github.com/shunting314 ghstack dependencies: #124030, #122642, #123229, #122825	2024-04-19 20:28:55 +00:00
Shunting Zhang	b381a4372b	make GPT2ForSequenceClassification pass inference accuracy check (#120537 ) We need a higher tolerance for GPT2ForSequenceClassification since if I change --bfloat16 in ``` time python benchmarks/dynamo/huggingface.py --accuracy --inference --bfloat16 --backend inductor --disable-cudagraphs --only GPT2ForSequenceClassification ``` to --float16 or --float32 it will pass the accuracy check. Adding --freezing can also make the test pass for this model. I think that's may be due to different fusion output being generated (depending on if constant propagation is happening controlled by freezing) and cause some small numerical difference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120537 Approved by: https://github.com/jansel	2024-02-26 11:02:57 +00:00
haozhe.zhu	6500ccebd7	enable fp16 autocast for dynamo benchmark (#114088 ) `--amp` to enable amp path for` CUDA` (default amp_dtype will be float16) and `CPU` (default amp_dtype will be bfloat16). If users set `--amp_dtype`, the amp_dtype from users will have the highest priority. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114088 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-12-14 12:38:44 +00:00
angelayi	a565f1bee6	[aotinductor] Skip benchmarks with control flow (#109661 ) Since AOTInductor doesn't support control flow yet, we will skip over tests that are currently failing due to containing control flow in the code. Logs taken from https://hud.pytorch.org/benchmark/compilers?startTime=Tue%2C%2012%20Sep%202023%2022%3A56%3A40%20GMT&stopTime=Tue%2C%2019%20Sep%202023%2022%3A56%3A40%20GMT&granularity=hour&suite=torchbench&mode=inference&dtype=bfloat16&lBranch=main&lCommit=2c1554a0323107d821be3ff13df7833b9f0b960d&rBranch=main&rCommit=47be61e12bd51df27182343d312dc3df485d5559 Errors documented in https://github.com/pytorch/pytorch/issues/105217 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109661 Approved by: https://github.com/desertfire	2023-09-25 18:49:06 +00:00
Simon Fan	54c5f474a7	Forward rank and world size info to Torchbench models when using dynamo runner (#108438 ) Adding support to pass rank and world_size to torchbench model, via its extra_args parameter: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/model.py#L83C80-L83C90 This is used for models which distribute over multiple GPUs e.g. simple_gpt https://github.com/pytorch/benchmark/pull/1867 Also add an option to skip multiprocess only gpu models Testing via `python benchmarks/dynamo/torchbench.py -d cuda --output=benchmark_logs/performance.csv --inference --performance --timing --print-memory --multiprocess --only simple_gpt` Pull Request resolved: https://github.com/pytorch/pytorch/pull/108438 Approved by: https://github.com/Chillee	2023-09-14 21:01:20 +00:00
blzheng	b9befc53a6	benchmark: higher tolerance for RobertaForQuestionAnswering (#107376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107376 Approved by: https://github.com/kit1980, https://github.com/XiaobingSuper, https://github.com/jansel ghstack dependencies: #107375	2023-08-21 04:34:24 +00:00
Bin Bao	141828498c	[CI] Update inference accuracy test (#103361 ) Summary: 1) Switch inference accuracy test from fp32 to amp (consistent with dashboard run, https://github.com/pytorch/pytorch/pull/103220) 2) GoogleFnet fails in eager with amp or fp16, so fallback to always using fp32. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103361 Approved by: https://github.com/eellison	2023-06-12 19:34:18 +00:00
Edward Z. Yang	eeb3c62117	Add Wav2Vec2 HuggingFace support (#103009 ) This is not actually enabled in the benchmark suite as you need https://github.com/pytorch/pytorch/pull/103001 and also training is broken per https://github.com/pytorch/pytorch/issues/101160 but might as well review this part first. Contains https://github.com/pytorch/pytorch/pull/102979 but I will probably rebase past that once it lands. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103009 Approved by: https://github.com/Skylion007	2023-06-06 13:25:06 +00:00
Animesh Jain	65631d4515	[benchmarks] Use train mode for accuracy checks for HF models (#102578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102578 Approved by: https://github.com/desertfire	2023-05-31 19:47:18 +00:00
Animesh Jain	33a49eeae7	[benchmark] Flag to switch on activation checkpointing for HF models (#102557 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102557 Approved by: https://github.com/ngimel, https://github.com/Chillee	2023-05-30 23:46:14 +00:00
Elias Ellison	e5e451a9db	Update batch size for a couple models (#101837 ) The memory compression for these models is at parity, but because we interleave timings between torch.compile and eager run memory is duplicated between between eager and cudagraphs pool and causes OOM. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101837 Approved by: https://github.com/anijain2305	2023-05-19 19:09:59 +00:00
PaliC	e0fc24cdc5	add retries to inductor benchmark suite (#101019 ) This pr accomplishes 1) Enables retries for downloading torchbenchmark and huggingface models in a similar method to how we do it for timm models right now. 2) creates a `_download_model` function for the hugging face and TIMM runners whose output I plan to use to preload the models somewhere if possible (please double check I'll be saving the right thing). Instead of retries, we plan to just add torchbench to a docker image as it is relatively small. <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 3361a4c</samp> > _We're the brave and bold coders of the `common.py` module_ > _We've made a handy function for downloading models_ > _We've shared it with our mates in the other runners_ > _So pull and push and try again, we'll get them all in time_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/101019 Approved by: https://github.com/huydhn, https://github.com/desertfire	2023-05-16 21:41:50 +00:00
Edward Z. Yang	41a4e22015	Update torchbench pin (#101071 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101071 Approved by: https://github.com/malfet	2023-05-11 18:09:40 +00:00
Animesh Jain	0acfe2ce09	[dashboard] higher tolerance for AlbertForQuestionAnswering (#100277 ) @desertfire Pull Request resolved: https://github.com/pytorch/pytorch/pull/100277 Approved by: https://github.com/desertfire	2023-05-02 23:51:08 +00:00
Jason Ansel	fdc853b14c	Add --baseline option to benchmark runners (#100266 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100266 Approved by: https://github.com/ngimel	2023-05-02 02:35:11 +00:00
Edward Z. Yang	0b545bc667	Stop marking sequence length as dynamic (#99889 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99889 Approved by: https://github.com/voznesenskym, https://github.com/huydhn	2023-04-25 01:04:16 +00:00
Edward Z. Yang	f602b3a6ae	Preserve mark_dynamic when cloning inputs (#99617 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99617 Approved by: https://github.com/ngimel, https://github.com/voznesenskym, https://github.com/anijain2305	2023-04-22 19:46:31 +00:00
Natalia Gimelshein	bfbc4e74ab	adjust batch sizes for hf suite (#99691 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/99691 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2023-04-21 23:57:53 +00:00
Edward Z. Yang	37b9143206	Require sequence length in huggingface to be dynamic (#98335 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98335 Approved by: https://github.com/voznesenskym	2023-04-05 19:40:22 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Xuehai Pan	8d45f555d7	[BE] [1/3] Rewrite `super()` calls in caffe2 and benchmarks (#94587 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94587 Approved by: https://github.com/ezyang	2023-02-11 18:19:48 +00:00
Michael Voznesensky	333e771394	Add benchmarks.py to run all benchmarks, add new file with all torchbench model names (#94146 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94146 Approved by: https://github.com/ezyang	2023-02-08 01:18:38 +00:00
Edward Z. Yang	c52567ec18	Switch CI exclusions to use exact match. (#92761 ) Since the CI exclusions are hard-coded in our script, we might as well require them to match exactly. This solved some head scratching where I was like, "this model is not obviously excluded, why is it not showing up in CI." Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92761 Approved by: https://github.com/jansel	2023-01-22 17:10:20 +00:00
blzheng	0c1777acec	Dynamo benchmark: add CPU specific changes (#88477 ) This pr adds some CPU specific changes: - Add support for IPEX backend - https://github.com/pytorch/torchdynamo/issues/1618 - https://github.com/pytorch/torchdynamo/issues/1534 - Enable CPU launcher in runner.py. - Fix the issue that some environment variables are not support on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/88477 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-01-07 09:26:06 +00:00
Natalia Gimelshein	c43209db4d	use libdevice for tanh (#90889 ) Per title I see slight differences in perf with this implementation, where standalone tanh is slightly slower for a tensor of 4000000 elements (20.4 us instead of 19.4us), other sizes are within noise. @bertmaher could you check if it affects your benchmarks? Pull Request resolved: https://github.com/pytorch/pytorch/pull/90889 Approved by: https://github.com/bertmaher, https://github.com/anijain2305	2022-12-20 23:21:37 +00:00
Michael Lazos	7c524221ba	[reland3][dynamo] Revert "Revert "[reland][dynamo] use optimizers correctly in benchmar… (#90956 ) …king (#87492)" (#90746)" This reverts commit `ff1bbc2773`. This should be okay to merge now. The flakiness of HF models will be fixed by seeding the rng (https://github.com/pytorch/pytorch/pull/90936), and the numeric mismatch was root-caused to three decomps (still investigating why those decomps cause this) see https://github.com/pytorch/torchdynamo/issues/1985 for more detail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90956 Approved by: https://github.com/desertfire	2022-12-17 06:27:15 +00:00
PyTorch MergeBot	6bc6fb21db	Revert "[reland2][dynamo] Revert "Revert "[reland][dynamo] use optimizers correctly in benchmar… (#90956 )" This reverts commit `8bc38ae4e2`. Reverted https://github.com/pytorch/pytorch/pull/90956 on behalf of https://github.com/desertfire due to Causing TIMM model failures	2022-12-16 19:28:05 +00:00
Michael Lazos	8bc38ae4e2	[reland2][dynamo] Revert "Revert "[reland][dynamo] use optimizers correctly in benchmar… (#90956 ) …king (#87492)" (#90746)" This reverts commit `ff1bbc2773`. This should be okay to merge now. The flakiness of HF models will be fixed by seeding the rng (https://github.com/pytorch/pytorch/pull/90936), and the numeric mismatch was root-caused to three decomps (still investigating why those decomps cause this) see https://github.com/pytorch/torchdynamo/issues/1985 for more detail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90956 Approved by: https://github.com/desertfire	2022-12-16 13:33:38 +00:00
Michael Lazos	f3da157ce3	Reset rng in hf before loading a model (#90936 ) Reset the rng in hf before generating input and loading model, this makes the huggingface inputs+weights deterministic depending on the seed of the rng. This matches the behavior of the other test suites. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90936 Approved by: https://github.com/desertfire	2022-12-16 00:15:27 +00:00
Bin Bao	ff1bbc2773	Revert "[reland][dynamo] use optimizers correctly in benchmarking (#87492 )" (#90746 ) This reverts commit `d91d7a3221`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90746 Approved by: https://github.com/anijain2305	2022-12-13 11:37:16 +00:00
Animesh Jain	d91d7a3221	[reland][dynamo] use optimizers correctly in benchmarking (#87492 ) Reland https://github.com/pytorch/pytorch/pull/87311 mlazos: updated to use SGD to not add a bunch of additional memory allocations (like Adam) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87492 Approved by: https://github.com/desertfire	2022-12-09 20:32:53 +00:00
Animesh Jain	3162a48a77	[dynamo][benchmarks] Call zero grad (#90026 ) Hoping that it might reduce some flakiness Pull Request resolved: https://github.com/pytorch/pytorch/pull/90026 Approved by: https://github.com/williamwen42	2022-12-02 04:05:57 +00:00
Animesh Jain	cad5772c2c	[dashboard][huggingface] skip accuracy checks for really large models… (#89273 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89273 Approved by: https://github.com/desertfire	2022-11-19 00:22:45 +00:00
Animesh Jain	74610a1ced	[dynamo][benchmarks] HF - Fix seq len and batch sizes (#89165 ) Fixes many models in https://github.com/pytorch/torchdynamo/issues/1842 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89165 Approved by: https://github.com/ngimel	2022-11-17 06:14:24 +00:00
Animesh Jain	1b575782a0	[dynamo][benchmarks] use fresh inductor cache and raise batch size wherever possible (#88044 ) cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx Pull Request resolved: https://github.com/pytorch/pytorch/pull/88044 Approved by: https://github.com/ngimel	2022-10-30 17:10:17 +00:00
PyTorch MergeBot	f38a88c4dd	Revert "[dynamo] use optimizers correctly in benchmarking (#87311 )" This reverts commit `703c19008d`. Reverted https://github.com/pytorch/pytorch/pull/87311 on behalf of https://github.com/anijain2305 due to Bin (desertfire) is trying to get torchbench models in CI, and this PR prevents that. I will bring this back after models are in CI.	2022-10-20 22:01:51 +00:00
Animesh Jain	703c19008d	[dynamo] use optimizers correctly in benchmarking (#87311 ) We were not setting optimizers correctly * This hid the issue that we see here - https://github.com/pytorch/torchdynamo/issues/1687 * This has also revealed that we are activating profilers for every dynamo optimized model call. This could affect speedup cc @jansel @lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87311 Approved by: https://github.com/mlazos, https://github.com/yanboliang	2022-10-20 05:46:25 +00:00
Animesh Jain	c30cfb07ab	[dynamo][dashboard] Run 2 iterations for the correctness runs (#87104 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87104 Approved by: https://github.com/soumith	2022-10-18 15:53:40 +00:00
Jason Ansel	054a2fd6c2	Sync changes from `pytorch/torchdynamo` (#87013 ) This updates to: `6380959be2` Generated with: https://github.com/pytorch/torchdynamo/blob/main/copy_to_core.sh Pull Request resolved: https://github.com/pytorch/pytorch/pull/87013 Approved by: https://github.com/voznesenskym	2022-10-15 21:00:57 +00:00

1 2

51 Commits