pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Yukio Siraichi	2f6fc33c20	Move skip sets into a new file. (#118032 ) This PR moves the skip sets that lived in benchmarks/dynamo/torchbench.py into a more readable YAML file, so that it is consumable from other projects (e.g. XLA). Pull Request resolved: https://github.com/pytorch/pytorch/pull/118032 Approved by: https://github.com/lezcano, https://github.com/ezyang	2024-01-24 19:22:01 +00:00
haozhe.zhu	6500ccebd7	enable fp16 autocast for dynamo benchmark (#114088 ) `--amp` to enable amp path for` CUDA` (default amp_dtype will be float16) and `CPU` (default amp_dtype will be bfloat16). If users set `--amp_dtype`, the amp_dtype from users will have the highest priority. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114088 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-12-14 12:38:44 +00:00
Jason Ansel	de89a53df8	[benchmarking] Reduce box_detections_per_img for vision_maskrcnn (#115487 ) This fixes a failure on the [perf dashboard](https://hud.pytorch.org/benchmark/compilers) with `--amp` mode. I believe boxes 5 and 6 were getting swapped. The existing comment explains the issue. Before ``` $ ./benchmarks/dynamo/torchbench.py --training --accuracy --no-translation-validatio --amp --backend=inductor --disable-cudagraphs --only vision_maskrcnn ... [2023-12-09 13:21:27,292] torch._dynamo.utils: [ERROR] RMSE (res-fp64): 0.00171, (ref-fp64): 0.00054 and shape=torch.Size([256, 256, 3, 3]) [2023-12-09 13:21:27,292] torch._dynamo.utils: [ERROR] Accuracy failed for key name backbone.fpn.layer_blocks.2.0.weight.grad fail_accuracy ``` After ``` $ ./benchmarks/dynamo/torchbench.py --training --accuracy --no-translation-validatio --amp --backend=inductor --disable-cudagraphs --only vision_maskrcnn ... pass ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115487 Approved by: https://github.com/yanboliang	2023-12-11 08:42:25 +00:00
Jason Ansel	7bbc19adc4	[dynamo] Unskip DALLE2_pytorch (#114960 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114960 Approved by: https://github.com/eellison ghstack dependencies: #114959	2023-12-02 00:40:25 +00:00
Jason Ansel	67562c8cf8	Add DALLE2_pytorch to skips (#114924 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114924 Approved by: https://github.com/huydhn	2023-12-01 07:15:59 +00:00
Jason Ansel	b35ca2cb94	Better error message for misconfigured torchbench model (#114827 ) ``` File "/home/jansel/pytorch/./benchmarks/dynamo/torchbench.py", line 381, in load_model benchmark_cls.name = model_name AttributeError: 'NoneType' object has no attribute 'name ``` becomes ``` File "/home/jansel/pytorch/./benchmarks/dynamo/torchbench.py", line 381, in load_model raise NotImplementedError(f"{model_name}.Model is None") NotImplementedError: torchrec_dlrm.Model is None ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114827 Approved by: https://github.com/xuzhao9, https://github.com/yanboliang	2023-11-30 19:11:01 +00:00
eellison	605236af06	Force fp16 for vision_maskrcnn inference (#113110 ) For fp16 for maskrcnn inference (doesnt support bf16). Also skip phi_1_5 in training - it OOMs even with batch size 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113110 Approved by: https://github.com/xmfan	2023-11-10 02:25:11 +00:00
Elias Ellison	f6fb9fd681	use smaller batch size for timm_efficientdet in inference (#113095 ) Previously had OOMs Pull Request resolved: https://github.com/pytorch/pytorch/pull/113095 Approved by: https://github.com/xmfan ghstack dependencies: #112650	2023-11-07 07:08:16 +00:00
Elias Ellison	5c1ea30ca3	bump torchbench commit (#112650 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112650 Approved by: https://github.com/msaroufim, https://github.com/xuzhao9	2023-11-07 03:56:16 +00:00
Simon Fan	28ebe5df7a	yolov3: reduce batch size due to OOM (#111959 ) yolov3 w/ cudagraphs (known to use more memory) is failing perf test due to OOM (https://hud.pytorch.org/benchmark/torchbench/inductor_with_cudagraphs?startTime=Mon,%2016%20Oct%202023%2020:19:47%20GMT&stopTime=Mon,%2023%20Oct%202023%2020:19:47%20GMT&granularity=hour&mode=training&dtype=amp&lBranch=main&lCommit=0b424ee0b7bfe09e0a438a63e8336e95eea85901&rBranch=main&rCommit=29048be41ca3aa8974795d93b9ea9fd6dee415fc) I'm reducing the batch size from 16 to 8 to keep the same batch size for all yolov3 HUD benchmarks Pull Request resolved: https://github.com/pytorch/pytorch/pull/111959 Approved by: https://github.com/xuzhao9	2023-10-25 06:18:53 +00:00
Simon Fan	88ef126a93	rename nanogpt_generate to nanogpt to also support train (#109746 ) Differential Revision: [D49522940](https://our.internmc.facebook.com/intern/diff/D49522940) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109746 Approved by: https://github.com/msaroufim, https://github.com/malfet, https://github.com/xuzhao9	2023-09-29 17:36:48 +00:00
angelayi	a565f1bee6	[aotinductor] Skip benchmarks with control flow (#109661 ) Since AOTInductor doesn't support control flow yet, we will skip over tests that are currently failing due to containing control flow in the code. Logs taken from https://hud.pytorch.org/benchmark/compilers?startTime=Tue%2C%2012%20Sep%202023%2022%3A56%3A40%20GMT&stopTime=Tue%2C%2019%20Sep%202023%2022%3A56%3A40%20GMT&granularity=hour&suite=torchbench&mode=inference&dtype=bfloat16&lBranch=main&lCommit=2c1554a0323107d821be3ff13df7833b9f0b960d&rBranch=main&rCommit=47be61e12bd51df27182343d312dc3df485d5559 Errors documented in https://github.com/pytorch/pytorch/issues/105217 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109661 Approved by: https://github.com/desertfire	2023-09-25 18:49:06 +00:00
Mark Saroufim	e2cfbca5ab	Add clip to dynamo runners (#109840 ) CLIP was moved to canary models because we use the multimodal version which depends on torchtext which torchbench deprecated https://github.com/pytorch/benchmark/pull/1837 This issue didn't show up before because we hadn't updated the torchbench pin Pull Request resolved: https://github.com/pytorch/pytorch/pull/109840 Approved by: https://github.com/cpuhrsch	2023-09-22 20:50:57 +00:00
eellison	d24ba7a634	Add 3d Attn Pattern to match HF Whisper (#109156 ) Adds a 3d pattern that improves perf of HF Whisper from 1.3 -> 4.1. We could be matching more generally on 3d, but i'll leave that for another pr. Thanks to @drisspg for helping me write the pattern. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109156 Approved by: https://github.com/yanboliang ghstack dependencies: #109663, #108894, #108917, #109142	2023-09-20 16:39:31 +00:00
Simon Fan	54c5f474a7	Forward rank and world size info to Torchbench models when using dynamo runner (#108438 ) Adding support to pass rank and world_size to torchbench model, via its extra_args parameter: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/model.py#L83C80-L83C90 This is used for models which distribute over multiple GPUs e.g. simple_gpt https://github.com/pytorch/benchmark/pull/1867 Also add an option to skip multiprocess only gpu models Testing via `python benchmarks/dynamo/torchbench.py -d cuda --output=benchmark_logs/performance.csv --inference --performance --timing --print-memory --multiprocess --only simple_gpt` Pull Request resolved: https://github.com/pytorch/pytorch/pull/108438 Approved by: https://github.com/Chillee	2023-09-14 21:01:20 +00:00
drisspg	ad90ab31f2	Flash Attention v2 (#105602 ) # Summary ## PR Dependencies I don't use ghstack :( this is a PR where it would have been helpful. That beings said I am going to peel off some PRs to make reviewing this easier: - [x] Separate build flags for Flash and MemEff: #107985 ### Description This pull request updates the version of _scaled_dot_product_flash_attention from version 1 to version 2. The changes are based on the flash attention code originally authored by @tridao ### Changes Made The majority of the changes in this pull request involve: - Copying over the flash_attention sources. - Updating header files. - Removing padding and slicing code from within the flash_attention kernel and relocating it to the composite implicit region of the SDPA. This was need to make the kernel functional and appease autograd. - Introducing a simple kernel generator to generate different instantiations of the forward and backward flash templates. - Adding conditional compilation (ifdef) to prevent building when nvcc is invoked with gencode < sm80. - Introducing a separate dependent option for mem_eff_attention, as flash_attention v2 lacks support for Windows and cannot be built for sm50 generation codes. - Modifying build.sh to reduce parallelization on sm86 runners and to lower the maximum parallelization on the manywheel builds. This adjustment was made to address out-of-memory issues during the compilation of FlashAttentionV2 sources. - Adding/Updating tests. ### Notes for Reviewers This is not a fun review, and I apologize in advance. Most of the files-changed are in the flash_attn/ folder. The only files of interest here IMO: - aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp - aten/src/ATen/native/transformers/cuda/flash_attn/kernels/generate_kernels.py ( this has been incorporated upstream to flash-attention github) There are a number of files all related to avoiding OOMs in CI/CD. These are typically shell scripts. ### Follow up items - Include the updates from `e07aa036db` and `9e5e8bc91e` \| https://github.com/pytorch/pytorch/issues/108108 ### Work Items - [x] I don't think Windows will be supported for 3.1.0 - Need to update cmakee - [x] Let multi_query/attention pass through and test \| UPDATE: I have the fast path implemented here: https://github.com/pytorch/pytorch/pull/106730 but since this will require changes to semantics of math to call repeat_interleave, I think this should be done as a followup. - [x] Had to drop cutlass back to 3.0.0 to get it to compile. Need to figure out how to upgrade to 3.1.0 and later. Spoke with Tri and he is going to be taking a look. Note: compiling with clang currently errors for the cute headers. - [x] Update test exercise above codepath - [x] Still need to disable on seq_len % 128 != 0 for backward( Tri beat me to it `a4f148b6ab`) - [x] Add determinism warning to BWD, Tri got to this one as well: 1c41d2b - [x] Update dispatcher to universally prefer FlashV2 - [x] Update tests to exercise new head_dims - [x] Move the head_dim padding from kernel to top level composite implicit function in order to make it purely functional - [x] Create template generator script - [x] Initial cmake support for building kernels/ folder - [x] Replay CudaGraph changes ### Results #### Forward only The TFlops are reported here are on a100 that is underclocked. ![flashv2_tflops_vs_seq_len](https://github.com/pytorch/pytorch/assets/32754868/152de46d-8fa6-42f0-9a9c-ef1eb7ae29e7) #### Forward+Backward Ran a sweep and for large compute bound sizes we do see a ~2x performance increase for forw+back. <img width="1684" alt="Screenshot 2023-07-20 at 3 47 47 PM" src="https://github.com/pytorch/pytorch/assets/32754868/fdd26e07-0077-4878-a417-f3a418b6fb3b"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/105602 Approved by: https://github.com/huydhn, https://github.com/cpuhrsch	2023-09-13 13:59:05 +00:00
Huy Do	a9c663c269	Revert "Flash Attention v2 (#105602 )" (#108827 ) This reverts commit `add45aea1c`. There are some conflicts on some benchmark csv file https://github.com/pytorch/pytorch/pull/105602#issuecomment-1710988951 so I need to revert this manually. The diff has been reverted internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108827 Approved by: https://github.com/kit1980	2023-09-08 07:43:04 +00:00
PyTorch MergeBot	e45b290127	Revert "Revert "Flash Attention v2 (#105602 )" (#108827 )" This reverts commit `24e9bbe22a`. Reverted https://github.com/pytorch/pytorch/pull/108827 on behalf of https://github.com/huydhn due to I need to land this revert properly as there are new failures showing up on trunk ([comment](https://github.com/pytorch/pytorch/pull/108827#issuecomment-1711020924))	2023-09-08 03:25:45 +00:00
Huy Do	24e9bbe22a	Revert "Flash Attention v2 (#105602 )" (#108827 ) This reverts commit `add45aea1c`. There are some conflicts on some benchmark csv file https://github.com/pytorch/pytorch/pull/105602#issuecomment-1710988951 so I need to revert this manually. The diff has been reverted internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108827 Approved by: https://github.com/kit1980	2023-09-08 02:54:20 +00:00
eellison	738106c1f7	Torchbench model tolerance changes (#108598 ) Move detectron2_fcos_r_50_fpn to amp. The minifier showed the following snippet as causing the divergence, where inductor has better numerics than eager: ``` import torch def foo(x): return x > .2 inp = torch.tensor([.2002], device="cuda", dtype=torch.bfloat16) print(foo(inp)) print(torch.compile(foo)(inp)) ``` doctr_reco_predictor had very minimal divergence (.002 vs .001 required), bumping tolerance here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108598 Approved by: https://github.com/shunting314	2023-09-06 16:52:29 +00:00
drisspg	add45aea1c	Flash Attention v2 (#105602 ) # Summary ## PR Dependencies I don't use ghstack :( this is a PR where it would have been helpful. That beings said I am going to peel off some PRs to make reviewing this easier: - [x] Separate build flags for Flash and MemEff: #107985 ### Description This pull request updates the version of _scaled_dot_product_flash_attention from version 1 to version 2. The changes are based on the flash attention code originally authored by @tridao ### Changes Made The majority of the changes in this pull request involve: - Copying over the flash_attention sources. - Updating header files. - Removing padding and slicing code from within the flash_attention kernel and relocating it to the composite implicit region of the SDPA. This was need to make the kernel functional and appease autograd. - Introducing a simple kernel generator to generate different instantiations of the forward and backward flash templates. - Adding conditional compilation (ifdef) to prevent building when nvcc is invoked with gencode < sm80. - Introducing a separate dependent option for mem_eff_attention, as flash_attention v2 lacks support for Windows and cannot be built for sm50 generation codes. - Modifying build.sh to reduce parallelization on sm86 runners and to lower the maximum parallelization on the manywheel builds. This adjustment was made to address out-of-memory issues during the compilation of FlashAttentionV2 sources. - Adding/Updating tests. ### Notes for Reviewers This is not a fun review, and I apologize in advance. Most of the files-changed are in the flash_attn/ folder. The only files of interest here IMO: - aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp - aten/src/ATen/native/transformers/cuda/flash_attn/kernels/generate_kernels.py ( this has been incorporated upstream to flash-attention github) There are a number of files all related to avoiding OOMs in CI/CD. These are typically shell scripts. ### Follow up items - Include the updates from `e07aa036db` and `9e5e8bc91e` \| https://github.com/pytorch/pytorch/issues/108108 ### Work Items - [x] I don't think Windows will be supported for 3.1.0 - Need to update cmakee - [x] Let multi_query/attention pass through and test \| UPDATE: I have the fast path implemented here: https://github.com/pytorch/pytorch/pull/106730 but since this will require changes to semantics of math to call repeat_interleave, I think this should be done as a followup. - [x] Had to drop cutlass back to 3.0.0 to get it to compile. Need to figure out how to upgrade to 3.1.0 and later. Spoke with Tri and he is going to be taking a look. Note: compiling with clang currently errors for the cute headers. - [x] Update test exercise above codepath - [x] Still need to disable on seq_len % 128 != 0 for backward( Tri beat me to it `a4f148b6ab`) - [x] Add determinism warning to BWD, Tri got to this one as well: 1c41d2b - [x] Update dispatcher to universally prefer FlashV2 - [x] Update tests to exercise new head_dims - [x] Move the head_dim padding from kernel to top level composite implicit function in order to make it purely functional - [x] Create template generator script - [x] Initial cmake support for building kernels/ folder - [x] Replay CudaGraph changes ### Results #### Forward only The TFlops are reported here are on a100 that is underclocked. ![flashv2_tflops_vs_seq_len](https://github.com/pytorch/pytorch/assets/32754868/152de46d-8fa6-42f0-9a9c-ef1eb7ae29e7) #### Forward+Backward Ran a sweep and for large compute bound sizes we do see a ~2x performance increase for forw+back. <img width="1684" alt="Screenshot 2023-07-20 at 3 47 47 PM" src="https://github.com/pytorch/pytorch/assets/32754868/fdd26e07-0077-4878-a417-f3a418b6fb3b"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/105602 Approved by: https://github.com/huydhn, https://github.com/cpuhrsch	2023-09-01 22:14:44 +00:00
Elias Ellison	e18f512b81	Update accuracy checking for nan, floats (#108202 ) Fixes inference accuracy for `doctr_reco_predictor` and `pyhpc_turbulent_kinetic_energy`. For the `same(float, float)` comparison we weren't going through the more rigorous tensor comparison path which takes into account the fp64 base results. Also return True when fp64 base result are not well formed (nan). I debugged these models and the source of divergence were innocuous: `doctr_reco_predictor` - can be fixed by turning off layout optimization, decomp for batch norm `pyhpc_turbulent_kinetic_energy` - divergence caused because fused kernel keeps precision in fp32 instead of casting back and forth from/to fp32/bf16. Fused kernel is better precision, anyway. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108202 Approved by: https://github.com/jansel	2023-09-01 02:54:01 +00:00
Elias Ellison	63eee52ba7	Add Drq to BF16 Higher Tolernace (#108368 ) This passes for me on aws gpu but not devgpu, and was already in the `REQUIRE_HIGHER_FP16_TOLERANCE` set. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108368 Approved by: https://github.com/shunting314	2023-09-01 00:29:27 +00:00
Shunting Zhang	eb8659fe81	pass inference accuracy check for detectron2_fcos_r_50_fpn (#108328 ) We need a higher tolerance to pass the inference accuracy check for detectron2_fcos_r_50_fpn . Command: ``` python benchmarks/dynamo/torchbench.py --backend inductor --bfloat16 --accuracy --only detectron2_fcos_r_50_fpn --disable-cudagraphs --inference ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/108328 Approved by: https://github.com/jansel	2023-08-31 20:21:20 +00:00
Edward Z. Yang	5b04e9b6ce	Install torchrec/fbgemm from source in CI (#106808 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106808 Approved by: https://github.com/malfet, https://github.com/xuzhao9	2023-08-12 02:08:44 +00:00
Mark Saroufim	1b32ac3cab	Update torchbench.txt (#106761 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/106761 Approved by: https://github.com/malfet	2023-08-09 19:01:21 +00:00
Edward Z. Yang	c379d6283a	Don't suppress ModuleNotFoundError if the failure is for an unrelated module (#106807 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106807 Approved by: https://github.com/williamwen42, https://github.com/voznesenskym	2023-08-09 01:54:49 +00:00
Mark Saroufim	90c264c276	sd flaky on cpu skip (#106726 ) waiting for update expected script Pull Request resolved: https://github.com/pytorch/pytorch/pull/106726 Approved by: https://github.com/malfet	2023-08-08 02:44:47 +00:00
Elias Ellison	578969ca61	skip maml (#106471 ) This one benchmark distorts benchmarks because it is so low (.0007, the equivalent of a 1400x speedup). It also has been flakey, which has produced a lot of noise. Disabling. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106471 Approved by: https://github.com/anijain2305	2023-08-04 22:14:09 +00:00
Howard Huang	236eda4d51	remove jit from torchbench (#106071 ) Need to remove jit arguments after changes in https://github.com/pytorch/benchmark/pull/1787 Also curious, is there is a procedure for updating torchbench version in Pytorch CI? Pull Request resolved: https://github.com/pytorch/pytorch/pull/106071 Approved by: https://github.com/xuzhao9, https://github.com/msaroufim, https://github.com/malfet, https://github.com/lezcano	2023-08-03 21:04:43 +00:00
Mark Saroufim	6268ab2c2d	torchbench pin upd: hf auth token, clip, whisper, llamav2, sd (#106009 ) Includes stable diffusion, whisper, llama7b and clip To get this to work I had to Pass in hf auth token to all ci jobs, github does not pass in secrets from parent to child automatically. There's a likelihood HF will rate limit us in case please revert this PR and I'll work on adding a cache next - cc @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @aakhundov @malfet Something upstream changed in torchbench too where now `hf_Bert` and `hf_Bert_large` are both failing on some dynamic shape looking error which I'm not sure how to debug yet so for now felt a bit gross but added a skip since others are building on top this work @ezyang `llamav2_7b_16h` cannot pass through accuracy checks cause it OOMs on deepcloning extra inputs this seems to make it not need to show up in expected numbers csv, will figure this when we update the pin with https://github.com/pytorch/benchmark/pull/1803 cc @H-Huang @xuzhao9 @cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/106009 Approved by: https://github.com/malfet	2023-08-03 16:28:40 +00:00
Bin Bao	28d42e66e4	[CI] Add DALLE2_pytorch to FORCE_AMP_FOR_FP16_BF16_MODELS (#104283 ) Summary: DALLE2_pytorch inference does not support bfloat16, fallback to use AMP. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104283 Approved by: https://github.com/eellison	2023-06-28 02:37:15 +00:00
Bin Bao	a2988c9e6a	[CI] Switch inference accuracy and performance tests to bfloat16 (#103535 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103535 Approved by: https://github.com/eellison	2023-06-17 00:24:37 +00:00
Edward Z. Yang	bc6ec97e02	Switch dynamic_shapes to True by default (#103597 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103597 Approved by: https://github.com/voznesenskym	2023-06-15 15:16:20 +00:00
Animesh Jain	d6da649a1b	[benchmark] hf_T5_base - torchbench original batchsize too large (#103442 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103442 Approved by: https://github.com/desertfire	2023-06-15 01:06:40 +00:00
Animesh Jain	16c2090b2d	[benchmark][compile] Limit number of bounding boxes to 5 (#103413 ) Depends on https://github.com/pytorch/benchmark/pull/1729 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103413 Approved by: https://github.com/ezyang	2023-06-15 01:06:40 +00:00
Animesh Jain	428bff842d	[benchmarks] Torchbench llama is not suitable for training (#103094 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103094 Approved by: https://github.com/eellison, https://github.com/desertfire	2023-06-07 01:33:07 +00:00
Animesh Jain	33a49eeae7	[benchmark] Flag to switch on activation checkpointing for HF models (#102557 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102557 Approved by: https://github.com/ngimel, https://github.com/Chillee	2023-05-30 23:46:14 +00:00
Edward Z. Yang	22ca1a1124	Partially fix shape mismatch in vision_maskrcnn (#101477 ) The bulk of the heavy lifting is happening in https://github.com/pytorch/vision/pull/7592 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101477 Approved by: https://github.com/voznesenskym	2023-05-21 05:20:08 +00:00
Edward Z. Yang	41468833fb	vision_maskrcnn is now deterministic (#101116 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101116 Approved by: https://github.com/ngimel	2023-05-16 21:32:17 +00:00
Edward Z. Yang	f48718f749	Update torchbench pin (#101365 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101365 Approved by: https://github.com/albanD, https://github.com/awgu	2023-05-15 16:52:31 +00:00
Edward Z. Yang	fcf2fb273c	Make missing model import error marginally better (#101221 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101221 Approved by: https://github.com/albanD, https://github.com/anijain2305	2023-05-14 19:57:01 +00:00
Edward Z. Yang	41a4e22015	Update torchbench pin (#101071 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101071 Approved by: https://github.com/malfet	2023-05-11 18:09:40 +00:00
Edward Z. Yang	ad070b6dfa	Check canary_models for models too in torchbench.py (#101081 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101081 Approved by: https://github.com/desertfire	2023-05-11 13:23:17 +00:00
Edward Z. Yang	d25c93f919	Remove speech_transformer workaround, torchbench handles it correctly now (#100558 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100558 Approved by: https://github.com/albanD	2023-05-04 01:14:24 +00:00
Yanbo Liang	896eb1db26	[Dynamo] Skip TB Background_Matting model eager accuracy check because of non deterministic (#100513 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/100513 Approved by: https://github.com/anijain2305	2023-05-03 07:06:50 +00:00
Yanbo Liang	3009c42e7d	[CI Testing] Re-enable timm_efficientdet training (#99787 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/99787 Approved by: https://github.com/desertfire	2023-04-24 20:05:15 +00:00
Edward Z. Yang	fc8fa6c356	Require at least one tensor to be marked dynamic with --dynamic-batch-only (#99620 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99620 Approved by: https://github.com/voznesenskym	2023-04-21 00:17:08 +00:00
Will Constable	9ac2b041c9	Make opacus xfail instead of skip (#99380 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99380 Approved by: https://github.com/desertfire, https://github.com/anijain2305	2023-04-19 21:09:06 +00:00
Huy Do	5d395769a6	Skip vision_maskrcnn after #98923 (#99394 ) This is failing in trunk as documented in https://github.com/pytorch/pytorch/issues/99438 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99394 Approved by: https://github.com/desertfire	2023-04-19 17:07:07 +00:00

1 2

89 Commits