pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
xndcn	bebb8b7c1e	[inductor] use native fetch_add function for trivial types (#101931 ) floating-point is supported by std::atomic::fetch_add since C++20. However, this code path is not activated yet because cpp_flags in codecache.py is hard-coded to "-std=c++17" Pull Request resolved: https://github.com/pytorch/pytorch/pull/101931 Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/jansel	2023-06-01 03:47:56 +00:00
Animesh Jain	65631d4515	[benchmarks] Use train mode for accuracy checks for HF models (#102578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102578 Approved by: https://github.com/desertfire	2023-05-31 19:47:18 +00:00
Bin Bao	47b884a74c	[inductor] Revert a CI remedy for Triton compilation error (#102541 ) Summary: revert https://github.com/pytorch/pytorch/pull/91634 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102541 Approved by: https://github.com/ngimel	2023-05-31 13:13:51 +00:00
Animesh Jain	33a49eeae7	[benchmark] Flag to switch on activation checkpointing for HF models (#102557 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102557 Approved by: https://github.com/ngimel, https://github.com/Chillee	2023-05-30 23:46:14 +00:00
Yanbo Liang	9ff1932d2b	[Dynamo] Save global autocast state to restore on graph break (#102415 ) Fixes #102414 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102415 Approved by: https://github.com/yf225	2023-05-30 23:03:21 +00:00
Horace He	e71ab21422	update triton pin (#101919 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101919 Approved by: https://github.com/ngimel	2023-05-30 17:16:05 +00:00
Animesh Jain	040d2cc969	[dynamo] Some torchrec_dlrm related fixes (#101953 ) Issue 1 of https://github.com/pytorch/pytorch/issues/101918 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101953 Approved by: https://github.com/jansel	2023-05-28 17:56:08 +00:00
William Wen	3c77310752	fix benchmarks/dynamo/runner.py (#102311 ) Benchmark performance csv's can now contain `infra_error` strings, leading to failed parses. Fix by converting strings in data to 0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102311 Approved by: https://github.com/yanboliang	2023-05-25 22:42:03 +00:00
Bin Bao	ee33bae5c7	Fix an issue where checking sameness throw an exception (#102279 ) Summary: currently the exception is caught by outside and marked as infra_error Pull Request resolved: https://github.com/pytorch/pytorch/pull/102279 Approved by: https://github.com/anijain2305	2023-05-25 19:49:23 +00:00
Jongsoo Park	b91eb97d34	[transformer benchmark] relax tolerance in sdp.py (#101965 ) Summary: Otherwise we get ``` Traceback (most recent call last): File "<string>", line 49, in <module> File "<string>", line 47, in __run File "/usr/local/fbcode/platform010/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/fbcode/platform010/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp__/sdp#link-tree/caffe2/benchmarks/transformer/sdp.py", line 346, in <module> main(save_path) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp__/sdp#link-tree/caffe2/benchmarks/transformer/sdp.py", line 328, in main experiment = run_single_experiment(experiment_config) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp__/sdp#link-tree/caffe2/benchmarks/transformer/sdp.py", line 229, in run_single_experiment assert_close_tensors(nn_mha_output, composite_mha_output) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp__/sdp#link-tree/caffe2/benchmarks/transformer/sdp.py", line 196, in assert_close_tensors assert torch.allclose(a, b, atol=1e-3, rtol=1e-3) AssertionError ``` Test Plan: buck run mode/dev-nosan //caffe2/benchmarks/transformer:sdp Differential Revision: D45843836 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101965 Approved by: https://github.com/drisspg	2023-05-23 06:54:08 +00:00
Jason Ansel	5ba16011d7	Suppress profiler spam in dynamo benchmarks (#101942 ) Makes this stuff go away: ``` STAGE:2023-05-20 20:49:34 63580:63580 ActivityProfilerController.cpp:311] Completed Stage: Warm Up STAGE:2023-05-20 20:49:34 63580:63580 ActivityProfilerController.cpp:317] Completed Stage: Collection STAGE:2023-05-20 20:49:34 63580:63580 ActivityProfilerController.cpp:321] Completed Stage: Post Processing ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/101942 Approved by: https://github.com/shunting314, https://github.com/desertfire	2023-05-22 18:32:31 +00:00
Edward Z. Yang	22ca1a1124	Partially fix shape mismatch in vision_maskrcnn (#101477 ) The bulk of the heavy lifting is happening in https://github.com/pytorch/vision/pull/7592 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101477 Approved by: https://github.com/voznesenskym	2023-05-21 05:20:08 +00:00
PyTorch MergeBot	7f3fed125e	Revert "separate out dynamo .requires_grad and .is_grad_enabled guards (#100570 )" This reverts commit `1fabee399d`. Reverted https://github.com/pytorch/pytorch/pull/100570 on behalf of https://github.com/PaliC due to breaking inductor tests along with #101219 ([comment](https://github.com/pytorch/pytorch/pull/100570#issuecomment-1555271267))	2023-05-19 21:29:09 +00:00
Elias Ellison	e5e451a9db	Update batch size for a couple models (#101837 ) The memory compression for these models is at parity, but because we interleave timings between torch.compile and eager run memory is duplicated between between eager and cudagraphs pool and causes OOM. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101837 Approved by: https://github.com/anijain2305	2023-05-19 19:09:59 +00:00
Brian Hirsh	1fabee399d	separate out dynamo .requires_grad and .is_grad_enabled guards (#100570 ) Fixes https://github.com/pytorch/pytorch/issues/100977 This will hopefully fix this error (from [issue](https://github.com/pytorch/pytorch/issues/99616)) This PR fixes an internal model: we were running an inductor inference graph, but `torch.is_grad_enabled()` was True, causing us to error inside of the inference graph when we encountered an out= operator. I haven't been able to create a smaller repro - before landing this, I want to create a smaller repro to convince myself of why we need to separate out these guards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100570 Approved by: https://github.com/ezyang	2023-05-19 16:14:56 +00:00
Michael Voznesensky	4c1bc91f42	Support autograd.Function w/ grad (#99483 ) This PR adds support for tracing autograd.Function with grad. A few important bullet points outlining our approach: 1) Our goal is to verify soundness in order to add a call_function to the autograd.Function's `apply` to the graph. 2) We achieve (1) by either verifying soundness or rejecting soundness, by ensuring that both forward and backward of the autograd.Function are sound. 3) For the forward, if we verify soundness, we install its guards into the graph. 4) For the backward, if we verify soundness, we throw it out. However, backwards soundness verification is more onerous, and has a config driven set of banned attrs and methods for tensors. 1-4 above are achieved by turning the forward and backward into UserDefinedFunctionVariables, and inlining through them, relying on dynamo's soundness detection. If we graph break in these, we raise and treat them as unsound. As noted above, backwards is stricter yet. For the tracing, the safety comes from dynamo's HigherOrderOperator system. That system ensures that not only do we trace soundly, but that no new variables are lifted into inputs during the tracing, and that the forward and backwards are entirely self contained. Whenever we reject a function as unsound, we restore back, as usual. Due to some limitations in the lifting logic, we have an escape hatch we implemented for tensors that are known in forward, but cross into backwards through save_tensors (save) /saved_tensors (load). We escape hatch here to avoid having the known saved tensors coming from forward end up being accidentally treated as lifted variables (and rejected). This is sound, but a little hacky feeling. Additionally, due to some limitations in fx node removal, combined with how we produce subgraphs for the traces installed from HigherOrderOperators, we had to improve our node removal logic. In the event of a restore, we remove the old nodes from the graph, as usual in dynamo. However, because the references to these nodes may exist in subgraphs, we traverse any nodes users and remove them first if and only if they are in another graph. This is always sound, because removal should only be downstream of restoration at this point. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99483 Approved by: https://github.com/zou3519	2023-05-19 01:26:21 +00:00
drisspg	6f13d6892a	Add meta support for multinomial (#101324 ) # Summary Found this when trying to compile the text gen loop of nanogpt here: `b33289942b/torchbenchmark/models/nanogpt_generate/model.py (L322)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/101324 Approved by: https://github.com/ngimel	2023-05-19 00:04:26 +00:00
Animesh Jain	794cc3952e	adding moco to CI (#101098 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/101098 Approved by: https://github.com/desertfire	2023-05-18 10:01:49 +00:00
chuanqiw	b315c9b5ab	[CI] Enlarge memory for OOM models in inductor cpu HF accuracy test (#101395 ) Change the Inductor CPU HF accuracy test node from `linux.4xlarge` (32GB) to `linux.24xlarge` (192GB) to enlarge the node memory. Also add 3 HF models back to CI test. Fixes #101390 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101395 Approved by: https://github.com/EikanWang, https://github.com/desertfire, https://github.com/huydhn	2023-05-18 09:23:30 +00:00
Animesh Jain	dafa009c3c	[dynamo][moco] Save global torch state to restore on graph break (#101201 ) This is relevant to https://github.com/pytorch/pytorch/pull/100570 as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101201 Approved by: https://github.com/voznesenskym	2023-05-18 01:03:15 +00:00
Peter Bell	ef512db0f8	[inductor] Constant and index_expr propagation pass (#101077 ) This pass does a limited form of constant propagation, as well as propagation of sympy indexing expressions. For example, say you have the function: ```python def flip(x): i = torch.arange(x.size(0) - 1, -1, -1, device=x.device) return x[i] ``` On current main this results in indirect indexing: ```python class buf0_loop_body: var_ranges = {z0: 4, z1: 3} index0 = 3 - z0 index1 = 3indirect0 + z1 index2 = 3z0 + z1 def body(self, ops): get_index = self.get_index('index0') index_expr = ops.index_expr(get_index, torch.int64) set_indirect0 = self.set_indirect0(index_expr) get_index_1 = self.get_index('index1') load = ops.load('arg0_1', get_index_1) get_index_2 = self.get_index('index2') store = ops.store('buf0', get_index_2, load, None) return store ``` With this PR the indexing is propagated through the computation and into direct indexing: ```python class buf0_loop_body: var_ranges = {z0: 4, z1: 3} index0 = -3z0 + z1 + 9 index1 = 3z0 + z1 def body(self, ops): get_index = self.get_index('index0') load = ops.load('arg0_1', get_index) get_index_1 = self.get_index('index1') store = ops.store('buf0', get_index_1, load, None) return store ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/101077 Approved by: https://github.com/lezcano, https://github.com/ngimel	2023-05-17 23:36:24 +00:00
Jongsoo Park	ebae77e891	[transformer benchmark] sort by cuda time (#101349 ) Summary: The benchmark is running on CUDA Test Plan: buck run mode/opt //caffe2/benchmarks/transformer:sdp_backwards Differential Revision: D45843837 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101349 Approved by: https://github.com/drisspg	2023-05-17 15:38:56 +00:00
Jason Ansel	403ce1a1c9	Fix benchmark model names printouts with tqdm (#101627 ) With the TQDM changes in #100969 -- the models names ended up getting hidden from the benchmark printouts. We would print the model name with no newline, then tqdm would print a `\r` and overwrite the name of the running model. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101627 Approved by: https://github.com/ezyang	2023-05-17 15:31:11 +00:00
PaliC	e0fc24cdc5	add retries to inductor benchmark suite (#101019 ) This pr accomplishes 1) Enables retries for downloading torchbenchmark and huggingface models in a similar method to how we do it for timm models right now. 2) creates a `_download_model` function for the hugging face and TIMM runners whose output I plan to use to preload the models somewhere if possible (please double check I'll be saving the right thing). Instead of retries, we plan to just add torchbench to a docker image as it is relatively small. <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 3361a4c</samp> > _We're the brave and bold coders of the `common.py` module_ > _We've made a handy function for downloading models_ > _We've shared it with our mates in the other runners_ > _So pull and push and try again, we'll get them all in time_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/101019 Approved by: https://github.com/huydhn, https://github.com/desertfire	2023-05-16 21:41:50 +00:00
Edward Z. Yang	41468833fb	vision_maskrcnn is now deterministic (#101116 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101116 Approved by: https://github.com/ngimel	2023-05-16 21:32:17 +00:00
Edward Z. Yang	23d1cc3811	Update llama to failing (#101565 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101565 Approved by: https://github.com/janeyx99	2023-05-16 14:12:26 +00:00
Yanbo Liang	e4eaf33346	Re-enable detectron2_maskrcnn on CI (#100791 ) #99665 has been fixed, we can re-enable these models on CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100791 Approved by: https://github.com/huydhn	2023-05-16 04:25:58 +00:00
Edward Z. Yang	f48718f749	Update torchbench pin (#101365 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101365 Approved by: https://github.com/albanD, https://github.com/awgu	2023-05-15 16:52:31 +00:00
Edward Z. Yang	fcf2fb273c	Make missing model import error marginally better (#101221 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101221 Approved by: https://github.com/albanD, https://github.com/anijain2305	2023-05-14 19:57:01 +00:00
Jongsoo Park	8876c0b282	[transformer benchmark] fix in sdp_bwd for scaled_dot_product_attention return type (#101341 ) Summary: Otherwise we get ``` Traceback (most recent call last): File "<string>", line 49, in <module> File "<string>", line 47, in __run File "/usr/local/fbcode/platform010/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/fbcode/platform010/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp_backwards__/sdp_backwards#link-tree/caffe2/benchmarks/transformer/sdp_backwards.py", line 188, in <module> main() File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp_backwards__/sdp_backwards#link-tree/caffe2/benchmarks/transformer/sdp_backwards.py", line 184, in main run_timing(min_run_time, batch_size, embed_dim, num_heads, max_seq_len, dtype) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp_backwards__/sdp_backwards#link-tree/caffe2/benchmarks/transformer/sdp_backwards.py", line 105, in run_timing rand_fused_upward = cpt(x, x, x, mask).clone().detach() File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp_backwards__/sdp_backwards#link-tree/torch/nn/modules/module.py", line 1502, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp_backwards__/sdp_backwards#link-tree/torch/nn/modules/module.py", line 1511, in _call_impl return forward_call(args, **kwargs) File "/data/users/jongsoo/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/caffe2/benchmarks/transformer/__sdp_backwards__/sdp_backwards#link-tree/caffe2/benchmarks/transformer/sdp_backwards.py", line 39, in forward attn, _ = torch.nn.functional.scaled_dot_product_attention( ValueError: too many values to unpack (expected 2) ``` Test Plan: buck run mode/dev-nosan //caffe2/benchmarks/transformer:sdp_backwards Differential Revision: D45843838 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101341 Approved by: https://github.com/drisspg	2023-05-14 01:34:51 +00:00
Natalia Gimelshein	49578913fb	update timm commit (#100931 ) Fixes #100903 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100931 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-05-12 04:22:08 +00:00
Edward Z. Yang	41a4e22015	Update torchbench pin (#101071 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101071 Approved by: https://github.com/malfet	2023-05-11 18:09:40 +00:00
Edward Z. Yang	ad070b6dfa	Check canary_models for models too in torchbench.py (#101081 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101081 Approved by: https://github.com/desertfire	2023-05-11 13:23:17 +00:00
lezcano	8b4e28d65d	Fix microbenchmarks (#101065 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/101065 Approved by: https://github.com/jansel	2023-05-11 09:14:22 +00:00
Jason Ansel	036a8d6b4a	Remove NullContext() from benchmark runners (#100309 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100309 Approved by: https://github.com/Skylion007, https://github.com/anijain2305	2023-05-11 06:42:27 +00:00
XiaobingSuper	c84627c2ee	benchmarks: make --amp works for cpu path (#101057 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101057 Approved by: https://github.com/jgong5, https://github.com/desertfire, https://github.com/jansel	2023-05-11 02:51:38 +00:00
Edward Z. Yang	c658732950	[RFC] Add tqdm to benchmarking script (#100969 ) Here's what it looks like, on a slower running benchmark: https://github.com/pytorch/pytorch/assets/13564/47c4a5bd-e963-45de-a15c-2fd943de0fa4 There's actually quite a bit of dead time, it's possible there are more spots we should add tqdm to. Looking for opinions on utility of this. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100969 Approved by: https://github.com/Skylion007	2023-05-10 15:39:24 +00:00
Edward Z. Yang	1e89a56a5b	Apply static policy correctly to unspec (#98983 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98983 Approved by: https://github.com/ezyang	2023-05-10 05:59:12 +00:00
Bin Bao	76cc3ab4f3	[CI] Delete skips from https://github.com/pytorch/pytorch/issues/93847 (#96049 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96049 Approved by: https://github.com/jansel	2023-05-10 01:27:27 +00:00
Edward Z. Yang	9eab13fc90	Reenable llama benchmark (#100877 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100877 Approved by: https://github.com/albanD	2023-05-09 01:12:54 +00:00
Natalia Gimelshein	9790f9174a	skip lcnet (#100726 ) Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/100726 Approved by: https://github.com/voznesenskym	2023-05-05 23:19:42 +00:00
Animesh Jain	3f025c607c	summarize graph breaks (#100696 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100696 Approved by: https://github.com/yanboliang	2023-05-05 22:27:47 +00:00
Natalia Gimelshein	4ca26d183a	[CI] update hf version for ci (#100666 ) per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/100666 Approved by: https://github.com/malfet	2023-05-05 18:12:53 +00:00
Animesh Jain	8994d9e610	[dynamo] Hide guard_fail_hook behind a flag to improve cache lookup time (+10% DebertaV2) (#100590 ) For TorchDynamo eager backend, DebertaV2 speedup improves from 0.77x to 0.87x. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100590 Approved by: https://github.com/voznesenskym, https://github.com/wconstab	2023-05-04 18:52:21 +00:00
Edward Z. Yang	c58d9642d0	Don't build Triton from source in benchmarks/dynamo/Makefile (#100613 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100613 Approved by: https://github.com/voznesenskym	2023-05-04 06:13:42 +00:00
Edward Z. Yang	d25c93f919	Remove speech_transformer workaround, torchbench handles it correctly now (#100558 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100558 Approved by: https://github.com/albanD	2023-05-04 01:14:24 +00:00
Yanbo Liang	896eb1db26	[Dynamo] Skip TB Background_Matting model eager accuracy check because of non deterministic (#100513 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/100513 Approved by: https://github.com/anijain2305	2023-05-03 07:06:50 +00:00
Animesh Jain	0acfe2ce09	[dashboard] higher tolerance for AlbertForQuestionAnswering (#100277 ) @desertfire Pull Request resolved: https://github.com/pytorch/pytorch/pull/100277 Approved by: https://github.com/desertfire	2023-05-02 23:51:08 +00:00
Jason Ansel	fdc853b14c	Add --baseline option to benchmark runners (#100266 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100266 Approved by: https://github.com/ngimel	2023-05-02 02:35:11 +00:00
Edward Z. Yang	e918fd18e7	Disable densenet121 as it is flaky (#100371 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100371 Approved by: https://github.com/voznesenskym	2023-05-02 01:49:11 +00:00
Michael Voznesensky	aafc6ce8cc	Produce constant variables in cases where a SymNode is created with a constant (#100144 ) ` AOT_DYNAMIC_SHAPES=1 TORCHDYNAMO_DYNAMIC_SHAPES=1 benchmarks/dynamo/huggingface.py --performance --training --amp --backend eager --disable-cudagraphs --device cuda --only AllenaiLongformerBase --explain` Looks promising! Goes from: Dynamo produced 173 graphs covering 2760 ops with 160 graph breaks (14 unique) To: Dynamo produced 6 graphs covering 2298 ops with 15 graph breaks (7 unique) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100144 Approved by: https://github.com/ezyang	2023-05-01 21:32:11 +00:00
Edward Z. Yang	5d93265cce	Report timeout/infra_error instead of 0.0000 on infra error (#100372 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100372 Approved by: https://github.com/Skylion007, https://github.com/albanD	2023-05-01 14:56:01 +00:00
PyTorch MergeBot	89c43f4108	Revert "Produce constant variables in cases where a SymNode is created with a constant (#100144 )" This reverts commit `d7bdfd3454`. Reverted https://github.com/pytorch/pytorch/pull/100144 on behalf of https://github.com/ezyang due to ci failure is real ([comment](https://github.com/pytorch/pytorch/pull/100144#issuecomment-1529587039))	2023-05-01 11:10:48 +00:00
Michael Voznesensky	d7bdfd3454	Produce constant variables in cases where a SymNode is created with a constant (#100144 ) ` AOT_DYNAMIC_SHAPES=1 TORCHDYNAMO_DYNAMIC_SHAPES=1 benchmarks/dynamo/huggingface.py --performance --training --amp --backend eager --disable-cudagraphs --device cuda --only AllenaiLongformerBase --explain` Looks promising! Goes from: Dynamo produced 173 graphs covering 2760 ops with 160 graph breaks (14 unique) To: Dynamo produced 6 graphs covering 2298 ops with 15 graph breaks (7 unique) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100144 Approved by: https://github.com/ezyang	2023-04-30 17:13:57 +00:00
Animesh Jain	006785cd46	[dynamo][hf_bigbird] Actually graph break on tensor.unsqueeze_/resize_ (#99986 ) Currently, we return `unimplemented` w/o a graph break on seeing a x.unsqueeze_ when x is input. This essentially means we fall back to the original frame. This PR actually graph breaks so that we can generate the continuation frame for the rest of the function. Instead of graph breaking at LOAD_ATTR, we delay the graph break to the actual CALL_FUNCTION, where its cleaner to graph break. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99986 Approved by: https://github.com/jansel	2023-04-26 18:50:06 +00:00
Aaron Gokaslan	e2a3817dfd	[BE] Enable C419 rule for any all shortcircuiting (#99890 ) Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890 Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet	2023-04-25 15:02:13 +00:00
Huy Do	9a69634b28	Skip some failing dynamic shape models on periodic (#99895 ) After some recent changes, these tests are failing in periodic trunk. So let's move them to unstable while waiting for the team to root cause the issue https://github.com/pytorch/pytorch/issues/99893. Note that a forward fix can use `ciflow/unstable` to run those unstable jobs to confirm that they are fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99895 Approved by: https://github.com/malfet	2023-04-25 07:05:08 +00:00
Edward Z. Yang	04e8df4dd7	Return full accuracy status for printing, not abbreviated version (#99894 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99894 Approved by: https://github.com/jansel	2023-04-25 05:17:10 +00:00
Jiong Gong	e5c9a0fcf5	[dynamo] avoid graph break on repeat_interleave.self_int (#99528 ) Address convit_base failure: https://github.com/pytorch/torchdynamo/issues/1886 mentioned in https://github.com/pytorch/pytorch/issues/93777 Also for models like EleutherAI/gpt-j-6B. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99528 Approved by: https://github.com/ezyang	2023-04-25 04:47:39 +00:00
Edward Z. Yang	cd61707167	yolov3 dynamic training accuracy is fixed (#99896 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99896 Approved by: https://github.com/albanD	2023-04-25 01:15:24 +00:00
Edward Z. Yang	0b545bc667	Stop marking sequence length as dynamic (#99889 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99889 Approved by: https://github.com/voznesenskym, https://github.com/huydhn	2023-04-25 01:04:16 +00:00
chuanqiw	e9e5ffe83e	Re-enable dynamic shapes test in dynamo benchmark (#99816 ) Set `torch._dynamo.config.assume_static_by_default = False` for dynamic shapes flag enabled Fixes #99815 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99816 Approved by: https://github.com/jgong5, https://github.com/ezyang	2023-04-24 20:34:52 +00:00
Yanbo Liang	3009c42e7d	[CI Testing] Re-enable timm_efficientdet training (#99787 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/99787 Approved by: https://github.com/desertfire	2023-04-24 20:05:15 +00:00
Edward Z. Yang	dc1c0924ec	Properly parenthesize dynamo_dynamic_indices test (#99823 ) I've got the E2E test case which triggered this in https://github.com/pytorch/pytorch/pull/99809 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99823 Approved by: https://github.com/voznesenskym	2023-04-23 22:41:34 +00:00
Edward Z. Yang	f602b3a6ae	Preserve mark_dynamic when cloning inputs (#99617 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99617 Approved by: https://github.com/ngimel, https://github.com/voznesenskym, https://github.com/anijain2305	2023-04-22 19:46:31 +00:00
Natalia Gimelshein	bfbc4e74ab	adjust batch sizes for hf suite (#99691 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/99691 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2023-04-21 23:57:53 +00:00
Bin Bao	e09f785a72	[CI] Remove inductor skip list for Huggingface (#99375 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99375 Approved by: https://github.com/anijain2305	2023-04-21 18:13:22 +00:00
Edward Z. Yang	fc8fa6c356	Require at least one tensor to be marked dynamic with --dynamic-batch-only (#99620 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99620 Approved by: https://github.com/voznesenskym	2023-04-21 00:17:08 +00:00
Huy Do	5315317b7b	Skip some detectron2_maskrcnn models with KeyError _ignore_torch_cuda_oom (#99599 ) These tests are failing in trunk `233cc34d3b` with `KeyError: '_ignore_torch_cuda_oom'` Pull Request resolved: https://github.com/pytorch/pytorch/pull/99599 Approved by: https://github.com/malfet	2023-04-20 18:11:35 +00:00
Shunting Zhang	68bc0fc012	[inductor] a script to benchmark the perf impact from tensor layout (#99583 ) Follow up on Jason's idea of tensor layout tuning. Add a script to show the perf impact of layout to convolution (will add more cases like batch/layer norm, reduction to the scripts). For convolution, a quick test shows using channels last layout, we get 1.4x speedup for convolution: ``` baseline 4.509183883666992 test 3.178528070449829 speedup 1.419x ``` The speedup definitely also depends on input/weight shapes. E.g., change input channel from 3 in the test to 8, we see speedup to be 2.1x The trace shows cudnn calls different kernels when input layout changes to channels last. <img width="997" alt="Screenshot 2023-04-19 at 5 27 54 PM" src="https://user-images.githubusercontent.com/52589240/233228656-4bdcac0a-7633-416a-82e1-17d8dc8ea9a6.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99583 Approved by: https://github.com/jansel	2023-04-20 06:26:10 +00:00
Jason Ansel	3233450d07	Add TorchXLA option to benchmark runner (#99505 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99505 Approved by: https://github.com/voznesenskym	2023-04-19 22:44:52 +00:00
Will Constable	9ac2b041c9	Make opacus xfail instead of skip (#99380 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99380 Approved by: https://github.com/desertfire, https://github.com/anijain2305	2023-04-19 21:09:06 +00:00
Huy Do	5d395769a6	Skip vision_maskrcnn after #98923 (#99394 ) This is failing in trunk as documented in https://github.com/pytorch/pytorch/issues/99438 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99394 Approved by: https://github.com/desertfire	2023-04-19 17:07:07 +00:00
Michael Voznesensky	113bd11cf4	Skip levit (#99491 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99491 Approved by: https://github.com/ezyang	2023-04-19 07:41:42 +00:00
Edward Z. Yang	e60557793f	Make hash update script more robust and run it (#99370 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99370 Approved by: https://github.com/Chillee, https://github.com/voznesenskym	2023-04-19 02:26:03 +00:00
Bin Bao	46b9377190	[CI] Collect inductor max-autotune performance every Sunday (#99387 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99387 Approved by: https://github.com/malfet, https://github.com/huydhn	2023-04-18 13:20:13 +00:00
PyTorch MergeBot	ce7c4ba11d	Revert "Mark doctr_det_predictor as broken on master (#99370 )" This reverts commit `b290381e09`. Reverted https://github.com/pytorch/pytorch/pull/99370 on behalf of https://github.com/ezyang due to malfet already directly fixed it	2023-04-18 13:18:10 +00:00
Edward Z. Yang	b290381e09	Mark doctr_det_predictor as broken on master (#99370 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99370 Approved by: https://github.com/Chillee, https://github.com/voznesenskym	2023-04-18 06:58:47 +00:00
Edward Z. Yang	039faf0dbf	Add invariant that all symbolic shapes must be bound in graph (#99089 ) Previously, we had a problem when partitioning forward-backward dynamic graphs, which is that we could end up with a backward graph that mentions a symbol in an input tensor (e.g., `f32[s0 + s1]`), but without this symbol being otherwise bound elsewhere. When this happens, we have no way of actually deriving the values of `s0` and `s1`. Our fix for this in https://github.com/pytorch/pytorch/pull/93059 was to just retrace the graph, so that s0 + s1 got allocated a new symbol s2 and everything was happy. However, this strategy had other problems, namely (1) we lost all information from the previous ShapeEnv, including guards and (2) we end up allocating a LOT of fresh new symbols in backwards. With this change, we preserve the same ShapeEnv between forward and backwards. How do we do this? We simply require that every symbol which may be present inside tensors, ALSO be a plain SymInt input to the graph. This invariant is enforced by Dynamo. Once we have done this, we can straightforwardly modify the partitioner to preserve these SymInt as saved for backwards, if they are needed in the backwards graph to preserve the invariant as well. This apparently breaks yolov3, but since everything else is OK I'm merging this as obviously good and investigating later. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99089 Approved by: https://github.com/voznesenskym	2023-04-16 01:48:19 +00:00
Yanbo Liang	15fe5a0798	[Dynamo] Fix benchmark --verbose error (#99224 ) Dynamo benchmark --verbose is broken: ``` Traceback (most recent call last): File "/scratch/ybliang/work/repos/pytorch/benchmarks/dynamo/torchbench.py", line 400, in <module> torchbench_main() File "/scratch/ybliang/work/repos/pytorch/benchmarks/dynamo/torchbench.py", line 396, in torchbench_main main(TorchBenchmarkRunner(), original_dir) File "/scratch/ybliang/work/repos/pytorch/benchmarks/dynamo/common.py", line 1967, in main return maybe_fresh_cache( File "/scratch/ybliang/work/repos/pytorch/benchmarks/dynamo/common.py", line 993, in inner return fn(args, *kwargs) File "/scratch/ybliang/work/repos/pytorch/benchmarks/dynamo/common.py", line 2135, in run torch._dynamo.config.log_level = logging.DEBUG File "/scratch/ybliang/work/repos/pytorch/torch/_dynamo/config_utils.py", line 67, in __setattr__ raise AttributeError(f"{self.__name__}.{name} does not exist") AttributeError: torch._dynamo.config.log_level does not exist ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/99224 Approved by: https://github.com/voznesenskym	2023-04-15 20:18:50 +00:00
Bin Bao	34f681c13b	[CI] Remove inductor skip list for timm_models (#98840 ) Summary: check against the expected csv file instead of skipping tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/98840 Approved by: https://github.com/ezyang	2023-04-15 13:54:41 +00:00
Bin Bao	a595a50653	[CI] Use expected accuracy csv files to check benchmark test status (#98839 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98839 Approved by: https://github.com/ezyang	2023-04-15 13:54:41 +00:00
Will Constable	6eab5e88c8	Graph-break on allowed modules if they have hooks (#97184 ) Allowed modules are stuck into dynamo's fx graph as call_module nodes, without dynamo doing any tracing of the module. This means during AOT trace time, hooks will fire during tracing when the call_module is executed, but the hooks themselves will disappear after that and not be present in the compiled program. (worse, if they performed any tensor operations, those would get traced so you could end up with part of the hook's functionality). To circumvent this, there are two options for 'allowed modules' with hooks. 1) don't treat them as 'allowed' - trace into them 2) graph-break, so the module is no longer part of the dynamo trace at all (1) will fail for users that opted into allowed modules becuase they know their module has problems being traced by dynamo. (2) causes graph breaks on common modules such as nn.Linear, just because they are marked as 'allowed'. It would help matters if we could differentiate between types of allowed modules (A) allowed to avoid overheads - used for common ops like nn.Linear (B) allowed to avoid dynamo graphbreaks caused by unsupported code Ideally, we'd use method (1) for group (A) and (2) for (B). For now, graph-break on all cases of allowed modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97184 Approved by: https://github.com/jansel	2023-04-15 01:46:15 +00:00
lezcano	1e78a2edcc	Make summarize_perf.py work with perf-compare (#99095 ) [perf-compare](https://github.com/pytorch/pytorch/actions/workflows/inductor-perf-compare.yml) has a different structure than that of the nightlies. For these files, the script now generates: ``` # cuda float32 training performance results ## Geometric mean speedup huggingface timm_models torchbench -------- ------------- ------------- ------------ inductor 1.46 1.4 1.17 ## Mean compilation time huggingface timm_models torchbench -------- ------------- ------------- ------------ inductor 57.85 97.63 60.18 ## Peak memory compression ratio huggingface timm_models torchbench -------- ------------- ------------- ------------ inductor 1.06 1.01 0.83 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/99095 Approved by: https://github.com/ezyang	2023-04-14 12:10:54 +00:00
Bin Bao	e5501a967e	[inductor] Support IndexPutFallback in cpp_wrapper (#98972 ) Summary: 1) Make the fallback index_put generate the right cpp code in cpp_wapper 2) Add a --cpp-wrapper option to common.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/98972 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-04-13 15:41:03 +00:00
Will Constable	296822c475	Make update_expected not fail on one missing file (#98982 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98982 Approved by: https://github.com/voznesenskym	2023-04-13 03:59:20 +00:00
PyTorch MergeBot	629377ea8b	Revert "Replace _dynamo.config with an object instead of module (#96455 )" This reverts commit `420104a886`. Reverted https://github.com/pytorch/pytorch/pull/96455 on behalf of https://github.com/jansel due to BC breaking, was landed prematurely	2023-04-12 15:06:14 +00:00
Han Qi	420104a886	Replace _dynamo.config with an object instead of module (#96455 ) Summary: Replace _dynamo.config with an object instead of module Current usage patterns of setting and reading fields on config will work unchanged. Only changes needed going forward: 1. import torch._dynamo.config will not work. However, just doing import torch._dynamo is sufficient to access dynamo config as torch._dynamo.config. 2. Files inside of _dynamo folder need to access config via from torch._dynamo.config_util import config instead of from torch._dynamo import config. Because _dynamo/__init__.py imports some of the files so it would be circular import. Test Plan: Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/96455 Approved by: https://github.com/williamwen42	2023-04-11 21:23:32 +00:00
Edward Z. Yang	16beb636b8	Generalize summary script to work with more CSV names (#98500 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98500 Approved by: https://github.com/wconstab	2023-04-11 19:05:18 +00:00
Edward Z. Yang	b8b840be3d	Convert logging f-strings to use % format, part five (#98765 ) This does some annoying but simple cases by hand. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98765 Approved by: https://github.com/wanchaol	2023-04-11 13:17:59 +00:00
Edward Z. Yang	b09722f540	Convert logging f-strings to use % format, part two (#98700 ) This hits multi-line logging strings Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98700 Approved by: https://github.com/voznesenskym	2023-04-10 12:19:31 +00:00
Edward Z. Yang	9a8f71f23e	Convert logging f-strings to use % format (#98697 ) Codemod done with https://gist.github.com/ezyang/2e8b0463cdc6be278478495b23ff0530 with assistance from ChatGPT. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98697 Approved by: https://github.com/voznesenskym	2023-04-10 12:19:31 +00:00
Jason Ansel	f4858fa8ef	Improve dynamo support for autograd.Function (#98158 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98158 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2023-04-10 00:33:51 +00:00
Bin Bao	5210d7c423	[CI] Mark vision_maskrcnn as NONDETERMINISTIC (#98570 ) Summary: vision_maskrcnn fails eager checking, so mark it as NONDETERMINISTIC to reduce noise on the dashboard. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98570 Approved by: https://github.com/eellison, https://github.com/huydhn	2023-04-07 19:33:20 +00:00
PyTorch MergeBot	e394f6db5a	Revert "Improve dynamo support for autograd.Function (#98158 )" This reverts commit `4716fa2411`. Reverted https://github.com/pytorch/pytorch/pull/98158 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it seems to breaks MacOS trunk job `4716fa2411`. The signal was missing from the PR because we disabled MacOS job yesterday due to https://github.com/pytorch/pytorch/issues/98362	2023-04-06 18:15:02 +00:00
William Wen	bb33173962	Add max-autotune compilers to benchmarks (#98464 ) Title Pull Request resolved: https://github.com/pytorch/pytorch/pull/98464 Approved by: https://github.com/shunting314	2023-04-06 17:13:02 +00:00
Jason Ansel	4716fa2411	Improve dynamo support for autograd.Function (#98158 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98158 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2023-04-06 16:44:37 +00:00
Edward Z. Yang	bdb79a8f52	Turn off divisible_by_16 for dynamic shapes; support ablation (#98471 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98471 Approved by: https://github.com/ngimel, https://github.com/voznesenskym	2023-04-06 12:57:07 +00:00
Bin Bao	007587aa00	[CI] Update update_expected.py to make it generate a combined csv file (#98407 ) Summary: make update_expected.py combine csv files from all shards into a single csv file for each test suite Pull Request resolved: https://github.com/pytorch/pytorch/pull/98407 Approved by: https://github.com/wconstab, https://github.com/ezyang	2023-04-06 00:00:58 +00:00
Edward Z. Yang	37b9143206	Require sequence length in huggingface to be dynamic (#98335 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98335 Approved by: https://github.com/voznesenskym	2023-04-05 19:40:22 +00:00
Edward Z. Yang	cf1bfca2ba	Require batch dimensions to be compiled dynamically (#98334 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98334 Approved by: https://github.com/voznesenskym	2023-04-05 19:40:22 +00:00
Bin Bao	c4de7fdef5	[CI] Mark sebotnet33ts_256 as nondeterministic (#98356 ) Summary: The goal is make sure the new dashboard doesn't give noisy alert on this test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98356 Approved by: https://github.com/ezyang	2023-04-05 12:05:47 +00:00
Edward Z. Yang	b923f84805	Switch accuracy CI to dynamic batch only (#98307 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98307 Approved by: https://github.com/wconstab	2023-04-05 01:20:12 +00:00
Elias Ellison	a3365e1d0d	Increment pending forwards after invocation (#98101 ) Forwards are only pending following invocation, not before. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98101 Approved by: https://github.com/ngimel	2023-04-05 00:04:39 +00:00
Bin Bao	bd6db54285	[CI] Mark mobilenet_v3_large as nondeterministic (#98314 ) Summary: Skip mobilenet_v3_large for accuracy checking to reduce noise on the dashboard. The root cause still needs to be investigated. mobilenet_v3_large shows random accuracy check failures with different error values from time to time, and here are some examples: ``` cuda train mobilenet_v3_large [2023-04-04 14:54:50,990] torch._dynamo.utils: [ERROR] RMSE (res-fp64): 0.02172, (ref-fp64): 0.01068 and shape=torch.Size([960, 1, 5, 5]) [2023-04-04 14:54:50,990] torch._dynamo.utils: [ERROR] Accuracy failed for key name features.14.block.1.0.weight.grad ``` ``` cuda train mobilenet_v3_large [2023-04-04 14:57:59,972] torch._dynamo.utils: [ERROR] RMSE (res-fp64): 0.07744, (ref-fp64): 0.03073 and shape=torch.Size([72, 1, 5, 5]) [2023-04-04 14:57:59,973] torch._dynamo.utils: [ERROR] Accuracy failed for key name features.4.block.1.0.weight.grad ``` One observation is turnning off cudnn in the eager mode with `torch.backends.cudnn.enabled = False` makes the non-deterministic behvior go away but meanwhile it fails accuaracy checking consistently. Minifier didn't help to narrow down the error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98314 Approved by: https://github.com/huydhn	2023-04-04 21:55:23 +00:00
Edward Z. Yang	3c36f82fa2	[EASY] Handle new inference csv from CI (#98294 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98294 Approved by: https://github.com/wconstab	2023-04-04 20:37:51 +00:00
William Wen	4cf3e7c255	[dynamo benchmarks] Fix inference benchmark runs (#98248 ) Update flags for dynamo inference benchmark runs. Add flag to not compute regressions/metric graphs (useful if there aren't previous runs to compare with). Pull Request resolved: https://github.com/pytorch/pytorch/pull/98248 Approved by: https://github.com/shunting314	2023-04-04 01:24:13 +00:00
Bin Bao	69ff39d2e7	Skip gat, gcn and sage for TorchBench CUDA test (#98244 ) Summary: The three models only support CPU for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98244 Approved by: https://github.com/ezyang	2023-04-04 01:06:18 +00:00
Jason Ansel	55afaa46a4	Support functools.partial and itertools.product (#98120 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98120 Approved by: https://github.com/anijain2305	2023-04-03 18:23:25 +00:00
Bin Bao	ba7ee00f00	Add a --inference flag to dynamo benchmark script (#98173 ) Summary: When calling benchmark scripts, make it a requirement to pass --inference or --training Pull Request resolved: https://github.com/pytorch/pytorch/pull/98173 Approved by: https://github.com/huydhn	2023-04-03 17:12:28 +00:00
Jason Ansel	76074dc0a3	Improve support for dict subclasses (#98154 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98154 Approved by: https://github.com/anijain2305	2023-04-03 01:42:08 +00:00
Jason Ansel	bc9dd969e1	Support inlining no_grad() decorator (#98121 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98121 Approved by: https://github.com/anijain2305, https://github.com/voznesenskym	2023-04-03 00:24:56 +00:00
Jason Ansel	92b46202ef	Add --stats option to benchmark scripts (#98109 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98109 Approved by: https://github.com/anijain2305	2023-04-02 02:23:13 +00:00
Edward Z. Yang	5df59f957f	Fix G001,G002,G003 in logs to % syntax (#97812 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97812 Approved by: https://github.com/Skylion007, https://github.com/kiukchung, https://github.com/malfet, https://github.com/mlazos	2023-04-01 01:43:33 +00:00
Animesh Jain	6b319d1525	[dynamo][graph break fix] inplace add for empty tuple (#97923 ) Fixes one of the frequent graph breaks in HF models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97923 Approved by: https://github.com/yanboliang, https://github.com/jansel	2023-04-01 00:11:16 +00:00
Bin Bao	c699ac17df	[CI] Bump up torchbench version to fix dynamo graph breaks in transformers (#98003 ) Summary: When we bump up the torchbench version pin last time, we found there were new graph breaks introduced with the trasformers version upgrade, see https://github.com/pytorch/pytorch/pull/96782. Turns out they are already fixed upstream, see https://github.com/huggingface/transformers/pull/21648 and https://github.com/pytorch/benchmark/pull/1511 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98003 Approved by: https://github.com/ngimel	2023-03-31 16:52:09 +00:00
Edward Z. Yang	91ad5984d8	Add script to summarize performance from CI performance run (#97977 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97977 Approved by: https://github.com/wconstab	2023-03-31 12:44:48 +00:00
Edward Z. Yang	97fc8ea5f4	Run the benchmark suite with dynamic batch only (#97912 ) Symbolic shapes compile time on full CI with inductor is horribly long (even though our aot_eager local runs seemed to suggest that the added latency was only 10s per model.) To patch over the problem for now, run the benchmark suite with dynamic batch only. This should absolve a lot of sins. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97912 Approved by: https://github.com/janeyx99, https://github.com/desertfire	2023-03-30 18:04:48 +00:00
Aaron Gokaslan	47dca20d80	[BE] Enable flake8-comprehension rule C417 (#97880 ) Enables flake8-comprehension rule C417. Ruff autogenerated these fixes to the codebase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97880 Approved by: https://github.com/ezyang, https://github.com/kit1980, https://github.com/albanD	2023-03-30 14:34:24 +00:00
Will Constable	2f86c9bc0b	Update query version for update_expected.py (#97898 ) Unclear why this wobbled, but rocks had an outage and fixed it, maybe new endpoints were generated as a result of that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97898 Approved by: https://github.com/huydhn	2023-03-29 21:50:19 +00:00
William Wen	b93e1f377e	[dynamo, benchmarks] Add inductor-mode (for max-autotune) and warm start options to dynamo benchmarks (#97719 ) Title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97719 Approved by: https://github.com/shunting314	2023-03-29 21:09:00 +00:00
Edward Z. Yang	f754be897a	Disable speedup_experiment_ds (#97806 ) It seems to be broken. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97806 Approved by: https://github.com/jansel	2023-03-29 01:27:31 +00:00
Aaron Gokaslan	597b558c51	[BE]: Update flake8 and plugins and fix bugs (#97795 ) Update flake8 and flake8-plugins in lintrunner to a modern version. Enables more checks and makes flake8 checks significantly faster. Added a few additional rule ignores that will need to be fixed in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97795 Approved by: https://github.com/alexsio27444, https://github.com/janeyx99, https://github.com/ezyang	2023-03-28 23:51:55 +00:00
Bin Bao	a9a81ab7e3	[CI] Run benchmark test with dynamo_eager in periodic (#97543 ) Summary: The idea is to catch any dynamo_eager regression earlier, and also we can take that off the dashboard run. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97543 Approved by: https://github.com/huydhn	2023-03-28 01:02:49 +00:00
Shunting Zhang	652592efa9	[inductor] use torch.prifiler in the triton wrapper (#97405 ) I think it's helpful to use torch.profiler to profile the triton wrapper. E.g., I tried it for nvidia_deeprecommender's infernece graph. Even with max-autotune, we see the majority of the time the GPU is running 2 mm/addmm op. That's why max autotune does not help for this model since tuning does not affect the external mm ops. <img width="711" alt="Screenshot 2023-03-22 at 5 49 28 PM" src="https://user-images.githubusercontent.com/52589240/227072474-2f0d7205-4a10-4929-b1b7-551214788c61.png"> next step I'll check why the triton mm kernels are not picked. EDIT: the above screenshot is captured without max-autotune due to a typo. below is the trace with max-autotune enabled: <img width="712" alt="Screenshot 2023-03-22 at 6 43 26 PM" src="https://user-images.githubusercontent.com/52589240/227077624-fdccf928-be08-4211-871b-a9e3d7b76fbe.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97405 Approved by: https://github.com/ngimel	2023-03-27 21:54:25 +00:00
Yanbo Liang	d305d4a57f	[Dynamo] Fix TIMM benchmark compute_loss (#97423 ) Fixes #97382 #95416 fixed a critical bug in dynamo benchmark, where AMP tests fall back to eager mode before that PR. However, after that PR, we found [a list of TIMM models amp + eager + training testing failed](https://docs.google.com/spreadsheets/d/1DEhirVOkj15Lu4UNawIUon9MqkVLaWqyT-DQPif5NHk/edit#gid=0). Now we identified the root cause is: high loss values make gradient checking harder, as small changes in accumulation order upset accuracy checks. We should switch to the helper function ```reduce_to_scalar_loss``` which has been used by Torchbench tests. After switching to ```reduce_to_scalar_loss```, TIMM models accuracy pass rate grows from 67.74% to 91.94% in my local test. The rest 5 failed models(ese_vovnet19b_dw, fbnetc_100, mnasnet_100, mobilevit_s, sebotnet33ts_256) need further investigation and handling, but I think it should be similar reason. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97423 Approved by: https://github.com/Chillee	2023-03-24 16:50:28 +00:00
Scott Wolchok	3b54592050	[PyTorch] Add annotation_str benchmark (#96496 ) To be used to evaluate performance of following improvements. Baseline numbers: https://gist.github.com/swolchok/c8bcb92be1dc6e67c4f7efad498becd5 Differential Revision: [D43919653](https://our.internmc.facebook.com/intern/diff/D43919653/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43919653/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/96496 Approved by: https://github.com/Skylion007	2023-03-23 04:18:07 +00:00
Jason Ansel	9370f253e3	[inductor] Rewrite convolution triton templates (#95556 ) Fixes #95775 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95556 Approved by: https://github.com/Chillee, https://github.com/ngimel	2023-03-22 18:12:23 +00:00
Edward Z. Yang	cff4826f28	pytorch_unet is now passing (#97309 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97309 Approved by: https://github.com/janeyx99, https://github.com/zou3519	2023-03-22 13:55:05 +00:00
Bin Bao	be49d3b170	[CI] Turn on debug logging for dla102 and gernet_l (#97307 ) Summary: Log the generated code for those two flaky tests to see if there is any codegen difference when they fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97307 Approved by: https://github.com/ezyang	2023-03-22 13:42:13 +00:00
jjsjann123	2b32a74ab0	moving nvfuser benchmark to third_party/nvfuser (#96725 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96725 Approved by: https://github.com/davidberard98	2023-03-21 23:19:15 +00:00
Natalia Gimelshein	e7d9331688	[inductor] hoist symbolic padding expressions (#97099 ) Towards fixing pnasnet5large, see #96709. The generated kernel looks much better ``` @pointwise(size_hints=[1048576], filename=__file__, meta={'signature': {0: 'fp32', 1: 'fp32', 2: 'i32', 3: 'i32', 4: 'i32', 5: 'i32', 6: 'i32'}, 'device': 0, 'constants': {}, 'mutated_arg_names': [], 'configs': [instance_descriptor(divisible_by_16=(0, 1, 6), equal_to_1=())]}) @triton.jit def triton_(in_ptr0, out_ptr0, ks0, ks1, ks2, ks3, xnumel, XBLOCK : tl.constexpr): xoffset = tl.program_id(0) * XBLOCK xindex = xoffset + tl.arange(0, XBLOCK)[:] xmask = xindex < xnumel x1 = (xindex // ks0) % ks0 x0 = xindex % ks0 x2 = (xindex // ks3) x4 = xindex tmp0 = x1 + ((-1)ks1) tmp1 = 0 tmp2 = tmp0 >= tmp1 tmp3 = ks2 tmp4 = tmp0 < tmp3 tmp5 = x0 + ((-1)ks1) tmp6 = tmp5 >= tmp1 tmp7 = tmp5 < tmp3 tmp8 = tmp2 & tmp4 tmp9 = tmp8 & tmp6 tmp10 = tmp9 & tmp7 tmp11 = tl.load(in_ptr0 + (x0 + ((-1)ks1) + (ks2x1) + (x2(ks2ks2)) + ((-1)ks1ks2) + tl.zeros([XBLOCK], tl.int32)), tmp10 & xmask, other=0) tmp12 = tl.where(tmp10, tmp11, 0.0) tl.store(out_ptr0 + (x4 + tl.zeros([XBLOCK], tl.int32)), tmp12, xmask) ``` Interestingly, removing `expand` in in index `simplify` function makes `load` expression a little bit better, but `store` fails to simplify to flat store in this case, so I'm leaving `expand` in. Full pnasnet still chokes on `ceiling` in batch_norm kernels, additionally, it looks like shape propagation goofs in inductor and generates overly complicated expressions, we should switch to meta data from fx graph. I'm still not adding `ceil` print to triton, because we should be able to hoist all indexing expression (and just printing ceil without converting to int64 doesn't work) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97099 Approved by: https://github.com/jansel	2023-03-21 21:43:32 +00:00
Bin Bao	ead5186462	[CI] Change tests used by the new dashboard (#96986 ) Summary: Stop using runn.py to trigger the new dashboard run. Instead, we spell out the actual cmd which will be easier to extend. Dropping perf tests for dynamo_eager and aot_eager in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96986 Approved by: https://github.com/huydhn, https://github.com/weiwangmeta	2023-03-20 17:28:12 +00:00
Edward Z. Yang	e74c5e5637	rexnet_100 is disabled for static, does not need dynamic listing (#97100 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97100 Approved by: https://github.com/Skylion007	2023-03-19 20:57:49 +00:00
David Berard	a4c706bcbc	[dynamo][dashboard] fix triton clone step in dashboard (#96623 ) previously this would clone triton, and then try to checkout without being in the git repo directory. This wasn't usually a problem because the environment already had a triton repo downloaded; but I ran into this while trying to construct a new environment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96623 Approved by: https://github.com/anijain2305	2023-03-17 22:36:26 +00:00
Bin Bao	577d930c39	[CI] Revert https://github.com/pytorch/pytorch/pull/96195 (#96897 ) Summary: https://github.com/pytorch/pytorch/pull/96195 was an experiment for debugging flaky failures on CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96897 Approved by: https://github.com/ngimel	2023-03-16 06:28:18 +00:00
Edward Z. Yang	3606f59366	Default specialize_int to False (#96624 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624 Approved by: https://github.com/janeyx99	2023-03-16 02:54:18 +00:00
Will Constable	54cd4a67d0	Output peak memory stats from dynamo torchbench perf CI (#95666 ) Adds absolute memory usage numbers (in addition to compression ratio) to performance jobs. Example output: <img width="1211" alt="image" src="https://user-images.githubusercontent.com/4984825/225419950-500908c5-00ce-4711-afa2-c995bf90d35d.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95666 Approved by: https://github.com/ezyang, https://github.com/williamwen42	2023-03-15 19:24:47 +00:00
Bin Bao	33c7be360f	[reland][CI] switch torchbench to a pinned version (#96782 ) Summary: This is reland of https://github.com/pytorch/pytorch/pull/96553 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96782 Approved by: https://github.com/huydhn	2023-03-15 12:46:36 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Edward Z. Yang	037acd5a22	Update CI skips (#96745 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96745 Approved by: https://github.com/wconstab	2023-03-14 22:19:10 +00:00
PyTorch MergeBot	be4eaa69c2	Revert "[CI] switch torchbench to a pinned version (#96553 )" This reverts commit `61d6ccd29a`. Reverted https://github.com/pytorch/pytorch/pull/96553 on behalf of https://github.com/desertfire due to land race	2023-03-14 21:39:45 +00:00
PyTorch MergeBot	2951a75c3a	Revert "Update perf smoke test threshold in check_hf_bert_perf_csv.py (#96772 )" This reverts commit `2eed44933b`. Reverted https://github.com/pytorch/pytorch/pull/96772 on behalf of https://github.com/desertfire due to land race	2023-03-14 21:37:30 +00:00
Wei Wang	2eed44933b	Update perf smoke test threshold in check_hf_bert_perf_csv.py (#96772 ) Reduce the threshold a little further due to runner to runner performance variations. e.g. https://github.com/pytorch/pytorch/actions/runs/4419276220/jobs/7747985757 https://github.com/pytorch/pytorch/actions/runs/4419548525/jobs/7748553775 failed to meet 1.145 but were above 1.140. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96772 Approved by: https://github.com/seemethere, https://github.com/huydhn, https://github.com/atalman	2023-03-14 21:00:13 +00:00
PyTorch MergeBot	ba4fb9b6ad	Revert "Default specialize_int to False (#96624 )" This reverts commit `1ac8782db2`. Reverted https://github.com/pytorch/pytorch/pull/96624 on behalf of https://github.com/kit1980 due to Broke inductor/test_torchinductor_dynamic_shapes.py	2023-03-14 19:43:47 +00:00
Will Constable	66871d61bb	One line print for check_graph_breaks (#96750 ) New output looks like this <img width="1040" alt="image" src="https://user-images.githubusercontent.com/4984825/225059313-fbac5152-ea8b-46ba-893d-dc1e2f8d82cc.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96750 Approved by: https://github.com/ezyang	2023-03-14 19:35:54 +00:00
Bin Bao	61d6ccd29a	[CI] switch torchbench to a pinned version (#96553 ) Summary: Previously we were using a branch on torchbench which skips torchaudio. We should switch to make sure a good test coverage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96553 Approved by: https://github.com/huydhn, https://github.com/ezyang	2023-03-14 18:42:22 +00:00
Edward Z. Yang	1ac8782db2	Default specialize_int to False (#96624 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624 Approved by: https://github.com/janeyx99	2023-03-14 18:37:47 +00:00
David Berard	6e3d51b08a	[inductor][CI] also skip rexnet_100 on non-dynamic shapes (#96691 ) Recent failures show rexnet_100 accuracy is flaky also on non-dynamic shapes (was already disabled for dynamic shapes in #96474). The failure occurs for the same reason (stem.bn.weight.grad). e.g. https://github.com/pytorch/pytorch/actions/runs/4402868441/jobs/7710977874 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96691 Approved by: https://github.com/desertfire	2023-03-14 18:11:59 +00:00
Edward Z. Yang	ff7e510d1e	Correctly use PythonPrinter for generating wrapper code referencing sympy (#96710 ) Otherwise you get stuff like ceiling(s0) which is not valid Python code. Fixes volo_d1_224 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96710 Approved by: https://github.com/ngimel, https://github.com/jansel	2023-03-14 14:35:52 +00:00
Will Constable	f1d4d291b0	update_expected.py to parse artifacts and update graph break stats (#96480 ) TODO (cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire @ZainRizvi) hopefully i can convert the rocks query i'm using to a public API and delete the rocs api usage (and need for apikey) from this before landing. If that's not easy or if i need to make a new query first, maybe i should land this as-is and at least people can use it if they get an apikey. Also, any bad practices in how i parsed/mangled the filenames? Would be nice to make the naming of artifacts more consistent with the job names so less mangling is needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96480 Approved by: https://github.com/ZainRizvi	2023-03-14 13:37:21 +00:00
Wang, Eikan	3cad8d23d0	[Inductor] Skip the hf_T5_base due to intermittent failure on CI (#96649 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96649 Approved by: https://github.com/desertfire	2023-03-14 07:40:20 +00:00
Will Constable	218eeacacd	Check dynamo graph-breaks in CI (#96346 ) - add graph-breaks baselines - add check_graph_breaks script (message users on regress or improvement) - hook up test.sh for existing accuracy job Refactor graph-break CI check Take steps toward merging checker with existing check flow, consider merging it all the way inside the bench runner. csvs Pull Request resolved: https://github.com/pytorch/pytorch/pull/96346 Approved by: https://github.com/ezyang	2023-03-14 03:39:36 +00:00
Edward Z. Yang	507feb805f	Don't specialize torch.Size with specialize_int = False (#96419 ) Fixes https://github.com/pytorch/pytorch/issues/95868 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96419 Approved by: https://github.com/jansel, https://github.com/ngimel	2023-03-14 01:32:58 +00:00
David Berard	1d792288a5	[dynamo][dashboard] Clear local changes before pulling git repos (#96667 ) Current dashboard issue is due to a .pt file in torchbench that has beeen modified for some reason. This clears any local changes before pulling. Tested in a duplicate dashboard environment with the same .pt file modified: * Before the change to this makefile, `make pull-deps` fails * After the change to this makefile, `make pull-deps` succeeds. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96667 Approved by: https://github.com/anijain2305	2023-03-13 22:50:38 +00:00
Edward Z. Yang	c7f39c0820	Update CI skips (#96554 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96554 Approved by: https://github.com/janeyx99	2023-03-13 13:40:45 +00:00
Huy Do	c3614c7a61	Add a flag to benchmarks script to keep the test report directory (#96398 ) I notice from the Rockset data that there are only `float32` records, while there should be both dtypes there. It turns out that the benchmarks script generated by `runner.py` always removes the output directory by default, so there are only records from `float32` running later left. For example, `rm -rf /var/lib/jenkins/workspace/test/test-reports` appeared twice in the CI log https://ossci-raw-job-status.s3.amazonaws.com/log/11840774308. I'm adding a new flag `--keep-output-dir` to keep the output directory. This is off by default as I'm not sure how this script is used internally, people probably expect to see the output directory cleaned up everytime. ### Testing Not really want to start the 10h jobs just to test this small flag, so I'm triple check the change to make sure that there is no bug Pull Request resolved: https://github.com/pytorch/pytorch/pull/96398 Approved by: https://github.com/weiwangmeta	2023-03-11 03:16:56 +00:00
Yanbo Liang	7fcf8b1829	[Dynamo] Support torch.{cuda/cpu}.amp.autocast (#95416 ) For Meta internal use cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95416 Approved by: https://github.com/jansel	2023-03-10 21:48:08 +00:00
Wei Wang	49eed50d19	[Inductor Perf CI] Lower the threshold of performance smoke test speedup. (#96531 ) Avoids issues with https://github.com/pytorch/pytorch/issues/96530 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96531 Approved by: https://github.com/seemethere	2023-03-10 18:58:28 +00:00
David Berard	29cd60dfb7	[CI] handle more dynamo benchmark models that are not expected to be deterministic (#96324 ) Follow-up to #96245. alexnet, Background_Matting, vision_maskrcnn, and vgg16 all have the same problem; but on float32 they were also failing on the previous day so I missed this. Once the amp jobs became available I could see that these have the same issue (on both float32 and amp). Pull Request resolved: https://github.com/pytorch/pytorch/pull/96324 Approved by: https://github.com/desertfire	2023-03-10 18:15:34 +00:00
Bin Bao	a651e6253a	[CI] Change compile_threads to 1 when running benchmark accuracy test on CI (#96195 ) Summary: This is not a pretty solution, but it a way to verify if the flakiness is coming from parallel compilation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96195 Approved by: https://github.com/ngimel	2023-03-10 17:39:38 +00:00
Edward Z. Yang	ff2e14f200	Skip rexnet_100 in dynamic CI (#96474 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96474 Approved by: https://github.com/yanboliang, https://github.com/msaroufim	2023-03-10 01:23:19 +00:00
Horace He	5bbec680d7	Fix usages of contextmanager without finally (#96170 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96170 Approved by: https://github.com/ngimel, https://github.com/malfet	2023-03-08 20:59:27 +00:00
Edward Z. Yang	c988de1040	[EASY] Update inductor training dynamic skips (#96298 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96298 Approved by: https://github.com/Chillee, https://github.com/janeyx99	2023-03-08 19:31:46 +00:00
Bin Bao	b3a079810e	[CI] Add a workflow for quick perf comparison (#96166 ) Summary: ciflow/inductor-perf-test-nightly now contains full dashboard run which takes a very long time. Ed proposed a simplification of the perf run there, but it is still worth to have a set of fast perf test which only includes one configuration (--training --amp). Pull Request resolved: https://github.com/pytorch/pytorch/pull/96166 Approved by: https://github.com/huydhn, https://github.com/weiwangmeta	2023-03-08 19:09:04 +00:00
Bin Bao	664381b293	[CI] Avoid calling torch.use_deterministic_algorithms for some models (#96245 ) tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/96245 Approved by: https://github.com/davidberard98	2023-03-08 03:35:32 +00:00
Edward Z. Yang	d0641ed247	[TEST] Turn on unspecialize int dynamic training inductor CI (#96058 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96058 Approved by: https://github.com/janeyx99, https://github.com/voznesenskym	2023-03-07 16:08:45 +00:00
Edward Z. Yang	a6e3e7905e	Turn on unspecialize int dynamic inductor CI (#96034 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96034 Approved by: https://github.com/voznesenskym	2023-03-07 12:39:55 +00:00
Jason Ansel	95d17dc93d	[inductor] Reland #95567 part 1 (#96023 ) This is the non-problematic part of #95567. The errors were coming from IR printing changes which will be next in the stack. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96023 Approved by: https://github.com/ngimel, https://github.com/mlazos	2023-03-06 22:57:22 +00:00
Edward Z. Yang	1fd7ea1ba8	Update skips for RecursionError (#96109 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96109 Approved by: https://github.com/huydhn	2023-03-06 17:55:38 +00:00
Bin Bao	02792ff16f	[CI] Make inductor-perf-test-nightly produce data for dashboard (#95685 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95685 Approved by: https://github.com/ezyang, https://github.com/huydhn	2023-03-06 03:14:03 +00:00
Bin Bao	60cf95610d	[CI] Skip xcit_large_24_p8_224 in TIMM (#96048 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96048 Approved by: https://github.com/jansel	2023-03-05 14:54:46 +00:00
Bin Bao	1359d16fe8	[CI] Further tighten the checking of two eager runs (#95902 ) Summary: To catch nondeterminism in eager if there is any. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95902 Approved by: https://github.com/jansel	2023-03-05 14:53:02 +00:00
Edward Z. Yang	c7c4a20321	Update dynamic skips (#95966 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95966 Approved by: https://github.com/janeyx99, https://github.com/voznesenskym	2023-03-04 23:01:58 +00:00
Jason Ansel	43dd043ea7	Revert "[inductor] Improve error messages (#95567 )" (#96014 ) This reverts commit `62b775583f`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96014 Approved by: https://github.com/Chillee	2023-03-04 04:03:31 +00:00
Edward Z. Yang	d303665d33	Make int unspecialization actually work (#95621 ) OK, so this PR used to be about reducing the number of constants we specialize on, but it turns out that unspecialization was ~essentially never used (because we still constant specialized way too aggressively) and I ended up having to fix a bunch of issues to actually get tests to pass. So this PR is now "make int unspecialization actually work". As part of this, I have to turn off unspecialization by default, as there are still latent bugs in inductor. The general strategy is that an unspecialized int is represented as a SymInt. Representing it as a 0d tensor (which is what the code used to do) is untenable: (1) we often need unspecialized ints to participate in size computations, but we have no way of propagating sympy expressions through tensor compute, and (2) a lot of APIs work when passed SymInt, but not when passed a Tensor. However, I continue to represent Numpy scalars as Tensors, as they are rarely used for size computation and they have an explicit dtype, so they are more accurately modeled as 0d tensors. * I folded in the changes from https://github.com/pytorch/pytorch/pull/95099 as I cannot represent unspecialized ints as SymInts without also turning on dynamic shapes. This also eliminates the necessity for test_unspec.py, as toggling specialization without dynamic shapes doesn't do anything. As dynamic shapes defaults to unspecializing, I just deleted this entirely; for the specialization case, I rely on regular static shape tests to catch it. (Hypothetically, we could also rerun all the tests with dynamic shapes, but WITH int/float specialization, but this seems... not that useful? I mean, I guess export wants it, but I'd kind of like our Source heuristic to improve enough that export doesn't have to toggle this either.) * Only 0/1 integers get specialized by default now * A hodgepodge of fixes. I'll comment on the PR about them. Fixes https://github.com/pytorch/pytorch/issues/95469 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95621 Approved by: https://github.com/jansel, https://github.com/Chillee	2023-03-04 01:22:08 +00:00
Jason Ansel	62b775583f	[inductor] Improve error messages (#95567 ) Example error message before/after (710 to 131 lines): https://gist.github.com/jansel/6fecad057738089fa95bf08c3de9fc8a Pull Request resolved: https://github.com/pytorch/pytorch/pull/95567 Approved by: https://github.com/mlazos	2023-03-02 02:20:55 +00:00
Bin Bao	879f0c3fee	[CI] Increate the timeout limit for benchmark test (#95787 ) Summary: xcit_large_24_p8_224 occasionally hits TIMEOUT on CI. Bump up the limit to reduce flakiness. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95787 Approved by: https://github.com/ezyang, https://github.com/ZainRizvi	2023-03-01 19:54:25 +00:00
Bin Bao	e79b2b7792	[CI] Force clear triton cache between running each test (#95729 ) Summary: The idea is to see if this reduces some of the flakiness we have seen on CI. If it does help, then we have a problem in our caching implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95729 Approved by: https://github.com/ngimel	2023-03-01 04:10:03 +00:00
William Wen	cf3638a9cc	[dynamo] Clear cache on dynamo dashboard accuracy tests (#95726 ) Might fix some flaky accuracy tests? Pull Request resolved: https://github.com/pytorch/pytorch/pull/95726 Approved by: https://github.com/ngimel, https://github.com/anijain2305, https://github.com/desertfire	2023-03-01 00:50:19 +00:00
Will Constable	1a72712645	Add dynamo graph break stats to CI (#95635 ) Adds columns to csv produced by accuracy job including dynamo graph break stats. Example output from torchbench CI job: <img width="771" alt="image" src="https://user-images.githubusercontent.com/4984825/221716236-9276684e-1be8-43e1-837e-f41671d4e0e3.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95635 Approved by: https://github.com/ezyang	2023-02-28 16:17:46 +00:00
Edward Z. Yang	3762e801ba	Update dynamic skips (#95587 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95587 Approved by: https://github.com/voznesenskym	2023-02-28 03:26:55 +00:00
Bin Bao	fa5a4b0dfc	[CI] Do not compare two eager run results against fp64 result (#95616 ) Summary: When running the benchmark test with --accuracy, two eager runs should return the same result. If not, we want to detect it early, but comparing against fp64_output may hide the non-deterministism in eager. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95616 Approved by: https://github.com/ZainRizvi	2023-02-27 20:11:21 +00:00
Bin Bao	ab1ab3ab19	[CI] Specify more torch.backends.cudnn options to reduce non-determinism (#95478 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95478 Approved by: https://github.com/ezyang	2023-02-25 18:54:12 +00:00
Edward Z. Yang	b8151d2ba9	Utility for running delta comparisons between two flag configs (#95411 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95411 Approved by: https://github.com/Chillee	2023-02-25 02:30:22 +00:00
Bin Bao	4c8ad93a7c	[Inductor][CI] Remove hf_GPT2_large from CPU inference test (#95473 ) Summary: hf_GPT2_large shows random failure on CI for the CPU inference. Created https://github.com/pytorch/pytorch/issues/95474 for the Intel team to investigate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95473 Approved by: https://github.com/anijain2305	2023-02-24 18:21:36 +00:00
Will Constable	8de4238a31	Add dynamo bench arg --per_process_memory_fraction (#95260 ) Simply pipes the arg to the existing torch.cuda API by the same name. Useful for locally debugging OOMs that happened on a smaller GPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95260 Approved by: https://github.com/davidberard98	2023-02-22 05:11:18 +00:00
Edward Z. Yang	08370ddad8	Update model skips (#95089 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95089 Approved by: https://github.com/albanD	2023-02-20 13:24:49 +00:00
Wang, Eikan	954c767bc6	[Inductor] Enable accuracy test for CPPBackend (#94898 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94898 Approved by: https://github.com/jgong5, https://github.com/desertfire	2023-02-20 05:02:15 +00:00
Edward Z. Yang	a2f44d82f8	Flag guard unbacked SymInt/SymFloat support (#94987 ) I believe this fixes the AllenaiLongformerBase problem in periodic. The longer version of the problem is here is we are currently optimistically converting all item() calls into unbacked SymInt/SymFloat, but sometimes this results in a downstream error due to a data-dependent guard. Fallbacks for this case are non-existent; this will just crash the model. This is bad. So we flag guard until we get working fallbacks. What could these fallbacks look like? One idea I have is to optimistically make data-dependent calls unbacked, but then if it results in a crash, restart Dynamo analysis with the plan of graph breaking when the item() call immediately happened. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94987 Approved by: https://github.com/Skylion007, https://github.com/malfet	2023-02-17 00:25:05 +00:00
Edward Z. Yang	5747a51657	Fix flaky StaticRuntime.Nonzero test (#94418 ) If the operator produces a zero size tensor, the memory may be equal to the original. With nonzero, we would sometimes get unlucky and everything was zero. See failing tests at https://hud.pytorch.org/failure/%5B%20%20FAILED%20%20%5D%20StaticRuntime.Nonzero Arguably we should also fix the seeding but it was less obvious to me where to do that. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94418 Approved by: https://github.com/albanD	2023-02-16 21:25:15 +00:00
Edward Z. Yang	7aaebe00ee	Fail dynamic_aot_eager AllenaiLongformerBase model (#94986 ) ``` GuardOnDataDependentSymNode: It appears that you're trying to get a value out of symbolic int/float whose value is data-dependent (and thus we do not know the true value.) The expression we were trying to evaluate is Eq(i3, -1). Scroll up to see where each of these data-dependent accesses originally occurred. While executing %as_strided : [#users=1] = call_method[target=as_strided](args = (%pad,), kwargs = {size: (12, %add, 768, 64), stride: (%getitem, %mul, %getitem_1, %getitem_2)}) Original traceback: File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/transformers/models/longformer/modeling_longformer.py", line 928, in <graph break in _sliding_chunks_matmul_attn_probs_value> chunked_value = padded_value.as_strided(size=chunked_value_size, stride=chunked_value_stride) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94986 Approved by: https://github.com/albanD	2023-02-16 20:02:46 +00:00
Wei Wang	5705199fb1	Update smoke test threshold (#94888 ) https://github.com/pytorch/pytorch/pull/94249 touched upon what values we should set. It turns out 1.17 is too high, as seemingly innocent commits are failing to yield 1.17x. They yielded ~1.168x. https://github.com/pytorch/pytorch/actions/runs/4180998255/jobs/7242758816 <img width="881" alt="image" src="https://user-images.githubusercontent.com/109318740/218951536-476d3764-1aa6-481b-bd92-f55d1c50e385.png"> Setting it to 1.165x. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94888 Approved by: https://github.com/ngimel	2023-02-15 07:29:41 +00:00
Xuehai Pan	b005ec62b9	[BE] Remove dependency on `six` and `future` (#94709 ) Remove the Python 2 and 3 compatibility library [six](https://pypi.org/project/six) and [future](https://pypi.org/project/future) and `torch._six`. We only support Python 3.8+ now. It's time to retire them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94709 Approved by: https://github.com/malfet, https://github.com/Skylion007	2023-02-14 09:14:14 +00:00
Natalia Gimelshein	f2aee8b8d5	small fixes for mlir backend (#94717 ) Fixes for skipped tests with mlir triton backend (will unskip once #94249 lands) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94717 Approved by: https://github.com/malfet, https://github.com/atalman	2023-02-13 22:42:53 +00:00
Aaron Gokaslan	0444a6c90a	[BE] Remove deprecated logging warn method (#94708 ) Swaps all logging.warn calls to logging.warning since the former is deprecated and even raises a deprecation warning now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94708 Approved by: https://github.com/ezyang	2023-02-13 18:24:52 +00:00
Edward Z. Yang	ae7a628b03	Dynamic shapes CI updates (#94690 ) Data from https://github.com/pytorch/pytorch/pull/94683 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94690 Approved by: https://github.com/cpuhrsch	2023-02-13 18:20:12 +00:00
Nikita Shulga	4869929f32	Update Triton hash (#94249 ) That includes MLIR + latest packaging changes (that also download ptxas from CUDA-12) Tweak CI to install gcc-9 to build trition Disable a few tests to make everything be correct Pull Request resolved: https://github.com/pytorch/pytorch/pull/94249 Approved by: https://github.com/Skylion007, https://github.com/ngimel, https://github.com/weiwangmeta	2023-02-13 13:17:36 +00:00
Aaron Gokaslan	67d9790985	[BE] Apply almost all remaining flake8-comprehension checks (#94676 ) Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676 Approved by: https://github.com/ezyang	2023-02-12 01:01:25 +00:00
Xuehai Pan	8d45f555d7	[BE] [1/3] Rewrite `super()` calls in caffe2 and benchmarks (#94587 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94587 Approved by: https://github.com/ezyang	2023-02-11 18:19:48 +00:00

... 2 3 4 5 6 ...

1332 Commits