Edward Z. Yang
b8b840be3d
Convert logging f-strings to use % format, part five ( #98765 )
...
This does some annoying but simple cases by hand.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98765
Approved by: https://github.com/wanchaol
2023-04-11 13:17:59 +00:00
Edward Z. Yang
b09722f540
Convert logging f-strings to use % format, part two ( #98700 )
...
This hits multi-line logging strings
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98700
Approved by: https://github.com/voznesenskym
2023-04-10 12:19:31 +00:00
Edward Z. Yang
9a8f71f23e
Convert logging f-strings to use % format ( #98697 )
...
Codemod done with
https://gist.github.com/ezyang/2e8b0463cdc6be278478495b23ff0530 with
assistance from ChatGPT.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98697
Approved by: https://github.com/voznesenskym
2023-04-10 12:19:31 +00:00
Edward Z. Yang
bdb79a8f52
Turn off divisible_by_16 for dynamic shapes; support ablation ( #98471 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98471
Approved by: https://github.com/ngimel , https://github.com/voznesenskym
2023-04-06 12:57:07 +00:00
Edward Z. Yang
cf1bfca2ba
Require batch dimensions to be compiled dynamically ( #98334 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98334
Approved by: https://github.com/voznesenskym
2023-04-05 19:40:22 +00:00
Edward Z. Yang
b923f84805
Switch accuracy CI to dynamic batch only ( #98307 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98307
Approved by: https://github.com/wconstab
2023-04-05 01:20:12 +00:00
Elias Ellison
a3365e1d0d
Increment pending forwards after invocation ( #98101 )
...
Forwards are only pending following invocation, not before.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98101
Approved by: https://github.com/ngimel
2023-04-05 00:04:39 +00:00
Bin Bao
69ff39d2e7
Skip gat, gcn and sage for TorchBench CUDA test ( #98244 )
...
Summary: The three models only support CPU for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98244
Approved by: https://github.com/ezyang
2023-04-04 01:06:18 +00:00
Jason Ansel
55afaa46a4
Support functools.partial and itertools.product ( #98120 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98120
Approved by: https://github.com/anijain2305
2023-04-03 18:23:25 +00:00
Bin Bao
ba7ee00f00
Add a --inference flag to dynamo benchmark script ( #98173 )
...
Summary: When calling benchmark scripts, make it a requirement to pass
--inference or --training
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98173
Approved by: https://github.com/huydhn
2023-04-03 17:12:28 +00:00
Jason Ansel
92b46202ef
Add --stats option to benchmark scripts ( #98109 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98109
Approved by: https://github.com/anijain2305
2023-04-02 02:23:13 +00:00
Edward Z. Yang
5df59f957f
Fix G001,G002,G003 in logs to % syntax ( #97812 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97812
Approved by: https://github.com/Skylion007 , https://github.com/kiukchung , https://github.com/malfet , https://github.com/mlazos
2023-04-01 01:43:33 +00:00
Bin Bao
c699ac17df
[CI] Bump up torchbench version to fix dynamo graph breaks in transformers ( #98003 )
...
Summary: When we bump up the torchbench version pin last time, we found
there were new graph breaks introduced with the trasformers version
upgrade, see https://github.com/pytorch/pytorch/pull/96782 . Turns out
they are already fixed upstream, see
https://github.com/huggingface/transformers/pull/21648 and https://github.com/pytorch/benchmark/pull/1511
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98003
Approved by: https://github.com/ngimel
2023-03-31 16:52:09 +00:00
Edward Z. Yang
97fc8ea5f4
Run the benchmark suite with dynamic batch only ( #97912 )
...
Symbolic shapes compile time on full CI with inductor is horribly long (even though our aot_eager local runs seemed to suggest that the added latency was only 10s per model.) To patch over the problem for now, run the benchmark suite with dynamic batch only. This should absolve a lot of sins.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97912
Approved by: https://github.com/janeyx99 , https://github.com/desertfire
2023-03-30 18:04:48 +00:00
Aaron Gokaslan
47dca20d80
[BE] Enable flake8-comprehension rule C417 ( #97880 )
...
Enables flake8-comprehension rule C417. Ruff autogenerated these fixes to the codebase.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97880
Approved by: https://github.com/ezyang , https://github.com/kit1980 , https://github.com/albanD
2023-03-30 14:34:24 +00:00
William Wen
b93e1f377e
[dynamo, benchmarks] Add inductor-mode (for max-autotune) and warm start options to dynamo benchmarks ( #97719 )
...
Title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97719
Approved by: https://github.com/shunting314
2023-03-29 21:09:00 +00:00
Edward Z. Yang
f754be897a
Disable speedup_experiment_ds ( #97806 )
...
It seems to be broken.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97806
Approved by: https://github.com/jansel
2023-03-29 01:27:31 +00:00
Bin Bao
a9a81ab7e3
[CI] Run benchmark test with dynamo_eager in periodic ( #97543 )
...
Summary: The idea is to catch any dynamo_eager regression earlier, and also
we can take that off the dashboard run.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97543
Approved by: https://github.com/huydhn
2023-03-28 01:02:49 +00:00
Shunting Zhang
652592efa9
[inductor] use torch.prifiler in the triton wrapper ( #97405 )
...
I think it's helpful to use torch.profiler to profile the triton wrapper.
E.g., I tried it for nvidia_deeprecommender's infernece graph.
Even with max-autotune, we see the majority of the time the GPU is running 2 mm/addmm op. That's why max autotune does not help for this model since tuning does not affect the external mm ops.
<img width="711" alt="Screenshot 2023-03-22 at 5 49 28 PM" src="https://user-images.githubusercontent.com/52589240/227072474-2f0d7205-4a10-4929-b1b7-551214788c61.png ">
next step I'll check why the triton mm kernels are not picked.
EDIT: the above screenshot is captured without max-autotune due to a typo. below is the trace with max-autotune enabled:
<img width="712" alt="Screenshot 2023-03-22 at 6 43 26 PM" src="https://user-images.githubusercontent.com/52589240/227077624-fdccf928-be08-4211-871b-a9e3d7b76fbe.png ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97405
Approved by: https://github.com/ngimel
2023-03-27 21:54:25 +00:00
Edward Z. Yang
cff4826f28
pytorch_unet is now passing ( #97309 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97309
Approved by: https://github.com/janeyx99 , https://github.com/zou3519
2023-03-22 13:55:05 +00:00
Bin Bao
be49d3b170
[CI] Turn on debug logging for dla102 and gernet_l ( #97307 )
...
Summary: Log the generated code for those two flaky tests to see if
there is any codegen difference when they fail.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97307
Approved by: https://github.com/ezyang
2023-03-22 13:42:13 +00:00
Natalia Gimelshein
e7d9331688
[inductor] hoist symbolic padding expressions ( #97099 )
...
Towards fixing pnasnet5large, see #96709 . The generated kernel looks much better
```
@pointwise(size_hints=[1048576], filename=__file__, meta={'signature': {0: '*fp32', 1: '*fp32', 2: 'i32', 3: 'i32', 4: 'i32', 5: 'i32', 6: 'i32'}, 'device': 0, 'constants': {}, 'mutated_arg_names': [], 'configs': [instance_descriptor(divisible_by_16=(0, 1, 6), equal_to_1=())]})
@triton.jit
def triton_(in_ptr0, out_ptr0, ks0, ks1, ks2, ks3, xnumel, XBLOCK : tl.constexpr):
xoffset = tl.program_id(0) * XBLOCK
xindex = xoffset + tl.arange(0, XBLOCK)[:]
xmask = xindex < xnumel
x1 = (xindex // ks0) % ks0
x0 = xindex % ks0
x2 = (xindex // ks3)
x4 = xindex
tmp0 = x1 + ((-1)*ks1)
tmp1 = 0
tmp2 = tmp0 >= tmp1
tmp3 = ks2
tmp4 = tmp0 < tmp3
tmp5 = x0 + ((-1)*ks1)
tmp6 = tmp5 >= tmp1
tmp7 = tmp5 < tmp3
tmp8 = tmp2 & tmp4
tmp9 = tmp8 & tmp6
tmp10 = tmp9 & tmp7
tmp11 = tl.load(in_ptr0 + (x0 + ((-1)*ks1) + (ks2*x1) + (x2*(ks2*ks2)) + ((-1)*ks1*ks2) + tl.zeros([XBLOCK], tl.int32)), tmp10 & xmask, other=0)
tmp12 = tl.where(tmp10, tmp11, 0.0)
tl.store(out_ptr0 + (x4 + tl.zeros([XBLOCK], tl.int32)), tmp12, xmask)
```
Interestingly, removing `expand` in in index `simplify` function makes `load` expression a little bit better, but `store` fails to simplify to flat store in this case, so I'm leaving `expand` in.
Full pnasnet still chokes on `ceiling` in batch_norm kernels, additionally, it looks like shape propagation goofs in inductor and generates overly complicated expressions, we should switch to meta data from fx graph.
I'm still not adding `ceil` print to triton, because we should be able to hoist all indexing expression (and just printing ceil without converting to int64 doesn't work)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97099
Approved by: https://github.com/jansel
2023-03-21 21:43:32 +00:00
Edward Z. Yang
e74c5e5637
rexnet_100 is disabled for static, does not need dynamic listing ( #97100 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97100
Approved by: https://github.com/Skylion007
2023-03-19 20:57:49 +00:00
Bin Bao
577d930c39
[CI] Revert https://github.com/pytorch/pytorch/pull/96195 ( #96897 )
...
Summary: https://github.com/pytorch/pytorch/pull/96195 was an experiment
for debugging flaky failures on CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96897
Approved by: https://github.com/ngimel
2023-03-16 06:28:18 +00:00
Edward Z. Yang
3606f59366
Default specialize_int to False ( #96624 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624
Approved by: https://github.com/janeyx99
2023-03-16 02:54:18 +00:00
Will Constable
54cd4a67d0
Output peak memory stats from dynamo torchbench perf CI ( #95666 )
...
Adds absolute memory usage numbers (in addition to compression ratio) to performance jobs.
Example output:
<img width="1211" alt="image" src="https://user-images.githubusercontent.com/4984825/225419950-500908c5-00ce-4711-afa2-c995bf90d35d.png ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95666
Approved by: https://github.com/ezyang , https://github.com/williamwen42
2023-03-15 19:24:47 +00:00
Bin Bao
33c7be360f
[reland][CI] switch torchbench to a pinned version ( #96782 )
...
Summary: This is reland of https://github.com/pytorch/pytorch/pull/96553
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96782
Approved by: https://github.com/huydhn
2023-03-15 12:46:36 +00:00
Edward Z. Yang
037acd5a22
Update CI skips ( #96745 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96745
Approved by: https://github.com/wconstab
2023-03-14 22:19:10 +00:00
PyTorch MergeBot
be4eaa69c2
Revert "[CI] switch torchbench to a pinned version ( #96553 )"
...
This reverts commit 61d6ccd29a .
Reverted https://github.com/pytorch/pytorch/pull/96553 on behalf of https://github.com/desertfire due to land race
2023-03-14 21:39:45 +00:00
PyTorch MergeBot
ba4fb9b6ad
Revert "Default specialize_int to False ( #96624 )"
...
This reverts commit 1ac8782db2 .
Reverted https://github.com/pytorch/pytorch/pull/96624 on behalf of https://github.com/kit1980 due to Broke inductor/test_torchinductor_dynamic_shapes.py
2023-03-14 19:43:47 +00:00
Bin Bao
61d6ccd29a
[CI] switch torchbench to a pinned version ( #96553 )
...
Summary: Previously we were using a branch on torchbench which skips
torchaudio. We should switch to make sure a good test coverage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96553
Approved by: https://github.com/huydhn , https://github.com/ezyang
2023-03-14 18:42:22 +00:00
Edward Z. Yang
1ac8782db2
Default specialize_int to False ( #96624 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624
Approved by: https://github.com/janeyx99
2023-03-14 18:37:47 +00:00
David Berard
6e3d51b08a
[inductor][CI] also skip rexnet_100 on non-dynamic shapes ( #96691 )
...
Recent failures show rexnet_100 accuracy is flaky also on non-dynamic shapes (was already disabled for dynamic shapes in #96474 ). The failure occurs for the same reason (stem.bn.weight.grad).
e.g. https://github.com/pytorch/pytorch/actions/runs/4402868441/jobs/7710977874
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96691
Approved by: https://github.com/desertfire
2023-03-14 18:11:59 +00:00
Edward Z. Yang
ff7e510d1e
Correctly use PythonPrinter for generating wrapper code referencing sympy ( #96710 )
...
Otherwise you get stuff like ceiling(s0) which is not valid Python code. Fixes volo_d1_224
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96710
Approved by: https://github.com/ngimel , https://github.com/jansel
2023-03-14 14:35:52 +00:00
Wang, Eikan
3cad8d23d0
[Inductor] Skip the hf_T5_base due to intermittent failure on CI ( #96649 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96649
Approved by: https://github.com/desertfire
2023-03-14 07:40:20 +00:00
Edward Z. Yang
507feb805f
Don't specialize torch.Size with specialize_int = False ( #96419 )
...
Fixes https://github.com/pytorch/pytorch/issues/95868
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96419
Approved by: https://github.com/jansel , https://github.com/ngimel
2023-03-14 01:32:58 +00:00
Edward Z. Yang
c7f39c0820
Update CI skips ( #96554 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96554
Approved by: https://github.com/janeyx99
2023-03-13 13:40:45 +00:00
David Berard
29cd60dfb7
[CI] handle more dynamo benchmark models that are not expected to be deterministic ( #96324 )
...
Follow-up to #96245 . alexnet, Background_Matting, vision_maskrcnn, and vgg16 all have the same problem; but on float32 they were also failing on the previous day so I missed this. Once the amp jobs became available I could see that these have the same issue (on both float32 and amp).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96324
Approved by: https://github.com/desertfire
2023-03-10 18:15:34 +00:00
Bin Bao
a651e6253a
[CI] Change compile_threads to 1 when running benchmark accuracy test on CI ( #96195 )
...
Summary: This is not a pretty solution, but it a way to verify if the flakiness is coming from parallel compilation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96195
Approved by: https://github.com/ngimel
2023-03-10 17:39:38 +00:00
Edward Z. Yang
ff2e14f200
Skip rexnet_100 in dynamic CI ( #96474 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96474
Approved by: https://github.com/yanboliang , https://github.com/msaroufim
2023-03-10 01:23:19 +00:00
Edward Z. Yang
c988de1040
[EASY] Update inductor training dynamic skips ( #96298 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96298
Approved by: https://github.com/Chillee , https://github.com/janeyx99
2023-03-08 19:31:46 +00:00
Bin Bao
b3a079810e
[CI] Add a workflow for quick perf comparison ( #96166 )
...
Summary: ciflow/inductor-perf-test-nightly now contains full dashboard
run which takes a very long time. Ed proposed a simplification of the
perf run there, but it is still worth to have a set of fast perf test
which only includes one configuration (--training --amp).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96166
Approved by: https://github.com/huydhn , https://github.com/weiwangmeta
2023-03-08 19:09:04 +00:00
Bin Bao
664381b293
[CI] Avoid calling torch.use_deterministic_algorithms for some models ( #96245 )
...
tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96245
Approved by: https://github.com/davidberard98
2023-03-08 03:35:32 +00:00
Edward Z. Yang
d0641ed247
[TEST] Turn on unspecialize int dynamic training inductor CI ( #96058 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96058
Approved by: https://github.com/janeyx99 , https://github.com/voznesenskym
2023-03-07 16:08:45 +00:00
Edward Z. Yang
a6e3e7905e
Turn on unspecialize int dynamic inductor CI ( #96034 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96034
Approved by: https://github.com/voznesenskym
2023-03-07 12:39:55 +00:00
Jason Ansel
95d17dc93d
[inductor] Reland #95567 part 1 ( #96023 )
...
This is the non-problematic part of #95567 . The errors were coming from
IR printing changes which will be next in the stack.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96023
Approved by: https://github.com/ngimel , https://github.com/mlazos
2023-03-06 22:57:22 +00:00
Edward Z. Yang
1fd7ea1ba8
Update skips for RecursionError ( #96109 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96109
Approved by: https://github.com/huydhn
2023-03-06 17:55:38 +00:00
Bin Bao
60cf95610d
[CI] Skip xcit_large_24_p8_224 in TIMM ( #96048 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96048
Approved by: https://github.com/jansel
2023-03-05 14:54:46 +00:00
Bin Bao
1359d16fe8
[CI] Further tighten the checking of two eager runs ( #95902 )
...
Summary: To catch nondeterminism in eager if there is any.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95902
Approved by: https://github.com/jansel
2023-03-05 14:53:02 +00:00
Edward Z. Yang
c7c4a20321
Update dynamic skips ( #95966 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95966
Approved by: https://github.com/janeyx99 , https://github.com/voznesenskym
2023-03-04 23:01:58 +00:00