Commit Graph

22 Commits

Author SHA1 Message Date
PyTorch MergeBot
a7bfa04da6 Revert "More markDynamoStrictTest (#115870)"
This reverts commit 7f686c8fe1.

Reverted https://github.com/pytorch/pytorch/pull/115870 on behalf of https://github.com/jeanschmidt due to Breaking internal tests and builds, please check diff ([comment](https://github.com/pytorch/pytorch/pull/115870#issuecomment-1862997125))
2023-12-19 15:40:57 +00:00
rzou
7f686c8fe1 More markDynamoStrictTest (#115870)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115870
Approved by: https://github.com/voznesenskym
ghstack dependencies: #115845, #115855, #115856, #115857, #115858
2023-12-15 05:26:54 +00:00
Jithun Nair
2ea2421b44 Skip unit tests that fail on MI210 runners (#114613)
Taken from https://github.com/pytorch/pytorch/pull/105980
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114613
Approved by: https://github.com/malfet
2023-11-27 22:25:35 +00:00
rraminen
44367c59b2 Update skip reason for failing unit tests on ROCm 5.7 (#113286)
Follow up to https://github.com/pytorch/pytorch/pull/110465. Updated skip reason for failing unit tests on ROCm 5.7

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113286
Approved by: https://github.com/malfet
2023-11-13 19:29:04 +00:00
rraminen
3a429423fc Upgrade CI to ROCm5.7 (#110465)
This PR is to upgrade CI to ROCm5.7

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110465
Approved by: https://github.com/pruthvistony, https://github.com/malfet
2023-11-08 06:11:10 +00:00
Pruthvi Madugundu
9ce2e02fd6 Revert "[ROCm] Remove PYTORCH_MIOPEN_SUGGEST_NHWC flag (#90725)" (#110319)
This reverts commit 66bfcd32fd.

NHWC is have perf regression on MIOpen, so reverting till the performance issue is fixed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110319
Approved by: https://github.com/jeffdaily, https://github.com/jithunnair-amd, https://github.com/kit1980
2023-10-03 19:14:47 +00:00
CaoE
7c9052165a add fp16 support for native conv and deconv on CPU (#99497)
### Testing

Native conv vs. mkldnn conv on SPR (with avx512_fp16 support)

Single core:

Input | Naïve impl   / us | oneDNN /   us | Speed up
-- | -- | -- | --
IC:   64, OC: 256, kernel: 1, stride: 1, N: 256, H: 56, W: 56, G: 1, pad: 0 | 34676789 | 524199.8 | 66.15185
IC:   128, OC: 512, kernel: 1, stride: 1, N: 256, H: 28, W: 28, G: 1, pad: 0 | 33454125 | 349844.4 | 95.62573
IC: 256, OC: 256, kernel: 3, stride: 1,   N: 1, H: 16, W: 16, G: 1, pad: 0 | 317650.1 | 2317.677 | 137.0554
IC: 128, OC: 256, kernel: 3, stride: 1,   N: 1, L: 64 | 15334.68 | 167.264 | 91.67952

56 cores:
Input | Naïve impl   / us | oneDNN /   us | Speed up
-- | -- | -- | --
IC:   64, OC: 256, kernel: 1, stride: 1, N: 256, H: 56, W: 56, G: 1, pad: 0 | 1032064 | 11073.58 | 93.20061
IC:   128, OC: 512, kernel: 1, stride: 1, N: 256, H: 28, W: 28, G: 1, pad: 0 | 1000097 | 16371.19 | 61.08883
IC:   256, OC: 1024, kernel: 1, stride: 1, N: 256, H: 14, W: 14, G: 1, pad: 0 | 981813.4 | 9008.908 | 108.9825
IC: 1024, OC: 256, kernel: 1, stride: 1,   N: 256, H: 14, W: 14, G: 1, pad: 0 | 1082606 | 10150.47 | 106.6558
IC: 256, OC: 256, kernel: 3, stride: 1,   N: 1, H: 16, W: 16, G: 1, pad: 0 | 319980.6 | 181.598 | 1762.027

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99497
Approved by: https://github.com/jgong5, https://github.com/cpuhrsch
2023-09-25 01:31:26 +00:00
Justin Chu
79c5e33349 [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436
Approved by: https://github.com/malfet, https://github.com/albanD
2023-07-21 07:38:46 +00:00
Fuzzkatt
6d570ccd59 tf32 context fixes for various tests (#103137)
Addresses tf32 context related failures from NVIDIA internal testing for following unit tests:

H100:

- functorch/test_vmap.py: test_op_has_batch_rule

A100:

- test_expanded_weights.py: test_cnn_model_sum
- nn/test_convolution.py: test_conv2d_same_padding_backward

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103137
Approved by: https://github.com/zou3519
2023-06-15 02:33:12 +00:00
Fuzzkatt
f8896b7b0e update tf32 thresholds in nn/test_convolution.py (#102015)
updated tf32 thresholds for test_cudnn_convolution_relu, test_cudnn_convolution_add_relu

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102015
Approved by: https://github.com/ngimel
2023-05-24 22:42:25 +00:00
Fuzzkatt
47e9dba765 move tf32_on_and_off fix for test_convolution.py (#102007)
move tf32_on_and_off after  @torch.backends.cudnn.flags(enabled=True, benchmark=False) due to  @torch.backends.cudnn.flags(enabled=True, benchmark=False) overwriting tf32_on_and_off if after.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102007
Approved by: https://github.com/ngimel
2023-05-24 02:23:06 +00:00
kshitij12345
3b966a6ce3 [autograd] disable backward/grad for complex scalar output (#92753)
Fixes https://github.com/pytorch/pytorch/issues/92750

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92753
Approved by: https://github.com/ezyang
2023-02-23 11:38:27 +00:00
Jeff Daily
66bfcd32fd [ROCm] Remove PYTORCH_MIOPEN_SUGGEST_NHWC flag (#90725)
Fixes #64427.  MIOpen supports ChannelsLast.  No longer need to opt-in with env var.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90725
Approved by: https://github.com/malfet
2023-02-09 22:26:24 +00:00
mingfeima
26cba842ad Optimize ConvTransposed2D with mkldnn float32 and bfloat16 on CPU (#92530)
this PR optimized `ConvTranspose2d` with oneDNN and add channels last support for it. Also the fallback path `slow_conv_transpose2d` also have channels last support. So the memory format propagation behavior would stay the same with or without oneDNN.

Replacement of https://github.com/pytorch/pytorch/pull/77060, https://github.com/pytorch/pytorch/pull/70897 and https://github.com/pytorch/pytorch/pull/74023 which enables oneDNN for `ConvTranspose2d` and `ConvTranspose3d`

The following results collects on Skylake Xeon 8180, dual sockets, 28 cores per socket.
### single core channels last

configs | forward before/ms | forward after/ms | ratio | backward   before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100, 100), weight size: (32, 32, 3, 3) | 181.36 | 91.16 | 1.99 | 531.38 | 124.08 | 4.28
input size:   (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 324.35 | 153.50 | 2.11 | 973.16 | 185.97 | 5.23
input size:   (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 1086.82 | 671.52 | 1.62 | 3008.94 | 1453.33 | 2.07

### single core channels first

configs | forward before/ms | forward after/ms | ratio | backward   before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100, 100), weight size: (32, 32, 3, 3) | 138.10 | 5.94 | 23.23 | 37.97 | 11.25 | 3.38
input size:   (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 236.43 | 8.75 | 27.03 | 87.77 | 18.58 | 4.72
input size:   (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 484.39 | 37.69 | 12.85 | 185.40 | 90.57 | 2.05

### single socket channels last

configs | forward before/ms | forward after/ms | ratio | backward   before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100, 100), weight size: (32, 32, 3, 3) | 138.10 | 5.94 | 23.23 | 37.97 | 11.25 | 3.38
input size:   (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 236.43 | 8.75 | 27.03 | 87.77 | 18.58 | 4.72
input size:   (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 484.39 | 37.69 | 12.85 | 185.40 | 90.57 | 2.0

### single socket channels first

configs | forward before/ms | forward after/ms | ratio | backward   before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100,   100), weight size: (32, 32, 3, 3) | 132.56 | 7.19 | 18.43 | 31.43 | 11.20 | 2.81
input size:   (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 227.94 | 13.33 | 17.11 | 63.00 | 23.41 | 2.69
input size:   (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 473.68 | 52.79 | 8.97 | 150.40 | 87.33 | 1.72

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92530
Approved by: https://github.com/jgong5, https://github.com/ezyang
2023-02-06 10:11:25 +00:00
Jeff Daily
72502b94f3 correct use of torch.backends.cudnn.flags() (#93182)
Fixes #77467.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93182
Approved by: https://github.com/ngimel
2023-01-28 06:50:06 +00:00
Eddie Yan
dabf515c18 [cuDNN][cuDNN V8 API] (re-re-re-open) cuDNN V8 API on by default (#91117)
Re-opening following #91025

CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91117
Approved by: https://github.com/ngimel
2022-12-20 18:52:29 +00:00
PyTorch MergeBot
ba7aeac37b Revert "[cuDNN][cuDNN V8 API] (re-re-open) cuDNN V8 API on by default (#89022)"
This reverts commit eecd621f06.

Reverted https://github.com/pytorch/pytorch/pull/89022 on behalf of https://github.com/ngimel due to breaks some convolution configurations #91025
2022-12-16 23:06:35 +00:00
Eddie Yan
eecd621f06 [cuDNN][cuDNN V8 API] (re-re-open) cuDNN V8 API on by default (#89022)
Testing V8 on by default again after fixes have been merged for e.g., https://github.com/pytorch/torchdynamo/issues/1833

One new failure that seems to be surfaced with V8 on appears in halonext + amp
```
RuntimeError: Internal Triton PTX codegen error:
Segmentation fault (core dumped)
```
But I'm not sure if this points to a V8 issue or a Triton issue CC @ngimel @ptrblck

Current dynamo benchmarks on A100:
v7 vs. v8
|dev |name                           |batch_size|abs_latency_v7|abs_latency_v8|
|----|-------------------------------|----------|--------------|--------------|
|cuda|adv_inception_v3               |128       |166.0240      |165.5798      |
|cuda|beit_base_patch16_224          |64        |123.5912      |123.0797      |
|cuda|botnet26t_256                  |128       |107.7343      |107.5948      |
|cuda|cait_m36_384                   |4         |184.5038      |184.0271      |
|cuda|coat_lite_mini                 |128       |142.3061      |140.5814      |
|cuda|convit_base                    |64        |165.2499      |161.0743      |
|cuda|convmixer_768_32               |32        |325.6984      |325.7094      |
|cuda|convnext_base                  |64        |237.4632      |238.0142      |
|cuda|crossvit_9_240                 |128       |72.2980       |72.4367       |
|cuda|cspdarknet53                   |64        |96.6862       |96.8308       |
|cuda|deit_base_distilled_patch16_224|64        |117.6045      |117.9616      |
|cuda|dla102                         |128       |182.3073      |182.2304      |
|cuda|dm_nfnet_f0                    |128       |133.6011      |133.6298      |
|cuda|dpn107                         |32        |148.5080      |148.5885      |
|cuda|eca_botnext26ts_256            |128       |113.8676      |113.1514      |
|cuda|eca_halonext26ts               |128       |119.2242      |119.1845      |
|cuda|ese_vovnet19b_dw               |128       |80.0217       |79.9438       |
|cuda|fbnetc_100                     |128       |91.4548       |91.4009       |
|cuda|fbnetv3_b                      |128       |115.4496      |115.5058      |
|cuda|gernet_l                       |128       |114.8365      |114.7870      |
|cuda|ghostnet_100                   |128       |58.5766       |58.5766       |
|cuda|gluon_inception_v3             |128       |165.5222      |165.7167      |
|cuda|gluon_xception65               |32        |165.8779      |165.7818      |
|cuda|gmixer_24_224                  |128       |116.3611      |113.4925      |
|cuda|gmlp_s16_224                   |128       |121.2607      |121.2534      |
|cuda|hrnet_w18                      |128       |246.5706      |246.7599      |
|cuda|inception_v3                   |128       |166.1096      |166.2034      |
|cuda|jx_nest_base                   |32        |93.6064       |93.4088       |
|cuda|lcnet_050                      |128       |21.4156       |21.4207       |
|cuda|levit_128                      |128       |27.2901       |27.2543       |
|cuda|mixer_b16_224                  |128       |157.8992      |158.2878      |
|cuda|mixnet_l                       |128       |197.3443      |197.2125      |
|cuda|mnasnet_100                    |128       |71.4604       |71.2997       |
|cuda|mobilenetv2_100                |128       |67.6080       |67.7515       |
|cuda|mobilenetv3_large_100          |128       |57.7224       |57.6591       |
|cuda|mobilevit_s                    |64        |93.0372       |93.0530       |
|cuda|nfnet_l0                       |128       |113.1664      |113.2853      |
|cuda|pit_b_224                      |64        |133.3333      |133.4153      |
|cuda|pnasnet5large                  |16        |238.9545      |238.8122      |
|cuda|poolformer_m36                 |64        |144.2353      |144.2375      |
|cuda|regnety_002                    |128       |32.8534       |32.9069       |
|cuda|repvgg_a2                      |128       |102.4150      |102.3827      |
|cuda|res2net101_26w_4s              |64        |120.8127      |120.8322      |
|cuda|res2net50_14w_8s               |128       |149.7052      |149.8969      |
|cuda|res2next50                     |128       |153.7439      |153.8215      |
|cuda|resmlp_12_224                  |128       |89.1918       |86.9226       |
|cuda|resnest101e                    |64        |159.4706      |159.3133      |
|cuda|rexnet_100                     |128       |88.0032       |88.0397       |
|cuda|sebotnet33ts_256               |64        |80.4635       |80.0120       |
|cuda|selecsls42b                    |128       |70.4430       |70.3663       |
|cuda|spnasnet_100                   |128       |78.0537       |78.1991       |
|cuda|swin_base_patch4_window7_224   |64        |212.9073      |213.0824      |
|cuda|swsl_resnext101_32x16d         |32        |193.0229      |193.0404      |
|cuda|tf_efficientnet_b0             |128       |97.1316       |97.0410       |
|cuda|tf_mixnet_l                    |128       |203.4956      |203.5340      |
|cuda|tinynet_a                      |128       |82.4038       |82.8733       |
|cuda|tnt_s_patch16_224              |128       |284.8576      |284.8867      |
|cuda|twins_pcpvt_base               |64        |118.3893      |119.2329      |
|cuda|visformer_small                |128       |126.0533      |126.0390      |
|cuda|vit_base_patch16_224           |64        |118.2873      |118.0573      |
|cuda|volo_d1_224                    |64        |108.7764      |108.2063      |
|cuda|xcit_large_24_p8_224           |5         |100.4656      |100.5209      |

v7 vs. v8 amp

|dev |name                           |batch_size|abs_latency_v7|abs_latency_v8|
|----|-------------------------------|----------|--------------|--------------|
|cuda|adv_inception_v3               |128       |104.9729      |105.1237      |
|cuda|beit_base_patch16_224          |64        |75.4330       |75.2039       |
|cuda|botnet26t_256                  |128       |74.5149       |74.8071       |
|cuda|cait_m36_384                   |4         |110.9788      |111.5170      |
|cuda|coat_lite_mini                 |128       |62.3618       |64.4965       |
|cuda|convit_base                    |64        |116.4054      |117.9129      |
|cuda|convmixer_768_32               |32        |264.4401      |264.4491      |
|cuda|convnext_base                  |64        |182.9009      |179.2136      |
|cuda|crossvit_9_240                 |128       |48.8586       |48.8359       |
|cuda|cspdarknet53                   |64        |80.0245       |80.0160       |
|cuda|deit_base_distilled_patch16_224|64        |66.5921       |66.7448       |
|cuda|dla102                         |128       |116.7780      |117.1683      |
|cuda|dm_nfnet_f0                    |128       |78.9322       |79.1135       |
|cuda|dpn107                         |32        |85.5206       |85.7514       |
|cuda|eca_botnext26ts_256            |128       |76.3672       |77.0050       |
|cuda|eca_halonext26ts               |128       |86.2458       |              |
|cuda|ese_vovnet19b_dw               |128       |43.2943       |43.3379       |
|cuda|fbnetc_100                     |128       |54.8479       |54.9251       |
|cuda|fbnetv3_b                      |128       |70.7504       |71.0188       |
|cuda|gernet_l                       |128       |66.1607       |66.0379       |
|cuda|ghostnet_100                   |128       |43.8882       |43.9336       |
|cuda|gluon_inception_v3             |128       |104.9297      |105.0204      |
|cuda|gluon_xception65               |32        |85.7118       |85.8370       |
|cuda|gmixer_24_224                  |128       |75.1214       |76.1170       |
|cuda|gmlp_s16_224                   |128       |76.4207       |76.6641       |
|cuda|hrnet_w18                      |128       |186.1326      |186.2435      |
|cuda|inception_v3                   |128       |105.0561      |105.0783      |
|cuda|jx_nest_base                   |32        |65.3066       |65.3245       |
|cuda|lcnet_050                      |128       |14.7991       |14.8687       |
|cuda|levit_128                      |128       |19.2893       |19.4772       |
|cuda|mixer_b16_224                  |128       |93.9826       |94.2056       |
|cuda|mixnet_l                       |128       |147.1245      |147.0435      |
|cuda|mnasnet_100                    |128       |39.1781       |39.2565       |
|cuda|mobilenetv2_100                |128       |42.3704       |42.3114       |
|cuda|mobilenetv3_large_100          |128       |37.2946       |37.2816       |
|cuda|mobilevit_s                    |64        |55.8930       |55.8934       |
|cuda|nfnet_l0                       |128       |64.0448       |64.4438       |
|cuda|pit_b_224                      |64        |80.6342       |80.2933       |
|cuda|pnasnet5large                  |16        |154.9611      |154.8654      |
|cuda|poolformer_m36                 |64        |101.7489      |101.8138      |
|cuda|regnety_002                    |128       |27.0939       |27.0309       |
|cuda|repvgg_a2                      |128       |60.9651       |61.2533       |
|cuda|res2net101_26w_4s              |64        |77.3291       |77.4739       |
|cuda|res2net50_14w_8s               |128       |93.6572       |93.7221       |
|cuda|res2next50                     |128       |112.4975      |112.3248      |
|cuda|resmlp_12_224                  |128       |59.5422       |60.7644       |
|cuda|resnest101e                    |64        |97.9894       |98.3358       |
|cuda|rexnet_100                     |128       |55.2218       |55.0718       |
|cuda|sebotnet33ts_256               |64        |60.4880       |60.8113       |
|cuda|selecsls42b                    |128       |41.4294       |41.5341       |
|cuda|spnasnet_100                   |128       |45.0037       |45.0304       |
|cuda|swin_base_patch4_window7_224   |64        |98.2561       |98.6925       |
|cuda|swsl_resnext101_32x16d         |32        |100.6179      |100.9195      |
|cuda|tf_efficientnet_b0             |128       |56.5344       |56.4591       |
|cuda|tf_mixnet_l                    |128       |153.0318      |152.9367      |
|cuda|tinynet_a                      |128       |54.1307       |53.9298       |
|cuda|tnt_s_patch16_224              |128       |142.4801      |142.6589      |
|cuda|twins_pcpvt_base               |64        |67.9027       |67.8325       |
|cuda|visformer_small                |128       |72.5589       |72.9427       |
|cuda|vit_base_patch16_224           |64        |71.4885       |71.7342       |
|cuda|volo_d1_224                    |64        |69.3539       |69.5910       |
|cuda|xcit_large_24_p8_224           |5         |59.9000       |59.9699       |

v7 vs. v8 float16
|dev |name                           |batch_size|abs_latency|abs_latency|
|----|-------------------------------|----------|-----------|-----------|
|cuda|adv_inception_v3               |128       |104.2544   |104.2677   |
|cuda|beit_base_patch16_224          |64        |85.3601    |85.3786    |
|cuda|botnet26t_256                  |128       |72.1476    |71.8277    |
|cuda|cait_m36_384                   |4         |108.3075   |108.5941   |
|cuda|coat_lite_mini                 |128       |61.2382    |61.6049    |
|cuda|convmixer_768_32               |32        |263.3818   |263.3598   |
|cuda|convnext_base                  |64        |172.6821   |173.8520   |
|cuda|crossvit_9_240                 |128       |44.6321    |44.6340    |
|cuda|cspdarknet53                   |64        |79.3165    |79.2964    |
|cuda|deit_base_distilled_patch16_224|64        |61.9816    |62.2109    |
|cuda|dla102                         |128       |115.7403   |115.9928   |
|cuda|dm_nfnet_f0                    |128       |77.5434    |77.7440    |
|cuda|dpn107                         |32        |83.6489    |83.5605    |
|cuda|eca_botnext26ts_256            |128       |73.9953    |74.1031    |
|cuda|eca_halonext26ts               |128       |81.7951    |81.7103    |
|cuda|ese_vovnet19b_dw               |128       |42.9618    |42.8853    |
|cuda|fbnetc_100                     |128       |54.3590    |54.3575    |
|cuda|fbnetv3_b                      |128       |69.7977    |70.1696    |
|cuda|gernet_l                       |128       |64.8684    |65.1726    |
|cuda|ghostnet_100                   |128       |43.2054    |43.1319    |
|cuda|gluon_inception_v3             |128       |104.1988   |104.3030   |
|cuda|gluon_xception65               |32        |84.2245    |84.5085    |
|cuda|gmixer_24_224                  |128       |82.0418    |82.7252    |
|cuda|gmlp_s16_224                   |128       |75.4792    |75.8374    |
|cuda|hrnet_w18                      |128       |184.1450   |184.1848   |
|cuda|inception_v3                   |128       |104.1203   |104.2536   |
|cuda|jx_nest_base                   |32        |58.2386    |58.4901    |
|cuda|lcnet_050                      |128       |14.6409    |14.5616    |
|cuda|levit_128                      |128       |22.3875    |22.4680    |
|cuda|mixer_b16_224                  |128       |98.9534    |98.4730    |
|cuda|mixnet_l                       |128       |146.1623   |146.1947   |
|cuda|mnasnet_100                    |128       |38.9208    |39.3463    |
|cuda|mobilenetv2_100                |128       |41.8946    |41.9847    |
|cuda|mobilenetv3_large_100          |128       |36.7810    |36.8264    |
|cuda|mobilevit_s                    |64        |55.3211    |55.3186    |
|cuda|nfnet_l0                       |128       |63.1302    |63.5544    |
|cuda|pit_b_224                      |64        |73.8752    |73.4602    |
|cuda|pnasnet5large                  |16        |151.6806   |151.6111   |
|cuda|poolformer_m36                 |64        |86.8341    |86.8021    |
|cuda|regnety_002                    |128       |26.6798    |26.5295    |
|cuda|repvgg_a2                      |128       |61.6652    |62.1482    |
|cuda|res2net101_26w_4s              |64        |75.8037    |75.7739    |
|cuda|res2net50_14w_8s               |128       |92.6362    |92.4338    |
|cuda|res2next50                     |128       |111.5371   |111.5832   |
|cuda|resmlp_12_224                  |128       |58.2349    |57.9807    |
|cuda|resnest101e                    |64        |96.1114    |96.2742    |
|cuda|rexnet_100                     |128       |54.8138    |54.7643    |
|cuda|sebotnet33ts_256               |64        |53.1524    |53.3823    |
|cuda|selecsls42b                    |128       |40.6070    |40.7104    |
|cuda|spnasnet_100                   |128       |44.5732    |44.4318    |
|cuda|swin_base_patch4_window7_224   |64        |98.6447    |98.8445    |
|cuda|swsl_resnext101_32x16d         |32        |97.0195    |97.2968    |
|cuda|tf_efficientnet_b0             |128       |56.0640    |56.0278    |
|cuda|tf_mixnet_l                    |128       |152.0958   |152.0874   |
|cuda|tinynet_a                      |128       |53.3694    |53.3762    |
|cuda|tnt_s_patch16_224              |128       |130.2981   |130.3726   |
|cuda|twins_pcpvt_base               |64        |62.5459    |62.6416    |
|cuda|visformer_small                |128       |68.8502    |69.1756    |
|cuda|vit_base_patch16_224           |64        |65.8587    |66.0285    |
|cuda|volo_d1_224                    |64        |64.5348    |64.6057    |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89022
Approved by: https://github.com/ngimel
2022-12-15 03:24:44 +00:00
kshitij12345
6a964c16e5 [flaky] relax tolerance conv1d_vs_scipy (#89193)
Fixes https://github.com/pytorch/pytorch/issues/89087

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89193
Approved by: https://github.com/kit1980
2022-11-18 07:31:10 +00:00
PyTorch MergeBot
d98a884b33 Revert "[cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669)"
This reverts commit 3c6bddc3f6.

Reverted https://github.com/pytorch/pytorch/pull/87669 on behalf of https://github.com/eqy due to investigating convnext benchmark regressions
2022-11-08 19:04:25 +00:00
eqy
3c6bddc3f6 [cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669)
#58414

Has a small tweak to a test that was breaking on A10 (CC @malfet).

CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87669
Approved by: https://github.com/ngimel
2022-11-02 01:36:37 +00:00
Kshiteej K
6735bf21c7 [test_nn] split convolution tests from test_nn (#87474)
Ref #63085

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87474
Approved by: https://github.com/albanD
2022-10-31 04:42:45 +00:00