PyTorch MergeBot
a7bfa04da6
Revert "More markDynamoStrictTest ( #115870 )"
...
This reverts commit 7f686c8fe1 .
Reverted https://github.com/pytorch/pytorch/pull/115870 on behalf of https://github.com/jeanschmidt due to Breaking internal tests and builds, please check diff ([comment](https://github.com/pytorch/pytorch/pull/115870#issuecomment-1862997125 ))
2023-12-19 15:40:57 +00:00
rzou
7f686c8fe1
More markDynamoStrictTest ( #115870 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115870
Approved by: https://github.com/voznesenskym
ghstack dependencies: #115845 , #115855 , #115856 , #115857 , #115858
2023-12-15 05:26:54 +00:00
Jithun Nair
2ea2421b44
Skip unit tests that fail on MI210 runners ( #114613 )
...
Taken from https://github.com/pytorch/pytorch/pull/105980
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114613
Approved by: https://github.com/malfet
2023-11-27 22:25:35 +00:00
rraminen
44367c59b2
Update skip reason for failing unit tests on ROCm 5.7 ( #113286 )
...
Follow up to https://github.com/pytorch/pytorch/pull/110465 . Updated skip reason for failing unit tests on ROCm 5.7
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113286
Approved by: https://github.com/malfet
2023-11-13 19:29:04 +00:00
rraminen
3a429423fc
Upgrade CI to ROCm5.7 ( #110465 )
...
This PR is to upgrade CI to ROCm5.7
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110465
Approved by: https://github.com/pruthvistony , https://github.com/malfet
2023-11-08 06:11:10 +00:00
Pruthvi Madugundu
9ce2e02fd6
Revert "[ROCm] Remove PYTORCH_MIOPEN_SUGGEST_NHWC flag ( #90725 )" ( #110319 )
...
This reverts commit 66bfcd32fd .
NHWC is have perf regression on MIOpen, so reverting till the performance issue is fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110319
Approved by: https://github.com/jeffdaily , https://github.com/jithunnair-amd , https://github.com/kit1980
2023-10-03 19:14:47 +00:00
CaoE
7c9052165a
add fp16 support for native conv and deconv on CPU ( #99497 )
...
### Testing
Native conv vs. mkldnn conv on SPR (with avx512_fp16 support)
Single core:
Input | Naïve impl / us | oneDNN / us | Speed up
-- | -- | -- | --
IC: 64, OC: 256, kernel: 1, stride: 1, N: 256, H: 56, W: 56, G: 1, pad: 0 | 34676789 | 524199.8 | 66.15185
IC: 128, OC: 512, kernel: 1, stride: 1, N: 256, H: 28, W: 28, G: 1, pad: 0 | 33454125 | 349844.4 | 95.62573
IC: 256, OC: 256, kernel: 3, stride: 1, N: 1, H: 16, W: 16, G: 1, pad: 0 | 317650.1 | 2317.677 | 137.0554
IC: 128, OC: 256, kernel: 3, stride: 1, N: 1, L: 64 | 15334.68 | 167.264 | 91.67952
56 cores:
Input | Naïve impl / us | oneDNN / us | Speed up
-- | -- | -- | --
IC: 64, OC: 256, kernel: 1, stride: 1, N: 256, H: 56, W: 56, G: 1, pad: 0 | 1032064 | 11073.58 | 93.20061
IC: 128, OC: 512, kernel: 1, stride: 1, N: 256, H: 28, W: 28, G: 1, pad: 0 | 1000097 | 16371.19 | 61.08883
IC: 256, OC: 1024, kernel: 1, stride: 1, N: 256, H: 14, W: 14, G: 1, pad: 0 | 981813.4 | 9008.908 | 108.9825
IC: 1024, OC: 256, kernel: 1, stride: 1, N: 256, H: 14, W: 14, G: 1, pad: 0 | 1082606 | 10150.47 | 106.6558
IC: 256, OC: 256, kernel: 3, stride: 1, N: 1, H: 16, W: 16, G: 1, pad: 0 | 319980.6 | 181.598 | 1762.027
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99497
Approved by: https://github.com/jgong5 , https://github.com/cpuhrsch
2023-09-25 01:31:26 +00:00
Justin Chu
79c5e33349
[BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ ( #105436 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436
Approved by: https://github.com/malfet , https://github.com/albanD
2023-07-21 07:38:46 +00:00
Fuzzkatt
6d570ccd59
tf32 context fixes for various tests ( #103137 )
...
Addresses tf32 context related failures from NVIDIA internal testing for following unit tests:
H100:
- functorch/test_vmap.py: test_op_has_batch_rule
A100:
- test_expanded_weights.py: test_cnn_model_sum
- nn/test_convolution.py: test_conv2d_same_padding_backward
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103137
Approved by: https://github.com/zou3519
2023-06-15 02:33:12 +00:00
Fuzzkatt
f8896b7b0e
update tf32 thresholds in nn/test_convolution.py ( #102015 )
...
updated tf32 thresholds for test_cudnn_convolution_relu, test_cudnn_convolution_add_relu
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102015
Approved by: https://github.com/ngimel
2023-05-24 22:42:25 +00:00
Fuzzkatt
47e9dba765
move tf32_on_and_off fix for test_convolution.py ( #102007 )
...
move tf32_on_and_off after @torch.backends.cudnn.flags(enabled=True, benchmark=False) due to @torch.backends.cudnn.flags(enabled=True, benchmark=False) overwriting tf32_on_and_off if after.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102007
Approved by: https://github.com/ngimel
2023-05-24 02:23:06 +00:00
kshitij12345
3b966a6ce3
[autograd] disable backward/grad for complex scalar output ( #92753 )
...
Fixes https://github.com/pytorch/pytorch/issues/92750
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92753
Approved by: https://github.com/ezyang
2023-02-23 11:38:27 +00:00
Jeff Daily
66bfcd32fd
[ROCm] Remove PYTORCH_MIOPEN_SUGGEST_NHWC flag ( #90725 )
...
Fixes #64427 . MIOpen supports ChannelsLast. No longer need to opt-in with env var.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90725
Approved by: https://github.com/malfet
2023-02-09 22:26:24 +00:00
mingfeima
26cba842ad
Optimize ConvTransposed2D with mkldnn float32 and bfloat16 on CPU ( #92530 )
...
this PR optimized `ConvTranspose2d` with oneDNN and add channels last support for it. Also the fallback path `slow_conv_transpose2d` also have channels last support. So the memory format propagation behavior would stay the same with or without oneDNN.
Replacement of https://github.com/pytorch/pytorch/pull/77060 , https://github.com/pytorch/pytorch/pull/70897 and https://github.com/pytorch/pytorch/pull/74023 which enables oneDNN for `ConvTranspose2d` and `ConvTranspose3d`
The following results collects on Skylake Xeon 8180, dual sockets, 28 cores per socket.
### single core channels last
configs | forward before/ms | forward after/ms | ratio | backward before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100, 100), weight size: (32, 32, 3, 3) | 181.36 | 91.16 | 1.99 | 531.38 | 124.08 | 4.28
input size: (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 324.35 | 153.50 | 2.11 | 973.16 | 185.97 | 5.23
input size: (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 1086.82 | 671.52 | 1.62 | 3008.94 | 1453.33 | 2.07
### single core channels first
configs | forward before/ms | forward after/ms | ratio | backward before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100, 100), weight size: (32, 32, 3, 3) | 138.10 | 5.94 | 23.23 | 37.97 | 11.25 | 3.38
input size: (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 236.43 | 8.75 | 27.03 | 87.77 | 18.58 | 4.72
input size: (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 484.39 | 37.69 | 12.85 | 185.40 | 90.57 | 2.05
### single socket channels last
configs | forward before/ms | forward after/ms | ratio | backward before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100, 100), weight size: (32, 32, 3, 3) | 138.10 | 5.94 | 23.23 | 37.97 | 11.25 | 3.38
input size: (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 236.43 | 8.75 | 27.03 | 87.77 | 18.58 | 4.72
input size: (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 484.39 | 37.69 | 12.85 | 185.40 | 90.57 | 2.0
### single socket channels first
configs | forward before/ms | forward after/ms | ratio | backward before/ms | backward after/ms | ratio
-- | -- | -- | -- | -- | -- | --
input size: (32, 32, 100, 100), weight size: (32, 32, 3, 3) | 132.56 | 7.19 | 18.43 | 31.43 | 11.20 | 2.81
input size: (32, 16, 200, 200), weight size: (16, 16, 3, 3) | 227.94 | 13.33 | 17.11 | 63.00 | 23.41 | 2.69
input size: (32, 128, 100, 100), weight size: (128, 128, 3, 3) | 473.68 | 52.79 | 8.97 | 150.40 | 87.33 | 1.72
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92530
Approved by: https://github.com/jgong5 , https://github.com/ezyang
2023-02-06 10:11:25 +00:00
Jeff Daily
72502b94f3
correct use of torch.backends.cudnn.flags() ( #93182 )
...
Fixes #77467 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93182
Approved by: https://github.com/ngimel
2023-01-28 06:50:06 +00:00
Eddie Yan
dabf515c18
[cuDNN][cuDNN V8 API] (re-re-re-open) cuDNN V8 API on by default ( #91117 )
...
Re-opening following #91025
CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91117
Approved by: https://github.com/ngimel
2022-12-20 18:52:29 +00:00
PyTorch MergeBot
ba7aeac37b
Revert "[cuDNN][cuDNN V8 API] (re-re-open) cuDNN V8 API on by default ( #89022 )"
...
This reverts commit eecd621f06 .
Reverted https://github.com/pytorch/pytorch/pull/89022 on behalf of https://github.com/ngimel due to breaks some convolution configurations #91025
2022-12-16 23:06:35 +00:00
Eddie Yan
eecd621f06
[cuDNN][cuDNN V8 API] (re-re-open) cuDNN V8 API on by default ( #89022 )
...
Testing V8 on by default again after fixes have been merged for e.g., https://github.com/pytorch/torchdynamo/issues/1833
One new failure that seems to be surfaced with V8 on appears in halonext + amp
```
RuntimeError: Internal Triton PTX codegen error:
Segmentation fault (core dumped)
```
But I'm not sure if this points to a V8 issue or a Triton issue CC @ngimel @ptrblck
Current dynamo benchmarks on A100:
v7 vs. v8
|dev |name |batch_size|abs_latency_v7|abs_latency_v8|
|----|-------------------------------|----------|--------------|--------------|
|cuda|adv_inception_v3 |128 |166.0240 |165.5798 |
|cuda|beit_base_patch16_224 |64 |123.5912 |123.0797 |
|cuda|botnet26t_256 |128 |107.7343 |107.5948 |
|cuda|cait_m36_384 |4 |184.5038 |184.0271 |
|cuda|coat_lite_mini |128 |142.3061 |140.5814 |
|cuda|convit_base |64 |165.2499 |161.0743 |
|cuda|convmixer_768_32 |32 |325.6984 |325.7094 |
|cuda|convnext_base |64 |237.4632 |238.0142 |
|cuda|crossvit_9_240 |128 |72.2980 |72.4367 |
|cuda|cspdarknet53 |64 |96.6862 |96.8308 |
|cuda|deit_base_distilled_patch16_224|64 |117.6045 |117.9616 |
|cuda|dla102 |128 |182.3073 |182.2304 |
|cuda|dm_nfnet_f0 |128 |133.6011 |133.6298 |
|cuda|dpn107 |32 |148.5080 |148.5885 |
|cuda|eca_botnext26ts_256 |128 |113.8676 |113.1514 |
|cuda|eca_halonext26ts |128 |119.2242 |119.1845 |
|cuda|ese_vovnet19b_dw |128 |80.0217 |79.9438 |
|cuda|fbnetc_100 |128 |91.4548 |91.4009 |
|cuda|fbnetv3_b |128 |115.4496 |115.5058 |
|cuda|gernet_l |128 |114.8365 |114.7870 |
|cuda|ghostnet_100 |128 |58.5766 |58.5766 |
|cuda|gluon_inception_v3 |128 |165.5222 |165.7167 |
|cuda|gluon_xception65 |32 |165.8779 |165.7818 |
|cuda|gmixer_24_224 |128 |116.3611 |113.4925 |
|cuda|gmlp_s16_224 |128 |121.2607 |121.2534 |
|cuda|hrnet_w18 |128 |246.5706 |246.7599 |
|cuda|inception_v3 |128 |166.1096 |166.2034 |
|cuda|jx_nest_base |32 |93.6064 |93.4088 |
|cuda|lcnet_050 |128 |21.4156 |21.4207 |
|cuda|levit_128 |128 |27.2901 |27.2543 |
|cuda|mixer_b16_224 |128 |157.8992 |158.2878 |
|cuda|mixnet_l |128 |197.3443 |197.2125 |
|cuda|mnasnet_100 |128 |71.4604 |71.2997 |
|cuda|mobilenetv2_100 |128 |67.6080 |67.7515 |
|cuda|mobilenetv3_large_100 |128 |57.7224 |57.6591 |
|cuda|mobilevit_s |64 |93.0372 |93.0530 |
|cuda|nfnet_l0 |128 |113.1664 |113.2853 |
|cuda|pit_b_224 |64 |133.3333 |133.4153 |
|cuda|pnasnet5large |16 |238.9545 |238.8122 |
|cuda|poolformer_m36 |64 |144.2353 |144.2375 |
|cuda|regnety_002 |128 |32.8534 |32.9069 |
|cuda|repvgg_a2 |128 |102.4150 |102.3827 |
|cuda|res2net101_26w_4s |64 |120.8127 |120.8322 |
|cuda|res2net50_14w_8s |128 |149.7052 |149.8969 |
|cuda|res2next50 |128 |153.7439 |153.8215 |
|cuda|resmlp_12_224 |128 |89.1918 |86.9226 |
|cuda|resnest101e |64 |159.4706 |159.3133 |
|cuda|rexnet_100 |128 |88.0032 |88.0397 |
|cuda|sebotnet33ts_256 |64 |80.4635 |80.0120 |
|cuda|selecsls42b |128 |70.4430 |70.3663 |
|cuda|spnasnet_100 |128 |78.0537 |78.1991 |
|cuda|swin_base_patch4_window7_224 |64 |212.9073 |213.0824 |
|cuda|swsl_resnext101_32x16d |32 |193.0229 |193.0404 |
|cuda|tf_efficientnet_b0 |128 |97.1316 |97.0410 |
|cuda|tf_mixnet_l |128 |203.4956 |203.5340 |
|cuda|tinynet_a |128 |82.4038 |82.8733 |
|cuda|tnt_s_patch16_224 |128 |284.8576 |284.8867 |
|cuda|twins_pcpvt_base |64 |118.3893 |119.2329 |
|cuda|visformer_small |128 |126.0533 |126.0390 |
|cuda|vit_base_patch16_224 |64 |118.2873 |118.0573 |
|cuda|volo_d1_224 |64 |108.7764 |108.2063 |
|cuda|xcit_large_24_p8_224 |5 |100.4656 |100.5209 |
v7 vs. v8 amp
|dev |name |batch_size|abs_latency_v7|abs_latency_v8|
|----|-------------------------------|----------|--------------|--------------|
|cuda|adv_inception_v3 |128 |104.9729 |105.1237 |
|cuda|beit_base_patch16_224 |64 |75.4330 |75.2039 |
|cuda|botnet26t_256 |128 |74.5149 |74.8071 |
|cuda|cait_m36_384 |4 |110.9788 |111.5170 |
|cuda|coat_lite_mini |128 |62.3618 |64.4965 |
|cuda|convit_base |64 |116.4054 |117.9129 |
|cuda|convmixer_768_32 |32 |264.4401 |264.4491 |
|cuda|convnext_base |64 |182.9009 |179.2136 |
|cuda|crossvit_9_240 |128 |48.8586 |48.8359 |
|cuda|cspdarknet53 |64 |80.0245 |80.0160 |
|cuda|deit_base_distilled_patch16_224|64 |66.5921 |66.7448 |
|cuda|dla102 |128 |116.7780 |117.1683 |
|cuda|dm_nfnet_f0 |128 |78.9322 |79.1135 |
|cuda|dpn107 |32 |85.5206 |85.7514 |
|cuda|eca_botnext26ts_256 |128 |76.3672 |77.0050 |
|cuda|eca_halonext26ts |128 |86.2458 | |
|cuda|ese_vovnet19b_dw |128 |43.2943 |43.3379 |
|cuda|fbnetc_100 |128 |54.8479 |54.9251 |
|cuda|fbnetv3_b |128 |70.7504 |71.0188 |
|cuda|gernet_l |128 |66.1607 |66.0379 |
|cuda|ghostnet_100 |128 |43.8882 |43.9336 |
|cuda|gluon_inception_v3 |128 |104.9297 |105.0204 |
|cuda|gluon_xception65 |32 |85.7118 |85.8370 |
|cuda|gmixer_24_224 |128 |75.1214 |76.1170 |
|cuda|gmlp_s16_224 |128 |76.4207 |76.6641 |
|cuda|hrnet_w18 |128 |186.1326 |186.2435 |
|cuda|inception_v3 |128 |105.0561 |105.0783 |
|cuda|jx_nest_base |32 |65.3066 |65.3245 |
|cuda|lcnet_050 |128 |14.7991 |14.8687 |
|cuda|levit_128 |128 |19.2893 |19.4772 |
|cuda|mixer_b16_224 |128 |93.9826 |94.2056 |
|cuda|mixnet_l |128 |147.1245 |147.0435 |
|cuda|mnasnet_100 |128 |39.1781 |39.2565 |
|cuda|mobilenetv2_100 |128 |42.3704 |42.3114 |
|cuda|mobilenetv3_large_100 |128 |37.2946 |37.2816 |
|cuda|mobilevit_s |64 |55.8930 |55.8934 |
|cuda|nfnet_l0 |128 |64.0448 |64.4438 |
|cuda|pit_b_224 |64 |80.6342 |80.2933 |
|cuda|pnasnet5large |16 |154.9611 |154.8654 |
|cuda|poolformer_m36 |64 |101.7489 |101.8138 |
|cuda|regnety_002 |128 |27.0939 |27.0309 |
|cuda|repvgg_a2 |128 |60.9651 |61.2533 |
|cuda|res2net101_26w_4s |64 |77.3291 |77.4739 |
|cuda|res2net50_14w_8s |128 |93.6572 |93.7221 |
|cuda|res2next50 |128 |112.4975 |112.3248 |
|cuda|resmlp_12_224 |128 |59.5422 |60.7644 |
|cuda|resnest101e |64 |97.9894 |98.3358 |
|cuda|rexnet_100 |128 |55.2218 |55.0718 |
|cuda|sebotnet33ts_256 |64 |60.4880 |60.8113 |
|cuda|selecsls42b |128 |41.4294 |41.5341 |
|cuda|spnasnet_100 |128 |45.0037 |45.0304 |
|cuda|swin_base_patch4_window7_224 |64 |98.2561 |98.6925 |
|cuda|swsl_resnext101_32x16d |32 |100.6179 |100.9195 |
|cuda|tf_efficientnet_b0 |128 |56.5344 |56.4591 |
|cuda|tf_mixnet_l |128 |153.0318 |152.9367 |
|cuda|tinynet_a |128 |54.1307 |53.9298 |
|cuda|tnt_s_patch16_224 |128 |142.4801 |142.6589 |
|cuda|twins_pcpvt_base |64 |67.9027 |67.8325 |
|cuda|visformer_small |128 |72.5589 |72.9427 |
|cuda|vit_base_patch16_224 |64 |71.4885 |71.7342 |
|cuda|volo_d1_224 |64 |69.3539 |69.5910 |
|cuda|xcit_large_24_p8_224 |5 |59.9000 |59.9699 |
v7 vs. v8 float16
|dev |name |batch_size|abs_latency|abs_latency|
|----|-------------------------------|----------|-----------|-----------|
|cuda|adv_inception_v3 |128 |104.2544 |104.2677 |
|cuda|beit_base_patch16_224 |64 |85.3601 |85.3786 |
|cuda|botnet26t_256 |128 |72.1476 |71.8277 |
|cuda|cait_m36_384 |4 |108.3075 |108.5941 |
|cuda|coat_lite_mini |128 |61.2382 |61.6049 |
|cuda|convmixer_768_32 |32 |263.3818 |263.3598 |
|cuda|convnext_base |64 |172.6821 |173.8520 |
|cuda|crossvit_9_240 |128 |44.6321 |44.6340 |
|cuda|cspdarknet53 |64 |79.3165 |79.2964 |
|cuda|deit_base_distilled_patch16_224|64 |61.9816 |62.2109 |
|cuda|dla102 |128 |115.7403 |115.9928 |
|cuda|dm_nfnet_f0 |128 |77.5434 |77.7440 |
|cuda|dpn107 |32 |83.6489 |83.5605 |
|cuda|eca_botnext26ts_256 |128 |73.9953 |74.1031 |
|cuda|eca_halonext26ts |128 |81.7951 |81.7103 |
|cuda|ese_vovnet19b_dw |128 |42.9618 |42.8853 |
|cuda|fbnetc_100 |128 |54.3590 |54.3575 |
|cuda|fbnetv3_b |128 |69.7977 |70.1696 |
|cuda|gernet_l |128 |64.8684 |65.1726 |
|cuda|ghostnet_100 |128 |43.2054 |43.1319 |
|cuda|gluon_inception_v3 |128 |104.1988 |104.3030 |
|cuda|gluon_xception65 |32 |84.2245 |84.5085 |
|cuda|gmixer_24_224 |128 |82.0418 |82.7252 |
|cuda|gmlp_s16_224 |128 |75.4792 |75.8374 |
|cuda|hrnet_w18 |128 |184.1450 |184.1848 |
|cuda|inception_v3 |128 |104.1203 |104.2536 |
|cuda|jx_nest_base |32 |58.2386 |58.4901 |
|cuda|lcnet_050 |128 |14.6409 |14.5616 |
|cuda|levit_128 |128 |22.3875 |22.4680 |
|cuda|mixer_b16_224 |128 |98.9534 |98.4730 |
|cuda|mixnet_l |128 |146.1623 |146.1947 |
|cuda|mnasnet_100 |128 |38.9208 |39.3463 |
|cuda|mobilenetv2_100 |128 |41.8946 |41.9847 |
|cuda|mobilenetv3_large_100 |128 |36.7810 |36.8264 |
|cuda|mobilevit_s |64 |55.3211 |55.3186 |
|cuda|nfnet_l0 |128 |63.1302 |63.5544 |
|cuda|pit_b_224 |64 |73.8752 |73.4602 |
|cuda|pnasnet5large |16 |151.6806 |151.6111 |
|cuda|poolformer_m36 |64 |86.8341 |86.8021 |
|cuda|regnety_002 |128 |26.6798 |26.5295 |
|cuda|repvgg_a2 |128 |61.6652 |62.1482 |
|cuda|res2net101_26w_4s |64 |75.8037 |75.7739 |
|cuda|res2net50_14w_8s |128 |92.6362 |92.4338 |
|cuda|res2next50 |128 |111.5371 |111.5832 |
|cuda|resmlp_12_224 |128 |58.2349 |57.9807 |
|cuda|resnest101e |64 |96.1114 |96.2742 |
|cuda|rexnet_100 |128 |54.8138 |54.7643 |
|cuda|sebotnet33ts_256 |64 |53.1524 |53.3823 |
|cuda|selecsls42b |128 |40.6070 |40.7104 |
|cuda|spnasnet_100 |128 |44.5732 |44.4318 |
|cuda|swin_base_patch4_window7_224 |64 |98.6447 |98.8445 |
|cuda|swsl_resnext101_32x16d |32 |97.0195 |97.2968 |
|cuda|tf_efficientnet_b0 |128 |56.0640 |56.0278 |
|cuda|tf_mixnet_l |128 |152.0958 |152.0874 |
|cuda|tinynet_a |128 |53.3694 |53.3762 |
|cuda|tnt_s_patch16_224 |128 |130.2981 |130.3726 |
|cuda|twins_pcpvt_base |64 |62.5459 |62.6416 |
|cuda|visformer_small |128 |68.8502 |69.1756 |
|cuda|vit_base_patch16_224 |64 |65.8587 |66.0285 |
|cuda|volo_d1_224 |64 |64.5348 |64.6057 |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89022
Approved by: https://github.com/ngimel
2022-12-15 03:24:44 +00:00
kshitij12345
6a964c16e5
[flaky] relax tolerance conv1d_vs_scipy ( #89193 )
...
Fixes https://github.com/pytorch/pytorch/issues/89087
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89193
Approved by: https://github.com/kit1980
2022-11-18 07:31:10 +00:00
PyTorch MergeBot
d98a884b33
Revert "[cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default ( #87669 )"
...
This reverts commit 3c6bddc3f6 .
Reverted https://github.com/pytorch/pytorch/pull/87669 on behalf of https://github.com/eqy due to investigating convnext benchmark regressions
2022-11-08 19:04:25 +00:00
eqy
3c6bddc3f6
[cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default ( #87669 )
...
#58414
Has a small tweak to a test that was breaking on A10 (CC @malfet).
CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87669
Approved by: https://github.com/ngimel
2022-11-02 01:36:37 +00:00
Kshiteej K
6735bf21c7
[test_nn] split convolution tests from test_nn ( #87474 )
...
Ref #63085
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87474
Approved by: https://github.com/albanD
2022-10-31 04:42:45 +00:00