pytorch/benchmarks/cpp
Raghavan Raman 2cc9778495 [MicroBench] Added a log_vml version of the signed log1p kernel (#64205)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64205

The log_vml version of the micro-bench is over **2x** faster than the log1p version. Here are the perf numbers:

```
---------------------------------------------------------------------------------------------
Benchmark                                   Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------
SignedLog1pBench/ATen/10/1467           45915 ns        45908 ns        14506 GB/s=2.5564G/s
SignedLog1pBench/NNC/10/1467            40469 ns        40466 ns        17367 GB/s=2.9002G/s
SignedLog1pBench/NNCLogVml/10/1467      19560 ns        19559 ns        35902 GB/s=6.00016G/s
```

Thanks to bertmaher for pointing this out.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30644716

Pulled By: navahgar

fbshipit-source-id: ba2b32c79d4265cd48a2886b0c62d0e89ff69c19
2021-09-10 16:49:06 -07:00
..
tensorexpr [MicroBench] Added a log_vml version of the signed log1p kernel (#64205) 2021-09-10 16:49:06 -07:00
CMakeLists.txt CPU Convolution benchmark harness for some popular models (#56455) 2021-04-22 22:14:36 -07:00
convolution.cpp Disable avoid-non-const-global-variables lint check (#62008) 2021-07-22 18:04:40 -07:00