pytorch/c10/benchmark
Yangqing Jia 1ef99cf0ab Intrusive_ptr implementation slower than shared_ptr (#30810)
Summary:
It was a random coding exercise so I wasn't putting much effort into it; but, I was like "hey is the current intrusive_ptr implementation optimized enough?" so I compared it with shared_ptr (using std::shared_from_this).

My benchmark result shows that intrusive_ptr is actually slower. On my macbook the speed is:

```
---------------------------------------------------------------
Benchmark                        Time           CPU Iterations
---------------------------------------------------------------
BM_IntrusivePtrCtorDtor         14 ns         14 ns   52541902
BM_SharedPtrCtorDtor            10 ns         10 ns   71898849
BM_IntrusivePtrArray         14285 ns      14112 ns      49775
BM_SharedPtrArray            13821 ns      13384 ns      51602
```

Wanted to share the results so someone could probably take a look if interested.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30810

Reviewed By: yinghai

Differential Revision: D18828785

Pulled By: bddppq

fbshipit-source-id: 202e9849c9d8a3da17edbe568572a74bb70cb6c5
2019-12-13 00:25:36 -08:00
..
CMakeLists.txt Intrusive_ptr implementation slower than shared_ptr (#30810) 2019-12-13 00:25:36 -08:00
intrusive_ptr_benchmark.cpp Intrusive_ptr implementation slower than shared_ptr (#30810) 2019-12-13 00:25:36 -08:00