pytorch/test/cpp
Taylor Robie 0b1f3bd158 [Profiler] Prefer TSC to wall clock when available (#73855)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73855

Calling the clock is one of the most expensive parts of profiling. We can reduce the profiling overhead by using `rdtsc` instead. The tradeoff is that we have to measure and convert. (shift and scale)

Test Plan: I added a cpp unit test with *very* aggressive anti-flake measures. I also ran the overhead benchmark (9 replicates) with `--stressTestKineto` (0.94 -> 0.89 us) and `--stressTestKineto --kinetoProfileMemory` (1.27 -> 1.17 us)

Reviewed By: chaekit

Differential Revision: D34231071

fbshipit-source-id: e3b3dd7580d93bcc783e87c7f2fc726cb74f4df8
(cherry picked from commit e8be9f8160793c6ee35d5af02bca3e01703e377d)
2022-03-13 18:29:06 +00:00
..
api [caffe2] fix build failures in optimized builds under clang 2022-02-22 22:31:47 +00:00
c10d Add support for deleteKey for FileStore (#69953) 2022-01-07 06:20:59 -08:00
common
dist_autograd Fix distributed autograd gradients synchronization (#57792) 2021-05-09 17:32:59 -07:00
jit Revert D34455360: Multisect successfully blamed D34455360 for test failures 2022-03-08 23:18:54 +00:00
lazy Revert D34342689: Revert D34250357: Sync lazy_tensor_staging back to master 2022-02-18 17:31:21 +00:00
lite_interpreter_runtime [PyTorch] Add codegen unboxing ability (#69881) 2022-03-01 23:28:13 +00:00
monitor torch/monitor: merge Interval and FixedCount stats (#72009) 2022-01-30 23:21:59 +00:00
profiler [Profiler] Prefer TSC to wall clock when available (#73855) 2022-03-13 18:29:06 +00:00
rpc Remove ProcessGroup from TensorPipeAgent initialization (#68128) 2021-11-11 12:28:55 -08:00
tensorexpr [Quant][core] Merged conv packed params and linear packed params (#73486) 2022-03-11 15:18:45 +00:00
__init__.py