pytorch/test/cpp
Jeremy Lilley 468a9d448e [aten] Pass std::function<> to thread_pool by value, instead of const ref. (#37681)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37681

By passing by value, we can std::move, and avoid unnecessarily copying
args that are part of any std::function/lambda state (e.g. in the jit
interpreter, there is a std::vector<> stack passed in the
InterpreterContinuation)

This makes the api also consistent with e.g. folly and best practices.
Added a minor at::launch() benchmark to test/cpp/, the difference is
mostly noticeable when copying the std::function<> internal args is
non-trivial.

Benchmarks pre/post (min over ~5 runs)
NoData: 5.81 us -> 5.63 us (-3.2%)
WithData(0): 6.67 us -> 5.88 us (-11.8%)
WithData(4): 6.98 us -> 6.51 us (-6.7%)
WithData(256): 9.44 us -> 7.89 (-16.5%)

ghstack-source-id: 103322321

Test Plan:
- perf: buck run mode/opt caffe2/test/cpp/api:parallel_benchmark pre/post
  - correctness buck test mode/dev-nosan caffe2/test/...

Reviewed By: dzhulgakov

Differential Revision: D21355148

fbshipit-source-id: 3567e730845106f1991091e4a892d093e00571c3
2020-05-05 08:41:38 -07:00
..
api [aten] Pass std::function<> to thread_pool by value, instead of const ref. (#37681) 2020-05-05 08:41:38 -07:00
common Trim libshm deps, move tempfile.h to c10 (#17019) 2019-02-13 19:38:35 -08:00
dist_autograd Fix/relax CMake linter rules (#35574) 2020-03-27 16:52:33 -07:00
jit Adding symbolic sizes, contiguity, stride indices (#36101) 2020-05-01 02:01:25 -07:00
rpc [TensorPipe/RPC] Serialize and deserialize message (#36197) 2020-05-05 05:45:57 -07:00
tensorexpr Add an iterator to Block. (#37542) 2020-05-01 15:12:49 -07:00
__init__.py Add train() / eval() / is_training() to C++ ScriptModule API (#16044) 2019-02-01 13:07:38 -08:00