pytorch

OSSForks/pytorch

Fork 0

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Commit Graph

Author	SHA1	Message	Date
Bert Maher	b7261de0df	[pytorch][te] Add compilation time benchmark (#46124 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46124 We want to make sure we can actually fuse kernels within a fairly tight time budget. So here's a quick benchmark of codegen for a simple pointwise activation function (swish). I kept all the intermediate tensors separate to force TE to actually do inlining. Test Plan: ``` buck run mode/opt //caffe2/benchmarks/cpp/tensorexpr:tensorexpr_bench ``` I've only run in debug mode so results aren't super meaningful, but even in that mode it's 18ms for compilation, 15 of which are in llvm. Update, opt build mode: ``` ---------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------- BM_CompileSwish 5123276 ns 5119846 ns 148 BM_CompileSwishLLVMOnly 4754361 ns 4753701 ns 160 ``` Reviewed By: asuhan Differential Revision: D24232801 fbshipit-source-id: d58a8b7f79bcd9244c49366af7a693e09f24bf76	2020-10-09 23:11:37 -07:00

Author

SHA1

Message

Date

Bert Maher

b7261de0df

[pytorch][te] Add compilation time benchmark (#46124 )

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46124

We want to make sure we can actually fuse kernels within a fairly
tight time budget.  So here's a quick benchmark of codegen for a simple
pointwise activation function (swish).  I kept all the intermediate tensors
separate to force TE to actually do inlining.

Test Plan:
```
buck run mode/opt //caffe2/benchmarks/cpp/tensorexpr:tensorexpr_bench
```

I've only run in debug mode so results aren't super meaningful, but even in
that mode it's 18ms for compilation, 15 of which are in llvm.

Update, opt build mode:
```
----------------------------------------------------------------------------
Benchmark                                     Time           CPU Iterations
----------------------------------------------------------------------------
BM_CompileSwish                         5123276 ns    5119846 ns        148
BM_CompileSwishLLVMOnly                 4754361 ns    4753701 ns        160
```

Reviewed By: asuhan

Differential Revision: D24232801

fbshipit-source-id: d58a8b7f79bcd9244c49366af7a693e09f24bf76

2020-10-09 23:11:37 -07:00

1 Commits