pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Bert Maher fbeb8b4992 [nnc] Speed up batchnorm benchmark Summary: Use better scheduling: fuse and parallelize NC, fuse and vectorize HW. ``` ----------------------------------------------- N/C/H/W ATen NNC ----------------------------------------------- 1/64/112/112 45449 ns 36672 ns 1/256/14/14 15555 ns 7116 ns 1/128/28/28 15737 ns 8560 ns 1/64/56/56 20766 ns 12153 ns 1/512/7/7 16985 ns 8182 ns 5/64/112/112 2532475 ns 2069668 ns 5/256/14/14 24507 ns 12228 ns 5/128/28/28 29352 ns 20146 ns 5/64/56/56 44786 ns 38784 ns 5/512/7/7 22307 ns 20505 ns ``` Test Plan: benchmark results above Reviewed By: navahgar Differential Revision: D29288658 fbshipit-source-id: dd05efa4b7d26b6ad94f54a9ef6c8c47adb160b5		2021-06-22 22:57:43 -07:00
..
tensorexpr	[nnc] Speed up batchnorm benchmark	2021-06-22 22:57:43 -07:00
CMakeLists.txt	CPU Convolution benchmark harness for some popular models (#56455 )	2021-04-22 22:14:36 -07:00
convolution.cpp	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 )	2021-05-07 20:02:33 -07:00