opencv

mirror of https://github.com/zebrajr/opencv.git synced 2025-12-06 12:19:50 +01:00

History

pratham-mcw 8f3976ae97 Merge pull request #27785 from pratham-mcw:dnn-lstm-neon dnn: added neon intrinsics implementation of fastGEMM1T function #27785 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [X] The PR is proposed to the proper branch - This PR improves the performance of the LSTM function on ARM64 targets. - Added a NEON intrinsics implementation of the fastGEMM1T function and enabled its use in fully connected and recurrent layers file. - As a result, ARM64 now benefits from vectorized matrix–vector multiplications, leading to measurable performance improvements in the LSTM layer. - This change is limited to ARM64 and does not affect other architectures. Performance impact: - The optimization significantly improves the performance of lstm functions on ARM64 targets. <img width="930" height="313" alt="image" src="https://github.com/user-attachments/assets/92e251cd-dc6c-4cda-9586-acc19bf16dfd" />		2025-10-03 10:50:50 +03:00
..
cmake	Add Definition "_USE_MATH_DEFINES" for dnn plugin on Win32 build	2024-04-07 21:08:09 +09:00
include/opencv2	pre: OpenCV 4.12.0 (version++).	2025-06-19 11:03:59 +03:00
misc	Add Java wrapper support for List<List<MatShape>>	2025-08-25 02:09:19 +09:00
perf	Merge pull request #26127 from alexlyulkov:al/blob-from-images	2024-12-23 10:04:34 +03:00
src	Merge pull request #27785 from pratham-mcw:dnn-lstm-neon	2025-10-03 10:50:50 +03:00
test	Higher threshold for ViT on OpenVINO	2025-05-21 09:31:40 +03:00
CMakeLists.txt	Merge pull request #27785 from pratham-mcw:dnn-lstm-neon	2025-10-03 10:50:50 +03:00