Danila Kutenin
|
49eb18ee4f
|
Optimize vectorscan for aarch64 by using shrn instruction
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--
To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
|
2022-06-26 22:55:45 +00:00 |
|
Danila Kutenin
|
9af996b936
|
Fix all ASAN issues in vectorscan
|
2022-02-18 17:14:51 +00:00 |
|
Konstantinos Margaritis
|
713aaef799
|
move casemask helper functions to separate header
|
2021-11-01 16:05:43 +00:00 |
|
Konstantinos Margaritis
|
41ff0962c4
|
minor fixes
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
2753dbb3b0
|
rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes
|
2021-10-12 11:51:34 +03:00 |
|
George Wort
|
d1009e8830
|
Fix error in initial noodle double final call.
Change-Id: Ie044988f183b47e0b2f1eed3b4bd23de75c3117d
|
2021-10-12 11:51:34 +03:00 |
|
George Wort
|
d6df8116a5
|
Add SVE2 support for noodle
Change-Id: Iacb7d1f164bdd0ba50e2e13d26fe548cf9b45a6a
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
e215157a21
|
move definitions elsewhere
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
05c7c8e576
|
move SuperVector versions of noodleEngine scan functions to _simd.hpp file
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
c6406bebde
|
simplify scanSingleMain() and scanDoubleMain()
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
ede2b18564
|
add generic SIMD implementation
|
2021-10-12 11:51:34 +03:00 |
|