291 Commits

Author SHA1 Message Date
Hong, Yang A
5209c7978a remove invalid nfa dump info 2023-09-05 13:58:24 +03:00
Hong, Yang A
91f0cb6cea fix nfa dump error 2023-09-05 13:51:22 +03:00
Hong, Yang A
978105a4c0 klocwork: fix risk issues 2023-09-05 13:45:33 +03:00
Konstantinos Margaritis
48105cdd1d move variable 2022-09-16 14:05:31 +03:00
Hong, Yang A
decabdfede update year for bugfix #302-#305 2022-08-29 15:03:11 +03:00
Hong, Yang A
a119693a66 mcclellan: improve wide-state checking in Sherman optimization
fixes github issue #305
2022-08-29 15:03:06 +03:00
Danila Kutenin
49eb18ee4f Optimize vectorscan for aarch64 by using shrn instruction
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--

To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
2022-06-26 22:55:45 +00:00
Danila Kutenin
5f8729a085 Fix a couple of tests 2022-02-18 19:31:03 +00:00
Danila Kutenin
9af996b936 Fix all ASAN issues in vectorscan 2022-02-18 17:14:51 +00:00
Konstantinos Margaritis
81fba99f3a fix SVE2 build after the changes 2021-11-25 18:48:24 +02:00
Apostolos Tapsas
bfc8da1102 Removed accidentaly included header file 2021-11-24 12:11:21 +00:00
Apostolos Tapsas
54158a1746 vermicelli and match implementations for ppc64el added 2021-11-13 19:36:46 +00:00
apostolos
e09d8674b4 resolving conficts after merging 2021-11-13 18:58:22 +02:00
Konstantinos Margaritis
dcf6b59e8d split vermicelli block implementations per arch 2021-11-08 19:45:21 +00:00
Apostolos Tapsas
ba90cdeb5a SuperVector constructors as well as andnot implementation fixed 2021-11-05 13:34:48 +00:00
Konstantinos Margaritis
24fa54081b add len parameter and mask, fixes corner cases on AVX512 2021-11-05 14:30:22 +02:00
Konstantinos Margaritis
210295a702 remove vermicelli.h and replace it with vermicelli.hpp 2021-11-02 22:30:53 +02:00
Konstantinos Margaritis
869d2bd53b refactor vermicelliDoubleMaskedExec() 2021-11-02 22:30:21 +02:00
Konstantinos Margaritis
f6fd845400 complete refactoring and unification of Vermicelli functions 2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
d47641c2fc remove unneeded header 2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
bc1a1127cf add new include file 2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
5eabceddcf renamed matcher functions, added new ones for Vermicelli 2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
16e5e2ae64 nits 2021-11-01 16:05:43 +00:00
Konstantinos Margaritis
2fa947af9c added refactored vermicelli_simd.cpp implementation 2021-11-01 16:05:43 +00:00
apostolos
d9d39d48c5 prints commants and formating fixes 2021-11-01 10:09:15 +02:00
Apostolos Tapsas
1eb3b19f63 Shuffle simd and SuperVector implementetions as well as their test realy fixed 2021-10-25 09:19:30 +03:00
Apostolos Tapsas
d43d6733b6 SuperVector shuffle implementation and test function optimized 2021-10-22 11:55:39 +00:00
apostolos
b53b0a0fcd test for movemask and shuffle cases added 2021-10-22 11:17:43 +03:00
Apostolos Tapsas
2b1db73326 WIP: simd & bitutils files finctions fixes 2021-10-21 13:34:02 +00:00
apostolos
ba4472a61c trufle and shufle implementations for ARCH_PPC64EL 2021-10-14 16:01:21 +03:00
apostolos
d0a41252c8 blockSigleMask implementations for ARCH_PPC64 added 2021-10-14 15:56:13 +03:00
Konstantinos Margaritis
4e044d4142 Add missing copyright info from tampered files 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
2d9f52d03e add arm truffle block function 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
aea10b8ab0 simplify truffle and provide arch-specific block functions 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
623c64142b simplify shufti and provide arch-specific block functions 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
fad39b6058 optimize and simplify Shufti and Truffle to work with a single block method instead 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
9e6c1c30cf remove asserts, as they are not needed 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
9ab18cf419 fix for new pshufb 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e7161fdfec initial SSE/AVX2 implementation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
08357a096c remove Windows/ICC support 2021-10-12 11:51:34 +03:00
apostolos
b3a20afbbc limex_shuffle added and it's unit tests 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
de30471edd remove duplicate functions from previous merge 2021-10-12 11:51:34 +03:00
George Wort
a879715953 Move SVE functions into their own files.
Change-Id: I995ba4b7d2b558ee403693ee45d747d414d3b177
2021-10-12 11:51:34 +03:00
George Wort
6c6aee9682 Implement new DoubleVermicelli16 acceleration functions using SVE2
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
00fff3f53c Use SVE for double shufti.
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
2021-10-12 11:51:34 +03:00
George Wort
c95a4c3dd1 Use SVE for single shufti.
Change-Id: Ic76940c5bb9b81a1c45d39e9ca396a158c50a7dc
2021-10-12 11:51:34 +03:00
George Wort
df926ef62f Implement new Vermicelli16 acceleration functions using SVE2.
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.

Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
George Wort
c7086cb7f1 Add SVE2 support for dvermicelli
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
a38324a5a3 add arm rshift128/rshift128 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
603bc14cdd fix failing corner case, add pshufb_maskz() 2021-10-12 11:51:34 +03:00