77 Commits

Author SHA1 Message Date
Konstantinos Margaritis
0d2f9ccbaa Fix 'unqualified call to std::move' errors in clang 15+ 2023-10-03 20:24:39 +03:00
Danila Kutenin
49eb18ee4f Optimize vectorscan for aarch64 by using shrn instruction
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--

To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
2022-06-26 22:55:45 +00:00
Danila Kutenin
9af996b936 Fix all ASAN issues in vectorscan 2022-02-18 17:14:51 +00:00
Konstantinos Margaritis
81fba99f3a fix SVE2 build after the changes 2021-11-25 18:48:24 +02:00
Konstantinos Margaritis
210295a702 remove vermicelli.h and replace it with vermicelli.hpp 2021-11-02 22:30:53 +02:00
Konstantinos Margaritis
bc1a1127cf add new include file 2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
713aaef799 move casemask helper functions to separate header 2021-11-01 16:05:43 +00:00
Konstantinos Margaritis
4e044d4142 Add missing copyright info from tampered files 2021-10-12 11:51:35 +03:00
George Wort
df926ef62f Implement new Vermicelli16 acceleration functions using SVE2.
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.

Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
George Wort
c7086cb7f1 Add SVE2 support for dvermicelli
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e35b88f2c8 use STL make_unique, remove wrapper header, breaks C++17 compilation 2021-10-12 11:51:34 +03:00
George Wort
9fb79ac3ec Add SVE2 support for vermicelli
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
George Wort
7162446358 Remove possibly undefined behaviour from Noodle.
Change-Id: I9a7997cea6a48927cb02b00c5dba5009bbf83850
2021-10-12 11:51:34 +03:00
George Wort
b48ea2c1a6 Remove first check from scanDouble Noodle.
Change-Id: I00eabb3cb06ef6a2060df52c26fa8591907a2711
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
41ff0962c4 minor fixes 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
2753dbb3b0 rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes 2021-10-12 11:51:34 +03:00
George Wort
d1009e8830 Fix error in initial noodle double final call.
Change-Id: Ie044988f183b47e0b2f1eed3b4bd23de75c3117d
2021-10-12 11:51:34 +03:00
George Wort
d6df8116a5 Add SVE2 support for noodle
Change-Id: Iacb7d1f164bdd0ba50e2e13d26fe548cf9b45a6a
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e215157a21 move definitions elsewhere 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
05c7c8e576 move SuperVector versions of noodleEngine scan functions to _simd.hpp file 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
c6406bebde simplify scanSingleMain() and scanDoubleMain() 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f77837130d delete separate implementations 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
ede2b18564 add generic SIMD implementation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
7a9a2dd0dc convert to C++ 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
831091db9e fix typo 2021-10-12 11:51:33 +03:00
Konstantinos Margaritis
ec5531a6b1 minor optimizations 2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
d3ff893871 prefetch works best when addresses are 64-byte aligned 2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
58cface115 optimise case handling 2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
e3e101b412 simplify and make scanSingle*()/scanDouble*() more uniform 2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
2f13ad0674 optimize caseMask handling 2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
27bd09454f use correct function names for AVX512, fix build failure 2021-02-15 13:54:19 +02:00
Konstantinos Margaritis
9fd94e0062 use unaligned loads for short scans 2021-02-11 14:21:57 +02:00
Konstantinos Margaritis
d3e03ed88a optimize case mask AND out of the loop 2021-02-10 13:29:45 +02:00
Konstantinos Margaritis
5333467249 fix names, use own intrinsic instead of explicit _mm* ones 2020-09-23 11:51:21 +03:00
Konstantinos Margaritis
8ed5f4ac75 fix include paths for masked_move 2020-09-18 12:55:57 +03:00
Wang Xiang W
f658c4e149 Noodle: avoid an extra convert instruction
fixes github issue #221
2020-05-25 13:46:42 +00:00
Hong, Yang A
23e5f06594 add new Literal API for pure literal expressions:
Design compile time api hs_compile_lit() and hs_compile_lit_multi()
to handle pure literal pattern sets. Corresponding option --literal-on
is added for hyperscan testing suites. Extended parameters and part of
flags are not supported for this api.
2019-08-13 14:51:38 +08:00
Hong, Yang A
f68723a606 literal matching: separate path for pure literal patterns 2019-01-21 09:59:22 +08:00
Justin Viiret
af519f3190 hwlm_build: default for HWLMProto::make_small
Silences Coverity warning.
2017-09-18 13:29:34 +10:00
Alex Coyte
41783fe912 more comments on hwlm/fdr's start parameter 2017-08-21 11:23:41 +10:00
Wang, Xiang W
86c5f7feb1 FDR: Squash buckets of included literals in FDR confirm
- Change the compile of literal matchers to two passes.
 - Reverse the bucket assignment in FDR, bucket with longer literals has
   smaller bucket id.
 - Squash the buckets of included literals and jump to the the program of
   included literals directly from parent literal program without going
   through FDR confirm for included iterals.
2017-08-21 11:12:36 +10:00
Wang, Xiang W
67a8f43355 literal matchers: change context passed to callback to scratch 2017-08-21 11:12:36 +10:00
Wang, Xiang W
ebb1b0006b remove start argument in literal matcher callbacks 2017-08-21 11:12:36 +10:00
Matthew Barr
35d396d061 noodle: correct streaming bounds 2017-08-21 11:12:26 +10:00
Matthew Barr
f2b97a51d8 noodle: param name 2017-08-21 11:12:26 +10:00
Matthew Barr
166f5d8ba5 noodle: scan using the correct offsets 2017-08-21 11:12:24 +10:00
Matthew Barr
31a445a0e8 noodle: behave like our other literal matchers
Noodle now supports supplementary masks.
2017-08-21 11:10:20 +10:00
Matthew Barr
9c538a7522 Move hwlm literal len define 2017-08-21 11:10:20 +10:00
Matthew Barr
293f9fcc49 noodle: we don't need memcpy 2017-08-21 11:10:20 +10:00
Matthew Barr
4be7d6fecc noodle: Use a sane temp buf for streaming 2017-08-21 11:10:18 +10:00