Konstantinos Margaritis
3ed0c593f4
Fix 'unqualified call to std::move' errors in clang 15+
2023-10-03 20:24:39 +03:00
Danila Kutenin
eb7b0bb50c
Optimize vectorscan for aarch64 by using shrn instruction
...
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--
To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
2022-06-26 22:55:45 +00:00
Danila Kutenin
a526f6bb6b
Fix all ASAN issues in vectorscan
2022-02-18 17:14:51 +00:00
Konstantinos Margaritis
cc1a8dd47e
fix SVE2 build after the changes
2021-11-25 18:48:24 +02:00
Konstantinos Margaritis
694e2faf7f
remove vermicelli.h and replace it with vermicelli.hpp
2021-11-02 22:30:53 +02:00
Konstantinos Margaritis
0d886f7800
add new include file
2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
9e69273807
move casemask helper functions to separate header
2021-11-01 16:05:43 +00:00
Konstantinos Margaritis
2b3d0a355b
Add missing copyright info from tampered files
2021-10-12 11:51:35 +03:00
George Wort
b54710d208
Implement new Vermicelli16 acceleration functions using SVE2.
...
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
George Wort
b6a7ee7e84
Add SVE2 support for dvermicelli
...
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
5adbfc94b8
use STL make_unique, remove wrapper header, breaks C++17 compilation
2021-10-12 11:51:34 +03:00
George Wort
acfa11a34f
Add SVE2 support for vermicelli
...
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
George Wort
b2332218a4
Remove possibly undefined behaviour from Noodle.
...
Change-Id: I9a7997cea6a48927cb02b00c5dba5009bbf83850
2021-10-12 11:51:34 +03:00
George Wort
ddffd031ed
Remove first check from scanDouble Noodle.
...
Change-Id: I00eabb3cb06ef6a2060df52c26fa8591907a2711
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
31aca74801
minor fixes
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
c2a5de03e0
rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes
2021-10-12 11:51:34 +03:00
George Wort
db26cdd4bf
Fix error in initial noodle double final call.
...
Change-Id: Ie044988f183b47e0b2f1eed3b4bd23de75c3117d
2021-10-12 11:51:34 +03:00
George Wort
0ba1cbb32b
Add SVE2 support for noodle
...
Change-Id: Iacb7d1f164bdd0ba50e2e13d26fe548cf9b45a6a
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
94089d9acb
move definitions elsewhere
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
4fe9a75abc
move SuperVector versions of noodleEngine scan functions to _simd.hpp file
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
923f8bd357
simplify scanSingleMain() and scanDoubleMain()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
4d8fbe95a7
delete separate implementations
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
7c303b62e3
add generic SIMD implementation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
718cc7be1d
convert to C++
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
5171627e3b
fix typo
2021-10-12 11:51:33 +03:00
Konstantinos Margaritis
361aa4b900
minor optimizations
2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
acb542a5be
prefetch works best when addresses are 64-byte aligned
2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
95b929ed26
optimise case handling
2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
ffb6a95e72
simplify and make scanSingle*()/scanDouble*() more uniform
2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
da6216e42d
optimize caseMask handling
2021-10-12 11:50:32 +03:00
Konstantinos Margaritis
1c1bca4f98
use correct function names for AVX512, fix build failure
2021-02-15 13:54:19 +02:00
Konstantinos Margaritis
e12a6a0fd2
use unaligned loads for short scans
2021-02-11 14:21:57 +02:00
Konstantinos Margaritis
ecd2842217
optimize case mask AND out of the loop
2021-02-10 13:29:45 +02:00
Konstantinos Margaritis
8ef26f19fc
fix names, use own intrinsic instead of explicit _mm* ones
2020-09-23 11:51:21 +03:00
Konstantinos Margaritis
09993e5190
fix include paths for masked_move
2020-09-18 12:55:57 +03:00
Wang Xiang W
f658c4e149
Noodle: avoid an extra convert instruction
...
fixes github issue #221
2020-05-25 13:46:42 +00:00
Hong, Yang A
23e5f06594
add new Literal API for pure literal expressions:
...
Design compile time api hs_compile_lit() and hs_compile_lit_multi()
to handle pure literal pattern sets. Corresponding option --literal-on
is added for hyperscan testing suites. Extended parameters and part of
flags are not supported for this api.
2019-08-13 14:51:38 +08:00
Hong, Yang A
f68723a606
literal matching: separate path for pure literal patterns
2019-01-21 09:59:22 +08:00
Justin Viiret
af519f3190
hwlm_build: default for HWLMProto::make_small
...
Silences Coverity warning.
2017-09-18 13:29:34 +10:00
Alex Coyte
41783fe912
more comments on hwlm/fdr's start parameter
2017-08-21 11:23:41 +10:00
Wang, Xiang W
86c5f7feb1
FDR: Squash buckets of included literals in FDR confirm
...
- Change the compile of literal matchers to two passes.
- Reverse the bucket assignment in FDR, bucket with longer literals has
smaller bucket id.
- Squash the buckets of included literals and jump to the the program of
included literals directly from parent literal program without going
through FDR confirm for included iterals.
2017-08-21 11:12:36 +10:00
Wang, Xiang W
67a8f43355
literal matchers: change context passed to callback to scratch
2017-08-21 11:12:36 +10:00
Wang, Xiang W
ebb1b0006b
remove start argument in literal matcher callbacks
2017-08-21 11:12:36 +10:00
Matthew Barr
35d396d061
noodle: correct streaming bounds
2017-08-21 11:12:26 +10:00
Matthew Barr
f2b97a51d8
noodle: param name
2017-08-21 11:12:26 +10:00
Matthew Barr
166f5d8ba5
noodle: scan using the correct offsets
2017-08-21 11:12:24 +10:00
Matthew Barr
31a445a0e8
noodle: behave like our other literal matchers
...
Noodle now supports supplementary masks.
2017-08-21 11:10:20 +10:00
Matthew Barr
9c538a7522
Move hwlm literal len define
2017-08-21 11:10:20 +10:00
Matthew Barr
293f9fcc49
noodle: we don't need memcpy
2017-08-21 11:10:20 +10:00
Matthew Barr
4be7d6fecc
noodle: Use a sane temp buf for streaming
2017-08-21 11:10:18 +10:00