Hong, Yang A
e05b154abb
mcclellan: improve wide-state checking in Sherman optimization
...
fixes github issue #305
2022-08-29 15:03:06 +03:00
Danila Kutenin
eb7b0bb50c
Optimize vectorscan for aarch64 by using shrn instruction
...
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--
To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
2022-06-26 22:55:45 +00:00
Danila Kutenin
faca38e058
Fix a couple of tests
2022-02-18 19:31:03 +00:00
Danila Kutenin
a526f6bb6b
Fix all ASAN issues in vectorscan
2022-02-18 17:14:51 +00:00
Konstantinos Margaritis
cc1a8dd47e
fix SVE2 build after the changes
2021-11-25 18:48:24 +02:00
Apostolos Tapsas
8a6c3f81e4
Removed accidentaly included header file
2021-11-24 12:11:21 +00:00
Apostolos Tapsas
aac39f3208
vermicelli and match implementations for ppc64el added
2021-11-13 19:36:46 +00:00
apostolos
2136580d50
resolving conficts after merging
2021-11-13 18:58:22 +02:00
Konstantinos Margaritis
3fd710706a
split vermicelli block implementations per arch
2021-11-08 19:45:21 +00:00
Apostolos Tapsas
5b18538373
SuperVector constructors as well as andnot implementation fixed
2021-11-05 13:34:48 +00:00
Konstantinos Margaritis
6317e24a82
add len parameter and mask, fixes corner cases on AVX512
2021-11-05 14:30:22 +02:00
Konstantinos Margaritis
694e2faf7f
remove vermicelli.h and replace it with vermicelli.hpp
2021-11-02 22:30:53 +02:00
Konstantinos Margaritis
80286f38cb
refactor vermicelliDoubleMaskedExec()
2021-11-02 22:30:21 +02:00
Konstantinos Margaritis
4db360c7b6
complete refactoring and unification of Vermicelli functions
2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
94b467dc12
remove unneeded header
2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
0d886f7800
add new include file
2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
54245bc5ac
renamed matcher functions, added new ones for Vermicelli
2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
1c1a06aaae
nits
2021-11-01 16:05:43 +00:00
Konstantinos Margaritis
d6fe28afc8
added refactored vermicelli_simd.cpp implementation
2021-11-01 16:05:43 +00:00
apostolos
3a4d8afb48
prints commants and formating fixes
2021-11-01 10:09:15 +02:00
Apostolos Tapsas
4f53ec6b08
Shuffle simd and SuperVector implementetions as well as their test realy fixed
2021-10-25 09:19:30 +03:00
Apostolos Tapsas
789f723814
SuperVector shuffle implementation and test function optimized
2021-10-22 11:55:39 +00:00
apostolos
ea5add7d4f
test for movemask and shuffle cases added
2021-10-22 11:17:43 +03:00
Apostolos Tapsas
7978b3f054
WIP: simd & bitutils files finctions fixes
2021-10-21 13:34:02 +00:00
apostolos
fd905a0c9e
trufle and shufle implementations for ARCH_PPC64EL
2021-10-14 16:01:21 +03:00
apostolos
6aac8241b1
blockSigleMask implementations for ARCH_PPC64 added
2021-10-14 15:56:13 +03:00
Konstantinos Margaritis
2b3d0a355b
Add missing copyright info from tampered files
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
ae81088193
add arm truffle block function
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
a654204122
simplify truffle and provide arch-specific block functions
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
92e0b9a351
simplify shufti and provide arch-specific block functions
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
f6f7d7a039
optimize and simplify Shufti and Truffle to work with a single block method instead
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
1503d9a946
remove asserts, as they are not needed
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
690e3c24e6
fix for new pshufb
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
a3f083a9ff
initial SSE/AVX2 implementation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cf4b95fff2
remove Windows/ICC support
2021-10-12 11:51:34 +03:00
apostolos
150ae10ea4
limex_shuffle added and it's unit tests
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
b9fbfb1204
remove duplicate functions from previous merge
2021-10-12 11:51:34 +03:00
George Wort
3bdd48fd61
Move SVE functions into their own files.
...
Change-Id: I995ba4b7d2b558ee403693ee45d747d414d3b177
2021-10-12 11:51:34 +03:00
George Wort
e1f0f6baf7
Implement new DoubleVermicelli16 acceleration functions using SVE2
...
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
60b2112505
Use SVE for double shufti.
...
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
2021-10-12 11:51:34 +03:00
George Wort
87ee8d4d7f
Use SVE for single shufti.
...
Change-Id: Ic76940c5bb9b81a1c45d39e9ca396a158c50a7dc
2021-10-12 11:51:34 +03:00
George Wort
b54710d208
Implement new Vermicelli16 acceleration functions using SVE2.
...
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
George Wort
b6a7ee7e84
Add SVE2 support for dvermicelli
...
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
3296d538ea
add arm rshift128/rshift128
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
0033cec725
fix failing corner case, add pshufb_maskz()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
5adbfc94b8
use STL make_unique, remove wrapper header, breaks C++17 compilation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
0ec5dc37ca
remove low4bits from the arguments, fix cases that mostly affect loading large (64) vectors and falling out of bounds
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
03e7d788b6
use rshift128() instead of vector-wide right shift
2021-10-12 11:51:34 +03:00
George Wort
7e5138b78f
Fix CROSS_COMPILE_AARCH64 for SVE issues.
...
Change-Id: I7b9ba3ccb754d96eee22ca01714c783dae1e4956
2021-10-12 11:51:34 +03:00
George Wort
acfa11a34f
Add SVE2 support for vermicelli
...
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00