Commit Graph

260 Commits

Author SHA1 Message Date
Hong, Yang A
1edddabb76 bugfix: add vbmi platform parameter for tests in single.cpp 2023-09-05 13:52:03 +03:00
Hong, Yang A
9a42397dc9 update year 2022 2023-09-05 13:49:52 +03:00
Hong, Yang A
e510f1c776 UTF-8 validation: fix one cotec check corner issue
fix github issue #362
2023-09-05 13:49:41 +03:00
Konstantinos Margaritis
bdc3947746 [VSX] correct lshiftbyte_m128/rshiftbyte_m128, variable_byte_shift 2022-09-06 23:59:51 +03:00
Konstantinos Margaritis
c0436e7cad Add missing <memory> header 2022-08-30 20:40:23 +03:00
Danila Kutenin
1e09891b2b Fix avx512 movemask call 2022-07-20 09:03:50 +01:00
Danila Kutenin
eb7b0bb50c Optimize vectorscan for aarch64 by using shrn instruction
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--

To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
2022-06-26 22:55:45 +00:00
Danila Kutenin
a526f6bb6b Fix all ASAN issues in vectorscan 2022-02-18 17:14:51 +00:00
Konstantinos Margaritis
1609e7a56e clang is more strict 2021-12-02 23:09:53 +02:00
Konstantinos Margaritis
77f9b7edf9 nit 2021-11-25 06:21:07 +00:00
Apostolos Tapsas
d73bf231ee Removed duplicates 2021-11-24 15:09:53 +00:00
Apostolos Tapsas
e655d76a01 *fix palignr implementation for VSX Release mode
*add unit test for palignr
*enable unit test building for Release mode
2021-11-24 15:03:49 +00:00
Apostolos Tapsas
aac39f3208 vermicelli and match implementations for ppc64el added 2021-11-13 19:36:46 +00:00
apostolos
2136580d50 resolving conficts after merging 2021-11-13 18:58:22 +02:00
apostolos
6440d18b48 SuperVector opandnot test enriched 2021-11-10 15:12:25 +02:00
apostolos
537d81a27e test for load m128 from u64a function added 2021-11-10 09:01:28 +02:00
Konstantinos Margaritis
694e2faf7f remove vermicelli.h and replace it with vermicelli.hpp 2021-11-02 22:30:53 +02:00
Konstantinos Margaritis
0d886f7800 add new include file 2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
5e59b36634 add arm vector types in union, avoid -flax-conversions, fix castings 2021-11-01 16:52:17 +02:00
apostolos
3a4d8afb48 prints commants and formating fixes 2021-11-01 10:09:15 +02:00
apostolos
b8d3d81d7f nits 2021-10-26 11:55:02 +03:00
apostolos
d06839ad8b Special case for Shuffle test added as well as comments for respectives implementations 2021-10-26 11:48:33 +03:00
Apostolos Tapsas
4f53ec6b08 Shuffle simd and SuperVector implementetions as well as their test realy fixed 2021-10-25 09:19:30 +03:00
Apostolos Tapsas
789f723814 SuperVector shuffle implementation and test function optimized 2021-10-22 11:55:39 +00:00
apostolos
ddebbeeb11 print functions keyword renamed 2021-10-22 12:36:07 +03:00
apostolos
ea5add7d4f test for movemask and shuffle cases added 2021-10-22 11:17:43 +03:00
Apostolos Tapsas
7978b3f054 WIP: simd & bitutils files finctions fixes 2021-10-21 13:34:02 +00:00
Apostolos Tapsas
3423ea5b2b WIP: Power VSX support almost completed 2021-10-14 13:53:55 +03:00
Konstantinos Margaritis
2f55e5b54f add x86 vsh* implementations 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
ef7da97aa1 no need to convert to size_t 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
1af82e395f Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction). 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
a3f083a9ff initial SSE/AVX2 implementation 2021-10-12 11:51:34 +03:00
apostolos
bb9bcb3760 micro-benchmarks for shufti, trufle and noodle added 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cf4b95fff2 remove Windows/ICC support 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
752d6cf997 fix lshift128 test 2021-10-12 11:51:34 +03:00
apostolos
b26a88efe5 alignr methods for avx2 and avx512 added 2021-10-12 11:51:34 +03:00
apostolos
150ae10ea4 limex_shuffle added and it's unit tests 2021-10-12 11:51:34 +03:00
George Wort
e1f0f6baf7 Implement new DoubleVermicelli16 acceleration functions using SVE2
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
60b2112505 Use SVE for double shufti.
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
2021-10-12 11:51:34 +03:00
George Wort
b54710d208 Implement new Vermicelli16 acceleration functions using SVE2.
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.

Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
5adbfc94b8 use STL make_unique, remove wrapper header, breaks C++17 compilation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d6fd17ec82 convert to for loops 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
5fd1ed58e6 add {l,r}shift128()+tests, rename printv_u64() to print64() 2021-10-12 11:51:34 +03:00
George Wort
acfa11a34f Add SVE2 support for vermicelli
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
apostolos
ce9ffe9bce Equal mask test fixed with random numbers 2021-10-12 11:51:34 +03:00
apostolos
b1dfc6abc4 Supervector test fixes 2021-10-12 11:51:34 +03:00
apostolos
a369e3aa53 SuperVector AVX512 implementations 2021-10-12 11:51:34 +03:00
apostolos
3f72b681cc SuperVector unit tests for AVX2 and AVX512 added 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
1f496a1411 tiny change in vector initialization 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f59be47288 harmonise syntax of x86 SuperVector impl.cpp like arm, fix alignr, define printv_* functions when on debug mode only 2021-10-12 11:51:34 +03:00