Hong, Yang A
4fb3a48dfd
bugfix: add vbmi case for test in database.cpp
2023-09-05 13:52:10 +03:00
Hong, Yang A
6765b35d48
bugfix: add vbmi platform parameter for tests in single.cpp
2023-09-05 13:52:03 +03:00
Hong, Yang A
b7ee9102ee
update year 2022
2023-09-05 13:49:52 +03:00
Hong, Yang A
684f0ce2cb
UTF-8 validation: fix one cotec check corner issue
...
fix github issue #362
2023-09-05 13:49:41 +03:00
Konstantinos Margaritis
94fe406f0c
[VSX] correct lshiftbyte_m128/rshiftbyte_m128, variable_byte_shift
2022-09-06 23:59:51 +03:00
Konstantinos Margaritis
74ab41897c
Add missing <memory> header
2022-08-30 20:40:23 +03:00
Danila Kutenin
db52ce6f08
Fix avx512 movemask call
2022-07-20 09:03:50 +01:00
Danila Kutenin
49eb18ee4f
Optimize vectorscan for aarch64 by using shrn instruction
...
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--
To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
2022-06-26 22:55:45 +00:00
Danila Kutenin
9af996b936
Fix all ASAN issues in vectorscan
2022-02-18 17:14:51 +00:00
Konstantinos Margaritis
7cad514366
clang is more strict
2021-12-02 23:09:53 +02:00
Konstantinos Margaritis
00384c9e37
nit
2021-11-25 06:21:07 +00:00
Apostolos Tapsas
725a8d8f1a
Removed duplicates
2021-11-24 15:09:53 +00:00
Apostolos Tapsas
35e5369c70
*fix palignr implementation for VSX Release mode
...
*add unit test for palignr
*enable unit test building for Release mode
2021-11-24 15:03:49 +00:00
Apostolos Tapsas
54158a1746
vermicelli and match implementations for ppc64el added
2021-11-13 19:36:46 +00:00
apostolos
e09d8674b4
resolving conficts after merging
2021-11-13 18:58:22 +02:00
apostolos
4114b8a480
SuperVector opandnot test enriched
2021-11-10 15:12:25 +02:00
apostolos
942deb7d80
test for load m128 from u64a function added
2021-11-10 09:01:28 +02:00
Konstantinos Margaritis
210295a702
remove vermicelli.h and replace it with vermicelli.hpp
2021-11-02 22:30:53 +02:00
Konstantinos Margaritis
bc1a1127cf
add new include file
2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
7b65b298c1
add arm vector types in union, avoid -flax-conversions, fix castings
2021-11-01 16:52:17 +02:00
apostolos
d9d39d48c5
prints commants and formating fixes
2021-11-01 10:09:15 +02:00
apostolos
3f17750a27
nits
2021-10-26 11:55:02 +03:00
apostolos
bf54aae779
Special case for Shuffle test added as well as comments for respectives implementations
2021-10-26 11:48:33 +03:00
Apostolos Tapsas
1eb3b19f63
Shuffle simd and SuperVector implementetions as well as their test realy fixed
2021-10-25 09:19:30 +03:00
Apostolos Tapsas
d43d6733b6
SuperVector shuffle implementation and test function optimized
2021-10-22 11:55:39 +00:00
apostolos
24f149f239
print functions keyword renamed
2021-10-22 12:36:07 +03:00
apostolos
b53b0a0fcd
test for movemask and shuffle cases added
2021-10-22 11:17:43 +03:00
Apostolos Tapsas
2b1db73326
WIP: simd & bitutils files finctions fixes
2021-10-21 13:34:02 +00:00
Apostolos Tapsas
7888dd4418
WIP: Power VSX support almost completed
2021-10-14 13:53:55 +03:00
Konstantinos Margaritis
8b7ba89cb5
add x86 vsh* implementations
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
456b1c6182
no need to convert to size_t
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
67e0674df8
Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction).
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e7161fdfec
initial SSE/AVX2 implementation
2021-10-12 11:51:34 +03:00
apostolos
904a94fbe5
micro-benchmarks for shufti, trufle and noodle added
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
08357a096c
remove Windows/ICC support
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
8cff876962
fix lshift128 test
2021-10-12 11:51:34 +03:00
apostolos
67fa6d2738
alignr methods for avx2 and avx512 added
2021-10-12 11:51:34 +03:00
apostolos
b3a20afbbc
limex_shuffle added and it's unit tests
2021-10-12 11:51:34 +03:00
George Wort
6c6aee9682
Implement new DoubleVermicelli16 acceleration functions using SVE2
...
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
00fff3f53c
Use SVE for double shufti.
...
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
2021-10-12 11:51:34 +03:00
George Wort
df926ef62f
Implement new Vermicelli16 acceleration functions using SVE2.
...
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e35b88f2c8
use STL make_unique, remove wrapper header, breaks C++17 compilation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
a2e6143ea1
convert to for loops
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6c51f7f591
add {l,r}shift128()+tests, rename printv_u64() to print64()
2021-10-12 11:51:34 +03:00
George Wort
9fb79ac3ec
Add SVE2 support for vermicelli
...
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
apostolos
89b123d003
Equal mask test fixed with random numbers
2021-10-12 11:51:34 +03:00
apostolos
6f88ecac44
Supervector test fixes
2021-10-12 11:51:34 +03:00
apostolos
ae6bc52076
SuperVector AVX512 implementations
2021-10-12 11:51:34 +03:00
apostolos
32350cf9b1
SuperVector unit tests for AVX2 and AVX512 added
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
78e098661f
tiny change in vector initialization
2021-10-12 11:51:34 +03:00