gtsoul-tech
3ced2f7ebf
shiftTooManyBitsSigned
2024-04-24 11:13:28 +03:00
gtsoul-tech
ba3603b285
uninitvar
2024-04-24 11:13:02 +03:00
gtsoul-tech
ca340c141e
invalidPrintfArgType_sint
2024-04-24 11:07:23 +03:00
Konstantinos Margaritis
1fb601f3a9
fix SIMDe emulation builds on Arm, add native translation from x86 for comparison
2023-11-27 12:21:58 +00:00
Konstantinos Margaritis
de7a376c9f
fix test for SIMDe
2023-11-23 16:07:58 +00:00
Konstantinos Margaritis
bbeec16894
use the right type of cast
2023-10-04 23:35:10 +03:00
Konstantinos Margaritis
ba81576d28
clang 16 as well
2023-10-04 22:07:34 +03:00
Konstantinos Margaritis
4ae1aebc1b
use the conditional in the right way
2023-10-04 20:35:58 +03:00
Konstantinos Margaritis
bfe1aa52f1
add conditional for __clang__
2023-10-04 20:28:35 +03:00
Konstantinos Margaritis
b5d87d3877
clang 15 (but not 16) fails on ppc64le with -Wdeprecate-lax-vec-conv-all
2023-10-04 20:09:45 +03:00
Konstantinos Margaritis
71374eea1d
Reduce unit test runtimes dramatically for debug builds
2023-10-04 19:21:30 +03:00
Hong, Yang A
06975070ae
bugfix: add vbmi case for test in database.cpp
2023-09-05 13:52:10 +03:00
Hong, Yang A
9a42397dc9
update year 2022
2023-09-05 13:49:52 +03:00
Hong, Yang A
e510f1c776
UTF-8 validation: fix one cotec check corner issue
...
fix github issue #362
2023-09-05 13:49:41 +03:00
Konstantinos Margaritis
bdc3947746
[VSX] correct lshiftbyte_m128/rshiftbyte_m128, variable_byte_shift
2022-09-06 23:59:51 +03:00
Konstantinos Margaritis
c0436e7cad
Add missing <memory> header
2022-08-30 20:40:23 +03:00
Danila Kutenin
1e09891b2b
Fix avx512 movemask call
2022-07-20 09:03:50 +01:00
Danila Kutenin
eb7b0bb50c
Optimize vectorscan for aarch64 by using shrn instruction
...
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--
To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
2022-06-26 22:55:45 +00:00
Danila Kutenin
a526f6bb6b
Fix all ASAN issues in vectorscan
2022-02-18 17:14:51 +00:00
Konstantinos Margaritis
1609e7a56e
clang is more strict
2021-12-02 23:09:53 +02:00
Konstantinos Margaritis
77f9b7edf9
nit
2021-11-25 06:21:07 +00:00
Apostolos Tapsas
e655d76a01
*fix palignr implementation for VSX Release mode
...
*add unit test for palignr
*enable unit test building for Release mode
2021-11-24 15:03:49 +00:00
Apostolos Tapsas
aac39f3208
vermicelli and match implementations for ppc64el added
2021-11-13 19:36:46 +00:00
apostolos
2136580d50
resolving conficts after merging
2021-11-13 18:58:22 +02:00
apostolos
6440d18b48
SuperVector opandnot test enriched
2021-11-10 15:12:25 +02:00
apostolos
537d81a27e
test for load m128 from u64a function added
2021-11-10 09:01:28 +02:00
Konstantinos Margaritis
694e2faf7f
remove vermicelli.h and replace it with vermicelli.hpp
2021-11-02 22:30:53 +02:00
Konstantinos Margaritis
0d886f7800
add new include file
2021-11-01 16:28:50 +00:00
Konstantinos Margaritis
5e59b36634
add arm vector types in union, avoid -flax-conversions, fix castings
2021-11-01 16:52:17 +02:00
apostolos
3a4d8afb48
prints commants and formating fixes
2021-11-01 10:09:15 +02:00
apostolos
b8d3d81d7f
nits
2021-10-26 11:55:02 +03:00
apostolos
d06839ad8b
Special case for Shuffle test added as well as comments for respectives implementations
2021-10-26 11:48:33 +03:00
Apostolos Tapsas
4f53ec6b08
Shuffle simd and SuperVector implementetions as well as their test realy fixed
2021-10-25 09:19:30 +03:00
Apostolos Tapsas
789f723814
SuperVector shuffle implementation and test function optimized
2021-10-22 11:55:39 +00:00
apostolos
ddebbeeb11
print functions keyword renamed
2021-10-22 12:36:07 +03:00
apostolos
ea5add7d4f
test for movemask and shuffle cases added
2021-10-22 11:17:43 +03:00
Apostolos Tapsas
7978b3f054
WIP: simd & bitutils files finctions fixes
2021-10-21 13:34:02 +00:00
Apostolos Tapsas
3423ea5b2b
WIP: Power VSX support almost completed
2021-10-14 13:53:55 +03:00
Konstantinos Margaritis
2f55e5b54f
add x86 vsh* implementations
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
ef7da97aa1
no need to convert to size_t
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
1af82e395f
Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction).
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
a3f083a9ff
initial SSE/AVX2 implementation
2021-10-12 11:51:34 +03:00
apostolos
bb9bcb3760
micro-benchmarks for shufti, trufle and noodle added
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
752d6cf997
fix lshift128 test
2021-10-12 11:51:34 +03:00
apostolos
b26a88efe5
alignr methods for avx2 and avx512 added
2021-10-12 11:51:34 +03:00
apostolos
150ae10ea4
limex_shuffle added and it's unit tests
2021-10-12 11:51:34 +03:00
George Wort
e1f0f6baf7
Implement new DoubleVermicelli16 acceleration functions using SVE2
...
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
60b2112505
Use SVE for double shufti.
...
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
2021-10-12 11:51:34 +03:00
George Wort
b54710d208
Implement new Vermicelli16 acceleration functions using SVE2.
...
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
5adbfc94b8
use STL make_unique, remove wrapper header, breaks C++17 compilation
2021-10-12 11:51:34 +03:00