Konstantinos Margaritis
|
7dbcab34c2
|
WIP: Refactor CMake build system to more modular
|
2023-10-08 23:27:24 +03:00 |
|
Konstantinos Margaritis
|
bafaffb967
|
remove extra print
|
2023-10-06 12:08:36 +03:00 |
|
Konstantinos Margaritis
|
d2142ea96f
|
Reduce debug unit tests runtime even more
In single.cpp featuremask with AVX512 features is not relevant to non-x86 platforms,
and just extends the runtime for no reason.
|
2023-10-05 19:12:58 +03:00 |
|
Konstantinos Margaritis
|
37edb70936
|
Don't run regression UE_2595 on debug, it times out CI
|
2023-10-05 10:40:30 +03:00 |
|
Konstantinos Margaritis
|
bbeec16894
|
use the right type of cast
|
2023-10-04 23:35:10 +03:00 |
|
Konstantinos Margaritis
|
ba81576d28
|
clang 16 as well
|
2023-10-04 22:07:34 +03:00 |
|
Konstantinos Margaritis
|
4ae1aebc1b
|
use the conditional in the right way
|
2023-10-04 20:35:58 +03:00 |
|
Konstantinos Margaritis
|
bfe1aa52f1
|
add conditional for __clang__
|
2023-10-04 20:28:35 +03:00 |
|
Konstantinos Margaritis
|
b5d87d3877
|
clang 15 (but not 16) fails on ppc64le with -Wdeprecate-lax-vec-conv-all
|
2023-10-04 20:09:45 +03:00 |
|
Konstantinos Margaritis
|
71374eea1d
|
Reduce unit test runtimes dramatically for debug builds
|
2023-10-04 19:21:30 +03:00 |
|
Hong, Yang A
|
06975070ae
|
bugfix: add vbmi case for test in database.cpp
|
2023-09-05 13:52:10 +03:00 |
|
Hong, Yang A
|
1edddabb76
|
bugfix: add vbmi platform parameter for tests in single.cpp
|
2023-09-05 13:52:03 +03:00 |
|
Hong, Yang A
|
9a42397dc9
|
update year 2022
|
2023-09-05 13:49:52 +03:00 |
|
Hong, Yang A
|
e510f1c776
|
UTF-8 validation: fix one cotec check corner issue
fix github issue #362
|
2023-09-05 13:49:41 +03:00 |
|
Konstantinos Margaritis
|
bdc3947746
|
[VSX] correct lshiftbyte_m128/rshiftbyte_m128, variable_byte_shift
|
2022-09-06 23:59:51 +03:00 |
|
Konstantinos Margaritis
|
c0436e7cad
|
Add missing <memory> header
|
2022-08-30 20:40:23 +03:00 |
|
Danila Kutenin
|
1e09891b2b
|
Fix avx512 movemask call
|
2022-07-20 09:03:50 +01:00 |
|
Danila Kutenin
|
eb7b0bb50c
|
Optimize vectorscan for aarch64 by using shrn instruction
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--
To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
|
2022-06-26 22:55:45 +00:00 |
|
Danila Kutenin
|
a526f6bb6b
|
Fix all ASAN issues in vectorscan
|
2022-02-18 17:14:51 +00:00 |
|
Konstantinos Margaritis
|
1609e7a56e
|
clang is more strict
|
2021-12-02 23:09:53 +02:00 |
|
Konstantinos Margaritis
|
77f9b7edf9
|
nit
|
2021-11-25 06:21:07 +00:00 |
|
Apostolos Tapsas
|
d73bf231ee
|
Removed duplicates
|
2021-11-24 15:09:53 +00:00 |
|
Apostolos Tapsas
|
e655d76a01
|
*fix palignr implementation for VSX Release mode
*add unit test for palignr
*enable unit test building for Release mode
|
2021-11-24 15:03:49 +00:00 |
|
Apostolos Tapsas
|
aac39f3208
|
vermicelli and match implementations for ppc64el added
|
2021-11-13 19:36:46 +00:00 |
|
apostolos
|
2136580d50
|
resolving conficts after merging
|
2021-11-13 18:58:22 +02:00 |
|
apostolos
|
6440d18b48
|
SuperVector opandnot test enriched
|
2021-11-10 15:12:25 +02:00 |
|
apostolos
|
537d81a27e
|
test for load m128 from u64a function added
|
2021-11-10 09:01:28 +02:00 |
|
Konstantinos Margaritis
|
694e2faf7f
|
remove vermicelli.h and replace it with vermicelli.hpp
|
2021-11-02 22:30:53 +02:00 |
|
Konstantinos Margaritis
|
0d886f7800
|
add new include file
|
2021-11-01 16:28:50 +00:00 |
|
Konstantinos Margaritis
|
5e59b36634
|
add arm vector types in union, avoid -flax-conversions, fix castings
|
2021-11-01 16:52:17 +02:00 |
|
apostolos
|
3a4d8afb48
|
prints commants and formating fixes
|
2021-11-01 10:09:15 +02:00 |
|
apostolos
|
b8d3d81d7f
|
nits
|
2021-10-26 11:55:02 +03:00 |
|
apostolos
|
d06839ad8b
|
Special case for Shuffle test added as well as comments for respectives implementations
|
2021-10-26 11:48:33 +03:00 |
|
Apostolos Tapsas
|
4f53ec6b08
|
Shuffle simd and SuperVector implementetions as well as their test realy fixed
|
2021-10-25 09:19:30 +03:00 |
|
Apostolos Tapsas
|
789f723814
|
SuperVector shuffle implementation and test function optimized
|
2021-10-22 11:55:39 +00:00 |
|
apostolos
|
ddebbeeb11
|
print functions keyword renamed
|
2021-10-22 12:36:07 +03:00 |
|
apostolos
|
ea5add7d4f
|
test for movemask and shuffle cases added
|
2021-10-22 11:17:43 +03:00 |
|
Apostolos Tapsas
|
7978b3f054
|
WIP: simd & bitutils files finctions fixes
|
2021-10-21 13:34:02 +00:00 |
|
Apostolos Tapsas
|
3423ea5b2b
|
WIP: Power VSX support almost completed
|
2021-10-14 13:53:55 +03:00 |
|
Konstantinos Margaritis
|
2f55e5b54f
|
add x86 vsh* implementations
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
ef7da97aa1
|
no need to convert to size_t
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
1af82e395f
|
Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction).
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
a3f083a9ff
|
initial SSE/AVX2 implementation
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
bb9bcb3760
|
micro-benchmarks for shufti, trufle and noodle added
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
cf4b95fff2
|
remove Windows/ICC support
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
752d6cf997
|
fix lshift128 test
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
b26a88efe5
|
alignr methods for avx2 and avx512 added
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
150ae10ea4
|
limex_shuffle added and it's unit tests
|
2021-10-12 11:51:34 +03:00 |
|
George Wort
|
e1f0f6baf7
|
Implement new DoubleVermicelli16 acceleration functions using SVE2
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
|
2021-10-12 11:51:34 +03:00 |
|
George Wort
|
60b2112505
|
Use SVE for double shufti.
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
|
2021-10-12 11:51:34 +03:00 |
|