Commit Graph

1705 Commits

Author SHA1 Message Date
Konstantinos Margaritis
f2e45ccc06 remove simd_utils.c 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
2f55e5b54f add x86 vsh* implementations 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
3248393d1a use movemask 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
a85b1c75d1 add header define to avoid double inclusion 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
6ec68bbedd do not include the Supervector impl.cpp files in fat runtime 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
ba9d11c1b9 atm, do not built benchmark tool for fat runtime, as the function names are modified, need to rethink this 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
b8bf6063b6 Improve benchmarks 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
f6f7d7a039 optimize and simplify Shufti and Truffle to work with a single block method instead 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
ef7da97aa1 no need to convert to size_t 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
1503d9a946 remove asserts, as they are not needed 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
5563f0c3b6 firstMatch/lastMatch are now arch-dependent, emulating movemask on non-Intel is very costly, the alternative is almost twice as fast on Arm 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
690e3c24e6 fix for new pshufb 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
1af82e395f Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction). 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
a3f083a9ff initial SSE/AVX2 implementation 2021-10-12 11:51:34 +03:00
Duncan Bellamy
314116cbb5 remove adding CMAKE_CXX_IMPLICIT_LINK_LIBRARIES to PRIVATE_LIBS
as on alpine linux this add gcc_s which is a shared library

on alpine:
Libs.private: -lstdc++ -lm -lssp_nonshared -lgcc_s -lgcc -lc -lgcc_s -lgcc
2021-10-12 11:51:34 +03:00
apostolos
70092c48b6 Unify benchmarks, more accurate measurements 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6a72d7b5ca Unify benchmarks, more accurate measurements
(cherry picked from commit f50d7656bc78c54ec25916b6c8e655c188d79a13)
2021-10-12 11:51:34 +03:00
apostolos
ed4b280a7f benchmarks functions replaced with lambdas 2021-10-12 11:51:34 +03:00
apostolos
390573a07a raw pointers replaced with smart pointers 2021-10-12 11:51:34 +03:00
apostolos
d9b8e9e224 nit 2021-10-12 11:51:34 +03:00
apostolos
388dc457de nit 2021-10-12 11:51:34 +03:00
apostolos
8ae4fab6c6 fix benchmarks outputs 2021-10-12 11:51:34 +03:00
apostolos
d39e132fdf bandwidth output fixes 2021-10-12 11:51:34 +03:00
apostolos
b03428e584 size outup for case with match fixed 2021-10-12 11:51:34 +03:00
apostolos
da4805bdc5 nits 2021-10-12 11:51:34 +03:00
apostolos
488517c45a size output fixed 2021-10-12 11:51:34 +03:00
apostolos
87f605b87c nits 2021-10-12 11:51:34 +03:00
apostolos
d1cf8989c7 benchmarks output fixes 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
98a950f405 add missing header 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e2fc2c3dfe remove confusing OPTIMISE flag 2021-10-12 11:51:34 +03:00
apostolos
ee2ed6a8c8 nits 2021-10-12 11:51:34 +03:00
apostolos
e0fefb3489 code size reduction by using function arrays and add bandwidth to output 2021-10-12 11:51:34 +03:00
apostolos
bb9bcb3760 micro-benchmarks for shufti, trufle and noodle added 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cf4b95fff2 remove Windows/ICC support 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
752d6cf997 fix lshift128 test 2021-10-12 11:51:34 +03:00
apostolos
b26a88efe5 alignr methods for avx2 and avx512 added 2021-10-12 11:51:34 +03:00
apostolos
150ae10ea4 limex_shuffle added and it's unit tests 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
b9fbfb1204 remove duplicate functions from previous merge 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
acacafe1af add missing compile flags 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
44496d7508 add accidentally removed lines 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cd5c251f67 * add -fno-new-ttp-matching to fix build-failures on newer gcc compilers with C++17
* add explicit -mssse3, -mavx2 in compiler flags in respective build profiles
2021-10-12 11:51:34 +03:00
George Wort
3bdd48fd61 Move SVE functions into their own files.
Change-Id: I995ba4b7d2b558ee403693ee45d747d414d3b177
2021-10-12 11:51:34 +03:00
George Wort
e1f0f6baf7 Implement new DoubleVermicelli16 acceleration functions using SVE2
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
91f5f10831 Use SVE shufti for counting miracles.
Change-Id: Idd4aaf5bbc05fc90e9138c6fed385bc6ffa7b0b8
2021-10-12 11:51:34 +03:00
George Wort
60b2112505 Use SVE for double shufti.
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
2021-10-12 11:51:34 +03:00
George Wort
87ee8d4d7f Use SVE for single shufti.
Change-Id: Ic76940c5bb9b81a1c45d39e9ca396a158c50a7dc
2021-10-12 11:51:34 +03:00
George Wort
d1e763c13b Use SVE2 for counting miracles.
Change-Id: I048dc182e5f4e726b847b3285ffafef4f538e550
2021-10-12 11:51:34 +03:00
George Wort
ceb230c7db Replace USE_ARM_SVE with HAVE_SVE.
Change-Id: I469efaac197cba93201f2ca6eca78ca61be3054d
2021-10-12 11:51:34 +03:00
George Wort
7ba060bbf8 Add Licence to state_compress and bitutils.
Change-Id: I958daf82e5aef5bd306424dcfa7812382b266d65
2021-10-12 11:51:34 +03:00
George Wort
b54710d208 Implement new Vermicelli16 acceleration functions using SVE2.
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.

Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00