Konstantinos Margaritis
|
d52428694b
|
added refactored vermicelli_simd.cpp implementation
|
2021-10-27 12:29:39 +03:00 |
|
Konstantinos Margaritis
|
9f7b2fa8a8
|
link benchmarks against static lib only as some symbols are not exposed in the shared lib
|
2021-10-12 10:33:40 +00:00 |
|
Konstantinos Margaritis
|
2b3d0a355b
|
Add missing copyright info from tampered files
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
9e07d7971d
|
bump version
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
4a4a851c6d
|
fix multiple/undefined symbols when using fat runtimes
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
ae81088193
|
add arm truffle block function
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
45f395245b
|
add simd_onebit_masks as static in arm simd_utils.h as well
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
a654204122
|
simplify truffle and provide arch-specific block functions
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
92e0b9a351
|
simplify shufti and provide arch-specific block functions
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
a1acc456cc
|
rearrange method declarations
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
f2e45ccc06
|
remove simd_utils.c
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
2f55e5b54f
|
add x86 vsh* implementations
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
3248393d1a
|
use movemask
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
a85b1c75d1
|
add header define to avoid double inclusion
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
6ec68bbedd
|
do not include the Supervector impl.cpp files in fat runtime
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
ba9d11c1b9
|
atm, do not built benchmark tool for fat runtime, as the function names are modified, need to rethink this
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
b8bf6063b6
|
Improve benchmarks
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
f6f7d7a039
|
optimize and simplify Shufti and Truffle to work with a single block method instead
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
ef7da97aa1
|
no need to convert to size_t
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
1503d9a946
|
remove asserts, as they are not needed
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
5563f0c3b6
|
firstMatch/lastMatch are now arch-dependent, emulating movemask on non-Intel is very costly, the alternative is almost twice as fast on Arm
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
690e3c24e6
|
fix for new pshufb
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
1af82e395f
|
Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction).
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
a3f083a9ff
|
initial SSE/AVX2 implementation
|
2021-10-12 11:51:34 +03:00 |
|
Duncan Bellamy
|
314116cbb5
|
remove adding CMAKE_CXX_IMPLICIT_LINK_LIBRARIES to PRIVATE_LIBS
as on alpine linux this add gcc_s which is a shared library
on alpine:
Libs.private: -lstdc++ -lm -lssp_nonshared -lgcc_s -lgcc -lc -lgcc_s -lgcc
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
70092c48b6
|
Unify benchmarks, more accurate measurements
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
6a72d7b5ca
|
Unify benchmarks, more accurate measurements
(cherry picked from commit f50d7656bc78c54ec25916b6c8e655c188d79a13)
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
ed4b280a7f
|
benchmarks functions replaced with lambdas
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
390573a07a
|
raw pointers replaced with smart pointers
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
d9b8e9e224
|
nit
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
388dc457de
|
nit
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
8ae4fab6c6
|
fix benchmarks outputs
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
d39e132fdf
|
bandwidth output fixes
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
b03428e584
|
size outup for case with match fixed
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
da4805bdc5
|
nits
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
488517c45a
|
size output fixed
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
87f605b87c
|
nits
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
d1cf8989c7
|
benchmarks output fixes
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
98a950f405
|
add missing header
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
e2fc2c3dfe
|
remove confusing OPTIMISE flag
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
ee2ed6a8c8
|
nits
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
e0fefb3489
|
code size reduction by using function arrays and add bandwidth to output
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
bb9bcb3760
|
micro-benchmarks for shufti, trufle and noodle added
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
cf4b95fff2
|
remove Windows/ICC support
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
752d6cf997
|
fix lshift128 test
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
b26a88efe5
|
alignr methods for avx2 and avx512 added
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
150ae10ea4
|
limex_shuffle added and it's unit tests
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
b9fbfb1204
|
remove duplicate functions from previous merge
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
acacafe1af
|
add missing compile flags
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
44496d7508
|
add accidentally removed lines
|
2021-10-12 11:51:34 +03:00 |
|