440 Commits

Author SHA1 Message Date
Konstantinos Margaritis
eebd6c97bc use movemask 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
6ceab8435d add header define to avoid double inclusion 2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
fa3d509fad firstMatch/lastMatch are now arch-dependent, emulating movemask on non-Intel is very costly, the alternative is almost twice as fast on Arm 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
67e0674df8 Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction). 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e7161fdfec initial SSE/AVX2 implementation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
08357a096c remove Windows/ICC support 2021-10-12 11:51:34 +03:00
apostolos
67fa6d2738 alignr methods for avx2 and avx512 added 2021-10-12 11:51:34 +03:00
George Wort
a879715953 Move SVE functions into their own files.
Change-Id: I995ba4b7d2b558ee403693ee45d747d414d3b177
2021-10-12 11:51:34 +03:00
George Wort
c95a4c3dd1 Use SVE for single shufti.
Change-Id: Ic76940c5bb9b81a1c45d39e9ca396a158c50a7dc
2021-10-12 11:51:34 +03:00
George Wort
ab5d4d9279 Replace USE_ARM_SVE with HAVE_SVE.
Change-Id: I469efaac197cba93201f2ca6eca78ca61be3054d
2021-10-12 11:51:34 +03:00
George Wort
8242f46ed7 Add Licence to state_compress and bitutils.
Change-Id: I958daf82e5aef5bd306424dcfa7812382b266d65
2021-10-12 11:51:34 +03:00
George Wort
c7086cb7f1 Add SVE2 support for dvermicelli
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
603bc14cdd fix failing corner case, add pshufb_maskz() 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e35b88f2c8 use STL make_unique, remove wrapper header, breaks C++17 compilation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f2d9784979 fix loadu_maskz, add {l,r}shift128_var(), tab fixes 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f8ce0bb922 minor fixes, add 2 constructors from half size vectors 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cabd13d18a fix lastMatch<64> 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
ebb1b84ae3 provide an {l,r}shift128_var() to fix immediate value build failure in loadu_maskz 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
825460856f fix arm loadu_maskz() 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
86accf41a3 add arm rshift128/rshift128 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6c51f7f591 add {l,r}shift128()+tests, rename printv_u64() to print64() 2021-10-12 11:51:34 +03:00
George Wort
051ceed0f9 Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.

Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-10-12 11:51:34 +03:00
George Wort
9fb79ac3ec Add SVE2 support for vermicelli
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
apostolos
6f88ecac44 Supervector test fixes 2021-10-12 11:51:34 +03:00
apostolos
ae6bc52076 SuperVector AVX512 implementations 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
7ae636dfe9 really fix lshift for avx2 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
b42b187712 add AVX2 specializations 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
dede600637 lots of fixes to AVX2 implementation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
c45e72775f convert print helper functions to class methods 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
ec3f108d71 fix arm SuperVector implementation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f425951b49 fix x86 debug alignr 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
845e533b66 move firstMatch, lastMatch to own header in util 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6d8f3b9ff8 compilation fixes for debug mode 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d7b247a949 fix arm implementation of alignr() 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
28b2949396 harmonise syntax of x86 SuperVector impl.cpp like arm, fix alignr, define printv_* functions when on debug mode only 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
9de3065e68 style fixes 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e0a45a354d removed obsolete file 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
2753dbb3b0 rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes 2021-10-12 11:51:34 +03:00
apostolos
1ce5e17ce9 Truffle simd vectorized 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
8b09ecfe48 nits 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cceb599fc9 fix typo 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e49fa3a97a fix unit tests, and resp. ARM SuperVector methods based on those unit tests, add print functions for SuperVector 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
acca824dea add missing ARM SuperVector methods, some tests still fail, WIP 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6fbd18183a rename arm impl.hpp to impl.cpp, add operator|() to SuperVector class 2021-10-12 11:51:34 +03:00
George Wort
3ee7b75ee0 Add SVE, SVE2, and SVE2_BITPERM as targets
Change-Id: I5231e2eb0a31708a16c853dc83ea48db32e0b0a5
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6526df81e4 add more functions, move defines here, enable inlining of template specializations only when running optimized code 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d8b5eb5d17 fix compilation on C++ 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
273b9683ac simplify function 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6e63aafbea add arm support for the new SuperVector class 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e6c1fa04ce add C++ template SIMD library (WIP) 2021-10-12 11:51:34 +03:00