George Wort
df926ef62f
Implement new Vermicelli16 acceleration functions using SVE2.
...
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
George Wort
c7086cb7f1
Add SVE2 support for dvermicelli
...
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
a38324a5a3
add arm rshift128/rshift128
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
603bc14cdd
fix failing corner case, add pshufb_maskz()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e35b88f2c8
use STL make_unique, remove wrapper header, breaks C++17 compilation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6f44a1aa26
remove low4bits from the arguments, fix cases that mostly affect loading large (64) vectors and falling out of bounds
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f2d9784979
fix loadu_maskz, add {l,r}shift128_var(), tab fixes
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f8ce0bb922
minor fixes, add 2 constructors from half size vectors
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cabd13d18a
fix lastMatch<64>
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
ebb1b84ae3
provide an {l,r}shift128_var() to fix immediate value build failure in loadu_maskz
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
825460856f
fix arm loadu_maskz()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
86accf41a3
add arm rshift128/rshift128
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
b67cd7dfd0
use rshift128() instead of vector-wide right shift
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6c51f7f591
add {l,r}shift128()+tests, rename printv_u64() to print64()
2021-10-12 11:51:34 +03:00
George Wort
051ceed0f9
Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
...
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.
Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-10-12 11:51:34 +03:00
George Wort
4bc28272da
Fix CROSS_COMPILE_AARCH64 for SVE issues.
...
Change-Id: I7b9ba3ccb754d96eee22ca01714c783dae1e4956
2021-10-12 11:51:34 +03:00
George Wort
9fb79ac3ec
Add SVE2 support for vermicelli
...
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
George Wort
7162446358
Remove possibly undefined behaviour from Noodle.
...
Change-Id: I9a7997cea6a48927cb02b00c5dba5009bbf83850
2021-10-12 11:51:34 +03:00
George Wort
b48ea2c1a6
Remove first check from scanDouble Noodle.
...
Change-Id: I00eabb3cb06ef6a2060df52c26fa8591907a2711
2021-10-12 11:51:34 +03:00
apostolos
6f88ecac44
Supervector test fixes
2021-10-12 11:51:34 +03:00
apostolos
ae6bc52076
SuperVector AVX512 implementations
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
7ae636dfe9
really fix lshift for avx2
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d04b899c29
fix truffle SIMD for S>16 as well
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
b42b187712
add AVX2 specializations
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
dede600637
lots of fixes to AVX2 implementation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
c45e72775f
convert print helper functions to class methods
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d453a612dc
fix last failing Shufti/Truffle tests
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
ec3f108d71
fix arm SuperVector implementation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
0ed10082b1
fix rtruffle, was failing Lbr and a few ReverseTruffle tests
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f425951b49
fix x86 debug alignr
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
845e533b66
move firstMatch, lastMatch to own header in util
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
41ff0962c4
minor fixes
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6d8f3b9ff8
compilation fixes for debug mode
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d7b247a949
fix arm implementation of alignr()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
28b2949396
harmonise syntax of x86 SuperVector impl.cpp like arm, fix alignr, define printv_* functions when on debug mode only
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
9de3065e68
style fixes
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e0a45a354d
removed obsolete file
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
2753dbb3b0
rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes
2021-10-12 11:51:34 +03:00
apostolos
1ce5e17ce9
Truffle simd vectorized
2021-10-12 11:51:34 +03:00
George Wort
d1009e8830
Fix error in initial noodle double final call.
...
Change-Id: Ie044988f183b47e0b2f1eed3b4bd23de75c3117d
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
8b09ecfe48
nits
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cceb599fc9
fix typo
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e49fa3a97a
fix unit tests, and resp. ARM SuperVector methods based on those unit tests, add print functions for SuperVector
2021-10-12 11:51:34 +03:00
George Wort
d6df8116a5
Add SVE2 support for noodle
...
Change-Id: Iacb7d1f164bdd0ba50e2e13d26fe548cf9b45a6a
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
acca824dea
add missing ARM SuperVector methods, some tests still fail, WIP
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6fbd18183a
rename arm impl.hpp to impl.cpp, add operator|() to SuperVector class
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
23b075cbd4
refactor shufti algorithm to use SuperVector class, WIP
2021-10-12 11:51:34 +03:00
George Wort
3ee7b75ee0
Add SVE, SVE2, and SVE2_BITPERM as targets
...
Change-Id: I5231e2eb0a31708a16c853dc83ea48db32e0b0a5
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6526df81e4
add more functions, move defines here, enable inlining of template specializations only when running optimized code
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d8b5eb5d17
fix compilation on C++
2021-10-12 11:51:34 +03:00