George Wort
a94219aaed
Use SVE for single shufti.
...
Change-Id: Ic76940c5bb9b81a1c45d39e9ca396a158c50a7dc
2021-07-26 19:15:58 +03:00
George Wort
6e434318a1
Use SVE2 for counting miracles.
...
Change-Id: I048dc182e5f4e726b847b3285ffafef4f538e550
2021-07-26 19:15:58 +03:00
George Wort
6d23032a6b
Replace USE_ARM_SVE with HAVE_SVE.
...
Change-Id: I469efaac197cba93201f2ca6eca78ca61be3054d
2021-07-26 19:15:58 +03:00
George Wort
ca7d5d7536
Add Licence to state_compress and bitutils.
...
Change-Id: I958daf82e5aef5bd306424dcfa7812382b266d65
2021-07-26 19:15:58 +03:00
George Wort
db0d8f79e6
Implement new Vermicelli16 acceleration functions using SVE2.
...
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-07-26 19:15:58 +03:00
George Wort
185c45263b
Add SVE2 support for dvermicelli
...
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-07-26 19:15:58 +03:00
Konstantinos Margaritis
455789db9f
add arm rshift128/rshift128
2021-07-26 19:15:58 +03:00
Konstantinos Margaritis
55d5631c5c
fix failing corner case, add pshufb_maskz()
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
be5a675da8
use STL make_unique, remove wrapper header, breaks C++17 compilation
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
18693bd14c
change C/C++ standard used to C17/C++17
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
369e0d473d
remove low4bits from the arguments, fix cases that mostly affect loading large (64) vectors and falling out of bounds
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
47e29602e0
fix loadu_maskz, add {l,r}shift128_var(), tab fixes
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e25c8ad78b
convert to for loops
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
b517cbfb8a
minor fixes, add 2 constructors from half size vectors
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e047fcf629
fix lastMatch<64>
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
0d1d76140a
provide an {l,r}shift128_var() to fix immediate value build failure in loadu_maskz
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
f680a79f1e
fix arm loadu_maskz()
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
b0fbb39cf0
add arm rshift128/rshift128
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e1950f1ce5
use rshift128() instead of vector-wide right shift
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
93abed4d87
add {l,r}shift128()+tests, rename printv_u64() to print64()
2021-07-26 00:10:54 +03:00
George Wort
e325ce9452
Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
...
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.
Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-07-26 00:10:54 +03:00
George Wort
726a668b65
Fix CROSS_COMPILE_AARCH64 for SVE issues.
...
Change-Id: I7b9ba3ccb754d96eee22ca01714c783dae1e4956
2021-07-26 00:10:54 +03:00
George Wort
7d7d31ec0d
Add SVE2 support for vermicelli
...
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-07-26 00:10:54 +03:00
George Wort
418851a26e
Remove possibly undefined behaviour from Noodle.
...
Change-Id: I9a7997cea6a48927cb02b00c5dba5009bbf83850
2021-07-26 00:10:54 +03:00
George Wort
f99c380167
Remove first check from scanDouble Noodle.
...
Change-Id: I00eabb3cb06ef6a2060df52c26fa8591907a2711
2021-07-26 00:10:54 +03:00
apostolos
80c01e451d
Equal mask test fixed with random numbers
2021-07-26 00:10:54 +03:00
apostolos
4326aecfda
Supervector test fixes
2021-07-26 00:10:54 +03:00
apostolos
a98a114568
SuperVector AVX512 implementations
2021-07-26 00:10:54 +03:00
apostolos
17bf07f446
SuperVector unit tests for AVX2 and AVX512 added
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
56b124b9a1
really fix lshift for avx2
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e296a71540
disable OPTIMISE by default
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
6ad1e70c36
fix truffle SIMD for S>16 as well
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
46c37ece4b
add AVX2 specializations
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
65d911c976
lots of fixes to AVX2 implementation
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
517d2aa633
convert print helper functions to class methods
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
009ac9432c
tiny change in vector initialization
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
9c2c6fdcfc
fix last failing Shufti/Truffle tests
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
3021db0a17
fix arm SuperVector implementation
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
0576c5e8b1
fix rtruffle, was failing Lbr and a few ReverseTruffle tests
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
d1097e713d
fix x86 debug alignr
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
848a583599
move firstMatch, lastMatch to own header in util
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e31dd448b4
minor fixes
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
69378e7eee
compilation fixes for debug mode
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
a5cce2670e
fix arm implementation of alignr()
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
736da56850
harmonise syntax of x86 SuperVector impl.cpp like arm, fix alignr, define printv_* functions when on debug mode only
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
f6edd100c4
style fixes
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
a6bbe55574
removed obsolete file
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
6348a2f222
rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes
2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e024e41a2c
handle GNUCC_ARCH on non-x86 properly
2021-07-26 00:10:54 +03:00
apostolos
885a4da0c8
Truffle simd vectorized
2021-07-26 00:10:54 +03:00