Commit Graph

1457 Commits

Author SHA1 Message Date
George Wort
a94219aaed Use SVE for single shufti.
Change-Id: Ic76940c5bb9b81a1c45d39e9ca396a158c50a7dc
2021-07-26 19:15:58 +03:00
George Wort
6e434318a1 Use SVE2 for counting miracles.
Change-Id: I048dc182e5f4e726b847b3285ffafef4f538e550
2021-07-26 19:15:58 +03:00
George Wort
6d23032a6b Replace USE_ARM_SVE with HAVE_SVE.
Change-Id: I469efaac197cba93201f2ca6eca78ca61be3054d
2021-07-26 19:15:58 +03:00
George Wort
ca7d5d7536 Add Licence to state_compress and bitutils.
Change-Id: I958daf82e5aef5bd306424dcfa7812382b266d65
2021-07-26 19:15:58 +03:00
George Wort
db0d8f79e6 Implement new Vermicelli16 acceleration functions using SVE2.
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.

Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-07-26 19:15:58 +03:00
George Wort
185c45263b Add SVE2 support for dvermicelli
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-07-26 19:15:58 +03:00
Konstantinos Margaritis
455789db9f add arm rshift128/rshift128 2021-07-26 19:15:58 +03:00
Konstantinos Margaritis
55d5631c5c fix failing corner case, add pshufb_maskz() 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
be5a675da8 use STL make_unique, remove wrapper header, breaks C++17 compilation 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
18693bd14c change C/C++ standard used to C17/C++17 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
369e0d473d remove low4bits from the arguments, fix cases that mostly affect loading large (64) vectors and falling out of bounds 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
47e29602e0 fix loadu_maskz, add {l,r}shift128_var(), tab fixes 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e25c8ad78b convert to for loops 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
b517cbfb8a minor fixes, add 2 constructors from half size vectors 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e047fcf629 fix lastMatch<64> 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
0d1d76140a provide an {l,r}shift128_var() to fix immediate value build failure in loadu_maskz 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
f680a79f1e fix arm loadu_maskz() 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
b0fbb39cf0 add arm rshift128/rshift128 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e1950f1ce5 use rshift128() instead of vector-wide right shift 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
93abed4d87 add {l,r}shift128()+tests, rename printv_u64() to print64() 2021-07-26 00:10:54 +03:00
George Wort
e325ce9452 Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.

Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-07-26 00:10:54 +03:00
George Wort
726a668b65 Fix CROSS_COMPILE_AARCH64 for SVE issues.
Change-Id: I7b9ba3ccb754d96eee22ca01714c783dae1e4956
2021-07-26 00:10:54 +03:00
George Wort
7d7d31ec0d Add SVE2 support for vermicelli
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-07-26 00:10:54 +03:00
George Wort
418851a26e Remove possibly undefined behaviour from Noodle.
Change-Id: I9a7997cea6a48927cb02b00c5dba5009bbf83850
2021-07-26 00:10:54 +03:00
George Wort
f99c380167 Remove first check from scanDouble Noodle.
Change-Id: I00eabb3cb06ef6a2060df52c26fa8591907a2711
2021-07-26 00:10:54 +03:00
apostolos
80c01e451d Equal mask test fixed with random numbers 2021-07-26 00:10:54 +03:00
apostolos
4326aecfda Supervector test fixes 2021-07-26 00:10:54 +03:00
apostolos
a98a114568 SuperVector AVX512 implementations 2021-07-26 00:10:54 +03:00
apostolos
17bf07f446 SuperVector unit tests for AVX2 and AVX512 added 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
56b124b9a1 really fix lshift for avx2 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e296a71540 disable OPTIMISE by default 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
6ad1e70c36 fix truffle SIMD for S>16 as well 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
46c37ece4b add AVX2 specializations 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
65d911c976 lots of fixes to AVX2 implementation 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
517d2aa633 convert print helper functions to class methods 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
009ac9432c tiny change in vector initialization 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
9c2c6fdcfc fix last failing Shufti/Truffle tests 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
3021db0a17 fix arm SuperVector implementation 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
0576c5e8b1 fix rtruffle, was failing Lbr and a few ReverseTruffle tests 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
d1097e713d fix x86 debug alignr 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
848a583599 move firstMatch, lastMatch to own header in util 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e31dd448b4 minor fixes 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
69378e7eee compilation fixes for debug mode 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
a5cce2670e fix arm implementation of alignr() 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
736da56850 harmonise syntax of x86 SuperVector impl.cpp like arm, fix alignr, define printv_* functions when on debug mode only 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
f6edd100c4 style fixes 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
a6bbe55574 removed obsolete file 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
6348a2f222 rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes 2021-07-26 00:10:54 +03:00
Konstantinos Margaritis
e024e41a2c handle GNUCC_ARCH on non-x86 properly 2021-07-26 00:10:54 +03:00
apostolos
885a4da0c8 Truffle simd vectorized 2021-07-26 00:10:54 +03:00