George Wort
8242f46ed7
Add Licence to state_compress and bitutils.
...
Change-Id: I958daf82e5aef5bd306424dcfa7812382b266d65
2021-10-12 11:51:34 +03:00
George Wort
c7086cb7f1
Add SVE2 support for dvermicelli
...
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
603bc14cdd
fix failing corner case, add pshufb_maskz()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e35b88f2c8
use STL make_unique, remove wrapper header, breaks C++17 compilation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f2d9784979
fix loadu_maskz, add {l,r}shift128_var(), tab fixes
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f8ce0bb922
minor fixes, add 2 constructors from half size vectors
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cabd13d18a
fix lastMatch<64>
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
ebb1b84ae3
provide an {l,r}shift128_var() to fix immediate value build failure in loadu_maskz
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
825460856f
fix arm loadu_maskz()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
86accf41a3
add arm rshift128/rshift128
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6c51f7f591
add {l,r}shift128()+tests, rename printv_u64() to print64()
2021-10-12 11:51:34 +03:00
George Wort
051ceed0f9
Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
...
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.
Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-10-12 11:51:34 +03:00
George Wort
9fb79ac3ec
Add SVE2 support for vermicelli
...
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
apostolos
6f88ecac44
Supervector test fixes
2021-10-12 11:51:34 +03:00
apostolos
ae6bc52076
SuperVector AVX512 implementations
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
7ae636dfe9
really fix lshift for avx2
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
b42b187712
add AVX2 specializations
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
dede600637
lots of fixes to AVX2 implementation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
c45e72775f
convert print helper functions to class methods
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
ec3f108d71
fix arm SuperVector implementation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
f425951b49
fix x86 debug alignr
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
845e533b66
move firstMatch, lastMatch to own header in util
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6d8f3b9ff8
compilation fixes for debug mode
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d7b247a949
fix arm implementation of alignr()
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
28b2949396
harmonise syntax of x86 SuperVector impl.cpp like arm, fix alignr, define printv_* functions when on debug mode only
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
9de3065e68
style fixes
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e0a45a354d
removed obsolete file
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
2753dbb3b0
rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes
2021-10-12 11:51:34 +03:00
apostolos
1ce5e17ce9
Truffle simd vectorized
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
8b09ecfe48
nits
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
cceb599fc9
fix typo
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e49fa3a97a
fix unit tests, and resp. ARM SuperVector methods based on those unit tests, add print functions for SuperVector
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
acca824dea
add missing ARM SuperVector methods, some tests still fail, WIP
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6fbd18183a
rename arm impl.hpp to impl.cpp, add operator|() to SuperVector class
2021-10-12 11:51:34 +03:00
George Wort
3ee7b75ee0
Add SVE, SVE2, and SVE2_BITPERM as targets
...
Change-Id: I5231e2eb0a31708a16c853dc83ea48db32e0b0a5
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6526df81e4
add more functions, move defines here, enable inlining of template specializations only when running optimized code
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
d8b5eb5d17
fix compilation on C++
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
273b9683ac
simplify function
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6e63aafbea
add arm support for the new SuperVector class
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e6c1fa04ce
add C++ template SIMD library (WIP)
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
52661f35e8
add global definitions for CHUNKSIZE/VECTORSIZE, define HAVE_AVX512* only when BUILD_AVX512 is also enabled
2021-10-12 11:51:33 +03:00
Konstantinos Margaritis
831091db9e
fix typo
2021-10-12 11:51:33 +03:00
Konstantinos Margaritis
556206f138
replace push_back by emplace_back where possible
2021-10-12 11:51:33 +03:00
Konstantinos Margaritis
27bd09454f
use correct function names for AVX512, fix build failure
2021-02-15 13:54:19 +02:00
Konstantinos Margaritis
741d8246c5
fix some AVX512 function names, to fix AVX512 build failure, also rename the expand* functions to broadcast*() ones for consistency
2021-02-15 13:54:19 +02:00
Konstantinos Margaritis
c3c68b1c3f
fix x86 implementations for compress128/expand128
2021-02-15 13:54:19 +02:00
Konstantinos Margaritis
814045201f
add BUILD_AVX2 definition, enable non-AVX2 building selectively
2021-02-15 13:54:19 +02:00
Konstantinos Margaritis
be66cdb51d
fixes in shifting primitives
2021-02-08 19:38:20 +02:00
Konstantinos Margaritis
f541f75400
bugfix compress128/expand128, add unit tests
2021-02-08 19:20:37 +02:00
Konstantinos Margaritis
d9874898c7
make const
2021-02-08 19:19:52 +02:00