George Wort
c7086cb7f1
Add SVE2 support for dvermicelli
...
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
George Wort
051ceed0f9
Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
...
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.
Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-10-12 11:51:34 +03:00
George Wort
9fb79ac3ec
Add SVE2 support for vermicelli
...
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6e63aafbea
add arm support for the new SuperVector class
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
be66cdb51d
fixes in shifting primitives
2021-02-08 19:38:20 +02:00
Konstantinos Margaritis
f541f75400
bugfix compress128/expand128, add unit tests
2021-02-08 19:20:37 +02:00
Konstantinos Margaritis
e2f253d8ab
remove loads from movemask128, variable_byte_shift, add palignr_imm(), minor fixes
2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
5b85589274
add some useful intrinsics
2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
1c581e45e9
add expand128() implementation for NEON
2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
773dc6fa69
optimize *shiftbyte_m128() functions to use palign instead of variable_byte_shift_m128()
2020-12-07 23:12:26 +02:00
Konstantinos Margaritis
38477b08bc
fix movq and load_m128_from_u64a and resp. test for NEON
2020-12-03 19:27:38 +02:00
Konstantinos Margaritis
1c26f044a7
when building in debug mode, vgetq_lane_*() and vextq_*() need immediate operands, and we have to use switch()'ed versions
2020-11-24 17:56:40 +02:00
Konstantinos Margaritis
c4f1372814
remove debug from functions
2020-11-05 20:33:17 +02:00
Konstantinos Margaritis
33904180d8
add compress128 function and implementation
2020-11-05 19:20:06 +02:00
Konstantinos Margaritis
7b8cf97546
add extra instructions (currently arm-only), fix order of elements in set4x32/set2x64
2020-11-05 19:18:53 +02:00
Konstantinos Margaritis
548242981d
fix ARM implementations
2020-10-30 10:38:41 +02:00
Konstantinos Margaritis
c4db63665a
scalar implementations of diffrich256 and diffrich384
2020-10-16 13:02:40 +03:00
Konstantinos Margaritis
c5a7f4b846
add ARM simd_utils vectorized functions for 128-bit vectors
2020-10-15 16:26:49 +03:00
Konstantinos Margaritis
5b425bd5a6
add arm simple cpuid_flags
2020-10-15 16:26:04 +03:00
Konstantinos Margaritis
31ac6718dd
add ARM version of simd_utils.h
2020-10-13 09:19:56 +03:00
Konstantinos Margaritis
a9212174ee
add arm bitutils.h header
2020-10-08 20:50:55 +03:00
Konstantinos Margaritis
4c924cc920
add arm architecture basic defines
2020-10-07 14:28:12 +03:00