22 Commits

Author SHA1 Message Date
George Wort
c7086cb7f1 Add SVE2 support for dvermicelli
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00
George Wort
051ceed0f9 Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.

Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-10-12 11:51:34 +03:00
George Wort
9fb79ac3ec Add SVE2 support for vermicelli
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
6e63aafbea add arm support for the new SuperVector class 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
be66cdb51d fixes in shifting primitives 2021-02-08 19:38:20 +02:00
Konstantinos Margaritis
f541f75400 bugfix compress128/expand128, add unit tests 2021-02-08 19:20:37 +02:00
Konstantinos Margaritis
e2f253d8ab remove loads from movemask128, variable_byte_shift, add palignr_imm(), minor fixes 2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
5b85589274 add some useful intrinsics 2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
1c581e45e9 add expand128() implementation for NEON 2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
773dc6fa69 optimize *shiftbyte_m128() functions to use palign instead of variable_byte_shift_m128() 2020-12-07 23:12:26 +02:00
Konstantinos Margaritis
38477b08bc fix movq and load_m128_from_u64a and resp. test for NEON 2020-12-03 19:27:38 +02:00
Konstantinos Margaritis
1c26f044a7 when building in debug mode, vgetq_lane_*() and vextq_*() need immediate operands, and we have to use switch()'ed versions 2020-11-24 17:56:40 +02:00
Konstantinos Margaritis
c4f1372814 remove debug from functions 2020-11-05 20:33:17 +02:00
Konstantinos Margaritis
33904180d8 add compress128 function and implementation 2020-11-05 19:20:06 +02:00
Konstantinos Margaritis
7b8cf97546 add extra instructions (currently arm-only), fix order of elements in set4x32/set2x64 2020-11-05 19:18:53 +02:00
Konstantinos Margaritis
548242981d fix ARM implementations 2020-10-30 10:38:41 +02:00
Konstantinos Margaritis
c4db63665a scalar implementations of diffrich256 and diffrich384 2020-10-16 13:02:40 +03:00
Konstantinos Margaritis
c5a7f4b846 add ARM simd_utils vectorized functions for 128-bit vectors 2020-10-15 16:26:49 +03:00
Konstantinos Margaritis
5b425bd5a6 add arm simple cpuid_flags 2020-10-15 16:26:04 +03:00
Konstantinos Margaritis
31ac6718dd add ARM version of simd_utils.h 2020-10-13 09:19:56 +03:00
Konstantinos Margaritis
a9212174ee add arm bitutils.h header 2020-10-08 20:50:55 +03:00
Konstantinos Margaritis
4c924cc920 add arm architecture basic defines 2020-10-07 14:28:12 +03:00