Commit Graph

21 Commits

Author SHA1 Message Date
George Wort
ace6cd15f2 Use SVE2 Bitperm's bdep instruction in bitutils and state_compress
Specifically for pdep64, expand32, and expand64 in bitutils,
as well as all of the loadcompressed functions used in
state_compress.

Change-Id: I92851bd12481dbee6a7e344df0890c4901b56d01
2021-10-12 11:51:34 +03:00
George Wort
acfa11a34f Add SVE2 support for vermicelli
Change-Id: Ia025de53521fbaefe5fb1e4425aaf75c7d80a14e
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
bc2e3dfd2e add arm support for the new SuperVector class 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e2ed30f42c fixes in shifting primitives 2021-02-08 19:38:20 +02:00
Konstantinos Margaritis
2ebb7d2b21 bugfix compress128/expand128, add unit tests 2021-02-08 19:20:37 +02:00
Konstantinos Margaritis
444bec59fb remove loads from movemask128, variable_byte_shift, add palignr_imm(), minor fixes 2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
f30ced88c2 add some useful intrinsics 2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
0b14b24616 add expand128() implementation for NEON 2021-01-25 12:13:35 +02:00
Konstantinos Margaritis
9716197623 optimize *shiftbyte_m128() functions to use palign instead of variable_byte_shift_m128() 2020-12-07 23:12:26 +02:00
Konstantinos Margaritis
ffbb6eb548 fix movq and load_m128_from_u64a and resp. test for NEON 2020-12-03 19:27:38 +02:00
Konstantinos Margaritis
505d7215c3 when building in debug mode, vgetq_lane_*() and vextq_*() need immediate operands, and we have to use switch()'ed versions 2020-11-24 17:56:40 +02:00
Konstantinos Margaritis
1d02082052 remove debug from functions 2020-11-05 20:33:17 +02:00
Konstantinos Margaritis
c728b76898 add compress128 function and implementation 2020-11-05 19:20:06 +02:00
Konstantinos Margaritis
5e2d704bcd add extra instructions (currently arm-only), fix order of elements in set4x32/set2x64 2020-11-05 19:18:53 +02:00
Konstantinos Margaritis
c9b338fd6c fix ARM implementations 2020-10-30 10:38:41 +02:00
Konstantinos Margaritis
a34cbf8edb scalar implementations of diffrich256 and diffrich384 2020-10-16 13:02:40 +03:00
Konstantinos Margaritis
08c3114090 add ARM simd_utils vectorized functions for 128-bit vectors 2020-10-15 16:26:49 +03:00
Konstantinos Margaritis
64535610f5 add arm simple cpuid_flags 2020-10-15 16:26:04 +03:00
Konstantinos Margaritis
d3b33ac02d add ARM version of simd_utils.h 2020-10-13 09:19:56 +03:00
Konstantinos Margaritis
73297dea33 add arm bitutils.h header 2020-10-08 20:50:55 +03:00
Konstantinos Margaritis
b77ffbf4ed add arm architecture basic defines 2020-10-07 14:28:12 +03:00