35 Commits

Author SHA1 Message Date
Konstantinos Margaritis
1f55d419eb add initial ppc64el support
(cherry picked from commit 63e26a4b2880eda7b6ac7b49271d83ba3e6143c4)
(cherry picked from commit c214ba253327114c16d0724f75c998ab00d44919)
2021-10-14 13:53:55 +03:00
Konstantinos Margaritis
45bfed9b9d add scalar versions of the vectorized functions for architectures that don't support 256-bit/512-bit SIMD vectors such as ARM 2020-10-15 16:30:18 +03:00
Konstantinos Margaritis
31ac6718dd add ARM version of simd_utils.h 2020-10-13 09:19:56 +03:00
Konstantinos Margaritis
e8e188acaf move x86 implementations of simd_utils.h to util/arch/x86/ 2020-09-22 13:12:07 +03:00
Konstantinos Margaritis
2d89df44ae move x86 arch and SIMD types to x86 arch folder 2020-09-17 19:00:48 +03:00
Chang, Harry
43204dda48 AVX512VBMI Teddy. 2020-05-25 13:47:53 +00:00
Chang, Harry
8da2d13baa AVX512 Reinforced FAT teddy. 2017-08-21 11:14:59 +10:00
Chang, Harry
68e08d8e18 AVX512 reinforced teddy. 2017-08-21 11:12:36 +10:00
Matthew Barr
3e345c2567 If we can shift by an immediate, do it. Otherwise, don't. 2017-05-30 14:00:45 +10:00
Matthew Barr
f6b688fc06 rename pshufb to pshufb_m128 2017-05-30 13:59:23 +10:00
Matthew Barr
a295c96198 rename vpshufb to pshufb_m256 2017-05-30 13:59:23 +10:00
Matthew Barr
8a56d16d57 avx512: add basic functions to simd_utils
Extends the m512 type to use avx512 and also changes required
for limex.
2017-05-30 13:59:18 +10:00
Xu, Chi
ae3cb7de6f rose: add multi-path shufti 16x8, 32x8, 32x16, 64x8 and multi-path lookaround instructions. 2017-04-26 15:18:56 +10:00
Matthew Barr
cd418ea6a8 Wrapper for system intrin header 2017-04-26 15:18:26 +10:00
Matthew Barr
8201183138 Check compiler architecture flags in one place 2017-04-26 15:18:26 +10:00
Matthew Barr
d2416736cb Use intrinsic to get correct movq everywhere
The real trick here is that _mm_set_epi64x() (note the 'x') takes a 64-bit
value - not a ptr to a 128-bit value like the non-x - so compilers don't
twist themselves in knots with alignment or whatever confuses them.
2017-04-26 15:16:03 +10:00
Wang, Xiang W
90216921b0 FDR: front end loop improvement 2017-04-26 15:11:10 +10:00
Alex Coyte
e51b6d23b9 introduce Sheng-McClellan hybrid 2016-12-14 15:27:18 +11:00
Matthew Barr
99e14df117 Fix combine2x128 2016-12-02 11:33:48 +11:00
Matthew Barr
7d3eff8648 extern "C" for mask1bit table 2016-10-28 14:51:49 +11:00
Xu, Chi
04d79629de rose: add shufti-based lookaround instructions
More lookaround specialisations that use the shufti approach.
2016-10-28 14:46:27 +11:00
Alex Coyte
e74b141e95 rework load_m128_from_u64a() 2016-10-28 14:44:16 +11:00
Alex Coyte
a08e1dd690 Introduce a 64-bit LimEx model.
On 64-bit platforms, the Limex 64 model is implemented in normal GPRs.
On 32-bit platforms, however, 128-bit SSE registers are used for the
runtime implementation.
2016-10-28 14:44:12 +11:00
Xu, Chi
b96d5c23d1 rose: add new instruction CHECK_MASK_32
This is a specialisation of the "lookaround" code.
2016-10-28 14:43:33 +11:00
Justin Viiret
49bb3b5c82 simd_utils: setbit/clearbit by loading 1-bit mask 2016-08-10 14:52:56 +10:00
Matthew Barr
22b451b59b Ensure that m256 is 32-aligned on non-avx2 builds 2016-08-10 14:52:56 +10:00
Matthew Barr
e3d416a6ea Apply some consistency to the names we give shifts 2016-07-08 11:07:50 +10:00
Matthew Barr
c76ff285e7 remove unnecessary function proto 2016-07-08 11:07:50 +10:00
Matthew Barr
9c915cc936 remove only use of cmpmsk8 and unused cmpmsk16 2016-07-08 11:07:50 +10:00
Matthew Barr
0722b5db5b Remove GCC-style compound statements
These do not appear to give us benefits over inlining on recent compilers.
2016-07-08 11:07:50 +10:00
Matthew Barr
adf820bbba simd: simplify the set-all-ones util funcs
Modern compilers (gcc, icc) get this right, with the benefit of
removing our last use of inline asm in this file.
2016-07-08 11:07:50 +10:00
Matthew Barr
4d6934fc77 Move limex specific shuffle utils and ssse3 funcs 2016-07-08 11:07:50 +10:00
Alex Coyte
e86688e313 add m128 byte shift functions
variable_byte_shift_m128 taken from pug-interpreter branch
2016-05-18 16:22:44 +10:00
Matthew Barr
dd4c1eceb8 Remove unused loadu2x128 2016-04-20 13:34:55 +10:00
Matthew Barr
904e436f11 Initial commit of Hyperscan 2015-10-20 09:13:35 +11:00