Konstantinos Margaritis
1f55d419eb
add initial ppc64el support
...
(cherry picked from commit 63e26a4b2880eda7b6ac7b49271d83ba3e6143c4)
(cherry picked from commit c214ba253327114c16d0724f75c998ab00d44919)
2021-10-14 13:53:55 +03:00
Konstantinos Margaritis
45bfed9b9d
add scalar versions of the vectorized functions for architectures that don't support 256-bit/512-bit SIMD vectors such as ARM
2020-10-15 16:30:18 +03:00
Konstantinos Margaritis
31ac6718dd
add ARM version of simd_utils.h
2020-10-13 09:19:56 +03:00
Konstantinos Margaritis
e8e188acaf
move x86 implementations of simd_utils.h to util/arch/x86/
2020-09-22 13:12:07 +03:00
Konstantinos Margaritis
2d89df44ae
move x86 arch and SIMD types to x86 arch folder
2020-09-17 19:00:48 +03:00
Chang, Harry
43204dda48
AVX512VBMI Teddy.
2020-05-25 13:47:53 +00:00
Chang, Harry
8da2d13baa
AVX512 Reinforced FAT teddy.
2017-08-21 11:14:59 +10:00
Chang, Harry
68e08d8e18
AVX512 reinforced teddy.
2017-08-21 11:12:36 +10:00
Matthew Barr
3e345c2567
If we can shift by an immediate, do it. Otherwise, don't.
2017-05-30 14:00:45 +10:00
Matthew Barr
f6b688fc06
rename pshufb to pshufb_m128
2017-05-30 13:59:23 +10:00
Matthew Barr
a295c96198
rename vpshufb to pshufb_m256
2017-05-30 13:59:23 +10:00
Matthew Barr
8a56d16d57
avx512: add basic functions to simd_utils
...
Extends the m512 type to use avx512 and also changes required
for limex.
2017-05-30 13:59:18 +10:00
Xu, Chi
ae3cb7de6f
rose: add multi-path shufti 16x8, 32x8, 32x16, 64x8 and multi-path lookaround instructions.
2017-04-26 15:18:56 +10:00
Matthew Barr
cd418ea6a8
Wrapper for system intrin header
2017-04-26 15:18:26 +10:00
Matthew Barr
8201183138
Check compiler architecture flags in one place
2017-04-26 15:18:26 +10:00
Matthew Barr
d2416736cb
Use intrinsic to get correct movq everywhere
...
The real trick here is that _mm_set_epi64x() (note the 'x') takes a 64-bit
value - not a ptr to a 128-bit value like the non-x - so compilers don't
twist themselves in knots with alignment or whatever confuses them.
2017-04-26 15:16:03 +10:00
Wang, Xiang W
90216921b0
FDR: front end loop improvement
2017-04-26 15:11:10 +10:00
Alex Coyte
e51b6d23b9
introduce Sheng-McClellan hybrid
2016-12-14 15:27:18 +11:00
Matthew Barr
99e14df117
Fix combine2x128
2016-12-02 11:33:48 +11:00
Matthew Barr
7d3eff8648
extern "C" for mask1bit table
2016-10-28 14:51:49 +11:00
Xu, Chi
04d79629de
rose: add shufti-based lookaround instructions
...
More lookaround specialisations that use the shufti approach.
2016-10-28 14:46:27 +11:00
Alex Coyte
e74b141e95
rework load_m128_from_u64a()
2016-10-28 14:44:16 +11:00
Alex Coyte
a08e1dd690
Introduce a 64-bit LimEx model.
...
On 64-bit platforms, the Limex 64 model is implemented in normal GPRs.
On 32-bit platforms, however, 128-bit SSE registers are used for the
runtime implementation.
2016-10-28 14:44:12 +11:00
Xu, Chi
b96d5c23d1
rose: add new instruction CHECK_MASK_32
...
This is a specialisation of the "lookaround" code.
2016-10-28 14:43:33 +11:00
Justin Viiret
49bb3b5c82
simd_utils: setbit/clearbit by loading 1-bit mask
2016-08-10 14:52:56 +10:00
Matthew Barr
22b451b59b
Ensure that m256 is 32-aligned on non-avx2 builds
2016-08-10 14:52:56 +10:00
Matthew Barr
e3d416a6ea
Apply some consistency to the names we give shifts
2016-07-08 11:07:50 +10:00
Matthew Barr
c76ff285e7
remove unnecessary function proto
2016-07-08 11:07:50 +10:00
Matthew Barr
9c915cc936
remove only use of cmpmsk8 and unused cmpmsk16
2016-07-08 11:07:50 +10:00
Matthew Barr
0722b5db5b
Remove GCC-style compound statements
...
These do not appear to give us benefits over inlining on recent compilers.
2016-07-08 11:07:50 +10:00
Matthew Barr
adf820bbba
simd: simplify the set-all-ones util funcs
...
Modern compilers (gcc, icc) get this right, with the benefit of
removing our last use of inline asm in this file.
2016-07-08 11:07:50 +10:00
Matthew Barr
4d6934fc77
Move limex specific shuffle utils and ssse3 funcs
2016-07-08 11:07:50 +10:00
Alex Coyte
e86688e313
add m128 byte shift functions
...
variable_byte_shift_m128 taken from pug-interpreter branch
2016-05-18 16:22:44 +10:00
Matthew Barr
dd4c1eceb8
Remove unused loadu2x128
2016-04-20 13:34:55 +10:00
Matthew Barr
904e436f11
Initial commit of Hyperscan
2015-10-20 09:13:35 +11:00