Apostolos Tapsas
1eb3b19f63
Shuffle simd and SuperVector implementetions as well as their test realy fixed
2021-10-25 09:19:30 +03:00
Apostolos Tapsas
d43d6733b6
SuperVector shuffle implementation and test function optimized
2021-10-22 11:55:39 +00:00
apostolos
57301721f1
print functions missing keywords replaced
2021-10-22 12:38:16 +03:00
apostolos
24f149f239
print functions keyword renamed
2021-10-22 12:36:07 +03:00
apostolos
b53b0a0fcd
test for movemask and shuffle cases added
2021-10-22 11:17:43 +03:00
Apostolos Tapsas
5abda15c26
expand128 bugs fixed
2021-10-22 07:05:55 +00:00
apostolos
7184ce9870
expand128 implementation was changed to be like arm's
2021-10-22 09:46:04 +03:00
Apostolos Tapsas
2b1db73326
WIP: simd & bitutils files finctions fixes
2021-10-21 13:34:02 +00:00
Apostolos Tapsas
558313a2c2
SuperVector operators fixes and simd_utils low/high64 functions implementations added
2021-10-18 12:26:38 +00:00
Apostolos Tapsas
e084c2d6e4
SuperVector vsh* implementations
2021-10-15 14:07:17 +00:00
apostolos
b1f53f8e49
match file for ARCH_PPC64EL added
2021-10-14 16:26:59 +03:00
apostolos
ba4472a61c
trufle and shufle implementations for ARCH_PPC64EL
2021-10-14 16:01:21 +03:00
apostolos
d0a41252c8
blockSigleMask implementations for ARCH_PPC64 added
2021-10-14 15:56:13 +03:00
apostolos
4d2acd59e2
Supervector vsh* added
2021-10-14 15:08:23 +03:00
Apostolos Tapsas
7888dd4418
WIP: Power VSX support almost completed
2021-10-14 13:53:55 +03:00
Vectorcamp
2231f7c024
compile fixes for vsc port
2021-10-14 13:53:55 +03:00
apostolos
90d3db1776
update powerpc simd util file functions
2021-10-14 13:53:55 +03:00
apostolos
0078c28ee6
implementations for powerpc64el architecture
2021-10-14 13:53:55 +03:00
Konstantinos Margaritis
1f55d419eb
add initial ppc64el support
...
(cherry picked from commit 63e26a4b2880eda7b6ac7b49271d83ba3e6143c4)
(cherry picked from commit c214ba253327114c16d0724f75c998ab00d44919)
2021-10-14 13:53:55 +03:00
Konstantinos Margaritis
4e044d4142
Add missing copyright info from tampered files
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
c3baf3d296
fix multiple/undefined symbols when using fat runtimes
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
2d9f52d03e
add arm truffle block function
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
9d0c15c448
add simd_onebit_masks as static in arm simd_utils.h as well
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
aea10b8ab0
simplify truffle and provide arch-specific block functions
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
623c64142b
simplify shufti and provide arch-specific block functions
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
577e03e0c7
rearrange method declarations
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
9c54412447
remove simd_utils.c
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
8b7ba89cb5
add x86 vsh* implementations
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
eebd6c97bc
use movemask
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
6ceab8435d
add header define to avoid double inclusion
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
fad39b6058
optimize and simplify Shufti and Truffle to work with a single block method instead
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
9e6c1c30cf
remove asserts, as they are not needed
2021-10-12 11:51:35 +03:00
Konstantinos Margaritis
fa3d509fad
firstMatch/lastMatch are now arch-dependent, emulating movemask on non-Intel is very costly, the alternative is almost twice as fast on Arm
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
9ab18cf419
fix for new pshufb
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
67e0674df8
Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction).
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
e7161fdfec
initial SSE/AVX2 implementation
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
08357a096c
remove Windows/ICC support
2021-10-12 11:51:34 +03:00
apostolos
67fa6d2738
alignr methods for avx2 and avx512 added
2021-10-12 11:51:34 +03:00
apostolos
b3a20afbbc
limex_shuffle added and it's unit tests
2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
de30471edd
remove duplicate functions from previous merge
2021-10-12 11:51:34 +03:00
George Wort
a879715953
Move SVE functions into their own files.
...
Change-Id: I995ba4b7d2b558ee403693ee45d747d414d3b177
2021-10-12 11:51:34 +03:00
George Wort
6c6aee9682
Implement new DoubleVermicelli16 acceleration functions using SVE2
...
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
25183089fd
Use SVE shufti for counting miracles.
...
Change-Id: Idd4aaf5bbc05fc90e9138c6fed385bc6ffa7b0b8
2021-10-12 11:51:34 +03:00
George Wort
00fff3f53c
Use SVE for double shufti.
...
Change-Id: I09e0d57bb8a2f05b613f6225dea79ae823136268
2021-10-12 11:51:34 +03:00
George Wort
c95a4c3dd1
Use SVE for single shufti.
...
Change-Id: Ic76940c5bb9b81a1c45d39e9ca396a158c50a7dc
2021-10-12 11:51:34 +03:00
George Wort
56ef2d5f72
Use SVE2 for counting miracles.
...
Change-Id: I048dc182e5f4e726b847b3285ffafef4f538e550
2021-10-12 11:51:34 +03:00
George Wort
ab5d4d9279
Replace USE_ARM_SVE with HAVE_SVE.
...
Change-Id: I469efaac197cba93201f2ca6eca78ca61be3054d
2021-10-12 11:51:34 +03:00
George Wort
8242f46ed7
Add Licence to state_compress and bitutils.
...
Change-Id: I958daf82e5aef5bd306424dcfa7812382b266d65
2021-10-12 11:51:34 +03:00
George Wort
df926ef62f
Implement new Vermicelli16 acceleration functions using SVE2.
...
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
George Wort
c7086cb7f1
Add SVE2 support for dvermicelli
...
Change-Id: I056ef15e162ab6fb1f78964321ce893f4096367e
2021-10-12 11:51:34 +03:00