Danila Kutenin
|
db52ce6f08
|
Fix avx512 movemask call
|
2022-07-20 09:03:50 +01:00 |
|
Danila Kutenin
|
49eb18ee4f
|
Optimize vectorscan for aarch64 by using shrn instruction
This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--
To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
|
2022-06-26 22:55:45 +00:00 |
|
apostolos
|
4114b8a480
|
SuperVector opandnot test enriched
|
2021-11-10 15:12:25 +02:00 |
|
Apostolos Tapsas
|
1eb3b19f63
|
Shuffle simd and SuperVector implementetions as well as their test realy fixed
|
2021-10-25 09:19:30 +03:00 |
|
Apostolos Tapsas
|
d43d6733b6
|
SuperVector shuffle implementation and test function optimized
|
2021-10-22 11:55:39 +00:00 |
|
Konstantinos Margaritis
|
8b7ba89cb5
|
add x86 vsh* implementations
|
2021-10-12 11:51:35 +03:00 |
|
Konstantinos Margaritis
|
67e0674df8
|
Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction).
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
e7161fdfec
|
initial SSE/AVX2 implementation
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
904a94fbe5
|
micro-benchmarks for shufti, trufle and noodle added
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
8cff876962
|
fix lshift128 test
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
67fa6d2738
|
alignr methods for avx2 and avx512 added
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
b3a20afbbc
|
limex_shuffle added and it's unit tests
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
a2e6143ea1
|
convert to for loops
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
6c51f7f591
|
add {l,r}shift128()+tests, rename printv_u64() to print64()
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
89b123d003
|
Equal mask test fixed with random numbers
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
6f88ecac44
|
Supervector test fixes
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
ae6bc52076
|
SuperVector AVX512 implementations
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
32350cf9b1
|
SuperVector unit tests for AVX2 and AVX512 added
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
78e098661f
|
tiny change in vector initialization
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
28b2949396
|
harmonise syntax of x86 SuperVector impl.cpp like arm, fix alignr, define printv_* functions when on debug mode only
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
2753dbb3b0
|
rename supervector class header, use dup_*() functions names instead of set1_*(), minor fixes
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
1ce5e17ce9
|
Truffle simd vectorized
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
5297ed5038
|
syntax fixes
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
e49fa3a97a
|
fix unit tests, and resp. ARM SuperVector methods based on those unit tests, add print functions for SuperVector
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
1e434a9b3d
|
Supervector Unit Tests
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
feb2d3ccf7
|
SuperVector unit tests
|
2021-10-12 11:51:34 +03:00 |
|
apostolos
|
096fb55faa
|
unit tests for supervector
|
2021-10-12 11:51:34 +03:00 |
|