Konstantinos Margaritis
|
273b9683ac
|
simplify function
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
e215157a21
|
move definitions elsewhere
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
05c7c8e576
|
move SuperVector versions of noodleEngine scan functions to _simd.hpp file
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
6e63aafbea
|
add arm support for the new SuperVector class
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
c6406bebde
|
simplify scanSingleMain() and scanDoubleMain()
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
f77837130d
|
delete separate implementations
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
e6c1fa04ce
|
add C++ template SIMD library (WIP)
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
ede2b18564
|
add generic SIMD implementation
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
7a9a2dd0dc
|
convert to C++
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
2805ff038a
|
revert to push_back()
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
52661f35e8
|
add global definitions for CHUNKSIZE/VECTORSIZE, define HAVE_AVX512* only when BUILD_AVX512 is also enabled
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
831091db9e
|
fix typo
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
556206f138
|
replace push_back by emplace_back where possible
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
ec5531a6b1
|
minor optimizations
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
d3ff893871
|
prefetch works best when addresses are 64-byte aligned
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
521f233cfd
|
Revert "replace long macro and switch statement with function pointer array and branchless execution"
This reverts commit cc9dfed2494d709aac79051c29adb0a563903ba9.
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
92916e311f
|
replace long macro and switch statement with function pointer array and branchless execution
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
58cface115
|
optimise case handling
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
e3e101b412
|
simplify and make scanSingle*()/scanDouble*() more uniform
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
2f13ad0674
|
optimize caseMask handling
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
27bd09454f
|
use correct function names for AVX512, fix build failure
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
741d8246c5
|
fix some AVX512 function names, to fix AVX512 build failure, also rename the expand* functions to broadcast*() ones for consistency
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
c3c68b1c3f
|
fix x86 implementations for compress128/expand128
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
814045201f
|
add BUILD_AVX2 definition, enable non-AVX2 building selectively
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
9fd94e0062
|
use unaligned loads for short scans
|
2021-02-11 14:21:57 +02:00 |
|
Konstantinos Margaritis
|
d3e03ed88a
|
optimize case mask AND out of the loop
|
2021-02-10 13:29:45 +02:00 |
|
Konstantinos Margaritis
|
be66cdb51d
|
fixes in shifting primitives
|
2021-02-08 19:38:20 +02:00 |
|
Konstantinos Margaritis
|
f541f75400
|
bugfix compress128/expand128, add unit tests
|
2021-02-08 19:20:37 +02:00 |
|
Konstantinos Margaritis
|
d9874898c7
|
make const
|
2021-02-08 19:19:52 +02:00 |
|
Wang Xiang W
|
6a8a7a6c01
|
Bump version number for release
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
52f658ac55
|
Fix Klocwork scan issues.
|
2021-01-25 14:13:13 +02:00 |
|
Wang Xiang W
|
5f930b267c
|
Limex: exception handling with AVX512
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
001b7824d2
|
Logical Combination: use hs_misc_free instead of free.
fixes github issue #284
|
2021-01-25 14:13:13 +02:00 |
|
Wang Xiang W
|
beaca7c7db
|
Adjust sensitive terms
|
2021-01-25 14:13:13 +02:00 |
|
Wang Xiang W
|
9ea1e4be3d
|
limex: add fast NFA check
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
5ad3d64b4b
|
Discard HAVE_AVX512VBMI checks at Sheng/McSheng compile time.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
b19a41528a
|
Add cpu feature / target info "AVX512VBMI".
|
2021-01-25 14:13:13 +02:00 |
|
Zhu,Wenjun
|
d96f1ab505
|
MCSHENG64: extend to 64-state based on mcsheng
|
2021-01-25 14:13:13 +02:00 |
|
Hong, Yang A
|
dea7c4dc2e
|
lookaround:
add 64x8 and 64x16 shufti models
add mask64 model
expand entry quantity
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
56cb107005
|
AVX512VBMI Fat Teddy.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
f5657ef7b7
|
Fix find_vertices_in_cycles(): don't check self-loop in SCC.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
a388a0f193
|
Fix sheng64 dump compile issue in clang.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
c41d33c53f
|
Fix sheng64 compile issue in clang and in DEBUG_OUTPUT mode on SKX.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
ed4b0f713a
|
SHENG64: 64-state 1-byte shuffle based DFA.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
6a42b37fca
|
SHENG32: Compile priority sheng > mcsheng > sheng32.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
cc747013c4
|
SHENG32: 32-state 1-byte shuffle based DFA.
|
2021-01-25 14:13:13 +02:00 |
|
Hong, Yang A
|
d71515be04
|
DFA: use sherman economically
|
2021-01-25 14:13:13 +02:00 |
|
Konstantinos Margaritis
|
87413fbff0
|
optimize get_conf_stride_1()
|
2021-01-25 12:13:35 +02:00 |
|
Konstantinos Margaritis
|
e2f253d8ab
|
remove loads from movemask128, variable_byte_shift, add palignr_imm(), minor fixes
|
2021-01-25 12:13:35 +02:00 |
|
Konstantinos Margaritis
|
a039089888
|
fix non-const char * write-strings compile error
|
2021-01-25 12:13:35 +02:00 |
|