Konstantinos Margaritis
|
a6230d6410
|
add more functions, move defines here, enable inlining of template specializations only when running optimized code
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
d72038bc31
|
fix compilation on C++
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
7262ae8b74
|
simplify function
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
94089d9acb
|
move definitions elsewhere
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
4fe9a75abc
|
move SuperVector versions of noodleEngine scan functions to _simd.hpp file
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
bc2e3dfd2e
|
add arm support for the new SuperVector class
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
923f8bd357
|
simplify scanSingleMain() and scanDoubleMain()
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
4d8fbe95a7
|
delete separate implementations
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
c3101d53f4
|
add C++ template SIMD library (WIP)
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
7c303b62e3
|
add generic SIMD implementation
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
718cc7be1d
|
convert to C++
|
2021-10-12 11:51:34 +03:00 |
|
Konstantinos Margaritis
|
26bb00a932
|
revert to push_back()
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
317fb3dcfc
|
add global definitions for CHUNKSIZE/VECTORSIZE, define HAVE_AVX512* only when BUILD_AVX512 is also enabled
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
5171627e3b
|
fix typo
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
3f35a2be37
|
replace push_back by emplace_back where possible
|
2021-10-12 11:51:33 +03:00 |
|
Konstantinos Margaritis
|
361aa4b900
|
minor optimizations
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
acb542a5be
|
prefetch works best when addresses are 64-byte aligned
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
68b92f338d
|
Revert "replace long macro and switch statement with function pointer array and branchless execution"
This reverts commit cc9dfed249.
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
381ef41168
|
replace long macro and switch statement with function pointer array and branchless execution
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
95b929ed26
|
optimise case handling
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
ffb6a95e72
|
simplify and make scanSingle*()/scanDouble*() more uniform
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
da6216e42d
|
optimize caseMask handling
|
2021-10-12 11:50:32 +03:00 |
|
Konstantinos Margaritis
|
1c1bca4f98
|
use correct function names for AVX512, fix build failure
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
73bab6346d
|
fix some AVX512 function names, to fix AVX512 build failure, also rename the expand* functions to broadcast*() ones for consistency
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
2a2609229c
|
fix x86 implementations for compress128/expand128
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
71c59a95e9
|
add BUILD_AVX2 definition, enable non-AVX2 building selectively
|
2021-02-15 13:54:19 +02:00 |
|
Konstantinos Margaritis
|
e12a6a0fd2
|
use unaligned loads for short scans
|
2021-02-11 14:21:57 +02:00 |
|
Konstantinos Margaritis
|
ecd2842217
|
optimize case mask AND out of the loop
|
2021-02-10 13:29:45 +02:00 |
|
Konstantinos Margaritis
|
e2ed30f42c
|
fixes in shifting primitives
|
2021-02-08 19:38:20 +02:00 |
|
Konstantinos Margaritis
|
2ebb7d2b21
|
bugfix compress128/expand128, add unit tests
|
2021-02-08 19:20:37 +02:00 |
|
Konstantinos Margaritis
|
1e41465eff
|
make const
|
2021-02-08 19:19:52 +02:00 |
|
Wang Xiang W
|
bd29733de2
|
Bump version number for release
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
686e73c35e
|
Fix Klocwork scan issues.
|
2021-01-25 14:13:13 +02:00 |
|
Wang Xiang W
|
723b469cf7
|
Limex: exception handling with AVX512
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
80c4d1bc6b
|
Logical Combination: use hs_misc_free instead of free.
fixes github issue #284
|
2021-01-25 14:13:13 +02:00 |
|
Wang Xiang W
|
5bd1ee9888
|
Adjust sensitive terms
|
2021-01-25 14:13:13 +02:00 |
|
Wang Xiang W
|
a307e11283
|
limex: add fast NFA check
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
191cfef6cd
|
Discard HAVE_AVX512VBMI checks at Sheng/McSheng compile time.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
86b57e409f
|
Add cpu feature / target info "AVX512VBMI".
|
2021-01-25 14:13:13 +02:00 |
|
Zhu,Wenjun
|
1c8c7ea806
|
MCSHENG64: extend to 64-state based on mcsheng
|
2021-01-25 14:13:13 +02:00 |
|
Hong, Yang A
|
8436f95f24
|
lookaround:
add 64x8 and 64x16 shufti models
add mask64 model
expand entry quantity
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
e1706c435c
|
AVX512VBMI Fat Teddy.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
99ffbbf425
|
Fix find_vertices_in_cycles(): don't check self-loop in SCC.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
63c7345ab2
|
Fix sheng64 dump compile issue in clang.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
6c56aaf7a9
|
Fix sheng64 compile issue in clang and in DEBUG_OUTPUT mode on SKX.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
cfd3b0bf4e
|
SHENG64: 64-state 1-byte shuffle based DFA.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
7bd488922a
|
SHENG32: Compile priority sheng > mcsheng > sheng32.
|
2021-01-25 14:13:13 +02:00 |
|
Chang, Harry
|
2cde84c96d
|
SHENG32: 32-state 1-byte shuffle based DFA.
|
2021-01-25 14:13:13 +02:00 |
|
Hong, Yang A
|
6f8bfa1854
|
DFA: use sherman economically
|
2021-01-25 14:13:13 +02:00 |
|
Konstantinos Margaritis
|
ba4a83aee7
|
optimize get_conf_stride_1()
|
2021-01-25 12:13:35 +02:00 |
|