Commit Graph

  • a388a0f193 Fix sheng64 dump compile issue in clang. Chang, Harry 2020-09-01 07:04:04 +00:00
  • c41d33c53f Fix sheng64 compile issue in clang and in DEBUG_OUTPUT mode on SKX. Chang, Harry 2020-08-31 13:27:22 +00:00
  • ed4b0f713a SHENG64: 64-state 1-byte shuffle based DFA. Chang, Harry 2020-07-10 13:26:17 +00:00
  • 6a42b37fca SHENG32: Compile priority sheng > mcsheng > sheng32. Chang, Harry 2020-07-20 06:36:53 +00:00
  • cc747013c4 SHENG32: 32-state 1-byte shuffle based DFA. Chang, Harry 2018-11-01 16:33:58 +08:00
  • d71515be04 DFA: use sherman economically Hong, Yang A 2020-06-18 09:48:52 +00:00
  • 7d21fc157c hsbench: add CSV dump support Wang Xiang W 2020-04-30 07:37:55 -04:00
  • 87413fbff0 optimize get_conf_stride_1() Konstantinos Margaritis 2021-01-22 10:13:55 +02:00
  • e2f253d8ab remove loads from movemask128, variable_byte_shift, add palignr_imm(), minor fixes Konstantinos Margaritis 2021-01-22 10:13:19 +02:00
  • a039089888 fix non-const char * write-strings compile error Konstantinos Margaritis 2021-01-22 10:11:20 +02:00
  • 4686ac47b6 replace andn() by explicit bitops and group loads/stores, gives ~1% gain Konstantinos Margaritis 2021-01-18 13:00:45 +02:00
  • b62247a36e borrow cache prefetching tricks from the Marvell port, seem to improve performance by 5-28% Konstantinos Margaritis 2021-01-15 17:42:11 +02:00
  • 51dcfa8571 fix compilation on non-x86 Konstantinos Margaritis 2021-01-15 17:35:21 +02:00
  • 5b85589274 add some useful intrinsics Konstantinos Margaritis 2021-01-15 17:35:01 +02:00
  • 1c581e45e9 add expand128() implementation for NEON Konstantinos Margaritis 2021-01-15 17:33:41 +02:00
  • c238d627c9 optimize get_conf_stride_1() Konstantinos Margaritis 2021-01-22 10:13:55 +02:00
  • f9ef98ce19 remove loads from movemask128, variable_byte_shift, add palignr_imm(), minor fixes Konstantinos Margaritis 2021-01-22 10:13:19 +02:00
  • dfba9227e9 fix non-const char * write-strings compile error Konstantinos Margaritis 2021-01-22 10:11:20 +02:00
  • 9bf5cac782 replace andn() by explicit bitops and group loads/stores, gives ~1% gain Konstantinos Margaritis 2021-01-18 13:00:45 +02:00
  • 94739756b4 borrow cache prefetching tricks from the Marvell port, seem to improve performance by 5-28% Konstantinos Margaritis 2021-01-15 17:42:11 +02:00
  • fc4338eca0 fix compilation on non-x86 Konstantinos Margaritis 2021-01-15 17:35:21 +02:00
  • ef9bf02d00 add some useful intrinsics Konstantinos Margaritis 2021-01-15 17:35:01 +02:00
  • 6a11c83630 add expand128() implementation for NEON Konstantinos Margaritis 2021-01-15 17:33:41 +02:00
  • 64a995bf44 Merge branch 'github_develop' into github_master v5.4.0 Hong, Yang A 2021-01-13 14:39:34 +00:00
  • 433d2f386a Bump version number for release Wang Xiang W 2020-12-21 10:11:22 +00:00
  • 76066b9ef2 changelog: updates for 5.4.0 release Wang Xiang W 2020-12-21 10:09:43 +00:00
  • 66dc649197 Fix Klocwork scan issues. Chang, Harry 2020-12-27 12:04:55 +00:00
  • d1ea4c762a chimera: fix return value handling Wang Xiang W 2020-12-01 10:50:13 -05:00
  • 2945c9bd20 Limex: exception handling with AVX512 Wang Xiang W 2020-04-24 11:51:34 -04:00
  • 20e69f6ad8 Logical Combination: use hs_misc_free instead of free. Chang, Harry 2020-12-02 05:13:23 +00:00
  • 845ea5c9e3 examples: add cmake enabling option BUILD_EXAMPLES. Hong, Yang A 2020-12-01 08:41:59 +00:00
  • b16c6200ee [dev-reference] Fix minor typo in docs Piotr Skamruk 2020-08-12 17:30:11 +02:00
  • 1a43a63218 Fixed several typos Fixed spellings of regular, interpretation, and grammar to improve readability. Walt Stoneburner 2020-05-18 13:15:34 -04:00
  • 04d3be487d Adjust sensitive terms Wang Xiang W 2020-11-19 14:25:21 +00:00
  • 5eab583df5 limex: add fast NFA check Wang Xiang W 2020-09-10 09:55:12 +00:00
  • ddc247516c Discard HAVE_AVX512VBMI checks at Sheng/McSheng compile time. Chang, Harry 2020-10-21 12:30:04 +00:00
  • 5326b3e688 Add cpu feature / target info "AVX512VBMI". Chang, Harry 2020-10-21 05:14:53 +00:00
  • 0102f03c9c MCSHENG64: extend to 64-state based on mcsheng Zhu,Wenjun 2020-09-08 14:59:33 +00:00
  • f06e19e6cb lookaround: add 64x8 and 64x16 shufti models add mask64 model expand entry quantity Hong, Yang A 2020-10-20 20:34:50 +00:00
  • 00b697bb3b AVX512VBMI Fat Teddy. Chang, Harry 2020-02-25 13:35:09 +08:00
  • 007117146c Fix find_vertices_in_cycles(): don't check self-loop in SCC. Chang, Harry 2020-09-19 05:00:13 +00:00
  • 1bd99d9318 Fix cmake error on ICX under release mode. Chang, Harry 2020-08-26 05:39:10 +00:00
  • 0c4c149433 Fix sheng64 dump compile issue in clang. Chang, Harry 2020-09-01 07:04:04 +00:00
  • d8dc1ad685 Fix sheng64 compile issue in clang and in DEBUG_OUTPUT mode on SKX. Chang, Harry 2020-08-31 13:27:22 +00:00
  • 27ab2e086d SHENG64: 64-state 1-byte shuffle based DFA. Chang, Harry 2020-07-10 13:26:17 +00:00
  • cf06d552f8 SHENG32: Compile priority sheng > mcsheng > sheng32. Chang, Harry 2020-07-20 06:36:53 +00:00
  • 33cef12050 SHENG32: 32-state 1-byte shuffle based DFA. Chang, Harry 2018-11-01 16:33:58 +08:00
  • 15f0ccd1b8 DFA: use sherman economically Hong, Yang A 2020-06-18 09:48:52 +00:00
  • 475ad00f53 hsbench: add CSV dump support Wang Xiang W 2020-04-30 07:37:55 -04:00
  • 644aac5e1b Merge pull request #5 from VectorCamp/bugfix/fix-ia32-build v5.3.2 Konstantinos Margaritis 2020-12-31 09:50:35 +02:00
  • 752a42419b fix IA32 build, as we need minimum SSSE3 support for compilation to succeed Konstantinos Margaritis 2020-12-30 19:57:44 +02:00
  • 124455a4a8 Merge pull request #2 from VectorCamp/develop v5.3.1 Konstantinos Margaritis 2020-12-21 20:50:27 +02:00
  • 0372a8120a Merge pull request #1 from VectorCamp/feature/add-arm-support Konstantinos Margaritis 2020-12-16 19:01:32 +02:00
  • 61b963a717 fix x86 compilation Konstantinos Margaritis 2020-12-08 11:42:30 +02:00
  • e088c6ae2b remove forgotten printf Konstantinos Margaritis 2020-12-07 23:12:41 +02:00
  • 773dc6fa69 optimize *shiftbyte_m128() functions to use palign instead of variable_byte_shift_m128() Konstantinos Margaritis 2020-12-07 23:12:26 +02:00
  • 39945b7775 clear zones array Konstantinos Margaritis 2020-12-03 19:30:50 +02:00
  • c38722a68b add ARM platform Konstantinos Margaritis 2020-12-03 19:27:58 +02:00
  • 38477b08bc fix movq and load_m128_from_u64a and resp. test for NEON Konstantinos Margaritis 2020-12-03 19:27:38 +02:00
  • 259c2572c1 define debug vector print functions to NULL in non-debug mode Konstantinos Margaritis 2020-12-03 19:27:05 +02:00
  • 17ab42d891 small optimization that was for some reason failing in ARM, should be faster anyway Konstantinos Margaritis 2020-11-24 17:59:42 +02:00
  • d76365240b helper functions to print a m128 vector in debug mode Konstantinos Margaritis 2020-11-24 17:57:16 +02:00
  • 1c26f044a7 when building in debug mode, vgetq_lane_*() and vextq_*() need immediate operands, and we have to use switch()'ed versions Konstantinos Margaritis 2020-11-24 17:56:40 +02:00
  • 606c53a05f fix compiler flag testcase Konstantinos Margaritis 2020-11-24 17:55:03 +02:00
  • c4f1372814 remove debug from functions Konstantinos Margaritis 2020-11-05 20:33:17 +02:00
  • 62fed20ad0 add some debug and minor optimizations in unit test Konstantinos Margaritis 2020-11-05 19:21:16 +02:00
  • 501f60e930 add some debug info Konstantinos Margaritis 2020-11-05 19:20:37 +02:00
  • 33904180d8 add compress128 function and implementation Konstantinos Margaritis 2020-11-05 19:20:06 +02:00
  • 7b8cf97546 add extra instructions (currently arm-only), fix order of elements in set4x32/set2x64 Konstantinos Margaritis 2020-11-05 19:18:53 +02:00
  • 18296eee47 fix 32-bit/64-bit detection Konstantinos Margaritis 2020-11-05 17:31:20 +02:00
  • 592b1905af needed for ARM vector type conversions Konstantinos Margaritis 2020-10-30 10:50:24 +02:00
  • 547f79b920 small optimization in storecompress*() Konstantinos Margaritis 2020-10-30 10:49:50 +02:00
  • 548242981d fix ARM implementations Konstantinos Margaritis 2020-10-30 10:38:41 +02:00
  • 0bef151437 don't use SSE directly in the tests Konstantinos Margaritis 2020-10-30 10:38:05 +02:00
  • 149ea938c4 don't redefine function on x86 Konstantinos Margaritis 2020-10-16 13:09:08 +03:00
  • c4db63665a scalar implementations of diffrich256 and diffrich384 Konstantinos Margaritis 2020-10-16 13:02:40 +03:00
  • 4bce012570 Revert "move x86 popcount.h implementations to util/arch/x86/popcount.h" Konstantinos Margaritis 2020-10-16 12:32:44 +03:00
  • 83977db7ab split arch-agnostic simd_utils.h functions into the common file Konstantinos Margaritis 2020-10-16 12:30:34 +03:00
  • e7e1308d7f fix compilation paths for cpuid_flags for x86 Konstantinos Margaritis 2020-10-16 12:29:45 +03:00
  • 45bfed9b9d add scalar versions of the vectorized functions for architectures that don't support 256-bit/512-bit SIMD vectors such as ARM Konstantinos Margaritis 2020-10-15 16:30:18 +03:00
  • c5a7f4b846 add ARM simd_utils vectorized functions for 128-bit vectors Konstantinos Margaritis 2020-10-15 16:26:49 +03:00
  • 5b425bd5a6 add arm simple cpuid_flags Konstantinos Margaritis 2020-10-15 16:25:29 +03:00
  • 31ac6718dd add ARM version of simd_utils.h Konstantinos Margaritis 2020-10-13 09:19:56 +03:00
  • a9212174ee add arm bitutils.h header Konstantinos Margaritis 2020-10-08 20:50:55 +03:00
  • 1c2c73becf add C implementation of pdep64() Konstantinos Margaritis 2020-10-08 20:50:18 +03:00
  • d2cf1a7882 move cpuid_flags.h header to common Konstantinos Margaritis 2020-10-08 20:48:20 +03:00
  • 5d773dd9db use C implementation of popcount for arm Konstantinos Margaritis 2020-10-07 14:28:45 +03:00
  • 4c924cc920 add arm architecture basic defines Konstantinos Margaritis 2020-10-07 14:28:12 +03:00
  • 9a0494259e minor fix Konstantinos Margaritis 2020-10-07 14:26:41 +03:00
  • e91082d477 use right intrinsic Konstantinos Margaritis 2020-10-06 13:45:52 +03:00
  • 5952c64066 add necessary modifications to CMake system to enable building on ARM, add arm_neon.h intrinsic header to intrinsics.h Konstantinos Margaritis 2020-10-06 12:44:23 +03:00
  • b1170bcc2e add arm checks in platform.cmake Konstantinos Margaritis 2020-10-06 08:09:18 +03:00
  • f0e70bc0ad Revert "Revert "move x86 popcount.h implementations to util/arch/x86/popcount.h"" Konstantinos Margaritis 2020-09-24 11:52:59 +03:00
  • 04fbf24681 Revert "move x86 popcount.h implementations to util/arch/x86/popcount.h" Konstantinos Margaritis 2020-09-23 21:38:12 +03:00
  • 5333467249 fix names, use own intrinsic instead of explicit _mm* ones Konstantinos Margaritis 2020-09-23 11:51:21 +03:00
  • f7a6b8934c add some set*() functions, harmonize names, rename setAxB to set1_AxB when using mm_set1_* internally Konstantinos Margaritis 2020-09-23 11:49:26 +03:00
  • e8e188acaf move x86 implementations of simd_utils.h to util/arch/x86/ Konstantinos Margaritis 2020-09-22 13:12:07 +03:00
  • e915d84864 no need to check for WIN32* Konstantinos Margaritis 2020-09-22 13:10:52 +03:00
  • 9f3ad89ed6 move andn helper function to bitutils.h Konstantinos Margaritis 2020-09-22 12:17:27 +03:00
  • 6581aae90e move x86 popcount.h implementations to util/arch/x86/popcount.h Konstantinos Margaritis 2020-09-22 11:45:24 +03:00