Commit Graph

  • bf54aae779 Special case for Shuffle test added as well as comments for respectives implementations apostolos 2021-10-26 11:48:33 +03:00
  • 1eb3b19f63 Shuffle simd and SuperVector implementetions as well as their test realy fixed Apostolos Tapsas 2021-10-24 16:52:12 +00:00
  • d43d6733b6 SuperVector shuffle implementation and test function optimized Apostolos Tapsas 2021-10-22 11:55:39 +00:00
  • 57301721f1 print functions missing keywords replaced apostolos 2021-10-22 12:38:16 +03:00
  • 24f149f239 print functions keyword renamed apostolos 2021-10-22 12:36:07 +03:00
  • b53b0a0fcd test for movemask and shuffle cases added apostolos 2021-10-22 11:17:43 +03:00
  • 5abda15c26 expand128 bugs fixed Apostolos Tapsas 2021-10-22 07:05:55 +00:00
  • 7184ce9870 expand128 implementation was changed to be like arm's apostolos 2021-10-22 09:46:04 +03:00
  • 2b1db73326 WIP: simd & bitutils files finctions fixes Apostolos Tapsas 2021-10-21 13:34:02 +00:00
  • 558313a2c2 SuperVector operators fixes and simd_utils low/high64 functions implementations added Apostolos Tapsas 2021-10-18 12:26:38 +00:00
  • e084c2d6e4 SuperVector vsh* implementations Apostolos Tapsas 2021-10-15 14:07:17 +00:00
  • b1f53f8e49 match file for ARCH_PPC64EL added apostolos 2021-10-14 16:26:59 +03:00
  • ba4472a61c trufle and shufle implementations for ARCH_PPC64EL apostolos 2021-10-14 16:01:21 +03:00
  • d0a41252c8 blockSigleMask implementations for ARCH_PPC64 added apostolos 2021-10-14 15:56:13 +03:00
  • 4d2acd59e2 Supervector vsh* added apostolos 2021-10-14 15:08:23 +03:00
  • 7888dd4418 WIP: Power VSX support almost completed Apostolos Tapsas 2021-10-14 10:33:10 +00:00
  • 2231f7c024 compile fixes for vsc port Vectorcamp 2021-10-06 06:23:46 -04:00
  • 90d3db1776 update powerpc simd util file functions apostolos 2021-09-27 15:14:07 +03:00
  • 0078c28ee6 implementations for powerpc64el architecture apostolos 2021-09-24 13:01:14 +03:00
  • 079f3518d7 ppc64el arcitecture added in CMakelists file Vectorcamp 2021-09-23 10:07:27 -04:00
  • f1d781ffee test commit from VM and CMakelists add power support Vectorcamp 2021-09-23 09:28:37 -04:00
  • 1f55d419eb add initial ppc64el support Konstantinos Margaritis 2021-01-26 00:44:38 +02:00
  • 35a25fffd7 link benchmarks against static lib only as some symbols are not exposed in the shared lib v5.4.4+vectorscan Konstantinos Margaritis 2021-10-12 10:33:40 +00:00
  • 4e044d4142 Add missing copyright info from tampered files v5.4.3+vectorscan Konstantinos Margaritis 2021-10-12 10:55:33 +03:00
  • b9801478b2 bump version Konstantinos Margaritis 2021-10-12 08:50:45 +03:00
  • c3baf3d296 fix multiple/undefined symbols when using fat runtimes Konstantinos Margaritis 2021-10-11 14:28:42 +03:00
  • 2d9f52d03e add arm truffle block function Konstantinos Margaritis 2021-10-08 22:12:43 +00:00
  • 9d0c15c448 add simd_onebit_masks as static in arm simd_utils.h as well Konstantinos Margaritis 2021-10-08 22:12:24 +00:00
  • aea10b8ab0 simplify truffle and provide arch-specific block functions Konstantinos Margaritis 2021-10-09 00:36:21 +03:00
  • 623c64142b simplify shufti and provide arch-specific block functions Konstantinos Margaritis 2021-10-09 00:35:59 +03:00
  • 577e03e0c7 rearrange method declarations Konstantinos Margaritis 2021-10-09 00:35:04 +03:00
  • 9c54412447 remove simd_utils.c Konstantinos Margaritis 2021-10-09 00:34:35 +03:00
  • 8b7ba89cb5 add x86 vsh* implementations Konstantinos Margaritis 2021-10-09 00:31:13 +03:00
  • eebd6c97bc use movemask Konstantinos Margaritis 2021-10-09 00:29:33 +03:00
  • 6ceab8435d add header define to avoid double inclusion Konstantinos Margaritis 2021-10-09 00:29:08 +03:00
  • db6354b787 do not include the Supervector impl.cpp files in fat runtime Konstantinos Margaritis 2021-10-09 00:28:22 +03:00
  • a78f3789a9 atm, do not built benchmark tool for fat runtime, as the function names are modified, need to rethink this Konstantinos Margaritis 2021-10-09 00:25:29 +03:00
  • 96af3e8613 Improve benchmarks Konstantinos Margaritis 2021-10-03 10:51:31 +00:00
  • fad39b6058 optimize and simplify Shufti and Truffle to work with a single block method instead Konstantinos Margaritis 2021-10-03 10:51:03 +00:00
  • 456b1c6182 no need to convert to size_t Konstantinos Margaritis 2021-10-03 10:49:38 +00:00
  • 9e6c1c30cf remove asserts, as they are not needed Konstantinos Margaritis 2021-10-03 10:49:09 +00:00
  • fa3d509fad firstMatch/lastMatch are now arch-dependent, emulating movemask on non-Intel is very costly, the alternative is almost twice as fast on Arm Konstantinos Margaritis 2021-10-03 10:47:53 +00:00
  • 9ab18cf419 fix for new pshufb Konstantinos Margaritis 2021-10-03 10:46:47 +00:00
  • 67e0674df8 Changes/Additions to SuperVector class * added ==,!=,>=,>,<=,< operators * reworked shift operators to be more uniform and orthogonal, like Arm ISA * Added Unroller class to allow handling of multiple cases but avoid code duplication * pshufb method can now emulate Intel or not (avoids one instruction). Konstantinos Margaritis 2021-10-03 10:43:13 +00:00
  • e7161fdfec initial SSE/AVX2 implementation Konstantinos Margaritis 2021-09-20 23:52:31 +03:00
  • e5e2057ca9 remove adding CMAKE_CXX_IMPLICIT_LINK_LIBRARIES to PRIVATE_LIBS as on alpine linux this add gcc_s which is a shared library Duncan Bellamy 2021-09-27 09:37:00 +01:00
  • bc57891aa0 Unify benchmarks, more accurate measurements apostolos 2021-09-22 12:05:28 +03:00
  • b40899966f Unify benchmarks, more accurate measurements Konstantinos Margaritis 2021-09-22 11:21:37 +03:00
  • d7e9d2d915 benchmarks functions replaced with lambdas apostolos 2021-09-16 17:23:10 +03:00
  • cf1d72745c raw pointers replaced with smart pointers apostolos 2021-09-15 13:03:25 +03:00
  • c774a76f24 nit apostolos 2021-09-14 16:35:33 +03:00
  • a86d6c290d nit apostolos 2021-09-14 16:01:32 +03:00
  • ee8fa17351 fix benchmarks outputs apostolos 2021-09-14 15:32:26 +03:00
  • 53b9034546 bandwidth output fixes apostolos 2021-09-13 20:25:46 +03:00
  • 0e141ce700 size outup for case with match fixed apostolos 2021-09-13 10:09:13 +03:00
  • 5d4adf267d nits apostolos 2021-09-09 12:06:02 +03:00
  • 2e6c75c895 size output fixed apostolos 2021-09-09 12:02:33 +03:00
  • 9901477bcf nits apostolos 2021-09-07 11:41:19 +03:00
  • 2b9636ccc0 benchmarks output fixes apostolos 2021-09-07 11:01:10 +03:00
  • 91f58fb1ca add missing header Konstantinos Margaritis 2021-09-02 15:35:23 +03:00
  • be1551aa94 remove confusing OPTIMISE flag Konstantinos Margaritis 2021-09-02 15:34:55 +03:00
  • 4027319d6c nits apostolos 2021-08-25 11:43:33 +03:00
  • 1009391d9f code size reduction by using function arrays and add bandwidth to output apostolos 2021-08-25 11:09:45 +03:00
  • 904a94fbe5 micro-benchmarks for shufti, trufle and noodle added apostolos 2021-08-24 14:05:12 +03:00
  • 08357a096c remove Windows/ICC support Konstantinos Margaritis 2021-07-30 12:49:38 +03:00
  • 8cff876962 fix lshift128 test Konstantinos Margaritis 2021-07-30 12:37:41 +03:00
  • 67fa6d2738 alignr methods for avx2 and avx512 added apostolos 2021-07-28 12:55:32 +03:00
  • b3a20afbbc limex_shuffle added and it's unit tests apostolos 2021-07-27 11:44:35 +03:00
  • de30471edd remove duplicate functions from previous merge Konstantinos Margaritis 2021-07-26 21:11:30 +03:00
  • e5050c9373 add missing compile flags Konstantinos Margaritis 2021-07-26 21:09:12 +03:00
  • 7f5e859019 add accidentally removed lines Konstantinos Margaritis 2021-07-26 19:50:34 +03:00
  • deae90f947 * add -fno-new-ttp-matching to fix build-failures on newer gcc compilers with C++17 * add explicit -mssse3, -mavx2 in compiler flags in respective build profiles Konstantinos Margaritis 2021-07-26 19:13:33 +03:00
  • a879715953 Move SVE functions into their own files. George Wort 2021-07-20 18:13:02 +01:00
  • 6c6aee9682 Implement new DoubleVermicelli16 acceleration functions using SVE2 George Wort 2021-06-28 16:29:43 +01:00
  • 25183089fd Use SVE shufti for counting miracles. George Wort 2021-07-02 15:54:42 +01:00
  • 00fff3f53c Use SVE for double shufti. George Wort 2021-07-13 20:39:53 +01:00
  • c95a4c3dd1 Use SVE for single shufti. George Wort 2021-07-13 15:09:38 +01:00
  • 56ef2d5f72 Use SVE2 for counting miracles. George Wort 2021-07-02 15:53:43 +01:00
  • ab5d4d9279 Replace USE_ARM_SVE with HAVE_SVE. George Wort 2021-07-16 13:21:14 +01:00
  • 8242f46ed7 Add Licence to state_compress and bitutils. George Wort 2021-07-16 11:56:48 +01:00
  • df926ef62f Implement new Vermicelli16 acceleration functions using SVE2. George Wort 2021-06-28 16:29:43 +01:00
  • c7086cb7f1 Add SVE2 support for dvermicelli George Wort 2021-06-23 14:14:28 +01:00
  • a38324a5a3 add arm rshift128/rshift128 Konstantinos Margaritis 2021-07-20 14:33:03 +03:00
  • 603bc14cdd fix failing corner case, add pshufb_maskz() Konstantinos Margaritis 2021-07-23 18:55:56 +03:00
  • e35b88f2c8 use STL make_unique, remove wrapper header, breaks C++17 compilation Konstantinos Margaritis 2021-07-23 11:54:53 +03:00
  • f5f37f3f40 change C/C++ standard used to C17/C++17 Konstantinos Margaritis 2021-07-23 11:47:45 +03:00
  • 6f44a1aa26 remove low4bits from the arguments, fix cases that mostly affect loading large (64) vectors and falling out of bounds Konstantinos Margaritis 2021-07-23 11:45:58 +03:00
  • f2d9784979 fix loadu_maskz, add {l,r}shift128_var(), tab fixes Konstantinos Margaritis 2021-07-23 11:44:46 +03:00
  • a2e6143ea1 convert to for loops Konstantinos Margaritis 2021-07-23 11:43:51 +03:00
  • f8ce0bb922 minor fixes, add 2 constructors from half size vectors Konstantinos Margaritis 2021-07-23 11:43:10 +03:00
  • cabd13d18a fix lastMatch<64> Konstantinos Margaritis 2021-07-23 11:42:13 +03:00
  • ebb1b84ae3 provide an {l,r}shift128_var() to fix immediate value build failure in loadu_maskz Konstantinos Margaritis 2021-07-21 10:20:40 +00:00
  • 825460856f fix arm loadu_maskz() Konstantinos Margaritis 2021-07-20 11:38:19 +00:00
  • 86accf41a3 add arm rshift128/rshift128 Konstantinos Margaritis 2021-07-20 14:33:03 +03:00
  • b67cd7dfd0 use rshift128() instead of vector-wide right shift Konstantinos Margaritis 2021-07-20 14:33:03 +03:00
  • 6c51f7f591 add {l,r}shift128()+tests, rename printv_u64() to print64() Konstantinos Margaritis 2021-07-20 14:32:40 +03:00
  • 051ceed0f9 Use SVE2 Bitperm's bdep instruction in bitutils and state_compress George Wort 2021-07-02 10:43:48 +01:00
  • 4bc28272da Fix CROSS_COMPILE_AARCH64 for SVE issues. George Wort 2021-07-12 17:08:11 +01:00
  • 9fb79ac3ec Add SVE2 support for vermicelli George Wort 2021-06-07 13:55:09 +01:00
  • 7162446358 Remove possibly undefined behaviour from Noodle. George Wort 2021-07-01 14:19:20 +01:00