Commit Graph

  • 6fbd18183a rename arm impl.hpp to impl.cpp, add operator|() to SuperVector class Konstantinos Margaritis 2021-06-10 13:35:51 +03:00
  • f689179a82 refactor shufti algorithm to use SuperVector class, WIP Konstantinos Margaritis 2021-06-10 13:34:38 +03:00
  • 23b075cbd4 refactor shufti algorithm to use SuperVector class, WIP Konstantinos Margaritis 2021-06-10 13:34:38 +03:00
  • d59f11dc01 Add SVE, SVE2, and SVE2_BITPERM as targets George Wort 2021-05-17 17:13:14 +01:00
  • 3ee7b75ee0 Add SVE, SVE2, and SVE2_BITPERM as targets George Wort 2021-05-17 17:13:14 +01:00
  • 503483a8ee Enable cross compilation to aarch64 George Wort 2021-05-17 15:17:38 +01:00
  • b6c3ab723b Enable cross compilation to aarch64 George Wort 2021-05-17 15:17:38 +01:00
  • 1e7765c485 SuperVector unit tests apostolos 2021-06-11 11:54:47 +03:00
  • feb2d3ccf7 SuperVector unit tests apostolos 2021-06-11 11:54:47 +03:00
  • 8bbcfe698a unit tests for supervector apostolos 2021-06-09 11:58:59 +03:00
  • 096fb55faa unit tests for supervector apostolos 2021-06-09 11:58:59 +03:00
  • a6230d6410 add more functions, move defines here, enable inlining of template specializations only when running optimized code Konstantinos Margaritis 2021-06-07 10:07:29 +03:00
  • 6526df81e4 add more functions, move defines here, enable inlining of template specializations only when running optimized code Konstantinos Margaritis 2021-06-07 10:07:29 +03:00
  • d72038bc31 fix compilation on C++ Konstantinos Margaritis 2021-06-07 10:04:57 +03:00
  • d8b5eb5d17 fix compilation on C++ Konstantinos Margaritis 2021-06-07 10:04:57 +03:00
  • 7262ae8b74 simplify function Konstantinos Margaritis 2021-06-07 10:04:36 +03:00
  • 273b9683ac simplify function Konstantinos Margaritis 2021-06-07 10:04:36 +03:00
  • 94089d9acb move definitions elsewhere Konstantinos Margaritis 2021-06-07 10:04:19 +03:00
  • e215157a21 move definitions elsewhere Konstantinos Margaritis 2021-06-07 10:04:19 +03:00
  • 4fe9a75abc move SuperVector versions of noodleEngine scan functions to _simd.hpp file Konstantinos Margaritis 2021-05-25 17:15:00 +03:00
  • 05c7c8e576 move SuperVector versions of noodleEngine scan functions to _simd.hpp file Konstantinos Margaritis 2021-05-25 17:15:00 +03:00
  • bc2e3dfd2e add arm support for the new SuperVector class Konstantinos Margaritis 2021-05-13 20:06:34 +03:00
  • 6e63aafbea add arm support for the new SuperVector class Konstantinos Margaritis 2021-05-13 20:06:34 +03:00
  • 923f8bd357 simplify scanSingleMain() and scanDoubleMain() Konstantinos Margaritis 2021-05-13 17:53:12 +03:00
  • c6406bebde simplify scanSingleMain() and scanDoubleMain() Konstantinos Margaritis 2021-05-13 17:53:12 +03:00
  • 4d8fbe95a7 delete separate implementations Konstantinos Margaritis 2021-05-12 20:18:05 +03:00
  • f77837130d delete separate implementations Konstantinos Margaritis 2021-05-12 20:18:05 +03:00
  • c3101d53f4 add C++ template SIMD library (WIP) Konstantinos Margaritis 2021-05-12 13:31:12 +03:00
  • e6c1fa04ce add C++ template SIMD library (WIP) Konstantinos Margaritis 2021-05-12 13:31:12 +03:00
  • 7c303b62e3 add generic SIMD implementation Konstantinos Margaritis 2021-05-12 13:30:20 +03:00
  • ede2b18564 add generic SIMD implementation Konstantinos Margaritis 2021-05-12 13:30:20 +03:00
  • c96cfd73c4 rename project, change to noodle_engine.cpp Konstantinos Margaritis 2021-05-12 13:29:50 +03:00
  • 5213ef579d rename project, change to noodle_engine.cpp Konstantinos Margaritis 2021-05-12 13:29:50 +03:00
  • 718cc7be1d convert to C++ Konstantinos Margaritis 2021-05-12 13:29:16 +03:00
  • 7a9a2dd0dc convert to C++ Konstantinos Margaritis 2021-05-12 13:29:16 +03:00
  • 26bb00a932 revert to push_back() Konstantinos Margaritis 2021-05-12 13:27:18 +03:00
  • 2805ff038a revert to push_back() Konstantinos Margaritis 2021-05-12 13:27:18 +03:00
  • 317fb3dcfc add global definitions for CHUNKSIZE/VECTORSIZE, define HAVE_AVX512* only when BUILD_AVX512 is also enabled Konstantinos Margaritis 2021-05-12 13:26:42 +03:00
  • 52661f35e8 add global definitions for CHUNKSIZE/VECTORSIZE, define HAVE_AVX512* only when BUILD_AVX512 is also enabled Konstantinos Margaritis 2021-05-12 13:26:42 +03:00
  • 5171627e3b fix typo Konstantinos Margaritis 2021-05-12 13:25:41 +03:00
  • 831091db9e fix typo Konstantinos Margaritis 2021-05-12 13:25:41 +03:00
  • 3f35a2be37 replace push_back by emplace_back where possible Konstantinos Margaritis 2021-03-26 12:39:40 +02:00
  • 556206f138 replace push_back by emplace_back where possible Konstantinos Margaritis 2021-03-26 12:39:40 +02:00
  • 1cdb7312cb use -O3 for C++ code as well, makes a difference Konstantinos Margaritis 2021-03-22 19:43:38 +02:00
  • 9f7088a9e0 use -O3 for C++ code as well, makes a difference Konstantinos Margaritis 2021-03-22 19:43:38 +02:00
  • e67148b315 merge with master Konstantinos Margaritis 2021-10-12 11:51:20 +03:00
  • 48e9a17f0a merge with master Konstantinos Margaritis 2021-10-12 11:51:20 +03:00
  • 361aa4b900 minor optimizations Konstantinos Margaritis 2021-03-16 17:47:00 +02:00
  • ec5531a6b1 minor optimizations Konstantinos Margaritis 2021-03-16 17:47:00 +02:00
  • acb542a5be prefetch works best when addresses are 64-byte aligned Konstantinos Margaritis 2021-03-12 10:10:53 +02:00
  • d3ff893871 prefetch works best when addresses are 64-byte aligned Konstantinos Margaritis 2021-03-12 10:10:53 +02:00
  • 68b92f338d Revert "replace long macro and switch statement with function pointer array and branchless execution" Konstantinos Margaritis 2021-02-26 16:40:58 +02:00
  • 521f233cfd Revert "replace long macro and switch statement with function pointer array and branchless execution" Konstantinos Margaritis 2021-02-26 16:40:58 +02:00
  • 381ef41168 replace long macro and switch statement with function pointer array and branchless execution Konstantinos Margaritis 2021-02-26 16:39:24 +02:00
  • 92916e311f replace long macro and switch statement with function pointer array and branchless execution Konstantinos Margaritis 2021-02-26 16:39:24 +02:00
  • 95b929ed26 optimise case handling Konstantinos Margaritis 2021-02-22 13:59:05 +02:00
  • 58cface115 optimise case handling Konstantinos Margaritis 2021-02-22 13:59:05 +02:00
  • ffb6a95e72 simplify and make scanSingle*()/scanDouble*() more uniform Konstantinos Margaritis 2021-02-19 12:16:43 +02:00
  • e3e101b412 simplify and make scanSingle*()/scanDouble*() more uniform Konstantinos Margaritis 2021-02-19 12:16:43 +02:00
  • da6216e42d optimize caseMask handling Konstantinos Margaritis 2021-02-16 22:10:42 +02:00
  • 2f13ad0674 optimize caseMask handling Konstantinos Margaritis 2021-02-16 22:10:42 +02:00
  • 387c45a990 * add -fno-new-ttp-matching to fix build-failures on newer gcc compilers with C++17 * add explicit -mssse3, -mavx2 in compiler flags in respective build profiles develop-SVE2-r20210721 Konstantinos Margaritis 2021-07-26 19:13:33 +03:00
  • 0f39535621 Move SVE functions into their own files. George Wort 2021-07-20 18:13:02 +01:00
  • 87a6733fbe Implement new DoubleVermicelli16 acceleration functions using SVE2 George Wort 2021-06-28 16:29:43 +01:00
  • 854854d8cf Use SVE shufti for counting miracles. George Wort 2021-07-02 15:54:42 +01:00
  • 2686048e6c Use SVE for double shufti. George Wort 2021-07-13 20:39:53 +01:00
  • a94219aaed Use SVE for single shufti. George Wort 2021-07-13 15:09:38 +01:00
  • 6e434318a1 Use SVE2 for counting miracles. George Wort 2021-07-02 15:53:43 +01:00
  • 6d23032a6b Replace USE_ARM_SVE with HAVE_SVE. George Wort 2021-07-16 13:21:14 +01:00
  • ca7d5d7536 Add Licence to state_compress and bitutils. George Wort 2021-07-16 11:56:48 +01:00
  • db0d8f79e6 Implement new Vermicelli16 acceleration functions using SVE2. George Wort 2021-06-28 16:29:43 +01:00
  • 185c45263b Add SVE2 support for dvermicelli George Wort 2021-06-23 14:14:28 +01:00
  • 455789db9f add arm rshift128/rshift128 Konstantinos Margaritis 2021-07-20 14:33:03 +03:00
  • 55d5631c5c fix failing corner case, add pshufb_maskz() Konstantinos Margaritis 2021-07-23 18:55:56 +03:00
  • be5a675da8 use STL make_unique, remove wrapper header, breaks C++17 compilation Konstantinos Margaritis 2021-07-23 11:54:53 +03:00
  • 18693bd14c change C/C++ standard used to C17/C++17 Konstantinos Margaritis 2021-07-23 11:47:45 +03:00
  • 369e0d473d remove low4bits from the arguments, fix cases that mostly affect loading large (64) vectors and falling out of bounds Konstantinos Margaritis 2021-07-23 11:45:58 +03:00
  • 47e29602e0 fix loadu_maskz, add {l,r}shift128_var(), tab fixes Konstantinos Margaritis 2021-07-23 11:44:46 +03:00
  • e25c8ad78b convert to for loops Konstantinos Margaritis 2021-07-23 11:43:51 +03:00
  • b517cbfb8a minor fixes, add 2 constructors from half size vectors Konstantinos Margaritis 2021-07-23 11:43:10 +03:00
  • e047fcf629 fix lastMatch<64> Konstantinos Margaritis 2021-07-23 11:42:13 +03:00
  • 0d1d76140a provide an {l,r}shift128_var() to fix immediate value build failure in loadu_maskz Konstantinos Margaritis 2021-07-21 10:20:40 +00:00
  • f680a79f1e fix arm loadu_maskz() Konstantinos Margaritis 2021-07-20 11:38:19 +00:00
  • b0fbb39cf0 add arm rshift128/rshift128 Konstantinos Margaritis 2021-07-20 14:33:03 +03:00
  • e1950f1ce5 use rshift128() instead of vector-wide right shift Konstantinos Margaritis 2021-07-20 14:33:03 +03:00
  • 93abed4d87 add {l,r}shift128()+tests, rename printv_u64() to print64() Konstantinos Margaritis 2021-07-20 14:32:40 +03:00
  • e325ce9452 Use SVE2 Bitperm's bdep instruction in bitutils and state_compress George Wort 2021-07-02 10:43:48 +01:00
  • 726a668b65 Fix CROSS_COMPILE_AARCH64 for SVE issues. George Wort 2021-07-12 17:08:11 +01:00
  • 7d7d31ec0d Add SVE2 support for vermicelli George Wort 2021-06-07 13:55:09 +01:00
  • 418851a26e Remove possibly undefined behaviour from Noodle. George Wort 2021-07-01 14:19:20 +01:00
  • f99c380167 Remove first check from scanDouble Noodle. George Wort 2021-06-30 14:13:27 +01:00
  • 80c01e451d Equal mask test fixed with random numbers apostolos 2021-07-19 13:12:58 +03:00
  • 4326aecfda Supervector test fixes apostolos 2021-07-19 10:23:11 +03:00
  • a98a114568 SuperVector AVX512 implementations apostolos 2021-07-16 11:17:28 +03:00
  • 17bf07f446 SuperVector unit tests for AVX2 and AVX512 added apostolos 2021-07-13 16:38:25 +03:00
  • 56b124b9a1 really fix lshift for avx2 Konstantinos Margaritis 2021-07-13 13:19:48 +03:00
  • e296a71540 disable OPTIMISE by default Konstantinos Margaritis 2021-07-12 21:12:21 +03:00
  • 6ad1e70c36 fix truffle SIMD for S>16 as well Konstantinos Margaritis 2021-07-12 21:12:05 +03:00
  • 46c37ece4b add AVX2 specializations Konstantinos Margaritis 2021-07-12 21:09:10 +03:00
  • 65d911c976 lots of fixes to AVX2 implementation Konstantinos Margaritis 2021-07-12 21:08:51 +03:00