There are some issues with dtors in boost::container::small_vector and/or vector, which is reported by MSan as an error. The suppression __attribute__((no_sanitize_memory)) works until clang-15, but since clang-16 it does not. It looks like before clang-16 this no_sanitize_memory works for all child functions, while since clang-16 only for this function. I've tried to add few others, but a) it looks icky b) I haven't managed to finish this process. Also I've measured the performance and it hadn't been changed. Though boost::small_vector should be faster then std::vector, but apparently my particular case hadn't affected too much. And one more thing, MSan reports this only with -O0, with -O3 - it is not reproduced. <details> <summary>MSan report:</summary> _Note: it was slightly trimmed_ ``` ==11364==WARNING: MemorySanitizer: use-of-uninitialized-value 2023.05.10 15:40:53.000233 [ 11620 ] {} <Trace> AsynchronousMetrics: MemoryTracking: was 1012.32 MiB, peak 1012.32 MiB, free memory in arenas 0.00 B, will set to 1015.82 MiB (RSS), difference: 3.50 MiB 0 0x55558d13289f in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<std::__1::pair<unsigned char, unsigned char>, std::__1::allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u>>::deallocate(std::__1::pair<unsigned char, unsigned char>* const&, unsigned long) .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:455:7 1 0x55558d139e8e in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<std::__1::pair<unsigned char, unsigned char>, std::__1::allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u>>::~vector_alloc_holder() .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:420:16 2 0x55558d139e0b in boost::container::vector<std::__1::pair<unsigned char, unsigned char>, boost::container::small_vector_allocator<std::__1::pair<unsigned char, unsigned char>, std::__1::allocator<void>, void>, void>::~vector() .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:1141:4 3 0x55558d12a4fa in boost::container::small_vector_base<std::__1::pair<unsigned char, unsigned char>, std::__1::allocator<std::__1::pair<unsigned char, unsigned char>>, void>::~small_vector_base() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:445:80 4 0x55558d12a4fa in boost::container::small_vector<std::__1::pair<unsigned char, unsigned char>, 1ul, std::__1::allocator<std::__1::pair<unsigned char, unsigned char>>, void>::~small_vector() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:564:7 5 0x55558d13a21b in std::__1::__tuple_leaf<0ul, boost::container::small_vector<std::__1::pair<unsigned char, unsigned char>, 1ul, std::__1::allocator<std::__1::pair<unsigned char, unsigned char>>, void>, false>::~__tuple_leaf() .cmake-llvm16-msan/./contrib/llvm-project/libcxx/include/tuple:265:7 6 0x55558d13a13a in std::__1::__tuple_impl<>::~__tuple_impl() .cmake-llvm16-msan/./contrib/llvm-project/libcxx/include/tuple:451:37 7 0x55558d13a05b in std::__1::tuple<>::~tuple() .cmake-llvm16-msan/./contrib/llvm-project/libcxx/include/tuple:538:28 8 0x55558d139f7b in ue2::flat_detail::flat_base<>::~flat_base() .cmake-llvm16-msan/./contrib/vectorscan/src/util/flat_containers.h:89:7 9 0x55558d1299da in ue2::flat_set<>::~flat_set() .cmake-llvm16-msan/./contrib/vectorscan/src/util/flat_containers.h:152:7 10 0x55558d4e4dda in ue2::(anonymous namespace)::DAccelScheme::~DAccelScheme() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:301:8 11 0x55558d4ff6cf in void boost::container::allocator_traits<>::priv_destroy<ue2::(anonymous namespace)::DAccelScheme>(boost::move_detail::integral_constant<bool, false>, boost::container::small_vector_allocator<ue2::(anonymous namespace)::DAccelScheme, boost::container::new_allocator<void>, void>&, ue2::(anonymous namespace)::DAccelScheme*) .cmake-llvm16-msan/./contrib/boost/boost/container/allocator_traits.hpp:403:11 12 0x55558d4fefde in void boost::container::allocator_traits<>::destroy<ue2::(anonymous namespace)::DAccelScheme>(boost::container::small_vector_allocator<ue2::(anonymous namespace)::DAccelScheme, boost::container::new_allocator<void>, void>&, ue2::(anonymous namespace)::DAccelScheme*) .cmake-llvm16-msan/./contrib/boost/boost/container/allocator_traits.hpp:331:7 13 0x55558d4fc364 in boost::container::dtl::disable_if_trivially_destructible<>::type boost::container::destroy_alloc_n<>(boost::container::small_vector_allocator<ue2::(anonymous namespace)::DAccelScheme, boost::container::new_allocator<void>, void>&, ue2::(anonymous namespace)::DAccelScheme*, unsigned long) .cmake-llvm16-msan/./contrib/boost/boost/container/detail/copy_move_algo.hpp:988:7 14 0x55558d517962 in boost::container::vector<>::~vector() .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:1138:7 15 0x55558d4f724d in boost::container::small_vector_base<>::~small_vector_base() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:445:80 16 0x55558d4f724d in boost::container::small_vector<>::~small_vector() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:564:7 17 0x55558d4f2ff3 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:444:1 18 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 19 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 20 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 21 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 22 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 23 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 24 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 25 0x55558d4e4af5 in ue2::findBestDoubleAccelScheme() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:556:5 26 0x55558d4e2659 in ue2::findBestAccelScheme() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:569:27 27 0x55558d3aa8ff in ue2::look_for_offset_accel(ue2::raw_dfa const&, unsigned short, unsigned int) .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:197:22 28 0x55558d3a9727 in ue2::accel_dfa_build_strat::find_escape_strings(unsigned short) const .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:414:13 29 0x55558d3b2119 in ue2::accel_dfa_build_strat::getAccelInfo(ue2::Grey const&)::$_0::operator()(unsigned long) const .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:606:26 30 0x55558d3aefd4 in ue2::accel_dfa_build_strat::getAccelInfo(ue2::Grey const&) .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:627:13 31 0x55558d2fc61f in ue2::mcclellanCompile8() .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/mcclellancompile.cpp:935:22 32 0x55558d2e89ec in ue2::mcclellanCompile_i() .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/mcclellancompile.cpp:1510:15 33 0x55558d2ff502 in ue2::mcclellanCompile() .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/mcclellancompile.cpp:1527:12 34 0x55558fb13b52 in ue2::getDfa() .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_bytecode.cpp:646:15 35 0x55558fb7e8c8 in ue2::makeLeftNfa() .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_bytecode.cpp:854:22 36 0x55558fb6bd36 in ue2::buildLeftfix() .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_bytecode.cpp:1123:15 37 0x55558fb21020 in ue2::buildLeftfixes() .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_bytecode.cpp:1579:9 38 0x55558fad972c in ue2::buildNfas() .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_bytecode.cpp:2063:10 39 0x55558fac9843 in ue2::RoseBuildImpl::buildFinalEngine(unsigned int) .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_bytecode.cpp:3660:10 40 0x55558f2b2d86 in ue2::RoseBuildImpl::buildRose(unsigned int) .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_compile.cpp:1796:12 Uninitialized value was stored to memory at 0 0x55558d132898 in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<std::__1::pair<unsigned char, unsigned char>, std::__1::allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u>>::deallocate(std::__1::pair<unsigned char, unsigned char>* const&, unsigned long) .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:455:56 1 0x55558d139e8e in boost::container::vector_alloc_holder<>::~vector_alloc_holder() .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:420:16 2 0x55558d139e0b in boost::container::vector<>::~vector() .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:1141:4 3 0x55558d12a4fa in boost::container::small_vector_base<>::~small_vector_base() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:445:80 4 0x55558d12a4fa in boost::container::small_vector<std::__1::pair<unsigned char, unsigned char>, 1ul, std::__1::allocator<std::__1::pair<unsigned char, unsigned char>>, void>::~small_vector() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:564:7 5 0x55558d13a21b in std::__1::__tuple_leaf<>::~__tuple_leaf() .cmake-llvm16-msan/./contrib/llvm-project/libcxx/include/tuple:265:7 6 0x55558d13a13a in std::__1::__tuple_impl<>::~__tuple_impl .cmake-llvm16-msan/./contrib/llvm-project/libcxx/include/tuple:451:37 7 0x55558d13a05b in std::__1::tuple<>::~tuple() .cmake-llvm16-msan/./contrib/llvm-project/libcxx/include/tuple:538:28 8 0x55558d139f7b in ue2::flat_detail::flat_base<>::~flat_base() .cmake-llvm16-msan/./contrib/vectorscan/src/util/flat_containers.h:89:7 9 0x55558d1299da in ue2::flat_set<>::~flat_set() .cmake-llvm16-msan/./contrib/vectorscan/src/util/flat_containers.h:1 52:7 10 0x55558d4e4dda in ue2::(anonymous namespace)::DAccelScheme::~DAccelScheme() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:301:8 11 0x55558d4ff6cf in void boost::container::allocator_traits<>::priv_destroy<ue2::(anonymous namespace)::DAccelScheme>() .cmake-llvm16-msan/./contrib/boost/boost/container/allocator_traits.hpp:403:11 12 0x55558d4fefde in void boost::container::allocator_traits<>::destroy<ue2::(anonymous namespace)::DAccelScheme>(boost::container::small_vector_allocator<>&, ue2::(anonymous namespace)::DAccelScheme*) .cmake-llvm16-msan/./contrib/boost/boost/container/allocator_traits.hpp:331:7 13 0x55558d4fc364 in boost::container::dtl::disable_if_trivially_destructible<>::type boost::container::destroy_alloc_n<boost::container::small_vector_allocator<ue2::(anonymous namespace)::DAccelScheme, boost::container::new_allocator<void>, void>, ue2::(anonymous namespace)::DAccelScheme*, unsigned long>(boost::container::small_vector_allocator<ue2::(anonymous namespace)::DAccelScheme, boost::container::new_allocator<void>, void>&, ue2::(anonymous namespace)::DAccelScheme*, unsigned long) .cmake-llvm16-msan/./contrib/boost/boost/container/detail/copy_move_algo.hpp:988:7 14 0x55558d517962 in boost::container::vector<ue2::(anonymous namespace)::DAccelScheme, boost::container::small_vector_allocator<ue2::(anonymous namespace)::DAccelScheme, boost::container::new_allocator<void>, void>, void>::~vector() .cmake-llvm16-msan/./contrib/boost/boost/container/vector.hpp:1138:7 15 0x55558d4f724d in boost::container::small_vector_base<>::~small_vector_base() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:445:80 16 0x55558d4f724d in boost::container::small_vector<>::~small_vector() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:564:7 17 0x55558d4f2ff3 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:444:1 18 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 19 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 20 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 21 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 Member fields were destroyed 0 0x5555652e08dd in __sanitizer_dtor_callback_fields /src/llvm/worktrees/llvm-16/compiler-rt/lib/msan/msan_interceptors.cpp:961:5 1 0x55558d4f71a6 in boost::container::small_vector<>::~small_vector() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:528:8 2 0x55558d4f71a6 in boost::container::small_vector<>::~small_vector() .cmake-llvm16-msan/./contrib/boost/boost/container/small_vector.hpp:564:7 3 0x55558d4f2ff3 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:444:1 4 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 5 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 6 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 7 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 8 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 9 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 10 0x55558d4f2f41 in ue2::findDoubleBest() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:442:9 11 0x55558d4e4af5 in ue2::findBestDoubleAccelScheme() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:556:5 12 0x55558d4e2659 in ue2::findBestAccelScheme() .cmake-llvm16-msan/./contrib/vectorscan/src/nfagraph/ng_limex_accel.cpp:569:27 13 0x55558d3aa8ff in ue2::look_for_offset_accel(ue2::raw_dfa const&, unsigned short, unsigned int) .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:197:22 14 0x55558d3a9727 in ue2::accel_dfa_build_strat::find_escape_strings(unsigned short) const .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:414:13 15 0x55558d3b2119 in ue2::accel_dfa_build_strat::getAccelInfo(ue2::Grey const&)::$_0::operator()(unsigned long) const .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:606:26 16 0x55558d3aefd4 in ue2::accel_dfa_build_strat::getAccelInfo(ue2::Grey const&) .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/accel_dfa_build_strat.cpp:627:13 17 0x55558d2fc61f in ue2::mcclellanCompile8(ue2::(anonymous namespace)::dfa_info&, ue2::CompileContext const&, std::__1::set<unsigned short, std::__1::less<unsigned short>, std::__1::allocator<unsigned short>>*) .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/mcclellancompile.cpp:935:22 18 0x55558d2e89ec in ue2::mcclellanCompile_i(ue2::raw_dfa&, ue2::accel_dfa_build_strat&, ue2::CompileContext const&, bool, std::__1::set<unsigned short, std::__1::less<unsigned short>, std::__1::allocator<unsigned short>>*) .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/mcclellancompile.cpp:1510:15 19 0x55558d2ff502 in ue2::mcclellanCompile(ue2::raw_dfa&, ue2::CompileContext const&, ue2::ReportManager const&, bool, bool, std::__1::set<unsigned short, std::__1::less<unsigned short>, std::__1::allocator<unsigned short>>*) .cmake-llvm16-msan/./contrib/vectorscan/src/nfa/mcclellancompile.cpp:1527:12 20 0x55558fb13b52 in ue2::getDfa(ue2::raw_dfa&, bool, ue2::CompileContext const&, ue2::ReportManager const&) .cmake-llvm16-msan/./contrib/vectorscan/src/rose/rose_build_bytecode.cpp:646:15 ``` </details> Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Vectorscan?
A fork of Intel's Hyperscan, modified to run on more platforms. Currently ARM NEON/ASIMD is 100% functional, and Power VSX are in development. ARM SVE2 will be implemented when harwdare becomes accessible to the developers. More platforms will follow in the future, on demand/request.
Vectorscan will follow Intel's API and internal algorithms where possible, but will not hesitate to make code changes where it is thought of giving better performance or better portability. In addition, the code will be gradually simplified and made more uniform and all architecture specific -currently Intel- #ifdefs will be removed and abstracted away.
Why the fork?
Originally, the ARM porting was supposed to be merged into Intel's own Hyperscan, and 2 Pull Requests had been made to the project for this reason (1, 2). Unfortunately, the PRs were rejected for now and the forseeable future, thus we have created Vectorscan for our own multi-architectural and opensource collaborative needs.
What is Hyperscan?
Hyperscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre library, but is a standalone library with its own C API.
Hyperscan uses hybrid automata techniques to allow simultaneous matching of large numbers (up to tens of thousands) of regular expressions and for the matching of regular expressions across streams of data.
Vectorscan is typically used in a DPI library stack, just like Hyperscan.
Cross Compiling for AArch64
- To cross compile for AArch64, first adjust the variables set in cmake/setenv-arm64-cross.sh.
export CROSS=<arm-cross-compiler-dir>/bin/aarch64-linux-gnu-
export CROSS_SYS=<arm-cross-compiler-system-dir>
export BOOST_PATH=<boost-source-dir>
- Set the environment variables:
source cmake/setenv-arm64-cross.sh
- Configure Vectorscan:
mkdir <build-dir-name>
cd <build-dir>
cmake -DCROSS_COMPILE_AARCH64=1 <hyperscan-source-dir> -DCMAKE_TOOLCHAIN_FILE=<hyperscan-source-dir>/cmake/arm64-cross.cmake
- Build Vectorscan:
make -jT
where T is the number of threads used to compile.cmake --build . -- -j T
can also be used instead of make.
Compiling for SVE
The following cmake variables can be set in order to target Arm's Scalable Vector Extension. They are listed in ascending order of strength, with cmake detecting whether the feature is available in the compiler and falling back to a weaker version if not. Only one of these variables needs to be set as weaker variables will be implied as set.
BUILD_SVE
BUILD_SVE2
BUILD_SVE2_BITPERM
Documentation
Information on building the Hyperscan library and using its API is available in the Developer Reference Guide.
License
Vectorscan, like Hyperscan is licensed under the BSD License. See the LICENSE file in the project repository.
Versioning
The master
branch on Github will always contain the most recent release of
Hyperscan. Each version released to master
goes through QA and testing before
it is released; if you're a user, rather than a developer, this is the version
you should be using.
Further development towards the next release takes place on the develop
branch.
Get Involved
The official homepage for Vectorscan is at www.github.com/VectorCamp/vectorscan.
Original Hyperscan links
The official homepage for Hyperscan is at www.hyperscan.io.
If you have questions or comments, we encourage you to join the mailing list. Bugs can be filed by sending email to the list, or by creating an issue on Github.
If you wish to contact the Hyperscan team at Intel directly, without posting publicly to the mailing list, send email to hyperscan@intel.com.