vectorscan

github_mirrors/vectorscan

Fork 1

mirror of https://github.com/VectorCamp/vectorscan.git synced 2025-06-28 16:41:01 +03:00

Commit Graph

Author	SHA1	Message	Date
Konstantinos Margaritis	f7d5546fe5	Bugfix/fix avx512vbmi regressions (#335 ) Multiple AVX512VBMI-related fixes: src/nfa/mcsheng_compile.cpp: No need for an assert here, impl_id can be set to 0 src/nfa/nfa_api_queue.h: Make sure this compiles on both C++ and C src/nfagraph/ng_fuzzy.cpp: Fix compilation error when DEBUG_OUTPUT=on src/runtime.c: Fix crash when data == NULL unit/internal/sheng.cpp: Unit test has to enable AVX512VBMI manually as autodetection does not get trigger, this causes test to fail src/fdr/teddy_fat.cpp: AVX512 loads need to be 64-bit aligned, caused a crash on clang-18	2025-05-30 21:08:55 +03:00
Yoan Picchi	938c026256	Speed up truffle with 256b TBL instructions 256b wide SVE vectors allow some simplification of truffle. Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s onhe microbenchmark. SVE2 also offer this capability for 128b vector with a speedup around 25% compared to normal SVE Add unit tests and benchmark for this wide variant Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>	2024-05-22 16:13:53 +00:00
Yoan Picchi	f2d8d63793	Add sheng tests Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>	2024-04-30 14:34:14 +00:00

Author

SHA1

Message

Date

Konstantinos Margaritis

f7d5546fe5

Bugfix/fix avx512vbmi regressions (#335 )

Multiple AVX512VBMI-related fixes:

src/nfa/mcsheng_compile.cpp: No need for an assert here, impl_id can be set to 0
src/nfa/nfa_api_queue.h: Make sure this compiles on both C++ and C
src/nfagraph/ng_fuzzy.cpp: Fix compilation error when DEBUG_OUTPUT=on
src/runtime.c: Fix crash when data == NULL
unit/internal/sheng.cpp: Unit test has to enable AVX512VBMI manually as autodetection does not get trigger, this causes test to fail
src/fdr/teddy_fat.cpp: AVX512 loads need to be 64-bit aligned, caused a crash on clang-18

2025-05-30 21:08:55 +03:00

Yoan Picchi

938c026256

Speed up truffle with 256b TBL instructions

256b wide SVE vectors allow some simplification of truffle.
Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s
onhe microbenchmark.
SVE2 also offer this capability for 128b vector with a speedup around
25% compared to normal SVE

Add unit tests and benchmark for this wide variant

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

2024-05-22 16:13:53 +00:00

Yoan Picchi

f2d8d63793

Add sheng tests

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

2024-04-30 14:34:14 +00:00

3 Commits