Commit Graph

11 Commits

Author SHA1 Message Date
Yoan Picchi
7054378c93 Speed up truffle with 256b TBL instructions
256b wide SVE vectors allow some simplification of truffle.
Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s
onhe microbenchmark.
SVE2 also offer this capability for 128b vector with a speedup around
25% compared to normal SVE

Add unit tests and benchmark for this wide variant

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
2024-05-22 16:13:53 +00:00
Konstantinos Margaritis
e39db866ce Fix C-style casts 2024-05-16 12:03:42 +03:00
George Wort
e1f0f6baf7 Implement new DoubleVermicelli16 acceleration functions using SVE2
Change-Id: Id4a8ffca840caab930a6e78cc0dfd0fe7d320b4e
2021-10-12 11:51:34 +03:00
George Wort
b54710d208 Implement new Vermicelli16 acceleration functions using SVE2.
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.

Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
2021-10-12 11:51:34 +03:00
Matthew Barr
423569ec82 De-multiaccel 2017-05-30 13:59:00 +10:00
Matthew Barr
2214296b7f Convert compile-time code to not require SIMD 2016-12-14 15:29:01 +11:00
Alex Coyte
ed3ef5b997 raise the limit of strings in double shufti 2016-04-20 13:34:56 +10:00
Alex Coyte
b4727cf1ea masked version of dverm 2016-04-20 13:34:56 +10:00
Alex Coyte
89d7728f77 refactoring of double byte offset accel to use paths and add to mcclellan 2016-04-20 13:34:56 +10:00
Anatoly Burakov
87424713a7 Multibyte acceleration compile side 2016-03-01 11:21:39 +11:00
Matthew Barr
904e436f11 Initial commit of Hyperscan 2015-10-20 09:13:35 +11:00