256b wide SVE vectors allow some simplification of truffle.
Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s
onhe microbenchmark.
SVE2 also offer this capability for 128b vector with a speedup around
25% compared to normal SVE
Add unit tests and benchmark for this wide variant
Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
The scheme utilises the MATCH and NMATCH instructions to
scan for 16 characters at the same rate as vermicelli
scans for one.
Change-Id: Ie2cef904c56651e6108593c668e9b65bc001a886
At stream boundaries, we can mark streams as exhausted if there are no
groups active and there are no other ways to report matches. This allows us
to stop maintaining the history buffer on subsequent stream writes.
Previously, streams were only marked as exhausted if a pure highlander case
reported all patterns or the outfix in a sole outfix case died.
Move the hash table used for long literal support in streaming mode from
FDR to Rose, and introduce new instructions CHECK_LONG_LIT and
CHECK_LONG_LIT_NOCASE for doing literal confirm for long literals.
This simplifies FDR confirm, and guarantees that HWLM matchers will only
be used for literals < 256 bytes long.