At stream boundaries, we can mark streams as exhausted if there are no
groups active and there are no other ways to report matches. This allows us
to stop maintaining the history buffer on subsequent stream writes.
Previously, streams were only marked as exhausted if a pure highlander case
reported all patterns or the outfix in a sole outfix case died.
Replaces the original long lit hash table (used in streaming mode) with a
smaller, simpler linear probing approach. Adds a bloom filter in front
of it to reduce time spent on false positives.
Sizing of both the hash table and bloom filter are done based on max
load.
Move the hash table used for long literal support in streaming mode from
FDR to Rose, and introduce new instructions CHECK_LONG_LIT and
CHECK_LONG_LIT_NOCASE for doing literal confirm for long literals.
This simplifies FDR confirm, and guarantees that HWLM matchers will only
be used for literals < 256 bytes long.
The roseRunProgram function had gotten very large for the number of
sites it was being inlined into, with negative effects on performance in
large cases. This change moves it into its own translation unit.
Replace the use of the internal_report structure (for reports from
engines, MPV etc) with the Rose program interpreter.
SOM processing was reworked to use a new som_operation structure that is
embedded in the appropriate instructions.
Unifies all literal match paths so that the Rose program is used for all
of them. This removes the previous specialised "direct report" and
"multi direct report" paths. Some additional REPORT instruction work was
necessary for this.
Reworked literal construction path at compile time in prep for using
program offsets as literal IDs.
Completely removed the anchored log runtime, which is no longer worth
the extra complexity.
Remove the RoseEngine and stream state pointers frose RoseContext, as
they are also present in core_info.
Unify stream state handing in Rose to always use a char * (we were often
a u8 * for no particularly good reason) and tidy up.