Design compile time api hs_compile_lit() and hs_compile_lit_multi()
to handle pure literal pattern sets. Corresponding option --literal-on
is added for hyperscan testing suites. Extended parameters and part of
flags are not supported for this api.
Replaces the original long lit hash table (used in streaming mode) with a
smaller, simpler linear probing approach. Adds a bloom filter in front
of it to reduce time spent on false positives.
Sizing of both the hash table and bloom filter are done based on max
load.
Move the hash table used for long literal support in streaming mode from
FDR to Rose, and introduce new instructions CHECK_LONG_LIT and
CHECK_LONG_LIT_NOCASE for doing literal confirm for long literals.
This simplifies FDR confirm, and guarantees that HWLM matchers will only
be used for literals < 256 bytes long.
Replace the use of the internal_report structure (for reports from
engines, MPV etc) with the Rose program interpreter.
SOM processing was reworked to use a new som_operation structure that is
embedded in the appropriate instructions.
Unifies all literal match paths so that the Rose program is used for all
of them. This removes the previous specialised "direct report" and
"multi direct report" paths. Some additional REPORT instruction work was
necessary for this.
Reworked literal construction path at compile time in prep for using
program offsets as literal IDs.
Completely removed the anchored log runtime, which is no longer worth
the extra complexity.
Replace the RoseLiteral structure with more program instructions; now,
instead of each literal ID leading to a RoseLiteral, it simply has a
program to run (and a delay rebuild program).
This commit also makes some other improvements:
* CHECK_STATE instruction, for use instead of a sparse iterator over a
single element.
* Elide some checks (CHECK_LIT_EARLY, ANCHORED_DELAY, etc) when not
needed.
* Flatten PUSH_DELAYED behaviour to one instruction per delayed
literal, rather than the mask/index-list approach used before.
* Simple program cache at compile time for deduplication.
- cleanups
- add sparse iter instructions
- merge "root" and "sparse iter" programs together
- move program execution to new file program_runtime.h
- simplify EOD execution
- Use program for EOD sparse iterator
- Use program for literal sparse iterator
- Eliminate RoseRole, RosePred, RoseVertexProps::role
- Small performance optimizations
We were using intermediate values int he enum and casting back and forth
with a u32; it is cleaner to just use a u32 and define some special
values.
Silences ICC warning #188: enumerated type mixed with another type.