The CHECK_NOT_HANDLED instruction was being inserted into an already
partially-flattened program, which would result in jump offsets becoming
incorrect.
This change places it as part of the normal flow of program
construction, which avoids this issue.
Replace the use of the internal_report structure (for reports from
engines, MPV etc) with the Rose program interpreter.
SOM processing was reworked to use a new som_operation structure that is
embedded in the appropriate instructions.
Previously, the exhaustion vector was a standard bitvector, which
required an expensive memset() call at init for databases with a large
number of exhaustion keys.
- Fix bugs introduced by recent addition of the boundary program. It's
not safe to do catchup there.
- Only do catchup once per report set, when necessary.
Unifies all literal match paths so that the Rose program is used for all
of them. This removes the previous specialised "direct report" and
"multi direct report" paths. Some additional REPORT instruction work was
necessary for this.
Reworked literal construction path at compile time in prep for using
program offsets as literal IDs.
Completely removed the anchored log runtime, which is no longer worth
the extra complexity.
Move report preconditions (bounds, exhaustion, etc) into program
instructions and use a more direct path to the user match callback than
the adaptor functions.
Report handling has been moved to new file src/report.h. Reporting from
EOD now uses the same instructions as normal report handling, rather
than its own.
Jump target tracking in rose_build_bytecode.cpp has been cleaned up.
Specifically, we must set build_context::floatingMinLiteralMatchOffset
to 1 when thew anchored table contains multiple DFAs, as they can
produce unordered matches.
This check was already been done, but too late to affect the generation
of ANCHORED_DELAY instructions.
Replace the RoseLiteral structure with more program instructions; now,
instead of each literal ID leading to a RoseLiteral, it simply has a
program to run (and a delay rebuild program).
This commit also makes some other improvements:
* CHECK_STATE instruction, for use instead of a sparse iterator over a
single element.
* Elide some checks (CHECK_LIT_EARLY, ANCHORED_DELAY, etc) when not
needed.
* Flatten PUSH_DELAYED behaviour to one instruction per delayed
literal, rather than the mask/index-list approach used before.
* Simple program cache at compile time for deduplication.