30 Commits

Author SHA1 Message Date
Alex Coyte
15c8a7bd98 rose: rework storage of extra lookaround information
- remove explicit lookaround table from bytecode
- make the RoseInstr responsible for adding required info to blob
2017-05-30 13:59:00 +10:00
Alex Coyte
f74f475189 rose_program: merge RECORD_ANCHORED instruction into ANCHORED_DELAY 2017-05-30 13:58:32 +10:00
Xu, Chi
ae3cb7de6f rose: add multi-path shufti 16x8, 32x8, 32x16, 64x8 and multi-path lookaround instructions. 2017-04-26 15:18:56 +10:00
Justin Viiret
18f843bcc1 rose: add CLEAR_WORK_DONE instruction
Preparatory work for allowing fragments to be shared between literals
that squash groups and those that don't.
2017-04-26 15:18:26 +10:00
Justin Viiret
eb14792a63 rose: group final ids by fragment 2017-04-26 14:41:29 +10:00
Justin Viiret
07a6b6510c rose/hwlm: limit literals to eight bytes
Rework HWLM to work over literals of eight bytes ("medium length"),
doing confirm in the Rose interpreter.
2017-04-26 14:41:29 +10:00
Justin Viiret
68bf473e2e fdr: move long literal handling into Rose
Move the hash table used for long literal support in streaming mode from
FDR to Rose, and introduce new instructions CHECK_LONG_LIT and
CHECK_LONG_LIT_NOCASE for doing literal confirm for long literals.

This simplifies FDR confirm, and guarantees that HWLM matchers will only
be used for literals < 256 bytes long.
2016-10-28 14:52:26 +11:00
Justin Viiret
6e533589bb rose: move END instruction to start of enum
Stop overloading END as the last Rose interpreter instruction, use new
sentinel LAST_ROSE_INSTRUCTION for that.

This change will also make it easier to add new instructions without
renumbering END and thus changing all generated bytecodes.
2016-10-28 14:47:07 +11:00
Xu, Chi
997787bd4b rose: add CHECK_SINGLE_LOOKAROUND instruction
This specialisation is cheaper than the shufti-based variants, so we
prefer it for single character class tests.
2016-10-28 14:47:04 +11:00
Xu, Chi
04d79629de rose: add shufti-based lookaround instructions
More lookaround specialisations that use the shufti approach.
2016-10-28 14:46:27 +11:00
Justin Viiret
13af3bfb74 rose: decouple build-time program representation
This commit replaces the build-time representation of the Rose
interpreter programs, from a class containing a discriminated union of
the bytecode structures to a class hierarchy of build-time prototypes.

This makes it easier to reason about and manipulate Rose programs during
compilation.
2016-10-28 14:45:15 +11:00
Justin Viiret
c8868fb9c7 rose: remove CHECK_LIT_MASK instruction 2016-10-28 14:43:33 +11:00
Xu, Chi
b96d5c23d1 rose: add new instruction CHECK_MASK_32
This is a specialisation of the "lookaround" code.
2016-10-28 14:43:33 +11:00
Justin Viiret
4dbbc4eaa5 rose: add RECORD_ANCHORED instruction to program
Moves recordAnchoredLiteralMatch from an unconditional call in the
anchored callback to being driven by a program instruction.
2016-08-10 14:59:10 +10:00
Xu, Chi
4d7469392d rose: add CHECK_BYTE/CHECK_MASK instructions
These instructions are specialisations of the "lookaround" code for
performance.
2016-08-10 14:57:48 +10:00
Justin Viiret
39461cc806 eod: move hwlm execution into MATCHER_EOD instr 2016-07-08 10:55:36 +10:00
Justin Viiret
2761e0105d eod: more suffix iteration into program 2016-07-08 10:54:07 +10:00
Justin Viiret
7a6a476723 eod: move engine checks into ENGINES_EOD instr 2016-07-08 10:54:07 +10:00
Justin Viiret
1df4da16ad rose: parameterise CHECK_LIT_EARLY 2016-07-08 10:47:33 +10:00
Justin Viiret
1d85987d96 FINAL_REPORT: Add specialised instruction
Specialisation of the REPORT instruction that also terminates execution
of the program. Improves performance on programs that generate many
reports.
2016-04-20 13:34:57 +10:00
Justin Viiret
36150bbc19 Rose: replace internal_report with program
Replace the use of the internal_report structure (for reports from
engines, MPV etc) with the Rose program interpreter.

SOM processing was reworked to use a new som_operation structure that is
embedded in the appropriate instructions.
2016-04-20 13:34:57 +10:00
Justin Viiret
67b9784dae Rose: use program for all literal matches
Unifies all literal match paths so that the Rose program is used for all
of them. This removes the previous specialised "direct report" and
"multi direct report" paths. Some additional REPORT instruction work was
necessary for this.

Reworked literal construction path at compile time in prep for using
program offsets as literal IDs.

Completely removed the anchored log runtime, which is no longer worth
the extra complexity.
2016-04-20 13:34:54 +10:00
Justin Viiret
5a1dd54049 Split CHECK_LEFTFIX into CHECK_{INFIX,PREFIX} 2016-03-01 11:35:08 +11:00
Justin Viiret
060defe6c4 Rose: move more report handling work into program
Move report preconditions (bounds, exhaustion, etc) into program
instructions and use a more direct path to the user match callback than
the adaptor functions.

Report handling has been moved to new file src/report.h. Reporting from
EOD now uses the same instructions as normal report handling, rather
than its own.

Jump target tracking in rose_build_bytecode.cpp has been cleaned up.
2016-03-01 11:32:01 +11:00
Justin Viiret
10cda4cc33 Rose: Move all literal operations into program
Replace the RoseLiteral structure with more program instructions; now,
instead of each literal ID leading to a RoseLiteral, it simply has a
program to run (and a delay rebuild program).

This commit also makes some other improvements:

 * CHECK_STATE instruction, for use instead of a sparse iterator over a
   single element.
 * Elide some checks (CHECK_LIT_EARLY, ANCHORED_DELAY, etc) when not
   needed.
 * Flatten PUSH_DELAYED behaviour to one instruction per delayed
   literal, rather than the mask/index-list approach used before.
 * Simple program cache at compile time for deduplication.
2016-03-01 11:23:56 +11:00
Justin Viiret
48c9d7c381 Remove use of depth from Rose entirely 2016-03-01 11:23:11 +11:00
Justin Viiret
3d87e382fa Remove CHECK_DEPTH instruction 2016-03-01 11:23:11 +11:00
Justin Viiret
b2ebdac642 rose: Extend program to handle literals, iterators
- cleanups
- add sparse iter instructions
- merge "root" and "sparse iter" programs together
- move program execution to new file program_runtime.h
- simplify EOD execution
2016-03-01 11:17:31 +11:00
Justin Viiret
d67c7583ea rose: Extend the interpreter to handle more work
- Use program for EOD sparse iterator
- Use program for literal sparse iterator
- Eliminate RoseRole, RosePred, RoseVertexProps::role
- Small performance optimizations
2016-03-01 11:16:02 +11:00
Justin Viiret
9cb2233589 rose: Use an interpreter for role runtime
Replace much of the RoseRole structure with an interpreted program,
simplifying the Rose runtime and making it much more flexible.
2016-03-01 11:16:02 +11:00