64 Commits

Author SHA1 Message Date
Wang, Xiang W
252eb820c4 ue-3145: make parents of included literals exclusive 2017-08-21 11:12:36 +10:00
Wang, Xiang W
86c5f7feb1 FDR: Squash buckets of included literals in FDR confirm
- Change the compile of literal matchers to two passes.
 - Reverse the bucket assignment in FDR, bucket with longer literals has
   smaller bucket id.
 - Squash the buckets of included literals and jump to the the program of
   included literals directly from parent literal program without going
   through FDR confirm for included iterals.
2017-08-21 11:12:36 +10:00
Wang, Xiang W
ebb1b0006b remove start argument in literal matcher callbacks 2017-08-21 11:12:36 +10:00
Matthew Barr
f6b688fc06 rename pshufb to pshufb_m128 2017-05-30 13:59:23 +10:00
Matthew Barr
a295c96198 rename vpshufb to pshufb_m256 2017-05-30 13:59:23 +10:00
Alex Coyte
15c8a7bd98 rose: rework storage of extra lookaround information
- remove explicit lookaround table from bytecode
- make the RoseInstr responsible for adding required info to blob
2017-05-30 13:59:00 +10:00
Alex Coyte
f74f475189 rose_program: merge RECORD_ANCHORED instruction into ANCHORED_DELAY 2017-05-30 13:58:32 +10:00
Alex Coyte
88fd95e38a rose: minor clean up of catchup
- anchored dfa do not mean that catchup is required
- remove needsCatchup from rose bytecode as catchup is based on interpreter
2017-05-30 13:58:32 +10:00
Xu, Chi
2f9d063190 rose: fix CHECK_MULTIPATH_LOOKAROUND match difference bug 2017-04-26 15:19:49 +10:00
Xu, Chi
ae3cb7de6f rose: add multi-path shufti 16x8, 32x8, 32x16, 64x8 and multi-path lookaround instructions. 2017-04-26 15:18:56 +10:00
Justin Viiret
18f843bcc1 rose: add CLEAR_WORK_DONE instruction
Preparatory work for allowing fragments to be shared between literals
that squash groups and those that don't.
2017-04-26 15:18:26 +10:00
Justin Viiret
c426d2dc7d rose: reduce anchored program dep on final_id
We only need to build anchored programs for cases where a
RECORD_ANCHORED instruction has been generated, and we can key those
directly rather than using final_id.
2017-04-26 15:04:30 +10:00
Justin Viiret
a4af801dd1 rose: define invalid value for program offset 2017-04-26 14:56:49 +10:00
Justin Viiret
eb14792a63 rose: group final ids by fragment 2017-04-26 14:41:29 +10:00
Justin Viiret
07a6b6510c rose/hwlm: limit literals to eight bytes
Rework HWLM to work over literals of eight bytes ("medium length"),
doing confirm in the Rose interpreter.
2017-04-26 14:41:29 +10:00
Alex Coyte
f9324febde Ensure the queue structure is initialised in roseEnginesEod(). 2017-03-01 13:05:10 +11:00
Justin Viiret
91a7ce1cda getData256(): data needs to be 32-byte aligned 2016-12-02 11:28:16 +11:00
Justin Viiret
68bf473e2e fdr: move long literal handling into Rose
Move the hash table used for long literal support in streaming mode from
FDR to Rose, and introduce new instructions CHECK_LONG_LIT and
CHECK_LONG_LIT_NOCASE for doing literal confirm for long literals.

This simplifies FDR confirm, and guarantees that HWLM matchers will only
be used for literals < 256 bytes long.
2016-10-28 14:52:26 +11:00
Matthew Barr
7849b9d611 MSVC prefers the attrib at the beginning 2016-10-28 14:48:31 +11:00
Justin Viiret
6e533589bb rose: move END instruction to start of enum
Stop overloading END as the last Rose interpreter instruction, use new
sentinel LAST_ROSE_INSTRUCTION for that.

This change will also make it easier to add new instructions without
renumbering END and thus changing all generated bytecodes.
2016-10-28 14:47:07 +11:00
Xu, Chi
997787bd4b rose: add CHECK_SINGLE_LOOKAROUND instruction
This specialisation is cheaper than the shufti-based variants, so we
prefer it for single character class tests.
2016-10-28 14:47:04 +11:00
Xu, Chi
04d79629de rose: add shufti-based lookaround instructions
More lookaround specialisations that use the shufti approach.
2016-10-28 14:46:27 +11:00
Justin Viiret
13af3bfb74 rose: decouple build-time program representation
This commit replaces the build-time representation of the Rose
interpreter programs, from a class containing a discriminated union of
the bytecode structures to a class hierarchy of build-time prototypes.

This makes it easier to reason about and manipulate Rose programs during
compilation.
2016-10-28 14:45:15 +11:00
Justin Viiret
c8868fb9c7 rose: remove CHECK_LIT_MASK instruction 2016-10-28 14:43:33 +11:00
Xu, Chi
b96d5c23d1 rose: add new instruction CHECK_MASK_32
This is a specialisation of the "lookaround" code.
2016-10-28 14:43:33 +11:00
Justin Viiret
e03375b644 program_runtime: remove commented-out code 2016-08-10 15:11:15 +10:00
Justin Viiret
4dbbc4eaa5 rose: add RECORD_ANCHORED instruction to program
Moves recordAnchoredLiteralMatch from an unconditional call in the
anchored callback to being driven by a program instruction.
2016-08-10 14:59:10 +10:00
Xu, Chi
4d7469392d rose: add CHECK_BYTE/CHECK_MASK instructions
These instructions are specialisations of the "lookaround" code for
performance.
2016-08-10 14:57:48 +10:00
Justin Viiret
9f98f4c7b2 nfa: standardise callback start, end naming 2016-07-08 11:02:05 +10:00
Justin Viiret
cf9e40ae1c nfa: unify NfaCallback and SomNfaCallback
Use just one callback type, with both start and end offsets.
2016-07-08 11:01:56 +10:00
Justin Viiret
013dbd3b3c rose: re-inline literal handling program exec 2016-07-08 11:01:34 +10:00
Justin Viiret
76d96809f8 rose: move roseRunProgram into its own unit
The roseRunProgram function had gotten very large for the number of
sites it was being inlined into, with negative effects on performance in
large cases. This change moves it into its own translation unit.
2016-07-08 11:01:34 +10:00
Alex Coyte
f166bc5658 allow some prefixes that may squash the literal match to run eagerly 2016-07-08 11:01:34 +10:00
Justin Viiret
159c09b70e roseEnginesEod: trust the queue structure 2016-07-08 10:55:36 +10:00
Justin Viiret
d9bd6d5dee roseSuffixesEod: trust the queue structure 2016-07-08 10:55:36 +10:00
Justin Viiret
3e0232f0d6 eod: retire getELiteralMatcher 2016-07-08 10:55:36 +10:00
Justin Viiret
78e4332a8b move eod iter program into general eod program 2016-07-08 10:55:36 +10:00
Justin Viiret
39461cc806 eod: move hwlm execution into MATCHER_EOD instr 2016-07-08 10:55:36 +10:00
Justin Viiret
2761e0105d eod: more suffix iteration into program 2016-07-08 10:54:07 +10:00
Justin Viiret
7a6a476723 eod: move engine checks into ENGINES_EOD instr 2016-07-08 10:54:07 +10:00
Justin Viiret
1df4da16ad rose: parameterise CHECK_LIT_EARLY 2016-07-08 10:47:33 +10:00
Justin Viiret
9e0ec02ac9 rose: assert that program offset is sane 2016-05-18 16:22:01 +10:00
Justin Viiret
1d85987d96 FINAL_REPORT: Add specialised instruction
Specialisation of the REPORT instruction that also terminates execution
of the program. Improves performance on programs that generate many
reports.
2016-04-20 13:34:57 +10:00
Justin Viiret
36150bbc19 Rose: replace internal_report with program
Replace the use of the internal_report structure (for reports from
engines, MPV etc) with the Rose program interpreter.

SOM processing was reworked to use a new som_operation structure that is
embedded in the appropriate instructions.
2016-04-20 13:34:57 +10:00
Justin Viiret
f2c0a66b6f Rose: use a multibit for the exhaustion vector
Previously, the exhaustion vector was a standard bitvector, which
required an expensive memset() call at init for databases with a large
number of exhaustion keys.
2016-04-20 13:34:55 +10:00
Justin Viiret
67b9784dae Rose: use program for all literal matches
Unifies all literal match paths so that the Rose program is used for all
of them. This removes the previous specialised "direct report" and
"multi direct report" paths. Some additional REPORT instruction work was
necessary for this.

Reworked literal construction path at compile time in prep for using
program offsets as literal IDs.

Completely removed the anchored log runtime, which is no longer worth
the extra complexity.
2016-04-20 13:34:54 +10:00
Justin Viiret
6e8f394d8d Make comparison signed (fix warning) 2016-03-01 11:36:09 +11:00
Justin Viiret
129578f970 Rose program: Improvements to debug/assertions
- Add current pc to debug printf.
- Assert that pc doesn't escape the RoseEngine structure.
2016-03-01 11:35:09 +11:00
Justin Viiret
5a1dd54049 Split CHECK_LEFTFIX into CHECK_{INFIX,PREFIX} 2016-03-01 11:35:08 +11:00
Justin Viiret
4d5710a84a Rename rosePrefixCheckMiracles to roseLeftfix... 2016-03-01 11:34:57 +11:00