91 Commits

Author SHA1 Message Date
Xu, Chi
997787bd4b rose: add CHECK_SINGLE_LOOKAROUND instruction
This specialisation is cheaper than the shufti-based variants, so we
prefer it for single character class tests.
2016-10-28 14:47:04 +11:00
Justin Viiret
385f71b44e rose: enable generation of shufti32x16 case 2016-10-28 14:46:37 +11:00
Xu, Chi
04d79629de rose: add shufti-based lookaround instructions
More lookaround specialisations that use the shufti approach.
2016-10-28 14:46:27 +11:00
Justin Viiret
9139123642 rose: move sparse iter cache to RoseEngineBlob
This enables its use for iterators written by instructions.
2016-10-28 14:45:32 +11:00
Justin Viiret
13af3bfb74 rose: decouple build-time program representation
This commit replaces the build-time representation of the Rose
interpreter programs, from a class containing a discriminated union of
the bytecode structures to a class hierarchy of build-time prototypes.

This makes it easier to reason about and manipulate Rose programs during
compilation.
2016-10-28 14:45:15 +11:00
Justin Viiret
f4fa6cd4dd rose: tighten up requirements for catch up
We only need to catch up when there is an actual anchored table, not
merely when there are successors of anchored_root in the Rose graph.
2016-10-28 14:44:20 +11:00
Justin Viiret
3cf4199879 debug: always use %zu in format string for size_t 2016-10-28 14:43:34 +11:00
Justin Viiret
c8868fb9c7 rose: remove CHECK_LIT_MASK instruction 2016-10-28 14:43:33 +11:00
Justin Viiret
4ce306864e rose: use lookarounds to implement benefits masks
This replaces the CHECK_LIT_MASK instruction.
2016-10-28 14:43:33 +11:00
Xu, Chi
b96d5c23d1 rose: add new instruction CHECK_MASK_32
This is a specialisation of the "lookaround" code.
2016-10-28 14:43:33 +11:00
Justin Viiret
ae14187462 rose: use min of max_offset in left merges
Be more careful with max_offset, since we rely on it ofr ANCH history
cases. Also adds tighter assertions.
2016-08-10 15:12:12 +10:00
Anatoly Burakov
6331da4e29 dfa: adding new Sheng engine
A new shuffle-based DFA engine, complete with acceleration and smallwrite.
2016-08-10 15:10:46 +10:00
Matthew Barr
cbd115f7fe Don't shadow names 2016-08-10 15:06:57 +10:00
Justin Viiret
7f49958824 rose: only write out report programs if in use
These programs are only used by output-exposed engines.
2016-08-10 15:05:53 +10:00
Alex Coyte
d574557200 take mask overhang into account for hwlm accel, float min dist 2016-08-10 15:05:19 +10:00
Justin Viiret
9eb349a343 rose: expose smwr builder, tidy up engine build 2016-08-10 14:59:10 +10:00
Justin Viiret
8754cbbd24 rose: use program offset, not final_id, in atable
This removes the need to look up the program offset in a table when
handling an anchored literal match.
2016-08-10 14:59:10 +10:00
Justin Viiret
4dbbc4eaa5 rose: add RECORD_ANCHORED instruction to program
Moves recordAnchoredLiteralMatch from an unconditional call in the
anchored callback to being driven by a program instruction.
2016-08-10 14:59:10 +10:00
Alex Coyte
981b59fd05 minor eager prefixes improvements
- count eager prefixes as always run engine when comparing with smwr
 - only check if a prefix is vacuous after adding back literal fragments
2016-08-10 14:59:10 +10:00
Xu, Chi
4d7469392d rose: add CHECK_BYTE/CHECK_MASK instructions
These instructions are specialisations of the "lookaround" code for
performance.
2016-08-10 14:57:48 +10:00
Justin Viiret
3e96cd48ef rose: sanity check CHECK_BOUNDS instruction 2016-08-10 14:57:36 +10:00
Alex Coyte
3a1429a621 group_weak_end is no longer used 2016-08-10 14:52:56 +10:00
Justin Viiret
cf9e40ae1c nfa: unify NfaCallback and SomNfaCallback
Use just one callback type, with both start and end offsets.
2016-07-08 11:01:56 +10:00
Xiang Wang
9087d59be5 tamarama: add container engine for exclusive nfas
Add the new Tamarama engine that acts as a container for infix/suffix
engines that can be proven to run exclusively of one another.

This reduces stream state for pattern sets with many exclusive engines.
2016-07-08 11:01:34 +10:00
Alex Coyte
f166bc5658 allow some prefixes that may squash the literal match to run eagerly 2016-07-08 11:01:34 +10:00
Alex Coyte
575e8c06dc only show floating groups to the floating table 2016-07-08 10:59:40 +10:00
Justin Viiret
6239805561 rose: don't build empty sparse iter subprograms 2016-07-08 10:59:40 +10:00
Justin Viiret
cdaf705a87 rose: pick up more prefix->lookaround conversions 2016-07-08 10:57:29 +10:00
Justin Viiret
d3c56b532b rose build: dedupe hasLastByteHistorySucc func 2016-07-08 10:57:00 +10:00
Justin Viiret
426bfc9cfb rose_build_bytecode: clean up 2016-07-08 10:55:36 +10:00
Justin Viiret
78e4332a8b move eod iter program into general eod program 2016-07-08 10:55:36 +10:00
Justin Viiret
39461cc806 eod: move hwlm execution into MATCHER_EOD instr 2016-07-08 10:55:36 +10:00
Justin Viiret
b8f771e824 rose_build_bytecode: tidy up addPredBlocks 2016-07-08 10:55:36 +10:00
Justin Viiret
2761e0105d eod: more suffix iteration into program 2016-07-08 10:54:07 +10:00
Justin Viiret
9669e0fe94 eod: remove forced sparse iter optimization 2016-07-08 10:54:07 +10:00
Justin Viiret
7a7dff5b70 eod: don't force sparse iter for general prog 2016-07-08 10:54:07 +10:00
Justin Viiret
02595cda1f eod: consolidate eod anchor programs 2016-07-08 10:54:07 +10:00
Justin Viiret
7a6a476723 eod: move engine checks into ENGINES_EOD instr 2016-07-08 10:54:07 +10:00
Justin Viiret
8e4c68e9df rose: eagerly report EOD literal matches
Where possible, eagerly report a match when a literal that matches at
EOD occurs, rather than setting a state bit and waiting for EOD
processing.
2016-07-08 10:47:33 +10:00
Justin Viiret
1df4da16ad rose: parameterise CHECK_LIT_EARLY 2016-07-08 10:47:33 +10:00
Justin Viiret
c2496fbf76 rose: elide SET_GROUPS when possible 2016-07-08 10:47:07 +10:00
Justin Viiret
beec5e59df rose: linear scan for lookaround during build
This allows us to reuse more lookaround entries in the bytecode.
2016-07-08 10:46:54 +10:00
Justin Viiret
9b7eca5400 rose: dump leftfix/suffix queue indices 2016-07-08 10:44:56 +10:00
Boris Nagaev
6d87533ef0 fix add_to_engine_blob for iterator=pointer 2016-07-06 19:46:41 +03:00
Justin Viiret
614ca0accf rose: always push CHECK_BOUNDS onto end of program 2016-06-01 10:56:57 +10:00
Justin Viiret
9826522e34 rose: fix CHECK_NOT_HANDLED placement bug
The CHECK_NOT_HANDLED instruction was being inserted into an already
partially-flattened program, which would result in jump offsets becoming
incorrect.

This change places it as part of the normal flow of program
construction, which avoids this issue.
2016-06-01 10:56:52 +10:00
Justin Viiret
ee7f31ac39 mpv: native report remapping 2016-05-18 16:22:01 +10:00
Justin Viiret
c101beb541 castle, lbr: native report remap 2016-05-18 16:21:36 +10:00
Justin Viiret
1f41a921f2 mcclellan, gough: native report remapping 2016-05-18 16:20:45 +10:00
Justin Viiret
611579511c rose: remap reports to program offsets 2016-05-18 16:20:42 +10:00