2150 Commits

Author SHA1 Message Date
Justin Viiret
fb8747295e roseTestLeftfix: unify common "nfa is dead" code 2016-03-01 11:34:57 +11:00
Justin Viiret
996eba9686 Add CATCH_UP to report_block, not "parent" program
Also ensure that exhaustion check happens after catch up, as catch up
may fire reports (which could exhaust).
2016-03-01 11:34:57 +11:00
Justin Viiret
1619d975c6 limex_runtime.h: scratch header no longer needed 2016-03-01 11:34:57 +11:00
Justin Viiret
cf00094f24 Remove more unused structures from unit tests
The NFA, LBR no longer need scratch or the NFAContext structure stored
outside the NFA stack.
2016-03-01 11:34:38 +11:00
Justin Viiret
c3860a9f29 NFA API: Remove unused scratch ptr from struct mq 2016-03-01 11:34:38 +11:00
Justin Viiret
58f9617f66 NFA API: Remove nfaBlockExecReverse scratch arg
Scratch is no longer used by this function's implementations.
2016-03-01 11:34:38 +11:00
Justin Viiret
3e002f8181 NFA: Move NFAContext to stack (from scratch) 2016-03-01 11:34:38 +11:00
Justin Viiret
7b54856642 Rose: allow block-mode merge of small prefixes
Previously, we disallowed the merging of all Rose prefixes in block mode
where the literal sets are not identical.

This change allows merging if the prefix graphs to be merged are very
small, as a small performance improvement for cases with lots of tiny
prefixes.

This check is deliberately conservative: graphs must have some common
vertices, and the result of the merge must not give up any
accelerability.
2016-03-01 11:34:26 +11:00
Justin Viiret
670eff5bc0 NFA merging: permit different reports
For cases where the edges from start are not to a mix of accept and
acceptEod, report sets can be combined.
2016-03-01 11:34:26 +11:00
Justin Viiret
42d34f19d1 Dump: don't call dumpNfaNotes for SOM reverse NFAs
These NFAs have no queue index.
2016-03-01 11:34:26 +11:00
Justin Viiret
961e303ff3 SET_GROUPS instr: don't generate more than one 2016-03-01 11:34:26 +11:00
Justin Viiret
314da68085 dedupeCatchup: only call when necessary at runtime 2016-03-01 11:34:11 +11:00
Justin Viiret
cd133f77ee DEDUPE instr: generate only when necessary 2016-03-01 11:34:11 +11:00
Justin Viiret
09bf568d95 Rose: clean up use of scratch, RoseContext 2016-03-01 11:32:11 +11:00
Justin Viiret
9e9bb6a960 Rose: pack global state bits into one u8
Eliminate the RoseRuntimeState structure in favour of a single status
byte that is stored in scratch and copied to/from stream state.
2016-03-01 11:32:01 +11:00
Justin Viiret
28f379d738 Rose: remove alignment req for anchored DFA state 2016-03-01 11:32:01 +11:00
Justin Viiret
060defe6c4 Rose: move more report handling work into program
Move report preconditions (bounds, exhaustion, etc) into program
instructions and use a more direct path to the user match callback than
the adaptor functions.

Report handling has been moved to new file src/report.h. Reporting from
EOD now uses the same instructions as normal report handling, rather
than its own.

Jump target tracking in rose_build_bytecode.cpp has been cleaned up.
2016-03-01 11:32:01 +11:00
Justin Viiret
94b33421ca ng_filter: Fix bug introduced in 98eff64
If the max width is modified for a region, use the modified version when
checking to see if a self-loop must be added on the last vertex.
2016-03-01 11:30:20 +11:00
Justin Viiret
9eb328b455 RoseRuntimeState no longer needs to be packed
This structure only contains u8 values now. In the future we may wish to
eliminate it entirely and store the few bits we need more directly.
2016-03-01 11:29:04 +11:00
Justin Viiret
435b08b984 Docs for Rose callback types 2016-03-01 11:29:04 +11:00
Justin Viiret
4feabf7bd6 Make Rose callback types explicitly take scratch 2016-03-01 11:29:04 +11:00
Justin Viiret
70620327cc Remove RoseContext::userCtx
All Rose callbacks receive scratch as their context.
2016-03-01 11:29:04 +11:00
Justin Viiret
cca4116861 Move cyclic path redundancy into reduce loop
Sometimes cyclic path redundancy can uncover further reduction work that
can be done by the other passes in the reduce loop.
2016-03-01 11:29:00 +11:00
Justin Viiret
b36197df26 roseEodRunMatcher: correct early return value 2016-03-01 11:28:56 +11:00
Justin Viiret
621dfbebb7 nfaCheckFinalState: define return value
Make nfaCheckFinalState return MO_HALT_MATCHING when the user instructs
us (via the callback return value) to halt matching. In the caller,
check this value and stop matching if told.
2016-03-01 11:28:03 +11:00
Anatoly Burakov
843ca0e7cc Don't look for accel friends for multibyte acceleration 2016-03-01 11:27:36 +11:00
Justin Viiret
755e6700c1 scratch: correctly align fatbit arrays
This fixes an assertion failure on 32-bit targets.
2016-03-01 11:27:29 +11:00
Justin Viiret
de61b32e98 Use fatbit for anch log, delay slots in scratch
Since these structures are in scratch, they do not have to be as small
as possible and we can use fatbit instead of multibit to improve
performance.
2016-03-01 11:24:17 +11:00
Justin Viiret
1c2fca8840 rose_build_anchored: take ref, not pointer 2016-03-01 11:24:17 +11:00
Justin Viiret
69682ed263 Account for multi-dfa case with ANCHORED_DELAY
Specifically, we must set build_context::floatingMinLiteralMatchOffset
to 1 when thew anchored table contains multiple DFAs, as they can
produce unordered matches.

This check was already been done, but too late to affect the generation
of ANCHORED_DELAY instructions.
2016-03-01 11:24:13 +11:00
Justin Viiret
d7c8ffc7fd Use correct type for anchored matcher build 2016-03-01 11:24:09 +11:00
Justin Viiret
e63fcec3c7 Fix release build (unused var) 2016-03-01 11:24:08 +11:00
Justin Viiret
8783750c72 Remove dupe engine, state ptrs from RoseContext
Remove the RoseEngine and stream state pointers frose RoseContext, as
they are also present in core_info.

Unify stream state handing in Rose to always use a char * (we were often
a u8 * for no particularly good reason) and tidy up.
2016-03-01 11:23:56 +11:00
Matthew Barr
39886a0968 Coverity: Restore output stream format 2016-03-01 11:23:56 +11:00
Justin Viiret
10cda4cc33 Rose: Move all literal operations into program
Replace the RoseLiteral structure with more program instructions; now,
instead of each literal ID leading to a RoseLiteral, it simply has a
program to run (and a delay rebuild program).

This commit also makes some other improvements:

 * CHECK_STATE instruction, for use instead of a sparse iterator over a
   single element.
 * Elide some checks (CHECK_LIT_EARLY, ANCHORED_DELAY, etc) when not
   needed.
 * Flatten PUSH_DELAYED behaviour to one instruction per delayed
   literal, rather than the mask/index-list approach used before.
 * Simple program cache at compile time for deduplication.
2016-03-01 11:23:56 +11:00
Alex Coyte
255d84a83a squashing: prevent generation of pairs of squash states 2016-03-01 11:23:56 +11:00
Matthew Barr
fe475cc069 alignof() should operate on a type-id 2016-03-01 11:23:48 +11:00
Justin Viiret
fafcc83520 Delete unused build_context::depths 2016-03-01 11:23:11 +11:00
Justin Viiret
48c9d7c381 Remove use of depth from Rose entirely 2016-03-01 11:23:11 +11:00
Justin Viiret
14f18bd6e8 Don't use depth for in-flight check 2016-03-01 11:23:11 +11:00
Justin Viiret
3d87e382fa Remove CHECK_DEPTH instruction 2016-03-01 11:23:11 +11:00
Justin Viiret
e051077a26 Remove "dot" entries from leftfix lookarounds
Note that we have to be careful to leave the first lookaround entry in
place, if it's a dot. This should eventually be done with a program
instruction.
2016-03-01 11:23:06 +11:00
Justin Viiret
e92a20e5fa ComponentRepeat: remove firsts_cache, precalc code
Firsts are easy to compute in ComponentRepeat::first() now.
2016-03-01 11:22:45 +11:00
Justin Viiret
3d049d6de3 ComponentRepeat: wire X{0,N} and (X?){N} the same 2016-03-01 11:22:45 +11:00
Justin Viiret
997c0c9efd ComponentRepeat: wire R{0,N} as (R{1,N})?
Change the way that we wire up the edges in a bounded repeat to avoid
large fan-out from predecessors.
2016-03-01 11:22:45 +11:00
Justin Viiret
98eff64edf ng_prefilter: turn large max bound into inf
During prefilter region replacement, turn regions with very large max
bounds into repeats with inf max bound. This improves compile time and
the likelihood that we will actually be able to build an implementation
for such patterns.
2016-03-01 11:22:45 +11:00
Anatoly Burakov
fb932616ca Multibyte matcher unit-tests 2016-03-01 11:21:39 +11:00
Anatoly Burakov
e6709cee5f Bitmatcher unit-tests 2016-03-01 11:21:39 +11:00
Anatoly Burakov
87424713a7 Multibyte acceleration compile side 2016-03-01 11:21:39 +11:00
Anatoly Burakov
081b3ef369 Multibyte truffle runtime 2016-03-01 11:21:39 +11:00