Justin Viiret
176c61aeaa
rose_build_bytecode: clean up findEdgesByLiteral()
2017-04-26 15:04:31 +10:00
Justin Viiret
6a0dc261a2
rose_build_bytecode: less final_id
2017-04-26 15:04:31 +10:00
Justin Viiret
24ffb156e9
rose: eliminate global final to fragment map
2017-04-26 15:04:31 +10:00
Justin Viiret
454fbf33d5
rose: tidy
2017-04-26 15:04:31 +10:00
Justin Viiret
dc50ab291b
container: allow sort_and_unique to have a comparator
2017-04-26 15:04:31 +10:00
Justin Viiret
cea8f452f2
rose: reorganise delay program generation
2017-04-26 15:04:31 +10:00
Justin Viiret
a2d2f7cb95
rose: dedupe anch programs and RECORD_ANCHOREDs
2017-04-26 15:04:31 +10:00
Justin Viiret
75c7f42314
rose: don't emit RECORD_ANCHORED in anchored progs
2017-04-26 15:04:31 +10:00
Justin Viiret
f5dd20e461
rose: rearrange anchored program generation
2017-04-26 15:04:31 +10:00
Justin Viiret
6a945e27fb
rose: reduce delay program dep on final_id
2017-04-26 15:04:31 +10:00
Justin Viiret
dc8220648c
rose: remove now-unused anchored_base_id
2017-04-26 15:04:30 +10:00
Justin Viiret
c426d2dc7d
rose: reduce anchored program dep on final_id
...
We only need to build anchored programs for cases where a
RECORD_ANCHORED instruction has been generated, and we can key those
directly rather than using final_id.
2017-04-26 15:04:30 +10:00
Justin Viiret
ea8d0bcb1c
rose: build fragments directly
2017-04-26 15:04:30 +10:00
Justin Viiret
79512bd5c3
rose: use fragment ids earlier for anchored dfas
2017-04-26 15:04:30 +10:00
Justin Viiret
8b25d83415
rose: write fragment ids into literal_info
2017-04-26 15:04:30 +10:00
Justin Viiret
7bdb327203
rose: use final_ids less in program construction
2017-04-26 14:56:48 +10:00
Justin Viiret
a83b7cb348
move final_id_to_literal into build_context
2017-04-26 14:56:48 +10:00
Justin Viiret
a0260c0362
rose: do fragment group assignment earlier
2017-04-26 14:56:48 +10:00
Justin Viiret
6bf35cb637
rose: make groupByFragment local
2017-04-26 14:49:51 +10:00
Justin Viiret
a5b3bc814f
rose: delete RoseEngine::literalCount
2017-04-26 14:49:51 +10:00
Justin Viiret
9550058e75
remove lit program tables from bytecode
2017-04-26 14:49:51 +10:00
Justin Viiret
c2cac5009a
tidy up args to builders
2017-04-26 14:46:49 +10:00
Justin Viiret
3ae2fb417e
move final_to_frag_map into RoseBuildImpl (for dump code)
2017-04-26 14:46:49 +10:00
Justin Viiret
76f72b6ab4
rose: use program offsets directly in lit tables
2017-04-26 14:46:48 +10:00
Justin Viiret
ac858cd47c
rose: build a separate delay rebuild matcher
2017-04-26 14:46:48 +10:00
Alex Coyte
bbd64f98ae
allow streams to marked as exhausted in more cases
...
At stream boundaries, we can mark streams as exhausted if there are no
groups active and there are no other ways to report matches. This allows us
to stop maintaining the history buffer on subsequent stream writes.
Previously, streams were only marked as exhausted if a pure highlander case
reported all patterns or the outfix in a sole outfix case died.
2017-04-26 14:44:53 +10:00
Justin Viiret
c6b2563df6
rose: delete literal_info requires_explode flag
2017-04-26 14:43:28 +10:00
Justin Viiret
f307956584
rose: do not combine fragments which squash groups
2017-04-26 14:41:30 +10:00
Justin Viiret
eb14792a63
rose: group final ids by fragment
2017-04-26 14:41:29 +10:00
Justin Viiret
07a6b6510c
rose/hwlm: limit literals to eight bytes
...
Rework HWLM to work over literals of eight bytes ("medium length"),
doing confirm in the Rose interpreter.
2017-04-26 14:41:29 +10:00
Justin Viiret
5c9c540424
rose: fix up comments referring to CHECK_LITERAL
...
This instruction is now called CHECK_LONG_LIT.
2017-04-26 14:41:29 +10:00
Matthew Barr
bc2f336d9d
Work around for deficiency in C++11/14/17 standard
...
As explained to us by STL at Microsoft (the author of their
vector), there is a hole in the standard wrt the vector copy
constructor, which always exists even if it won't compile.
2017-04-26 14:41:29 +10:00
Matthew Barr
2214296b7f
Convert compile-time code to not require SIMD
2016-12-14 15:29:01 +11:00
Justin Viiret
e271781d95
multibit, fatbit: make _size build-time only
...
This commit makes mmbit_size() and fatbit_size compile-time only, and
adds a resource limit for very large multibits.
2016-12-14 15:28:54 +11:00
Alex Coyte
e51b6d23b9
introduce Sheng-McClellan hybrid
2016-12-14 15:27:18 +11:00
Matthew Barr
99e14df117
Fix combine2x128
2016-12-02 11:33:48 +11:00
Alex Coyte
e1e9010cac
Introduce custom adjacency-list based graph
2016-12-02 11:31:33 +11:00
Justin Viiret
8869dee643
rose: simplify long lit table, add bloom filter
...
Replaces the original long lit hash table (used in streaming mode) with a
smaller, simpler linear probing approach. Adds a bloom filter in front
of it to reduce time spent on false positives.
Sizing of both the hash table and bloom filter are done based on max
load.
2016-10-28 14:52:45 +11:00
Justin Viiret
68bf473e2e
fdr: move long literal handling into Rose
...
Move the hash table used for long literal support in streaming mode from
FDR to Rose, and introduce new instructions CHECK_LONG_LIT and
CHECK_LONG_LIT_NOCASE for doing literal confirm for long literals.
This simplifies FDR confirm, and guarantees that HWLM matchers will only
be used for literals < 256 bytes long.
2016-10-28 14:52:26 +11:00
Alex Coyte
c94899dd44
allow sets of tops on edges
2016-10-28 14:51:46 +11:00
Xu, Chi
997787bd4b
rose: add CHECK_SINGLE_LOOKAROUND instruction
...
This specialisation is cheaper than the shufti-based variants, so we
prefer it for single character class tests.
2016-10-28 14:47:04 +11:00
Justin Viiret
385f71b44e
rose: enable generation of shufti32x16 case
2016-10-28 14:46:37 +11:00
Xu, Chi
04d79629de
rose: add shufti-based lookaround instructions
...
More lookaround specialisations that use the shufti approach.
2016-10-28 14:46:27 +11:00
Justin Viiret
9139123642
rose: move sparse iter cache to RoseEngineBlob
...
This enables its use for iterators written by instructions.
2016-10-28 14:45:32 +11:00
Justin Viiret
13af3bfb74
rose: decouple build-time program representation
...
This commit replaces the build-time representation of the Rose
interpreter programs, from a class containing a discriminated union of
the bytecode structures to a class hierarchy of build-time prototypes.
This makes it easier to reason about and manipulate Rose programs during
compilation.
2016-10-28 14:45:15 +11:00
Justin Viiret
f4fa6cd4dd
rose: tighten up requirements for catch up
...
We only need to catch up when there is an actual anchored table, not
merely when there are successors of anchored_root in the Rose graph.
2016-10-28 14:44:20 +11:00
Justin Viiret
3cf4199879
debug: always use %zu in format string for size_t
2016-10-28 14:43:34 +11:00
Justin Viiret
c8868fb9c7
rose: remove CHECK_LIT_MASK instruction
2016-10-28 14:43:33 +11:00
Justin Viiret
4ce306864e
rose: use lookarounds to implement benefits masks
...
This replaces the CHECK_LIT_MASK instruction.
2016-10-28 14:43:33 +11:00
Xu, Chi
b96d5c23d1
rose: add new instruction CHECK_MASK_32
...
This is a specialisation of the "lookaround" code.
2016-10-28 14:43:33 +11:00