52 Commits

Author SHA1 Message Date
gtsoul-tech
fc272868b8 fixes left_id and suffix_id 2024-05-13 13:04:57 +03:00
gtsoul-tech
0b8e863282 more fixes 2024-05-13 12:41:40 +03:00
gtsoul-tech
bb6464431f new variableScope 2024-04-29 15:09:55 +03:00
Konstantinos Margaritis
e35b88f2c8 use STL make_unique, remove wrapper header, breaks C++17 compilation 2021-10-12 11:51:34 +03:00
Konstantinos Margaritis
556206f138 replace push_back by emplace_back where possible 2021-10-12 11:51:33 +03:00
Justin Viiret
ce7cfbde82 misc: docs, typo fixes, small cleanups 2018-06-27 13:39:05 +08:00
Alex Coyte
a1fdc3afcf dedupeLeftfixesVariableLag: refactor, more blockmode deduping 2017-09-18 13:29:34 +10:00
Justin Viiret
4889a492e4 rose: more hash member funcs for rose types 2017-08-21 11:24:52 +10:00
Alex Coyte
37033ef9bb Provide RoseResources to roseQuality.
RoseResources is an alternative to manually digging through the bytecode.
2017-08-21 11:24:52 +10:00
Alex Coyte
ffc2d578b1 roseQuality() no longer needs to be part of rose's API. 2017-08-21 11:23:41 +10:00
Justin Viiret
9cf66b6ac9 util: switch from Boost to std::unordered set/map
This commit replaces the ue2::unordered_{set,map} types with their STL
versions, with some new hashing utilities in util/hash.h. The new types
ue2_unordered_set<T> and ue2_unordered_map<Key, T> default to using the
ue2_hasher.

The header util/ue2_containers.h has been removed, and the flat_set/map
containers moved to util/flat_containers.h.
2017-08-21 11:14:55 +10:00
Justin Viiret
8b9328fe9e rose: replace RoseLiteralMap use of bimap
This apoproach is simpler and more efficient for cases with large
numbers of literals.
2017-05-30 13:58:59 +10:00
Justin Viiret
a75b2ba2e5 rose: remove hasLiteral() 2017-05-30 13:58:59 +10:00
Alex Coyte
bb29aeb298 rose: shift program construction functions to rose_build_program 2017-05-30 13:58:32 +10:00
Justin Viiret
82838f5728 rose_build: move dedupe analysis into own file 2017-05-30 13:58:32 +10:00
Justin Viiret
3e5a8c9c90 rose: eliminate roseSize, use bytecode_ptr size 2017-04-26 15:19:43 +10:00
Justin Viiret
cf82924a39 depth: make constructor explicit 2017-04-26 15:19:19 +10:00
Justin Viiret
0a163b5535 rose: only use live reports for dedupe assignment 2017-04-26 15:18:13 +10:00
Justin Viiret
6a945e27fb rose: reduce delay program dep on final_id 2017-04-26 15:04:31 +10:00
Justin Viiret
dc8220648c rose: remove now-unused anchored_base_id 2017-04-26 15:04:30 +10:00
Justin Viiret
d43e9d838f rose: delete dead code for cloneVertex 2017-04-26 14:56:49 +10:00
Alex Coyte
bbd64f98ae allow streams to marked as exhausted in more cases
At stream boundaries, we can mark streams as exhausted if there are no
groups active and there are no other ways to report matches. This allows us
to stop maintaining the history buffer on subsequent stream writes.
Previously, streams were only marked as exhausted if a pure highlander case
reported all patterns or the outfix in a sole outfix case died.
2017-04-26 14:44:53 +10:00
Alex Coyte
512c049493 shift early_dfa construction earlier 2017-04-26 14:44:03 +10:00
Justin Viiret
07a6b6510c rose/hwlm: limit literals to eight bytes
Rework HWLM to work over literals of eight bytes ("medium length"),
doing confirm in the Rose interpreter.
2017-04-26 14:41:29 +10:00
Alex Coyte
e51b6d23b9 introduce Sheng-McClellan hybrid 2016-12-14 15:27:18 +11:00
Alex Coyte
e1e9010cac Introduce custom adjacency-list based graph 2016-12-02 11:31:33 +11:00
Alex Coyte
c94899dd44 allow sets of tops on edges 2016-10-28 14:51:46 +11:00
Matthew Barr
707fe675ea Operator precedence matters 2016-10-28 14:48:46 +11:00
Justin Viiret
9eb349a343 rose: expose smwr builder, tidy up engine build 2016-08-10 14:59:10 +10:00
Alex Coyte
981b59fd05 minor eager prefixes improvements
- count eager prefixes as always run engine when comparing with smwr
 - only check if a prefix is vacuous after adding back literal fragments
2016-08-10 14:59:10 +10:00
Justin Viiret
790683b641 rose: don't always dedupe small-block lit variants 2016-08-10 14:52:56 +10:00
Alex Coyte
691b08d170 use NGHolder::foo in favour of NFAGraph::foo 2016-08-10 14:52:56 +10:00
Alex Coyte
3a1429a621 group_weak_end is no longer used 2016-08-10 14:52:56 +10:00
Justin Viiret
e9cfbae68f workaround for freebsd/clang/libc++ build issues
Rather than relying on set's constructor from {}, explicitly construct
the set.
2016-07-08 11:07:51 +10:00
Xiang Wang
9087d59be5 tamarama: add container engine for exclusive nfas
Add the new Tamarama engine that acts as a container for infix/suffix
engines that can be proven to run exclusively of one another.

This reduces stream state for pattern sets with many exclusive engines.
2016-07-08 11:01:34 +10:00
Justin Viiret
32c866a8f9 OutfixInfo: use boost::variant for engines 2016-04-20 13:34:57 +10:00
Justin Viiret
fa27025bcb Wrap MPV puffettes in a struct 2016-04-20 13:34:57 +10:00
Justin Viiret
67b9784dae Rose: use program for all literal matches
Unifies all literal match paths so that the Rose program is used for all
of them. This removes the previous specialised "direct report" and
"multi direct report" paths. Some additional REPORT instruction work was
necessary for this.

Reworked literal construction path at compile time in prep for using
program offsets as literal IDs.

Completely removed the anchored log runtime, which is no longer worth
the extra complexity.
2016-04-20 13:34:54 +10:00
Justin Viiret
6bcccb4c5d Rose: further generalise literal dedupe work 2016-03-01 11:36:10 +11:00
Justin Viiret
f519fd9bcd Rose: don't assume roles with >1 lit need dedupe
We only require dedupe for such roles when they have literals that can
arrive simultaneously (i.e. one literal overlaps with the suffix of
another).
2016-03-01 11:36:10 +11:00
Justin Viiret
10cda4cc33 Rose: Move all literal operations into program
Replace the RoseLiteral structure with more program instructions; now,
instead of each literal ID leading to a RoseLiteral, it simply has a
program to run (and a delay rebuild program).

This commit also makes some other improvements:

 * CHECK_STATE instruction, for use instead of a sparse iterator over a
   single element.
 * Elide some checks (CHECK_LIT_EARLY, ANCHORED_DELAY, etc) when not
   needed.
 * Flatten PUSH_DELAYED behaviour to one instruction per delayed
   literal, rather than the mask/index-list approach used before.
 * Simple program cache at compile time for deduplication.
2016-03-01 11:23:56 +11:00
Justin Viiret
48c9d7c381 Remove use of depth from Rose entirely 2016-03-01 11:23:11 +11:00
Justin Viiret
d67c7583ea rose: Extend the interpreter to handle more work
- Use program for EOD sparse iterator
- Use program for literal sparse iterator
- Eliminate RoseRole, RosePred, RoseVertexProps::role
- Small performance optimizations
2016-03-01 11:16:02 +11:00
Justin Viiret
9cb2233589 rose: Use an interpreter for role runtime
Replace much of the RoseRole structure with an interpreted program,
simplifying the Rose runtime and making it much more flexible.
2016-03-01 11:16:02 +11:00
Alex Coyte
a7d8dafb71 detach the sidecar 2016-03-01 11:13:23 +11:00
Alex Coyte
5e0d10d805 Allow lag on castle infixes to be reduced
Reducing lag allows for castles to be merged more effectively
2016-03-01 11:10:13 +11:00
Justin Viiret
8c09d054c9 Add per-top findMinWidth etc for NFA graphs 2015-12-07 09:38:32 +11:00
Justin Viiret
da23e8306a assignDkeys: use flat_set<ReportID>, not set 2015-12-07 09:38:32 +11:00
Justin Viiret
8dac64d1dc findMinWidth, findMaxWidth: width for a given top
Currently only implemented for Castle suffixes.
2015-12-07 09:38:32 +11:00
Justin Viiret
03953f34b1 RoseDedupeAuxImpl: collect unique suffixes first 2015-12-07 09:38:32 +11:00