prefilter: workaround for \b in UCP and !UTF8 mode

For now, just drop the assertion (which will still return a superset of matches, as per prefiltering semantics).
2026-01-17 16:00:26 +03:00 · 2017-01-18 11:33:57 +11:00
parent 734eb2ce62
commit cacf07fe9b
1 changed files with 10 additions and 0 deletions
--- a/src/parser/prefilter.cpp
+++ b/src/parser/prefilter.cpp
@@ -295,6 +295,16 @@ public:

    Component *visit(ComponentWordBoundary *c) override {
        assert(c);
+
+        // TODO: Right now, we do not have correct code for resolving these
+        // when prefiltering is on, UCP is on, and UTF-8 is *off*. For now, we
+        // just replace with an empty sequence (as that will return a superset
+        // of matches).
+        if (mode.ucp && !mode.utf8) {
+            return new ComponentSequence();
+        }
+
+        // All other cases can be prefiltered.
        c->setPrefilter(true);
        return c;
    }