mirror of
https://github.com/VectorCamp/vectorscan.git
synced 2025-06-28 08:31:00 +03:00
The generated documentation continues to refer to Hyperscan despite the project now being VectorScan. Lets replace many of the Hyperscan references with Vectorscan. At the same time, lets resync the documentation here with the vectorscan readme. This updates the supported platforms/compilers and build options. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
86 lines
3.2 KiB
ReStructuredText
86 lines
3.2 KiB
ReStructuredText
.. include:: <isonum.txt>
|
|
.. _intro:
|
|
|
|
############
|
|
Introduction
|
|
############
|
|
|
|
Vectorscan is a software regular expression matching engine designed with
|
|
high performance and flexibility in mind. It is implemented as a library that
|
|
exposes a straightforward C API.
|
|
|
|
The Vectorscan API itself is composed of two major components:
|
|
|
|
***********
|
|
Compilation
|
|
***********
|
|
|
|
These functions take a group of regular expressions, along with identifiers and
|
|
option flags, and compile them into an immutable database that can be used by
|
|
the Vectorscan scanning API. This compilation process performs considerable
|
|
analysis and optimization work in order to build a database that will match the
|
|
given expressions efficiently.
|
|
|
|
If a pattern cannot be built into a database for any reason (such as the use of
|
|
an unsupported expression construct, or the overflowing of a resource limit),
|
|
an error will be returned by the pattern compiler.
|
|
|
|
Compiled databases can be serialized and relocated, so that they can be stored
|
|
to disk or moved between hosts. They can also be targeted to particular
|
|
platform features (for example, the use of Intel\ |reg| Advanced Vector Extensions
|
|
2 (Intel\ |reg| AVX2) instructions).
|
|
|
|
See :ref:`compilation` for more detail.
|
|
|
|
********
|
|
Scanning
|
|
********
|
|
|
|
Once a Vectorscan database has been created, it can be used to scan data in
|
|
memory. Vectorscan provides several scanning modes, depending on whether the
|
|
data to be scanned is available as a single contiguous block, whether it is
|
|
distributed amongst several blocks in memory at the same time, or whether it is
|
|
to be scanned as a sequence of blocks in a stream.
|
|
|
|
Matches are delivered to the application via a user-supplied callback function
|
|
that is called synchronously for each match.
|
|
|
|
For a given database, Vectorscan provides several guarantees:
|
|
|
|
* No memory allocations occur at runtime with the exception of two
|
|
fixed-size allocations, both of which should be done ahead of time for
|
|
performance-critical applications:
|
|
|
|
- **Scratch space**: temporary memory used for internal data at scan time.
|
|
Structures in scratch space do not persist beyond the end of a single scan
|
|
call.
|
|
- **Stream state**: in streaming mode only, some state space is required to
|
|
store data that persists between scan calls for each stream. This allows
|
|
Vectorscan to track matches that span multiple blocks of data.
|
|
|
|
* The sizes of the scratch space and stream state (in streaming mode) required
|
|
for a given database are fixed and determined at database compile time. This
|
|
means that the memory requirements of the application are known ahead of
|
|
time, and these structures can be pre-allocated if required for performance
|
|
reasons.
|
|
|
|
* Any pattern that has successfully been compiled by the Vectorscan compiler can
|
|
be scanned against any input. There are no internal resource limits or other
|
|
limitations at runtime that could cause a scan call to return an error.
|
|
|
|
See :ref:`runtime` for more detail.
|
|
|
|
*****
|
|
Tools
|
|
*****
|
|
|
|
Some utilities for testing and benchmarking Vectorscan are included with the
|
|
library. See :ref:`tools` for more information.
|
|
|
|
************
|
|
Example Code
|
|
************
|
|
|
|
Some simple example code demonstrating the use of the Vectorscan API is
|
|
available in the ``examples/`` subdirectory of the Vectorscan distribution.
|