Rather that iterating over NFAAccept structures and testing individual
bits in the state structure, iterate over the state vector and index
into accept structures.
Adds report list support to this path, unified with the report lists
used for exception handling.
On 64-bit platforms, the Limex 64 model is implemented in normal GPRs.
On 32-bit platforms, however, 128-bit SSE registers are used for the
runtime implementation.
Replaces the old LimEx NFA engines, which were specialised for model
size and number of shifts, with a new set of engines that can handle a
variable number of shifts.