AnchoredSetVariable::resolve is called for every rule
(see RuleWithOperator::evaluate). The previous implementation allocated
a new copy of every variable, which quickly added up. In my tests,
AnchoredSetVariable::resolve function consumed 7.8% of run time.
AnchoredSetVariable (which is a multimap) values are never changed,
only added. This means it's safe to store them in std::shared_ptr,
and make resolve return shared_ptr pointing to the same object.
Other resolve implementation could also use this optimization by not
allocating new objects, however, they are not hot spots, so this
optimization was not implemented there.
In my benchmark, this raises performance from 117 requests per second to
131 RPS, and overhead is lowered from 7.8% to 2.4%.
As a bonus, replacing plain pointer with smart pointers make code
cleaner, since using smart pointers makes manual deletes no longer necessary.
Additionally, VariableOrigin is now stored in plain std::vector,
since it's wasteful to store structure containing just two integer
values using std::list<std::unique_ptr<T>>.
Using GEOIP_INDEX_CACHE on some older versions of libGeoIP (e.g. 1.5.0
which is the default version on CentOS 7) leads to "Error reading file"
error while opening completely valid GeoIP.dat:
# cat test.c
#include <stdio.h>
#include "GeoIP.h"
int main(void) {
GeoIP *g;
g = GeoIP_open("/tmp/GeoIP.dat", GEOIP_INDEX_CACHE);
if (g == NULL) {
printf("error!\n");
}
GeoIP_delete(g);
exit(0);
}
# cc -lGeoIP -o test test.c
# ./test
Error reading file /tmp/GeoIP.dat
error!
# sed -i -e 's,GEOIP_INDEX_CACHE,GEOIP_MEMORY_CACHE,' test.c
# cc -lGeoIP -o test test.c
# ./test
# geoiplookup -f /tmp/GeoIP.dat -v 8.8.8.8
GeoIP Country Edition: GEO-106FREE 20180327 Build 1 Copyright (c) 2018 MaxMind Inc All Rights Reserved
Also tested with recent GeoLite databases converted from new format
into legacy format, distributed here:
https://mailfud.org/geoip-legacy/