Re: FOSS Distributed database

14 Jun 2020

      The paper I linked for Anna is "v0" and has been extended with "v1". The C++
implementation follows "v1".

https://arxiv.org/abs/1809.00089

v0 implements any-scale coordination-free partitioned lattice replication with
simple multi-master replication using a single number for the whole system.

v1 replaces the single number with selective replication (per key), adds
vertical tiering (in-memory vs. persistent) and horizontal elasticity (scaling
the cluster respectively with load to keep the minimum latency within a bound).
This adds two services around the core anna-kvs, namely anna-monitor and
anna-router.

The monitor watches the cluster and tunes the selective replication such that
hot keys are available on many in-memory nodes, while cold keys exist on fewer
on-disk nodes. If the network is under or over utilized, the monitor can add or
remove resources from the network (if presented the functionality and authority
to do so; otherwise it may just warn its administrator accordingly.)

The router abstracts away the indirections of the elastic system from clients.

v0 was approximately 2k lines of C++ excluding external libraries (zmq and
protobufs) but including the lattice library and client code.  (Probably
excluding comments and blank lines.)

v1 as-of-now is 3166 lines of code excluding comments and blank lines.

There are a few lattices defined with different consistency semantics.
(Causally) (un)ordered (multi)values. I.e. {get,put} key value, {get,put}_set
key [values], {get,put}_causal key value. It only takes about 10-20 lines of
additional code to implement a different semantic if necessary for your
application.

Finally, v0 was a prototype and I haven't seen any indication that v1 is
"production ready" but it does not appear far from it.

I have fixed one issue and have reported a few more. But these are minor and
should be easy to resolve. Hint; don't worry about the current 100% CPU
utilization... That will likely be fixed by replacing just one async operation
with a blocking one.

Re: FOSS Distributed database

James McGlashan