Kyle Kingsbury gives an excellent overview of the way that distributed systems deal with network partitions. And if that didn’t make any sense to you, bail out now, this article is not for you.
The things I’d like to emphasize from my experiences with dealing with Riak, though, are these:
In this case, a healthy cluster lost 71% of operations–because when two clients wrote to the value at roughly the same time, Riak simply picked the write with the higher timestamp and ignored the others–which might have added new numbers.
This is not linearizable consistency–and not all data structures can be expressed as CRDTs.
The thing about Riak is that you trade better performance in the face of network partitions for a lot of pain and inconvenience right away. You absolutely can not use Riak, except for very rare data cases (where, basically, you don’t care if arbitrary numbers of writes are thrown away), without immediately investing in merge functions. When we started using Riak, we were tempted to believe that conflicts and non-Last Write Wins behavior were rare, ignorable cases. Read the above again: a healthy cluster lost 71% of operations.
The win for Riak is recovery from catastrophic scenarios in which all other solutions simply screw you. But in return for that protection, the amount of effort you need to spend just dealing with daily operations is enormously greater.
You probably aren’t saving set data, where simple merge functions are easy to write and reliable. In fact, the first question you should be thinking of when you consider using Riak is “can I write a merge function for this data-type at all?” If the answer is no, and you care if unbounded numbers of writes are thrown on the floor in normal, non-partitioned operation, then just walk away from Riak. If you can write the merge function, write it from the very start, and commit to maintaining it.
Riak’s performance in the event of network partitioning is wonderful, and network partitions are an inevitable part of life, so it’s great that it’s available. But you don’t want good performance in the case of network partitions if it means terrible performance in the case of normal operations, and if you’re a small organization that’s labor-constrained, you may want to use a simpler data store that performs more poorly in the case of partitions, simply on the theory of deferred pain. It feels very adult and foresightful to plan ahead for partitions, clustering, and growth, but labor-constrained organizations need to kick some problems down the road and deal with them later.