I’ve just been heavily involved in a project involving riak, which, for the uninitiated, is a new-fangled NoSQL key-value store. If that doesn’t mean anything to you, you can pretty much stop reading this article right here.
I’d always used SQL datastores prior to this, and a few things about riak tripped me up. We had a data structure that was logically “a parent object with zero or more child objects.” In SQL, the logical approach to that structure is to have a table that contains the parent object, and then another table for the child objects which has a foreign key back to the parent object. And so we pretty much duplicated that pattern in riak: we have a bucket for the parent, a bucket for the children, and we constructed a search index linking the two together.
So right there, that was probably a mistake. Unlike SQL, you should only follow that pattern in riak if you really, really, really have to. We didn’t, and would probably have been 100% better off if we had just stored the entire composite object in one bucket.
But let’s say that you either made the mistake or there’s a genuinely good reason that it’s not a mistake for you. What we started to realize was that when we stored a parent/children object combination and then went and retrieved it a few seconds later, there was a very good chance that we got back either the parent or the children stale.
We knew, of course, that riak is eventual consistency and all that, but I had assumed that this was a misty, far-off problem for people running massive operations across dozens of nodes and millions of operations. Turns out, no, you have to account for it right from the start.
But it’s not that hard to account for. We had a method of telling (within limits) whether the children and the parent object were in agreement. We were running that kind of validity check when we saved, so we just imported the validity check in our load function. Then, once we’ve composed the object in code, we run the validity check. If it comes back invalid, we just try the entire load again, eventually backing off in time. Our scheme is that the first retry is instantaneous, and then for each additional retry, we back off a power of two seconds: 1, 2, 4, 8. If after the 8 second delay and reload, the object is still inconsistent, we give up and assume that there’s an actual data error.
I was a little dubious about this when I coded it. It’s a few simple lines of code to write, and I was dubious that it would actually end up helping much in production. I was pretty badly wrong. We see the back-off happen all the time in our relatively small operation, and usually after one to three retries (0 to 3 seconds total delay), the data achieves consistency. We had an immediate improvement in our holistic sense of the health of our service after implementing it.