Richard Bucker

Response to Seven Databases in Seven Weeks

Posted at — Dec 20, 2011

For this “7 in 7” book I just glanced at the motives for selecting the DBs that the author did. What caught my attention was the TOC. While the title of the book suggests that this is going to be a reference to modern databases and the NoSQL movement it included Postgres. What’s curious here is that a) PSQL is not a modern database and it’s not a NoSQL database either. b) While it is a modern implementation none of the modern features are mentioned.And then there is a huge gap where BDB, BerkeleyDB, should be. While BDB is sometimes considered a NoSQL database it does not implement the CAP theorem which is consistently attached to NoSQL DBs. What makes BDB interesting, and which would seem to be the subliminal rationale for the many query dialects of the NoSQL DBs is an essay that Mike Olsen wrote where he justified BDB’s APIs and the absence of a formal query language. [programmers know their the data better than any query optimizer] and then there was [the extra steps to compile and optimize are time consuming and better at compile time instead of runtime].CAP is the anti-pattern to ACID. Essentially CAP comes down to a principle of economics [pick two of the following three attributes]. A lot of time has been devoted to this paper and the many followup research papers. I’m not qualified to rebut the thesis but I always wonder if there is a spoiler out there. VoltDB has a novel approach that suggests that you can, in fact, have your cake and eat it too. (It’s also absent)The real challenge with the NoSQL movement and this publication is that they are implementing code as fast as they can. By the time this article is posted something new and interesting will have been deployed.Missing from consideration: memcache leveldb big table S3 BDB (mentioned) Orient UnSQL (a completely different movement) SQLiteFinally, the one thing that is missing for me is a comprehensive or at least a beginner list of use-cases and the DBs that best satisfy those use-cases and why. For example Riak seems to be a special purpose DB where MongoDB seems to be more of a general purpose DB. There are still some edge cases… but when you’re talking about the volume of data that many of the NoSQL people talk about you better have a good plan, specially if you think you might be moving the data from one storage engine to another.