Just got finished reading this post by Jonathan Ellis that brought up some interesting ideas about what CouchDB is and more specifically what it isn't. I started to write a comment trying to explain some of my disagreement and realized I was writing an entire post. So instead of spamming his comments, I figure I'll just post my thoughts in my own little corner.
Jonathon's main beef appears to be that CouchDB considers itself a distributed database and he disagrees with that billing. So before I get to far ahead of myself I thought I'd pull up Wikipedia's definition:
A distributed database is a database that is under the control of a central database management system (DBMS) in which storage devices are not all attached to a common CPU. It may be stored in multiple computers located in the same physical location, or may be dispersed over a network of interconnected computers.
As in all things life, the true meaning of that definition depends on how you parse it. As a proponent of the "CouchDB is distributed" idea, I would say that we meet the definition in full. Someone arguing against CouchDB being distributed would refer to the line that comments on the "under the control of a central database management system". Or to put it another way, with no apparent coordination of a set of CouchDB nodes, is it really a distributed database?
So, to answer whether CouchDB is a distributed database, I'll merely ask, "Is the web a distributed system?" Because really, the answers are one in the same. Some would argue that the web is merely the collection of emergent properties of the underlying systems. I would argue that the web, while having no central coordinating authority, is a distributed system. So, in reality, its merely, "You say tomato, I say tomato."
While given the whole theoretical hand wavy arguments, I think I understand Jonathon's concern. CouchDB does not provide users with a method of automatically spreading load amongst a set of physically distinct nodes. No automatic document sharding or re-balancing as nodes enter and leave the system. Yet. This sort of work is planned, its been discussed, general methods and algorithms have been proposed on IRC and the mailing lists. The thing is, such features haven't hit the top of the priority queue in terms of their cost/benefit ratio. I think one of Damien Katz's less appreciated traits is that he appears to be extremely focused on developing features in order of the most benefit to the community, instead of in the order of neato-ness. That or we have quite different definitions of neato.
Lots of people seem to confuse CouchDB as a replacement for an RDBMS. They may try to convince you otherwise by saying things like, "Now, I know CouchDB isn't trying to replace the RDBMS's out there, but..." and then launch into a laundry list of things CouchDB doesn't do. I don't think its a conscious decision at all. I spent my first three or four months with CouchDB trying to figure out how to bend it to my preconceived notions. It turns out I was bending the wrong thing. It was how I thought of CouchDB that needed to change.
I have spent a fair amount of time with PostgreSQL. Its awesome. I very much dislike MySQL. Its not awesome. The reasons I like PostgreSQL are all the reasons that I imagine Jonathon is alluding to when he says that you should ask your favorite non-MySQL DBA why those fancy features exist. The thing is, he's also disregarding that a huge part of the market using RDBMS's aren't using these features. He also doesn't mention the fact that all the talk about denormalizing to improve scalability are spitting in the face of these features.
The most important part of this entire post is the following statement: The only time when CouchDB should be considered as a replacement for an RDBMS is when an RDBMS was the wrong choice in the first place.
Just as CouchDB is not always the right tool for the job, RDBMS's are also not always the right answer. Now, to be clear, my financial and medical institutions better damn well be using some sort of RDBMS that has all of those fancy features. My blog on the other hand (if it weren't static) does not require materialized views or pivot tables.
I wrote this pretty quick so it's probably got errors and what not. I'll be re-reading it and updating over the next day or so.