You are here: Home / User Group Association Program / Book Reviews / Scaling CouchDB by Bradley Holt, review by Luke Gotszling

Scaling CouchDB by Bradley Holt, review by Luke Gotszling

by Tony Cappellini posted on Sep 18, 2011 05:55 PM last modified Jun 17, 2012 09:57 PM

CouchDB, while tolerant of failure during writes and supporting replication, suffers a bit in that it's not designed from start for more sophisticated clustering (including failover and sharding).

CouchDB

 

 

 

 

 

 

 

 

 

This book covers approaches to mitigate this difficulty in dynamically sharding databases. The most basic technique is oversharding (creating and running more than one shard on a machine and then moving the "extra" shards to their own hardware once the need arises). Since a change stream is available, it's possible to take advantage of continuous replication for both fault tolerance as well as scaling reads. One of the big benefits to CouchDB is that there's a great GUI for setting up replication (a command line option is available as well).

Bigcouch is a library that allows for easier creation and management of sharded environments using a lightweight layer built into a binary that also runs Couch. While still an early release (v0.3 as of writing; v0.4 currently), this approach shows promise. Smart proxy / dumb proxy methods are also presented. It's possible to manually set some of these up using web servers due to Couch using HTTP as a communication protocol. Some of these clustering techniques use a library like Lounge or Pillow to run as an intermediary proxy for routing requests. This is similar to how some deployments scale memcached on the client-side.
 
A chapter is included on tuning and designing for scale; however, not much is said in the way of scaling complex view queries. The default language for writing view (as well as map/reduce) queries is Javascript but we can write them in Python thanks to the couchdb-python library's view server. This can help reduce the burden of creating queries for Python back-ends.
 
The author concludes with load testing (using Tsung) and monitoring information (using Munin). The steps and instructions are very detailed, albeit tailored to Ubuntu. The book is only 58 pages in length and the paper version could make use of an index. All in all, the book goes over a lot of relevant information pertaining to scaling CouchDB in a clear format. Anyone facing scaling issues with Couch would find the book a great starting point.
 

Document Actions
Log in


Forgot your password?
New user?
Mailing List

Please click here to sign up or edit your subscription.