The Future of NoSQL

Future of NOSQL

Last night I went to a NYC MySQL Group meetup entitled “Future of 21st Century Databases: CEO/CTO Discussion among Database Superstars“. Originally scheduled at AOL‘s offices, it was moved to AppNexus‘ headquarters due to overwhelming demand — it was packed with over 600 attendees (they expected 300). That speaks volumes about the growing interest in NoSQL (Not Only SQL) databases. The guest panel of the evening featured executives from 3 major NoSQL database vendors: Barry Morris of NuoDB, Bob Wiederhold of Couchbase, and Eliot Horowitz of MongoDB. There was little if any debate among them, as the focus seemed to be on the unique features of NoSQL as compared to traditional relational data stores.

NoSQL databases offer high scale, high concurrency, low latency data storage that can support massive amounts of operations per second. They are ideal for large web-facing, big data applications with increasing amounts of unstructured and semistructured data. Relational databases, on the other hand, feature rigid schemas which are difficult to change, but easily support complex multi-table transactions. Barry Morris makes the important point that relational database transactions themselves are a layer of abstraction around what are essentially “grownup filesystems.” Therefore trying to replicate ACID (Atomic, Consistent, Isolated, Durable) is a mistake. Instead, the alternative model is BASE (Basic Availability, Soft-state, Eventual consistency). In other words, relational database are better at some things, and NoSQL at others. Here’s a little about each vendor:

  • NuoDB. The newest of the bunch, NuoDB shipped for the first time on January 15 of this year. It is also different because it currently the product is closed-source, though there are some open-source initiatives on the horizon. NuoDB is unique in that it promises “SQL on the outside, NoSQL on the inside,” meaning that developers can use SQL queries and transactions just as with relational databases, but with the scalability of NoSQL. It is a peer-to-peer system with transactional persistence and is based on “emergence,” the manifestation of intelligent behavior in groups without anyone being in charge (think a flock of birds). If you need more performance, simply add more servers to the cluster — “scale-out”, not “scale-up”. No need to reconfigure or repartition. Their largest single database is 100 nodes on a single multi-terabyte database withuut any sharding.
  • Couchbase. A large, well-known, open-source NoSQL database that started off as a project that used memcached with MySQL as a key-value store, combining efficient in-memory caching with persistence storage. This evolved and extended into document database (indexes, queries) after merging with CouchOne. Couchbase promises ease of use, reliability, high performance, and scalability (low latency and high throughput). They have 400 customers in the social, gaming, e-commerce, enterprise, and finance sectors. Big name clients include Zynga, DrawSomething, and Orbitz, where they replaced Oracle and practically eliminated any system downtime. Their largest single database is 7TB across 123 nodes.
  • MongoDB. Another popular open-source database that also features a document storage model that gives features similar to a relational DB. Multidocument transactions are possible if on same slice of architecture, and they will keep adding relational features as long as they don’t inhibit scalability. Large clients include Craigslist and SourceForge. Their largest install is for a government entity using 1,200 nodes with 400 shards at 250GB each, and 20,000 transactions per second.

So what does the future hold for NoSQL? First of all, the open source model is here to stay. The panelists agree that there will also likely be convergence among platforms and consolidation among vendors. Developers will learn to better understand and appreciate the differences between relational and NoSQL databases, and use the right tool when appropriate. Finally, there will likely evolve new SQL-like set-based data management languages. Thanks to the panelists for speaking and for AppNexus for letting us use their beautiful facilities!

About Martin Rybak

I am a New York area software developer and MBA with 10+ years of server-side experience on the Microsoft stack. I've also been a native iOS developer since before the days of ARC. I architect and develop full-stack web applications, iOS apps, database systems, and backend services.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: