Date Published: February 10, 2018

A Manager’s Guide to the Database Galaxy – Part 4 (NoSQL Document Stores)

PART 4

In the last blog we examined different relational database offerings and considered some of their unique features and differences. In this blog we will compare and evaluate different NoSQL offerings, starting with Document Stores. This is done in an attempt to highlight some of the key differences between otherwise similar technologies. We also consider the differences in performance and what costs are associated to running each database on premises or in the cloud, so that it may become clearer and easier to recognize the database that best suits your needs.

2. NoSQL – Document Stores

NameAmazon DynamoDBCouchbaseCouchDBMongoDB
DescriptionHosted, scalable database serviceJSON-based document store derived from CouchDB with memcached-compatible interfaceNative JSON document store, scalable from globally distributed server-clusters to mobile phonesMongoDB is a free and open-source cross-platform document-oriented database
Primary DB ModelDocument StoreDocument StoreDocument StoreDocument Store
Additional DB ModelsKey-Value StoreNoneNoneKey-Value Store
Popularity Ranking (DBs Overall)#21#24#28#5
Popularity Ranking (in Document Stores)#4#3#4#1
DeveloperAmazonCouchbase, Inc.Apache Software FoundationMongoDB, Inc.
Initial Release2012201120052009
Current ReleaseLast update August 20175.1.0, February 20182.1.1, November 20173.6.4, April 2018
LicenseCommercialOpen SourceOpen SourceOpen Source
Cloud-BasedYesNoNoNo
Implementation LanguageHostedC
C++
Go
Erlang
ErlangC++
Server Operating SystemsHosted – access via API, can be used with Apps running any OS (Linux, Windows, iOS, Android, Solaris, AIX, HP-UX, etc.)Linux
OS X
Windows
Android
BSD
Linux
OS X
Solaris
Windows
Linux
OS X
Solaris
Windows
Linux
Data SchemeSchema-freeSchema-freeSchema-freeSchema-free
TypingYesYesNoYes
XML SupportNo, XML<->JSON translator neededNo, XML<->JSON translator neededNo, XML<->JSON translator neededNo, XML<->JSON translator needed
Secondary IndexesYesYesYesYes
SQLNoSQL-like Query Language used (N1QL)NoNo
APIs / Access MethodsRESTful HTTP APIRESTful HTTP APIMemcached ProtocolRESTful HTTP/JSON APIProprietary Protocol using JSON
Supported Programming Languages.Net
ColdFusion
Erlang
Groovy
Java
JavaScript
Perl
PHP
Python
Ruby
.Net
CClojure
ColdFusion
Erlang
Go
Java
JavaScript
Perl
PHP
Python
Ruby
Scala
Tcl
C
C#
ColdFusion
Erlang
Haskell
Java
JavaScript
Lisp
Lua
Objective C
OCaml
Perl
PHP
PL/SQL
Python
Ruby
SmallTalk
Actionscript
C
C#
C++
Clojure
ColdFusion
D
Dart
Delphi
Erlang
Go
Groovy
Haskell
Java
JavaScript
Lisp
Lua
MatLab
Perl
PHP
PowerShell
Prolog
Python
R
Ruby
Scala
SmallTalk
Server-Side ScriptsNoJavascriptJavascriptJavascript
TriggersYesYesYesNo
Partitioning MethodsShardingShardingShardingSharding
Replication MethodsYesMaster-Master Replication
Master-Slave Replication
Master-Master Replication
Master-Slave Replication
Master-Slave Replication
MapReduceNoYesYesYes
Consistency ConceptsEventual Consistency
Immediate Consistency
Eventual Consistency
Immediate Consistency
Eventual ConsistencyEventual Consistency
Immediate Consistency
Foreign KeysNoNoNoNo
Transaction ConceptsNoNoNoNo
ConcurrencyYesYesYesYes
DurabilityYesYesYesYes
In-Memory CapabilitiesNot standard, only via DAXYes, depending on bucket modeNoYes
User ConceptsAccess rights and roles defined via AWS IAMUser and Admin separation via password-based and LDAP integrated AuthentificationAccess and user rights defined per databaseAccess rights for users and roles defined per database

Distinguishing Features

Although it is valid to argue that the inclusion of Amazon’s cloud-hosted DynamoDB does not make for a good apples-to-apples comparison with the other Document Stores, Amazon does not offer a non-hosted comparable, and being the 4th most popular document store, we felt it was relevant to include. While it is the only cloud-hosted document store, it is also the only DBMS with a commercial license whereas the other three databases are open source.

All of the compared document stores are schema free, but CouchDB is the only DBMS which does not offer typing, making the ingestion of new data types easier but the management effort and potential for errors greater as well. Furthermore none of the considered document stores have native XML support and all require a JSON translator of some sort, and Couchbase is the only option which offers SQL-like queries to be run via the N1QL language.

Now whilst MongoDB supports the greatest number of programming languages, it lacks diversity in replication methods available and only offers master-slave as a default, not also Multi-Master replication like its counterparts Couchbase and CouchDB. Lastly, while MongoDB, Couchbase, and DynamoDB all have eventual and immediate consistency options, CouchDB only has eventual consistency.

Performance

In terms of performance there are some significant differences between the databases compared. Whilst MongoDB certainly has much faster read speeds, CouchDB and Couchbase can be run on Apple iOS and Android devices making them great for mobile applications. If you need maximum throughput or have a rapidly growing database, MongoDB could be the better option. If you need a database that runs on mobile, needs master-master replication or single server durability, then CouchDB is a good choice.

Where does Couchbase come into all this? MongoDB has some nice features, and can be a good fit for some use cases – typically small-scale applications running on a few nodes. But companies report problems when trying to scale MongoDB to support more users and bigger workloads on clusters with multiple nodes.

This is where Couchbase shines. Couchbase has an entirely different, in-memory architecture that’s purpose-built to deliver consistent high performance in distributed environments, at scale. Its Multi-Dimensional Scaling (MDS) allows users to add nodes that expand specific services (data, indexing, and query) to accommodate different workload types and growth patterns. This way, users get consistently fast performance at scale along with exceptional agility and ease of development and deployment.

Couchbase has user-defined views or maps which are optimized by the system and only reindexed when the underlying document has significant changes. This makes Couchbase ideal for those situations where you have infrequent changes to the structure of your document; and know in advance what are the kinds of queries you will be executing. Couchbase also offers excellent support for offline databases and built-in master-master replication; making it a good candidate for mobile and other occasionally connected devices.

MongoDB is great for dynamic queries, if you prefer to define indexes, rather than map/reduce functions. MongoDB is better if your data changes a lot, filing up disks. CouchDB or Couchbase are better for accumulating large amounts of occasionally changing data, on which pre-defined queries are to be run Overall these differences make MongoDB the fastest of the three databases but not necessarily the most reliable at scale or the most user friendly, as it does not come with a default administration console/GUI for example.

Cost

All the compared document stores are freemium, meaning that the services are free up until a certain limit (such as storage space or read/write capacity) or with limited features. For the paid versions of these services, DynamoDB has increasingly high costs for higher read/write capacity, while Mongo does not. However, Mongo is more expensive to upscale in terms of storage.

MongoDB can be costly as its architecture limits its ability to efficiently support many concurrent users with a single node. Beyond a few dozen concurrent users per node, performance rapidly degrades. So the only way to effectively deal with performance degradation is to add more hardware resources, which increases the cost of your deployment. MongoDB is also seen as costly, because companies may need to add a dedicated third party cache to meet performance requirements at scale, and many features are not included in the default open source edition and need to be purchased separately.

That being said, Couchbase also isn’t cheap because you are charged for every GET request, not actual data transfer out. All in all, MongoDB is the costliest option to deploy, followed by DynamoDB and then Couchbase.

Conclusion

Conclusively, we find that each database offering has its own unique strengths and weaknesses. However, in terms of performance we find that MongoDB is the fastest alternative although it may not be great for distributed environments or to handle large scale deployments, here Couchbase takes the win. In terms of cost MongoDB is the most expensive option but also offers unique features for this price, whereas Couchbase is the cheapest alternative. In short, which database is best for you will depend on your use case, so you should always test different technologies side by side and find the one that suits you best before committing to any one technology.

**DISCLAIMER**
Whilst we are avid technology geeks ourselves and love the nitty-gritty lugs and bolts, kernel profiling and digging through stack traces, we also recognize the need for a higher-level, more digestible approach to understanding the cloud computing landscape. From this origin and perceived need the AVM Consulting Business Blog series has a slightly different tone, aimed at business or management professionals and decision makers. We hope that this series of cloud business blogs will provide valuable information and new insights into the otherwise highly technical and rapidly changing cloud environment. Lastly, it is important to note that the views expressed in these blogs merely represent the opinions, perspectives, and point of view of AVM Consulting, and although some of the findings are based on facts, the meat of the content is purely subjective and open to interpretation. This is what we think, do what you will with this information.

REFERENCES

  1. https://aws.amazon.com/dynamodb/faqs/
  2. https://aws.amazon.com/blogs/aws/amazon-dynamodb-accelerator-dax-in-memory-caching-for-read-intensive-workloads/
  3. https://thetechsolo.wordpress.com/2016/02/07/pills-of-eventual-consistency/
  4. https://dzone.com/articles/what-an-in-memory-database-is-and-how-it-persists
  5. https://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis/
  6. https://optimalbi.com/blog/2017/03/15/dynamodb-vs-mongodb-battle-of-the-nosql-databases/
  7. https://www.couchbase.com/
  8. http://couchdb.apache.org/
  9. https://blog.couchbase.com/couchbase-server-x-dynamodb-quick-comparison/
  10. https://en.wikipedia.org/wiki/CAP_theorem
  11. https://www.mongodb.com/

Comments

  1. Hussain Abid :

Write a Reply or Comment

Your email address will not be published. Required fields are marked *