Date Published: July 10, 2018

A Manager’s Guide to the Database Galaxy – Part 6 (NoSQL – Key Value Stores)

PART 6

In the last blog we considered different Wide Column Store databases and examined some facets of performance and associated costs. In this blog we look at another type of NoSQL databases, the much simpler Key Value Stores. Offerings from different vendors and examined in an attempt to highlight some of the key differences between otherwise similar technologies. We also consider the differences in performance and what costs are associated to running each database on premises or in the cloud, so that it may become clearer and easier to recognize the database that best suits your needs.

4. NoSQL – Key Value Stores

NameHazelcastMemcachedRedis
DescriptionWidely adopted Java in-memory data gridIn-Memory key-value store originally intended for cachingIn-memory data store used as database, cache, and message broker
Primary DB ModelKey-Value StoreKey-Value StoreKey-Value Store
Additional DB ModelsNoneNoneDocument Store
Graph DBMS
Time Series DBMS
Popularity Ranking (DBs Overall)#41#23#9
Popularity Ranking (in Key-Value Stores)#5#3#1
DeveloperHazelcastDanga InteractiveSalvatore Sanfilippo
Initial Release200820032009
Current Release3.9.2, January 20181.5.6, February 20184.0.9, March 2018
LicenseOpen SourceOpen SourceOpen Source
Cloud-BasedNoNoNo
Implementation LanguageJavaCC
Server Operating SystemsAll OS with Java VMFreeBSD
Linux
OS X
Unix
Windows
BSD
Linux
OS X
Windows
Data Schemeschema-freeschema-freeschema-free
TypingYesNoPartial
XML SupportNoNoNo
Secondary IndexesNoNoNo
SQLSQL-like Query LanguageNoNo
APIs / Access MethodsJCache
JPA
Memcached Protocol
RESTful HTTP API
Proprietary ProtocolProprietary Protocol
Supported Programming Languages.Net
C#
C++
Clojure
Java
JavaScript
Python
Scala
.Net
C
C++
ColdFusion
Erlang
Java
Lisp
Lua
OCaml
Perl
PHP
Python
Ruby
C
C#
C++
Clojure
Crystal
D
Dart
Elixir
Erlang
Fancy
Go
Haskell
Haxe
Java
JavaScript (Node.js)
Lisp
Lua
MatLab
Objective-C
OCaml
Pascal
Perl
PHP
Prolog
Pure Data
Python
R
Rebol
Ruby
Scala
Scheme
SmallTalk
Swift
Tcl
Visual Basic
Server-Side ScriptsNoNoLua
TriggersYesNoNo
Partitioning MethodsShardingNoneSharding
Replication MethodsYes, Replicated MapNoneMaster-Slave Replication
Multi-Master Replication
MapReduceYesNoNo
Consistency ConceptsImmediate ConsistencyEventual ConsistencyEventual Consistency
Foreign KeysNoNoNo
Transaction ConceptsACIDNoOptimistic locking, atomic execution of command blocks and scripts
ConcurrencyYesYesYes
DurabilityYesNoYes
In-Memory CapabilitiesYesYesYes
User ConceptsAccess rights per client and object definableUsing SASL (Simple Authentication Security Layer)Simple Password-based access control

Distinguishing Features

Redis is the world’s fastest database which makes it no wonder that it is ranked #1 as the most popular key-value database on the market today. In contrast to Memcached and Hazelcast, Redis also offers multiple data models including document store, graph database and time series DBMS, giving the user added flexibility and the choice of ingesting multiple different data types.

Hazelcast supports all operating systems with a Java VM which Memcached and Reid support BSD, Linux, OS X and Windows setups. All three compared databases are schema free but only Hazelcast offers full data typing. Redis only offers partial typing and Memcached has no typing at all. Also, Hazelcast is the only database that supports SQL-like queries to be run and it supports various access methods including JCache, JPA and a RESTful HTTP API, whereas Redis and Memcached only support a proprietary protocol.

All three of the databases store data in key-value formats reducing the data complexity and use in-memory storage as standard, making them incredibly fast. Only Redis has server side scripts in Lua, where Hazelcast is the only database that uses triggers. Hazelcast is also the only database uniquely replicating through what they call a Replicated Map. A Replicated Map does not partition data, nor does it spread data to different cluster members. Instead, it replicates the data to all members. This is different from the standard Master-Slave or Master-Master replication offered by Redis, and a lot more than Memcached which offers no replication methods at all.

Finally, Hazelcast and Redis both offer sharding as a way to partition the data, but only Hazelcast has immediate consistency versus the eventual consistency model offered by Redis and Memcached. And while Redis supports the greatest number of programming languages, making integration with development easier, only Hazelcast offers ACID transactions for data operations requiring guaranteed validity.

Performance

In a Hazelcast grid, data is distributed among the nodes or as we call them �members� of a computer cluster, allowing for horizontal scaling both in terms of available storage space and processing power. Backups are also distributed in a similar fashion to other members, based on configuration, thereby protecting against single member failure. Memcached clusters are comprised of 1 to 20 nodes. Scaling a Memcached cluster is as easy as adding or removing nodes from the cache cluster. Each Memcached Node is independent to one another and shares nothing.

Redis, Hazelcast and Memcached keep all data in RAM, which of course makes them supremely useful as a caching layer. However, Redis does not offer multi-threaded processing the same way Memcached and Hazelcast do. This may explain why for read heavy and balanced workloads in single node mode, Memcached tends to outperform Redis significantly, while Hazelcast stays right in the middle. However, when there is only a single concurrent client or low number of threads Redis demonstrated significantly higher throughput than the other two databases. Interestingly, for write heavy workloads in single node mode Hazelcast also tends to outperform Memcache, with Redis in close second.

In terms of read latency, single node Memcache also has significantly lower latency for read-heavy and balanced work loads. Whilst for Memcache also has lower read latency for write heavy workloads, the difference between the databases is less significant here and at the tail, for upwards of 24 clients Hazelcast and Redis are even faster than Memcache. Memcache also outperforms the other databases in terms of write latency, except for with write heavy workloads, here Hazelcast clearly takes the lead.

In cluster mode much of the same picture emerges; Memcached has the highest throughput for read-heavy and balanced workloads, whilst Redis triumphs in write heavy loads. Redis also triumphs with low workloads until the concurrency of requestors increases at which point Hazelcast shines. Memcached read and write latency are significantly lower than the other databases for read heavy and balanced loads, while Redis had the lowest latency for write heavy loads.

Cost

All three of the database solutions are open source and thus free to use except the cost of cloud resources such as EC2 instances or other VMs. However, through the board, regardless of whether it was read heavy, write heavy, or balanced workloads, Redis consumed a lot more memory than the other databases, making it the most expensive to run in any cloud hosted environment.

Although in essence open source software should be free, the open source versions of these database solutions generally only support a very small amount of data hosting for free and charges vary after that. Redis for example charges $338 per month for a 5GB standard database hosted on any major cloud (cache for 5GB is only $105 per month), but also offers a pay as you go option which has a base price of $338 per month plus usage. Only the pay as you go option offers a Multicore Redis.

Hazelcast does it slightly differently, having three different editions; Professional Support, Enterprise and Enterprise HD, ascending in price respectively and offering more functionality and a greater amount of features with each tier. Hazelcast also charges a flat fee per Hazelcast node or JVM (regardless of size), so it doesn’t penalize the user for running larger instances, and users can run multiple instances within one JVM counted as only one node.

Whilst Memcached is a fully free BSD license that does not incur costs when deployed on premises, users still have to consider the cost of hosting the database software if they are looking to implement a cloud solution. A common solution we’ve come across is ElastiCache (from AWS) which although reliable and easy to implement is quite costly.

Prices are calculated by the size of the EC2 instance, so the more memory and the larger the instances within the cluster the more expensive the setup becomes. For example, a current generation cache.r3.large memory optimized cache node goes for $0.228 per hour which works out to approximately $166 per month. Assuming cluster mode with automated failover is desired over single node architecture, running Memcached on AWS comes down to approximately $332 per month for a two node cluster.

Conclusion

Conclusively, we find that each database offering has its own unique strengths and weaknesses. In terms of performance, single node Memcached has the best latency for read and balanced work loads, but for more than 24 clients, Hazelcast and Redis are significantly faster. For write heavy workloads Hazelcast clearly takes the lead with Redis in close second. In terms of throughput, Memcache and Hazelcast outperform Redis partially explained by the lack of multi-threaded processing.

In terms of cost all three databases have comparable offerings, yet there are differences depending on the setup you want to run. Redis is priced as the most expensive DB with $338 for 5GB, and only pay as you go option (additional fees for usage) offers a multicore option. Hazelcast has 3 different price tiers, ascending in price and the amount of features included, but it has the great advantage of charging a flat fee per node regardless of size, so if you want to run multiple instances counted as a single node this may be your most cost-effective option. Finally, Memcached is a fully free BSD licensed technology that only incurs the usage costs of the cloud you run it on. For larger, high performant setups we recommend Oracle Cloud (OCI) as the most cost effective and reliant infrastructure.

In any case, selecting the best database for you will depend on your use case. Therefore, we recommend you always test different technologies side by side and find the one that suits you best before committing to any one technology.

**DISCLAIMER**
Whilst we are avid technology geeks ourselves and love the nitty-gritty lugs and bolts, kernel profiling and digging through stack traces, we also recognize the need for a higher-level, more digestible approach to understanding the cloud computing landscape. From this origin and perceived need the AVM Consulting Business Blog series has a slightly different tone, aimed at business or management professionals and decision makers. We hope that this series of cloud business blogs will provide valuable information and new insights into the otherwise highly technical and rapidly changing cloud environment. Lastly, it is important to note that the views expressed in these blogs merely represent the opinions, perspectives, and point of view of AVM Consulting, and although some of the findings are based on facts, the meat of the content is purely subjective and open to interpretation. This is what we think, do what you will with this information.

REFERENCES

  1. https://redis.io/
  2. https://memcached.org/
  3. https://github.com/memcached/memcached
  4. https://www.infoworld.com/article/3063161/nosql/why-redis-beats-memcached-for-caching.html
  5. https://www.tutorialspoint.com/memcached/index.htm
  6. https://www.linkedin.com/pulse/memcached-vs-redis-which-one-pick-ranjeet-vimal/
  7. https://github.com/antirez/redis
  8. https://redislabs.com/ebook/part-1-getting-started/chapter-1-getting-to-know-redis/
  9. https://codeburst.io/redis-what-and-why-d52b6829813?gi=f9a98f5d5f45
  10. https://aws.amazon.com/redis/
  11. https://hazelcast.com/
  12. https://hazelcast.org/
  13. https://github.com/hazelcast/hazelcast
  14. http://www.baeldung.com/java-hazelcast
  15. https://hazelcast.com/use-cases/nosql/redis-replacement/
  16. https://blog.hazelcast.com/hazelcast-radargun/
  17. https://hazelcast.com/resources/benchmark-redis-vs-hazelcast/
  18. https://aws.amazon.com/blogs/aws/amazon-dynamodb-accelerator-dax-in-memory-caching-for-read-intensive-workloads/
  19. http://docs.Hazelcast.org/docs/latest-dev/manual/html-single/index.html#replicated-map
  20. https://www.infoworld.com/article/3063161/nosql/why-redis-beats-memcached-for-caching.html
  21. https://anthonyaje.github.io/file/An_empirical_evaluation_of_Memcached_Redis_and_Aerospike_kvstore_Anthony_Eswar.pdf
  22. https://redislabs.com/pricing/cloud/
  23. https://blog.Hazelcast.com/Hazelcast-vs-ElastiCache-memcached/

Comments

Write a Reply or Comment

Your email address will not be published. Required fields are marked *