This is the second blog in our series comparing different database management systems. In the last blog we looked at different data structures and their corresponding database management systems. In this part of the blog series we will consider different use cases best suited to each system.
All Database Management Systems have their ups and down when applied to specific situations or workloads, but each DBMS has at least one use case that it is particularly suited for (unless we are talking about a multi-model DBMS then more than one use case may apply). Let’s have a look at some of the most common use cases for each database type, so that suitable scenarios for each database management system can be identified and more easily recognized.
1. Relational Databases Management Systems (RDBMS)
Relational Databases are best suited for storing and managing highly structured, non-complex data often associated with transactional applications, and are commonly used in large enterprise scenarios.
Unlike NoSQL databases which were built primarily for availability and partition-tolerance, the great strength of relational databases is their transactional consistency. RDBMS are capable of ACID transactions (Atomicity, Consistency, Isolation, Durability), which is a set of properties that guarantee validity of the data stored and managed, even in the event of system errors, power failures, and other disasters.
This makes RDBMS extremely well suited for applications or use cases where strong consistency is required, including but not limited to billing, rights management, operations support, adtech, SLA management, session management, trade verification, fraud detection, bid & offer management and sensor management. Oracle DB, AWS Aurora and Google Cloud Spanner are very popular options for these use cases, with Oracle Database Exadata Cloud Service being particularly well-equipped for large enterprise operations and applications.
2. NewSQL Databases Management Systems
While traditional RDBMS have extensibility issues and are often expensive to scale, a new wave of SQL data-bases including MemSQL and VoltDB are becoming increasingly popular. These NewSQL databases maintain the transactional properties of traditional RDBMS, are often in-memory, massively parallel, and can scale out linearly on inexpensive cloud commodity hardware. They are designed for low latency high read/write performance, using a shared-nothing, cluster-native, cloud-friendly architecture to provide high availability and fault tolerance, as well as geographic redundancy.
|Performance / big volume||No||Yes||Yes|
For example, MemSQL was used by a leading gas and electric utility enterprise to handle massive quantities of data from smart-meters. In this use case, a MemSQL cluster of 5 nodes was used to handle over 200,000 meter readings per second while users simultaneously processed queries against that data. In the first part of a year 72 billion meter readings were loaded which measured up to 6TB of raw data. MemSQL compressed this data by 10x and reduced it to 624GB of storage on disk. The queries against the data used to take hours, but with MemSQL only takes seconds or a few minutes.
3. Object Oriented Databases Management Systems (OODBMS)
OODBMS apply object oriented programming principles to the management of persistent objects with the goals of security, integrity, recovery and contention management. They are based on data types and classes, and inheritance mechanisms between them, as well as on the concepts of polymorphism, dynamic binding, and message passing.
OODBMS are particularly good at avoiding what is known as impedance mismatch problem, which occurs when the data models used to persistently store data and the data models used to write programs against the data (C++, Smalltalk, Visual Basic, Java, C#) are different. This could potentially occur when developing new applications in Java which access data in relational systems. With OODBMS this can be avoided because the programming language the applications are written in is the same as the language used to access data the DBMS.
With object-oriented programming languages like Java and C# being used in such a wide spread fashion it is surprising that Object Oriented databases like Versant or Caché aren’t more popular themselves. Reasons for this could include the lack of persistency compared to relational systems, or an industry wide lacking understanding and knowledge of OODBMS due to the systems’ relative immaturity.
However, some examples of where object oriented databases have been implemented successfully can be found with Credit Suisse, who uses OODBMS to enable user-defined policies for e-mail and SMS messaging related to credit-card activity, with Siemens who uses OODBMS as part of their automation system solution for postal processes, or with Fujitsu who uses Jasmin ODBMS to store multimedia data such as images, pictures and animations. In some other application areas, such as engineering, where large amounts of complex data from different sources need to be stored and managed, a full-fledged OODBMS may also be a good solution.
4. Object Oriented Relational Database Management Systems (ORDBMS)
ORDBMS are used when transactional properties of relational databases are desired but the data is too complex and the volume of data being processed is too great to be managed in a traditional RDBMS. Oracle Database is an example of a fully serviced database solution with object-relational capabilities and can be used for example, to manage customer contact information, purchase orders and stock.
In this ORDBMS example, the main entities–customers, stock, and purchase orders–become object types. These types are defined to represent entire addresses and entire lists of phone numbers rather than breaking these up into unrelated columns as would be done in a traditional RDBMS. Similarly, the object relational approach uses nested tables to keep line items with their purchase orders instead of storing them separately. As such, it becomes easier to store and capture both attributes (which describe states of an object) and actions (which describe behaviors) of the object in question. In other words, with an ORDBMS it becomes possible to manage and query a specific line item included in numerous purchase orders of each customer, frequently ordered to a specific address of that customer.
Other examples of common ORDBMS applications include B2B e-commerce platforms (as with BASF Agricultural platform built on PostgreSQL), knowledge management and data mining platforms (as Mohawk Software used to manage US Census data), or to manage digital data recorded from Video cameras, environmental sensors, specialized monitoring equipment, or data recorded for rich media systems, industrial routing applications, and medical imaging systems. A rule of thumb for when ORDBMS are suitable could read; use ORDBMS when large amounts of complex data are collected and organized in semi-structured form.
5. NoSQL Databases – Key Value Stores
Key-value stores are probably the simplest form of database management systems. They can only store pairs of keys and values, and retrieve values when a key is known. This is one of the main reasons Key Value stores very often outperform traditional RDBMS in many scenarios, because a key-value store doesn’t need to search through columns or tables to find an object requested by a query, but rather, only needs to know the key to quickly determine an objects location. These database are also free schema and do not enforce structure on the data, which offers a high level of flexibility for modeling data to match the requirements of any application.
Key Value Stores are quite simple to use because there is no need to perform lock, join, union or other operations generally required when working with objects, which also makes them very easy to scale out on demand. Some key-value databases are specifically designed to simplify operations by ensuring that it is as easy as possible to add and remove capacity as needed and that any hardware or network failures within the environment do not create downtime. Lastly, these types of databases can make it easier and less complex to provide high availability because they may use a masterless, distributed architecture that eliminates single points of failure to maximize resiliency.
Key-Value stores are frequently used for managing session information in web applications, in massive multiplayer on-line gaming to manage each player session, or in e-commerce to manage the shopping cart of online-buyers. Although in the last case it is recommended to use K-V Stores only for all shopping cart transactions up to the point of payment and a relational database for managing payment transactions and revenue posting.
6. NoSQL Databases – Document Stores
The fact that document-stores are schema free makes them very flexible for data entry from different source in different formats, and it makes them highly scalable. Not emphasizing relationships between records means less code for inheritance hierarchies, methods, pointers and thus faster performance. Therefore, document stores are great for large volumes of less related, separate documents, which need to be accessed at real-time.
Examples of document stores being successfully implemented include Craigslist (MongoDB), who uses the database to archive billions of records in varying formats, and or LinkedIn, which uses Couchbase to run 10 million queries per second with over 200 clusters in production, staging, and corporate environments. Other examples can be found in the telecommunications industry with AT&T (Couchbase), Verizon (Couchbase) or Comcast (Couchbase), and in the travel industry with big names like Expedia whose Scratchpad App uses MongoDB to automate the note-taking and travel planning process to compare millions of dates, airlines, destinations and hotel prices all with one database backend.
7. NoSQL Databases – Wide-Column Stores
Wide-Column stores are designed for storing very large amounts of data in keyspaces and column families. This format of DBMS is great when you need random, real-time read/write access to your big data. Due to the columnar family grouping, these types of data bases can store a large number of different and distinct data formats and can manage a high variation in application clients, such as browser, mobile, and so on.
This means that even when each client needs the same content but in different formats, columnar databases can remain performant in answering the queries. Clients may not just consume and query content, but may also produce large amounts of different types of data very fast, as is the case in social media with users generating thousands of posts, tweets, images and blogs at high speeds. Also in e-commerce domains, where large numbers of logs about customer search activities and usage history have to be collected, stored and analyzed, as well as augmented with incremental trickled data (as coming from and ad’s impressions or clickstream) to produce recommendations for better targeting and advertisement, columnar databases make a lot of good sense.
Lastly, wide-column databases such as HBase and Cassandra have also been successfully used in medical and health applications. Their use cases include storing genome sequences and running MapReduce on them, storing the disease history of people of an area for further analytics and proactive disease prevention measures, and much more. In short, columnar stores are useful when petabytes of heterogeneous data need to be stored and analyzed in a fast efficient manner.
8. NoSQL Databases – Graph Databases
It is important to note that in most graph databases edges have a direction associated with them. Directions help to specify the relationship between two nodes and can be written in Cypher Query Language, for example, as:
Although both statements document a relationship between records, entities, or objects, directional mapping ensures a clean layout so that read speeds are increased.
In short this makes graph databases a great source for building applications with highly connected data. Noteworthy examples of graph database implementations include Walmart (Neo4j) who uses the benefits provided by the technology to make increasingly relevant and rich recommendations to customers in their online shopping journey, eBay (Neo4j) who uses graph databases for their ShopBot which enables conversational shopping, or even in crime-fighting with the Minesterio de la Acusacion in Argentina, who uses OrientDB to to map the relationships between intra- and inter-case elements such as accused, victims, participants, prosecutors, defenders, elements, sites, and times.
There are many different ways of storing and organizing data in types of database. Each database type has its own advantages and disadvantages and results in a list of more or less suitable use cases. This part of our blog series reviewed the different types of databases and data-store formats and covered some of the most common use cases for each type.
The next part of this blog series will draw direct technical comparisons between some of the available technologies for each database management type and consider some of the factors involved with running these databases in different clouds.To be continued…