Updated: Apr 26
There are a huge number of choices, a huge number of benefits, competitive communities & multiple use case scenarios. Finding out the best possible match database in Graph-Based or in Column-Based or it is Document-Based, it is never easy. Every category has a list of different options to choose from with a different number of advantages or disadvantages. In this post, we will try to understand the advantages, disadvantages, examples & use cases for most of the popular available databases.
Three main keys to consider while choosing a database
According to the CAP theorem (Brewer’s theorem), when you are designing a distributed system you can get cannot achieve all three of Consistency, Availability and Partition tolerance. You can pick only two out of above mentioned three. ~ wiki
Let’s get into the practicality of selection. We have divided No-SQL Databases into 4 categories.
Categorized Database Names:
Key Value: Riak, Redis Server, Memcached, Scalaris, Tokyo Cabinet. Document-Based: MongoDB, CouchDB, OrientDB, RavenDB. Column-Based: Cassandra, Hbase, Hypertable, BigTable. Graph-Based: Neo4J, InfoGrid, Infinite Graph, Flock DB
Apart from above, there is something more called, Multi Model Database. We’re going into it after covering these above 4.
It is faster, but it’s schema-less (unstructured). Examples: Url Shortner, PasteBin, E-commerce- in use cases for, temporary prices, user profiles, product recommendations, session information etc. Companies using: Twitter uses Redis to deliver your timeline. Pinterest uses for followers, following, etc.
Designed for storing, retrieving and managing document-based information. Advantages: Data Tolerant Disadvantages: Query Performance, no structured query. Use Cases: Can be used as scalable general purposes. Example: A famous weather app (iOS), delivers weather alerts to 40M users, SEGA uses MongoDB for handling 11M in-game accounts.
Offer very high performance and highly scalable architecture, because it is fast to load data and query it. Excellent real-time usages: – Tweet information of a user is saved as column-wise – Organizes the data into rows and groups of columns. – Facebook: uses Column-based for nearby friends (Hbase). – Spotify: uses Cassandra to store user profile attributes like artists, songs etc.
Graph-based is used for various purposes and used by many good companies. – LinkedIn- For showing connections – Google Knowledge Graph– For example, search for — Indian Prime Minister and the first result box given by google is an example of graph-based. – Walmart– uses Ne04J for customers' personalized product recommendations. – Medium– uses Neo4J to build their social graph to enhance content personalization.
The below picture depicts where and when you can utilize it.
Source: Martin Fowler
A Multi-Model Database combines the capabilities of Column-Based, Graph-Based, Document Based and Key-Value Databases.
Example: Microsoft Azure Cosmos DB, Orient DB
The Monolithic Database Approach
Issues in the monolithic approach
Difficult to make schema changes
Single point of failure
Let’s split around this monolithic approach to resolving issues
The information generated from the application/system.
1- Events, logs, signals 2- No persistent storage so it should be highly available
Temporary data whose sole purpose is to improve the user experience by serving information in real-time e.g. cache for user experience.
Information gathered from user sessions — such as user clicks, cart data.
Payment processing and order processing data.
I hope this is somewhere useful for you. Let me know your views.
Thanks for reading. :)