Databases 101 – Factors to Consider When Choosing a Database
In part one of this series, we discussed the different types of databases available on the market today. In part two, we discussed how to choose the right database for your requirements. Today, we are going to look at some additional factors you should consider when selecting a database.
Let’s get started.
Additional Factors to Consider When Choosing a Database
After reviewing the first two posts, we know a bit more about the types of databases, as well as how they function and what problems they can solve. And we looked at the CAP theorem as one way to help evaluate which database engine will meet your requirements. But the reality is that there are many additional factors to consider when selecting a database engine. I’ve mentioned a few already, but let’s build a list and add some commentary.
ACID vs. BASE
Relational database engines do ACID (Atomicity, Consistency, Isolation, and Durability) very well, whereas NoSQL engines use BASE (Basic Availability, Soft-state, and Eventual consistency) to achieve the performance offered through horizontal scaling, high availability (HA), and fault tolerance. This is why I wrote in part one that relational systems tend to favor writes, and NoSQL systems tend to favor reads.
Your database engine must have the ability to be highly available. However, that may come at a high price. For example, clustering or adding nodes will also increase hardware and software costs. Make certain you understand the topology of the HA architecture and all associated costs.
You should review your disaster recovery plans to make certain your database engine meets your requirements and defined SLAs. If it takes too long to backup and restore, you may not want that engine. Or, if backups aren’t an option, and you need to restore or recreate a bunch of flat files, you’ll want to know that before disaster strikes. Oh, and before I forget, here is your reminder that HA != DR.
One of the most nebulous statements out there today: “that database doesn’t perform well.” You must understand your ratios of reads to writes and if your database engine is designed to maximize one or the other (or both). Also, you should understand that fast database performance may be a result of a tradeoff somewhere else. For example, you may be promised amazeballs transaction velocity, but it’s the result of weaker consistency. If you need strong consistency, then you may not get the throughput you are promised.
Chances are you will be looking to use some other piece of technology, such as Spark, Kafka, or RedisCache. You should consider if your engine is robust enough to connect with these other platforms as needed, or if you will need to spend the time and money to build the hooks yourself.
Speaking of complementary technologies, you will probably need to be moving data into and out of that database. If you already have a preferred ETL (Extract, Transform, and Load) provider, make certain it will work with your choice of database engine.
The database engine may require you to provide the infrastructure necessary for operations. Or it could be part of a larger data platform offering by a major cloud provider. Another item to consider here is one of data migrations. Spend the time evaluating the effort required to move your data as needed.
You may already have a preferred infrastructure vendor, such as Microsoft or AWS. If so, then your costs may be greatly reduced by leveraging an existing partnership. Otherwise, you may find the time, money, and effort required to bring on new hires with the necessary skills for your chosen database engine to be prohibitive.
Monitoring and Alerting
Whatever database you choose, you will want to make certain you can monitor, alert, and act upon the alerts as necessary. You don’t want to decide on a database solution that also requires you to roll your own monitoring solution. Or, maybe you do, but you should factor that as an additional expense, including the potential for needing to bring in some expertise.
At the end of the day, if you don’t have a champion for your choice of database engine on the business side of the fence, then your project won’t go anywhere. If you’ve read this far, then take this advice: Get a champion first, before you invest too much of your time, effort, blood, sweat, and tears into selecting the perfect database(s).
The complex systems of today require multiple database engines to provide efficient solutions that a single engine of yesterday cannot. The major database vendors are adding cross-functionality support (relational and non-relational features) to their products in response to the needs of their customers. As a result, many of today’s database engines appear to be similar to one another, and it can be difficult to understand what the term “database” even means.
It is my hope that you can use this article as a guide to navigate the world of relational versus NoSQL database engines.