Database

Database 102 – Why Is My Query Slow?

February 14, 2020

Database 102 – Why Is My Query Slow?

As a database administrator (aka DBA, or Default Blame Acceptor) throughout my career, I’ve worked with a myriad of developers, system administrators, and business users who have all had the same question—why is my query (or application) slow? Many organizations lack a full-time DBA, which makes the question even harder to answer. The answer is sometimes simple, sometimes complicated, but they all start with one bit of analysis you need to do: whether the relational database management system (RDBMS) you are using is DB2, MySQL, Microsoft SQL Server, Oracle, or PostgreSQL.

It’s All About the Execution Plan

A database engine balances CPU, memory, and storage resources to try to provide the best overall performance for all queries. As part of this, the engine will try to limit the number of times it executes expensive processes, by caching various objects in RAM—one use is saving blocks with the data needed to return results to a query. Another common use of caching is for execution plans or explain plans (different engines call these different things), which is probably the most important factor in your queries performance.

When you submit a query to a database engine, a couple of things happen—the query is first parsed, to ensure its syntax is valid, the objects (tables, views, functions) you’re querying exist, and you have permission to access them. This process is very fast and happens in a matter of microseconds. Next, the database engine will look to see if that query has been recently executed and if the cache of execution plans has a plan for that query. If there’s not an existing plan, the engine will have to generate a new plan. This process is very expensive from a CPU perspective, which is why the database engine will attempt to cache plans.

Execution or explain plans are simply the map and order of operations required to gather the data to answer your query. The engine uses statistics or metadata about the data in your table to build its best guess at the optimal way to gather your data. Depending on your database engine, other factors such as the number of CPUs, the amount of available memory, various server settings, and even the speed of your storage may impact the operations included in your plan (DBAs frequently refer to this as the shape of the plan).

How Do I Get This Plan and How Do I Read It?

Depending on your RDBMS, there are different approaches to gathering the plan. Typically, you can get the engine to give you a pre-plan, which tells you the operations the engine will perform to retrieve your data. This is helpful when you need to identify large operations like table scans, which would benefit from an index would be helpful. For example—if I had the following table called Employees:

EmployeeID LastName State
02 Dantoni PA
09 Brees LA

 

If I wanted to query by LastName, e.g.,

SELECT STATE

FROM Employees

WHERE LastName = ‘Dantoni’

I would want to add an index to the LastName column. Some database engines will even flag a missing index warning, to let you know an index on that column would help the query go faster.

There is also the notion of a post plan, which includes the actual row counts and execution times of the query. This can be useful if your statistics are very out of date, and the engine is making poor assumptions about the number of rows your query will return.

Performance tuning database systems are a combination of dark arts and science and can require a deep level of experience. However, knowing about the existence of and how to capture execution plans, allows you to have a much better understanding of the work your database engine is doing, and can give you a path to fix it.


Joseph D'Antoni is a Principal Consultant at Denny Cherry and Associates Consulting, and Microsoft Data Platform MVP with over 20 years of experience working in both Fortune 500 and smaller firms. He is a frequent speaker at major tech events like Microsoft Ignite, PASS Summit, and Enterprise Data World. He blogs about all topics technology at joeydantoni.com. He believes that no single platform is the answer to all technology problems. He lives in Malvern, PA, and holds a BS in Computer Information Systems from Louisiana Tech University, an MBA from North Carolina State University, and is the co-author of the Microsoft book "Introducing SQL Server 2016."