Home > Selecting Queries for Optimization

Selecting Queries for Optimization

Query optimization can feel like an overwhelming task. Where to start? In this blog post, we’ll discuss how to select which queries you really should optimize. This is a step that’s often skipped, but shouldn’t be! Why is selecting the right queries to optimize so important? There are several reasons.

Penny-Wise, Pound-Foolish. Look at the forest, not just the trees, and optimize globally, not locally. If you’re optimizing a query that never causes a user-visible problem, doesn’t impact other queries, and doesn’t significantly load servers, you might be “optimizing” things that don’t matter, spending more money than you save. Your time has value, too! Keep in mind, too, that optimizing a query that generates only 1% of the database’s overall load will not be a significant benefit to the bottom line.

Whack-A-Mole Queries. It’s very common to find a slow query in a log file, try re-executing it and find that it is fast. It was slow at some point in the past, but why? It can be hard to tell the difference between bad queries, cache misses, victims of bad server performance overall, and victims of other bad queries. This is especially true if you don’t have a good way to examine what was happening at the time of a bad query. And if you can’t measure the effects of any changes you make, you might just spend your time going in circles.

With this in mind, select queries that are:

  • Major contributors. As a rule of thumb, queries that contribute more than about 5% of the total execution time (service demand) are worth examining individually to see if they can be improved. The exact number is up to your judgment, but that’s our suggestion.
  • Consistently slow. Queries that have high latency generally cause user-facing performance problems if they are executed with any frequency.
  • Occasionally slow, and important. Finding queries that are sometimes slow is an important step towards finding small server-wide stalls that may worsen in the future, or may indicate long-tail latency problems or occasionally broken functionality. Many applications have a “celebrity” edge-case, named after the outlying effects of highly popular celebrity accounts on social media, which have orders of magnitude more followers and activity than ordinary accounts. Such edge cases can easily fall through the cracks of average-case analysis.
  • Queries with red flags. Queries that cause errors or warnings, don’t use indexes, and the like may be good to clean up, even if it’s purely to avoid “broken window syndrome” and keep your workflow clean and manageable.

Once you’ve selected the queries that need optimization, you need to find samples to examine. You can’t EXPLAIN or otherwise examine a statement digest, generally speaking. You need individual query occurrences that you can re-execute and study. Here are some basic ways to find the samples you need:

  • MySQL Performance Schema. You can find sample queries in the events_statements_history_long table, which contains a digest column you can use to filter by queries of interest:
select * from
performance_schema.events_statements_history_long
    where digest =
'e51be358a1cbf99c1acab35cc1c6b683'
  • PostgreSQL pg_stat_statements. You can select from the pg_stat_activity view to find samples of queries. Depending on whether the state column is “active” or “idle”, the queries are either in-progress or have finished. Ideally you’d capture queries after they finish but before that backend begins another query, so you can see the query’s execution time, which is state_change minus query_start when the backend is idle. See the pg_stat_activity documentation for details on this statistics view:
select datname, usename, application_name,
client_addr,
        client_hostname, client_port, state_change 
as end_time,
        state_change-query_start as response_time,
query
    from pg_stat_activity where state = 'idle' and
waiting = 'f';
  • Query Logs. The tools you use to aggregate the logs will present samples from each category of queries. For example, pt-query-digest will show the “worst” sample from each category.
  • SolarWinds® Database Performance Monitor. SolarWinds DPM has sophisticated sampling algorithms to capture representative samples of queries. In the future we will write a white-paper about the statistical techniques we use.
SolarWinds
We’re Geekbuilt.® Developed by network and systems engineers who know what it takes to manage today's dynamic IT environments, SolarWinds has a deep connection to…
Read more