The High Cost and Low Benefit of Unused Index Advice
April 26, 2018 |
You have to be careful what you wish for. The signal:noise ratio of finding out which indexes are “unused” is low. This isn’t obvious at first glance, but upon deeper inspection, there’s often not much to be gained, and it can even harm you. Let’s look at why.
How to Find Unused Indexes
The typical way to figure out which indexes aren’t used is to examine index-usage counters over a period of time. The database itself can provide these statistics: PostgreSQL has built-in views that can expose them, and in MySQL there’s similar views as well as community patches that originally came from Google. There’s a way to get the stats in MongoDB too.
What advice do people want about unused indexes? It typically boils down to:
The theory is that indexes cause bloat and wasted space if they’re not needed, and that they slow down modifications because they require duplication of writes when the indexed table is modified. Removing the biggest indexes from the heaviest-used tables could potentially eliminate a lot of wasted space and write amplification.
I’ve seen software that will auto-generate statements or migration code to drop the indexes. It couldn’t be more convenient: you just view the report and copy-paste the code, and you’ve got an “optimization easy button.”
Or is it?
- Tell me indexes that don’t seem to have any activity according to the statistics.
- Prioritize these indexes by the size or busyness of the data they reference.
Risk: Are They Really Unused?
The problem with finding “unused” indexes is that absence of evidence isn’t evidence of absence. There’s a nontrivial risk that an index really is used: just not while you gather statistics about its usage. And these indexes can be absolutely vital for performance during tasks you’re not thinking about. What about the end-of-month closing of the books, or the end-of-week generation of reports to email out to clients?
If you drop an index that’s actually needed, something performance-sensitive could turn into a full scan next time it runs. It could turn from a single-row access into a billion-row scan. I’ve seen a dropped index result in an outage at a later date, when it caused an emergency due to high load or missed deadlines.
Risk: Are the Indexes or Tables Large or Busy?
If you’ve ever operated data platforms at scale under high load, you know that schema changes need to be analyzed carefully for the potential impact. In particular, if you’re modifying a table or index that’s large or busy, you have the potential to cause a complete outage, which could be very long.
Depending on the type of modification, and the database technology, you could:
I have personally experienced these problems many times. My scars run deep.
The bigger and busier these tables and indexes are, the worse this problem gets. Schema changes on serious production systems aren’t to be treated casually.
- Cause high load for a long period of time, starving the system of the resources and impacting performance across the board.
- Lock or limit access to a table or set of tables, potentially resulting in hung statements and giant queues for things like trying to get a connection from a pool, which can cause backpressure and cascading outages.
- End up with failed, aborted, or killed alterations, which can result in outcomes like crashes and rollbacks, which can take much longer to recover from and even carry the risk of catastrophes like data corruption.
Benefit: Speed Up Inserts and Updates!
Indexes slow down inserts and updates, so removing unused ones can speed those operations
But, how much? And how much does it matter?
Databases are smart. They don’t do wasteful updates for data that doesn’t change (if you update column A, they don’t update indexes that reference column B). And they do smart caching of data in memory, and smart buffering of writes, and write-ahead-logging, and a ton of other optimizations. The net result is often that index updates are far less costly than you might think.
But that’s kind of an unqualified assertion, isn’t it! How would you know if you don’t measure? Exactly. When you have the type of data that SolarWinds® Database Performance Monitor (DPM) provides, you can make quantitative decisions and know whether you were right.
So the simple theory—indexes make things slower—isn’t always straightforward to prove. The database often optimizes things to the point that indexing overhead is negligible. You don’t know unless you measure. My experience is that when you measure you’ll often be surprised.
Benefit: Save Disk Space!
You can measure how much storage space an index uses, so you can figure out how much space you’ll save by dropping it.
But it turns out that the benefit isn’t clear-cut. Either an index is small and you’ll gain little by dropping it, or it’s large and there’s a higher chance of causing more harm than good.
All Risk, No Benefit?
So when you return to the “automatic waste-reduction adviser,” you’ll see that what it spits out is advice to the effect of this:
“Indexes bloat your disk space and slow down your database. Drop these unused* indexes, which are on seldom-updated tables and constitute a total of 20MB of data, to free up disk space and speed up your database, by copy-pasting the following commands: ALTER TABLE foo…”
* “unused” isn’t provably correct, remember.
What if the tool were pessimistic instead of optimistic? The advice might read like this:
“We don’t see that these indexes are used, but we can’t be sure. If you drop them you might free space and make writes faster, but you could also cause an outage: either now due to the cost of removing the index, or in the future due to the cost of missing the index in the event it’s actually needed.”
You can’t outsource caring about your database’s performance to a tool.
But what about the other case, when the index is on a hot column on a huge table and really is slowing things down? In that case the optimistic tool might say:
“Drop this unused index on a super-hot 1TB table and you’ll save 800GB of disk space and the overhead of maintaining 20k modifications per second. ALTER TABLE giant_table...”
But again, just pasting that ALTER into the command-line without scrutinizing it could cause take down production systems. And if the tool was wrong and the index really is used? Now you have to rebuild an index on a 1TB table. How long will that take?
Final note: how often does a situation like this occur? A few times a year? Does it need to be built into a tool, or should it just be a snippet that’s posted on a gist or a wiki page, or written into one of those DBA cookbooks?
“Easy-button” unused-index advice is highly dangerous in my personal experience. It can trap inexperienced operators into a potentially doomsday scenario. The cost of bad advice is astronomical. When it’s not risky, there’s no benefit; and when there’s a lot of benefit, by definition there’s also a lot of risk. This is a terrible combination.