Anyone who has ever put a filtered index on a table knows that auto-update-stats is inadequate at keeping the query optimizer informed about them. The reason is that to fire a stats update, ~20% of the table has to change. If you have a table with 500 million rows, and a filtered index covering 5 million of those rows, you’d still need ~100 million rows to change before stats would be automatically updated.
Luckily we already have a script that will use advanced criteria to update statistics for us. I refactored the original script and test to account for filtered indexes. Now, if we want to update stats if 15% of the table changes, we will also update stats if 15% of the filtered index changes.
Looking at our original example, the big table with 500 million rows would get its stats updated if 750,000 rows inside the filtered index are changed. That’s much better.
The key is that the sys.partitions view includes the estimated number of rows in an index. If an index is filtered, it will have a lower number of rows. The ratio of rows in the index to rows in the table tells you roughly what percentage of the table the index covers.
Script: Uber-Update-Stats-v11.sql Unit Test: Uber-Update-Stats-Test-v11.sql
PermalinkExperienced DBAs know that auto-update-stats is not sufficient to keep statistics updated on large tables. This is doubly true for tables where the DML activity is near one end of the clustered key.
There are some common workarounds:
Not content with any of these, I’ve gone ahead and built my own script:
Script: Uber-Update-Stats.sql Unit Test: Uber-Update-Stats-Test.sql
Now I can run this script on a near-constant schedule, and it will update each table as soon as a configurable percentage of rows or number of rows have changed. You can also configure the sampling percentages. I think it is The New Hotness very useful.
Coming up next: A version that compensates for filtered indexes & filtered statistics.
Permalink