Category Archives: Statistics

What do *YOU* want to read about?

I would like to make my blog as relevant as possible to readers. What would you like me to blog about? I’ve chosen from my areas of expertise, so expect 300-500 level posts in each of these areas.

No Perf for You! Read-only Databases stops Auto Create Stats

Read-only databases cannot create new statistics

A while back, I was working with Kendra Little (@KendraLittleblog), and we were using log shipping to make warm standby copies of reporting databases. The warm standby databases were marked as READ_ONLY. However, this meant the query optimization process couldn’t create new statistics (MSDN) when users ran ad-hoc SELECT statements.

Query Plans With Missing Statistics

Using a one-table unit test, we are able to see that a read-only database cannot create stats. The estimating query plan shows a warning symbol, symbolizing that the query optimizer cannot create statistics on necessary columns.

Query Plan With Missing Stats

Click on the image to see the warning in detail and actual query plan

Query Plans With Manually-Created Statistics

Let’s suppose that there are stats on every single column of every single table. Then, the query optimizer can do a better job of producing good plans. Using the query in our one-table unit test now produces a plan without warnings.

Query Plan With Stats

Click on the image to see the actual query plan

MANUALLY CREATED STATISTICS SCRIPT AND TEST

My solution is to manually create statistics on every column where they don’t already exist. Below is an example script and a unit test for it.

Script: Manually-Create-Stats.sql
Unit Test:  Manually-Create-Stats-Test.sql

Performance results

Unit Test Perf Results

The performance results are pretty clear. Using statistics in a simple, one-table unit test creates a slight performance boost. I have seen gains of 10-30% with many reporting queries; the effect of missing statistics as you join more tables together, use larger tables, and use columns for filtering.

Uber Update Stats 1.1: keep stats up to date with filtered indexes

Auto_Update_Stats & Filtered Indexes: they don’t play well together

Anyone who has ever put a filtered index on a table knows that auto-update-stats is inadequate at keeping the query optimizer informed about them. The reason is that to fire a stats update, ~20% of the table has to change. If you have a table with 500 million rows, and a filtered index covering 5 million of those rows, you’d still need ~100 million rows to change before stats would be automatically updated.

Uber-Update-Stats 1.1

Luckily we already have a script that will use advanced criteria to update statistics for us. I refactored the original script and test to account for filtered indexes. Now, if we want to update stats if 15% of the table changes, we will also update stats if 15% of the filtered index changes.

Nerds Candy

We are what we eat. Luckily nobody notices

Looking at our original example, the big table with 500 million rows would get its stats updated if 750,000 rows inside the filtered index are changed. That’s much better.

The key is that the sys.partitions view includes the estimated number of rows in an index. If an index is filtered, it will have a lower number of rows. The ratio of rows in the index to rows in the table tells you roughly what percentage of the table the index covers.

Script: Uber-Update-Stats-v11.sql
Unit Test: Uber-Update-Stats-Test-v11.sql