Category Archives: T-SQL Tuesdays

T-SQL Tuesday #25 – Tips and Tricks

T-SQL Tuesday logoIt is T-SQL Tuesday once again, hosted by Allen White (b / t). The theme for this T-SQL Tuesday is Tricks.

One of my favorite tricks is to use INTERSECT when I have to rewrite a SELECT query, to make sure I am returning the same data.

Let’s say I have a slow query that returns 1640 rows:

I refactor it for speed, and make sure it still returns 1640 rows:

How am I sure my new query returns the same data? I use INTERSECT, and make sure the intersect-ed query returns the same number of rows as the original query:

Success is concrete. Just like the Mafia.

Now I know my changes have not affected what data is returned. Success!

I also have a more concrete example, using the AdventureWorks2008 R2 database, and Example.sql .

T-SQL Tuesday #16: Statistics using CLR Aggregates

T-SQL Tuesday logoIt is T-SQL Tuesday again, brought to you by Jes Schultz Borland (@Grrl_Geek). The theme for this T-SQL Tuesday is Aggregations.

For my T-SQL Tuesday post, I’ve written up some CLR aggregation functions that can do statistical analysis. I’ve built three of the most common: median(), percentile(), and covariance().

Here’s my code in a ZIP file: MathCLR – By Dev Nambi.

Median

Median finds the middle number in a set of data. If you sorted the numbers and then picked the exact middle value, that is the median.

I’m using the same sample data set from my last post, which is about the runtime of a backup job over the past week. As we can see here, the median runtime is 33 minutes.

Percentile

Percentile finds the number under which X percent of the values fall. Using our sample data set, we can find that 80% of the backup runtimes are under 480 minutes.

The way to use the function is to pass in the data set as the first parameter, and the percentile you’re looking for as the second.

Covariance

Covariance is less well known. It shows the relationship between two sets of numbers. It finds the  correlations between two sets of numbers.

Using our data set, let’s see if there’s a relationship between backup time and the amount of data backed up. As we can see below, the covariance is negative. That means that either the backups get faster as they get larger (unlikely), or something else is at work.

How about a relationship between disk queue length on the backup file server and the runtime? Aha! That shows a very high covariance, suggesting that the disks on the backup file server are the bottleneck. Either the backup server is too slow, or else some other process is slowing it down.

 

Good luck with your analysis!

T-SQL Tuesday #14: SQL Azure

T-SQL Tuesday logoIt is T-SQL Tuesday again, brought to you by Jen McCown (@MidnightDBA). The theme for this T-SQL Tuesday is Resolutions. Mine is, “This year I will learn and use SQL Azure“.

Why SQL Azure?

The professional answer? Companies are always looking to do more with less. Also, the spread of the Internet to all areas of life means that businesses are having to deal with the brave new world of disintermediation, scale challenges, and thinner margins.

lightning

We're fine, everything's in the cloud. What could go wrong?

The personal answer? It’s a fun technology to experiment with. The concepts of sharding and queue-based architectures are, well, cool.

SQL Azure means more databases per DBA. Smaller outfits may not have DBAs at all. Unfortunately, that’s what the cloud does to our profession; it automates some of our work away. So take a deep breath, sigh, and move on.

More practically, work for talented computer professionals isn’t likely to decrease. Migration work to SQL Azure-based architecture takes a lot of time, even with the migration tools and tips that experts have come up with. Some industries, such as health-care & banking, are not likely to move their data to a third party. Database design and architecture work should always be around.

bright cloud

Welcome to "The Cloud"

So, time to learn! How can I make a 1TB database fit into 40GB chunks? How can I leverage the worker queues in Windows Azure to my benefit? How can I migrate half a database to SQL Azure, and still provide my application access to all data? How do I do a backup? How do I figure out the costs of moving to SQL Azure vs. the costs of the status quo?

Time to learn!