DB Perf is Like Cake: The Design Layer

04 April 2011

“Refactor it, and perf will come”

Layer 3 of The Layer Model is the design layer. The design layer is the schema of your tables and the application interfaces into your database. Each table’s datatypes, the views and stored procedures used by applications, LINQ, Entity Framework…they are the design layer of your database. Clean up your design, and you will get better performance.

Work with your developers

To do any of this work, there must be a good working relationship between the DBA team and the development team(s). Everybody needs to get along. These tuning techniques require DBAs, database developers, and C# developers to work well together. All of us will be working outside of our comfort zone.

Data-Access Tuning

The way that applications access your database has a lot of influence on how well your application behaves, and the load on your database. Does your UI or middle-tier use LINQ everywhere? Is it all done through stored procedures? Some other way?

Luckily, developers are pretty conservative (read: lazy), so your application probably only uses one set of tools to access the databases.

Stored Procedures

If your application always calls stored procedures (‘sprocs’), then you’re in luck. There are a lot of things you can do for performance tuning on your own. I’d recommend the following approach:

1. Do query tuning for the stored procedures in your system. 2. Do you have multiple applications hitting the same database? Consider making different stored procedures for each one. That way you can tune each application independently, and not have to worry about the others. That way a set of stored procedures becomes the application<->database interface layer. 3. Make sure each stored procedure is doing the minimum amount of work necessary. If you only need to return 2 columns to your application, only return those 2, instead of all 20 in the table.

Object to Relational Mappers

If your application uses an Object-to-Relational Mapper (ORM) like LINQ, Entity Framework, or nHibernate, then brace yourself. ORM tools hide a lot of query information from DBAs. They don’t expose information on what queries they create, and those queries are often hard to tune because nobody can re-write them directly. They make it hard for anyone to understand what the application is doing with your database.

For this post, I’m going to stick with tuning LINQ to SQL, since it seems to be the most common ORM tool with my readers. Nowadays I can’t swing a stick without hitting someone who’s run into performance issues using LINQ to SQL. Some intrepid souls have even benchmarked the performance hit you might receive, and the results may make you want to cry.

Luckily, this problem is common enough that there are a lot of resources around how to go performance tuning. Some of my favorites are below.

All of these resources require that DBAs and developers work together. They often require that developers have access to a database that’s similar in size to your production system, to find out what’s going on. However, it’s a great way to speed up your systems and meet nerdy colleagues.

Table Tuning

Table tuning isn’t talked about very often. Why would you change the schema of a table for performance? Isn’t it usually done only for new features, or to add data integrity?

The reason is that your table schema can _dramatically _change how your queries behave. Consider these two examples, which can be used for the exact same application:

Let’s say the table had 1 million rows. The first table would need 197MB for the clustered index. It also needs an additional 2GB for the images being stored as LOBs. The second table would need 107MB for the clustered index, and nothing else. That’s a space savings of over 95%.

I’d recommend the following approach:

  1. Make sure it’s safe to redesign your tables. That means making sure stored procedures are used everywhere, or views, or you have agreement from the folks working with the ORM tools you have.
  2. Redesign the largest tables first.
  3. Resign the tables that have the worst blocking or deadlocks.
  4. Redesign ‘core’ tables. If a table is involved in over ~20% of your application’s queries, it’s a ‘core’ table
  5. Redesign anything else.

There are some great resources on table design. My favorite starting point, however, is Kimberly Tripp’s blog post on GUIDs.

Summary

The Good: Great performance improvements are possible (5x-100x is very achievable). Space savings come along as well.

The Bad: Development teams can be busy and tiring. There is a medium amount of development risk. If you don’t have a good database developer, you could make things worse, and break your applications in the process.

The Ugly: That application that is using LINQ to SQL on 1000 active spids against your lone OLTP database? It’s still there, even if the queries finish faster.