Comparing strings regardless of collation

10 February 2011

How do you know if one string value is the same as another? The obvious way is an equality check, such as:

select case when 'lowercase'='lowercase'
        then 1 else 0 end as IsIdentical

However, there are a few limitations there, especially if you use the default SQL Server collation, which is case-insensitive. For example:

select case when 'lowercase'='LoWeRcAsE'
        then 1 else 0 end as IsIdentical

That will return 1, saying the two strings are identical. According to the collation rules, they are. Practically, they aren’t.

The workaround: Use varbinary

The workaround is to convert the values to varbinary() before comparing them, because the binary representation of these values is different, regardless of collation. This works with XML as well.

Here’s the same example, with a varbinary conversion:

select case when convert(varbinary,'lowercase')
                 =convert(varbinary,'LoWeRcAsE')
            then 1 else 0 end as IsIdentical

This now returns 0, saying the values are in fact different. Hoorah!

The Catch: CPU Time

The one caveat is that you pay a CPU performance penalty for doing convert() calls. In the example above, a comparison using 50K rows took 16 ms of CPU time normally, and 63ms of CPU time with the varbinary conversion.