Letting SQL Server change data types automatically can seriously impact performance in a negative way. Because a calculation has to be run on each column, you can’t get an index seek. Instead, you’re forced to use a scan. I can demonstrate this pretty simply. Here’s a script that sets up a test table with three columns and three indexes and tosses a couple of rows in:
CREATE TABLE dbo.ConvertTest ( BigIntColumn BIGINT NOT NULL, IntColumn INT NOT NULL, DateColumn VARCHAR(30) ); CREATE INDEX BigIntIndex ON dbo.ConvertTest (BigIntColumn); CREATE INDEX IntIndex ON dbo.ConvertTest (IntColumn); CREATE INDEX DateIndex ON dbo.ConvertTest (DateColumn); WITH Nums AS (SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT 1 )) AS n FROM master.sys.all_columns ac1 CROSS JOIN master.sys.all_columns ac2 ) INSERT INTO dbo.ConvertTest (BigIntColumn, IntColumn, DateColumn ) SELECT Nums.n, Nums.n, DATEADD(HOUR, Nums.n, '1/1/1900') FROM Nums;
If I run a very simple query like this:
DECLARE @param DATETIME; SET @param = '3/15/1963 8:00:00'; SELECT ct.DateColumn FROM dbo.ConvertTest AS ct WHERE ct.DateColumn = @param;
I get an execution plan that looks like this:
The issue is, while I put dates into the DateColumn, the data type there is VARCHAR, not DATETIME. Passing a DATETIME value results in an implicit conversion that we can see in the Index Scan Predicate property:
This is a function against a column and that is what results in the scan. If I modify the code as follows:
DECLARE @param VARCHAR(30); SET @param = '3/15/1963 8:00:00'; SELECT ct.DateColumn FROM dbo.ConvertTest AS ct WHERE ct.DateColumn = @param;
I get an execution plan that looks like this:
No change in data type is needed, so the index can be used. It’s a very simple demonstration of how implicit conversions hurt performance. Now, what about the integer columns? What happens when I do this with the code:
DECLARE @param INT; SET @param = 650323; SELECT ct.BigIntColumn FROM dbo.ConvertTest AS ct WHERE ct.BigIntColumn = @param;
Here’s the execution plan:
And here is the predicate:
In this case, we’re still getting an implicit conversion of the data type, but SQL Server is smart enough to realize that, hey, it’s still just an integer, so it can convert the @param instead of the column, resulting in a seek. If I change the test so that it runs a query against the INT column passing a BIGINT value, it doesn’t even do a conversion of any kind.
In short, not all implicit conversions are a problem. However, it’s still a best practice to use the right data type for columns. You should also use the same data type for your parameters and local variables as the column data type. Taking these steps avoids issues with implicit conversion.
For lots more information on query tuning, in August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.
I’m also going to be doing an all day training course before SQL Saturday Oslo in September.
[…] Grant Fritchey looks at implicit conversion and the havoc it can wreak: […]
[…] happens here is an implicit conversion problem. There are several blog posts from our distinguished community leaders on this topic and its woes. It causes heavy CPU load as the SQL engine tries desperately to […]
Thank you for publishing this. I was looking for a resource to explain to one of my report developers why Implicit Conversion is such a big deal. I was able to provide them this and a comparison of the Execution Plan from their query. If you added the execution times for comparison it would be nearly perfect.
You’ve got the code. You could run up execution times pretty readily. With a small data set like that, they’re not hugely different, but the elimination of the scan and the parallel execution does result, in this case, in increased performance.
[…] avoid conversions or implicit conversions is a fundamental approach to good T-SQL coding practices. Implicit conversions can lead to poor performance. However, other than looking at every execution plan in your system, is there a way to see if you […]