Parallelism and Columnstore Indexes

T-SQL
Columnstore indexes are fascinating and really cool. Unfortunately, they're adding an interesting new wrinkle to an old problem. What's the Cost Threshold for Parallelism set to on your server? If you just said "The whatsis of whositz?" then the value is 5. The cost threshold is the point at which the estimated cost of an execution plan goes from definitely serial to possibly parallel. This default was set for SQL Server 2000 and hasn't been changed since. I've long argued, loudly, that it's too low. I've suggested changing it to a much higher value. My advice has gone from 35 to 50 and several places in between. You could just look at the median or the mode of costs on your system and use the higher of those values as…
Read More

Database Engine Tuning Advisor

Azure, SQL Server, T-SQL
I would love to see the Database Engine Tuning Advisor (DTA) pulled from the product. Completely. Heck, I feel bad that I included a chapter on it in my query tuning book (all updated for SQL Server 2014 by the way). Let me tell you why we need to pull this tool. First, I understand its purpose. It's supposed to be a fast and easy way to get some performance tuning done for people who just don't have the time or knowledge to go through the full process of gathering metrics, evaluating poor performers, understanding root causes and applying indexes to fix those causes. I also readily acknowledge that it actually is an amazing piece of software. If you don't agree with that, go read this white paper. With those acknowledgements…
Read More

Is Performance Better With LEFT JOIN or RIGHT JOIN?

T-SQL
I tend to write my queries using LEFT JOIN. Why? Because logically I see it in my head like this: Give me all the rows from this table and only those rows that match from the other table. But, wouldn't this logic work just as well: Give me only the rows in this table that match the rows from this other table where I'm selecting all of them. I know. If I worked on it some more I could make that a better sentence, but I'm pretty sure the logic is still sound. Only matching rows from one data set, all the rows from another data set. In short, RIGHT JOIN. I read recently that we ought to be making everything into a LEFT JOIN because it performs better. I suspect…
Read More

Effects of Persisted Columns on Performance

T-SQL
I live for questions. And my favorite questions are the ones where I'm not completely sure of the answer. Those are the questions that make me stop presenting in order to take a note so I can try to answer the question later, usually in a blog post. Guess where we are today? I was asked at SQL Bits in London about the direct impact of the PERSISTED operator on calculated columns, both inserts and selects. I didn't have a specific answer, so I wrote it down for later (and asked the, self-described, persisting Dane, to email me to remind me. He did, so I put together a few tests to try to answer his question. First, I created three tables: CREATE TABLE dbo.PersistTest ( PersistTestID INT IDENTITY(1,1) NOT NULL PRIMARY…
Read More

Constraints and SELECT Statements

Azure, SQL Server, T-SQL
I've posted previously about how a foreign key constraint can change how a SELECT query behaves. Logically that just makes sense. But other types of constraints don't affect execution plans do they? Yes. Let's take this constraint as an example: ALTER TABLE Sales.SalesOrderDetail WITH CHECK ADD  CONSTRAINT CK_SalesOrderDetail_UnitPrice CHECK  ((UnitPrice>=(0.00))) That will ensure that no values less than zero can slip in there. We can even validate it: INSERT Sales.SalesOrderDetail (SalesOrderID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, rowguid, ModifiedDate ) VALUES (60176, -- SalesOrderID - int N'XYZ123', -- CarrierTrackingNumber - nvarchar(25) 1, -- OrderQty - smallint 873, -- ProductID - int 1, -- SpecialOfferID - int -22, -- UnitPrice - money 0.0, -- UnitPriceDiscount - money NEWID(), -- rowguid - uniqueidentifier GETDATE() -- ModifiedDate - datetime ); Will give me…
Read More

Does the New Cardinality Estimator Reduce Bad Parameter Sniffing

T-SQL
No. Next question. Although, that answer can be slightly, ever so slightly, nuanced... Parameter sniffing is a good thing. But, like a good wine, parameter sniffing can go bad. It always comes down to your statistics. A very accurate set of statistics with very little data skew (some values that have radically more/less data than other values) and a very even distribution (most values have approximately similar cardinality), and parameter sniffing is your bestest buddy on the planet (next to a tested backup). But, introduce some data skew, let the stats get wildly out of date, or suffer from seriously uneven distribution, and suddenly your best friend is doing unspeakable things to your performance (kind of like multi-statement table valued user defined functions). SQL Server 2014 has the first upgrade…
Read More

Simple Parameterization and Data Types

SQL Server, T-SQL
Simple paramaterization occurs when the optimizer determines that a query would benefit from a reusable plan, so it takes the hard coded values and converts them to a parameter. Great stuff. But... Let's take this example. Here's a very simple query: SELECT ct.* FROM Person.ContactType AS ct WHERE ct.ContactTypeID = 7; This query results in simple parameterization and we can see it in the SELECT operator of the execution plan: We can also see the parameter that was defined in use in the predicate of the seek operation: Hang on. Who the heck put the wrong data type in there that's causing an implicit conversion? The query optimizer did it. Yeah. Fun stuff. If I change the predicate value to 7000 or 700000 I'll get two more plans and I…
Read More

Monitoring for Timeouts

T-SQL
The question came up at SQL Rally, "Can you use Extended Events to monitor for query timeouts?" My immediate response was yes... and then I stood there trying to think of exactly how I'd do it. Nothing came quickly to mind. So, I promised to track down the answer and post it to the blog. My first thought is to use the Causality Tracking feature to find all the places where you have a sql_batch_starting without a sql_batch_completed (or the same thing with rpc calls). And you know what, that would work. But, before I got too deep in trying to write the query that would find all the mismatched attach_activity_id values that have a sequence of 1, but not one of 2, I did some additional reading. Seems there's…
Read More

Understand the True Source of Problems

SQL Server, T-SQL
There's an old joke that goes, "Doctor, doctor, it hurts when I do this." While the person in question swings their arm over their head. The doctor's response is, "Don't do that." Problem solved, right? Well, maybe not. Let's take a quick example from life. I do crossfit (yeah, I'm one of those, pull up a chair I'll tell you all about my clean & jerk progress... kidding). I've been experiencing pain in my shoulder. "It hurts when I do this." But, I'm not going to stop. I've been working with my coach to identify where the pain is and what stretches and warm-ups I can do to get around it (assuming it's not a real injury, and it isn't). In short, we're identifying the root cause and addressing the…
Read More

Common Table Expressions Are Not Tables

T-SQL
There's power in naming things. Supposedly some types of magic are even based on knowing the correct names for things. The name for the T-SQL clause Common Table Expression (CTE) is actually pretty accurate. It's an expression that looks like a table and can be used in common across the entire query (at least I think that's what the common part refers to). But note, I didn't say it was a table. It's not. It's an expression. If you look at the T-SQL definition at the link, it refers to a "temporary" result set. Now, to a lot of people, that means table. But it isn't. Let's look at this in more detail. Here's a query that defines a simple CTE and then uses it to query the date in the…
Read More