Reinforcing the Importance of Statistics on Row Estimate

SQL Server, SQL Server 2016, T-SQL
I recently wrote an introductory post about the importance of statistics. I just received a reinforcement on how important they are during my own work. Bad Estimate I hit a weird problem while I was setting up a query to illustrate a point (blog to be published next week). Let's take the basis of the problem and explain it. I wanted data with distribution skew, so I ran this query to find out if there was a wide disparity between the top and bottom of the range: SELECT i.BillToCustomerID, COUNT(i.BillToCustomerID) AS TestCount FROM Sales.Invoices AS i GROUP BY i.BillToCustomerID ORDER BY TestCount ASC; Sure enough, the bottom of the range returned three (3) rows and the top returned 21,551. If I then run a query to retrieve just a few rows…
Read More

Stored Procedures Are Not Faster Than Views

SQL Server, SQL Server 2016, T-SQL
A performance tuning tip I saw recently said, "Views don't perform as well as stored procedures." <sigh> Let's break this down, just a little. Definitions A view is nothing but a query. The definition given by Microsoft is that it's a virtual table that's defined by a query. It's a query that is used to mask data or perform a complex join or similar behaviors. Views are queries that get stored in the database. Views can be easily referred to as if they were a tables. That's it. I've written in the past about views, including how they can possibly perform poorly. A stored procedure is also a query, or a series of queries, or, a whole lot more. Microsoft's definition of a stored procedure basically defines it as programming object that can accept input through…
Read More

A Sub-Query Does Not Hurt Performance

SQL Server, SQL Server 2016, T-SQL
The things you read on the internet, for example, "don't use a sub-query because that hurts performance." Truly? Where do people get these things? Let's Test It I've written before about the concept of cargo cult data professionals. They see one issue, one time, and consequently extrapolate that to all issues, all the time. It's the best explanation I have for why someone would suggest that a sub-query is flat out wrong and will hurt performance. Let me put a caveat up front (which I will reiterate in the conclusion, just so we're clear), there's nothing magically good about sub-queries just like there is nothing magically evil about sub-queries. You can absolutely write a sub-query that performs horribly, does horrible things, runs badly, and therefore absolutely screws up your system. Just as…
Read More

SELECT * Does Not Hurt Performance

SQL Server, SQL Server 2016, T-SQL
I read all the time how SELECT * hurts performance. I even see where people have said that you just have to supply a column list instead of SELECT * to get a performance improvement. Let's test it, because I think this is bunkum. The Test I have here two queries: SELECT * FROM Warehouse.StockItemTransactions AS sit; --and SELECT sit.StockItemTransactionID, sit.StockItemID, sit.TransactionTypeID, sit.CustomerID, sit.InvoiceID, sit.SupplierID, sit.PurchaseOrderID, sit.TransactionOccurredWhen, sit.Quantity, sit.LastEditedBy, sit.LastEditedWhen FROM Warehouse.StockItemTransactions AS sit; I'm basically going to run this a few hundred times each from PowerShell. I'll capture the executions using Extended Events and we'll aggregate the results. The Results I ran the test multiple times because, funny enough, I kept seeing some disparity in the results. One test would show a clear bias for one method, another test would…
Read More

Statistics Are Vital For Query Performance

SQL Server, SQL Server 2016
This is post #10 supporting  Tim Ford’s (b|t) initiative on #iwanttohelp, #entrylevel. Read about it here. When you send a query to your SQL Server database (and this applies to Azure SQL Database, APS, and Azure SQL Data Warehouse), that query is going to go through a process known as query optimization. The query optimization process figures out if you can use indexes to assist the query, whether or not it can seek against those indexes or has to use a scan, and a whole bunch of other stuff. The primary driving force in making these decisions are the statistics available on the indexes and on your tables. What Are Statistics Statistics are a mathematical construct to represent the data in your tables. Instead of scanning through the data each and every…
Read More

Correlated Datetime Columns

SQL Server, SQL Server 2016, T-SQL
SQL Server is a deep and complex product. There's always more to learn. For example, I had never heard of Correlated Datetime Columns. They were evidently introduced as a database option in SQL Server 2005 to help support data warehousing style queries (frequently using dates and times as join criteria or filter criteria). You can read up on the concept here from this older article from 2008 on MSDN. However, doing a search online I didn't find much else explaining how this  stuff worked (one article here, that didn't break this down in a way I could easily understand). Time for me to get my learn on. The concept is simple, turning this on for your database means that dates which have a relationship, the example from MSDN uses OrderDate and…
Read More

The Clustered Index Is Vital To Your Database Design

Azure, SQL Server, SQL Server 2016
This is post #9 supporting  Tim Ford’s (b|t) initiative on #iwanttohelp, #entrylevel. Read about it here. You get one clustered index per table. That bears repeating, you get one clustered index per table. Choosing a clustered index is an extremely important and fundamental aspect of all your SQL Server work. The one clustered index that you get determines how the data in your table is stored. Because the clustered index determines how your data is stored, it also determines how your data is retrieved. Much of SQL Server is engineered around the clustered index because it is such a foundational object for the rest of all behavior. Without a clustered index, the data in your table is stored in what is called a heap. It is essentially a pile, a heap, of data,…
Read More

There Is No Difference Between Table Variables, Temporary Tables and Common Table Expressions

SQL Server, SQL Server 2016, SQL Server 2017, T-SQL
I actually saw the above statement posted online. The person making the claim further stated that choosing between these three constructs was "personal preference" and didn't change at all the way SQL Server would choose to deal with them in a query. Let's immediately say, right up front, the title is wrong. Yes, there are very distinct differences between these three constructs. Yes, SQL Server will absolutely deal with these three constructs in different ways. No, picking which one is correct in a given situation is not about personal preference, but rather about the differences in behavior between the three. To illustrate just a few of the differences between these three constructs, I'll use variations of this query: SELECT * FROM Sales.Orders AS o JOIN Sales.OrderLines AS ol ON ol.OrderID = o.OrderID WHERE ol.StockItemID = 227; The…
Read More

Monitor Query Performance

Azure, SQL Server, SQL Server 2016
Blog post #7 in support of Tim Ford’s (b|t) #iwanttohelp, #entrylevel. Read about it here. Sooner or later when you're working with SQL Server, someone is going to complain that the server is slow. I already pointed out the first place you should look when this comes up. But what if they're more precise? What if, you know, or at least suspect, you have a problem with a query? How do you get information about how queries are behaving in SQL Server? Choices For Query Metrics It's not enough to know that you have a slow query or queries. You need to know exactly how slow they are. You must measure. You need to know how long they take to run and you need to know how many resources are…
Read More

Common Table Expression, Just a Name

SQL Server, SQL Server 2016, T-SQL
The Common Table Expression (CTE) is a great tool in T-SQL. The CTE provides a mechanism to define a query that can be easily reused over and over within another query. The CTE also provides a mechanism for recursion which, though a little dangerous and overused, is extremely handy for certain types of queries. However, the CTE has a very unfortunate name. Over and over I've had to walk people back from the "Table" in Common Table Expression. The CTE is just a query. It's not a table. It's not providing a temporary storage space like a table variable or a temporary table. It's just a query. Think of it more like a temporary view, which is also just a query. Every time I explain this, there are people who don't…
Read More