SELECT * Does Not Hurt Performance

SQL Server, SQL Server 2016, T-SQL
I read all the time how SELECT * hurts performance. I even see where people have said that you just have to supply a column list instead of SELECT * to get a performance improvement. Let's test it, because I think this is bunkum. The Test I have here two queries: SELECT * FROM Warehouse.StockItemTransactions AS sit; --and SELECT sit.StockItemTransactionID, sit.StockItemID, sit.TransactionTypeID, sit.CustomerID, sit.InvoiceID, sit.SupplierID, sit.PurchaseOrderID, sit.TransactionOccurredWhen, sit.Quantity, sit.LastEditedBy, sit.LastEditedWhen FROM Warehouse.StockItemTransactions AS sit; I'm basically going to run this a few hundred times each from PowerShell. I'll capture the executions using Extended Events and we'll aggregate the results. The Results I ran the test multiple times because, funny enough, I kept seeing some disparity in the results. One test would show a clear bias for one method, another test would…
Read More

Statistics Are Vital For Query Performance

SQL Server, SQL Server 2016
This is post #10 supporting  Tim Ford’s (b|t) initiative on #iwanttohelp, #entrylevel. Read about it here. When you send a query to your SQL Server database (and this applies to Azure SQL Database, APS, and Azure SQL Data Warehouse), that query is going to go through a process known as query optimization. The query optimization process figures out if you can use indexes to assist the query, whether or not it can seek against those indexes or has to use a scan, and a whole bunch of other stuff. The primary driving force in making these decisions are the statistics available on the indexes and on your tables. What Are Statistics Statistics are a mathematical construct to represent the data in your tables. Instead of scanning through the data each and every…
Read More

Correlated Datetime Columns

SQL Server, SQL Server 2016, T-SQL
SQL Server is a deep and complex product. There's always more to learn. For example, I had never heard of Correlated Datetime Columns. They were evidently introduced as a database option in SQL Server 2005 to help support data warehousing style queries (frequently using dates and times as join criteria or filter criteria). You can read up on the concept here from this older article from 2008 on MSDN. However, doing a search online I didn't find much else explaining how this  stuff worked (one article here, that didn't break this down in a way I could easily understand). Time for me to get my learn on. The concept is simple, turning this on for your database means that dates which have a relationship, the example from MSDN uses OrderDate and…
Read More

The Clustered Index Is Vital To Your Database Design

Azure, SQL Server, SQL Server 2016
This is post #9 supporting  Tim Ford’s (b|t) initiative on #iwanttohelp, #entrylevel. Read about it here. You get one clustered index per table. That bears repeating, you get one clustered index per table. Choosing a clustered index is an extremely important and fundamental aspect of all your SQL Server work. The one clustered index that you get determines how the data in your table is stored. Because the clustered index determines how your data is stored, it also determines how your data is retrieved. Much of SQL Server is engineered around the clustered index because it is such a foundational object for the rest of all behavior. Without a clustered index, the data in your table is stored in what is called a heap. It is essentially a pile, a heap, of data,…
Read More

There Is No Difference Between Table Variables, Temporary Tables and Common Table Expressions

SQL Server, SQL Server 2016, SQL Server 2017, T-SQL
I actually saw the above statement posted online. The person making the claim further stated that choosing between these three constructs was "personal preference" and didn't change at all the way SQL Server would choose to deal with them in a query. Let's immediately say, right up front, the title is wrong. Yes, there are very distinct differences between these three constructs. Yes, SQL Server will absolutely deal with these three constructs in different ways. No, picking which one is correct in a given situation is not about personal preference, but rather about the differences in behavior between the three. To illustrate just a few of the differences between these three constructs, I'll use variations of this query: SELECT * FROM Sales.Orders AS o JOIN Sales.OrderLines AS ol ON ol.OrderID = o.OrderID WHERE ol.StockItemID = 227; The…
Read More

Monitor Query Performance

Azure, SQL Server, SQL Server 2016
Blog post #7 in support of Tim Ford’s (b|t) #iwanttohelp, #entrylevel. Read about it here. Sooner or later when you're working with SQL Server, someone is going to complain that the server is slow. I already pointed out the first place you should look when this comes up. But what if they're more precise? What if, you know, or at least suspect, you have a problem with a query? How do you get information about how queries are behaving in SQL Server? Choices For Query Metrics It's not enough to know that you have a slow query or queries. You need to know exactly how slow they are. You must measure. You need to know how long they take to run and you need to know how many resources are…
Read More

Common Table Expression, Just a Name

SQL Server, SQL Server 2016, T-SQL
The Common Table Expression (CTE) is a great tool in T-SQL. The CTE provides a mechanism to define a query that can be easily reused over and over within another query. The CTE also provides a mechanism for recursion which, though a little dangerous and overused, is extremely handy for certain types of queries. However, the CTE has a very unfortunate name. Over and over I've had to walk people back from the "Table" in Common Table Expression. The CTE is just a query. It's not a table. It's not providing a temporary storage space like a table variable or a temporary table. It's just a query. Think of it more like a temporary view, which is also just a query. Every time I explain this, there are people who don't…
Read More

Same Query, Different Servers, Different Performance. Now What?

SQL Server, SQL Server 2016, T-SQL
Based on the number of times I see this question on forums, it must be occurring all the time. You have two different servers that, as far as you know, are identical in terms of their options and setup (although not necessarily in terms of power, think a test or pre-production system versus production). On these servers you have a database on each that, as far as you know, is the same as the other in terms of options, objects, maybe even data (although, this does mean that you have unmasked production information in your QA environment, which potentially means you're going to jail, might want to address this, especially now that I've told you about it, mens rea, you're welcome). On each database you run, as far as you know, the exact same query (whether…
Read More

CASE Statement in GROUP BY

SQL Server, SQL Server 2016
Set based operations means you should put everything into a single statement, right? Well, not really. People seem to think that having two queries is really bad, so when faced with logical gaps, they just cram them into the query they have. This is partly because SQL Server and T-SQL supports letting you do this, and it's partly because it looks like a logical extension of code reuse to arrive at a query structure that supports multiple logic chains. However, let's explore what happens when you do this on particular situation, a CASE statement in a GROUP BY clause. You see this a lot because a given set of data may be needed in slightly different context by different groups within the company. Like many of my example queries, this…
Read More

Use The Correct Data Type

SQL Server, SQL Server 2016, T-SQL
Blog post #5 in support of Tim Ford’s (b|t) #iwanttohelp, #entrylevel. Read about it here. Saying that you should use the correct data type seems like something that should be very straight forward. Unfortunately it's very easy for things to get confusing. Let's take a simple example from AdventureWorks. If I run this query: SELECT a.ModifiedDate FROM Person.Address AS a WHERE a.AddressID = 42; The output looks like this: 2009-01-20 00:00:00.000 Normal right? You see the year, the month and the day followed by the time in hours, minutes, and seconds as a decimal. Ah, but there is an issue. This query is supposed to be for the reporting system, and the business only cares about the date that the values in the Person.Address table have been modified, so they don't want…
Read More