A question that comes up on the forums all the time: Which query used the most CPU. You may see variations on, memory, I/O, or just resources in general. However, people want to know this information, and it’s not readily apparent how to get it.
While you can look at what’s in cache through the DMVs to see the queries there, you don’t get any real history and you don’t get any detail of when the executions occurred. You can certainly take advantage of the Query Store for this kind of information. However, even that data is aggregated by hour. If you really want a detailed analysis of which query used the most CPU, you need to first set up an Extended Events session and then consume that data.
A Detailed Extended Events Session
Because I want a lot of information, I’m going to create a fairly detailed Extended Event Session. I’m running this in a test environment and filtering down to a single database. I’ll be able to deal with the amount of information generated. Take into account that this will generate quite a bit of information as configured. If you run this in a production environment, either run it for short periods of time, or place additional filtering.
Here’s the script to define the Extended Events Session:
CREATE EVENT SESSION QueryExecutionDetails
ON SERVER
ADD EVENT sqlserver.rpc_completed
(WHERE (sqlserver.database_name = N'AdventureWorks')),
ADD EVENT sqlserver.rpc_starting
(WHERE (sqlserver.database_name = N'AdventureWorks')),
ADD EVENT sqlserver.sp_statement_completed
(WHERE (sqlserver.database_name = N'AdventureWorks')),
ADD EVENT sqlserver.sql_batch_completed
(WHERE (sqlserver.database_name = N'AdventureWorks')),
ADD EVENT sqlserver.sql_batch_starting
(WHERE (sqlserver.database_name = N'AdventureWorks')),
ADD EVENT sqlserver.sql_statement_completed
(SET collect_statement = (1)
WHERE (sqlserver.database_name = N'AdventureWorks'))
ADD TARGET package0.event_file
(SET filename = N'C:\ExEvents\QueryExecutionDetails.xel')
WITH (TRACK_CAUSALITY = ON);
The information I’m capturing includes the start and stop of a given batch as well as the individual statements within the batch, and then the start and stop of a given procedure call as well as the statements within that call. I’ve also added causality tracking so that I can very easily group each set of statements together should I desire to.
Identifying Which Query Used the Most CPU
Once we have the data coming in, you have a bunch of choices for how to consume it. Let’s assume that I just want to open one of the QueryExecutionDetails.xel files. Once open, you’d see something like this:
This shows the events and the details. Clicking on one of the events, opens up the details like this:
You can see that we’ve capture cpu_time on this event. However, if you were to try to scroll through the events captured, there are 1,865 of them, it might take a while. Instead, let’s put this tool to work. On the Extended Events data viewer tool bar is a button labeled “Choose Columns…”. Opening that you’ll see a list of all the columns from all the events within the currently open file (or live Extended Event session). I’ve picked a number of columns, including cpu_time:
Once I click OK, you’ll notice that the top grid has completely changed. It now shows all the columns I have selected:
You’ll note that some events have a batch_text value and for others that’s null. Some events have reads or writes, or what have you,, and others those too are null. That’s fine. Each event has different columns it captures. With all this in place, finding the query with the highest CPU cost is shockingly simple.
If I click twice on the cpu_time column, the data in it gets sorted. The first time, ascending. The second time, descending. Ta-da, the query that used the most CPU was a batch consisting of several queries:
That actually doesn’t answer the question though. Which individual query used the most CPU. Let’s use another trick built into Extended Events. We can filter the events by clicking on the button labeled “Filters…” in the toolbar. I’ve limited the results to just the statements:
Once I apply this filter, I can definitively say exactly which query used the most CPU. I’m displaying this information now at the individual statement level:
Best of all, the columns I’ve picked and filters I’ve applied are kept in a configuration file. So, when I reopen the same Session data, I’ll have my formatting in place.
Conclusion
You can easily see how I could add other columns to the graph and then sort or group by them to see how other queries are consuming resources. The ease with which I can quickly generate and consume this information using Extended Events is one of many reasons why I love using them so much. You should be using them on your systems as well. Using Extended Events I can easily identify which query used the most CPU.
I have several upcoming precons, including one specific to how to better use Extended Events:
SQLSaturday Columbus Precon, June 7 2019, Columbus OH. Last chance to get signed up for this event and learn a whole lot more about Extended Events and other tools for tuning queries.
For info on DevOps, look to these precons coming up.
SQLSaturday Indianapolis Precon, Friday August 16th, 2019. Click here now to register.
SQLSaturday Oslos Precon, Friday August 30th, 2019. Click here to register.
[…] Grant Fritchey shows how you can create an extended event which identifies high-CPU queries: […]
[…] additional measures through Extended Events or the Query Store (here’s a fun one to spot the query that uses the most CPU). They can tell us which queries are using up the […]