I advocate for, use, document, teach, and just downright love, Extended Events. They are so much better than the old Trace Events (aka, Profiler) that it’s sometimes difficult to keep from just gushing. Let’s talk about a common situation that you’re going to run into on your servers all the time and how you can put Extended Events to work to help you, without actually doing any work at all.
What’s that? Be lazy and get rewards? Yes.
The Extended Events system_health Session
On your servers, any of them that are SQL Server 2008 or newer, right now, unless you’ve performed actions to prevent this, you’re running the Extended Events system_health session. It’s just happening, currently, on all your servers. Nothing you need to do about it at all. I’ll be a lot of you never even knew it was there.
If you follow the link you can see all the various types of information being gathered by the Extended Event system_health session. I won’t detail all of it here. Let me just provide a little context around how the session works. First and foremost, similar to the error log, this session consists of four files, each 5mb in size, rolling over as they get filled. For systems with a very high degree of activity, that means the information here may only be hours old. However, for most of us, this provides days, if not weeks worth of information about the behavior of your system.
You’ll need to know where the files are located so that we can query them or open them up in the Live Data window. Here’s a simple query that will give you the path on any system:
SELECT dosdlc.path FROM sys.dm_os_server_diagnostics_log_configurations AS dosdlc;
Now there are a ton of reasons why you should be taking advantage of the system_health session (deadlocks for example), but I’m going to focus on one, blocks.
Waits In The system_health Session
To see this information in action, we can set up a really simple query. Run this from two different query windows and let it sit for about 45 seconds or so (it only needs 30, but for our purposes, add a little padding):
BEGIN TRAN UPDATE dbo.ErrorLog SET ErrorTime = GETDATE() WHERE ErrorLogID = 1;
After a little while, roll back both scripts. Then, we can run this script to take a look at the system_health information specifically on the waits caused by these two queries blocking one another:
DECLARE @path NVARCHAR(260); --to retrieve the local path of system_health files SELECT @path = dosdlc.path FROM sys.dm_os_server_diagnostics_log_configurations AS dosdlc; SELECT @path = @path + N'system_health_*'; SELECT CAST(fx.event_data AS XML) AS Event_Data, fx.object_name FROM sys.fn_xe_file_target_read_file(@path, NULL, NULL, NULL) AS fx WHERE fx.object_name = 'wait_info';
The results on my system look like this:
<event name="wait_info" package="sqlos" timestamp="2018-06-01T12:04:18.915Z"> <data name="wait_type"> <value>5</value> <text>LCK_M_X</text> </data> <data name="opcode"> <value>1</value> <text>End</text> </data> <data name="duration"> <value>92115</value> </data> <data name="signal_duration"> <value>0</value> </data> <data name="wait_resource"> <value>0x0000002c2d3fb690</value> </data> <action name="sql_text" package="sqlserver"> <value>BEGIN TRAN UPDATE dbo.ErrorLog SET ErrorTime = GETDATE() WHERE ErrorLogID = 1; </value> </action> <action name="session_id" package="sqlserver"> <value>71</value> </action> <action name="callstack" package="package0"> <value>sqllang.dll+0x000000000053FF82 sqllang.dll+0x0000000000541241 sqllang.dll+0x0000000000540669 sqldk.dll+0x00000000001055E5 sqldk.dll+0x00000000000025E1 sqldk.dll+0x000000000000148F sqldk.dll+0x000000000000354B sqlmin.dll+0x0000000000067449 sqlmin.dll+0x000000000006720B sqlmin.dll+0x000000000000CF48 sqlmin.dll+0x000000000001391E sqlmin.dll+0x0000000000072EA6 sqlmin.dll+0x000000000002E7AF sqlmin.dll+0x000000000002E595 sqlmin.dll+0x0000000000031786 sqlTsEs.dll+0x0000000000001F17 sqlmin.dll+0x0000000000059010 sqlmin.dll+0x000000000003F6BF sqllang.dll+0x000000000001817B sqllang.dll+0x000000000002B86F sqllang.dll+0x000000000002B61F sqllang.dll+0x0000000000B8BFDB sqllang.dll+0x0000000000B7D7CB sqllang.dll+0x0000000000013A74</value> </action> </event>
You have the query, the wait_resource, the duration, in short, everything you need to start identifying why you had excessive blocking on one of your servers and all you had to do was reach out and grab it.
I’ll leave parsing the XML to others (check out this book, just starting to read it).
Conclusion
Really, there was a query that ran unusually long yesterday? I wonder if it was blocked and waiting on resources? Let me take a quick look at the Extended Events system_health session. It really is that easy. This is a free, easily accessed resource that is available to you right now. Please, use it.
Want to see a WHOLE bunch of other tricks and methods with Extended Events as well as a slew of other tools for dealing with SQL Server performance? I’m putting on an all day seminar in various locations. Please follow the links below to take advantage of this great learning and sharing opportunity:
For SQLSaturday Indianapolis (and my first time speaking in Indiana) on August 10, 2018. Please go here to sign up.
I’ll be at SQLSaturday Sioux Falls (and my first time speaking in South Dakota) on August 18, 2018. Please sign up here.
For SQLSaturday Oslo on August 31, 2018. Click here right now to register.
Haven’t tried it on all versions yet but this works in 2016 and based on the docs I think it will work in 2012+
SELECT * FROM sys.fn_xe_file_target_read_file (‘system_health*.xel’,’system_health*.xem’,null,null) fx
That way you can avoid the call to get the path.
Cool beans. Thanks.
[…] Grant Fritchey shows how to get wait information for particular sessions from the system_health exte…: […]
I found this by Wayne Sheffield, nice script never fun to reinvent the wheel. I made a few mods and it works fabulous for me in my dead lock report emails.
https://blog.waynesheffield.com/wayne/code-library/deadlocks/shred-deadlock-graph/
Wayne is a great guy. Glad he’s sharing this kind of work.