Let me be extremely clear up front, this is not my original work. I saw this post on DBA.StackExchange.com and I wanted to share and promote it. Nice work FevziKartal. The rest of this post is just me replicating work already done by others. I just want to see it in action. Columns in Live Data Explorer Back when I wrote the 2017 version of my query performance tuning book (no link, a) get the 2022 version for reasons I'm about to explain, b) this post is just about testing FevziKartal's work, not self-promotion), I was on board with #TeamXE. Jonathan Kehayias had taken me under his wing and explained the virtues of Extended Events and I was sold. I was also grossly ignorant. I thought that the way you…
The invitation this month for #PGSqlPhriday comes from Dian Fay. The topic is pretty simple, database change management. Now, I may have, once or twice, spoken about database change management, database DevOps, automating deployments, and all that sort of thing. Maybe. Once or twice. OK. This is my topic. I've got some great examples on taking changes from the schema on your PostgreSQL databases and then deploying them. All the technical stuff you could want. However, I don't want to talk about that today. Instead, I want to talk about something really important, the concept of rollbacks when it comes to database deployments. Why Are Rollbacks Difficult? The entire purpose of your PostgreSQL database is to persist, that is to store, the data. Then, of course, offer up a way…
I was talking with some developers from my team about monitoring, and I said, "We all use the same tools," referring to other monitoring software. Then, it hit me. How is AWS collecting monitoring data on it's RDS servers, specifically, the SQL Server instances. So, I set out to determine what happens on AWS RDS when it comes to the native monitoring. The Setup This part should be as obvious as it is easy. I'm going to use Extended Events. I've written before about how AWS RDS supports Extended Events, so I won't repeat all that here. I'll just leave you with the session I'm running to see what happens on AWS RDS: CREATE EVENT SESSION [ExEventTesting] ON SERVER ADD EVENT sqlserver.rpc_completed, ADD EVENT sqlserver.sql_batch_completed ADD TARGET package0.event_file (SET filename…
Honestly, sincerely, no kidding, I love Distributed Replay. Yes, I get it. Proof positive I'm an idiot. As we needed proof. To be a little fair to me, I love what Distributed Replay could have been, with a little more love. However, fact is, it's on the deprecation list for 2022. Which means, what minimal amount of love, if any, that Microsoft was giving to it, it's all gone, forever. Unlike the Little Engine That Could, turns out that Distributed Replay was the Little Engine That Almost Could, But Didn't. Really Didn't. Let's discuss it a bit. Distributed Replay The concept is wonderful. Capture a bunch of queries from your production system. Replay them on a non-production system for testing. Add in the idea of being able to chain together…
In the last few Fundamentals posts you were introduced to a couple of ways to limit and control the data stored in the tables in your database. A primary key won’t allow a duplicate value. A foreign key won’t allow a value to be added that doesn’t already exist in the parent table and it will prevent data from being deleted. These are types of constraints on data in your database. There are a bunch of other ways to constrain the data in an effort to ensure that the data stored is exactly what the business needs. The next few Fundamentals posts will cover several methods of limiting data. Unique Constraints to Stop Duplicates When the concept of the primary key was introduced earlier in the series, two different types…
Everyone knows that you only get a single clustered index, right? Wouldn't it be great though if you could have two clustered indexes? Well, you can. Sort of. Let's talk about it. Two Clustered Indexes First I'm going to create a table: DROP TABLE IF EXISTS dbo.od; GO SELECT pod.PurchaseOrderID, pod.PurchaseOrderDetailID, pod.DueDate, pod.OrderQty, pod.ProductID, pod.UnitPrice, pod.LineTotal, pod.ReceivedQty, pod.RejectedQty, pod.StockedQty, pod.ModifiedDate INTO dbo.od FROM Purchasing.PurchaseOrderDetail AS pod; With that in place, let's start with a clustered index: CREATE CLUSTERED INDEX TestCIndex ON od (ProductID); And, a query to test with: SELECT od.PurchaseOrderID, od.PurchaseOrderDetailID, od.DueDate, od.OrderQty, od.ProductID, od.UnitPrice, od.LineTotal, od.ReceivedQty, od.RejectedQty, od.StockedQty, od.ModifiedDate FROM dbo.od WHERE od.ProductID BETWEEN 500 AND 510 ORDER BY od.ProductID; This results in the following execution plan: OK. Well done, Grant. That's how a clustered index works. The…
Quite a few years ago, I wrote a post about SELECT * and performance. That post had a bit of a click-bait title (freely admitted). I wrote the post because there was a really bad checklist of performance tips making the rounds (pretty sure it's still making the rounds). The checklist recommended a whole bunch of silly stuff. One silly thing it recommended was to simply substitute ALL columns (let me emphasize that again, name each and every column) instead of SELECT * because "it was faster". My post, linked above, showed that this statement was nonsense. Let's be clear, I'm not a fan of SELECT *. Yes, it has some legitimate functionality. However, by and large, using SELECT * causes performance problems. SELECT * Hurts The most fundamental place…
Reading execution plans in SQL Server is just hard. There's a lot to learn and understand. I previously outlined the basics I use to get started when I'm looking at an execution plan for the first time. However, just those pointers are not enough. I want to explain a little further why and how those basic steps are how you get started reading execution plans. To begin with, instead of talking about the first operator, which I've detailed before, we'll talk about the highest cost operators. Highest Cost Operator Every execution plan within SQL Server includes what the optimizer has determined to be the estimated cost of each operation. All these estimated operator costs are tallied up, and that makes up the estimated cost of the whole execution plan. You…
Wouldn't it be great to be able to put together queries and waits at the same time? You all capture query metrics using some method. Most of us query sys.dm_os_wait_stats or sys.dm_db_wait_stats. Combining them is hard. You could query the wait stats. Store the results in a table variable. Run the query in question. Then query the wait stats again into a different table variable. Join the two table variables together to find the differences. Ta-da, you have query waits. Well. Probably. If you're the only one running queries on the system. Also, you're not seeing system waits or other noise caused by activity on the system. Or, we could put Extended Events to work. Queries and Waits Just like Profiler/Trace, you can capture stored procedures, batches, and individual statements…
Of all the things that Extended Events does, I've found the ability to quickly and easily gather a little bit of data and then use the Data Explorer window Live Data grouping to aggregate it to be one of the greatest. Sure, if we're talking about using Extended Events on a busy production server, this method probably isn't going to work well. There, you are going to be better off querying the XML (I know, I know, but I have ways to help there too). But in development, when doing testing and query tuning, the Live Data window is a gift of the gods on par with fire or beer (it's not as good as whiskey). Live Data Grouping Let's imagine a scenario like this. You're working on some query…