A question came up on SQL Server Central as to why or how Profiler could bring down the production server. I was aware of the fact that Profiler caused problems and that’s why you should always use a server-side trace on production systems. Duh! Of course…. But why? I didn’t have an answer. I just “knew” what everyone told me. So I set out to do some research. Initially I hit my favorite source of information on SQL Server (and it should be yours too) the Books Online.
Here is what I was reading on the architecture of Profiler. Good, but not enough.
I still want more information, so I’ve gone out a’wanderin’ through the interwebs. First, Linchi Shea has this great set of tests comparing Profiler to server side traces. Fantastic information, and he proves what lots of people “know,” but he doesn’t explain why it’s occurring, simply that it does, which, by and large, is good enough for me most of the time, but my interest was piqued.
I still haven’t read Inside 2005: Query Tuning and Optimization (trying very hard not to plagiarize, even by accident, that’s about done and I can’t wait to read this), but one chapter, just on trace and Profiler is available online. That’s a scary read (read it to find out what happens if the file buffer fills). But there is a mechanism that flushes the file reader queue every four seconds, which is a trick to apparently reduce load on systems with a high number of transactions. That and the general speed of a file system over SMO seems to be why there is better performance for the server-side trace. They go on to quote Linchi Shea’s article above, which still doesn’t provide the answer as to why Profiler is, not only slower, but actually slows the system.
Ah, finally, here’s the answer. Profiler, as opposed to trace, actually requires memory latches, and exclusive latches at that, in order for the sessions on SQL Server to write the event out to Profiler. Then, you factor in the network, that the machine consuming this stuff is probably not terribly powerful… You get Profiler slowing down the server… Cool!
UPDATE: Missed some links. Thanks Gail.
And it you want to really stuff things up, make Profiler write to a table destination. Now the latches have to be held not until the client machine receives the event, but until it received the event and fires off an insert into a database.
Fast way to bring down a server – Use Profiler to trace the Lock:Acquired event and write the trace into a remote server.
p.s. The link to the SSC discussion’s missing
Great research Grant. Now another resource to use when I get the articles for I’m working on written.
Oops. Thanks Gail.
I’m still waiting for your article Jack, but I had a bug in my ear and I had to get rid of it. Now you write it up correctly.
Gail,
Yeah, I probably should have mentioned how stupid & insane that is, but it seems to be covered by others so well already. Still, it makes you wonder why on earth MS would allow people to shoot themselves in the foot so thoroughly.
if only i had a dollar for every time a developer told me that they NEEDED to run profiler in order to find out why the server was so slow. or, my personal favorite, in order to find out how to tune a query.
let me show you what an execution plan looks like…perhaps we could start with that before firing up profiler on your production box?
Yeah, I get that a lot too. That’s one of the reasons we set up server side tracing most production machines. A developer asks, we deliver a file, they whine, we educate, they go away… for a time.
[…] How Profiler can Bring Down a Server – when people ask you to troubleshoot a database server using Profiler, make sure you are using a screaming fast machine for the profiling, and even then, be wary. I’ve had this happen to me on a data warehouse too. […]
“if only i had a dollar for every time a developer told me that they NEEDED to run profiler in order to find out why the server was so slow. or, my personal favorite, in order to find out how to tune a query.
let me show you what an execution plan looks like…perhaps we could start with that before firing up profiler on your production box?”
Can you please explain to me in simple terms how you find what plan to look at in the first place in the case of a poor performing production system? Doesn’t it make sense to profile to filter the low hanging fruit before diving in at the explain plans of SQL that may or may not be running slow? What if you have hundreds of SQL statements running at any given time.
It seems the DBA should be activtly monitoring the server. So that a developer who normally doesn’t have the needed access or skills isn’t stuck trying to pull of a task they aren’t equiped for.
I don’t want to speak for SQL Batman (he might hurt me), but if you don’t know what’s slowing things down, then yes, a server-side trace (not Profiler) and Performance Monitor (or data from Operations Manager or a third party product) are absolutely necessary in order to identify where the slow performance is occurring. You also should be monitoring it before someone else notices that things are slow.
I think SQL Batman’s issue was, that instead of hitting the tools available when they already know where the slow down is, they’re trying to get access to an inappropriate tool to start trouble-shooting in a way that’s not leading towards the answer, but rather introducing noise (not to mention that Profiler shouldn’t be run on production, as the research above shows).
[…] the Profiler GUI against a production server is not something you should do. I’ve outlined my research into exactly why in the past. But I hit another little issue with the Profiler GUI as part of work I’m doing on a […]
[…] addition, The Profiler can crash the server! SQL profiler may not do well on a production server (Profiler Research « Home of the Scary DBA). The complexity in performance tuning and troubleshooting is obvious. However, the map I am […]
[…] the production machine in order to capture the workload (see Grant Fritchey’s post on the Profiler GUI to learn why the server-side trace is always preferred).  She captures the trace data in a table […]
Nice one, thanks Grant. Bringing it all together…
Just to note, seems to be a small typo in the link to BOL – it points to: “http://msdn.microsoft.com/en-us/library/ms187933.aspx%5Dhttp://msdn.microsoft.com/en-us/library/ms187933.aspx”
They must have changed the links. I’ll see if I can remember what I was looking at four years ago. I’ll get something in place.