I keep thinking I’ve got a handle on the way execution plans are dealt with in SQL Server. I had a pretty simplistic view of things, there’s the estimated plan which comes out of the optimizer and there’s the actual plan which comes out of the data engine. The one shows the best guess for what will happen based on the query and the underlying statistics. The other shows what actually happened based on the query and the real data, load on the server, available memory, you get the idea. This model is easy to explain and understand. Too bad it’s not quite the way things work.
If you query the dynamic management function sys.dm_exec_query_plan, you can see a query plan. Once you drill down on the XML, or browse through the GUI, you’ll see that this is an estimated plan, with no execution statistics at all. The safe assumption here is that SQL Server takes the execution plan created by the optimizer and stores it. Ah, but is that exactly how it works? Can you generate an estimated execution plan for this batch statement:
CREATE TABLE dbo.TempTable (id INT IDENTITY(1, 1)
,val VARCHAR(50) ) ;
INSERT INTO dbo.TempTable
(val)
VALUES
(‘dude’) ;
SELECT tt.*
FROM dbo.TempTable AS tt ;
DROP TABLE dbo.TempTable ;
No. You can’t. If you try to generate an estimated plan you’ll get the error “Invalid object name ‘dbo.TempTable’.” This is because the algebrizer, a part of the process of the optimizer, which is what generates the estimated plan, can’t resolve the table dbo.TempTable because it doesn’t exist yet. However, if you execute the batch above and then run this query:
SELECT deqp.query_plan
FROM sys.dm_exec_cached_plans AS decp
CROSS APPLY sys.dm_exec_query_plan(decp.plan_handle) AS deqp
CROSS APPLY sys.dm_exec_sql_text(decp.plan_handle) AS dest
WHERE dest.text LIKE ‘create table%’
You can see the estimated plan for executing this batch. So… if the optimizer creates estimated plans, but it can’t create an estimated plan for this query since the algebrizer can’t resolve the object name… where did this estimated plan come from? Good question.
Here’s some more interesting info. You know that you can capture both estimated & actual execution plans through trace events, right? Watch what happens if you run the batch statement above and capture both the “Showplan XML” and “Showplan XML Statistics Profile” events, estimated and actual, respectively. You’ll get the estimated plan that is being stored in sys.dm_exec_query_plan and you’ll get an actual execution plan including actual rows vs. estimated, etc. Where did that execution plan come from? Another good question.
A better man than I, Tibor Karaszi, first noticed that there was, other than the execution statistics, no difference between “XML Showplan” and “XML Showplan Statistics Profile.” He asked whether or not we should ever expect to see a difference in a forum post. I jumped in with both feet, of course you’ll see a difference. After all, one is the estimated plan and one is the actual plan. Mr. Karaszi corrected a couple of my assumptions and suddenly, there I was, not understanding execution plans, again. I didn’t know where those estimated plans were coming from, not for sure. I assumed the optimizer, but then how were they getting past the algebrizer?
I had a theory and a mechanism for testing it. If I set up a trace and capture the recompile events using SQL:StmtRecompile and return the EventSubClass column, I can see what caused a recompile. When I run this query and check the trace information, I find that the EventSubClass resolves to 3-Deferred Compile. I’m not 100% certain that I’m correct here, but the following is what I think occurrs.
Because of the CREATE statement, the batch is going to fail in the optimizer. But, because it’s a DDL statement, it goes on to execute anyway. Then, a recompile puts the batch back through the optimizer. There, a table is now available where none was previously. An estimated execution plan is created. This plan, previously non-existent gets stored and can be accessed by us.
Within the facts available, this seems to explain the behavior. In fact, it’s basically supported by Connor Cunningham, one of the MS SQL Server developers, in some blog posts over at Conner vs. SQL.