Azure SQL Data Warehouse can sometimes feel like it’s completely different from SQL Server, but under the covers, it is still (mostly) SQL Server and it does have execution plans. Let’s take a look at one.
I’ve created a new SQL Data Warehouse using the sample database available on the portal, AdventureWorksDW. Here’s a query against that database:
SELECT dd.FullDateAlternateKey AS OrderDate, dc.LastName, SUM(fis.SalesAmount) AS SumSalesAmount FROM dbo.FactInternetSales AS fis JOIN dbo.DimDate AS dd ON fis.OrderDateKey = dd.DateKey JOIN dbo.DimCustomer AS dc ON dc.CustomerKey = fis.CustomerKey GROUP BY dd.FullDateAlternateKey, dc.LastName HAVING SUM(fis.SalesAmount) > 5000.0 ORDER BY OrderDate DESC;
If I attempt to capture an execution plan using the SQL Server Management Studio GUI, nothing happens. If I try to use T-SQL commands, I get an error that those commands are not supported with this version of SQL Server. Same thing if I try to capture a plan using Visual Studio. So… now what? Enter our new command:
EXPLAIN SELECT dd.FullDateAlternateKey AS OrderDate,        dc.LastName,        SUM(fis.SalesAmount) AS SumSalesAmount FROM   dbo.FactInternetSales AS fis JOIN   dbo.DimDate AS dd        ON fis.OrderDateKey = dd.DateKey JOIN   dbo.DimCustomer AS dc        ON dc.CustomerKey = fis.CustomerKey GROUP BY dd.FullDateAlternateKey,        dc.LastName HAVING SUM(fis.SalesAmount) > 5000.0 ORDER BY OrderDate DESC;
If I run this through the SSMS query window, I get a syntax error. So we’re now in Visual Studio. This is how we generate an execution plan from within Azure SQL Data Warehouse. What you get is XML output in the results like this:
According to the documentation on EXPLAIN, I should be able to click on the XML and it will open up to explore. In my version of Visual Studio (2015), I didn’t find that to be the case. Instead I had to copy and paste the XML into an XML file window that I created within Visual Studio. This is what I finished with:
<?xml version="1.0" encoding="utf-8"?> <dsql_query number_nodes="1" number_distributions="60" number_distributions_per_node="60"> <sql>SELECT dd.FullDateAlternateKey AS OrderDate, dc.LastName, SUM(fis.SalesAmount) AS SumSalesAmount FROM dbo.FactInternetSales AS fis JOIN dbo.DimDate AS dd ON fis.OrderDateKey = dd.DateKey JOIN dbo.DimCustomer AS dc ON dc.CustomerKey = fis.CustomerKey GROUP BY dd.FullDateAlternateKey, dc.LastName HAVING SUM(fis.SalesAmount) > 5000.0 ORDER BY OrderDate DESC</sql> <dsql_operations total_cost="5.98868068474576" total_number_operations="13"> <dsql_operation operation_type="RND_ID"> <identifier>TEMP_ID_14</identifier> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllComputeNodes" /> <sql_operations> <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_14] ([DateKey] INT NOT NULL, [FullDateAlternateKey] DATE NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="BROADCAST_MOVE"> <operation_cost cost="1.99584" accumulative_cost="1.99584" average_rowsize="7" output_rows="1188" GroupNumber="12" /> <source_statement>SELECT [T1_1].[DateKey] AS [DateKey], [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey] FROM [DWTest].[dbo].[DimDate] AS T1_1</source_statement> <destination_table>[TEMP_ID_14]</destination_table> </dsql_operation> <dsql_operation operation_type="RND_ID"> <identifier>TEMP_ID_15</identifier> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_15] ([OrderDateKey] INT NOT NULL, [CustomerKey] INT NOT NULL, [SalesAmount] MONEY NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="SHUFFLE_MOVE"> <operation_cost cost="3.93098847457627" accumulative_cost="5.92682847457627" average_rowsize="16" output_rows="60398" GroupNumber="10" /> <source_statement>SELECT [T1_1].[OrderDateKey] AS [OrderDateKey], [T1_1].[CustomerKey] AS [CustomerKey], [T1_1].[SalesAmount] AS [SalesAmount] FROM [DWTest].[dbo].[FactInternetSales] AS T1_1</source_statement> <destination_table>[TEMP_ID_15]</destination_table> <shuffle_columns>CustomerKey;</shuffle_columns> </dsql_operation> <dsql_operation operation_type="RND_ID"> <identifier>TEMP_ID_16</identifier> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_16] ([FullDateAlternateKey] DATE NOT NULL, [LastName] NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AS, [col] MONEY NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="SHUFFLE_MOVE"> <operation_cost cost="0.0618522101694915" accumulative_cost="5.98868068474576" average_rowsize="111" output_rows="136.985" GroupNumber="25" /> <source_statement>SELECT [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T1_1].[LastName] AS [LastName], [T1_1].[col] AS [col] FROM (SELECT SUM([T2_2].[SalesAmount]) AS [col], [T2_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T2_2].[LastName] AS [LastName] FROM [tempdb].[dbo].[TEMP_ID_14] AS T2_1 INNER JOIN (SELECT [T3_2].[OrderDateKey] AS [OrderDateKey], [T3_2].[SalesAmount] AS [SalesAmount], [T3_1].[LastName] AS [LastName] FROM [DWTest].[dbo].[DimCustomer] AS T3_1 INNER JOIN [tempdb].[dbo].[TEMP_ID_15] AS T3_2 ON ([T3_2].[CustomerKey] = [T3_1].[CustomerKey])) AS T2_2 ON ([T2_1].[DateKey] = [T2_2].[OrderDateKey]) GROUP BY [T2_1].[FullDateAlternateKey], [T2_2].[LastName]) AS T1_1</source_statement> <destination_table>[TEMP_ID_16]</destination_table> <shuffle_columns>FullDateAlternateKey;</shuffle_columns> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_15]</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllComputeNodes" /> <sql_operations> <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_14]</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="RETURN"> <location distribution="AllDistributions" /> <select>SELECT [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T1_1].[LastName] AS [LastName], [T1_1].[col] AS [col] FROM (SELECT [T2_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T2_1].[LastName] AS [LastName], [T2_1].[col] AS [col] FROM (SELECT SUM([T3_1].[col]) AS [col], [T3_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T3_1].[LastName] AS [LastName] FROM [tempdb].[dbo].[TEMP_ID_16] AS T3_1 GROUP BY [T3_1].[FullDateAlternateKey], [T3_1].[LastName]) AS T2_1 WHERE ([T2_1].[col] > CAST ((5000.0) AS DECIMAL (5, 1)))) AS T1_1 ORDER BY [T1_1].[FullDateAlternateKey] DESC</select> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_16]</sql_operation> </sql_operations> </dsql_operation> </dsql_operations> </dsql_query>
So now we just save this as a .sqlplan file and open it in SSMS, right?
Nope!
See, that’s not a regular execution plan, at all. Instead, it’s a D-SQL plan. It’s not the same as our old execution plans. You can’t open it as a graphical plan (and no, not even in that very popular 3rd party tool, I tried). You will have to learn how to read these plans differently because, well, they are different.
[…] by /u/ScaryDBA [link] […]
[…] Grant Fritchey shows how to build an execution plan for an Azure SQL Data Warehouse query: […]
Have you checked the portal, then to the activity in your Azure SQL Data Warehouse, and then clicked through the queries executed against your warehouse? While It’s not entirely graphical, it does parse out the execution steps into operations. It lets you see the operation, whether that step was a control, commpute, or storage operation, start and duration of the step. It’s a start at least. I would like to see that “very popular 3rd party tool” pick up DSQL plans too. It would be a boon to teaching how to improve developers queries in Azure SQL Data Warehouse.
No. I haven’t tried that yet. I have used the DBCC command to show the live execution. That’s pretty slick.
[…] have one Polybase node in our scaleout cluster.  Azure SQL Data Warehouse has a similar output; this Grant Fritchey post shows an example with number_distributions=”60″ instead of 8.  For more information […]
Graphical plan via SSMS’s button “Estimated execution plan” is now possible in SSMS 17.5.
I’ve heard. I haven’t had the chance to test this yet.