On the Journey to be a SQL Server Professional: Troubleshooting and Optimizing SQL Server Queries

1. Main Strategies

a. Caused by I/O?

i. How to know if there are physical I/O activities?

SET STATISTICS IO ON

SET STATISTICS TIME ON

ii. If the performance problem is related to I/O, what to do?

1. Memory bottleneck? Page out/Page In problems? If no

2. Is the query heavily used or occasionally used? If rarely used, but with large data, suggest the client to run the query at non-peak time.

3. Is it possible to reduce the volume of data by examining the execution plan?

4. Hardware (disk) improvement

b. Caused by heavy compile or recompile?

i. Focus on two types of queries

1. Simple but frequently used ones – is it possible to reduce the compile time via database re-design

2. Complex queries

ii. Two ways to calculate the compile time

1. SET STATISTICS TIME ON – cannot be used for application codes

2. SQL Trace

a. For Batches, compile time = SQL:StmtStarting – SQL:BatchStarting

b. For SP, compile time = SQL:StmtStarting – SP:StmtStarting

c. For RPC, compile time = SQL:StmtStarting – RPC:Starting

d. For dynamic queries, compile time = 2^nd SP:StmtStarting (for the actual query) – 1^st SP:StmtStarting (for @Exec)

USE [AdventureWorks]

DROP PROC LongCompile

CREATE PROC LongCompile ( @i INT )

DECLARE @cmd VARCHAR(MAX)

DECLARE @j INT

SET @j = 0

SET @cmd = '

SELECT COUNT(b.SalesOrderID) ,

SUM(p.Weight)

FROM dbo.SalesOrderHeader_test a

INNER JOIN dbo.SalesOrderDetail_test b ON a.SalesOrderID = b.SalesOrderID

INNER JOIN Production.Product p ON b.ProductID = p.ProductID

where a.SalesOrderID in(43659'

WHILE @j < @i

BEGIN

SET @cmd = @cmd + ',' + STR(@j + 43659)

SET @j = @j + 1

END

SET @cmd = @cmd + ')'

PRINT @cmd

EXEC(@cmd)

SET STATISTICS TIME ON

EXEC LongCompile 100

iii. How to improve if the compile time is too long?

1. Make the query shorter or simpler

2. Delete extra indexes

3. Reuse the plans if possible

c. Wrong execution plan in each step?

i. Compare the actual returned rows and estimated rows.

ii. Index seek or table scan? – Generally Seek is better than Scan, but if a wrong plan is used due to parameter mismatch, scan may be better in some cases. So it depends.

iii. Right join? Is Nested Loops, Merge, or Hash Join right?

iv. Filter (i.e., the WHERE clause) after join or join after filer? - filter

v. Summary of possible reasons for using wrong execution plan

1. No statistics or out dated – auto update or manual/job to update

2. Complex query (e.g., complex WHERE) makes SQL Server hard to estimate the cost

3. Use a wrong plan due to uneven data distribution

a. Parameter sniffing – performance degradation due to wrong parameter

b. Reason: uneven data distribution and passing parameters quite different

c. Solutions

i. Use Exec(@cmd+@i )

USE [AdventureWorks]

DROP PROC NOSniff

CREATE PROC NOSniff(@i INT)

DECLARE @cmd VARCHAR()

SET @cmd='

SELECT

COUNT(b.[SalesOrderID]),SUM(p.[Weight])

FROM [dbo].[SalesOrderHeader_test] a

INNER JOIN [dbo].[SalesOrderDetail_test] b

ON a.[SalesOrderID]=b.[SalesOrderID]

INNER JOIN [Production].[Product] p

ON b.[ProductID]=p.[ProductID]

WHERE a.[SalesOrderID]='

EXEC(@cmd+@i)

ii. Use Local Variables

CREATE PROC Sniff2 ( @i INT )

DECLARE @j INT

SET @j = @i

SELECT COUNT(b.[SalesOrderID]) ,

SUM(p.[Weight])

FROM [dbo].[SalesOrderHeader_test] a

INNER JOIN [dbo].[SalesOrderDetail_test] b ON a.[SalesOrderID] = b.[SalesOrderID]

INNER JOIN [Production].[Product] p ON b.[ProductID] = p.[ProductID]

WHERE a.[SalesOrderID] = @j

iii. Query Hint – Option(Recompile)

USE [AdventureWorks]

DROP PROC NoSniff_QueryHint_Recompile

CREATE PROC NoSniff_QueryHint_Recompile ( @i INT )

SELECT COUNT(b.[SalesOrderID]) ,

SUM(p.[Weight])

FROM [dbo].[SalesOrderHeader_test] a

INNER JOIN [dbo].[SalesOrderDetail_test] b ON a.[SalesOrderID] = b.[SalesOrderID]

INNER JOIN [Production].[Product] p ON b.[ProductID] = p.[ProductID]

WHERE a.[SalesOrderID] = @i

OPTION ( RECOMPILE )

iv. Create Proc xxx with Recompile

USE [AdventureWorks]

DROP PROC NoSniff_SPCreate_Recompile

CREATE PROC NoSniff_SPCreate_Recompile ( @i INT )

WITH RECOMPILE

SELECT COUNT(b.[SalesOrderID]) ,

SUM(p.[Weight])

FROM [dbo].[SalesOrderHeader_test] a

INNER JOIN [dbo].[SalesOrderDetail_test] b ON a.[SalesOrderID] = b.[SalesOrderID]

INNER JOIN [Production].[Product] p ON b.[ProductID] = p.[ProductID]

WHERE a.[SalesOrderID] = @i

v. Force SQL Server to use Nested Lopps/Merger/Hash Join

CREATE PROC NoSniff_QueryHint_JoinHint ( @i INT )

SELECT COUNT(b.[SalesOrderID]) ,

SUM(p.[Weight])

FROM [dbo].[SalesOrderHeader_test] a

INNER JOIN [dbo].[SalesOrderDetail_test] b ON a.[SalesOrderID] = b.[SalesOrderID]

INNER HASH JOIN [Production].[Product] p ON b.[ProductID] = p.[ProductID]

WHERE a.[SalesOrderID] = @i

vi. Optimize For

USE [AdventureWorks]

DROP PROC NoSniff_QueryHint_OptimizeFor

CREATE PROC NoSniff_QueryHint_OptimizeFor ( @i INT )

SELECT COUNT(b.[SalesOrderID]) ,

SUM(p.[Weight])

FROM [dbo].[SalesOrderHeader_test] a

INNER JOIN [dbo].[SalesOrderDetail_test] b ON a.[SalesOrderID] = b.[SalesOrderID]

INNER HASH JOIN [Production].[Product] p ON b.[ProductID] = p.[ProductID]

WHERE a.[SalesOrderID] = @i

OPTION ( OPTIMIZE FOR ( @i=74 ) )

vii. Plan Guide

USE [AdventureWorks]

EXEC [sys].[sp_create_plan_guide] @name = N'Guide1',

@stmt = N'SELECT COUNT(b.[SalesOrderID]),SUM(p.[Weight])

FROM [dbo].[SalesOrderHeader_test] a

INNER JOIN [dbo].[SalesOrderDetail_test] b

ON a.[SalesOrderID]=b.[SalesOrderID]

INNER JOIN [Production].[Product] p

ON b.[ProductID]=p.[ProductID]

WHERE a.[SalesOrderID]=@i', @type = N'OBJECT', @module_or_batch = N'Sniff',

@params = NULL, @hints = N'option(optimize for(@i=75124))';

viii. Comparisons of the methods on fighting the parameter sniffing

Method

Need to change the Proc

Need to recompile each time

Accuracy of execution plan

Exec(@DynamicSQL)

Yes

Yes

Excellent

Local variable

Yes

No

Good

Query hint+"Recompile"

Yes

Yes

Excellent

Query hint with specified join

Yes

No

Fair

Query hint optimize for

Yes

No

Good+

Plan Guide

No

No

Good+

4. Used non-sargable in WHERE, unable to use index

d. Caused by index, table structure, or DML statements? – modify index and use plan guides

i. Covering index vs. Include

1. Covering index covers some or all of the columns in the query (A covering index—which is used in covered queries—can provide some or all of the indexed columns that a covered query uses. If a covering index is also a composite index (i.e., it indexes more than one column in its base table or view), it might contain columns that aren't used in the covered query but are used instead in other queries that have overlapping columns.)

2. Select the 1^st column carefully as statistics is only on the 1^st column

3. Do not have to include all columns, especially the lengthy column as covering index arranges the column in B-tree.

4. Index with Included Columns (A include B) is an alternative to covering index (A, B) – less overhead due to no-B-Tree structure

ii. Primary Key and Clustered Index

1. PK is logical, clustered index is physical

2. By default, PK column has a clustered index, but it can be a non-clustered index as well.

iii. How to know if it is an index problem?

1. Step 1 – find the high cost one

2. Step 2 – Is it table scan or index scan

3. Step 3 – compare the number of rows return with the number of rows in the table, if it is much less, then SCAN is not a good option. It lacks an index to seek the data.

iv. How to find missing indexes?

1. SQL Profiler– use the event Performance – Showplan XML Statistics Profiles, then export to a file (right click)

2. SELECT * FROM sys.dm_db_missing_index_details

3. DTA

v. Using plan guides

1. Useful in some limited scenarios

2. Command Tune Up - What else can we do, after solutions on memory/IO bottleneck, blocking and deadlock, compile/recompile, parameter sniffing, index, and plan guide?

a. Use SARGable filters, do not use NOT or <>, built-in functions, or computations in filters. If you have to use UPPER(name), for instance, add a computed column and index it.

b. If the passing parameter changes inside the stored procedure, two options: (1) use Option(Recompile), (2) break the sp into a main and a sub one.

CREATE PROC GetRecentSales ( @date DATETIME )

BEGIN

--Note: if NULL, it will be replaced with a new date

IF @date IS NULL

SET @date = DATEADD(mm, -3,

( SELECT MAX([OrderDate])

FROM [dbo].[SalesOrderHeader_test]

))

SELECT SUM(d.[OrderQty])

FROM [dbo].[SalesOrderHeader_test] h ,

[dbo].[SalesOrderDetail_test] d

WHERE h.[SalesOrderID] = d.[SalesOrderID]

AND h.[OrderDate] > @date

OPTION ( RECOMPILE )

END

EXEC [sys].[sp_recompile] @objname = N'GetRecentSales'

DBCC FREEPROCCACHE

--Use the query hint: Option(Recompile)

ALTER PROC GetRecentSales ( @date DATETIME )

BEGIN

IF @date IS NULL

SET @date = DATEADD(mm, -3,

( SELECT MAX([OrderDate])

FROM [dbo].[SalesOrderHeader_test]

))

SELECT SUM(d.[OrderQty])

FROM [dbo].[SalesOrderHeader_test] h ,

[dbo].[SalesOrderDetail_test] d

WHERE h.[SalesOrderID] = d.[SalesOrderID]

AND h.[OrderDate] > @date

OPTION ( RECOMPILE )

END

--还有一种方法,是把可能出问题的语句单独做成一个子存储过程,让原来

--的存储过程调用子存储过程,而不是语句本身。例如

USE [AdventureWorks]

----the sub-sp

CREATE PROCEDURE GetRecentSalesHelper ( @date DATETIME )

BEGIN

SELECT SUM(d.[OrderQty])

FROM [dbo].[SalesOrderHeader_test] h ,

[dbo].[SalesOrderDetail_test] d

WHERE h.[SalesOrderID] = d.[SalesOrderID]

AND h.[OrderDate] > @date

END

--the main sp

ALTER PROC GetRecentSales ( @date DATETIME )

BEGIN

IF @date IS NULL

SET @date = DATEADD(mm, -3,

( SELECT MAX([OrderDate])

FROM [dbo].[SalesOrderHeader_test]

))

EXEC GetRecentSalesHelper @date --call the sub sp

END

c. Temp table vs. Table variable

i. Temp table – can have statistics and indexes, for large number of rows

ii. Table variable – for small number of rows (<100)

d. Simplify complex query

i. If you have to use dynamic query, limit the conditions of choices and rows returned

ii. If too many tables in join – de-normalizing them

iii. Avoid of too many levels of view in view or sp in sp

iv. Use Order By if you have to

v. Do you have to return too many rows ? If not, use TOP to limit

vi. Use several simple statements to replace a complex one.

Method	Need to change the Proc	Need to recompile each time	Accuracy of execution plan
Exec(@DynamicSQL)	Yes	Yes	Excellent
Local variable	Yes	No	Good
Query hint+"Recompile"	Yes	Yes	Excellent
Query hint with specified join	Yes	No	Fair
Query hint optimize for	Yes	No	Good+
Plan Guide	No	No	Good+