1. Main Strategies
a.
Caused by I/O?
i. How
to know if there are physical I/O activities?
SET STATISTICS
IO ON
SET STATISTICS TIME
ON
ii. If
the performance problem is related to I/O, what to do?
1.
Memory bottleneck? Page out/Page In problems? If
no
2.
Is the query heavily used or occasionally used?
If rarely used, but with large data, suggest the client to run the query at
non-peak time.
3.
Is it possible to reduce the volume of data by
examining the execution plan?
4.
Hardware (disk) improvement
b.
Caused by heavy compile or recompile?
i. Focus
on two types of queries
1.
Simple but frequently used ones – is it possible
to reduce the compile time via database re-design
2.
Complex queries
ii. Two
ways to calculate the compile time
1.
SET STATISTICS TIME
ON – cannot be used for application codes
2.
SQL Trace
a.
For Batches, compile time = SQL:StmtStarting –
SQL:BatchStarting
b.
For SP, compile time = SQL:StmtStarting –
SP:StmtStarting
c.
For RPC, compile time = SQL:StmtStarting –
RPC:Starting
d.
For dynamic queries, compile time = 2nd
SP:StmtStarting (for the actual query) – 1st SP:StmtStarting (for
@Exec)
USE [AdventureWorks]
GO
DROP PROC
LongCompile
GO
CREATE PROC
LongCompile ( @i INT
)
AS
DECLARE @cmd VARCHAR(MAX)
DECLARE @j INT
SET @j = 0
SET @cmd = '
SELECT COUNT(b.SalesOrderID) ,
SUM(p.Weight)
FROM
dbo.SalesOrderHeader_test a
INNER
JOIN dbo.SalesOrderDetail_test b ON a.SalesOrderID = b.SalesOrderID
INNER
JOIN Production.Product p ON b.ProductID = p.ProductID
where a.SalesOrderID in(43659'
WHILE @j < @i
BEGIN
SET @cmd = @cmd + ',' + STR(@j + 43659)
SET @j = @j + 1
END
SET @cmd = @cmd + ')'
PRINT @cmd
EXEC(@cmd)
GO
SET STATISTICS TIME ON
EXEC
LongCompile 100
iii. How
to improve if the compile time is too long?
1.
Make the query shorter or simpler
2.
Delete extra indexes
3.
Reuse the plans if possible
c.
Wrong execution plan in each step?
i. Compare
the actual returned rows and estimated rows.
ii. Index
seek or table scan? – Generally Seek is better than Scan, but if a wrong plan
is used due to parameter mismatch, scan may be better in some cases. So it
depends.
iii. Right
join? Is Nested Loops, Merge, or Hash Join right?
iv. Filter
(i.e., the WHERE clause) after join or join after filer? - filter
v. Summary
of possible reasons for using wrong execution plan
1.
No statistics or out dated – auto update or
manual/job to update
2.
Complex query (e.g., complex WHERE) makes SQL
Server hard to estimate the cost
3.
Use a wrong plan due to uneven data distribution
a.
Parameter sniffing – performance degradation due
to wrong parameter
b.
Reason: uneven data distribution and passing
parameters quite different
c.
Solutions
i. Use
Exec(@cmd+@i )
USE [AdventureWorks]
GO
DROP PROC NOSniff
GO
CREATE PROC NOSniff(@i INT)
AS
DECLARE @cmd VARCHAR()
SET @cmd='
SELECT
COUNT(b.[SalesOrderID]),SUM(p.[Weight])
FROM
[dbo].[SalesOrderHeader_test] a
INNER JOIN
[dbo].[SalesOrderDetail_test] b
ON
a.[SalesOrderID]=b.[SalesOrderID]
INNER JOIN
[Production].[Product] p
ON
b.[ProductID]=p.[ProductID]
WHERE
a.[SalesOrderID]='
EXEC(@cmd+@i)
GO
ii. Use
Local Variables
CREATE PROC
Sniff2 ( @i INT
)
AS
DECLARE @j INT
SET @j = @i
SELECT COUNT(b.[SalesOrderID]) ,
SUM(p.[Weight])
FROM [dbo].[SalesOrderHeader_test] a
INNER JOIN [dbo].[SalesOrderDetail_test]
b ON a.[SalesOrderID]
= b.[SalesOrderID]
INNER JOIN
[Production].[Product] p ON b.[ProductID] = p.[ProductID]
WHERE a.[SalesOrderID] = @j
GO
iii. Query
Hint – Option(Recompile)
USE [AdventureWorks]
GO
DROP PROC
NoSniff_QueryHint_Recompile
GO
CREATE PROC
NoSniff_QueryHint_Recompile ( @i INT )
AS
SELECT COUNT(b.[SalesOrderID]) ,
SUM(p.[Weight])
FROM [dbo].[SalesOrderHeader_test] a
INNER JOIN [dbo].[SalesOrderDetail_test] b ON
a.[SalesOrderID] =
b.[SalesOrderID]
INNER JOIN
[Production].[Product] p ON b.[ProductID] = p.[ProductID]
WHERE a.[SalesOrderID] = @i
OPTION
( RECOMPILE
)
GO
iv. Create
Proc xxx with Recompile
USE [AdventureWorks]
GO
DROP PROC
NoSniff_SPCreate_Recompile
GO
CREATE PROC
NoSniff_SPCreate_Recompile ( @i INT )
WITH RECOMPILE
AS
SELECT COUNT(b.[SalesOrderID]) ,
SUM(p.[Weight])
FROM [dbo].[SalesOrderHeader_test] a
INNER JOIN [dbo].[SalesOrderDetail_test] b ON
a.[SalesOrderID] =
b.[SalesOrderID]
INNER JOIN
[Production].[Product] p ON b.[ProductID] = p.[ProductID]
WHERE a.[SalesOrderID] = @i
GO
v. Force
SQL Server to use Nested Lopps/Merger/Hash Join
CREATE PROC
NoSniff_QueryHint_JoinHint ( @i INT )
AS
SELECT COUNT(b.[SalesOrderID]) ,
SUM(p.[Weight])
FROM [dbo].[SalesOrderHeader_test] a
INNER JOIN [dbo].[SalesOrderDetail_test] b ON
a.[SalesOrderID] =
b.[SalesOrderID]
INNER HASH
JOIN [Production].[Product] p ON b.[ProductID] = p.[ProductID]
WHERE a.[SalesOrderID] = @i
GO
vi. Optimize
For
USE [AdventureWorks]
GO
DROP PROC
NoSniff_QueryHint_OptimizeFor
GO
CREATE PROC
NoSniff_QueryHint_OptimizeFor ( @i INT )
AS
SELECT COUNT(b.[SalesOrderID]) ,
SUM(p.[Weight])
FROM [dbo].[SalesOrderHeader_test] a
INNER JOIN [dbo].[SalesOrderDetail_test] b ON
a.[SalesOrderID] =
b.[SalesOrderID]
INNER HASH JOIN [Production].[Product]
p ON b.[ProductID]
= p.[ProductID]
WHERE a.[SalesOrderID] = @i
OPTION ( OPTIMIZE FOR ( @i=74 ) )
GO
vii. Plan
Guide
USE [AdventureWorks]
GO
EXEC [sys].[sp_create_plan_guide]
@name = N'Guide1',
@stmt = N'SELECT
COUNT(b.[SalesOrderID]),SUM(p.[Weight])
FROM [dbo].[SalesOrderHeader_test]
a
INNER JOIN
[dbo].[SalesOrderDetail_test] b
ON
a.[SalesOrderID]=b.[SalesOrderID]
INNER JOIN
[Production].[Product] p
ON
b.[ProductID]=p.[ProductID]
WHERE
a.[SalesOrderID]=@i', @type = N'OBJECT',
@module_or_batch = N'Sniff',
@params = NULL, @hints = N'option(optimize for(@i=75124))';
GO
viii.
Comparisons of the methods on fighting the
parameter sniffing
Method Need to change the Proc Need to recompile each time Accuracy of execution plan Exec(@DynamicSQL) Yes Yes Excellent Local variable Yes No Good Query hint+"Recompile" Yes Yes Excellent Query hint with specified join Yes No Fair Query hint optimize for Yes No Good+ Plan Guide No No Good+
4.
Used non-sargable in WHERE, unable to use index
d.
Caused by index, table structure, or DML
statements? – modify index and use plan guides
i. Covering
index vs. Include
1.
Covering index covers some or all of the columns
in the query (A covering index—which is used in covered
queries—can provide some or all of the indexed columns that a covered query
uses. If a covering index is also a composite index (i.e., it indexes more than
one column in its base table or view), it might contain columns that aren't
used in the covered query but are used instead in other queries that have
overlapping columns.)
2.
Select the 1st column carefully as
statistics is only on the 1st column
3.
Do not have to include all columns, especially
the lengthy column as covering index arranges the column in B-tree.
4.
Index with Included Columns (A include B) is an
alternative to covering index (A, B) – less overhead due to no-B-Tree structure
ii. Primary
Key and Clustered Index
1.
PK is logical, clustered index is physical
2.
By default, PK column has a clustered index, but
it can be a non-clustered index as well.
iii. How
to know if it is an index problem?
1.
Step 1 – find the high cost one
2.
Step 2 – Is it table scan or index scan
3.
Step 3 – compare the number of rows return with
the number of rows in the table, if it is much less, then SCAN is not a good
option. It lacks an index to seek the data.
iv. How
to find missing indexes?
1.
SQL Profiler– use the event Performance –
Showplan XML Statistics Profiles, then export to a file (right click)
2.
SELECT * FROM sys.dm_db_missing_index_details
3.
DTA
v. Using
plan guides
1.
Useful in some limited scenarios
2.
Command Tune Up - What else can we do, after
solutions on memory/IO bottleneck, blocking and deadlock, compile/recompile,
parameter sniffing, index, and plan guide?
a.
Use SARGable filters, do not use NOT or
<>, built-in functions, or computations in filters. If you have to use
UPPER(name), for instance, add a computed column and index it.
b.
If the passing parameter changes inside the
stored procedure, two options: (1) use Option(Recompile), (2) break the sp into a
main and a sub one.
CREATE PROC
GetRecentSales ( @date DATETIME
)
AS
BEGIN
--Note: if NULL, it
will be replaced with a new date
IF @date IS NULL
SET @date = DATEADD(mm, -3,
( SELECT MAX([OrderDate])
FROM [dbo].[SalesOrderHeader_test]
))
SELECT SUM(d.[OrderQty])
FROM [dbo].[SalesOrderHeader_test] h ,
[dbo].[SalesOrderDetail_test] d
WHERE h.[SalesOrderID] = d.[SalesOrderID]
AND h.[OrderDate] > @date
OPTION ( RECOMPILE )
END
EXEC [sys].[sp_recompile]
@objname = N'GetRecentSales'
GO
DBCC FREEPROCCACHE
GO
--Use the query hint: Option(Recompile)
ALTER PROC
GetRecentSales ( @date DATETIME
)
AS
BEGIN
IF @date IS NULL
SET @date = DATEADD(mm, -3,
( SELECT MAX([OrderDate])
FROM [dbo].[SalesOrderHeader_test]
))
SELECT SUM(d.[OrderQty])
FROM [dbo].[SalesOrderHeader_test] h ,
[dbo].[SalesOrderDetail_test] d
WHERE h.[SalesOrderID] = d.[SalesOrderID]
AND h.[OrderDate] > @date
OPTION
( RECOMPILE
)
END
--还有一种方法,是把可能出问题的语句单独做成一个子存储过程,让原来
--的存储过程调用子存储过程,而不是语句本身。例如
USE [AdventureWorks]
GO
----the sub-sp
CREATE PROCEDURE
GetRecentSalesHelper ( @date DATETIME )
AS
BEGIN
SELECT SUM(d.[OrderQty])
FROM [dbo].[SalesOrderHeader_test] h ,
[dbo].[SalesOrderDetail_test] d
WHERE h.[SalesOrderID] = d.[SalesOrderID]
AND h.[OrderDate] > @date
END
--the main sp
ALTER PROC
GetRecentSales ( @date DATETIME
)
AS
BEGIN
IF @date IS NULL
SET @date = DATEADD(mm, -3,
( SELECT MAX([OrderDate])
FROM [dbo].[SalesOrderHeader_test]
))
EXEC GetRecentSalesHelper @date --call the sub sp
END
c.
Temp table vs. Table variable
i. Temp
table – can have statistics and indexes, for large number of rows
ii. Table
variable – for small number of rows (<100)
d.
Simplify complex query
i. If
you have to use dynamic query, limit the conditions of choices and rows
returned
ii. If too
many tables in join – de-normalizing them
iii. Avoid of too
many levels of view in view or sp in sp
iv. Use
Order By if you have to
v. Do you have to return
too many rows ? If not, use TOP to limit
vi. Use
several simple statements to replace a complex one.