Summary:
Due to the introduction of SQLOS, CPU often is not a trouble
maker for SQL Server performances. But problems do occur. The typical problems
are:
1. A long-running task occupies the CPU
and makes the server hang, you will find Errors 17883, 17884, and 17888 in the
error log.
2. High % of CPU usage. In this case,
you need to identify the trouble maker.
Some problems can be solved by upgrading the hardware, some
are side effects of memory and disk I/O problems. But, more than often, the
problems are caused by bad database designs, bad queries, or inappropriate use
of queries.
1. For SQL Server professionals, we first need to separate OLTP from OLAP (Data Warehouse), then use different strategies to prevent/lessen/solve the problems which make CPU unhappy.
1) For OLTP:
· Use small-size indexes for the tables requiring many modifications.
· Decrease the CPU usage % by reusing the execution plans and joining fewer tables with a flatten design
· Optimize indexes.
· Use small transaction sizes
· Avoid parallel execution of OLTP queries
2) For OLAP:
· You can build as many indexes as you need.
· For complex queries, recompiling is more efficient than re-using the non-optimal existing plans
· Consider adding index if necessary.
· If large scans are inevitable, use index rebuild/reorganization to reduce fragmentation.
· Parallel execution are often good for OLAP queries
==================================================================================================
2. SQLOS (see more on http://blogs.technet.com/b/josebda/archive/2009/03/30/sql-server-2008-sqlos.aspx)
1) Key components
2.1.1 Scheduler
2.1.2 Memory management
2.1.3 Error and Exception handling
2.1.4 Deadlock detection and resolution
2.1.5 Run 3rd party codes (Dll, Extended SP etc.)
2)
Why use it?
2.2.1 Only the running tasks are assigned with threads, idle connections are ignored
2.2.2 Sharply reduce the context switch in Windows
2.2.3 Very helpful, 1000-2000 simultaneous developers only need 100-200 threads.
3) Basics
2.3.1 What is it?
i. User mode layer between SQL Server and the Windows OS.
ii. Use Thread mode or fiber mode. Default is thread.
2.3.2 Scheduler
i. One scheduler on a SQL Server, corresponds to one logical CPU
ii. One time, only one worker can run. Other workers must wait.
2.3.3 Worker
i. A worker = A thread or fiber, the execution unit.
ii. Scheduler schedule workers not threads or fibers
iii. When a worker runs, it will try to finish the entire task unless it yields
iv. When the worker is idle over 15 minutes, scheduler will delete the worker and the thread/fiber to release the resources
v. There is a max limit of threads (workers) depending on the number of CPU and 32 vs. 64 bit, such as 256 on a 32-bit, <=4 processors machine.
2.3.4 Task
i. The smallest execution unit such as a batch
ii. When it runs, it executes or waits if it is blocked.
2.3.5 Yielding – yielding to other workers
i. All workers are non-preemptive (vs. preemptive in Windows)
ii. SQL Server defines many rules for yielding to refrain the running time of a task in the scheduler.
iii. If for some reason, a task runs over 60s and does not yield, SQL Server Health Monitor will create a mini-dump and a 17833, 17884, or 17888 error in error logs.
· 17883 – a single scheduler has trouble, check the abnormal number of user mode, kernel mode, system idle%, or process utilization% etc. But often it is cause by system IO.
· 17884 – all schedulers have threads waiting or blocked
· 17888 – over 50% scheduler have threads waiting or blocked
iv. Check the connection status of thread, workers, schedulers etc.
--了解scheduler和用户连接所使用的thread的情况. UMS =
User Mode Scheduling
DBCC SQLPERF(umsstats)
SELECT * FROM sys.sysprocesses WHERE kpid<>0
--More details
SELECT * FROM sys.dm_os_schedulers
SELECT * FROM sys.dm_os_workers
SELECT * FROM sys.dm_os_threads
SELECT * FROM sys.dm_os_tasks
SELECT * FROM sys.dm_os_waiting_tasks
SELECT * FROM sys.dm_os_ring_buffers
2.3.6 Overall architecture
对于每个CPU,SQL都会有一个scheduler与之对应。在每个scheduler里,会有若干个worker,对应于每个线程。在客户端发过来请求之后,SQL会将其分解成一个或多个task。根据每个scheduler的繁忙程度,task会被分配到某个scheduler上。如果scheduler里有空闲的worker,task就会被分配到某个worker上。如果没有,scheduler会创建新的worker,供task使用。如果scheduler里的worker已经到了他的上限值,而他们都有task要运行,那么新的task只好进入等待worker的状态.
3. SQL Server CPU 100%
1) Often not a problem, much less memory and disk, typically is low, 60-70% is very high to SQL
2) When CPU is used?
3.2.1 Compile and recompile
3.2.2 Sort and aggregation
3.2.3 Join
3.2.4 Decompression
3)
How to set CPU in SQL Server (use sp_configure)
3.3.1 Priority boost – not recommended as it may mess up windows
3.3.2 Affinity mask
3.3.3 Lightweight pooling – if on, fiber instead of thread is used, complicated but will reduce the number of context switch. Not suggested to change
3.3.4 Max degree of parallelism = CPU number
3.3.5 Cost Threshold of Parallelism – default to 5, rarely change it
3.3.6 Max workers threads – default to 0, auto decide. Theoretically, 256 on a 32-bit 4CU machine and 512 on a 64-bit 4CPU machine
3.3.7 Bottom line: often caused by workloads, little can be done in SQL Server to improve it. So need to find who caused the high %CPU and anyway to redesign the db.
4) Troubleshooting
3.4.1 How much is CPU% on the server, how much is from SQL Server? – Check the counters in performance monitor
i. Entire server
· Processor:%Processor Time: 该计数器可以监视CPU执行的非闲置线程所用的时间。持续80%到90%的状态可能表明需要升级CPU或需要增加更多的处理器。
· Processor:%Privileged Time(Kernel Mode):对应于处理器执行Microsoft Windows内核命令(如处理SQL Server I/O请求)所用时间百分比。如果Physical Disk计数器的值很高时,该计数器的值也一直很高,则考虑安装速度更快或效率更高的磁盘子系统。
· Processor:%User Time(User Mode):对应于处理器执行用户进程(例如 SQL Server)所用时间的百分比。
· System:Processor queue length:对应于等待处理器时间的线程数。当一个进程的线程需要的处理器循环数超过可获得的循环数时,就产生了处理器瓶颈。
如果有很多进程在争用处理器时间,可能需要安装一个速度更快的处理器。如果使用的是多处理器系统,则可以增加一个处理器。
· Context switches/sec:指计算机上的所有处理器全都从一个线程转换到另一个线程的综合速率。
当正在运行的线程自动放弃处理器时出现上下文转换,由一个有更高优先就绪的线程占先或在用户模式和特权 (内核) 模式之间转换以使用执行或分系统服务。
如果此计数器的数值较大,则表明锁定竞争很激烈,或者线程在用户和内核模式之间频繁切换。
ii. SQL Server Process
· Processor:%Processor Tim
· Processor:%Privileged Time(Kernel Mode
· Processor:%User Time(User Mode)
3.4.2 Any 17883/17884 or Access Violation errors? – Check the errorlog
3.4.3 Find the most expensive queries and optimize them
i. SQL Trace is the better, but negative impacts making it impossible to use
ii. SQL 2005+, use the following DMV, or standard reports
--SQL2005 DMV:
top 50 queries utilizing CPU since startup
SELECT
highest_cpu_queries.*,
highest_cpu_queries.total_worker_time,
DB_NAME(q.dbid) AS dbname,
q.[text] AS qtext
FROM (SELECT TOP 50
qs.*
FROM sys.dm_exec_query_stats qs
ORDER BY qs.total_worker_time DESC)
AS highest_cpu_queries
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS q
ORDER BY highest_cpu_queries.total_worker_time DESC
--find the top 25
recomplied usps
SELECT TOP 25
sql_text.text AS sqltext,
sql_handle AS sqlhandle,
plan_generation_num AS plangenerationnum,
execution_count AS execcount,
DB_NAME(dbid) AS dbname,
objectid AS objectid
FROM sys.dm_exec_query_stats a
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sql_text
WHERE plan_generation_num > 1
ORDER BY plan_generation_num DESC
iii. ReadTrace – see http://www.sqlskills.com/blogs/erin/rml-utilities-and-sql-server-2012/
iv. After finding the problematic queries, the key now is to optimize the database and queries
v. If you cannot change or re-design, migrate some databases to other servers and upgrade the hardware.
4. How to Improve Performance of an OLTP system?
1) Database Design?
4.1.1 The frequent run queries have 4+ table join.
i. Find them
--Return the most
popular top 100 queries
SELECT TOP 100
cp.cacheobjtype,
cp.usecounts,
cp.size_in_bytes,
qs.statement_start_offset,
qs.statement_end_offset,
qt.dbid,
qt.objectid,
SUBSTRING(qt.text, qs.statement_start_offset / 2, CASE
WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(MAX), qt.text)) * 2 ELSE qs.statement_end_offset
END - qs.statement_start_offset / 2) AS statement
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
INNER JOIN sys.dm_exec_cached_plans AS cp
ON qs.plan_handle = cp.plan_handle
WHERE cp.plan_handle = qs.plan_handle
AND cp.usecounts > 4
ORDER BY dbid, usecounts DESC
ii. Recommendation: make the tables less normalized.
4.1.2 More than 3 indexes are frequently updated
i. Find them.
--Find the top
100 indexes frequently updated
--From
databaseid,objectid,indexid和partitionnumber, determine whhich index on which
table
SELECT TOP 100 DB_NAME(database_id),*
FROM sys.dm_db_index_operational_stats(NULL,NULL,NULL,null)
ORDER BY leaf_insert_count+leaf_delete_count+leaf_update_count DESC
ii. Suggestions: really needed?
4.1.3 A large number of table scans and range scans
i. Find them: high value of SQL Server: Access Methods – Full Scans/sec and Range Scans/sec, and run the query below:
--return the top
50 queries with the most I/O and their execution plans
SELECT TOP 50
total_logical_reads / execution_count AS avg_logical_read,
total_logical_writes / execution_count AS avg_logical_write,
total_physical_reads / execution_count AS avg_phys_read,
execution_count AS execcount,
statement_start_offset AS stmt_start_offset,
statement_end_offset AS stmt_end_offset,
SUBSTRING(sql_text.text, (statement_start_offset / 2),
CASE
WHEN (statement_end_offset - statement_start_offset) / 2 <= 0 THEN 64000 ELSE (statement_end_offset - statement_start_offset) / 2
END) AS exec_statment,
sql_text.text AS text,
plan_text.*
FROM sys.dm_exec_query_stats
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sql_text
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS plan_text
ORDER BY (total_logical_reads + total_logical_writes) / execution_count DESC
ii. Add appropriate indexes
4.1.4 Indexes not used
i. Find them: indexes not in sys.dm_db_index_usage_stats
ii. Get rid of them
2) CPU Threshold in the OLTP
4.2.1 Signal waits should be <25%
SELECT CONVERT(NUMERIC(5,4),SUM(signal_wait_time_ms)/SUM(wait_time_ms))
FROM sys.dm_os_wait_stats
4.2.2
Re-use rate of execution plans in OLTP should be
>95%, computed from SQL Server Statistics
Initial
compilation/sec = SQL Compilation/sec – SQL Re-compilation/sec
Re-use rate of
execution plans = (Batch request/sec – Initial Compilation/sec)/Batch
requests/sec
4.2.3
The % of Cxpacket/wait in parallel execution
should be <5% - parallel execution in OLTP is bad
--Calculate % of
expacket/wait
DECLARE @cxpacket BIGINT
DECLARE @sumwait BIGINT
SELECT @cxpacket=wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type='cxpacket'
SELECT @sumwait=SUM(wait_time_ms)
FROM sys.dm_os_wait_stats
SELECT CONVERT(NUMERIC(5,4),@cxpacket/@sumwait)
3) Memory in the OLTP
4.3.1
Page life expectancy should be >300s -
Check SQLServer: Buffer Manager, SQLServer: Buffer Nodes
4.3.2
Page life expectancy should not decrease
over 50% - Check SQLServer: Buffer Manager
4.3.3
Memory Grants Pending should be <=1 -
Check SQL Server Memory Manager: Memory
Grants Pending
4.3.4
SQL Cache hit
ratio should >90 – Check SQLServer: Plan Cache
4)
I/O in the OLTP
4.4.1
Average
disk sec/read should be <20ms, often 4-8 ms, - Check Physical
Disk
4.4.2
Average
disk sec/write should be <20ms, often 1 ms,for log writing -
Check Physical Disk
4.4.3
Big IOs Table Scans and Range Scans should be
<1 – Check SQLServer:Access Method: Full Scans/sec and Range Scans/sec if
they are high, adding proper indexes
4.4.4
These (asynch_io_completion , io_completion,
logmgr, writelog, and pageiolatch_x) should not be in the top 2 list when
running the following query:
SELECT TOP 2 wait_type FROM sys.dm_os_wait_stats ORDER BY wait_time_ms DESC
5)
Blocking in the OLTP
4.5.1 % of blocking should be <2%, run the following script to check
DECLARE @dbid INT
SELECT
@dbid = DB_ID()
SELECT
dbid = database_id,
objectname = OBJECT_NAME(s.object_id),
indexname = i.name,
i.index_id,
partition_number,
row_lock_count,
row_lock_wait_count,
[block%] = CAST(100 * row_lock_wait_count / (1 + row_lock_count) AS numeric(15, 2)),
row_lock_wait_in_ms,
[avg row lock waits in ms] = CAST(1 * row_lock_wait_in_ms / (1 + row_lock_wait_count) AS numeric(15, 2))
FROM sys.dm_db_index_operational_stats(@dbid, NULL, NULL, NULL) AS s,
sys.indexes AS i
WHERE OBJECTPROPERTY(s.object_id, 'IsUserTable') = 1
AND i.object_id = s.object_id
AND i.index_id = s.index_id
ORDER BY row_lock_wait_count DESC
4.5.2 Automatically report blocking over 30s in SQL Trace, - Check sp_configure ‘blocked process threshold’
4.5.3 The average block time should be <100ms, run the script as above
4.5.4 The top 2 blocks should not have LCK_M_?? when running the following script:
SELECT TOP 2 wait_type FROM sys.dm_os_wait_stats ORDER BY wait_time_ms DESC
4.5.5
Deadlock should be less than 5 every hour. Turn
on Trace flag 1204 or check SQL Trace
6) Networking in the OLTP
4.6.1 Output queue length should be <2 – check Network Interface
4.6.2 Packets Outbound Discarded, Packets Outbound Errors, Packets Received Discarded, Packets Received Errors – check Network Interface.
5. How to Improve Performance of an OLAP system?
1) Database Design
5.1.1 More indexes are fine as most of DMLs are read only, covered index is recommended.
5.1.2 Reduce the fragmentation <25% by using rebuilding indexes
USE master --改为你要扫描索引碎片的那个数据库
DECLARE @dbid INT
SELECT
@dbid = DB_ID()
SELECT
*
FROM sys.dm_db_index_physical_stats(@dbid, NULL, NULL, NULL, NULL)
WHERE avg_fragmentation_in_percent > 25
ORDER BY avg_fragmentation_in_percent DESC
5.1.3 Add proper indexes
--Find the
possible missing indexes in the current database
SELECT
d.*,
s.avg_total_user_cost AS avgtotalusercost,
s.avg_user_impact AS avguserimpact,
s.last_user_seek AS lastuserseek,
s.unique_compiles AS uniquecompile
FROM sys.dm_db_missing_index_groups g
INNER JOIN sys.dm_db_missing_index_group_stats s
ON s.group_handle = g.index_group_handle
INNER JOIN sys.dm_db_missing_index_details d
ON g.index_handle = d.index_handle
WHERE s.group_handle = g.index_group_handle
AND d.index_handle = g.index_handle
ORDER BY s.avg_user_impact DESC
--推荐建立索引的字段
DECLARE @handle INT
SELECT @handle=d.index_handle
FROM sys.dm_db_missing_index_group_stats s,
sys.dm_db_missing_index_groups g,
sys.dm_db_missing_index_details d
WHERE s.group_handle=g.index_group_handle
AND d.index_handle=g.index_handle
SELECT * FROM sys.dm_db_missing_index_columns(@handle)
ORDER BY column_id
5.1.4 Do not add useless indexes
2) CPU Threshold in the OLAP
5.2.1
Signal waits should be <25%
SELECT CONVERT(NUMERIC(5,4),SUM(signal_wait_time_ms)/SUM(wait_time_ms))
FROM sys.dm_os_wait_stats
5.2.2
Re-use rate of execution plans in OLTP should be
<25% (precise execution plan is much more important than compile/recompile),
computed from SQL Server Statistics
Initial
compilation/sec = SQL Compilation/sec – SQL Re-compilation/sec
Re-use rate of
execution plans = (Batch request/sec – Initial Compilation/sec)/Batch
requests/sec
5.2.3
The % of Cxpacket/wait in parallel execution
should be >10% - parallel execution in OLAP is popular
--Calculate % of
expacket/wait
DECLARE @cxpacket BIGINT
DECLARE @sumwait BIGINT
SELECT @cxpacket=wait_time_ms
FROM sys.dm_os_wait_stats
WHERE wait_type='cxpacket'
SELECT @sumwait=SUM(wait_time_ms)
FROM sys.dm_os_wait_stats
SELECT CONVERT(NUMERIC(5,4),@cxpacket/@sumwait)
3) Memory in the OLTP
5.3.1
Page life expectancy should not decrease
over 50% (data paging is normal in OLAP, but often decrease indicates memory
problem, try to see if you can improve by optimizing indexes) - Check
SQLServer: Buffer Manager
5.3.2
Memory Grants Pending should be <=1 -
Check SQL Server Memory Manager: Memory
Grants Pending
4)
I/O in the OLAP – IO is more than in OLTP, but
much IO is still a problem.
5.4.1 Average disk sec/read should be <20ms, often 4-8 ms, - Check Physical Disk
5.4.2 Average disk sec/write should be <20ms, often 1 ms,for log writing - Check Physical Disk
5.4.3 Big IO Scans should be <1 (large scan indicates missing indexes) – Check SQLServer:Access Method: Full Scans/sec and Range Scans/sec if they are high, adding proper indexes
5.4.4 These (asynch_io_completion , io_completion, logmgr, writelog,and pageiolatch_x) should not be in the top 2 list when running the following query:
SELECT TOP 2 wait_type FROM sys.dm_os_wait_stats ORDER BY wait_time_ms DESC
5.4.5 Locks in the OLAP
i. % of locking should be <2%, run the following script to check
DECLARE @dbid INT
SELECT
@dbid = DB_ID()
SELECT
dbid = database_id,
objectname = OBJECT_NAME(s.object_id),
indexname = i.name,
i.index_id,
partition_number,
row_lock_count,
row_lock_wait_count,
[block%] = CAST(100 * row_lock_wait_count / (1 + row_lock_count) AS numeric(15, 2)),
row_lock_wait_in_ms,
[avg row lock waits in ms] = CAST(1 * row_lock_wait_in_ms / (1 + row_lock_wait_count) AS numeric(15, 2))
FROM sys.dm_db_index_operational_stats(@dbid, NULL, NULL, NULL) AS s,
sys.indexes AS i
WHERE OBJECTPROPERTY(s.object_id, 'IsUserTable') = 1
AND i.object_id = s.object_id
AND i.index_id = s.index_id
ORDER BY row_lock_wait_count DESC
ii. Automatically report locks over 30s in SQL Trace, - Check sp_configure ‘blocked process threshold’
iii. The average lock time should be <100ms, run the script as above
iv. The top 2 locks should not have LCK_M_?? when running the following script:
SELECT TOP 2 wait_type FROM sys.dm_os_wait_stats ORDER BY wait_time_ms DESC