Guidelines for Troubleshooting SQL Server Memory, Disk I/O and CPU Problems

  • Rule 1 – Never make a conclusion based on one or several independent counters. You need to think from many perspective in a related way.
  • Rule 2 – The conclusion you draw need to explain both the abnormal incidences and the normal behaviors in SQL Server.
  • Rule 3 – Start with the Windows system. If there is bottleneck on the resource(s), is it mainly caused by SQL Server, or by Windows/other programs?
  • Rule 4 – If there is a bottleneck in SQL Server, first find out which resource SQL Server is waiting for and the wait type (e.g., run sysprocesses or sys.dm_exec_requests, and sys.dm_os_wait_stats)
  • Rule 5 – Memory often is a trouble maker for SQL Server performance. If there is a problem, you need to further locate the bottleneck – in database pages, stolen memory, or multi-pages? For instance, if Page Life Expectancy does not drop sharply, it means there is no bottleneck in database pages.
  • Rule 6 – If no problems on memory, then examine disk I/O to see if the disk is under pressure (e.g., Physical Disk: % Disk Time, Physical Disk: Avg. Disk Queue Length, Avg. Disk sec/ReadAvg. Disk sec / WriteAvg. Disk sec/Transfer etc.)
  • Rule 7 – If no bottleneck on memory and disk I/O, and no locks, examine CPU. If there is a bottleneck, determine if it is caused by the workload or hardware.
  • Rule 8 – If there is a bottleneck on I/O or CPU, determine if it is caused by memory.
  • Rule 9 – Consider to adjust SQL Server parameters through sp_configure etc to improve the SQL Server performance, although less likely.
  • Rule 10 – Examine the running workload in SQL Server, and find out the DMLs or actions causing the problems. Communicate with the applications developers who send these troublesome DMLs if necessary.