How to Monitor the Space Usage in Temp DB?

Note: This post elaborates and demonstrates the example in the book Microsoft SQL Server企业级平台管理实践 by 徐海蔚  Microsoft SQL Server Enterprise Platform Management by Haiwei Xu: 1.1.5 案例:通过脚本监视 tempdb空间使用))


1.      Challenges

 

a. Some objects such as #tempTable or @tableVariable are reflected in sys.partitions and sys.allocation_units. But others such as the internal objects or version-based storage used by SQL Server are not reflected in sys.partitions and sys.allocation_units, thus, we cannot use sp_spaceused, which employs sys.partitions and sys.allocation_units for the calculation, to find the precise information about the space usage by temp DB.

 

b. DBCC ShowFileStats only gives us the overall information, but not the detailed breakdowns.

 

USE tempdb

GO

 

DBCC ShowFileStats

 

/*

Fileid        FileGroup      TotalExtents     UsedExtents   Name     FileName

1                1                      128                  39                    tempdev D:\xxx\tempdb.mdf

*/

 

c. Sys.dm_dm_file_space_usage can give us the snapshot information on the space used by the user objects, internal objects, or version-based objects in the temp DB. By itself alone, however, we cannot monitor the space usage in the tempDB dynamically. But, but, but, we can keep running the query periodically to monitor the space usage in the tempDB by SQL Server. This DMV serves the foundation of the approach presented in this post.

 

USE tempdb

GO

 

SELECT * FROM sys.dm_db_file_space_usage

 

/****manually edited ******

database_id                                               2

file_id                                                       1      

filegroup_id                                              1      

total_page_count                                      1024

allocated_extent_page_count                   312            

unallocated_extent_page_count               712            

version_store_reserved_page_count         0

user_object_reserved_page_count          128

internal_object_reserved_page_count                 8      

mixed_extent_page_count                       176

*/

 

2.      How to monitor the space usage in temp DB?

a. Scenario: Suppose you just notice a sudden huge increase in the temp DB, what causes it? Is it normal or something wrong? How to monitor the SQL Server behavior?

b. Option 1 – SQL Profiler or SQL Trace. The problem with this option is that we need to keep the SQL Trace or SQL Profiler always on as we usually do not know when the increase occurs. Obviously, there is a big negative impact on performance, which makes it less optimal, especially on a busy box.

c. Option 2 – A less impact method is to run the DBCC/DMV/DMF command at a certain interval and output the result to a file for a later analysis. This option is recommended on a heavy system and demonstrated below.

 

3.      How to use the script to monitor the space usage in the temp DB?

a. Scenario: Suppose we need to perform the following queries. Of course, we can do a quick qualitative analysis as below. But how do we monitor the use of space in tempDB quantitatively?

 

-- First, open a new query window. Lets say: Connection A (spid=55 in my case)

select @@spid

go

 

use adventureworks2012

go

 

select getdate()

go

 

-- create a temp table

-- This operation should employ user objects page

 

select * into #mySalesOrderDetail

from Sales.SalesOrderDetail

go

 

waitfor delay '0:0:2'

select getdate()

go

 

-- drop the temp table

-- This operation should cause the user object page decreased

drop table #mySalesOrderDetail

go

 

waitfor delay '0:0:2'

select getdate()

go

 

-- This join should use many internal objects.

select top 100000 * 

from [Sales].[SalesOrderDetail]

INNER JOIN [Sales].[SalesOrderHeader]

ON [Sales].[SalesOrderHeader] .[SalesOrderID] = [Sales].[SalesOrderDetail].[SalesOrderID];

 

select getdate()

go

 

-- After the join, the use of the internal objects page should dramatically decrease

SELECT getdate()

go

 

b. Use the following the script to monitor

 

-- Purpose: To monitor the space usage in the tempdb

 

--Step 1: In a separate connection, lets say connection B (spid=53 in my case) choose Query-Results to file, run the entire script below. Choose a location and file name for saving the query result. Keep it running until the query in connection A is done.

 

USE tempdb

GO

 

WHILE 1 = 1

BEGIN

  SELECT

    GETDATE()

 

-- check the space usage for tempdb at the database level

dbcc showfilestats

 

-- Query 1

-- return the reserved spaces in KB for the user objects, internal objects, version-stored objects and free space in the file for tempdb.

  

SELECT

    'Tempdb' AS DB,

    GETDATE(AS Time,

    SUM(user_object_reserved_page_count) * 8 AS user_objects_kb,

    SUM(internal_object_reserved_page_count) * 8 AS internal_objects_kb,

    SUM(version_store_reserved_page_count) * 8 AS version_store_kb,

    SUM(unallocated_extent_page_count) * 8 AS freespace_kb

  FROM sys.dm_db_file_space_usage

  WHERE database_id = 2

 

-- Query 2

-- This query returns the allocated and deallocated pages of the internal and user objects for the sessions utilizing the tempdb

  

SELECT

    t1.session_id,

    t1.internal_objects_alloc_page_count,

    t1.user_objects_alloc_page_count,

    t1.internal_objects_dealloc_page_count,

    t1.user_objects_dealloc_page_count,

    t3.*

  FROM sys.dm_db_session_space_usage t1 -- Returns the number of pages allocated and deallocated by each session for the tempdb.                           

  JOIN sys.dm_exec_sessions AS t3 -- Returns one row per authenticated session, it shows information about all active user connections and internal tasks.

    ON t1.session_id = t3.session_id

  WHERE (t1.internal_objects_alloc_page_count > 0

  OR t1.user_objects_alloc_page_count > 0

  OR t1.internal_objects_dealloc_page_count > 0

  OR t1.user_objects_dealloc_page_count > 0)

 

-- Query 3

-- Returns the queries that are running and have utilized the tempdb

 

  SELECT

    t1.session_id,

    st.text

  FROM sys.dm_db_session_space_usage AS t1,

       sys.dm_exec_requests AS t4

       CROSS APPLY sys.dm_exec_sql_text(t4.sql_handleAS st

  WHERE t1.session_id = t4.session_id

  AND t1.session_id > 50

  AND (t1.internal_objects_alloc_page_count > 0

  OR t1.user_objects_alloc_page_count > 0

  OR t1.internal_objects_dealloc_page_count > 0

  OR t1.user_objects_dealloc_page_count > 0)

  WAITFOR DELAY '0:0:1'

END                       

 

--Step 2: run the queries in connection A (@spid=55). The result is as below. Notice the four time stamps from the result:

2014-08-26 12:29:14

2014-08-26 12:29:16

2014-08-26 12:29:18

2014-08-26 12:29:24

 

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4ZCuB788j9TlJ1fuRWps_E_ajiDj2A4eWTx_lQj1wemOmdrHnn547vTr2vk7a7pFUxqdhsI6b3Ug8DpDxt-07OygLpa8WDyNwOPGZaQAaeWdmbsK449M_4KAXL413VLxfcDd10O7gTCs/s1600/monitorspaceusageInTempDB_Query+To+Be+Monitored.png

 

After the queries in connection A are done, manually cancel the running query in connection B. The usage information of the spaces in the tempdb in connection A should be captured in the file produced by the queries in connection B.

 

--Step 3: Now is the hard part: to find and interpret the information captured in the file. We need a text editor for that purpose. We choose UltraEdit.

 

--Step 3a. Open the file (checkTempUsage.rtf in our case) in UltraEdit. We will check the result from DBCC ShowFileStats first. On the menu, Search|Find|enter tempDev and select 列出包含字符串的行. That should return 36 rows.

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmW8zcw3P35620xPo4xrvimxvDC4tu1rK2UuctjIsaPXFcOiHebAzIqEocgATZykNwRAYMKFa2rXz83nnVJGZY_pEWjzRm3AUv39rKnkgacaIi-22m9kviSNe-tMT6rQvmEelb7B8uf_U/s1600/UltraEdit_DBCC+ShowFileStats.png

 

--Step 3a1. In the panel of the 36 rows, right click, copy to clipboard, paste it a text file. Then manually copy the header information and paste it this .txt file as below. Save it.

 

Fileid    FileGroup  TotalExtents UsedExtents    Name              Filename

--------- --------- -------------- ---------- --------                --------

1                1                      251                  42           tempdev            D:\xxx\tempdb.mdf

1                1                      251                  42           tempdev            D:\xxx\tempdb.mdf

---Other 34 rows are omitted. The filename is not fully presented------

 

--Step 3a2. If we do not need the filename column, we can open the newly saved .txt file in UltraEdit, from the menu Edit|Select Range, enter the beginning and ending numbers for rows and columns. Copy the result to a new .txt file. Thats the result from the DBCC ShowFileStats command that we will examine on.

 

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKc10l3tbkooay8DA8Kk7F-Bh3N5rcTWiZ11JsAKMJp3JOdOhszlUCUOF3YqxAthVQRNqYvhwXvHbji2rZzSmpUjH4fktON04HL-9CGguXm9He_6G1kluTiuI-x3uoAH9Am3q6ZiwXA7U/s1600/UltraEdit_Range+selection.png

 

Fileid      FileGroup   TotalExtents         UsedExtents          Name    

----------- ----------- -------------------- -------------------- ---------

********the 1st 16 rows, identical to the 17th row, are omitted******

1             1                           251                  42                    tempdev 

1             1                           251                  229                  tempdev  (1st increase – temp table created)

1             1                           251                  229                  tempdev 

1             1                           251                  229                  tempdev 

1             1                           251                  42                    tempdev (1st decrease – temp table dropped)

1             1                           251                  57                    tempdev 

1             1                           251                  116                  tempdev 

1             1                           251                  144                  tempdev  (2nd increase – a join query)

1             1                           251                  144                  tempdev 

1             1                           251                  144                  tempdev 

1             1                           251                  144                  tempdev 

1             1                           251                  42                    tempdev    (2nd decrease – the query done)

*****Eight identical rows are omitted********

--------------------------------------------------------------------------------------------------------------------------------

--Step 3b. Now lets check the result from query 1 in the monitoring connection (spid=53). On the menu, Search|Find|enter tempdb 2014 and select 列出包含字符串的行. That should return 36 rows as well. Manually adding the header information.

 

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimGWJRodKbQIOvOD8UmcoeEA7S3OBxGokJP1apQzXmiGwT-r6it_Ss4rbu67NhRKFHzIZekOLKL1naKkvLy8D84g0ORVUXu88pdaP_RJdXg2z_Ty42rwjckeZv4m1DmTFBj_TEoR4tiaM/s1600/UltraEdit_Query+1.png

 

DB     Time                                       user_objects_kb      internal_objects_kb  version_store_kb         freespace_kb         

 

…..omitted 21 rows, identical to the 22nd row……….


Tempdb 2014-08-26 12:29:14.237 1024                          256                       0                                           13376

 (1st increase from 1024 to 12992 due to creation of the temp table)

The time is slightly after 2014-08-26 12:29:14

Notice the increase occurs in the time period of 12:29:14 - 12:29:15

Tempdb 2014-08-26 12:29:15.240 12992                        256                       0                                           1408

Tempdb 2014-08-26 12:29:16.243 12992                        256                       0                                           1408

Tempdb 2014-08-26 12:29:17.243 12992                        256                       0                                           1408


 (1st decrease from 12992 to 1024 due to dropping of the temp table)

The time is slightly after 2014-08-26 12:29:16

Notice the decrease occurs in the time period of 12:29:17 - 12:29:18


Tempdb 2014-08-26 12:29:18.247 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:19.250 1024                          1216                     0                                           12416

Tempdb 2014-08-26 12:29:20.270 1024                          4992                    0                                           8640

 (2nd increase from 256 to 6784 for using internal objects by the join)

The time is slightly after 2014-08-26 12:29:18

Notice the increase occurs in the time period of 12:29:18 - 12:29:21

Tempdb 2014-08-26 12:29:21.273 1024                          6784                     0                                           6848

Tempdb 2014-08-26 12:29:22.277 1024                          6784                     0                                           6848

Tempdb 2014-08-26 12:29:23.280 1024                          6784                     0                                           6848

Tempdb 2014-08-26 12:29:24.300 1024                          6784                     0                                           6848

(2nd decrease from 6784 to 256 after the join is done)

The time is slightly after 2014-08-26 12:29:24

Notice the decrease starts at 12:29:24.300. But we do not know when it is done exactly

Tempdb 2014-08-26 12:29:25.303 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:26.310 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:27.310 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:28.317 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:29.317 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:30.330 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:31.333 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:32.337 1024                          256                       0                                           13376

Tempdb 2014-08-26 12:29:33.340 1024                          256                       0                                           13376


--Step 3c. Now lets check the result from query 2 in the monitor connection. On the menu, Search|Find|enter 55         [0-9]. 55 is the session ID for the main query. After it, there is 9 spaces, followed by a number. This is the regular expression in UltraEdit. In this way, we will find the result beginning with session_id 55, followed by internal_object_alloc_page_counts. That should return 36 rows as well. Manually adding the header information. Save it in .txt. Open the .txt file in UntraEdit for editing (deleting un-wanted columns as explained below)

 

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhr4U_HOJW5wJpTJtI_RVuigHHcUVlN7p2kZ8e2QPBf76iKFPUs54_qIiDhft4EwQWCQGNYyBYtaTlfn4UNXkDKyhJkzuMVFEeGG_91_-DPjst3FaWseZcB3SVXPb0T0czgEO9R8jrRC3Y/s1600/UltraEdit_Query+2.png

 

But, the results contain much more columns than what we need. We can use the Column|Delete Columns.menu. Unfortunately, you can only delete 100 spaces each time. Finally, we get what we need:

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcnUJXeNlV7s6hFb1tgQcjGwtHuCVwfg60tdnZk0BJbFfxbOKjygLKJHaPqlOvDVrUXnplDPDF7IafT-5Q3RcvNLub7CgB7GxO9AvDDaVyIp5y_B1hQmOQ1qRiKZFW-LuwLnF2aYY90Lo/s1600/UltraEdit_Query+2_cleaned.png

 

 

session_id internal_objects user_objects           status        last_request_start_tm       last_request_end_tm

--------  ----------------------   ------------               ----------    -----------------------           ---------------------

55         816                  1537                    sleeping    2014-08-26 12:25:45.830 2014-08-26 12:25:45.830

55         816                  1537                    sleeping    2014-08-26 12:25:45.830 2014-08-26 12:25:45.830

55         816                  1537                    sleeping    2014-08-26 12:25:45.830 2014-08-26 12:25:45.830

………………………………………………………………………………….

55         816                 1537                     sleeping   2014-08-26 12:25:45.830 2014-08-26 12:25:45.830

55         816                 1537                     running    2014-08-26 12:29:14.217 2014-08-26 12:29:14.120

55         816                 3074                     running    2014-08-26 12:29:14.623 2014-08-26 12:29:14.620

55         816                 3074                     running    2014-08-26 12:29:14.623 2014-08-26 12:29:14.620

55         816                 3074                     running    2014-08-26 12:29:16.697 2014-08-26 12:29:16.697

55         816                 3074                     running    2014-08-26 12:29:16.697 2014-08-26 12:29:16.697

55         816                 3074                     running    2014-08-26 12:29:18.783 2014-08-26 12:29:18.697

55         816                 3074                     running    2014-08-26 12:29:18.783 2014-08-26 12:29:18.697

55         816                 3074                     running    2014-08-26 12:29:18.783 2014-08-26 12:29:18.697

55         816                 3074                     running    2014-08-26 12:29:18.783 2014-08-26 12:29:18.697

55         816                 3074                     running    2014-08-26 12:29:18.783 2014-08-26 12:29:18.697

55         816                 3074                     running    2014-08-26 12:29:18.783 2014-08-26 12:29:18.697

……………………………………………………………………………………

55         1632               3074                     sleeping   2014-08-26 12:29:24.633 2014-08-26 12:29:24.633


What do they mean? Remember the four time periods of changes on the space usage in the tempdb from the result in query 1?

Result from Query 1                               Result from Query 2

12:29:14 - 12:29:15                                  Running!

12:29:17 - 12:29:18                                  Running!

12:29:18 - 12:29:21                                  Running!
12:29:24.300 - ???               

(The last session is done somewhere between 12:29.24.300 and 12:29:24.633)

Surely, the running status for connection A (spid=55) is on when the tempdb is utilized, no matter how we monitor the usage. Thus, query 2 confirms the running/sleeping status of connection A. Query 2 also produces much information we have not discussed yet in this demo.


--Step 3d. Now let’s check the result from query 3 in the monitor connection. 

From the result of query 1, we know that from 12:29:18.247 to 12:29:21.273, the internal_object_used has increased from 256KB to 6784KB, what causes it? Let’s see which query is running on connection A (spid=55) utilizing the space in tempdb during the time period? That’s query 3’s show!

Time                                                                User_Obj       Internal_Obj
----------------------------------------------------------------------------------

2014-08-26 12:29:18.247                                     1024                256


--The query running on connection A (spid=55) during the time period is:  
session_id   text         

----------   --------------

55      waitfor delay '0:0:2'

 

Time                                                                User_Obj       Internal_Obj
----------------------------------------------------------------------------------

2014-08-26 12:29:19.250                                     1024                1216


--The query running on connection A (spid=55) during the time period is:  

 

session_id   text         

----------   --------------

55           select top 100000 *

From [Sales].[SalesOrderDetail

INNER JOIN [Sales].[SalesOrderHeader

ON [Sales].[SalesOrderHeader].[SalesOrderID = [Sales].[SalesOrderHeader].[SalesOrderID];  

 

Time                                                                User_Obj       Internal_Obj
----------------------------------------------------------------------------------

2014-08-26 12:29:20.270                                     1024                4992


--The query running on connection A (spid=55) during the time period is:  
session_id   text         

----------   --------------

55           select top 100000 *

from[Sales].[SalesOrderDetail]                   

INNER JOIN [Sales].[SalesOrderHeader]       

ON [Sales].[SalesOrderHeader].[SalesOrderID] = [Sales].[SalesOrderHeader].[SalesOrderID];  

 

Time                                                                User_Obj       Internal_Obj
----------------------------------------------------------------------------------

2014-08-26 12:29:21.273                                     1024                6784

 

The result clearly shows that the big join query on connection A (spid=55) causes the utilization of internal objects in tempdb, which mainly occurred between 12:29:19.250 and 12:29:21.273.