Troubleshooting the Transaction Log Problems

1.  What’s inside the log?

--Use the database from an earlier demo

USE MyDB

GO

 

--the dbid=10, recovery model is full, Data file=4096KB, Log file=3456MB

sp_helpdb 'MyDB'

--Log size=3.37MB, Used %=93%

DBCC SQLPERF(LOGSPACE)

 

--Create a test table

CREATE TABLE test ( a INT )

GO

 

--empty the log

BACKUP DATABASE MyDB TO DISK='c:\Users\Charlie\Documents\myDB.bak'

BACKUP LOG MyDB TO DISK='c:\Users\Charlie\Documents\myDB.trn'

 

--Log size=3.37MB, Used %=21.8% after log backup

DBCC SQLPERF(LOGSPACE)

 

--Run DBCC Log (dbID, formatID = 3 for detailed info), returning 20 rows.

--Notice there are only 3 rows for the current spid 52

DBCC log(10,3)

 

--Now insert a row into the table

INSERT  INTO Test

VALUES  ( 1 )

 

--Run DBCC Log again. Now it returns 27 additional rows. Three of the them are related to the insert actioon.

DBCC log(10,3)

/*****Total 336 columns from the DBCC Log command. Selected columns and rows, edited in Excel****

Current LSN               Operation          SPID  BeginlogStatus   BeginTime   xName RowLogContents0

00000032:000000f0:0002    LOP_BEGIN_XACT     52   0x01000000     38:36:080   INSERT      NULL

00000032:000000f0:001a    LOP_INSERT_ROWS    NULL  NULL           NULL        NULL  0x1000080001000000010000

00000032:000000f0:001c    LOP_COMMIT_XACT    NULL  NULL           NULL        NULL        NULL

*/

--Nextinsert another row

INSERT  INTO test

VALUES  ( 2 )

GO

 

--Run DBCC Log again. Now it returns 3 additional rows for this inserttion. Notice LSNs are different from the previous one.

DBCC log(10,3)

GO

/********Selected rows, edited in Excel*****

Current LSN            Operation       SPID  BeginlogStatus   BeginTime   xName      RowLogContents0

00000032:000000f8:0002 LOP_BEGIN_XACT  52    0x01000000     09:03:52:687 INSERT     NULL

00000032:000000f8:001a LOP_INSERT_ROWS NULL  NULL              NULL         NULL 0x1000080002000000010000

00000032:000000f8:001c LOP_COMMIT_XACT NULL  NULL              NULL           NULL       NULL

*/

--What if we do an update? It returns addtional 4 rows for the update

UPDATE  test

SET     a = 3

 

/********Selected rows, edited in Excel*****

Current LSN            Operation      SPID   BeginlogStatus    BeginTime   xName RowLogContents0

00000032:00000100:0001 LOP_BEGIN_XACT 52     0x01000000    09:12:20:483 UPDATE           NULL

00000032:00000100:0002 LOP_MODIFY_ROW NULL   NULL              NULL         NULL           0x01

00000032:00000100:0003 LOP_MODIFY_ROW NULL   NULL              NULL          NULL              0x02

00000032:00000100:0004 LOP_COMMIT_XACTNULL   NULL              NULL          NULL              NULL

*/

 

The above excerpted results indicate that:

·       The log content contain data and meta data, but not the commands causing the data changes.

·       The contents are meaningful to SQL Server, but not meaningful to human beings.

·       One row change may have multiple records in the log file.

·       We can find the beginning and ending time, but we do not know who and what causes the changes from the log records. If you need to track who makes the changes, use SQL Trace or Events, not DBCC LOG. 

·       The log records contain the beginning and ending status of the data, but not the intermediate changes.

2.  Why does the log keep growing?

1)  What are kept in xLog?

a)             Log records before Checkpoint

b)             Transaction not committed yet

c)             Log records not backed up yet

d)             Lo records needed by replications, log shipping, DB mirroring etc.

2)  Why xLog keep growing?

a)             DB in full or bulk_logged recovery model, but log has not been backed up

·               If you do not need the log records, set the DB to simple

·               If you need the log records, back it up 

b)             An open transaction

·               Find it and correct it

c)             A long-running transaction such as create/rebuild index, delete/insert a large volume of data

·               Break these tasks into chunks if possible

·               Do them on off-peak time

d)             Abnormal behaviors caused by replication or DB mirroring

·       Identify/Temporarily stop these tasks and correct the problems

 

3.  Case study: identify the cause of log growth

1)  Step 1 – examine the status of the log use and re-use

-- find the log size and used %

DBCC sqlperf(logspace)

GO

 

/****** Find the status of log reuse:

 

log_reuse_wait_desc = Reuse of transaction log space is currently waiting on one of the following as of the last checkpoint:

 

0 = Nothing

1 = Checkpoint (When a database uses a recovery model and has a memory-optimized data filegroup, you should expect to see the log_reuse_wait column indicate checkpoint or xtp_checkpoint.)

2 = Log Backup

3 = Active backup or restore

4 = Active transaction

5 = Database mirroring

6 = Replication

7 = Database snapshot creation

8 = Log scan

9 = An AlwaysOn Availability Groups secondary replica is applying transaction log records of this database to a corresponding secondary database.

10 = For internal use only

11 = For internal use only

12 = For internal use only

13 = Oldest page

14 = Other

16 = XTP_CHECKPOINT (When a database uses a recovery model and has a memory-optimized data filegroup,you should expect to see the log_reuse_wait column indicate checkpoint or xtp_checkpoint.)

*/

 

SELECT  name ,

        recovery_model_desc ,

        log_reuse_wait ,

        log_reuse_wait_desc

FROM    sys.databases

GO

 

If you a high % of log use, you can

·          Back it up to truncate the log if waiting status is nothing

·          Switch to Simple if you do not need the log data

·          Waiting for the actions to complete to back it up

·          Investigate why the log cannot be backed up or why it increases to a large size again

 

2)  Step 2 – find the oldest open transaction

/*

DBCC OPENTRAN helps to identify active transactions that may be preventing log truncation.

 

DBCC OPENTRAN displays information about the oldest active transaction and the oldest distributed and nondistributed replicated transactions, if any, within the transaction log of the specified database.

 

Results are displayed only if there is an active transaction that exists in the log or if the database contains replication information.

 

Get the spid for the oldest transaction

*/

 

DBCC opentran

GO

 

--With the identified spid for the oldest open transaction, we can find the command it issued

SELECT  st.text ,

        t2.*

FROM    sys.dm_exec_sessions AS t2 ,

        sys.dm_exec_connections AS t1

        CROSS APPLY sys.dm_exec_sql_text(t1.most_recent_sql_handle) AS st

WHERE   t1.session_id = t2.session_id

        AND t1.session_id > 50 

 

3)  Step 3 - Handle the active transaction – now we know who issued it and what it does, we can

a)             Contact the owner of the command to commit it

b)             Contact the owner of the command to roll it back

c)             Kill @spid by ourself. But this option is not the optimal if the open transaction actually needs a long time to finish.