Target_recovery_time_in_seconds – Indirect checkpoint explained using extended events


 

Today I was querying sys.databases system view in SQL 2012 and was curious to find one column named target_recovery_time_in_seconds.I was pretty sure that this column was not available before 2012.

I decided to learn more about this,and a quick search on BOL lead me to a different world altogether.This modification was part of a new enhancement/feature in SQL 2012 called ‘Indirect Checkpoint’.

Target_Recovery_Time is a new database level (Important to note that this) setting with SQL 2012,which will override the sp_configure option recovery_interval for that particular database.

Lets check the sp_configure value for recovery_interval for one of my test instance.

name minimum  maximum config_value run_value
recovery interval (min) 0 32767 0 0

Now lets check what is the target_recovery_time_in_seconds for my test databases

SELECT name,target_recovery_time_in_seconds from SYS.Databases
name       target_recovery_time_in_seconds
master 0
tempdb 0
model 0
msdb 0
Test_Extd_Blocking 0
AdventureWorks2012 0

Lets now change the Target_Recovery_Time for database AdventureWorks2012 [ Note – Please do not do this directly on a production database without proper test!]

ALTER DATABASE [AdventureWorks2012] SET TARGET_RECOVERY_TIME = 10 seconds
SELECT name,target_recovery_time_in_seconds from SYS.Databases
Where target_recovery_time_in_seconds <>0
name       target_recovery_time_in_seconds
AdventureWorks2012  10

 

What does this mean ?

Checkpoint forces all dirty pages for the current database to be written to disk. Dirty pages are data or log pages modified after entered into the buffer cache, but the modifications have not yet been written to disk.Checkpoint plays an important part in the recovery interval for a database.

Before 2012 the automatic checkpoint was purely controlled using Recovery_Interval sp_configure option,and there was no flexibility to control this at database level.The default value is zero.

There are some limitations with the checkpoint algorithm before 2012,and its very clearly explained by Microsoft PFE Parikshit Savjani (B/T) in his blog post which can be read here. Another blog post which explained the concepts really clear was from Sourabh Agarwal,yet another PFE (B) and his that post can be read here.The major drawbacks of the old algorithm is massive Disk Write activity during checkpoint and unpredictable database recovery time.

With 2012 Indirect checkpoint algorithm there is a Dirty Page list is maintained and also there is a threshold which is maintained for target dirty buffer.A background process Recovery Writer polls the dirty pages to compare the threshold and flushes them.The whole process is very clearly explained in the above mentioned blog posts and I do not want to repeat the same here.Please refer both the blogs to get a detailed explanation with videos.

We can use extended events to check the behavior of Checkpoints and you can add the events checkpoint_begin and checkpoint_end as per requirement.

I did a quick check on behavior of checkpoint for a very small database which has one table.Below are the test results.A simple statement was wrtitten to insert 80000 rows to the table.

Use[Demo_Checkpoints]
INSERT INTO Test VALUES(1000000)
GO 80000

It took 1 minute and 17 seconds to complete the whole insert.

Default checkpoint behavior

The extended event session came up with the below results.

For the default behavior a checkpoint was applied 5 times.

Behavior after changing the Target Recovery_Time for the database

For the purpose of testing I changed the Target_Recovery_Time for the database as 90 seconds.The event session was triggered again.

This time checkpoint was applied for 3 times and I could easily understand the behavior changes.

One of the most important facts which I observed that the same INSERT query took 1 minute and 28 seconds to complete this time.This clearly states the fact that changing the default behavior needs lot of testing and should be used with lot of caution.

Thanks for reading.

Advertisement

4 thoughts on “Target_recovery_time_in_seconds – Indirect checkpoint explained using extended events

  1. This is very interesting, We recently moved to 2012 on a huge 64 processor machine – super fast and we are experiencing the dreaded “SQL Server has encountered 1045 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file:…” The new Indirect checkpoints are supposed to “smooth out” I/O spikes, but is it possible for them to have a negative effect on I/O?

    • Thanks and glad that you liked it. Indirect checkpoints should only be used on rare occasions after conducting proper test cases. There can be negative effects as you mentioned.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s