Cumulative Update 1 for SQL Server 2012 – Why am I interested in this ?


During the month of April 2012 Microsoft released CU1 for SQL Server 2012.

Earlier I was focusing only on Service Packs (SP) whenever they are released. I moved on from this earlier strategy of ignoring CU’s.

I have decided to focus on CU’s too whenever they are released, and made up my mind to read/understand about the hot fix’s which are included in the CU.

Some of the fixes which are included in CU1 for SQL 2012 which grabbed my attention is mentioned below. You will need to refer to http://support.microsoft.com/kb/2679368 for the full list of fixes.

1. FIX: Low CPU spikes at a set interval even without user activity in SQL Server 2008 R2 or in SQL Server 2012.

This issue has got a very interesting symptom -

You have a huge database which has heaps and you have enabled Read Committed Snapshot Isolation (RCSI).You might experience low CPU spikes every 5 minutes without any user activity!

2. FIX: Backup fails in SQL Server 2008, in SQL Server 2008 R2 or in SQL Server 2012 if you enable change tracking on the database

The symptoms for this issue are the backup will fail if the database is enabled for change tracking and the database has been recovered at least once after CDC was enabled.

I tried to repro this situation under RTM build, but never encountered this fix. In case you are getting this error for the above mentioned situation, then you can test out CU1.

3. FIX: It takes a long time to restore a database in SQL Server 2008 R2 or in SQL Server 2008 or in SQL 2012

The symptom of this issue is long duration for restoring databases which has many Virtual Log Files (VLF’s).

4.FIX: SQL Server Configuration Manager takes a long time to open if many SQL Server 2008 R2 or SQL Server 2012 instances are installed.

The symptom is Config Manager taking long time to open in case there are many instances configured on the same server. I don’t prefer running multiple instances for a real busy OTLP environment as sharing resources between the instances and management is a tough deal. These points are discussed well during initial phase of the planning itself.

Conclusion

The above mentioned lists of fixes are just the ones which grabbed my attention, as these issues might be common.

In case you are facing any issue and that issue has been fixed as per the KB article mentioned above, then you should test/validate CU1 and plan for deployment.The strategy of testing/deploying CU’s varies from environment to environment and this should be thoroughly discussed among the DBA team and well documented.

The compressed(zipped) folder is invalid or corrupted – Not always true !


Have you ever encountered this error message when tried to extract zipped backup files using built in ‘Extract All‘ facility of Windows -

The error message will first give you an impression that the zip file might be corrupt.

This might be true in some cases like a file not downloaded fully from FTP, copied over a network and got corrupted etc.

In most of the cases I have experienced, the zip file was fully functional and it never had any corruptions. The extract all facility still throws errors even though the zip file is valid.

I am not at all a big fan of using the zip/unzip OS facility provided by default, hence I depend on tools like WinRar,PKZIP etc which offers better compression rate, speed etc.

I would recommend extracting the zipped file using any of the above mentioned tools and see if the zip file is really corrupted.

The same zip file which showed errors while tried to extract using the default extract all feature was proved fine and was inflated using PKZIP utility

Conclusion

It’s not always true that the zipped file might be corrupt, hence you should always leverage third party tools to test the file validity.

Thanks for reading.

Book Review – SQL Server Hardware by Glenn Berry


I’m really happy and excited to write my first book review. I just completed reading the wonderful book  SQL Server Hardware written by deep SQL Server Technical Expert Glenn Berry(B/T)

I will rate this book with 

I am pretty confident to say that this book will make you “a better DBA/Consultant“.

Glenn with his deep technical experience structured the book really well, and it’s worth reading each and every page of this book. Trust me, you will love it!

What this book is all about – 

In simple words, this book will help you setup SQL Server environments with great confidence.

Setting up highly reliable, highly performing environments are never an easy task and by reading this book, you will be better equipped to deal with multiple decision making situations.

There is a great amount of chance that the person a senior DBA will be responsible for making the decision of selecting the best hardware to run SQL Server workloads in most of the environments. He/She should be having thorough knowledge about the workload which can be expected to run on the new system, and based up on these facts the CPU/Memory/IO decisions needs to be made.

Within CPU category the DBA has to further decide on the no of sockets,cores,threads etc.When it comes to IO,there are lot of other factors which needs to be decided like which RAID level should be used etc.There are lot of factors which needs to be considered for Memory too.

The DBA who makes the decision of selecting the hardware should be aware of the latest trends in processor/memory/IO technology and should be able to bench mark them with confidence. Lacking the hardware related knowledge can often stress out the person and can often force confusion on what to select from a greater set of available options in the market.

This book will help that person who is going to select the best possible hardware to run SQL Server workloads.

The best part of this book is that it not only covers the hardware portion of SQL Server,but it also covers some great and exciting areas like Selecting the best Operating System,Tuning the OS,SQL Server versions,editions, comparisons,patching,instance level/database level configurations etc.

A quick glance of the chapters included in this book is mentioned below -

1. The first chapter explains the hardware fundamentals.

2. This chapter deals with the storage subsystem.

3. This chapter was something really new and exciting to me and it deals with database and hardware benchmarking.

4. Deals with hardware investigation tools.

5. Explains Windows OS editions and benefits of running the correct version for SQL Server workloads.

6. Deals with SQL Server editions and versions.

7. Explains installation, configuration of SQL Server for optimum performance.

You can get this book from Amazon and they offer both Kindle and Paperback options.

http://www.amazon.com/SQL-Server-Hardware-Glenn-Berry/dp/1906434638

Conclusion

This book gave me lot of confidence and I would recommend this as a ‘Must Read’.I would like to convey my sincere thanks to Glenn for writing such a valuable book and helping the SQL Server Community.

Pull user mode VAS limit details for x86 and x64 bit server using sys.dm_os_ring_buffers


We all know that the user mode VAS(Virtual Address Space) limit for x86 bit server is 2GB and the same value for x64 bit server is pretty huge,and its 8 TB.

If you are interested in reading a lot about VAS,then I would recommend you reading the article written by Jonathan Kehayias(B/There.

I was trying to figure out a way on how I can demonstrate/confirm this fact to the users when dealing with technical talks or during initial discussion phase of a project.

The answer to my question was found in sys.dm_os_ring_buffers DMV.This dmv exposes the ring buffer structure within SQL Server and can be used to pull multiple diagnostic infotmation.

This DMV is relatively undocumented,however there are enough blog posts out there in web written by deep technical experts which will come to your rescue.

To double check the VAS details you need to query this DMV with a condition for ring_buffer_type = RING_BUFFER_RESOURCE_MONITOR. This allows you to see memory state changes due to various types of memory pressure.

SELECT Cast(record AS XML),  * 
FROM   sys.dm_os_ring_buffers 
WHERE  ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR' 
go

We will get the xml record as the output for this code and you can review the XML to get information related to VAS.

Below are the captured view of XML records for both x86 and x64 bit versions of SQL Server.You can clearly understand the VAS limit from this.

For x86 bit versions

For x64 bit versions

If you closely observe the XML data,you can see that there are lot of very useful memory related information which you can pull.These info can be really helpful for your troubleshooting process.

Conclusion

sys.dm_os_ring_buffers is really a powerful DMV which expose the ring buffer structure with in SQL Server and you can leverage this DMV to troubleshoot multiple issues.

Thanks for reading.

SQL Server has encountered occurrence(s) of cachestore flush – Warning !!!


You might have seen the below messages in SQL Server error log at some point of time -

SQL Server has encountered 1 occurrence(s) of cachestore flush for the ‘Object Plans’ cachestore (part of plan cache) due to some database maintenance or reconfigure operations.
SQL Server has encountered 1 occurrence(s) of cachestore flush for the ‘SQL Plans’ cachestore (part of plan cache) due to some database maintenance or reconfigure operations.
SQL Server has encountered 1 occurrence(s) of cachestore flush for the ‘Bound Trees’ cachestore (part of plan cache) due to some database maintenance or reconfigure operations.

In the event you see this message,your next course of action should be checking if any of the databases are having property ‘AutoClose’  is set to True.

There are other conditions too which triggers this message,however you dont want your databases to have AutoClose set as True for production environment at all.

We can query the sys.databases view to get this details pretty easy.

--Query the sys_databases view for autoclose details
SELECT name 
FROM   sys.databases 
WHERE  is_auto_close_on <> 0

Obviously this question comes to your mind – What is the relation between Auto Close being turned ON and this cachestore flush message ?

The answer to that question is pretty much explained well in MSDN.

 When the database is set to AUTOCLOSE = ON, an operation that initiates an automatic database shutdown clears the plan cache for the instance of SQL Server. Clearing the plan cache causes a recompilation of all subsequent execution plans and can cause a sudden, temporary decrease in query performance.

Lets do a quick demo to demonstrate this behaviour.

Demo

1. Using Adventureworks database running on SQL Server 2008 R2

2. I will now do a simple SELECT * from one of the tables.I will run the statement twice just to increase the execution count.

--Select query to pull the records
SELECT * 
FROM   production.WorkOrder

3. I will now query the plan cache to retrive the details of SELECT statement ran above

--Query plan cache for finding the SELECT Statement
SELECT CP.objtype                          AS ObjectType, 
       Object_name(SXT.objectid, SXT.dbid) AS ObjectName, 
       CP.usecounts                        AS ExecutionCount, 
       SXT.TEXT                            AS QueryText, 
       CP.plan_handle 
FROM   sys.dm_exec_cached_plans AS CP 
       CROSS apply sys.Dm_exec_sql_text(CP.plan_handle) AS SXT 
WHERE  CP.usecounts = 2

You can clearly see the Query Text(Select statement) and the plan_handle in the cache.Note down the plan_handle as it will be required later on

0x06000800557BB714B8E0F30A000000000000000000000000

4. Now we will set Auto Close property for the database Adventureworks as True.

--Set Autoclose to ON
USE [master]  go  ALTER DATABASE [Adventureworks]  SET auto_close ON WITH no_wait  go 

5. Next we will again query the plan cache to confirm that the details have been flushed as the Auto Close property was turned ON and the database was closed.

--Query plan cache for finding the SELECT Statement
SELECT CP.objtype                          AS ObjectType, 
       Object_name(SXT.objectid, SXT.dbid) AS ObjectName, 
       CP.usecounts                        AS ExecutionCount, 
       SXT.TEXT                            AS QueryText, 
       CP.plan_handle 
FROM   sys.dm_exec_cached_plans AS CP 
       CROSS apply sys.Dm_exec_sql_text(CP.plan_handle) AS SXT 
WHERE plan_handle = 0x06000800557BB714B8E0F30A000000000000000000000000

(0 row(s) affected)

Conclusion

Next time when you see this error make sure that Auto Close property is definitely cross checked and ensure that its not turned ON.

Mirroring connection timeout – Looking back at the best practices !


How many times have you followed the best practices check list whenever you are deploying a mission critical solution like Database Mirroring,Log shipping or even Failover cluster instances ?

If the answer is No,”Have never followed any best practices checklist ! “,then its the right time to change that perspective.

Lets look at  Wikipedia definition of  best practice -

best practice is a method or technique that has consistently shown results superior to those achieved with other means, and that is used as a benchmark. In addition, a “best” practice can evolve to become better as improvements are discovered. Best practice is considered by some as a business buzzword, used to describe the process of developing and following a standard way of doing things that multiple organizations can use.

This definition makes perfect sense with SQL Server too.We do have great set of information out there which clearly talks about best practices when you are dealing with SQL Server deployments.

A simple example would be  – Your data and log file should not reside on the same drive.

There will always be a reason for a best practice and best practice is always not constant.It can evolve over time and can become better.(I like to stress this sentence a lot !)

Recently I have observed a question related to Database Mirroring(DBM) where in the user was getting an event logged in the sql error log

The mirroring connection to “TCP://abc.xyz.com:5022″ has timed out for database “DB” after 10 seconds without a response.  Check the service and network connections.

This is directly related to a best practice within DBM.The failover mechanism within database mirroring is designed such a way that the instances sends across a Ping within each other to understand if they are alive.

A simple pictorial representation of Principal Instance and Witness instance sending a ping to the Mirrored instance is shown below.This is a one way representation only.

A ping is sent every 1 second.

The mirrored/witness instance will also send across pings to check if the principal is alive.In the event of Principal not being available,Mirrored instance will talk to Witness and will get confirmation that Principal is indeed unavailable. Next course of action is initiated from then.

This course of action depends on something called as Partner Time out.The default is 10 seconds and you can change this value.As per best practices,the value should be 10 seconds and above.

Lets assume that your DBM environment is configured such a way that it utilizes synchronous mirroring with automatic failover.If the Mirrorred instance doesnt not receive any responses to Ping for 10 seconds from Principal,then a failover will be initiated.[Under default Partner Time out settings]

Lets go back to the error log message which was discussed earlier.In this situation the user was not using an Automatic failover(No witness) and the ping was between the Principal and Mirrored instance.

There were network latency between the servers,hence the Ping from Principal did not received any responses for 10 seconds and an event was logged.As there is no witness an automatic failover is not triggered.

If there was a witness for automatic failover,then it would have been pretty dangerous to initiate a failover because of network latency.

The solution for the above user is to set an upper limit for the Partner Time Out value.

ALTER DATABASE <dbname> SET PARTNER TIMEOUT 20

This command has to be ran at the Principal server end.

Conclusion

Lowering this number for achieving quick failure responses is a Myth ! The default value of 10 should never be changed to a lower value.Upper values can be set upon doing proper analysis of the DBM environment and failover requirements.

I am curious to learn if you have any upper limits for this value in your environment and if Yes,then what is the reason.Please share your thoughts as comments here.

Thanks for reading.

Backups on secondary replicas – Always ON Availability Groups


How many times you have seen heavy CPU utilization,heavy disk I/O happening when ever a backup job is running on a production SQL server.The CPU utilization hits top values if backup compression is enabled for the instance with the main intention of reducing backup sizes.

Would you love to offload these backup jobs from the main production server and avoid these high CPU/IO hits ?

This is exactly what SQL 2012 is capable of doing it for you.The databases part of availability groups can be backed up according to a set preference.

Some of the important points to mention here is -

1.The backup on the primary replica still works if you need to go with that option.

2.Only copy only full backup is allowed for secondaries.

3. Log backups can also be done for multiple secondaries and they all will form a single chain.

4. Differential backups are not supported for secondaries and you would require running differential for primary replica.

After setting up the Availability groups you have an option to set the backup preference.

Lets say we have a database Adventureworks2012 sitting on 2 replica’s WIN2008NodeA\AlwaysONSrv1(Primary) and WIN2008NodeB\AlwaysONSrv2(Secondary) and you would like to offload backups to happen only on secondary,then you can set that up.Ref to the below screens to get a clear understanding on the options which you have

If you are using T-SQL jobs to schedule your backups,then there is a new function which you would need to incorporate if you are planning to implement backups for secondary replicas.

The function is sys.fn_hadr_backup_is_preferred_replica and it will return a value of if the replica is preferred for backups.

For syntax details you can ref http://msdn.microsoft.com/en-us/library/hh213235.aspx

A very simple code which you can incorporate in your TSQL code for backing up databases can be -

If sys.fn_hadr_backup_is_preferred_replica( @dbname ) <> 1 
BEGIN
-- If this is not the preferred replica, exit (without error).
END
-- If this is the preferred replica, continue to do the backup.

Incase you are using a maintenance plan schedule to backup the databases,then SQL 2012 backup task is smart enough to do this check.

Conclusion


Incase you are dealing with issues like backup running for a longer duration and causing good amount of performance issues within your mission critical servers,then you can think about moving to Availability Groups and make use of backups on secondary replica.The need for choosing this option should be evaluated very well.

Thanks for reading.

Network validation warnings during setting up cluster on a Virtual Machine (Lab setup)


Last week I was pretty much focusing on setting a lab environment to test SQL 2012 Availability Groups and Failover Cluster Instances.

I had to setup 3 VM’s to play around and out of that one will be a domain controller.

Once all the VM’s were ready and started communicating each other within the domain,I decided to setup the cluster.

The very first process involved in setting up a cluster is to validate the configuration.I went ahead with the validation part.

I was running all these VM’s within my laptop and had no option for a shared storage and I was pretty sure that I can just ignore all the warning which I am going to get for the disks within the cluster validation.

The validation was completed and it came up with 1 expected warning and one additional warning for me.

The additional warning was related to Network

It was simply telling me that I am running on one network card and there is no redundant one available to handle crisis situation.This is a single point of failure.

This is just a warning and will allow you to proceed with setting up the cluster,however its giving you a very important message.

For LAB situations this is normal,however for production deployments you might have to have redundant ones.

                     [Picture courtesy http://support.microsoft.com/kb/813937 ]

Conclusion


Its pretty fine to ignore these warnings when you are doing a LAB setup,however you should be very careful when dealing with highly critical production environments.

mscorlib.tlb could not be loaded , SSMS Just wont agree to run !


Today I opened SSMS in one of my SQL 2005 box,and to my surprise I got this error right away

“The file C:\WINDOWS\Microsoft.NET\Framework\%version%\mscorlib.tlb could not be loaded.”

I was running SQL 2005 SP3 (Yes,you heard it right !) and the only .NET framework which was installed on the box was .NET 2.0.

I went ahead to check the folder C:\Windows\Microsoft.Net\Framework and to my surprise I found that the folder framework is missing !

How did that happen ?  Before I get into identifying what caused this to happen,I had to fix the problem with SSMS.SSMS should be opened directly from the server ( I agree that its not a good practice,however the situation demanded it)

I decided to try my luck with repairing .NET framework as this was one of the options I had.I went ahead and did the repair of .NET framework 2.0 from Add/Remove options.

The repair just went smooth and I could see the missing files/ folders getting created again while the setup was running.

SSMS just launched perfect and the issue was fixed.

Conclusion


You can also do this repair in case you face similar issue.I will be focusing on why this corruption happened during the coming days and will update the same post in case I get some hints.

Thanks for reading and keep repairing !