SQL Server 2014 is a great release with high performance features like In Memory OLTP, Updatable Column Store Indexes etc.
These features still remains as the top favorite and the most compelling reasons for organizations to do an upgrade, However SQL Server 2014 also includes very small but very relevant and useful enhancements, and this post talks about one of those enhancement:
SELECT INTO now runs Parallel !
When I first heard about this feature my first impression was: What…was it not running parallel before ?
No, SELECT INTO was single threaded all these years and with SQL Server 2014 this will now use multiple threads.
Let’s do a quick walk-through of this enhancement.
In this walk-through we will use a table named MasterData which has around 12 million rows.
--Select Count SELECT COUNT(*) AS Row_Count FROM [dbo].[MasterData]
Row_Count
11,999,880
Data from this table will be copied over to a different table named MasterData_Copy using SELECT INTO operation.
--Turn on Actual Execution Plan SET STATISTICS TIME ON GO SELECT * INTO [dbo].[MasterData_Copy] FROM [dbo].[MasterData]
This operation was completed in 6 seconds (Desktop class machine) and if we examine the execution plan we can observe parallelism.
And the distribution of row’s among multiple threads:
Now let’s make this little more interesting. We will change the database compatibility level to SQL2012(110) and see what impact that brings to the SELECT INTO operation.
--Change Database Compatibility ALTER DATABASE SELECTINTO_Demo SET COMPATIBILITY_LEVEL = 110
--Turn on Actual Execution Plan SET STATISTICS TIME ON GO SELECT * INTO [dbo].[MasterData_Copy] FROM [dbo].[MasterData]
This operation was completed in 11 seconds and it generated a serial plan.
And here is the row distribution.
<Update 1/13/2015 2 PM>
I received a comment from Toni which says “It might be a good idea to set MAXDOP, most likely having 16 threads isn’t the best option”.
I absolutely agree with this and a well tested MAXDOP setting will definitely bring better results and this is something which you have to keep in mind when queries goes parallel.
</Update>
Conclusion:
If you are using SELECT INTO a lot in your environment, then I would highly recommend you to compare and see what improvements you can again out of SQL Server 2014.
It might be a good idea to set MAXDOP, most likely having 16 threads isn’t the best option…
Oh yeah..absolutely agree…Setting MAXDOP according to workload demands is always recommended. I will update the blog post with your comments.
It is worth mentioning that the SELECT..INTO..in T-SQL dialect is not ANSI/ISO Standard. This would be called a “Singleton Select” and it loads a single row value into local variables, not a table. It is not used much, but it replaces a cursor with much less overhead. It cannot take an ORDER BY and has other minor limitations.
Does this also apply to selecting into local temporary tables? We do that ALOT and it would be good to know that we will see some improvement in the speed of those processes.
I’m not sure..but I’m testing it right now to be sure…will update the comment or will write a new post..Will keep you posted on this.
Yes,it does
Awesome! Thanks for looking into that.
This is how non-SQL programmers fake 1950’s scratch tapes. In declarative programming, we do not use local variables or deliberately materialize data on a disk. The goal is to use a single expression to do the job. This is the joy of having aluminum wheels for your horse and buggy; it makes the wrong approach faster, but it is still the wrong approach.