SQL Server and UDF’s Performance Issues

The estimated reading time for this post is 3 minutes

Recently this issue happened once we upgraded a database from SQL Server 2005 to SQL Server 2014 to meet the compliance. Once the upgrading process is done, users tried to do the User Acceptance Test (UAT) to make sure every single functionality in the application and also the database performance still remain intact. A part of the application is doing some BULK INSERT operation to move certain data rows from a table to historical table (I know it is not the best practice to do), this operation took less than a minute to be completed in SQL Server 2005 but it takes few minutes to be completed in SQL Server 2014, Client expected to see some performance gains instead of losing it. After my investigations I found that there is a User Defined Function in Select…Into statement. The UDF is doing some calculation for every single row once the row is moved to the archived/history table.

Based on the findings I tried to simulate the environment in our own server to compare the Transaction activities in the BULK Operation with UDF and without UDF to observe how SQL Server behaves toward this BULK Operation. The following statements are for regenerate the performance issue.

CREATE DATABASE Test;
go
USE Test;
go
CREATE TABLE Active (ID BIGINT IDENTITY(1,1), Padding BINARY(10) DEFAULT 0XFFF, LastUpdate DATETIME DEFAULT SysDateTime());
 
Go
 
INSERT INTO Active DEFAULT VALUES
Go 3000
 
go
 
CREATE FUNCTION GetRecordAge(@LastUpdate AS DATETIME)
RETURNS  BIGINT
AS BEGIN
RETURN (SELECT DATEDIFF(MINUTE,@LastUpdate,Sysdatetime()));
END
 
go
 
SELECT ID, Padding,LastUpdate FROM Active;
--Total Records: 3000
go
 
SELECT * FROM fn_dblog(null,null);
 
--Current LSN: 0000002b:00000037:0002;
 
go
 
SELECT ID,Padding,dbo.GetRecordAge(LastUpdate) AS [MinutesOld] INTO History FROM Active;

As you see on the above statements, there is a UDF that does simple calculation and there are one table called active and also the Select…Into statements that does the data movement operation.

At the next step, I took a deep look into the Transaction Log file by using the following statements:-

SELECT * FROM fn_dblog(null,null);

 

 

The fn_dblog() function reads the entire transaction log and dump all the information, as you see at below , it shows that SQL Server does fully logged the entire transactions of the Select…Into with the UDF.

SELECT COUNT(*),MAX([CURRENT LSN]) FROM fn_dblog(null,null) WHERE [CURRENT LSN]>'0000002b:00000037:0002'
--Transactions Log Record Counts: 3200
--Current LSN: 0000002c:0000010f:0121

 

And below shows that SQL Server does minimally logged the entire transactions of the Select…Into without UDF.

SELECT ID,Padding, DATEDIFF(MINUTE,LastUpdate,Sysdatetime()) AS [MinutesOld] INTO History2 FROM Active;
 
SELECT COUNT(*),MAX([CURRENT LSN]) FROM fn_dblog(null,null) WHERE [CURRENT LSN]>'0000002c:0000010f:0121'
--Transactions Log Record Counts:142
--Current LSN: 0000002c:0000015b:0071

 

Then what is the solution to rectify this performance issue? I thought of the following configurations before finding the real solutions:-

  1. Enable the Instant File Initialization (But it does not work for Transaction Log files)
  2. Enable MAXDOP (It did not work as well)
  3. Disable Non-Clustered Indexes on the Archive/History table (It did not improve that much)
  4. Move the Archive/History table into a filegroup with multiple files. (It did not improve that much)

After reading the entire MSDN regarding User Defined Functions and their behavior and the changes has been done in SQL Server 2014, I found that in SQL Server 2012 and later the Select…Into operation will be fully logged due to data integrity issue as the UDF might read/write data in the same object that cause data corruption, which was not enabled on the earlier versions. To rectify this issue we need to make sure that UserDataAccess option in the UDF is set to 0 by enabling Schemabinding option on the UDF.

If the Select…Into statement still references at least one user-defined function that has UserDataAccess property set to 1, the operation is fully logged.

As you see in the below statements, after I have enabled the Schemabinding option on the scalar function, the number of transaction log records reduced significantly.

ALTER FUNCTION GetRecordAge(@LastUpdate AS DATETIME)
RETURNS  BIGINT WITH Schemabinding
AS BEGIN
	 RETURN (SELECT DATEDIFF(MINUTE,@LastUpdate,Sysdatetime()));
END
 
go
 
SELECT ID,Padding, dbo.GetRecordAge(LastUpdate) AS [MinutesOld] INTO History3 FROM Active;
 
SELECT COUNT(*),MAX([CURRENT LSN]) FROM fn_dblog(null,null) WHERE [CURRENT LSN]>'0000002c:0000015b:0071'
--Transaction Log Record Counts: 194

 

 

Hamid J. Fard

I am SQL Server Data Platform Expert with more than 9 years’ of professional experience, I am currently Microsoft Certified Master: SQL Server 2008, Microsoft Certified Solutions Master: Charter-Data Platform, Microsoft Data Platform MVP and CIW Database Design Specialist. I also do Pig, Hive and Sqoop development regarding Hadoop Big Data platform. After a few years of being a production database administrator I jumped into the role of Data Platform Expert. Being a consultant allows me to work directly with customers to help solve questions regarding database issues for SQL Server.

More Posts

Follow Me:
FacebookLinkedIn


Leave a Comment

6 Comments on "SQL Server and UDF’s Performance Issues"

Notify of
avatar
1000
Sort by:   newest | oldest | most voted
Siavash Golchoobian
Guest

It’s great, Thank you for your useful solution.

vnarayanaraju
Guest

Is this issue resolved by doing the above modification in UDF…

wpDiscuz