Designing and Implementing User-Defined Functions

The estimated reading time for this post is 10 minutes

Functions are routines that consist of one or more Transact-SQL statements that you can use to encapsulate code for reuse. In this blog, you will learn to design and implement user-defined functions (UDFs) that enforce business rules or data consistency, and to modify and maintain existing functions that other developers have written. A function takes zero or more input parameters and returns either a scalar value or a table. Functions do not support output parameters, but do return results, either as a single value or a table.

Microsoft SQL Server data management software offers three types of functions: scalar functions, table-valued functions (TVFs), and system functions. You can create two types of TVFs: inline TVFs and multistatement TVFs.

  • Scalar Functions. Scalar functions return a single data value of the type that is defined in a RETURNS clause.
  • Inline Table-Valued Functions. An inline TVF returns a table that is the result of a single SELECT statement. This is similar to a view, but an inline TVF is more flexible in that parameters can be passed to the SELECT statement within the function.
  • Multistatement Table-Valued Functions. A multistatement TVF returns a table that one or more Transact-SQL statements built. It is similar to a stored procedure. Multistatement TVFs are created for the same reasons as inline TVFs, but you can use them when the logic that the function needs to implement is too complex to be expressed in a single SELECT statement. You can call them from within a FROM clause.
  • System Functions. System functions are built-in functions that SQL Server provides to help you perform a variety of operations. You cannot modify them. System functions are described in the next topic.

A function that modifies the underlying database is considered to have “side-effects.” In SQL Server, functions are not permitted to have side-effects. You cannot change data in a database within a function, you may not call a stored procedure, and you may not execute dynamic Structured Query Language (SQL) code.

SQL Server provides a set of system functions to return information about values, objects, and settings in SQL Server. Most of the system functions are scalar functions. They provide the functionality such as operations on data types and conversions between data types. SQL Server also provides a library of mathematical and cryptographic functions.

  • Aggregates such as MIN, MAX, AVG, SUM, and COUNT perform calculations across groups of rows. Many of these functions automatically ignore NULL rows.
  • Ranking functions such as ROW_NUMBER, RANK, DENSE RANK, and NTILE perform windowing operations on rows of data.

User-defined functions are created by using the CREATE FUNCTION statement, modified by using the ALTER FUNCTION statement, and removed by using the DROP FUNCTION statement. Even though you must wrap the body of the function (apart from inline functions) in a BEGIN…END block, CREATE FUNCTION must be the only statement in the batch.

A scalar function returns a single data value of the type that is defined in a RETURNS clause. Unlike the definition of a stored procedure, where it is optional to use a BEGIN…END block that wraps the body of the stored procedure, the body of a function must be defined in a BEGIN…END block. The function body contains the series of Transact-SQL statements that return the value.

Scalar Function Definition Example

CREATE FUNCTION dbo.ExtractProtocolFromURL ( @URL NVARCHAR(1000))
RETURNS NVARCHAR(1000)
AS
BEGIN
       RETURN
              CASE
                     WHEN CHARINDEX(N':',@URL,1) >= 1
                     THEN SUBSTRING(@URL,1,CHARINDEX(N':',@URL,1) - 1)
              END;
 
END;

You can use scalar functions in the following code.

SELECT dbo.ExtractProtocolFromURL(N'http://www.microsoft.com')

Unlike views, the code for scalar functions is not incorporated directly into the surrounding query. The overuse of scalar functions is a common cause of performance problems in SQL Server systems. For example, a WHERE clause predicate that calls a scalar function calls that function for every target row. In many cases, extracting the code from the function definition and incorporating it directly into the query will resolve the performance issue.

Deterministic and Nondeterministic Functions

Both built-in functions and UDFs fall into one of two categories: deterministic and nondeterministic. This distinction is important because it determines where you can use a function.

A deterministic function always returns the same result when it is provided with the same set of input values for the same database state.

Deterministic Function Example

CREATE FUNCTION dbo.AddInteger
(@FirstValue INT, @SecondValue INT)
RETURNS INT
AS
BEGIN
RETURN @FirstValue + @SecondValue;
END;
GO

Every time the function is called with the same two integer values, it will return exactly the same result. A nondeterministic function may return different results for the same set of input values each time it is called, even if the database remains in the same state.

Nondeterministic Function Example

CREATE FUNCTION dbo.CurrentUTCTimeAsString()
RETURNS VARCHAR(40)
AS
BEGIN
RETURN CONVERT(VARCHAR(40),SYSUTCDATETIME(),100);
END;

Each time the function is called, it will return a different value, even though no input parameters are supplied.

You can use the OBJECTPROPERTY() function to determine if a UDF is deterministic.

SELECT OBJECTPROPERTY(OBJECT_ID('dbo.ufnGetAccountingEndDate'),'IsDeterministic')

Unlike scalar functions, TVFs return a table that can contain many rows of data, each with many columns. There are two types of TVFs. Inline TVFs return an output table that is defined by a RETURN statement that consists of a single SELECT statement. Multistatement TVFs that is used when the logic of the function is too complex to include in a single SELECT statement.

You can use both types of TVF as the equivalent of parameterized views. You can use inline functions to achieve the functionality of parameterized views. One of the limitations of a view is that you cannot include a user-provided parameter within the view when you create it.

Inline Table-Valued Function Definition Example

CREATE FUNCTION dbo.ufnGetEmployeesByCountry (@Region NVARCHAR(50))
RETURNS TABLE
AS
RETURN (SELECT VE.BusinessEntityID,
                     VE.Title,
                     VE.FirstName,
                     VE.MiddleName,
                     VE.LastName,
                     VE.JobTitle,
                     VE.EmailAddress,
                     VE.CountryRegionName
              FROM HumanResources.vEmployee VE
              WHERE VE.CountryRegionName = @Region
              );
GO

Note that the return type is TABLE. The definition of the columns of the table is not shown. You do not explicitly define the schema of the returned table. The output table schema is derived from the SELECT statement that you provide within the RETURN statement. Every column that the SELECT statement returns should also have a distinct name.

For inline functions, the body of the function is not enclosed in a BEGIN…END block. A syntax error occurs if you attempt to use this block. The CREATE FUNCTION statement still needs to be the only statement in the batch.

You can use inline table-valued functions in the following code.

SELECT * FROM ufnGetEmployeesByCountry ('Canada')

A multi-statement TVF enables more complexity in how the table to be returned is constructed. You can use UDFs that return a table to replace views. This is very useful when the logic for constructing the return table is more complex than would be possible within the definition of a view. A TVF (like a stored procedure) can use complex logic and multiple Transact-SQL statements to build a table.

Multi-statement Table-Valued Function Definition Example

CREATE FUNCTION [dbo].[ufnGetContactInformation](@PersonID INT)
RETURNS @retContactInformation TABLE
(
    [PersonID] INT NOT NULL,
    [FirstName] [NVARCHAR](50) NULL,
    [LastName] [NVARCHAR](50) NULL,
       [JobTitle] [NVARCHAR](50) NULL,
    [BusinessEntityType] [NVARCHAR](50) NULL
)
AS
BEGIN
       IF @PersonID IS NOT NULL
              BEGIN
              IF EXISTS(SELECT * FROM [HumanResources].[Employee] e
                                  WHERE e.[BusinessEntityID] = @PersonID)
                     INSERT INTO @retContactInformation
                           SELECT @PersonID, p.FirstName, p.LastName, e.[JobTitle], 'Employee'
                           FROM [HumanResources].[Employee] AS e
                                  INNER JOIN [Person].[Person] p
                                  ON p.[BusinessEntityID] = e.[BusinessEntityID]
                           WHERE e.[BusinessEntityID] = @PersonID
                           ORDER BY e.BusinessEntityID;
 
              IF EXISTS(SELECT * FROM [Purchasing].[Vendor] AS v
                                  INNER JOIN [Person].[BusinessEntityContact] bec
                                  ON bec.[BusinessEntityID] = v.[BusinessEntityID]
                                  WHERE bec.[PersonID] = @PersonID)
                     INSERT INTO @retContactInformation
                           SELECT @PersonID, p.FirstName, p.LastName, ct.[Name], 'Vendor Contact'
                           FROM [Purchasing].[Vendor] AS v
                                  INNER JOIN [Person].[BusinessEntityContact] bec
                                  ON bec.[BusinessEntityID] = v.[BusinessEntityID]
                                  INNER JOIN [Person].ContactType ct
                                  ON ct.[ContactTypeID] = bec.[ContactTypeID]
                                  INNER JOIN [Person].[Person] p
                                  ON p.[BusinessEntityID] = bec.[PersonID]
                           WHERE bec.[PersonID] = @PersonID;
 
              IF EXISTS(SELECT * FROM [Sales].[Store] AS s
                                  INNER JOIN [Person].[BusinessEntityContact] bec
                                  ON bec.[BusinessEntityID] = s.[BusinessEntityID]
                                  WHERE bec.[PersonID] = @PersonID)
                     INSERT INTO @retContactInformation
                           SELECT @PersonID, p.FirstName, p.LastName, ct.[Name], 'Store Contact'
                           FROM [Sales].[Store] AS s
                                  INNER JOIN [Person].[BusinessEntityContact] bec
                                  ON bec.[BusinessEntityID] = s.[BusinessEntityID]
                                  INNER JOIN [Person].ContactType ct
                                  ON ct.[ContactTypeID] = bec.[ContactTypeID]
                                  INNER JOIN [Person].[Person] p
                                  ON p.[BusinessEntityID] = bec.[PersonID]
                           WHERE bec.[PersonID] = @PersonID;
 
              IF EXISTS(SELECT * FROM [Person].[Person] AS p
                                  INNER JOIN [Sales].[Customer] AS c
                                  ON c.[PersonID] = p.[BusinessEntityID]
                                  WHERE p.[BusinessEntityID] = @PersonID AND c.[StoreID] IS NULL)
                     INSERT INTO @retContactInformation
                           SELECT @PersonID, p.FirstName, p.LastName, NULL, 'Consumer'
                           FROM [Person].[Person] AS p
                                  INNER JOIN [Sales].[Customer] AS c
                                  ON c.[PersonID] = p.[BusinessEntityID]
                                  WHERE p.[BusinessEntityID] = @PersonID AND c.[StoreID] IS NULL;
              END
 
       RETURN;
END;
GO

The code for a TVF may or may not be incorporated into the query that uses the function depending on what type of TVF it is. Inline TVFs are directly incorporated into the code of the query that uses them. Multi-statement TVFs are not incorporated into the code of the query that uses them. The inappropriate usage of such TVFs is a common cause of performance issues in SQL Server.

You can use the CROSS APPLY operator to call a TVF for each row in the table on the left within the query. Designs that require the calling of a TVF for every row in a table can lead to significant performance overhead. You should examine the design to see if there is a way to avoid the need to call the function for each row.

Guidelines for Creating Functions

  • In many cases, the performance of inline functions is much higher than the performance of multi-statement functions. Wherever possible, try to implement functions as inline functions.
  • Avoid building large, general-purpose functions. Keep functions relatively small and targeted at a specific purpose. This will avoid code complexity but will also increase the opportunities for reusing the functions.
  • Use two-part naming to qualify the name of any database objects within the function and also use two-part naming when you are choosing the name of the function.
  • Consider the impact of using functions in combination with indexes. In particular, note that a WHERE clause that uses a function within the predicate, is likely to remove the usefulness of an index.

Comparing Table-Valued Functions and Stored Procedures

You can often use TVFs and stored procedures to achieve similar outcomes. Each approach has its pros and cons.

  • Although it is possible to access the output rows of a stored procedure by using an INSERT EXEC statement, it is easier to consume the output of a function in code than the output of a stored procedure.
  • Stored procedures can modify data in database tables. Functions cannot modify data in database tables.
  • Functions can have significant performance impacts when they are called for each row in a query, such as when a TVF is called by using a CROSS APPLY or OUTER APPLY
  • Stored procedures can execute dynamic SQL statements. Functions are not permitted to execute dynamic SQL statements.
  • Stored procedures can include detailed exception handling. Functions cannot contain exception handling.
  • Stored procedures can return multiple result sets from a single stored procedure call. TVFs can return a single row set from a function call.

Comparing Table-Valued Functions and Views

TVFs can provide similar outcomes to views.

  • Views, and TVFs that do not contain parameters, can usually be consumed by most client applications that can access tables. Not all such applications can pass parameters to a TVF.
  • It is possible to update views and inline TVFs. This is not the case for multi-statement TVFs.
  • Views can have INSTEAD OF triggers associated with them. This is mostly used to provide for updatable views based on multiple base tables.
  • Views and inline TVFs are incorporated into surrounding queries. Multi-statement TVFs are not incorporated into surrounding queries and often lead to performance issues when they are used inappropriately.

Conclusion

Functions are only one option for implementing code. Now, we know how to Explores situations where other solutions may be appropriate and make decisions about which solution to use. Hope this post was informative, please share it with others if you think it worth to read and stay tuned to learn more about SQL Server.

 

Hamide B. Abshuri

Hamide B. Abshuri is SQL Server Database Developer and Administrator with extensive of working experience. She started work with SQL Server since year 2015. She is currently Microsoft Certified Professional and Microsoft Certified Solutions Associate: SQL Server 2016 Database Developer.

More Posts - Website


Leave a Comment

avatar
1000
  Subscribe  
Notify of