Functions are routines that consist of one or more Transact-SQL statements that you can use to encapsulate code for reuse. In this blog, you will learn to design and implement user-defined functions (UDFs) that enforce business rules or data consistency, and to modify and maintain existing functions that other developers have written. A function takes zero or more input parameters and returns either a scalar value or a table. Functions do not support output parameters, but do return results, either as a single value or a table.
Microsoft SQL Server data management software offers three types of functions: scalar functions, table-valued functions (TVFs), and system functions. You can create two types of TVFs: inline TVFs and multistatement TVFs.
- Scalar Functions. Scalar functions return a single data value of the type that is defined in a RETURNS clause.
- Inline Table-Valued Functions. An inline TVF returns a table that is the result of a single SELECT statement. This is similar to a view, but an inline TVF is more flexible in that parameters can be passed to the SELECT statement within the function.
- Multistatement Table-Valued Functions. A multistatement TVF returns a table that one or more Transact-SQL statements built. It is similar to a stored procedure. Multistatement TVFs are created for the same reasons as inline TVFs, but you can use them when the logic that the function needs to implement is too complex to be expressed in a single SELECT statement. You can call them from within a FROM clause.
- System Functions. System functions are built-in functions that SQL Server provides to help you perform a variety of operations. You cannot modify them. System functions are described in the next topic.
SQL Server provides a set of system functions to return information about values, objects, and settings in SQL Server. Most of the system functions are scalar functions. They provide the functionality such as operations on data types and conversions between data types. SQL Server also provides a library of mathematical and cryptographic functions.
- Aggregates such as MIN, MAX, AVG, SUM, and COUNT perform calculations across groups of rows. Many of these functions automatically ignore NULL rows.
- Ranking functions such as ROW_NUMBER, RANK, DENSE RANK, and NTILE perform windowing operations on rows of data.
User-defined functions are created by using the CREATE FUNCTION statement, modified by using the ALTER FUNCTION statement, and removed by using the DROP FUNCTION statement. Even though you must wrap the body of the function (apart from inline functions) in a BEGIN…END block, CREATE FUNCTION must be the only statement in the batch.
A scalar function returns a single data value of the type that is defined in a RETURNS clause. Unlike the definition of a stored procedure, where it is optional to use a BEGIN…END block that wraps the body of the stored procedure, the body of a function must be defined in a BEGIN…END block. The function body contains the series of Transact-SQL statements that return the value.
Scalar Function Definition Example
CREATE FUNCTION dbo.ExtractProtocolFromURL ( @URL NVARCHAR(1000)) RETURNS NVARCHAR(1000) AS BEGIN RETURN CASE WHEN CHARINDEX(N':',@URL,1) >= 1 THEN SUBSTRING(@URL,1,CHARINDEX(N':',@URL,1) - 1) END; END;
You can use scalar functions in the following code.
Unlike views, the code for scalar functions is not incorporated directly into the surrounding query. The overuse of scalar functions is a common cause of performance problems in SQL Server systems. For example, a WHERE clause predicate that calls a scalar function calls that function for every target row. In many cases, extracting the code from the function definition and incorporating it directly into the query will resolve the performance issue.
Deterministic and Nondeterministic Functions
Both built-in functions and UDFs fall into one of two categories: deterministic and nondeterministic. This distinction is important because it determines where you can use a function.
A deterministic function always returns the same result when it is provided with the same set of input values for the same database state.
Deterministic Function Example
CREATE FUNCTION dbo.AddInteger (@FirstValue INT, @SecondValue INT) RETURNS INT AS BEGIN RETURN @FirstValue + @SecondValue; END; GO
Every time the function is called with the same two integer values, it will return exactly the same result. A nondeterministic function may return different results for the same set of input values each time it is called, even if the database remains in the same state.
Nondeterministic Function Example
CREATE FUNCTION dbo.CurrentUTCTimeAsString() RETURNS VARCHAR(40) AS BEGIN RETURN CONVERT(VARCHAR(40),SYSUTCDATETIME(),100); END;
Each time the function is called, it will return a different value, even though no input parameters are supplied.
You can use the OBJECTPROPERTY() function to determine if a UDF is deterministic.
Unlike scalar functions, TVFs return a table that can contain many rows of data, each with many columns. There are two types of TVFs. Inline TVFs return an output table that is defined by a RETURN statement that consists of a single SELECT statement. Multistatement TVFs that is used when the logic of the function is too complex to include in a single SELECT statement.
You can use both types of TVF as the equivalent of parameterized views. You can use inline functions to achieve the functionality of parameterized views. One of the limitations of a view is that you cannot include a user-provided parameter within the view when you create it.
Inline Table-Valued Function Definition Example
CREATE FUNCTION dbo.ufnGetEmployeesByCountry (@Region NVARCHAR(50)) RETURNS TABLE AS RETURN (SELECT VE.BusinessEntityID, VE.Title, VE.FirstName, VE.MiddleName, VE.LastName, VE.JobTitle, VE.EmailAddress, VE.CountryRegionName FROM HumanResources.vEmployee VE WHERE VE.CountryRegionName = @Region ); GO
Note that the return type is TABLE. The definition of the columns of the table is not shown. You do not explicitly define the schema of the returned table. The output table schema is derived from the SELECT statement that you provide within the RETURN statement. Every column that the SELECT statement returns should also have a distinct name.
For inline functions, the body of the function is not enclosed in a BEGIN…END block. A syntax error occurs if you attempt to use this block. The CREATE FUNCTION statement still needs to be the only statement in the batch.
You can use inline table-valued functions in the following code.
SELECT * FROM ufnGetEmployeesByCountry ('Canada')
A multi-statement TVF enables more complexity in how the table to be returned is constructed. You can use UDFs that return a table to replace views. This is very useful when the logic for constructing the return table is more complex than would be possible within the definition of a view. A TVF (like a stored procedure) can use complex logic and multiple Transact-SQL statements to build a table.
Multi-statement Table-Valued Function Definition Example
CREATE FUNCTION [dbo].[ufnGetContactInformation](@PersonID INT) RETURNS @retContactInformation TABLE ( [PersonID] INT NOT NULL, [FirstName] [NVARCHAR](50) NULL, [LastName] [NVARCHAR](50) NULL, [JobTitle] [NVARCHAR](50) NULL, [BusinessEntityType] [NVARCHAR](50) NULL ) AS BEGIN IF @PersonID IS NOT NULL BEGIN IF EXISTS(SELECT * FROM [HumanResources].[Employee] e WHERE e.[BusinessEntityID] = @PersonID) INSERT INTO @retContactInformation SELECT @PersonID, p.FirstName, p.LastName, e.[JobTitle], 'Employee' FROM [HumanResources].[Employee] AS e INNER JOIN [Person].[Person] p ON p.[BusinessEntityID] = e.[BusinessEntityID] WHERE e.[BusinessEntityID] = @PersonID ORDER BY e.BusinessEntityID; IF EXISTS(SELECT * FROM [Purchasing].[Vendor] AS v INNER JOIN [Person].[BusinessEntityContact] bec ON bec.[BusinessEntityID] = v.[BusinessEntityID] WHERE bec.[PersonID] = @PersonID) INSERT INTO @retContactInformation SELECT @PersonID, p.FirstName, p.LastName, ct.[Name], 'Vendor Contact' FROM [Purchasing].[Vendor] AS v INNER JOIN [Person].[BusinessEntityContact] bec ON bec.[BusinessEntityID] = v.[BusinessEntityID] INNER JOIN [Person].ContactType ct ON ct.[ContactTypeID] = bec.[ContactTypeID] INNER JOIN [Person].[Person] p ON p.[BusinessEntityID] = bec.[PersonID] WHERE bec.[PersonID] = @PersonID; IF EXISTS(SELECT * FROM [Sales].[Store] AS s INNER JOIN [Person].[BusinessEntityContact] bec ON bec.[BusinessEntityID] = s.[BusinessEntityID] WHERE bec.[PersonID] = @PersonID) INSERT INTO @retContactInformation SELECT @PersonID, p.FirstName, p.LastName, ct.[Name], 'Store Contact' FROM [Sales].[Store] AS s INNER JOIN [Person].[BusinessEntityContact] bec ON bec.[BusinessEntityID] = s.[BusinessEntityID] INNER JOIN [Person].ContactType ct ON ct.[ContactTypeID] = bec.[ContactTypeID] INNER JOIN [Person].[Person] p ON p.[BusinessEntityID] = bec.[PersonID] WHERE bec.[PersonID] = @PersonID; IF EXISTS(SELECT * FROM [Person].[Person] AS p INNER JOIN [Sales].[Customer] AS c ON c.[PersonID] = p.[BusinessEntityID] WHERE p.[BusinessEntityID] = @PersonID AND c.[StoreID] IS NULL) INSERT INTO @retContactInformation SELECT @PersonID, p.FirstName, p.LastName, NULL, 'Consumer' FROM [Person].[Person] AS p INNER JOIN [Sales].[Customer] AS c ON c.[PersonID] = p.[BusinessEntityID] WHERE p.[BusinessEntityID] = @PersonID AND c.[StoreID] IS NULL; END RETURN; END; GO
The code for a TVF may or may not be incorporated into the query that uses the function depending on what type of TVF it is. Inline TVFs are directly incorporated into the code of the query that uses them. Multi-statement TVFs are not incorporated into the code of the query that uses them. The inappropriate usage of such TVFs is a common cause of performance issues in SQL Server.
You can use the CROSS APPLY operator to call a TVF for each row in the table on the left within the query. Designs that require the calling of a TVF for every row in a table can lead to significant performance overhead. You should examine the design to see if there is a way to avoid the need to call the function for each row.
Guidelines for Creating Functions
- In many cases, the performance of inline functions is much higher than the performance of multi-statement functions. Wherever possible, try to implement functions as inline functions.
- Avoid building large, general-purpose functions. Keep functions relatively small and targeted at a specific purpose. This will avoid code complexity but will also increase the opportunities for reusing the functions.
- Use two-part naming to qualify the name of any database objects within the function and also use two-part naming when you are choosing the name of the function.
- Consider the impact of using functions in combination with indexes. In particular, note that a WHERE clause that uses a function within the predicate, is likely to remove the usefulness of an index.
Comparing Table-Valued Functions and Stored Procedures
You can often use TVFs and stored procedures to achieve similar outcomes. Each approach has its pros and cons.
- Although it is possible to access the output rows of a stored procedure by using an INSERT EXEC statement, it is easier to consume the output of a function in code than the output of a stored procedure.
- Stored procedures can modify data in database tables. Functions cannot modify data in database tables.
- Functions can have significant performance impacts when they are called for each row in a query, such as when a TVF is called by using a CROSS APPLY or OUTER APPLY
- Stored procedures can execute dynamic SQL statements. Functions are not permitted to execute dynamic SQL statements.
- Stored procedures can include detailed exception handling. Functions cannot contain exception handling.
- Stored procedures can return multiple result sets from a single stored procedure call. TVFs can return a single row set from a function call.
Comparing Table-Valued Functions and Views
TVFs can provide similar outcomes to views.
- Views, and TVFs that do not contain parameters, can usually be consumed by most client applications that can access tables. Not all such applications can pass parameters to a TVF.
- It is possible to update views and inline TVFs. This is not the case for multi-statement TVFs.
- Views can have INSTEAD OF triggers associated with them. This is mostly used to provide for updatable views based on multiple base tables.
- Views and inline TVFs are incorporated into surrounding queries. Multi-statement TVFs are not incorporated into surrounding queries and often lead to performance issues when they are used inappropriately.
Functions are only one option for implementing code. Now, we know how to Explores situations where other solutions may be appropriate and make decisions about which solution to use. Hope this post was informative, please share it with others if you think it worth to read and stay tuned to learn more about SQL Server.