Introduction
SQL Server 2000 supports User-Defined Functions (UDFs) - one or more Transact-SQL statements that can be used to encapsulate code for reuse. User-defined functions cannot make a permanent changes to the data or modify database tables. UDF can change only local objects for this UDF, such as local cursors or variables.There are three types of UDF in SQL Server 2000:
Scalar functions return a single data value (not a table) with RETURNS clause. Scalar functions can use all scalar data types, with exception of timestamp and user-defined data types.
Inline table-valued functions return the result set of a single SELECT statement.
Multistatement table-valued functions return a table, that was built with many TRANSACT-SQL statements.
User-defined functions can be invoked from a query like built-in functions such as OBJECT_ID, LEN, DATEDIFF, or can be executed through an EXECUTE statement like stored procedures.
1 . Scalar Functions
Scalar functions return a data type such as int, money, varchar, real, etc. They can be used anywhere a built-in SQL function is allowed. The syntax for a scalar function is the following:
- CREATE FUNCTION [owner_name.] function_name
- ( [{ @parameter_name scalar_parameter_type [ = default]} [,..n]])
- RETURNS scalar_return_type
- [WITH <function_option> >::={SCHEMABINDING | ENCRYPTION]
- [AS]
- BEGIN
- function_body
- RETURN scalar_expression
- END
- A simple scalar function to cube a number would look like this:
- CREATE FUNCTION dbo.Cube( @fNumber float)
- RETURNS float
- AS
- BEGIN
- RETURN(@fNumber * @fNumber * @fNumber)
- END
- Surprisingly, user-defined functions (UDFs) support recursion. Here is an
- SQL Server 2000 UDF using the standard factorial example:
- CREATE FUNCTION dbo.Factorial ( @iNumber int )
- RETURNS INT
- AS
- BEGIN
- DECLARE @i int
- IF @iNumber <= 1
- SET @i = 1
- ELSE
- SET @i = @iNumber * dbo.Factorial( @iNumber - 1 )
- RETURN (@i)
- END
On Other Example
The following scalar function returns a maximum amount of books sold for a specified title. This function should be executed in pubs database. If the title has no sales, the UDF will return zero.
/* scalar function to return max amount of books sold
** for the specified title_id
*/
CREATE FUNCTION dbo.udf_max_copies_sold_for_title (@title_id CHAR(6))
RETURNS INT
AS
BEGIN
DECLARE @qty INT
-- initialize the variable at 0:
SELECT @qty = 0
SELECT
@qty = MAX(qty)
FROM sales
WHERE
title_id = @title_id
/* If there are no books sold for title_id specified
** then return 0:
*/
RETURN ISNULL(@qty, 0)
END
Now we can execute this function as follows: SELECT dbo.udf_max_copies_sold_for_title ('bu2075')
Results:
-----------
55
2. In-Line Table Functions
In-line table functions are functions that return the output of a single SELECT statement as a table data type. Since this type of function returns a table, the output can be used in joins of queries as if it was a standard table. The syntax for an in-line table function is as follows:
Other Example
- CREATE FUNCTION [owner_name.] function_name
- ( [{ @parameter_name scalar_parameter_type [ = default]} [,..n]])
- RETURNS TABLE
- [WITH <function_option>::={SCHEMABINDING | ENCRYPTION}]
- RETURN [(] select_statement [)]
- An in-line function to return the authors from a particular state would
- look like this:
- CREATE FUNCTION dbo.AuthorsForState(@cState char(2) )
- RETURNS TABLE
- AS
- RETURN (SELECT * FROM Authors WHERE state = @cState)
The following in-line function returns the product name, quantity ordered, and supplier for the top five best sellers in a particular year in the Northwind database:
/* in-line function to return quantity, product name and supplier
** for the top 5 best-selling products within the specified year
*/
CREATE FUNCTION dbo.udf_top_5_best_sellers (@year INT)
RETURNS TABLE
AS
RETURN
SELECT TOP 5
SUM(quantity) AS quantity_ordered,
a.ProductID,
ProductName,
CompanyName AS Supplier
FROM [order details] a INNER JOIN products b
ON a.productid = b.productid
INNER JOIN suppliers c ON c.supplierid = b.supplierid
INNER JOIN orders d ON d.orderid = a.orderid
AND DATEPART(YEAR, OrderDate) = @year
GROUP BY a.productid, productname, CompanyName
ORDER BY 1 DESC
We can execute this function as follows:
SELECT * FROM dbo.udf_top_5_best_sellers (1998)
Results:
quantity_ordered |
ProductID |
ProductName |
Supplier |
659 |
13 |
Konbu |
Mayumi's |
546 |
24 |
Guaraná Fantástica |
Refrescos Americanas LTDA |
542 |
60 |
Camembert Pierrot |
Gai pturage |
513 |
59 |
Raclette Courdavault |
Gai pturage |
414 |
7 |
Uncle Bob's Organic Dried Pears |
Grandma Kelly's Homestead |
Alternatively, we can SELECT only the desired columns from this function:
SELECT ProductID,
ProductName
FROM dbo.udf_top_5_best_sellers (1998)
Results:
ProductID |
ProductName |
13 |
Konbu |
24 |
Guaraná Fantástica |
60 |
Camembert Pierrot |
59 |
Raclette Courdavault |
7 |
Uncle Bob's Organic Dried Pears |
3. Multistatement Table Functions
Multistatement table functions are similar to stored procedures except that they return a table. This type of function is suited to address situations where more logic is required than can be expressed in a single query. The following is the syntax for a multistatement table function:
Hierarchical data, such as an organizational structure, is an example of data that cannot be gathered in a single query. The Northwind Company database's Employees table contains a field called ReportsTo that contains the EmployeeID of the employee's manager. GetManagerReports is a multistatement table function that returns a list of the employees who report to a specific employee, either directly or indirectly.
- CREATE FUNCTION [owner_name.] function_name
- ( [{ @parameter_name scalar_parameter_type [ = default]} [,..n]])
- RETURNS TABLE
- [WITH <function_option> >::={SCHEMABINDING | ENCRYPTION]
- [AS]
- BEGIN
- function_body
- RETURN
- END
The output of this function would be used in the same manner as a standard table. Figure 1 demonstrates JOINING the output of GetManagerReports with the Employees table to produce a listing of the organizational structure of the Northwind Company:
- CREATE FUNCTION dbo.GetManagerReports ( @iEmployeeID int )
- RETURNS @ManagerReports TABLE
- (
- EmployeeID int,
- EmployeeFirstName nvarchar(10),
- EmployeeLastName nvarchar(20),
- Title nvarchar(30),
- TitleOfCourtesy nvarchar(25),
- Extension nvarchar(4),
- ManagerID int
- )
- AS
- BEGIN
- DECLARE
- @iRowsAdded int, -- Counts rows added to
- -- table with each iteration
- @PREPROCESSED tinyint, -- Constant
- for record prior
- -- to processing
- @PROCESSING tinyint, -- Constant
- for record
- -- being processed
- @POSTPROCESSED tinyint -- Constant for
- records that
- -- have been processed
- SET @PREPROCESSED = 0
- SET @PROCESSING = 1
- SET @POSTPROCESSED = 2
- DECLARE @tblReports TABLE (
- -- Holds employees added with each pass thru source employees table
- EmployeeID int,
- EmployeeFirstName nvarchar(10),
- EmployeeLastName nvarchar(20),
- Title nvarchar(30),
- TitleOfCourtesy nvarchar(25),
- Extension nvarchar(4),
- ManagerID int,
- ProcessedState tinyint
- DEFAULT 0
- )
- --Begin by adding employees who report to the Manager directly.
- INSERT INTO @tblReports
- SELECT EmployeeID, FirstName, LastName, Title, TitleOfCourtesy,
- Extension, ReportsTo, @PREPROCESSED
- FROM Employees
- WHERE ReportsTo = @iEmployeeID
- --Save number of direct reports
- SET @iRowsAdded = @@ROWCOUNT
- -- Loop through Employees table until no more iterations are necessary
- -- (e.g., no more rows added) to add all indirect reports.
- WHILE @iRowsAdded > 0
- BEGIN
- --Set just added employees ProcessedState to PROCESSING
- -- (for first pass)
- UPDATE @tblReports
- SET ProcessedState = @PROCESSING
- WHERE ProcessedState = @PREPROCESSED
- --Add employees who report to Managers in
- -- ProcessedState = PROCESSING
- INSERT INTO @tblReports
- SELECT e.EmployeeID, e.FirstName, e.LastName, e.Title,
- e.TitleOfCourtesy, e.Extension, e.ReportsTo, @PREPROCESSED
- FROM Employees e
- INNER JOIN @tblReports r ON e.ReportsTo = r.EmployeeID
- WHERE r.ProcessedState = @PROCESSING
- AND e.ReportsTo <> @iEmployeeID
- --Save number of rows added for this iteration
- SET @iRowsAdded = @@ROWCOUNT
- --Set ProcessedState to POSTPROCESSED for Managers whose
- --reports were added in this iteration
- UPDATE @tblReports
- SET ProcessedState = @POSTPROCESSED
- WHERE ProcessedState = @PROCESSING
- END
- --Save all data to output table
- INSERT INTO @ManagerReports
- SELECT EmployeeID, EmployeeFirstName, EmployeeLastName, Title,
- TitleOfCourtesy, Extension, ManagerID
- FROM @tblReports
- RETURN
- END
Figure 1: User function used in JOIN query.
Other Example
The following multi-statement function accepts a delimited list as a parameter parses the string and returns the table containing each of the values in the parameter.
/* multi-statement function to parse
** a delimited list and turn it into a rowset
** DEFAULT delimiter is comma
*/
CREATE FUNCTION dbo.parse_comma_delimited_integer(
@list VARCHAR(8000),
@delimiter VARCHAR(10) = ',')
-- table variable that will contain values
RETURNS @tablevalues TABLE (
item INT)
AS
BEGIN
DECLARE @item VARCHAR(255)
/* Loop over the commadelimited list */
WHILE (DATALENGTH(@list) > 0)
BEGIN
IF CHARINDEX(@delimiter,@list) > 0
BEGIN
SELECT @item =
SUBSTRING(@list,1,(CHARINDEX(@delimiter, @list)-1))
SELECT @list =
SUBSTRING(@list,(CHARINDEX(@delimiter, @list) +
DATALENGTH(@delimiter)),DATALENGTH(@list))
END
ELSE
BEGIN
SELECT @item = @list
SELECT @list = NULL
END
-- Insert each item into temp table
INSERT @tablevalues (
item)
SELECT item = CONVERT(INT, @item)
END
RETURN
END
Now we can execute this function as follows:
SELECT * FROM dbo.parse_comma_delimited_integer
('39, 549, 324, 3556, 24, 2132, 345', ',')
Results:
item |
39 |
549 |
324 |
3556 |
24 |
2132 |
345 |
The following multi-statement UDF written for the Northwind database finds customers who have bought products in large quantities during a given year. The UDF accepts the year and amount of total sales and returns the customer name, the product that they have bought in large quantities, and all sales representatives who interfaced with these customers within the given year:
CREATE FUNCTION dbo.udf_top_customers_and_reps (
@year INT,
@amount INT)
RETURNS
@temp TABLE(
ProductName VARCHAR(200),
CategoryName VARCHAR(200),
CustomerID CHAR(5),
CompanyName VARCHAR(200),
TotalSales INT,
EmployeeNames VARCHAR(2000))
AS
BEGIN
/* populate the temp table with customers that have purchased any product
* with total sales greater than the specified amount, within the given year
*/
INSERT @temp (
ProductName ,
CategoryName ,
CompanyName ,
CustomerID ,
TotalSales
)
SELECT
ProductName,
CategoryName,
CompanyName,
b.CustomerID,
SUM(a.UnitPrice * quantity) AS total_sales
FROM [order details] a
INNER JOIN orders b ON a.orderid = b.orderid
INNER JOIN products c ON c.productid = a.productid
INNER JOIN customers d ON d.customerid = b.customerid
INNER JOIN categories e ON e.CategoryID = c.CategoryID
WHERE DATEPART(YEAR, OrderDate) = @year
GROUP BY c.ProductName, e.CategoryName, b.CustomerID,
d.CompanyName, DATEPART(YEAR, OrderDate)
HAVING SUM(a.UnitPrice * quantity) > @amount
ORDER BY ProductName
/* now get all the employees that have been involved with the customers
* in the given year and return them in a comma-delimited list
*/
DECLARE @CustomerID CHAR(5),
@EmployeeName VARCHAR(200)
DECLARE @Employees TABLE (
EmployeeName VARCHAR(80))
DECLARE CustomerCursor CURSOR FOR
SELECT CustomerID
FROM @temp a
OPEN CustomerCursor
FETCH NEXT FROM CustomerCursor INTO @CustomerID
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT @employees
SELECT DISTINCT FirstName + ' ' + LastName
FROM Employees a INNER JOIN Orders b ON a.EmployeeID = b.EmployeeID
AND DATEPART(YEAR,OrderDate) = @year
WHERE b.CustomerID = @CustomerID
/* create a comma-delimited list of employees */
SELECT @EmployeeName = ''
SELECT @EmployeeName = @EmployeeName + ', ' + EmployeeName
FROM @Employees
SELECT @EmployeeName = SUBSTRING(@EmployeeName, 3, LEN(@EmployeeName)-2)
UPDATE @temp
SET EmployeeNames = @EmployeeName
WHERE CustomerID = @CustomerID
DELETE @Employees
FETCH NEXT FROM CustomerCursor INTO @CustomerID
END
CLOSE CustomerCursor
DEALLOCATE CustomerCursor
RETURN
END
The following execution of the function returns customers who have bought any
product in amounts of over $10,000 during 1997: SELECT * FROM dbo.udf_top_customers_and_reps (1997, 10000)
Results:
ProductName |
CategoryName |
CustomerID |
Companyname |
TotalSales |
EmployeeNames |
Côte de Blaye |
Beverages |
MEREP |
Mère Paillarde |
10329 |
Janet Leverling, Laura Callahan, Margaret Peacock, Michael
Suyama, Nancy Davolio, Robert King |
Côte de Blaye |
Beverages |
SIMOB |
Simons bistro |
10540 |
Andrew Fuller, Margaret Peacock, Robert King |
Invoking Functions
There are a few syntax idiosyncrasies to observe when invoking user-defined functions. SQL Server 2000 provides some system-level user-defined functions in the Master database. These system functions are invoked with a slightly different syntax than ones that you would create. System functions that return a table have the following syntax:
- ::function_name ([argument_expr], [,...])
- System functions that return a scalar value use this syntax:
- function_name ([argument_expr], [,...])
- User-created scalar and rowset functions are invoked in exactly the same
- manner. The syntax for invoking a user-created function looks like this:
- [database_name] owner_name. function_name ([argument_expr], [,...])
Limitations
User-defined functions do have some restrictions placed upon them. Not every SQL statement or operation is valid within a function. The following lists enumerate the valid and invalid function operations: Valid:
- Assignment statements
- Control-flow statements
- Variable declarations
- SELECT statements that modify local variables
- Cursor operations that fetch into local variables
- INSERT, UPDATE, DELETE statement that act upon local table variables
- Built-in, nondeterministic functions such as GetDate()
- Statements that update, insert, or delete tables or views
- Cursor fetch operations that return data to the client
Performance Implications
Using UDFs will impact the performance of queries. The extent of the performance impact depends upon how and where you use a user-defined function. This is also true of built-in functions. However, UDFs have the potential for more dramatic performance hits than built-in functions. You should exercise caution when implementing functions in your queries and perform benchmarking tests to insure that the benefits of using your functions exceed the performance costs of using them.
Uses for Functions
Check constraints
Scalar user-defined functions can be used as check constraints for columns in table definitions. As long as an argument to the function is a constant or built-in function or an argument is the column being checked, the function may be used to validate the column's value. These UDF check constraints provide the ability to use more complex logic for determining acceptable column values than Boolean expressions or LIKE patterns would allow.
The following function validates that a serial number follows a specific pattern and portions of the serial number match a specific algorithm for a product type.
Computed columns
CREATE FUNCTION dbo.ValidSerialNumber( @nvcSerialNumber nvarchar(50)) RETURNS BIT AS BEGIN DECLARE @bValid BIT, @iNumber INT --default to invalid serial number SET @bValid = 0 --Home Office Product IF @nvcSerialNumber LIKE '[0-9][A-Z][0-9][A-Z][0-9][0-9][0-9][0-9]' BEGIN SET @iNumber = CONVERT(int,RIGHT(@nvcSerialNumber,4)) IF @iNumber % 7 = 2 BEGIN SET @bValid = 1 END END -- Video Game IF @nvcSerialNumber LIKE '[0-9][0-9][0-9][A-Z][0-9]5[A-Z]' BEGIN SET @iNumber = CONVERT(int,LEFT(@nvcSerialNumber, 3)) IF @iNumber % 2 = 0 BEGIN SET @bValid = 1 END END RETURN ( @bValid) END CREATE TABLE dbo.CustomerProduct ( CustomerID int NOT NULL PRIMARY KEY, ProductID int NOT NULL, SerialNumber nvarchar(20) NOT NULL CHECK(dbo.ValidSerialNumber(SerialNumber) = 1) )
Scalar functions can be used to compute column values in table definitions. Arguments to computed column functions must be table columns, constants, or built-in functions. This example shows a table that uses a Volume function to compute the volume of a container:
You should note that computed columns might be excluded from being indexed if user-defined functions determine their value. An index can be created on the computed column if the user-defined function is deterministic (e.g., always returns the same value given the same input).
CREATE FUNCTION dbo.Volume ( @dHeight decimal(5,2), @dLength decimal(5,2), @dWidth decimal(5,2) ) RETURNS decimal (15,4) AS BEGIN RETURN (@dHeight * @dLength * @dWidth ) END CREATE TABLE dbo.Container ( ContainerID int NOT NULL PRIMARY KEY, MaterialID int NOT NULL REFERENCES Material(MaterialID), ManufacturerID int NOT NULL REFERENCES Manufacturer(ManufacturerID) Height decimal(5,2) NOT NULL, Length decimal(5,2) NOT NULL, Width decimal(5,2) NOT NULL, Volume AS ( dbo.Volume( Height, Length, Width ) ) )
Default constraints
Default column values can be set with user-defined functions. UDFs can be very useful when a hard-coded value or built-in function does not suffice. For example, if a doctor's office wished to save a patient's appointment preference, a user-defined function could calculate the default day and time in a function by using the current date/time when the patient's record was created. If the patient's record were created on a Friday at 10:34 AM the AppointmentPref column would default to "Friday at 10:00" using the following function:
Assignments
CREATE FUNCTION dbo.AppointmentPreference ( @dtDefaultDateTime datetime ) RETURNS nvarchar(50) AS BEGIN DECLARE @nDay nvarchar(10), @nHour nvarchar(6), @nPreference nvarchar(50), @tiHour tinyint --Get date description SET @nDay = DATENAME(dw, @dtDefaultDateTime ) --Find current hour SET @tiHour = DATEPART(hh,@dtDefaultDateTime) --Use only 12-hour times IF @tiHour > 12 BEGIN SET @tiHour = @tiHour - 12 END --Don't allow appointments during lunch IF @tiHour = 12 BEGIN SET @tiHour = 1 END -- These are invalid hours IF @tiHour IN(5,6,7,8) BEGIN SET @tiHour = 4 END --Create preference text SET @nPreference = RTRIM(@nDay) + '''s at ' + CONVERT(varchar(2),@tiHour) + ':00' RETURN ( @nPreference) END CREATE TABLE dbo.Patient ( PatientID int NOT NULL PRIMARY KEY IDENTITY, FirstName nvarchar(20) NOT NULL, LastName nvarchar(20) NOT NULL, Addr1 nvarchar(50), Addr2 nvarchar(50), City nvarchar(50), State nvarchar(2), ZipCode nvarchar(20), HomePhone nvarchar(20), WorkPhone nvarchar(20), AppointmentPref nvarchar(50) DEFAULT (dbo.AppointmentPreference(GETDATE())) )
Scalar user-defined functions can be used to assign values to scalar variables. They may be used in any situation where a scalar built-in function may be used.
Control flow
DECLARE @fCube float SET @fCube = dbo.Cube( 4.5 )
Scalar user-defined functions may be used to control program flow when used in Boolean expressions.
Case expressions
IF dbo.ValidSerialNumber('002A15A') = 1 PRINT 'Yes' ELSE PRINT 'No'
User-defined functions that return a scalar value can be used in any of the cases of CASE expressions. The following example uses the DailySpecial function in a case function to determine what to display for a given day:
Alternative to views
CREATE FUNCTION dbo.DailySpecial( @nvcDay nvarchar(10)) RETURNS NVARCHAR(100) AS BEGIN DECLARE @nvcSpecial nvarchar(100) SET @nvcDay = UPPER(@nvcDay) IF @nvcDay = 'SUNDAY' SET @nvcSpecial = 'Roast beef with green beans and baked potato' IF @nvcDay = 'MONDAY' SET @nvcSpecial = 'Chopped beef with green bean casserole' IF @nvcDay = 'TUESDAY' SET @nvcSpecial = 'Beef stew' IF @nvcDay = 'WEDNESDAY' SET @nvcSpecial = 'Beef pot pie' IF @nvcDay = 'THURSDAY' OR @nvcDay = 'FRIDAY' OR @nvcDay = 'SATURDAY' SET @nvcSpecial = 'Beef surprise' RETURN ( @nvcSpecial ) END --Use output of DailySpecial function SELECT Special = CASE DateName(dw, getdate()) WHEN 'Sunday' THEN dbo.DailySpecial('Sunday') WHEN 'Monday' THEN dbo.DailySpecial('Monday') WHEN 'Tuesday' THEN dbo.DailySpecial('Tuesday') WHEN 'Wednesday' THEN dbo.DailySpecial('Wednesday') ELSE 'It's a mystery!' END
Rowset functions, functions that return tables, can be used as alternatives to read-only views. Since views are limited to a single select statement, user-defined functions can provide greater functionality than a view. Powerful logic can be used when determining the records returned, which is not possible within a view. Also, views cannot accept parameters so a separate view must be created if the WHERE clause must change for different search c riteria.
Alternative to temporary tables Rowset functions can be used as alternatives to temporary tables. For example, if you wished to find authors in the Pubs database who sold no books in a particular state, you could create a couple of functions that would generate the desired resultset.
To find the quantity of books sold for a particular author in a given state you could write the following function:
You could then create another function that would use the output from the first function to find the authors in a particular state that have not had any sales:
CREATE FUNCTION dbo.AuthorPoularityForState ( @cState Char(2)) RETURNS TABLE AS RETURN ( SELECT a.au_id, a.au_fname, a.au_lname, SUM(s.qty) AS QTY FROM Authors a INNER JOIN TitleAuthor ta ON a.au_id = ta.au_id INNER JOIN Titles t ON t.title_id = ta.title_id INNER JOIN Sales s ON s.title_id = t.title_id INNER JOIN Stores st ON st.Stor_ID = s.stor_id WHERE st.state = @cState GROUP BY a.au_id, a.au_fname, a.au_lname ORDER BY QTY DESC, a.au_lname, a.au_fname ) END
The following SQL statement would list the California authors who had not sold any books:
CREATE FUNCTION dbo.ReallyBoringAuthorsForState ( @cState Char(2) ) RETURNS TABLE RETURN( SELECT a.au_id as AuthorID, a.au_fname AS AuthorFirstName, a.au_lname AS AuthorLastName, @cState AS State FROM AuthorPopularityForState(@cState ) pa RIGHT JOIN authors a ON pa.AuthorID = a.au_id WHERE IsNull(pa.UnitsSold, 0) = 0 )
SELECT AuthorLastName, AuthorFirstName, AuthorID FROM ReallyBoringAuthorsForState('CA') ORDER BY AuthorLastName, AuthorFirstName
List of authors without book sales
Before the release of SQL Server 2000, temporary tables would likely have been used to generate the interim data to be used for the final query output. By using functions instead of temporary tables, potential table name concurrency problems are avoided. Functions also offer greater code reuse than temporary tables.