SQL Server TSQL Subquery

The estimated reading time for this post is 5 minutes

A subquery is a SELECT statement nested, or embedded, within another query. The nested query (subquery), is the inner query. The query containing the nested query is the outer query. A subquery can be used anywhere an expression is allowed, as a column expression, in a WHERE clause, in a HAVING clause or even in a FROM clause.

The purpose of a subquery is to return results to the outer query.

Consideration of Subquery:

To donate a query as a subquery, enclose it in parentheses. Multiple level of subqueries are supported in SQL server up to 32 levels which is not recommended as it increases the query complexity and Optimizer might not be able to convert them to Inner Join.

If the subquery returns an empty result set, the result of the subquery is converted and returned as a NULL. Ensure your outer query can gracefully handle a NULL, in addition to other expected results.

The form of the results will determine whether the subquery is a scalar or multi-valued subquery:

  1. Scalar subqueries, like scalar functions, returns a single value. Outer queries need to be written to process a single result.
  2. Multi-Valued subqueries return a result much like a single-column table. Outer queries need to be written to handle multiple possible results.

There are two kind of subqueries, Self-Contained Subquery and Correlated Subquery which is explained on the followings:-

SELF-CONTAINED SUBQUERY

Can be written as stand-alone queries, with no dependencies on the outer query. A self-contained subquery is processed once, when the outer query runs and passes its results to that query. It can return its result as scalar or multi-valued.

SELECT  TOP 1 sod.OrderQty , soh.Salesorderid , sod.SalesOrderDetailID , soh.TotalDue , soh.OrderDate 
FROM sales.salesorderheader AS soh
INNER join sales.salesorderdetail AS sod ON soh.salesorderid=sod.salesorderid
WHERE sod.salesorderid = ( SELECT sod.salesorderid FROM sales.salesorderdetail AS sod
      WHERE sod.OrderQty = (SELECT MAX (sod.OrderQty) FROM sales.salesorderdetail AS sod))
      ORDER BY OrderQty DESC

A multi-valued subquery is well suited to return results to the IN predicate, as in the following example:

SELECT SalespersonId , TerritoryId , SUM(TotalDue) AS TotalDue FROM Sales.Salesorderheader
WHERE SalespersonId in ( SELECT BusinessEntityId FROM Sales.Salesperson WHERE SalesYTD>2000000)
GROUP BY SalespersonId , TerritoryId

Correlated Subquery

Correlated subqueries are SELECT statements nested within an outer query. They may also be written as scalar or multi-valued subqueries. They are typically used to pass a value from the outer query to the inner query, to be used as a parameter there. In correlated subqueries,  the inner query receives input from the outer query and conceptually executes once per row in it. Correlated subqueries cannot be executed separately from the outer query. This complicates testing and debugging. Unlike self-contained subqueries which are processed once, correlated subqueries will run multiple times. Logically, the outer query runs first, and for each row returned, the inner query is processed.

SELECT SalesPersonID ,(SELECT FirstName+' '+LastName FROM person.person AS pp
 WHERE pp.BusinessEntityID=soh.salespersonid) AS FullName,
 SUM(TotalDue) AS TotalDue ,
 (SELECT TOP 1 YEAR(orderdate) FROM sales.salesorderheader WHERE orderdate='2005-10-15') AS OrderYear
 FROM sales.salesorderheader AS soh
 GROUP BY salespersonid

It is better to minimize using subqueries and replace them with Inner Join or other type of joins to increase performance.

Many Transact-SQL statements that include subqueries can be alternatively formulated as joins. Other questions can be posed only with subqueries. In Transact-SQL, there is usually no performance difference between a statement that includes a subquery and a semantically equivalent version that does not. However, in some cases where existence must be checked, a join yields better performance. Otherwise, the nested query must be processed for each result of the outer query to ensure elimination of duplicates. In such cases, a join approach would yield better results. The following is an example showing both a subquery SELECT and a join SELECT that return the same result set:

 SELECT SalespersonId , SUM(TotalDue) AS TotalDue , YEAR(Orderdate) AS [YEAR] FROM Sales.Salesorderheader
 WHERE SalesPersonID IN (SELECT BusinessEntityId FROM Person.Person WHERE FirstName =N'David')
 GROUP BY SalesPersonID , YEAR(OrderDate)
 
 SELECT SalespersonId , SUM(TotalDue) AS TotalDue , YEAR(Orderdate) AS [YEAR] FROM Sales.Salesorderheader AS SOH
 INNER join Person.Person AS PP ON SOH.SalesPersonID=PP.BusinessEntityID
 WHERE PP.FirstName =N'David'
 GROUP BY SalesPersonID , YEAR(OrderDate)

Using The EXIST Predicates With Subqueries

SQL Server provides a mechanism for checking whether any results would be returned from a query. The EXISTS predicate evaluates whether rows exist, but rather than return them, it returns TRUE or FALSE. This is a useful technique for validating data without incurring the overhead of retrieving and counting the results.

Compare the following queries:-

 SELECT PP.ProductID , PP.[Name] , PP.ListPrice FROM Production.Product AS PP
 WHERE ( SELECT COUNT(*) FROM Sales.SalesOrderDetail AS SOD
 WHERE PP.ProductID=SOD.ProductID)>0
 
SELECT PP.ProductID , PP.[Name] , PP.ListPrice FROM Production.Product AS PP
 WHERE EXISTS ( SELECT * FROM Sales.SalesOrderDetail AS SOD
 WHERE PP.ProductID=SOD.ProductID)

In the first example, the subquery must count every occurrence of each ProuductID found in the Sales.SalesOrderDetail table, and compare the count results to zero, simply to indicate that the Product has associated orders.

In the second query, EXISTS returns TRUE for an ProductID as soon as one has been found in the Sales.SalesOrderDetail table—a complete accounting of each occurrence is unnecessary.

Another useful application of EXISTS is negating it with NOT, as in the following example, which will return any Product which has never sold.

SELECT PP.ProductID , PP.[Name] , PP.ListPrice FROM Production.Product AS PP
 WHERE  NOT EXISTS ( SELECT * FROM Sales.SalesOrderDetail AS SOD
 WHERE PP.ProductID=SOD.ProductID)

To write queries that use EXISTS with subqueries, consider the following guidelines:

The keyword EXISTS directly follows WHERE. No column name (or other expression) needs to precede it, unless NOT is also used. 

Within the subquery following EXISTS, the SELECT list only needs to contain (*). No rows are returned by the subquery, so no columns need to be specified.

I hope this blog has been able to attract your attention and informative enough.

 

Saeed Jabarpourfard

Saeed is a Microsoft Certified Professional and Microsoft Certified Solutions Associate: SQL Server 2016 Database Developer, with extensive experience in Database Development and Administration. He has been working with SQL Server since year 2010, and he used SQL Server 2008 R2 as his first RDBMS.

More Posts - Website