Creating a subquery to retrieve data from more than one table

When you need to include in your WHERE clause selection criteria that only exists in another table, you can add subqueries to a SQL statement to retrieve the values that satisfy the condition. A subquery is a complete query that appears in the WHERE or HAVING clause of an SQL statement.

You can specify up to 16 subqueries within a single SQL statement, and you can specify subqueries within a subquery. Subqueries run from last to first within the main SQL statement in which they appear.

Rules for creating a subquery

  • Enclose the subquery in parentheses.
  • Specify only one column or expression in a subquery unless you are using IN, ANY, ALL, or EXISTS.
  • A subquery cannot contain a BETWEEN or LIKE clause.
  • A subquery cannot contain an ORDER BY clause.
  • A subquery in an UPDATE statement cannot retrieve data from the same table in which data is to be updated.
  • A subquery in a DELETE statement cannot retrieve data from the same table in which data is to be deleted.
The following query displays the names and IDs of employees who work in Boston. The subquery (in parentheses) finds the department number for the location of BOSTON in the Q.ORG table. Then the main query selects the names of the employees in that department from the Q.STAFF table.
SELECT NAME, ID
  FROM Q.STAFF
  WHERE DEPT=(SELECT DEPTNUMB
              FROM Q.ORG
              WHERE LOCATION='BOSTON')
In the next example, the subquery and main query retrieve data from the same table. The subquery calculates the average salary for all the employees in the Q.STAFF table. Then, the main query selects the salespeople whose salaries are equal to or greater than the average salary.
SELECT ID, NAME, SALARY
  FROM Q.STAFF
  WHERE JOB = 'SALES' AND
    SALARY >= (SELECT AVG(SALARY)
               FROM Q.STAFF)

Retrieving more than one value with a subquery

Usually a subquery selects only one column and returns only one value to the query. However, you can create a subquery that uses the ANY or ALL keywords with comparison operators (=, ¬=, >, >=, <, or <=) to return a particular set of values. In addition, just as you use the IN keyword in place of multiple OR statements in a query, you can also use IN in place of the ANY keyword in a subquery.

The query in the following figure selects any employee who works in the Eastern division. The subquery finds the department numbers in the Eastern division, and then the main query selects the employees who work in any of these departments.

Use the ANY keyword for this query, because it is likely that the subquery will find more than one department in the Eastern division. If you use the ALL keyword instead of the ANY keyword, no data is selected because no employee works in all departments of the Eastern division.

Figure 1. This SQL query contains a subquery that uses the ANY keyword.
SELECT NAME, ID
  FROM Q.STAFF
  WHERE DEPT = ANY
      (SELECT DEPTNUMB
       FROM Q.ORG
       WHERE DIVISION='EASTERN')

The query in the following figure selects the department with the highest average salary. The subquery finds the average salary for each department, and then the main query selects the department with the highest average salary.

Use the ALL keyword for this subquery. The department selected by the query must have an average salary greater than or equal to all the average salaries of the other departments.

Figure 2. This SQL query contains a subquery using the ALL keyword.
SELECT DEPT, AVG(SALARY)
  FROM Q.STAFF
  GROUP BY DEPT
  HAVING AVG(SALARY) >= ALL
        (SELECT AVG(SALARY)
         FROM Q.STAFF
         GROUP BY DEPT)

The query in following figure selects all salespeople (and their salaries) who work for managers earning more than $20,000 a year. The subquery finds the managers who earn more than $20,000 a year, and the main query selects the salespeople who work for those managers.

Use the IN keyword for this subquery because you need to find values from more than one department.

Figure 3. This SQL query contains a subquery using the IN keyword.
SELECT ID, NAME, SALARY
  FROM Q.STAFF
  WHERE JOB = 'SALES'
    AND DEPT IN
       (SELECT DISTINCT DEPT
        FROM Q.STAFF
        WHERE JOB = 'MGR'
          AND SALARY > 20000)

You can also specify NOT IN in a subquery to select information from one table when corresponding information does not exist in the other table.

Checking for rows that satisfy a condition

Not only can you use a subquery to return a value to the query, you can also use a WHERE EXISTS clause in a subquery to check for rows that satisfy a certain row condition.

The query in the following figure selects employees from the Q.STAFF table who have a salary of less than $14,000 and who work in a department where at least one other employee with the same job earns a salary greater than $14,000. (The subquery checks for other employees in the department with the same job, but who earn a salary greater than $14,000.)
Figure 4. This subquery checks for rows that satisfy a condition.
SELECT NAME, DEPT, JOB, SALARY
  FROM Q.STAFF S
  WHERE S.SALARY < 14000 AND
        EXISTS (SELECT * FROM Q.STAFF
        WHERE S.DEPT=DEPT AND SALARY >14000
        AND S.JOB=JOB)
  ORDER BY S.DEPT

Specifying a correlation name in a subquery

You can specify a correlation name in a subquery to use a condition to evaluate every row retrieved by the main query.

The query in the following figure selects the department, name, and salary of the employees who have the highest salary in their departments. The subquery calculates the maximum salary for each department that is selected by the main query. The correlation name, Y, compares each row that is selected by the query to the maximum salary that is calculated for the department in the subquery.

Figure 5. This subquery specifies a correlation name.
SELECT DEPT, NAME, SALARY
  FROM Q.STAFF Y
  WHERE SALARY = (SELECT MAX (SALARY)
                  FROM Q.STAFF
                  WHERE DEPT = Y.DEPT)