Creating a subquery to retrieve data from more than one table
When you need to include in your WHERE clause selection criteria that only exists in another table, you can add subqueries to a SQL statement to retrieve the values that satisfy the condition. A subquery is a complete query that appears in the WHERE or HAVING clause of an SQL statement.
You can specify up to 16 subqueries within a single SQL statement, and you can specify subqueries within a subquery. Subqueries run from last to first within the main SQL statement in which they appear.
Rules for creating a subquery
- Enclose the subquery in parentheses.
- Specify only one column or expression in a subquery unless you are using IN, ANY, ALL, or EXISTS.
- A subquery cannot contain a BETWEEN or LIKE clause.
- A subquery cannot contain an ORDER BY clause.
- A subquery in an UPDATE statement cannot retrieve data from the same table in which data is to be updated.
- A subquery in a DELETE statement cannot retrieve data from the same table in which data is to be deleted.
SELECT NAME, ID
FROM Q.STAFF
WHERE DEPT=(SELECT DEPTNUMB
FROM Q.ORG
WHERE LOCATION='BOSTON')
SELECT ID, NAME, SALARY
FROM Q.STAFF
WHERE JOB = 'SALES' AND
SALARY >= (SELECT AVG(SALARY)
FROM Q.STAFF)
Retrieving more than one value with a subquery
Usually a subquery selects only one column and returns only one value to the query. However, you can create a subquery that uses the ANY or ALL keywords with comparison operators (=, ¬=, >, >=, <, or <=) to return a particular set of values. In addition, just as you use the IN keyword in place of multiple OR statements in a query, you can also use IN in place of the ANY keyword in a subquery.
The query in the following figure selects any employee who works in the Eastern division. The subquery finds the department numbers in the Eastern division, and then the main query selects the employees who work in any of these departments.
Use the ANY keyword for this query, because it is likely that the subquery will find more than one department in the Eastern division. If you use the ALL keyword instead of the ANY keyword, no data is selected because no employee works in all departments of the Eastern division.
SELECT NAME, ID
FROM Q.STAFF
WHERE DEPT = ANY
(SELECT DEPTNUMB
FROM Q.ORG
WHERE DIVISION='EASTERN')
The query in the following figure selects the department with the highest average salary. The subquery finds the average salary for each department, and then the main query selects the department with the highest average salary.
Use the ALL keyword for this subquery. The department selected by the query must have an average salary greater than or equal to all the average salaries of the other departments.
SELECT DEPT, AVG(SALARY)
FROM Q.STAFF
GROUP BY DEPT
HAVING AVG(SALARY) >= ALL
(SELECT AVG(SALARY)
FROM Q.STAFF
GROUP BY DEPT)
The query in following figure selects all salespeople (and their salaries) who work for managers earning more than $20,000 a year. The subquery finds the managers who earn more than $20,000 a year, and the main query selects the salespeople who work for those managers.
Use the IN keyword for this subquery because you need to find values from more than one department.
SELECT ID, NAME, SALARY
FROM Q.STAFF
WHERE JOB = 'SALES'
AND DEPT IN
(SELECT DISTINCT DEPT
FROM Q.STAFF
WHERE JOB = 'MGR'
AND SALARY > 20000)
You can also specify NOT IN in a subquery to select information from one table when corresponding information does not exist in the other table.
Checking for rows that satisfy a condition
Not only can you use a subquery to return a value to the query, you can also use a WHERE EXISTS clause in a subquery to check for rows that satisfy a certain row condition.
SELECT NAME, DEPT, JOB, SALARY
FROM Q.STAFF S
WHERE S.SALARY < 14000 AND
EXISTS (SELECT * FROM Q.STAFF
WHERE S.DEPT=DEPT AND SALARY >14000
AND S.JOB=JOB)
ORDER BY S.DEPT
Specifying a correlation name in a subquery
You can specify a correlation name in a subquery to use a condition to evaluate every row retrieved by the main query.
The query in the following figure selects the department, name, and salary of the employees who have the highest salary in their departments. The subquery calculates the maximum salary for each department that is selected by the main query. The correlation name, Y, compares each row that is selected by the query to the maximum salary that is calculated for the department in the subquery.
SELECT DEPT, NAME, SALARY
FROM Q.STAFF Y
WHERE SALARY = (SELECT MAX (SALARY)
FROM Q.STAFF
WHERE DEPT = Y.DEPT)