SELECT (to retrieve rows)

Use the SELECT command to retrieve rows from a table or view. It returns the rows that satisfy the criteria you specify, including the WHERE condition. If you omit the WHERE condition, the SELECT command chooses from all rows.

Note: For a description of how to use the SELECT command to start a stored procedure on a IBM® Netezza® host, see the SELECT command in the IBM Netezza Stored Procedures Developer's Guide.
The system forms output rows by computing the select output expressions for each selected row.
  • You can include * in the output list as a shorthand way of indicating all columns of selected rows. You can also indicate <table>.* as shorthand for the columns that come from a specific table. Query a table describes functions that you can use within a SELECT command.
  • You can use the keyword distinct to eliminate duplicate rows from the result. The all keyword (the default) returns all candidate rows, including duplicates.

Syntax

Syntax for using the SELECT command to retrieve rows from a table or view:
SELECT [ DISTINCT | ALL ] [ * | <col> [ AS <output_name> ]
  <expression> [ AS <output_name> ] [,<expression> [ AS <output_name> ]…]
    [ FROM <from_item>[,<from_item>…] ]
    [ WHERE <condition> ]
    [ GROUP BY <expression>[,<expression>…] ]
    [ HAVING <condition>[,<condition>…] ]
    [ { UNION | INTERSECT | EXCEPT | MINUS }[ DISTINCT | ALL ]
SELECT
    [ ORDER BY <expression> [ ASC | DESC | USING <operator>
                          [NULLS {FIRST | LAST}][, …] ]
    [ LIMIT { <count> | ALL } ]
Where <from_item> represents:
<table>
     [ [ AS ] <alias> [ ( <column_alias_list> ) ] ] |
     ( <select> ) [ AS ] <alias> [ ( <column_alias_list> ) ] |
      <from_item> [ NATURAL ] <join_type> <from_item>
         [ ON <join_condition> | USING ( <join_column_list> ) ]

Inputs

The SELECT command takes the following inputs:

Table 1. SELECT inputs
Input Description
alias Specifies a substitute name for the preceding table name. Use an alias for brevity or to eliminate ambiguity for self-joins (where the same table is scanned multiple times). If you write an alias, you can also write a column alias list to provide substitute names for one or more columns of the table.
<col> The name of a column.
<output_name> The name the column is to be given in the output. You typically use this name to label a column for display. You can also use it to refer to the column value in ORDER BY and GROUP BY clauses.

You cannot use this name in the WHERE or HAVING clauses; write out the expression instead.

<expression> The name of a table column or an expression.
FROM The FROM clause specifies one or more source tables for the SELECT command. If you specify multiple sources, the result is the Cartesian product of all the rows in all the sources. Usually, though, you add qualifying conditions to restrict the rows the system returns to few the Cartesian product.

You can parenthesize a subselect command within a FROM clause. Using a subselect command is the only way to get multiple levels of grouping, aggregation, or sorting in a single query.

You must specify an alias for the subselect command.

A FROM item can be a join clause, which combines two simpler from items. Use parentheses, if necessary, to determine the order of nesting.

<from_item> A table reference, subselect, or join clause.
WHERE This clause has the following general form:
WHERE boolean_expr
The boolean_expr portion of the clause can consist of any expression that produces a Boolean value. In many cases, you use the expression as follows:
expr cond_op expr
or 
log_op expr
where:
  • cond_op can be one of: =, <, <=, >, >= or <>, a conditional operator like all, any, in, like, or a locally defined operator.
  • log_op can be one of: and, or, not. select ignores all rows for which the WHERE condition does not return true.
GROUP BY Use this clause to divide a table into groups of rows that match one or more values. The clause specifies a grouped table that is derived by applying the clause:
GROUP BY <expression>[,<expression>…]

The GROUP BY clause condenses, into a single row, all selected rows that share values for the grouped columns. The system computes aggregate functions across all rows that make up each group, producing a separate value for each group (whereas without GROUP BY, an aggregate produces a single value that is computed across all the selected rows). When you include the GROUP BY clause, the SELECT command output expression or expressions cannot refer to ungrouped columns except within aggregate functions because there would be more than one possible value to return for an ungrouped column.

A group by value can be:
  • An input column name.
  • The name or ordinal number of an output column (select expression).
  • An arbitrary expression that is formed from input-column values. If there is ambiguity, the system interprets a group by name as an input-column name rather than an output column name.
HAVING The optional HAVING clause has the general form:
HAVING boolean_expr
where boolean_expr is the same as specified for the where clause.

The HAVING clause specifies a grouped table that is derived by the elimination of rows that do not satisfy the boolean_expr.

The HAVING clause is different from the WHERE clause:
  • The WHERE clause filters individual rows before application of GROUP BY.
  • The HAVING clause filters group rows that are created by GROUP BY.

Each column that is referenced in boolean_expr must unambiguously reference a grouping column, unless the reference displays within an aggregate function.

In a grouping select, the HAVING clause can only reference expressions that are single-valued within a group. That is, you can only reference group fields, aggregates, or single-valued expressions that are derived from group fields or aggregates (which must include constants).

For example, to return grp and counts of grps with more than four members:
SELECT grp, count(id) AS n FROM emp GROUP BY grp HAVING n > 4

In a non-grouping select where the select is conceptually grouped by zero group fields, you can only reference aggregates or expressions that are single-valued.

For example to return no rows if there are four or fewer employees in emp, or one row with the count if there are more than four employees in emp:
SELECT count(id) AS n FROM emp HAVING n > 4
<condition> A Boolean expression that gives a result of true or false.
UNION Operator causes the system to compute the collection of rows that are returned by the queries. Eliminates duplicate rows unless you specify the ALL keyword.
table_query UNION [ ALL ] table_query
[ ORDER BY <expression> [ASC|DESC|USING operator] [, …] ]
[ LIMIT { COUNT | ALL } ]
[ OFFSET start ]
where table_query specifies any select expression without an ORDER BY, FOR UPDATE, or LIMIT clause.

If you enclose a subexpression in parentheses, you can include ORDER BY and LIMIT clauses. If you do not include parentheses, the clauses are taken to apply to the result of the union, not to its right input expression.

The UNION operator computes the collection (set union) of the rows that are returned by the queries involved. The two selects that represent the direct operands of the union must produce the same number of columns, and corresponding columns must be of compatible data types.

The result of UNION does not contain any duplicate rows unless you specify the ALL option. The ALL option prevents elimination of duplicates.

Multiple UNION operators in the same SELECT command are evaluated left to right, unless you indicate otherwise by using parentheses.

INTERSECT Combines the results of two queries into a single result that comprises all the rows common to both queries. For more information, see INTERSECT operation.
EXCEPT or MINUS Finds the difference between the two queries and the result comprises the rows that belong only to the first query. For more information, see EXCEPT operation.
ORDER BY Use this clause to sort returned rows in the order that you specify. An ORDER BY clause can be:
  • The name or ordinal number of an output column (select expression). The ordinal number refers to the ordinal (left-to-right) position of the result column. This feature makes it possible to define an ordering based on a column that does not have a name. For example:
    SELECT title, date_prod + 1 AS newlen
      FROM films ORDER BY newlen;
  • An arbitrary expression that is formed from input-column values, including fields that do not display in the select result list. For example:
    SELECT name FROM distributors ORDER BY code;
    If there is ambiguity, an ORDER BY name is interpreted as an output-column name.

A limitation of this feature is that an ORDER BY clause that applies to the result of a union query might only specify an output column name or number, not an expression.

If an ORDER BY clause is a simple name that matches both a result column name and an input column name, ORDER BY interprets it as the result column name. This is the opposite of the choice that the GROUP BY clause makes in the same situation.

You can specify ascending or descending sort order:
ASC
Sort in ascending order. This is equivalent to using <. This is the default
DESC
Sort in descending order. This is equivalent to using >.
You can specify a sorting order for null data:
NULLS FIRST
Null values come before non-null values. For an ascending sort, nulls display first in the output; for a descending sort, they display last. This is the default
NULLS LAST
Null values come after non-null values.
LIMIT Return a subset of the rows that are produced by your query:
<count>
Return at most the specified number of rows.
ALL
Return all rows. This is the default.

When you specify the LIMIT parameter, also specify the ORDER BY parameter. Otherwise, the subset of the rows is returned is indeterminate.

<join_column_list> A join column list. For example, specifying USING (a,b) is equivalent to specifying ON left_table.a = right_table.a and ON left_table.b = right_table.b.
<join_condition> A qualification condition. This condition is similar to the WHERE condition except that it only applies to the two from_items being joined in this join clause.
<join_type> One of the following types of join:
  • [INNER] JOIN
  • LEFT [OUTER] JOIN
  • RIGHT [OUTER] JOIN
  • FULL [OUTER] JOIN
  • CROSS JOIN
For an inner or outer join type, you must include exactly one of the following clauses:
  • NATURAL
  • ON <join_condition>
  • USING (<join_column_list>)
Do not include any of these clauses for a cross join.
<select> A SELECT command. You must enclose the subselect in parentheses, and provide an alias for it.

When you include a subselect in the FROM clause, the subselect acts as though its output were created as a temporary table during this single select command.

<table> In a FROM clause, the name of a table or view.
CROSS JOIN and INNER JOIN Produces a simple Cartesian product, the same as you would get if you listed the two items at the top level of FROM.

The cross join and inner join on (true) types are equivalent; no rows are removed by qualification.

These join types are a notational convenience. You can accomplish the same results by using the FROM and WHERE clauses.

LEFT OUTER JOIN Returns all rows in the qualified Cartesian product (that is, all combined rows that pass its on condition), plus one copy of each row in the left table for which there was no right row that passed the on condition.

The system considers only the own on or using condition of the join to determine those rows that have matches. It applies OUTER ON or WHERE conditions afterward.

RIGHT OUTER JOIN Returns all the joined rows, plus one row for each unmatched right row (extended with nulls on the left).

This is just a notational convenience. You can convert it to a left outer join by switching the left and right inputs.

FULL OUTER JOIN Returns all the joined rows, plus one row for each unmatched left row (extended with nulls on the right), plus one row for each unmatched right row (extended with nulls on the left).
For all the join types except cross join, you must write exactly one of:
ON <join_condition>
This is the most general case. You can write any qualification expression that involves the two tables you want to join.
USING <join_column_list>
For example, specifying USING (a,b) is equivalent to specifying ON left_table.a = right_table.a and ON left_table.b = right_table.b. The term USING implies that only one of each pair of equivalent columns is to be included in the join output, not both.
NATURAL
Use a USING list that mentions all similarly named columns in the tables.

Outputs

The SELECT command has the following outputs:

Table 2. SELECT outputs
Output Description
ROWS Returns the complete set of rows that result from the query.
COUNT Returns the number of rows that are returned by the query.

Privileges

You must be the admin user, the table owner, the owner of the database or schema where the table is defined, or your account must have Select privileges on the table or the Table object class. You must have the Select privilege for a table to read its values. As of Release 4.0, to select data from an external table you must have the Select privilege for the External Table class.

For more information about object privileges and object classes, refer to the IBM Netezza System Administrator’s Guide.

Usage

The following provides sample usage.
  • Join the table films with the table distributors:
    MYDB.SCH1(USER)=> SELECT f.title, f.did, d.name, f.date_prod, f.kind
    FROM distributors d, films f WHERE f.did = d.did;
    
       title                     |did  |name              |date_prod  |kind
       --------------------------+-----+------------------+-----------+-------
       The Third Man             |101  |British Lion      |1949-12-23 |Drama
       The African Queen         |101  |British Lion      |1951-08-11 |Romantic
       Une Femme est une Femme   |102  |Jean Luc Godard   |1961-03-12 |Romantic
       Vertigo                   |103  |Paramount         |1958-11-14 |Action
       Becket                    |103  |Paramount         |1964-02-03 |Drama
       48 Hours                  |103  |Paramount         |1982-10-22 |Action
       War and Peace             |104  |Mosfilm           |1967-02-12 |Drama
       West Side Story           |105  |United Artists    |1961-01-03 |Musical
       Bananas                   |105  |United Artists    |1971-07-13 |Comedy
       Yojimbo                   |106  |Toho              |1961-06-16 |Drama
       There's a Girl in my Soup |107  |Columbia          |1970-06-11 |Comedy
       Taxi Driver               |107  |Columbia          |1975-05-15 |Action
       Absence of Malice         |107  |Columbia          |1981-11-15 |Action
       Storia di una donna       |108  |Westward          |1970-08-15 |Romantic
       The King and I            |109  |20th Century Fox  |1956-08-11 |Musical
       Das Boot                  |110  |Bavaria Atelier   |1981-11-11 |Drama
  • Sum the column len of all films and group the results by kind:
    MYDB.SCH1(USER)=> SELECT kind, SUM(len) AS total FROM films GROUP BY
    kind;
       kind     |total
       ---------+-------
       Action   |07:34
       Comedy   |02:58
       Drama    |14:28
       Musical  |06:42
       Romantic |04:38
  • Sum the column len of all films, group the results by kind, and display those group totals that are less than 5 hours:
    MYDB.SCH1(USER)=> SELECT kind, SUM(len) AS total FROM films GROUP BY
    kind HAVING SUM(len) < INTERVAL '5 HOUR';
       kind     | total
       ---------+------
       Comedy   | 02:58
       Romantic | 04:38
  • Sort the contents of the table distributors according to the contents of the second column (name):
    MYDB.SCH1(USER)=> SELECT * FROM distributors ORDER BY name;
    or
    MYDB.SCH1(USER)=> SELECT * FROM distributors ORDER BY 2;
       did   | name
       ------+-----------------
       109   | 20th Century Fox
       110   | Bavaria Atelier
       101   | British Lion
       107   | Columbia
       102   | Jean Luc Godard
       113   | Luso Films
       104   | Mosfilm
       103   | Paramount
       106   | Toho
       105   | United Artists
       111   | Walt Disney
       112   | Warner Bros.
       108   | Westward
  • Obtain the union of the tables distributors and actors, and restrict the results to those that begin with letter W:
    MYDB.SCH1(USER)=> SELECT distributors.name
          FROM distributors
          WHERE distributors.name LIKE 'W%';
          UNION
          SELECT actors.name
          FROM actors
          WHERE actors.name LIKE 'W%';
       Walt Disney
       Walter Matthau
       Warner Bros.
       Warren Beatty
       Westward
       Woody Allen

Because only distinct rows are wanted, the ALL keyword is omitted.