The previous article, "Developing with Apache Derby -- Hitting the Trifecta: Database development with Apache Derby, Part 4," introduced the fundamental concepts related to selecting data from an Apache Derby database. These included the
WHERE clauses, and the concept of joining two tables together to facilitate more advanced data selection. That article also
discussed how to initialize your Derby workspace by using the included SQL script file to build
and populate the Apache Derby database demonstrated in the example code listings. If you
haven't already done so, you should read the previous article and initialize your Derby
workspace by running the SQL script available for download with that article.
This article builds on these basic SQL concepts to facilitate more advanced data-selection operations, including
the use of basic mathematical operations and SQL functions within a query. You can use these operations and
functions in the
WHERE clause to strengthen the data restrictions in your query or
apply them in the
SELECT clause to extract computed values. First, however, the article discusses
several additional SQL query keywords that you can use to modify the results of
your query. Start Derby's
ij tool, connect to your test
database, and follow along to learn more about writing SQL queries with the Apache Derby database.
Modify the selection of data
All the queries presented in the previous article were straightforward, selecting a set of
columns from all rows in a table, from a simple join of two tables, or from a query that included
WHERE clause. Although useful, this functionality corresponds to
performing a data dump. Fortunately, SQL provides a number of techniques to filter
both the rows selected from a table and the data extracted. The rest of this section
discusses two basic techniques for controlling the data extracted by a query: the
DISTINCT keyword and the
ORDER BY clause.
Extract distinct rows
By default, when data is selected by using an SQL query, all rows that satisfy the
WHERE clause are extracted from the database. In some cases, this may result in
rows that have identical column values being returned. If you need to restrict your query so that only unique row values are
returned, you can use the
DISTINCT qualifier, as shown in Listing 1.
Listing 1. Using the
DISTINCT qualifier in a query
rb$ java org.apache.derby.tools.ij ij version 10.1 ij> connect 'jdbc:derby:test' ; ij> SELECT DISTINCT vendorNumber AS "Vendor #" FROM bigdog.vendors ; Vendor # ----------- 3 2 1 3 rows selected ij> SELECT DISTINCT vendorNumber AS "Vendor #", itemNumber as "Item #" FROM bigdog.vendors WHERE itemNumber > 5 ; Vendor # |Item # ----------------------- 1 |6 1 |7 1 |8 3 |9 3 |10 5 rows selected ij>
The first query in this listing uses the
DISTINCT qualifier to restrict the output of
the query to only distinct, or unique, values of the
column, which is the only column listed in the
SELECT clause. In the
example schema that these articles use, there are only three vendors (with
vendorNumber being restricted to 1, 2, or 3). Thus, when the
DISTINCT qualifier is used in the query, only three rows are selected.
DISTINCT qualifier, however, applies to the entire list of selected
columns, so if multiple columns are listed following a
keyword, only unique combinations of all the columns are selected. This is demonstrated in
the second example, where both
itemNumber are listed in the
clause. Because every item has a unique
itemNumber, every combination of
these two columns is unique, and all rows that satisfy the
are selected -- in other words, the
DISTINCT qualifier has no effect on
One remaining point that you may have noticed from the examples in Listing 1 is that the selected
rows were not in the same order. If the order of selected rows is important, you can easily
control it by using an
ORDER BY clause in your query, as shown in next section.
In general, you can't assume that Apache Derby, or any database, will return rows from a query in a specific
order. If the order is important, you can use the
ORDER BY clause to
have Apache Derby order the data that are returned by your query in a particular manner. Generally,
you do so by specifying a column that should be used to provide the ordinal values for
comparison, as shown in Listing 2.
Listing 2. Using the
ORDER BY clause in a query
ij> SELECT v.vendorNumber AS "Vendor #", vendorName as "Vendor", p.price as "Price", p.itemNumber AS "Item #" FROM bigdog.products AS p, bigdog.vendors AS v WHERE p.itemNumber = v.itemNumber AND p.price > 20.0 ORDER by v.vendorNumber ; Vendor # |Vendor |Price |Item # --------------------------------------------------------------- 1 |Luna Vista Limited |32.95 |8 1 |Luna Vista Limited |24.95 |7 1 |Luna Vista Limited |99.99 |2 2 |Mikal Arroyo Incorporated |49.95 |5 2 |Mikal Arroyo Incorporated |29.95 |4 3 |Quiet Beach Industries |34.95 |10 6 rows selected ij> SELECT v.vendorNumber AS "Vendor #", vendorName as "Vendor", p.price as "Price", p.itemNumber AS "Item #" FROM bigdog.products AS p, bigdog.vendors AS v WHERE p.itemNumber = v.itemNumber AND p.price > 20.0 ORDER BY v.vendorNumber ASC, p.price DESC ; Vendor # |Vendor |Price |Item # --------------------------------------------------------------- 1 |Luna Vista Limited |99.99 |2 1 |Luna Vista Limited |32.95 |8 1 |Luna Vista Limited |24.95 |7 2 |Mikal Arroyo Incorporated |49.95 |5 2 |Mikal Arroyo Incorporated |29.95 |4 3 |Quiet Beach Industries |34.95 |10 6 rows selected ij>
In Listing 2, the first query uses the
ORDER BY clause
to list a subset of all the rows in the table that results from joining the
bigdog.vendors table to the
The rows are ordered by
vendorNumber (the subset is constructed by applying the
WHERE clause). An
ORDER BY clause can take either
a column name, as in this example, or a column number, which is taken from the order in
which the columns are listed after the
You can also specify multiple columns to use during the sorting process and
ASC for ascending order, which is the default, or
DESC for descending order. For example, if you used the
ORDER BY 1 DESC, 4 DESC clause in the first query, the query would return
the same rows, but they would be ordered by using the
vendorNumber column as the
primary sort column in descending order followed by the
itemNumber column as the secondary sort
column in descending order.
Do math in a query
Selecting columns from a database provides a number of useful benefits, but being able to compute
and select quantities based on data in a table opens up even more possibilities. Apache Derby
provides several mathematical operators, detailed in Table 1, that you can use in either a
SELECT clause or a
Table 1. Apache Derby mathematical operators
|unary ||A noop, or no operation, as |
|unary ||Changes the sign of the value to which it's applied|
|Adds the second value to the first value|
|Subtracts the second value from the first value|
|Multiplies the first value by the second value|
|Divides the first value by the second value|
Using these operators is straightforward because they generally behave exactly as you expect. For example,
if the sales tax is 8.25%, you can return the price for an item both before and after sales tax has been
applied by using
SELECT price, price * 1.0825 FROM bigdog.products ;.
As another example, if you have a column called
numberItems that tracks the number of items purchased and another column called
price that contains the price at which they're
purchased, you can return the total amount paid for those items at a given price by using
numberItems * price. Several of the queries shown in the code listings
in the rest of this article provide additional examples of how to use these operators.
The only concern when using these operators arises from complications that result from using different data types,
such as integer or floating-point, in a mathematical operation. If both operands are the same data type, the result type will be the same.
If you're performing division, this can result in truncation (for example, if you're using two
integer values), which might cause unexpected problems. You can handle this issue properly by using the
CAST function described later in this article. On the other hand, if the two operands are different data types,
the result type is promoted to the more complex type.
Use Derby's SQL functions
SQL is a powerful and expressive language that can be used to perform a wide range of actions. Part of the SQL language's power comes from its ability to directly interact with a variety of data types. This section discusses the SQL-92 standard built-in functions provided by Apache Derby, which you can use in your SQL statements to simplify many common data-manipulation tasks. For the purposes of this article, these functions are broken into five categories:
- Type conversion
- Date and time
- Character string
The simplest type of function to use in Apache Derby is the type-conversion function. These functions can be used to explicitly convert one type of data, such as a character string, to another type of data, such as a date or integer, or vice versa. There are two main categories of type-conversion functions: the explicit type functions, detailed in Table 2, and
CAST function, which provides a more powerful approach to type
Table 2. Apache Derby type-conversion functions
|Returns a 64-bit integer constant for a given character string or number.|
|Returns a fixed-length character representation, up to a maximum length of 254 bytes, of a given value, which must be one of the built-in Apache Derby types. An optional second argument can be supplied to specify the length of the character string.|
|Returns a date representation of the input value.|
|Returns a double-precision, floating-point representation of the input number or character string.|
|Returns an integer constant for a given character string, date, time, or number.|
|Returns a small integer constant for a given character string or number.|
|Returns a time representation of the input value.|
|Returns a timestamp representation of the input value.|
|Returns a variable-length character representation, up to a maximum length of 32,672 bytes, of a given date, time, timestamp, or character-string value.|
You can use the
CAST function to change the length of a character
string or convert one type of data to another, including data types that are not among those
supported by the type-conversion functions listed in Table 2 (such as
DECIMAL). In this manner, you can use
CAST to make the columns selected by a query more readable or more compact or to preserve numerical precision in a mathematical
When you're extracting results from a database within a Java™ program, however, you also have the option of converting data types within the application. Both methods provide the same result, but you gain several performance benefits by having the database cast the values, because you can minimize the network traffic by having the database send only the actual data required and push any processing to the database engine and away from a potential thin client. This approach may overburden a database server if taken to an extreme, so you should use it only when warranted. Using these type-conversion functions is straightforward, as demonstrated in Listing 3.
Listing 3. Using the
ij> SELECT p.price AS "Price", DOUBLE(p.price * 1.0825) AS "Float Total", CAST(p.price * 1.0825 AS DECIMAL(6,2)) AS "Total", CAST(p.description AS CHAR(20)) AS "Description" FROM bigdog.products AS p WHERE p.price > 21.0 AND p.stockDate < Date('2006-01-21') ; Price |Float Total |Total |Description -------------------------------------------------------------- 24.95 |27.008375 |27.00 |White beach towel 32.95 |35.668375 |35.66 |Blue-striped beach to 2 rows selected
The single query shown in Listing 3 demonstrates how to properly use several type-conversion
functions. The second line of this query converts the product of the selected
price column and the numerical factor of 1.0875, which applies a tax of 8.75% to the price of an item, to a
DOUBLE precision floating-point
number. The third and fourth lines use the
CAST function to first
convert this same product to a
DECIMAL type and then truncate the
description column to display only 20 characters. Notice
CAST function truncated the calculated price when converting
the value to a
DECIMAL data type. Finally,
DATE function converts a date character string to a
DATE data type. Although this step isn't required, being explicit is
a best practice (see the sidebar on explicit types).
This query also demonstrates the use of the
AS clause to rename columns in a
query. This is important because the query uses functions in place of column names. When Apache
Derby doesn't have an explicit name for a column, it automatically uses numbers, which
provide little insight into the nature of the data. By explicitly providing names for all
columns, you avoid this problem. In addition, because these names include white space, you must enclose
them in double quotation marks to indicate to the
ij tool that they are character-string constants.
Compute aggregate values
Apache Derby also provides functions that operate on multiple rows, known as aggregate functions, which you can use to enhance a query. Aggregate functions -- also known as set functions in SQL-92 or, more informally, as column functions -- return a computed quantity from a column over a number of rows. Apache Derby supports five aggregate functions, which are detailed in Table 3.
Table 3. Apache Derby aggregate functions
|Returns the average value of a column from all rows that satisfy an expression. Can only be used with built-in numeric data types. The precision of the returned value is defined by the precision of the column being evaluated.|
|Returns the number of rows that satisfy an expression, such as a query. Can be used with any data type.|
|Returns the maximum value of a column from all rows that satisfy an expression. Can only be used with built-in data types.|
|Returns the minimum value of a column from all rows that satisfy an expression. Can only be used with built-in data types.|
|Returns the sum of a column over all rows that satisfy an expression. Can only be used with built-in numeric data types.|
These aggregate functions can often be used to quickly find useful information that might otherwise be difficult to identify, as shown in Listing 4.
Listing 4. Using aggregate functions in a query
ij> SELECT COUNT(p.itemNumber) AS Number, CAST((AVG(p.price) + 0.005) AS DECIMAL(5,2)) AS Average, MIN(p.stockDate) AS "First Date", MAX(p.stockDate) AS "Last Date" FROM bigdog.products AS p ; NUMBER |AVERAGE |First Date|Last Date --------------------------------------------- 10 |31.66 |2005-12-20|2006-03-31 1 row selected
Listing 4 uses four of the five aggregate functions to get summary information about
the data in the
bigdog.products table. The
function indicates that the table includes ten rows (because the query didn't use a
WHERE clause to restrict the rows selected from the table). The
AVG function calculates the average price of all items in the
bigdog.products table. Because this calculation results in a
floating-point number, the query explicitly casts the result back to the suitably sized
DECIMAL data type. Unlike in Listing 3, this example adds
0.005 to force the
CAST function to properly
round the calculated value. Finally, the
MAX functions extract the minimum and maximum dates from the
Apache Derby provides three built-in functions, detailed in Table 4, that let you perform
more complex mathematical operations in a SQL query, especially when used in concert with the mathematical
operators detailed earlier in this article. These functions can be used in either the
SELECT or the
WHERE clause of a query.
Table 4. Apache Derby math functions
|Returns the absolute value of an expression, which must be one of the built-in numeric types. The return type is the same as the argument.|
|Returns the remainder when the first argument is divided by
the second argument. The return type is the same as the argument with the biggest integer
|Returns the square root of an expression, which must be a floating-point value. The return type is the same as the argument.|
Date and time functions
Apache Derby provides considerable support for date and time data types. To simplify
using these types in a database query, there are six date and time functions, which are detailed in Table 5.
These functions often provide an alternative to using the type-conversion functions described
earlier. For example, you can replace the
Date('2006-01-21') part of the
WHERE clause in the query
shown in Listing 3 with the
YEAR(p.stockDate) < 2006.
Table 5. Apache Derby date and time functions
|Returns an integer that contains the day component of a date, timestamp, or character string that contains a valid date|
|Returns an integer that contains the hour component of a time, timestamp, or character string that contains a valid time|
|Returns an integer that contains the minute component of a time, timestamp, or character string that contains a valid time|
|Returns an integer that contains the month component of a date, timestamp, or character string that contains a valid date|
|Returns an integer that contains the second component of a time, timestamp, or character string that contains a valid time|
|Returns an integer that contains the year component of a date, timestamp, or character string that contains a valid date|
The last category of functions in Apache Derby simplifies the task of working with character data. These eight functions, detailed in Table 6, can be used to count character values, convert character values to all uppercase or all lowercase, remove white space at the start or end of the string, find a specific substring, or select a substring.
The character values used by these functions can be database columns that are one of the character data types supported by
the Apache Derby database, database columns that can be directly converted to a character data
type, or constant character strings. For example, the query
UCASE(p.description), UCASE('hello world'), UCASE(p.stockDate) FROM bigdog.products AS
p ; returns three columns, which are all displayed as uppercase character strings.
Table 6. Apache Derby character-string functions
|The concatenation operator combines two values into a new character string. If both values
are of type |
|Returns a character string in which all alphabetical characters in the input value have been converted to lowercase.|
|Returns the number of characters in the input value. Noncharacter data is implicitly converted to a character string.|
|Returns the starting location for the first occurrence of a substring in a search string, or zero if the substring isn't found. The first argument is the substring, the second argument is the search string, and an optional starting position can be supplied as a third argument.|
|Returns a character string in which all spaces at the end of the input value have been removed.|
|Returns a character string in which all spaces at the start of the input value have been removed.|
|Returns part of an input character string as a |
|Returns a character string in which all alphabetical characters in the input value have been converted to uppercase.|
This article introduced several techniques for writing more powerful SQL queries. First, the
DISTINCT keyword was used to select only unique rows in a query. Second, the
ORDER BY clause was used to sort the selected rows in a query by one or more criteria. Third, you read about the basic
math operators that can be used in a query. And finally, you learned about the basic built-in SQL functions in detail. These functions can often be used to simplify complex SQL queries.
These techniques will be further demonstrated in future articles where you'll use them to selectively delete or modify data in an existing database.
- Check out the other articles in this series:
- "Developing with Apache Derby -- Hitting the Trifecta: Introduction to Apache Derby" (developerWorks, February 2006) introduces the Apache Derby database and provides the foundation for many topics in this series.
- "Developing with Apache Derby -- Hitting the Trifecta: Database development with Apache Derby, Part 1: Start working with Derby today" (developerWorks, March 2006) introduces the
ijtool and demonstrates how to use it to connect to an Apache Derby database.
- "Developing with Apache Derby -- Hitting the Trifecta: Database development with Apache Derby, Part 2: Schemas" (developerWorks, April 2006) covers several database concepts, including schemas, tables, and column data tapes, and gives you a simple introduction to SQL.
- "Developing with Apache Derby -- Hitting the Trifecta: Database development with Apache Derby, Part 3: Running scripts and inserting data" (developerWorks, May 2006) introduces the concept of executing SQL scripts with Apache Derby and demonstrates how to insert data into tables in an Apache Derby database.
- "Developing with Apache Derby -- Hitting the Trifecta: Database development with Apache Derby, Part 4: Select data with a query" (developerWorks, June 2006) introduces the concept of SQL queries, including the basic SELECT, FROM, and WHERE clauses.
- Peruse the Apache Derby Project online manuals for more detailed information about how to use the Apache Derby database.
- Take the Apache Derby Project tutorial that details how to download and install Apache Derby.
- Check out the developerWorks Apache Derby project area for articles, tutorials, and other resources to help you get started with Derby today.
- Browse all the Apache articles and free Apache tutorials available in the developerWorks Open source zone.
- Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
- Browse for books on these and other technical topics at the Safari bookstore.
Get products and technologies
- Download Apache Derby from the Apache Derby Project home page.
- Innovate your next open source development project with IBM trial software, available for download or on DVD.
- Get involved in the developerWorks community by participating in developerWorks blogs.