A compendium of Cloudscape frequently asked questions

IBM® Cloudscape™ is a commercial release of the Apache Derby open source database. Derby is a small-footprint, multiuser, standards-driven, relational database system written entirely in Java™ that is well suited for embedding in Java applications and servers. This technical article answers common questions.

Stanley Bradbury (bradbury@us.ibm.com), Community Coordinator - Cloudscape, SDI Corp.

Stanley Bradbury is currently working as Community Coordinator with the Cloudscape group at IBM. Before this he provided third level support to Cloudscape customers (2001 to 2005). Prior to working with Cloudscape he managed databases for companies in the Biotech, Manufacturing and Internet Security Service industries. He is a graduate of the University of California at Berkeley and enjoys spending time with his family.



27 July 2006 (First published 03 August 2004)

Also available in

Table of Contents

General

What is the difference between IBM Cloudscape and Apache Derby?
Is IBM Cloudscape free?
What is "embedded"?
Is embedded Derby a multiuser database?
How can double-booting with a read-only connection cause corruption?
Why isn't Client/Server support bundled in the base engine jar?
Does Derby support SQLJ?
Why shouldn't the Derby JDBC Driver be deployed at the J2EE application level?

Installation and configuration

How do I install the Derby environment?
What are the basic components of a deployment?
What is the embedded Server architecture?
What are the steps needed to implement authentication and authorization?
How do I setup Network Server to run under the java security manager?
How can I tune Derby to improve cache hits and reduce I/O?

General usage

How is a schema different from a database?
What can I do to avoid duplicate key failures (ERROR 23505) from values generated for a Derby identity column after I have inserted specific values into the column?
How can an application connect to more than one database?
How do I list the names of the tables in a database?
How do I create a Java function in a Derby database?
How do I find the filename for a datatable?

Troubleshooting

How do I set up my development environment for Derby?
How can I avoid OutOfMemory exceptions when running Derby?
How do I verify the integrity of all my tables and indexes?

Releases

What is new in IBM Cloudscape, Version 10.1?
Why was a new client driver library released with version 10.1?
What is IBM Cloudscape, Version 10.0?

Resources

Useful links


General questions

Q: What is the difference between IBM Cloudscape and Apache Derby?

A: There is no difference in core functionality between Apache Derby and IBM Cloudscape. Beginning with Version 10, IBM Cloudscape became a commercial implementation of the database being developed as part of the Apache Derby Project. (See Resources .) In August 2004, IBM contributed the source code for the IBM Cloudscape product to the Apache Software Foundation (ASF). The software was renamed Derby. IBM produces builds of IBM Cloudscape, Version 10, and above directly from the ASF codeline. IBM makes no changes to the database engine, tools, or network server code it repackages and distributes as IBM Cloudscape.

Beyond the core functionality, IBM Cloudscape differs from Derby in packaging and in optional, value-added modules and tools provided by IBM. Because the core products are the same, these additional modules and tools can be used with Apache Derby or IBM Cloudscape; both products are indistinguishable at the core level. The following lists the packaging and value-added modules and tools available for use with Derby and/or IBM Cloudscape.

  • IBM Cloudscape builds undergo extensive testing to insure the stability of the codeline prior to releasing the product for download and include:
    • Major releases bundled in an installer package to simplify setup
    • The IBM build of the JRE for Windows and Linux within the respective installer package for use, if needed
    • The DB2 JDBC client driver that can be used to connect to the Network Server: IBM DB2 JDBC Universal Driver. [ Note: The open source Derby Network Client V10.1 should be used in place of this library in most installations.] These JAR files are included with all IBM Cloudscape installers and are also available as a separate download for use with ASF Derby distributions. (See Resources .)
    • Documentation in PDF format
  • An ODBC driver for Derby, the IBM DB2 Runtime Client (available as a separate download). (See Resources .)
  • A database browsing tool, the IBM DB2 plug-ins for Eclipse (available as a separate download). (See Resources .) [ Updated: A smaller, improved version of this tool will be made available soon after version 10.1 is released.]

Access the IBM Cloudscape download page to download these modules free of charge. (See Resources .)

IBM also offers services to IBM Cloudscape users. A fee-based Support license (see Resources ) for IBM Cloudscape is available to individuals and organizations that require guaranteed response and resolution to issues. This offering provides voice and electronic access into IBM support organizations. The service license is restricted to IBM Cloudscape distributions of Derby and is on a 'per machine' basis. For individuals and organizations that do not require formal support services, IBM also provides free support through papers available on the developerWorks Cloudscape Web site and through the IBM Cloudscape community forum. (See Resources .)

Q: Is IBM Cloudscape free?

A: Yes. IBM Cloudscape is free for download. Use the "Download free modules" link in the Resources section of this document. In addition its license allows IBM Cloudscape to be freely redistributed.

Q: What is "embedded"?

A: Definition from the Merriam-Webster OnLine Dictionary (http://m-w.com):

em·bed·ded; em·bed·ding - transitive senses
1 a : to enclose closely in or as if in a matrix <fossils embedded in stone> b : to make something an integral part of <the prejudices embedded in our language> c : to prepare (a microscopy specimen) for sectioning by infiltrating with and enclosing in a supporting substance
2 : to surround closely <a sweet pulp embeds the plum seed> intransitive senses : to become embedded

"Embedded Derby" means there isn't a separate database server; Derby is installed with the Java application (one JAR file plus the database files). It runs in the same JVM as the application and is booted and shutdown by the application. The end-user does not need to be aware there is a database system within it. The result is an application that is easy to deploy with all the features of a relational database system. From an application design standpoint, Derby is about as 'Plug N Play' as a DBMS can be.

Q: Is embedded Derby a multiuser database?

A: Yes, Derby supports multiuser access, regardless of the deployment option used. "Multiuser" means the engine manages multiple, isolated transactions as needed. The Derby manuals address using Derby in single user applications. Derby is well suited to run in this architecture but is not restricted to a single user or single connection. Derby supports multiple connections even in a fully embedded deployment. Each connection represents an individual transaction. A single application can establish multiple connections and thus execute multiple, independent transactions. Each connection can specify one of four levels of transaction isolation to provide the required level of concurrency. See the Derby Developers Guide for a complete description of transaction isolation levels, concurrency, and the types of locking utilized.

It is important to note that the Derby JAR file contains no networking code. Again, this does not mean Derby is single user, although this misinterpretation is common. Network messages are required for communications outside the JVM, which is running the Derby engine (inter-process communication). Communication between two independent processes is termed "Client-Server." To meet the need for Client-Server capabilities, Derby provides the Network Server (in a separate JAR file). Derby easily embeds in Network Server to provide Client-Server connectivity when needed. Derby also embeds easily in many other server programs (for example, WebSphere, Apache Tomcat, and so on), so the choice is up to you.

A word of caution; if an interactive connection to the database is required in addition to the connection(s) established by the application, it is advisable to use a server environment (the embedded server architecture is the most flexible). In several instances database corruption has occurred when, during testing, a programmer attempted to monitor test progress using a separate query tool. The query tool runs in its own JVM and, if allowed to connect, will corrupt the database by 'double-booting' (also known as 'dual-booting') the database. See the section on Double-booting in the DB2 Help site documentation for important information on this subject.

Q: How can double-booting Derby with a read-only connection cause corruption?

A: The first connection made to a Derby database "boots" the database. Booting means that all memory structures and processes are initialized (caches, managers, lists, and so on), the data dictionary is read, and any uncommitted transactions in the transaction log are rolled back. In a double-boot (also known as dual-boot) scenario, where the uncommitted transactions are those active in the other JVM process, this rollback will cause corruption. So the type of connection being established does not matter because the act of booting the database will recover any incomplete transactions in the transaction log in order to preserve data integrity.

Where possible, Derby tries to detect and prevent double-booting, but, until recently, it has not been possible to stop this in all environments. Use of a JVM of version 1.4 or higher enables Derby to prevent double-booting on UNIX, Linux, and similar operating systems. Use of a version 1.4 or higher JVM is highly recommended on these platforms. With systems where an earlier JVM must be used, dual-booting can be minimized by setting the property derby.database.forceDatabaseLock=true . One side-effect of setting forceDatabaseLock is that the existence of the database lock file (db.lck) will always prevent a database from booting. When a database is not properly shutdown (for example, a crash or an abort) an 'orphan' database lock file will be left. When forceDatabaseLock is set, this orphan file will need to be removed manually. Normally Derby detects and handles orphaned database lock files appropriately, but forceDatabaseLock could prevent the database from booting even if no other process has an active lock on the file. Be sure to verify that no JVMs are running on the database machine before manually removing a database lock file.

Q: Why isn't Client/Server support bundled in the base engine jar?

A: Many applications run entirely on a single system in a single process, so do not require Client/Server capabilities. For instance, a servlet storing data entered into a Web form or an address book application running on a PC does not need to communicate with other processes. Client/Server code would increase the footprint of these applications and provide no benefit. For such applications, Derby provides all the advantages of a relational database without unnecessary networking code. Client/Server capabilities can be added incrementally in instances where it is required by running Network Server or some other server architecture.

Q: Does Derby support SQLJ?

A: No. There is no SQLJ implementation produced specifically for Derby. Though some SQLJ translators may generate code that does not fail when used with Derby this has not been tested and the performance of the code might be poor. The SQLJ standard allows for 'vendor customizations' in order to produce the best result possible. This is why most SQLJ implementations are produced by the major database vendors (Oracle, Sybase, IBM, etc.) and include customizations specific to that database. Generally when people are interested in SQLJ, they want more than just a simple JDBC translator. They expect the resulting code to be as fast or faster than the programs they would produce themselves.

Q: Why shouldn't the Derby JDBC Driver be deployed at the J2EE application level?

A: This is because the Derby driver contains DBMS engine classes, not just the communication classes that make up most JDBC drivers. When the driver is first loaded the engine boots and these internal structures need to be accessible by any other JVM thread(s) performing database activity. If the Derby driver is first loaded by a deployed application such access is not possible. This is because application servers setup a hierarchy of restricted classloaders that isolate deployed modules (applications, JSPs, Servlets, EJBs, etc.) to prevent them from "stepping" on one and other. To work properly the engine classes must be visible JVM-wide (e.g. above the application level in the classloader hierarchy of the server). This insures that Derby internal threads as well as threads making database requests will have access to the classes and objects needed to perform the requested functions. Because it is embedded, Derby inherits the privileges of the routine that loads it. If Derby is booted in a restricted environment the results are unpredictable.


Installation and configuration

Q: How do I install the IBM Cloudscape environment?

A: Installation is simple and made platform independent by using the Java installer. (Command line syntax to begin the installation and file path designation remain platform dependant). This answer outlines installation using the Java installer. Platform-specific installers are available for Windows and Linux environments; see the install.html file for information on using platform-specific installers.

Installation requires a functioning JVM and adequate disk space on the filesystem -- about 4Mb for the JAR files alone, 24Mb for the complete IBM Cloudscape installation with javadoc and samples, or (on Windows only) 64Mb for the complete IBM Cloudscape installation, and the IBM JRE. These estimates do not include the size of the installation JAR file.

At the OS command line prompt, move to the directory containing the IBM Cloudscape installation JAR file. Invoke the installer by typing the command:

<path-to-java-binary>\java -cp . -jar
<cloudscape-installation-jarfile>

Where:

  • <path-to-java-binary> is the fully qualified path to the java.exe file (usually indicated by the environment variable JAVA_HOME )
  • and <cloudscape-installation-jarfile> is the installation JAR file

Respond to the prompts:

  • Initial screen (1): Verify that the correct product is to be installed: Click Next .
  • Screen 2: Uncheck the checkbox labeled "Yes, I would like to read the release notes now." Click Next .
  • Screen 3: Review the licensing terms. If acceptable, select I accept the terms of the license agreement . Click Next .
  • Screen 4: Verify or enter the full directory path for the installation. Click Next .
  • Screen 5: Verify the installation selections and the disk space needed. Click Next .
  • Screen 6: Click Next , then click Finish .

Configuration/Setup

Q: What are the basic components of a Derby deployment?

A: The key Derby components in an application deployment are 1) the JAR files, 2) the database files, and 3) the error log. These will exist in all deployments and must be located before any database level work or debugging can occur.

The concept of the System Home is also important when talking about any Derby deployment. The Derby System Home is the default location where all files, the database, and the Derby log are created. If this location is not defined by setting the property derby.system.home , it will default to the current working directory of the JVM process running the database engine. It is prudent to explicitly define the System Home in production deployments and document the location. Set derby.system.home before connecting to the first database (also referred to as "starting the Derby engine").

The default name of the error log file is derby.log , but the name is configurable and is sometimes changed by the application programmer. By default, the error log is created in derby.system.home, and it is prudent to allow it to default to this location. The error log is required to conduct any database level investigation/debugging activities. Its location and name must be known in order to work with a Derby deployment. The log will contain a list of all databases booted by the Derby engine along with any severe errors that occur (see the Tuning Guide, "Working with Derby Properties," for information on setting derby.stream.error.logSeverityLevel to capture less severe messages and warnings).

The database files (for example, c2a10.dat) and the transaction logfiles (for example, log2.dat) are located in subdirectories of the main database directory. The main directory will have the same name as the database (for example, toursDB). The path to the database directory is needed in order to boot the database. The main database directory will contain at least two subdirectories:

  • "seg0" contains the datafiles
  • "log" contains the transaction logfiles

Never directly access any files in these directories.

The derby.jar file contains the code to run the Derby engine. This file must be in your classpath to load the embedded driver and boot a Derby database. The Derby development environment contains other optional JAR files that may be included in a Derby deployment. These are:

  • derbytools.jar - The tools ij and dblook
    • ij - Derby interactive JDBC scripting tool and SQL command line interface
    • dblook - Derby utility to view or dump the Data Definition Language (DDL) for objects in a database
  • derbynet.jar - The Derby Network server
  • derbyclient.jar - The Derby Network Client. This client library became available with release 10.1 and is the preferred client library. It can also be used with Network Server Version 10.0.
  • db2jcc.jar and db2jcc_license_c.jar - The DB2 JDBC Universal Driver that supports client communications with the Network Server. [ Note: derbyclient.jar is the preferred client library to use with Network Server.]

Q: What is the embedded Server architecture?

A: Put simply, an embedded Server architecture is created when a Server program that can communicate with Derby runs in the same JVM as the Derby engine. In this architecture any thread running within the JVM can access Derby databases using the embedded driver (org.apache.derby.jdbc.EmbeddedDriver) while threads running in separate JVMs can simultaneously access the same databases using the appropriate client driver. The internal threads benefit by being able to interact directly with the Derby engine (embedded) while the external threads communicate with Derby via the Server intermediary (client/server).

When using the Derby Network Server an embedded server architecture can be easily created by setting the property derby.drda.startNetworkServer to true. The property causes Derby to automatically start the Network Server when the engine boots. By simply setting this property on the java command line or in the derby.properties file any program that loads the Derby embedded driver will run with the Network Server embedded as well. This is particularly useful when testing or debugging an embedded application because it provides the ability to monitor the database using ad-hoc queries while he application is running. SimpleNetworkServerSample.java program (provided with the Derby distribution) for a demonstration of a simple embedded server environment.

The most common example of an embedded server environment is when Derby is used in conjunction with and application server such as Tomcat or WebSphere. By configuring these servers to load the Derby engine at startup an embedded server environment is created. The database engine exists in the same JVM as the application server so all applications, JSPs, EJBs, Servlets, etc running on the server can use the embedded driver for database access (via a global or server-wide embedded datasource)and avoid the overhead of client/server communication. For more information on using Derby with application servers read the paper titled "Use Derby in a J2EE Server environment" listed in the Resources section.

Q: What are the steps needed to implement user authentication and authorization?

A: Derby has a built-in user authentication (username and password) and user authorization (data access levels) system that can be used in environments where increased security is needed and other systems (e.g. LDAP) are not being used. In its default configuration Derby authentication and authorization is not used. This example activates authentication by the built-in provider then defines four user accounts (Spot, Rover, Dick, Jane) and assigns two to each of the two authorization (access) levels. Properties are then set to disallow access without a valid username and password and to prevent the settings from being overridden by system level properties set programmatically.

Active authentication by the built-in provider
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(
'derby.connection.requireAuthentication', 'true') 

CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(
'derby.authentication.provider', 'BUILTIN')
Define four user accounts and passwords
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Spot', 's06oh1r1')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Rover', 'j2um3u5u')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Dick', 'bt84ayNn9') 
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Jane', 'aal379y0w')
Define authorization level (Full or Read-only) for each account
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.fullAccessUsers','Spot,Rover') 
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.readAccessUsers','Dick,Jane')
Restrict access to defined, authorized users and block property override
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.connection.requireAuthentication', 'true') 
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.defaultAccessMode', 'noAccess')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.propertiesOnly', 'true')
Examples of invalid and valid connection attempts using IJ
java org.apache.derby.tools.ij
ij version 10.1
ij> connect 'jdbc:derby:authdb';
ERROR 08004: Connection refused : Invalid authentication.
ij> connect 'jdbc:derby:authdb;user=Rover';
ERROR 08004: Connection refused : Invalid authentication.
ij> connect 'jdbc:derby:authdb;user=Spot;password=s06oh1r1';
ij>
Listing the authorized accounts for a database
ij> values SYSCS_UTIL.SYSCS_GET_DATABASE_PROPERTY
      ('derby.database.fullAccessUsers');
1
------------------------------------------------------
Spot,Rover
ij> values SYSCS_UTIL.SYSCS_GET_DATABASE_PROPERTY
       ('derby.database.readAccessUsers');
1
-------------------------------------------------------
Dick,Jane

Considerations when using Authentication

  • Always create at least one full access user account before setting derby.connection.requireAuthentication or you will not be able to manipulate data in the database.
  • Applications should use fully qualified tablenames in the format schemaname.tablename to avoid problems due to each user having a different default schema. When the schema name is not explicitly listed in an SQL statement the default schema of the connection is assumed. For an authenticated connection the username becomes the default schema. If no username is specified when a connection is made the system default APP schema is assumed.
  • Consider encrypting the database when security is important.
  • To deauthenticate an account set the password to null. Example: CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Jane',null)
  • To deauthorize a user recreate the authorization list(s) without the username listed.

Q: How do I setup Network Server to run under the java security manager?

A: Derby and the Network Server require specific access permissions be granted in order to run under the java security manager. These permissions can be listed in a special policy that is specified on the command line when Network Server is started. The example policy file that follows lists the basic permissions needed. Create the policy file (named nwsvr.policy in this example) and specify the following two options when starting Network Server:

  • -Djava.security.manager - to invoke the default security manager
  • -Djava.security.policy=nwsvr.policy - to allow the access required to Network Server and Derby to run properly.
Example policy file syntax to run Network Server
//Recommended set of permissions to start and use the Network Server. 
//  Assumes the Java System variable ${derby.base} is set to the directory that  
//     contains all the derby jars.  Restrict access to this directory!
// Assumes all connections will be from localhost
// Assumes the database is under the directory ${derby.system.home}.
// Fine tune based on your environment settings
grant codeBase "file:${derby.base}/-" {
permission java.io.FilePermission "${derby.system.home}", "read";
permission java.io.FilePermission "${derby.system.home}${/}-", "read, write, delete";
permission java.io.FilePermission "${user.dir}${/}-", "read, write, delete";
permission java.util.PropertyPermission "derby.*", "read";
permission java.util.PropertyPermission "user.dir", "read";
permission java.lang.RuntimePermission "createClassLoader";
permission java.net.SocketPermission "localhost", "accept";
};

Use the following command to start Network Server to accept connections from localhost using the property file nwsvr.policy residing in the current directory and with the Derby jarfiles being in the directory /opt/derby/lib.

java -Djava.security.manager -Djava.security.policy=nwsvr.policy 
     -Dderby.base=/opt/derby/lib
    org.apache.derby.drda.NetworkServerControl -h localhost start

To start the Network Server so connections can be made from other machines you will need to modify both the policy file and the Network Server startup command. Change the policy file line that sets permission java.net.SocketPermission to accept connections from the machines or domains that require access. When starting the Network Server use the -h parameter and specify the address 0.0.0.0 so the Network Server itself does not restrict connections.

Example policy: allow access for machines in "myOwnDomain.com"
permission java.net.SocketPermission "*.myOwnDomain.com", "accept";
Example start command: do not restrict connections
java -Djava.security.manager -Djava.security.policy=nwsvr.policy 
     -Dderby.base=/opt/derby/lib
    org.apache.derby.drda.NetworkServerControl -h 0.0.0.0 start

Besides the syntax shown in the examples the hosts specification of the java.net.SocketPermission class can be a complete DNS name or a numerical IP address and can include port number ranges. For complete information on specifying access to this and other security manager permission classes refer to the "Java Security Specification: Permissions" link in Resources section.

Other Security Considerations when using Network Server

  • Secure the machine hosting the Network Server
  • Restrict access to the database files and directories
  • Change the port that the Network Server listens on
  • Encrypt the Derby database files
  • Setup user authentication and authorization using the Derby BUILTIN authentication provider or other system.
  • Encrypt network transmissions using SSL or other technology

Performance

Q: How can I tune Derby to improve cache hits and reduce I/O?

A: Performance may be improved and I/O reduced for systems that handle large amounts of data by increasing the size of the data cache available to Derby. The goal of sizing the cache for performance would be to set the pageCache to hold as much data as possible up to the size of the entire database while avoiding OutOfMemory exceptions (OOMs). A rough (but large) estimate of the amount of data in a database can be obtained from the total size of the datafiles in the seg0 directory. The sum of file sizes will always exceed the total sum of all the records in the database because there is free space in these files. Compressing the database just prior to noting the total datafile size will minimize the error in this method. Usually a much smaller cache provides good performance, so there is no point sizing a pageCache larger than the value above.

Getting the cache sized optimally requires iterative testing to insure that the increased cache both improves performance and does not exceed the memory available to the JVM (maximum heap). Too large a Derby pageCache will cause the JVM to crash. The following factors need to be balanced to avoid OutOfMemory exceptions (OOMs):

  • The physical memory on the machine. The JVM must fit in physical memory.
  • The maximum heap size allowed for the jvm (-Xmx )
  • The size of the Derby pageCache ( pageCache * avg. pageSize )

See the Troubleshooting section of this document, question " How can I avoid OutOfMemory exceptions when running Derby? ," for information on adjusting these values.


General usage/features (the basics)

Q: How is a schema different from a database?

A: A Derby database is a group of schemas, each with its own set of tables, all of which are defined by the same data dictionary (the system tables in the SYS schema). On disk, the database is a set of data files (located in the seg0 directory) and a set of transaction log files (located in the log directory). Transactions cannot span databases, but they can span schemas in the same database. Refer to the question: " How can an application connect to more than one database in a Derby system? " for more information on multiple databases opened by a single Derby engine.

Derby implements schemas to provide multiple namespaces within a single DB. They are used to group related tables into a logical unit. There are no separate disk files related to schemas. All schemas in the same database share the same data dictionary and transaction logs. You can refer to the multiple schemas in a single SQL statement and it will execute within the same connection (no open is required). SQL manipulation between schemas in the same database are transactional.

Q: What can I do to avoid duplicate key failures (ERROR 23505) from values generated for a Derby identity column after I have inserted specific values into the column?

A:NOTE: The ability to insert data into an identity column was introduced in Version 10.1.2.1. Earlier versions do not contain this functionality. The ability to reset the identity sequence to adjust for manually inserted values will not be available until version 10.2 is released late in 2006.

As you've discovered, manually inserting values into an identity column that requires unique values is a move away from a Derby zero administration system. Derby increments the identity sequence based on the last number the value generator assigned and is unaware of values manually inserted. Because identity columns of type GENERATED BY DEFAULT allow duplicate values to occur, applying a unique constraint to the column can result in insert failures if the values manually inserted are later generated as part of the identity sequence. This problem can be addressed in one of the following two ways:

  • Avoid overlapping values: Implement a design where the values are partitioned into ranges that do not overlap. For instance, the records from source A are in the range 1 to 1 million, the generated values in this table ranges from 1 million and 1 to 2 million and the records from source C are more than 2 million. The starting value for the identity sequence as well as the value the sequence increments by can be specified when the column is created.
  • Step through the conflicting values: The value generated in the sequence increments even when the insert fails. It is possible to retry the failed insert statement until the value generated does not conflict with the existing values and the insert succeeds.

Version 10.2 will introduce the ability to restart an identity sequence using the ALTER TABLE command. If this problem is encountered, a restart value larger than the largest value found in the column can be specified with the following syntax:

ALTER TABLE <tableName> 
   ALTER COLUMN <identity ColName> 
   RESTART WITH <integer-constant>

Q: How can an application connect to more than one database in a Derby system?

A: It is possible to open multiple connections, each against a different database in a single application. The instances/connections, however, are unaware of each other with regard to cross database manipulation with SQL. Any manipulation between the two databases/connections would need to be done at the application level (not within a single SQL statement) and the results displayed or written to the appropriate location.

For example, you can read table ONE from database A into an array and table TWO from database B into another array. Loop through the data records in the arrays, and change the values as needed, then write the data back to the original database and table. By contrast, creating table ONE and table TWO in two different schemas of the same database allows this manipulation to be done using SQL within a single transaction.

Q: How do I list the names of the tables in a database?

A: Tables are listed in the system table SYSTABLES. A fully qualified table name includes the schema in which the table was created (APP is the default schema). The following query lists the fully qualified tablenames of all user tables in the database.

select s.schemaname || '.' || t.tablename 
     from sys.systables t, sys.sysschemas s 
     where t.schemaid = s.schemaid
          and t.tabletype = 'T'
     order by s.schemaname, t.tablename

Q: How do I create a Java function in a Derby database?

A: A powerful feature of Derby is the ability to create your own functions to perform specialized processing. Functions are implemented in Java, called as part of a SQL statement and can be used in triggers to perform conditional data manipulations. The following example demonstrates the steps involved in creating a user-defined function in Derby. The example function converts a BIGINT value to a Hexadecimal string. For more examples of creating Derby functions and procedures see the link in the Resources section.

The basic steps to creating a database function are:

  • Create java methods to perform the desired manipulations and place them in a JAR file
  • Install (load) the JAR file into the database
  • Set the database classpath to include the JAR file
  • Create (define) the function to call each method
  • Test the newly defined function
Listing 1. Java code to support the bigintToHexString function
// Class supporting Derby Java Stored Procedures and Functions
import java.sql.*;

public class derbyJavaUtils
{
// bigintToHexString: converts a BIGINT value to a Hexadecimal String
public static String bigintToHexString(long myBigint)
    {
        return Long.toHexString(myBigint);
    }
}
Listing 2. SQL syntax to install a JAR file to the database

Click to see code listing

Listing 2. SQL syntax to install a JAR file to the database

Format: CALL SQLJ.install_jar( 'jarFilePath', 'qualifiedJarName', 0)
  Example: CALL sqlj.install_jar(  'derbyJavaUtils.jar','APP.derbyJavaUtils',0 ) 

    NOTE: In this example the file 'derbyJavaUtils.jar' is in the Derby default directory, derby.system.home.  
       The filepath will needed to be specified if the jarfile is not located in derby.system.home.
Listing 3. Setting the database classpath to include the JAR file
Format: Call SYSCS_UTIL.SYSCS_GET_DATABASE_PROPERTY('KEY','VALUE')
  Example: CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY (
             'derby.database.classpath', 'APP.derbyJavaUtils')
Listing 4. Defining the function to the database
Example: CREATE FUNCTION app.bigintToHexString(hexString bigint)
RETURNS VARCHAR(16)
PARAMETER STYLE JAVA NO SQL
LANGUAGE JAVA 
EXTERNAL NAME 'derbyJavaUtils.bigintToHexString'
Listing 5. Testing (using) the function
Example: select CONGLOMERATENAME, bigintToHexString(CONGLOMERATENUMBER) 
           from SYS.SYSCONGLOMERATES

Q: How do I find the filename for a datatable?

A: The system table SYSCONGLOMERATES references the physical files associated with the data objects in a Derby database. The filename is constructed as a hexadecimal string representation of the conglomerate number. To list the file names for all user tables in the tables in a database create the bigintToHexString function shown in the Java function FAQ and execute the following query:

select  TABLENAME, 'C' ||  bigintToHexString(CONGLOMERATENUMBER) ||  '.dat' 
 from SYS.SYSCONGLOMERATES a, SYS.SYSTABLES b
 where a.TABLEID = b.TABLEID AND b.TABLETYPE <> 'S' ;

CAUTION: Never change or delete a datafile for any reason or unrecoverable corruption will occur. The Derby DBMS engine is the only program that should ever access or change the contents of a Derby datafile.


Troubleshooting

Q: How do I set up my development environment for Derby?

A: Set the Derby properties shown below in development and test environments so the Derby error log is not automatically removed and all errors and warnings are recorded. Derby is no longer a zero admin tem when these settings are used, so you will need to watch the size of the logfile. Never use these properties in a production environment, where zero administration is important.

The simplest way to set these properties is by creating a file named derby.properties in the derby.system.home directory (usually this is the same directory that contains the Derby errorlog logfile). Alternately you can specify each property using the -D parameter on the java command line. You will need to restart Derby for the properties to take effect. Below is a commented properties file for use in development and test environments:

Listing 6. Properties file for use in development and test environments
#           Top of Sample File 
# This is a sample Derby properties file provided to show 
#   setting properties which are useful for development and 
# troubleshooting 
# 
# -- Append to the log file rather than overwriting it 
# - manually remove the derby.log when it becomes large
derby.infolog.append=true 
# 
# -- Log all errors/messages of any severity (will list deadlocks)
derby.stream.error.logSeverityLevel=0 
# 
# -- Log all lock timeouts with additional trace information
derby.locks.monitor=true 
# 
# - List transaction information on victim and survivor transactions # for both lockTimeouts
#     for both lockTimeouts and Deadlocks
derby.locks.deadlockTrace=true 
#           End of Sample File

Q: How can I avoid OutOfMemory exceptions when running Derby?

A: The following sections provide background on Derby memory use and provide one approach to sizing Derby and the JVM to avoid OutOfMemory exceptions.

Factors relating to Derby memory use

If you are unhappy with Derby performance or are encountering OutOfMemory exceptions (OOMs), the first thing to try is adjusting the Derby pageCache . The memory you can safely allocate to pageCache depends on many factors; physical memory, average pageSize , and JVM maximum Heap size ( maxHeap ) are the main factors. The basic calculation is to compute the number of pages to allocate to the pageCache as a ratio of the JVM maximum Heap size.

The JVM maximum Heap Size

The JVM maximum Heap size ( maxHeap ) is set using the java parameter -Xmx#m , where # is the maximum number of Mb to allocate to the heap (for example, java -Xmx128m for a 128 Mb maxHeap ). Be sure that the JVM maximum heap size specified is less than the total physical memory of the machine.

Relating JVM MaxHeap to Derby PageCache

Tests in-house using Cloudscape 4.0 (a precursor of Derby) and Java 1.4.0 proved it to be safe to jump to a heapSize to Cloudscape pageCache ratio of 20:1. This ratio supported an active application and many applications work well with smaller ratios. Applications that handle a larger number of transactions per second might experience OutOfMemoryExceptions with larger JVM maxHeap to Cloudscape pageCache ratios. If you are experiencing OOMs, we recommend you perform stress testing to determine the optimal ratio for your tem. Start with a JVM to pageCache ratio of 20:1 and perform long running stress tests to determine when memory utilization stabilizes. If no OOMs are encountered, perform additional tests with smaller ratios until an acceptable throughput is achieved or until the application encounters OutOfMemory exceptions. Light weight applications can run with a ratio of 2.5:1. The Derby default with Version 5.1 (1000 pageCache [= 4 MB cache] and a 64K maxHeap ) is a 16:1 ratio and works well for most applications. The best ratio to use is one that provides adequate performance without encountering OOMs.

PageCache: Doing the math

The size of the Derby pageCache is determined by the average pagesize used in the database and the setting of the property derby.storage.pageCacheSize . pageCache specifies the maximum number of pages held in the Derby page cache. To estimate the amount of memory the pageCache will use, multiply the pageCache by the average page size in the database. (See the documentation on the derby.storage.pageSize property for more information on page sizes.) If you use the default pagesize of 4K throughout all your tables in the database, this calculation is simply (pgSz = 4k) x ( pageCache setting).

JVM Heap to PageCache Ratio: Doing the math

The default heap:pageCache ratio (if neither maxHeap nor pageCache is set) for Cloudscape Version 4.0 running in a V1.4 JVM and below is 400:1 (64M / ( (40*4K)/1000). With Version 5.0 the default, pageCache size was increased from 40 to 1000 pages, thus changing the default ratio to 16:1 (64M / ( (1000*4)/1000 ) ). This ratio may be too small for active applications, and OOMs could occur using the Version 5.0 default.

Q: How do I verify the integrity of all my tables and indexes?

A: Derby's transaction logging and recovery system protects against machine and database crashes. If, however, there are hardware problems (e.g. I/O system failure) or the database was dual-booted it is advisable to verify the internal consistency of the tables and indexes. Derby supplies the system function SYSCS_UTIL.SYSCS_CHECK_TABLE to check the consistency of a tables indexes against the base table. This function can be executed as part of a query against the data dictionary to check each user table in the database. If a problem is found the query needs to be reexecuted excluding the problem table or tables until the query completes without an exception. The following steps describe this process:

  1. Run the following query using IJ or another query mechanism
    SELECT schemaname || '.' || tablename as TableName,
       SYSCS_UTIL.SYSCS_CHECK_TABLE(schemaname, tablename) AS OK
    FROM sys.sysschemas s, sys.systables t
    WHERE s.schemaid = t.schemaid
       and t.tabletype = 'T';
  2. When a problem is found the query aborts without executing on the remaining tables. Note the name of the table listed in the exception. If the problem is with an index, also note the index name. Rerun the query excluding the problem table(s).
    	SELECT schemaname ||'.' || tablename as TableName,
       SYSCS_UTIL.SYSCS_CHECK_TABLE(schemaname, tablename) AS OK
    FROM sys.sysschemas s, sys.systables t
    WHERE s.schemaid = t.schemaid
       and t.tabletype = 'T'
       and tablename NOT IN ("<tableX>");
  3. If further problems are found continue to add to the NOT IN list until the query completes without an exception. the clause might look something like this tablename NOT IN ("CITIES","FLIGHTS"). When run against the demo database toursdb the output would look like this:
    ij>  	SELECT schemaname ||'.' || tablename as TableName,
       SYSCS_UTIL.SYSCS_CHECK_TABLE(schemaname, tablename) AS OK
    FROM sys.sysschemas s, sys.systables t
    WHERE s.schemaid = t.schemaid
       and t.tabletype = 'T'
       and tablename NOT IN ("CITIES","FLIGHTS");
       
    	TABLENAME
            |OK
    --------------------------------
    --------------------
    APP.AIRLINES
            |1
    APP.COUNTRIES
            |1
    APP.FLIGHTAVAILABILITY
            |1
    APP.MAPS
            |1
    APP.FLIGHTS_HISTORY
            |1
    
    5 rows selected

Index problems reported by SYSCS_CHECK_TABLE can be corrected by dropping and recreating the indexes. If problems are found in the datatables the database will need to be restored from the last backup.


Release Information

Q: What is new in IBM Cloudscape, Version 10.1?

A: IBM Cloudscape 10.1 includes the Apache Derby 10.1 release and is the second commercial release from IBM of Apache Derby. In addition to all the benefits and features inherent in Version 10.0, Version 10.1 contains:

  • Support for JSR 169, the JDBC API defined for the Connected Device Configuration (CDC) / Foundation Profile (FP). Use the org.apache.derby.jdbc.EmbeddedSimpleDataSource class when running in a Java 2 Platform, Micro Edition (J2ME) CDC/FP environment.
  • The Derby Network Client, an open source client library for use with the Derby Network Server. Network Server, Version 10.1, continues to support use of the IBM DB2 JDBC Universal Driver, but the Derby Network Client will be the standard client library going forward.
  • XA distributed transaction support when using the Derby Network client. Use the org.apache.derby.client.jdbc.ClientXADataSource client class when implementing Derby in a J2EE Resource manager in a distributed J2EE environment implementing the Java Transaction API (JTA).
  • Soft Upgrade: the ability to access a database created by Derby 10.0 using version 10.1 without specifying upgrade=true and still be able to use the older version to access the database. Certain new features will not be available, however, until a hard upgrade is performed.
  • Support for the JDBC 3.0 Updatable ResultSet API
  • SQL synonyms
  • Functions for performing SQL Date/Time arithmetic
  • SQL intercept and except
  • Ability to import or insert values into identity columns
  • Run in place compress table
  • The Migration Tool to assist in upgrading from Cloudscape, Version 5, to IBM Cloudscape, Version 10.x (available for download from developerWorks at the end of August 2005). (See Resources .)

Q: Why was a new client driver library released with version 10.1?

A: The Derby Network Client (derbyclient.jar) is the new client-side library for the Derby Network Server. Like the DB2 JDBC Universal Driver, it provides access to Derby databases with the Derby Network Server from a remote machine or process. Unlike its predecessor, the Network Client is open source code and is better integrated with the Derby Network Server. Some of the improvements are:

  • A username and password is only required if authentication is being used.
  • Special quoting is no longer required to escape the slash and colon characters used in some database paths.
  • The URL connection attributes are position independent and use a single delimiter (semicolon).
  • There is no separate license for this client library (like the rest of the Apache Derby software, it is available under the Apache license).

Version 10.1 of the Derby Network Server supports both the Derby Network Client and the DB2 JDBC Universal Driver. Going forward, enhancements to Network Server will require the use of the Derby Network Client library. For example, only the Network Client contains the Network Server XA datasource, org.apache.derby.client.jdbc.ClientXADataSource .

Switching from using the Universal Driver to the Network Client will require:

  • Distributing the derbyclient.jar file and placing it in the CLASSPATH, instead of db2jcc.jar and db2jcc_license_c.jar.
  • Updating the application to load the new driver (org.apache.derby.jdbc.ClientDriver) .
  • Altering the database connection URLs to the simpler syntax:

                         jdbc:derby://<server>[:<port>]/<databaseName>
                         [;<URL attribute>=<value>[;...]]

Q: What is IBM Cloudscape, Version 10.0?

A: IBM Cloudscape 10.0, a commercial release from IBM of Apache Derby, is a small-footprint, multiuser, standards-driven, relational database tem written entirely in Java that is well suited for embedding in Java applications and servers. The core functionality of Derby and IBM Cloudscape is the same as what is referred to here as Derby. Derby runs in any fully functional JVM, thus the same application will run unaltered on many hardware platforms.

Derby supports the SQL-92E (entry) standard and part of SQL-99. The Derby API is JDBC. Derby requires no administration (zero admin); so there is no need to install or manage the database separate from the application. Install it and forget it, is the key goal for a successful, embeddable component.

Resources

Learn

Get products and technologies

  • Migration Tool : Assists in upgrading from Cloudscape, Version 5, to IBM Cloudscape, Version 10.x (available for download from developerWorks at the end of August 2005).
  • IBM DB2 JDBC Universal Driver for the Apache Derby Network Server: Provided with Cloudscape 10.0; however, it is not included with the Derby Network Server. It can be downloaded from this page, free.
  • IBM DB2 plug-ins for Eclipse, Version 1.0.2 Beta (developerWorks, August 2004) discusses IBM DB2 plug-ins for Eclipse and provides links for downloading the plug-ins. ( Note: A smaller, improved version of this tool will be made available soon after version 10.1 is released.)
  • Download free modules - Cloudscape and related software from the Cloudscape download page.
  • Technical support for IBM Cloudscape is available to individuals and organizations that require guaranteed response and resolution to issues (fee-based).

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=14616
ArticleTitle=A compendium of Cloudscape frequently asked questions
publish-date=07272006