General
What is the difference between IBM Cloudscape and Apache Derby?
Is IBM Cloudscape free?
What is "embedded"?
Is embedded Derby a multiuser database?
How can double-booting with a read-only connection cause corruption?
Why isn't Client/Server support bundled in the base engine jar?
Does Derby support SQLJ?
Why shouldn't the Derby JDBC Driver be deployed at the J2EE application level?
Installation and configuration
How do I install the Derby environment?
What are the basic components of a deployment?
What is the embedded Server architecture?
What are the steps needed to implement authentication and authorization?
How do I setup Network Server to run under the java security manager?
How can I tune Derby to improve cache hits and reduce I/O?
General usage
How is a schema different from a database?
What can I do to avoid duplicate key failures (ERROR 23505) from values generated for a Derby identity column after I have inserted specific values into the column?
How can an application connect to more than one database?
How do I list the names of the tables in a database?
How do I create a Java function in a Derby database?
How do I find the filename for a datatable?
Troubleshooting
How do I set up my development environment for Derby?
How can I avoid OutOfMemory exceptions when running Derby?
How do I verify the integrity of all my tables and indexes?
Releases
What is new in IBM Cloudscape, Version 10.1?
Why was a new client driver library released with version 10.1?
What is IBM Cloudscape, Version 10.0?
Resources
Q: What is the difference between IBM Cloudscape and Apache Derby?
A: There is no difference in core functionality between Apache Derby and IBM Cloudscape. Beginning with Version 10, IBM Cloudscape became a commercial implementation of the database being developed as part of the Apache Derby Project. (See Resources .) In August 2004, IBM contributed the source code for the IBM Cloudscape product to the Apache Software Foundation (ASF). The software was renamed Derby. IBM produces builds of IBM Cloudscape, Version 10, and above directly from the ASF codeline. IBM makes no changes to the database engine, tools, or network server code it repackages and distributes as IBM Cloudscape.
Beyond the core functionality, IBM Cloudscape differs from Derby in packaging and in optional, value-added modules and tools provided by IBM. Because the core products are the same, these additional modules and tools can be used with Apache Derby or IBM Cloudscape; both products are indistinguishable at the core level. The following lists the packaging and value-added modules and tools available for use with Derby and/or IBM Cloudscape.
-
IBM Cloudscape builds undergo extensive testing to insure the
stability of the codeline prior to releasing the product for
download and include:
- Major releases bundled in an installer package to simplify setup
- The IBM build of the JRE for Windows and Linux within the respective installer package for use, if needed
- The DB2 JDBC client driver that can be used to connect to the Network Server: IBM DB2 JDBC Universal Driver. [ Note: The open source Derby Network Client V10.1 should be used in place of this library in most installations.] These JAR files are included with all IBM Cloudscape installers and are also available as a separate download for use with ASF Derby distributions. (See Resources .)
- Documentation in PDF format
- An ODBC driver for Derby, the IBM DB2 Runtime Client (available as a separate download). (See Resources .)
- A database browsing tool, the IBM DB2 plug-ins for Eclipse (available as a separate download). (See Resources .) [ Updated: A smaller, improved version of this tool will be made available soon after version 10.1 is released.]
Access the IBM Cloudscape download page to download these modules free of charge. (See Resources .)
IBM also offers services to IBM Cloudscape users. A fee-based Support license (see Resources ) for IBM Cloudscape is available to individuals and organizations that require guaranteed response and resolution to issues. This offering provides voice and electronic access into IBM support organizations. The service license is restricted to IBM Cloudscape distributions of Derby and is on a 'per machine' basis. For individuals and organizations that do not require formal support services, IBM also provides free support through papers available on the developerWorks Cloudscape Web site and through the IBM Cloudscape community forum. (See Resources .)
A: Yes. IBM Cloudscape is free for download. Use the "Download free modules" link in the Resources section of this document. In addition its license allows IBM Cloudscape to be freely redistributed.
A: Definition from the Merriam-Webster OnLine Dictionary (http://m-w.com):
em·bed·ded; em·bed·ding - transitive senses
1 a : to enclose closely in or as if in a matrix <fossils embedded in stone> b : to make something an integral part of <the prejudices embedded in our language> c : to prepare (a microscopy specimen) for sectioning by infiltrating with and enclosing in a supporting substance
2 : to surround closely <a sweet pulp embeds the plum seed> intransitive senses : to become embedded
"Embedded Derby" means there isn't a separate database server; Derby is installed with the Java application (one JAR file plus the database files). It runs in the same JVM as the application and is booted and shutdown by the application. The end-user does not need to be aware there is a database system within it. The result is an application that is easy to deploy with all the features of a relational database system. From an application design standpoint, Derby is about as 'Plug N Play' as a DBMS can be.
Q: Is embedded Derby a multiuser database?
A: Yes, Derby supports multiuser access, regardless of the deployment option used. "Multiuser" means the engine manages multiple, isolated transactions as needed. The Derby manuals address using Derby in single user applications. Derby is well suited to run in this architecture but is not restricted to a single user or single connection. Derby supports multiple connections even in a fully embedded deployment. Each connection represents an individual transaction. A single application can establish multiple connections and thus execute multiple, independent transactions. Each connection can specify one of four levels of transaction isolation to provide the required level of concurrency. See the Derby Developers Guide for a complete description of transaction isolation levels, concurrency, and the types of locking utilized.
It is important to note that the Derby JAR file contains no networking code. Again, this does not mean Derby is single user, although this misinterpretation is common. Network messages are required for communications outside the JVM, which is running the Derby engine (inter-process communication). Communication between two independent processes is termed "Client-Server." To meet the need for Client-Server capabilities, Derby provides the Network Server (in a separate JAR file). Derby easily embeds in Network Server to provide Client-Server connectivity when needed. Derby also embeds easily in many other server programs (for example, WebSphere, Apache Tomcat, and so on), so the choice is up to you.
A word of caution; if an interactive connection to the database is required in addition to the connection(s) established by the application, it is advisable to use a server environment (the embedded server architecture is the most flexible). In several instances database corruption has occurred when, during testing, a programmer attempted to monitor test progress using a separate query tool. The query tool runs in its own JVM and, if allowed to connect, will corrupt the database by 'double-booting' (also known as 'dual-booting') the database. See the section on Double-booting in the DB2 Help site documentation for important information on this subject.
Q: How can double-booting Derby with a read-only connection cause corruption?
A: The first connection made to a Derby database "boots" the database. Booting means that all memory structures and processes are initialized (caches, managers, lists, and so on), the data dictionary is read, and any uncommitted transactions in the transaction log are rolled back. In a double-boot (also known as dual-boot) scenario, where the uncommitted transactions are those active in the other JVM process, this rollback will cause corruption. So the type of connection being established does not matter because the act of booting the database will recover any incomplete transactions in the transaction log in order to preserve data integrity.
Where possible, Derby tries to detect and prevent
double-booting, but, until recently, it has not been possible to stop
this in all environments. Use of a JVM of version 1.4 or higher enables
Derby to prevent double-booting on UNIX, Linux, and similar
operating systems. Use of a version 1.4 or higher JVM is highly
recommended on these platforms. With systems where an earlier JVM must be
used, dual-booting can be minimized by setting the property
derby.database.forceDatabaseLock=true
. One side-effect of setting
forceDatabaseLock
is that the existence of the database lock file (db.lck) will always
prevent a database from booting. When a database is not properly shutdown
(for example, a crash or an abort) an 'orphan' database lock
file will be left. When
forceDatabaseLock
is set, this orphan file will need to be removed manually. Normally
Derby detects and handles orphaned database lock files
appropriately, but
forceDatabaseLock
could prevent the database from booting even if no other process has an
active lock on the file. Be sure to verify that no JVMs are running on
the database machine before manually removing a database lock file.
Q: Why isn't Client/Server support bundled in the base engine jar?
A: Many applications run entirely on a single system in a single process, so do not require Client/Server capabilities. For instance, a servlet storing data entered into a Web form or an address book application running on a PC does not need to communicate with other processes. Client/Server code would increase the footprint of these applications and provide no benefit. For such applications, Derby provides all the advantages of a relational database without unnecessary networking code. Client/Server capabilities can be added incrementally in instances where it is required by running Network Server or some other server architecture.
A: No. There is no SQLJ implementation produced specifically for Derby. Though some SQLJ translators may generate code that does not fail when used with Derby this has not been tested and the performance of the code might be poor. The SQLJ standard allows for 'vendor customizations' in order to produce the best result possible. This is why most SQLJ implementations are produced by the major database vendors (Oracle, Sybase, IBM, etc.) and include customizations specific to that database. Generally when people are interested in SQLJ, they want more than just a simple JDBC translator. They expect the resulting code to be as fast or faster than the programs they would produce themselves.
Q: Why shouldn't the Derby JDBC Driver be deployed at the J2EE application level?
A: This is because the Derby driver contains DBMS engine classes, not just the communication classes that make up most JDBC drivers. When the driver is first loaded the engine boots and these internal structures need to be accessible by any other JVM thread(s) performing database activity. If the Derby driver is first loaded by a deployed application such access is not possible. This is because application servers setup a hierarchy of restricted classloaders that isolate deployed modules (applications, JSPs, Servlets, EJBs, etc.) to prevent them from "stepping" on one and other. To work properly the engine classes must be visible JVM-wide (e.g. above the application level in the classloader hierarchy of the server). This insures that Derby internal threads as well as threads making database requests will have access to the classes and objects needed to perform the requested functions. Because it is embedded, Derby inherits the privileges of the routine that loads it. If Derby is booted in a restricted environment the results are unpredictable.
Installation and configuration
Q: How do I install the IBM Cloudscape environment?
A: Installation is simple and made platform independent by using the Java installer. (Command line syntax to begin the installation and file path designation remain platform dependant). This answer outlines installation using the Java installer. Platform-specific installers are available for Windows and Linux environments; see the install.html file for information on using platform-specific installers.
Installation requires a functioning JVM and adequate disk space on the filesystem -- about 4Mb for the JAR files alone, 24Mb for the complete IBM Cloudscape installation with javadoc and samples, or (on Windows only) 64Mb for the complete IBM Cloudscape installation, and the IBM JRE. These estimates do not include the size of the installation JAR file.
At the OS command line prompt, move to the directory containing the IBM Cloudscape installation JAR file. Invoke the installer by typing the command:
<path-to-java-binary>\java -cp . -jar
<cloudscape-installation-jarfile>
|
Where:
-
<path-to-java-binary>is the fully qualified path to thejava.exefile (usually indicated by the environment variableJAVA_HOME) -
and
<cloudscape-installation-jarfile>is the installation JAR file
Respond to the prompts:
- Initial screen (1): Verify that the correct product is to be installed: Click Next .
- Screen 2: Uncheck the checkbox labeled "Yes, I would like to read the release notes now." Click Next .
- Screen 3: Review the licensing terms. If acceptable, select I accept the terms of the license agreement . Click Next .
- Screen 4: Verify or enter the full directory path for the installation. Click Next .
- Screen 5: Verify the installation selections and the disk space needed. Click Next .
- Screen 6: Click Next , then click Finish .
Q: What are the basic components of a Derby deployment?
A: The key Derby components in an application deployment are 1) the JAR files, 2) the database files, and 3) the error log. These will exist in all deployments and must be located before any database level work or debugging can occur.
The concept of the System Home is also important when talking about any
Derby deployment. The Derby System Home is the default
location where all files, the database, and the Derby log are created. If
this location is not defined by setting the property
derby.system.home
, it will default to the current working directory of the JVM process
running the database engine. It is prudent to explicitly define the
System Home in production deployments and document the location. Set
derby.system.home
before connecting to the first database (also referred to as
"starting the Derby engine").
The default name of the error log file is
derby.log
, but the name is configurable and is sometimes changed by the
application programmer. By default, the error log is created in
derby.system.home, and it is prudent to allow it to default to this
location. The error log is required to conduct any database level
investigation/debugging activities. Its location and name must be known
in order to work with a Derby deployment. The log will contain
a list of all databases booted by the Derby engine along with
any severe errors that occur (see the
Tuning
Guide, "Working with Derby Properties," for information on
setting
derby.stream.error.logSeverityLevel
to capture less severe messages and warnings).
The database files (for example, c2a10.dat) and the transaction logfiles (for example, log2.dat) are located in subdirectories of the main database directory. The main directory will have the same name as the database (for example, toursDB). The path to the database directory is needed in order to boot the database. The main database directory will contain at least two subdirectories:
- "seg0" contains the datafiles
- "log" contains the transaction logfiles
Never directly access any files in these directories.
The derby.jar file contains the code to run the Derby engine. This file must be in your classpath to load the embedded driver and boot a Derby database. The Derby development environment contains other optional JAR files that may be included in a Derby deployment. These are:
-
derbytools.jar - The tools
ijanddblook-
ij- Derby interactive JDBC scripting tool and SQL command line interface -
dblook- Derby utility to view or dump the Data Definition Language (DDL) for objects in a database
-
- derbynet.jar - The Derby Network server
- derbyclient.jar - The Derby Network Client. This client library became available with release 10.1 and is the preferred client library. It can also be used with Network Server Version 10.0.
- db2jcc.jar and db2jcc_license_c.jar - The DB2 JDBC Universal Driver that supports client communications with the Network Server. [ Note: derbyclient.jar is the preferred client library to use with Network Server.]
Q: What is the embedded Server architecture?
A:
Put simply, an embedded Server architecture is created when a Server program that can communicate with Derby runs in the same JVM as the Derby engine. In this architecture any thread running within the JVM can access Derby databases using the embedded driver (org.apache.derby.jdbc.EmbeddedDriver) while threads running in separate JVMs can simultaneously access the same databases using the appropriate client driver. The internal threads benefit by being able to interact directly with the Derby engine (embedded) while the external threads communicate with Derby via the Server intermediary (client/server).
When using the Derby Network Server an embedded server architecture can be easily created by setting the property derby.drda.startNetworkServer to true. The property causes Derby to automatically start the Network Server when the engine boots. By simply setting this property on the java command line or in the derby.properties file any program that loads the Derby embedded driver will run with the Network Server embedded as well. This is particularly useful when testing or debugging an embedded application because it provides the ability to monitor the database using ad-hoc queries while he application is running. SimpleNetworkServerSample.java program (provided with the Derby distribution) for a demonstration of a simple embedded server environment.
The most common example of an embedded server environment is when Derby is used in conjunction with and application server such as Tomcat or WebSphere. By configuring these servers to load the Derby engine at startup an embedded server environment is created. The database engine exists in the same JVM as the application server so all applications, JSPs, EJBs, Servlets, etc running on the server can use the embedded driver for database access (via a global or server-wide embedded datasource)and avoid the overhead of client/server communication. For more information on using Derby with application servers read the paper titled "Use Derby in a J2EE Server environment" listed in the Resources section.
Q: What are the steps needed to implement user authentication and authorization?
A:
Derby has a built-in user authentication (username and password) and user authorization (data access levels)
system that can be used in environments where increased security is needed and other systems (e.g. LDAP) are not being used.
In its default configuration Derby authentication and authorization is not used.
This example activates authentication by the built-in provider then defines four user accounts
(Spot, Rover, Dick, Jane) and assigns two to each of the two authorization (access) levels.
Properties are then set to disallow access without a valid username and password and to prevent the settings from being overridden by system level properties set programmatically.
Active authentication by the built-in provider
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(
'derby.connection.requireAuthentication', 'true')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(
'derby.authentication.provider', 'BUILTIN')
|
Define four user accounts and passwords
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Spot', 's06oh1r1')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Rover', 'j2um3u5u')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Dick', 'bt84ayNn9')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Jane', 'aal379y0w')
|
Define authorization level (Full or Read-only) for each account
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.fullAccessUsers','Spot,Rover')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.readAccessUsers','Dick,Jane')
|
Restrict access to defined, authorized users and block property override
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.connection.requireAuthentication', 'true')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.defaultAccessMode', 'noAccess')
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.database.propertiesOnly', 'true')
|
Examples of invalid and valid connection attempts using IJ
java org.apache.derby.tools.ij
ij version 10.1
ij> connect 'jdbc:derby:authdb';
ERROR 08004: Connection refused : Invalid authentication.
ij> connect 'jdbc:derby:authdb;user=Rover';
ERROR 08004: Connection refused : Invalid authentication.
ij> connect 'jdbc:derby:authdb;user=Spot;password=s06oh1r1';
ij>
|
Listing the authorized accounts for a database
ij> values SYSCS_UTIL.SYSCS_GET_DATABASE_PROPERTY
('derby.database.fullAccessUsers');
1
------------------------------------------------------
Spot,Rover
ij> values SYSCS_UTIL.SYSCS_GET_DATABASE_PROPERTY
('derby.database.readAccessUsers');
1
-------------------------------------------------------
Dick,Jane
|
Considerations when using Authentication
- Always create at least one full access user account before setting
derby.connection.requireAuthenticationor you will not be able to manipulate data in the database. - Applications should use fully qualified tablenames in the format
schemaname.tablenameto avoid problems due to each user having a different default schema. When the schema name is not explicitly listed in an SQL statement the default schema of the connection is assumed. For an authenticated connection the username becomes the default schema. If no username is specified when a connection is made the system default APP schema is assumed. - Consider encrypting the database when security is important.
- To deauthenticate an account set the password to
null. Example:CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.user.Jane',null) - To deauthorize a user recreate the authorization list(s) without the username listed.
Q: How do I setup Network Server to run under the java security manager?
A: Derby and the Network Server require specific access permissions be granted in order to run under the java security manager. These permissions can be listed in a special policy that is specified on the command line when Network Server is started. The example policy file that follows lists the basic permissions needed. Create the policy file (named nwsvr.policy in this example) and specify the following two options when starting Network Server:
-
-Djava.security.manager- to invoke the default security manager -
-Djava.security.policy=nwsvr.policy- to allow the access required to Network Server and Derby to run properly.
Example policy file syntax to run Network Server
//Recommended set of permissions to start and use the Network Server.
// Assumes the Java System variable ${derby.base} is set to the directory that
// contains all the derby jars. Restrict access to this directory!
// Assumes all connections will be from localhost
// Assumes the database is under the directory ${derby.system.home}.
// Fine tune based on your environment settings
grant codeBase "file:${derby.base}/-" {
permission java.io.FilePermission "${derby.system.home}", "read";
permission java.io.FilePermission "${derby.system.home}${/}-", "read, write, delete";
permission java.io.FilePermission "${user.dir}${/}-", "read, write, delete";
permission java.util.PropertyPermission "derby.*", "read";
permission java.util.PropertyPermission "user.dir", "read";
permission java.lang.RuntimePermission "createClassLoader";
permission java.net.SocketPermission "localhost", "accept";
};
|
Use the following command to start Network Server to accept connections from localhost using the property file nwsvr.policy residing in the current directory and with the Derby jarfiles being in the directory
/opt/derby/lib.
java -Djava.security.manager -Djava.security.policy=nwsvr.policy
-Dderby.base=/opt/derby/lib
org.apache.derby.drda.NetworkServerControl -h localhost start
|
To start the Network Server so connections can be made from other machines you will need to modify both the policy file and the Network Server startup command. Change the policy file line that sets
permission java.net.SocketPermission to accept connections from the machines or domains that require access.
When starting the Network Server use the -h parameter and specify the address 0.0.0.0 so the Network Server itself does not restrict connections.
Example policy: allow access for machines in "myOwnDomain.com"
permission java.net.SocketPermission "*.myOwnDomain.com", "accept"; |
Example start command: do not restrict connections
java -Djava.security.manager -Djava.security.policy=nwsvr.policy
-Dderby.base=/opt/derby/lib
org.apache.derby.drda.NetworkServerControl -h 0.0.0.0 start
|
Besides the syntax shown in the examples the hosts specification of the
java.net.SocketPermission class can be a complete DNS name or a numerical IP address and can include port number ranges.
For complete information on specifying access to this and other security manager permission classes refer to
the "Java Security Specification: Permissions" link in
Resources section.
Other Security Considerations when using Network Server
- Secure the machine hosting the Network Server
- Restrict access to the database files and directories
- Change the port that the Network Server listens on
- Encrypt the Derby database files
- Setup user authentication and authorization using the Derby BUILTIN authentication provider or other system.
- Encrypt network transmissions using SSL or other technology
Q: How can I tune Derby to improve cache hits and reduce I/O?
A:
Performance may be improved and I/O reduced for systems that handle large
amounts of data by increasing the size of the data cache available to
Derby. The goal of sizing the cache for performance would be
to set the
pageCache
to hold as much data as possible up to the size of the entire database
while avoiding OutOfMemory exceptions (OOMs). A rough (but large)
estimate of the amount of data in a database can be obtained from the
total size of the datafiles in the seg0 directory. The sum of file sizes
will always exceed the total sum of all the records in the database
because there is free space in these files. Compressing the database just
prior to noting the total datafile size will minimize the error in this
method. Usually a much smaller cache provides good performance, so there
is no point sizing a
pageCache
larger than the value above.
Getting the cache sized optimally requires iterative testing to insure
that the increased cache both improves performance and does not exceed
the memory available to the JVM (maximum heap). Too large a
Derby
pageCache
will cause the JVM to crash. The following factors need to be balanced to
avoid OutOfMemory exceptions (OOMs):
- The physical memory on the machine. The JVM must fit in physical memory.
- The maximum heap size allowed for the jvm (-Xmx )
-
The size of the Derby
pageCache(pageCache* avg.pageSize)
See the Troubleshooting section of this document, question " How can I avoid OutOfMemory exceptions when running Derby? ," for information on adjusting these values.
General usage/features (the basics)
Q: How is a schema different from a database?
A: A Derby database is a group of schemas, each with its own set of tables, all of which are defined by the same data dictionary (the system tables in the SYS schema). On disk, the database is a set of data files (located in the seg0 directory) and a set of transaction log files (located in the log directory). Transactions cannot span databases, but they can span schemas in the same database. Refer to the question: " How can an application connect to more than one database in a Derby system? " for more information on multiple databases opened by a single Derby engine.
Derby implements schemas to provide multiple namespaces within a single DB. They are used to group related tables into a logical unit. There are no separate disk files related to schemas. All schemas in the same database share the same data dictionary and transaction logs. You can refer to the multiple schemas in a single SQL statement and it will execute within the same connection (no open is required). SQL manipulation between schemas in the same database are transactional.
A: NOTE: The ability to insert data into an identity column was introduced in Version 10.1.2.1. Earlier versions do not contain this functionality. The ability to reset the identity sequence to adjust for manually inserted values will not be available until version 10.2 is released late in 2006.
As you've discovered, manually inserting values into an identity column that requires unique values is a move away from a Derby zero administration system. Derby increments the identity sequence based on the last number the value generator assigned and is unaware of values manually inserted. Because identity columns of type GENERATED BY DEFAULT allow duplicate values to occur, applying a unique constraint to the column can result in insert failures if the values manually inserted are later generated as part of the identity sequence. This problem can be addressed in one of the following two ways:
- Avoid overlapping values: Implement a design where the values are partitioned into ranges that do not overlap. For instance, the records from source A are in the range 1 to 1 million, the generated values in this table ranges from 1 million and 1 to 2 million and the records from source C are more than 2 million. The starting value for the identity sequence as well as the value the sequence increments by can be specified when the column is created.
- Step through the conflicting values: The value generated in the sequence increments even when the insert fails. It is possible to retry the failed insert statement until the value generated does not conflict with the existing values and the insert succeeds.
Version 10.2 will introduce the ability to restart an identity sequence using the ALTER TABLE command. If this problem is encountered, a restart value larger than the largest value found in the column can be specified with the following syntax:
ALTER TABLE <tableName> ALTER COLUMN <identity ColName> RESTART WITH <integer-constant> |
Q: How can an application connect to more than one database in a Derby system?
A: It is possible to open multiple connections, each against a different database in a single application. The instances/connections, however, are unaware of each other with regard to cross database manipulation with SQL. Any manipulation between the two databases/connections would need to be done at the application level (not within a single SQL statement) and the results displayed or written to the appropriate location.
For example, you can read table ONE from database A into an array and table TWO from database B into another array. Loop through the data records in the arrays, and change the values as needed, then write the data back to the original database and table. By contrast, creating table ONE and table TWO in two different schemas of the same database allows this manipulation to be done using SQL within a single transaction.
Q: How do I list the names of the tables in a database?
A:
Tables are listed in the system table SYSTABLES. A fully qualified table name includes the schema in which the table was created (APP is the default schema).
The following query lists the fully qualified tablenames of all user tables in the database.
select s.schemaname || '.' || t.tablename
from sys.systables t, sys.sysschemas s
where t.schemaid = s.schemaid
and t.tabletype = 'T'
order by s.schemaname, t.tablename
|
Q: How do I create a Java function in a Derby database?
A: A powerful feature of Derby is the ability to create your own functions to perform specialized processing. Functions are implemented in Java, called as part of a SQL statement and can be used in triggers to perform conditional data manipulations. The following example demonstrates the steps involved in creating a user-defined function in Derby. The example function converts a BIGINT value to a Hexadecimal string. For more examples of creating Derby functions and procedures see the link in the Resources section.
The basic steps to creating a database function are:
- Create java methods to perform the desired manipulations and place them in a JAR file
- Install (load) the JAR file into the database
- Set the database classpath to include the JAR file
- Create (define) the function to call each method
- Test the newly defined function
Listing 1. Java code to support the bigintToHexString function
// Class supporting Derby Java Stored Procedures and Functions
import java.sql.*;
public class derbyJavaUtils
{
// bigintToHexString: converts a BIGINT value to a Hexadecimal String
public static String bigintToHexString(long myBigint)
{
return Long.toHexString(myBigint);
}
}
|
Listing 2. SQL syntax to install a JAR file to the database
Format: CALL SQLJ.install_jar( 'jarFilePath', 'qualifiedJarName', 0)
Example: CALL sqlj.install_jar( 'derbyJavaUtils.jar','APP.derbyJavaUtils',0 )
NOTE: In this example the file 'derbyJavaUtils.jar' is in the Derby default directory, derby.system.home.
The filepath will needed to be specified if the jarfile is not located in derby.system.home.
|
Listing 3. Setting the database classpath to include the JAR file
Format: Call SYSCS_UTIL.SYSCS_GET_DATABASE_PROPERTY('KEY','VALUE')
Example: CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY (
'derby.database.classpath', 'APP.derbyJavaUtils')
|
Listing 4. Defining the function to the database
Example: CREATE FUNCTION app.bigintToHexString(hexString bigint)
RETURNS VARCHAR(16)
PARAMETER STYLE JAVA NO SQL
LANGUAGE JAVA
EXTERNAL NAME 'derbyJavaUtils.bigintToHexString'
|
Listing 5. Testing (using) the function
Example: select CONGLOMERATENAME, bigintToHexString(CONGLOMERATENUMBER)
from SYS.SYSCONGLOMERATES
|
Q: How do I find the filename for a datatable?
A:
The system table SYSCONGLOMERATES references the physical files associated with the data objects in a Derby database.
The filename is constructed as a hexadecimal string representation of the conglomerate number. To list the file names for all user tables in the
tables in a database create the bigintToHexString function shown in the Java function FAQ and execute the following query:
select TABLENAME, 'C' || bigintToHexString(CONGLOMERATENUMBER) || '.dat' from SYS.SYSCONGLOMERATES a, SYS.SYSTABLES b where a.TABLEID = b.TABLEID AND b.TABLETYPE <> 'S' ; |
CAUTION: Never change or delete a datafile for any reason or unrecoverable corruption will occur. The Derby DBMS engine is the only program that should ever access or change the contents of a Derby datafile.
Q: How do I set up my development environment for Derby?
A: Set the Derby properties shown below in development and test environments so the Derby error log is not automatically removed and all errors and warnings are recorded. Derby is no longer a zero admin tem when these settings are used, so you will need to watch the size of the logfile. Never use these properties in a production environment, where zero administration is important.
The simplest way to set these properties is by creating a file named
derby.properties in the derby.system.home directory (usually
this is the same directory that contains the Derby errorlog
logfile). Alternately you can specify each property using the
-D
parameter on the java command line. You will need to restart
Derby for the properties to take effect. Below is a commented
properties file for use in development and test environments:
Listing 6. Properties file for use in development and test environments
# Top of Sample File
# This is a sample Derby properties file provided to show
# setting properties which are useful for development and
# troubleshooting
#
# -- Append to the log file rather than overwriting it
# - manually remove the derby.log when it becomes large
derby.infolog.append=true
#
# -- Log all errors/messages of any severity (will list deadlocks)
derby.stream.error.logSeverityLevel=0
#
# -- Log all lock timeouts with additional trace information
derby.locks.monitor=true
#
# - List transaction information on victim and survivor transactions # for both lockTimeouts
# for both lockTimeouts and Deadlocks
derby.locks.deadlockTrace=true
# End of Sample File
|
Q: How can I avoid OutOfMemory exceptions when running Derby?
A: The following sections provide background on Derby memory use and provide one approach to sizing Derby and the JVM to avoid OutOfMemory exceptions.
Factors relating to Derby memory use
If you are unhappy with Derby performance or are encountering
OutOfMemory exceptions (OOMs), the first thing to try is adjusting the
Derby
pageCache
. The memory you can safely allocate to
pageCache
depends on many factors; physical memory, average
pageSize
, and JVM maximum Heap size (
maxHeap
) are the main factors. The basic calculation is to compute the number of
pages to allocate to the
pageCache
as a ratio of the JVM maximum Heap size.
The JVM maximum Heap Size
The JVM maximum Heap size (
maxHeap
) is set using the java parameter
-Xmx#m
, where
#
is the maximum number of Mb to allocate to the heap (for example,
java -Xmx128m
for a 128 Mb
maxHeap
). Be sure that the JVM maximum heap size specified is less than the
total physical memory of the machine.
Relating JVM MaxHeap to Derby PageCache
Tests in-house using Cloudscape 4.0 (a precursor of Derby) and Java 1.4.0 proved it to be safe
to jump to a
heapSize
to Cloudscape
pageCache
ratio of 20:1. This ratio supported an active application and many
applications work well with smaller ratios. Applications that handle a
larger number of transactions per second might experience
OutOfMemoryExceptions with larger JVM
maxHeap
to Cloudscape
pageCache
ratios. If you are experiencing OOMs, we recommend you perform stress
testing to determine the optimal ratio for your tem. Start with a JVM
to
pageCache
ratio of 20:1 and perform long running stress tests to determine when
memory utilization stabilizes. If no OOMs are encountered, perform
additional tests with smaller ratios until an acceptable throughput is
achieved or until the application encounters OutOfMemory exceptions.
Light weight applications can run with a ratio of 2.5:1. The
Derby default with Version 5.1 (1000
pageCache
[= 4 MB cache] and a 64K
maxHeap
) is a 16:1 ratio and works well for most applications. The best ratio to
use is one that provides adequate performance without encountering OOMs.
PageCache: Doing the math
The size of the Derby
pageCache
is determined by the average pagesize used in the database and the
setting of the property
derby.storage.pageCacheSize
.
pageCache
specifies the maximum number of pages held in the Derby page
cache. To estimate the amount of memory the
pageCache
will use, multiply the
pageCache
by the average page size in the database. (See the documentation on the
derby.storage.pageSize
property for more information on page sizes.) If you use the default
pagesize of 4K throughout all your tables in the database, this
calculation is simply (pgSz = 4k) x (
pageCache
setting).
JVM Heap to PageCache Ratio: Doing the math
The default
heap:pageCache
ratio (if neither
maxHeap
nor
pageCache
is set) for Cloudscape Version 4.0 running in a V1.4 JVM and below is
400:1 (64M / ( (40*4K)/1000). With Version 5.0 the default,
pageCache
size was increased from 40 to 1000 pages, thus changing the default ratio
to 16:1 (64M / ( (1000*4)/1000 ) ). This ratio may be too small for
active applications, and OOMs could occur using the Version 5.0 default.
Q: How do I verify the integrity of all my tables and indexes?
A: Derby's transaction logging and recovery system protects against machine and database crashes. If, however, there are hardware problems (e.g. I/O system failure) or the database was dual-booted it is advisable to verify the internal consistency of the tables and indexes. Derby supplies the system function SYSCS_UTIL.SYSCS_CHECK_TABLE to check the consistency of a tables indexes against the base table. This function can be executed as part of a query against the data dictionary to check each user table in the database. If a problem is found the query needs to be reexecuted excluding the problem table or tables until the query completes without an exception. The following steps describe this process:
- Run the following query using IJ or another query mechanism
SELECT schemaname || '.' || tablename as TableName, SYSCS_UTIL.SYSCS_CHECK_TABLE(schemaname, tablename) AS OK FROM sys.sysschemas s, sys.systables t WHERE s.schemaid = t.schemaid and t.tabletype = 'T';
- When a problem is found the query aborts without executing on the remaining tables. Note the name of the table listed in the
exception. If the problem is with an index, also note the index name. Rerun the query excluding the problem table(s).
SELECT schemaname ||'.' || tablename as TableName, SYSCS_UTIL.SYSCS_CHECK_TABLE(schemaname, tablename) AS OK FROM sys.sysschemas s, sys.systables t WHERE s.schemaid = t.schemaid and t.tabletype = 'T' and tablename NOT IN ("<tableX>");
- If further problems are found continue to add to the
NOT INlist until the query completes without an exception. the clause might look something like thistablename NOT IN ("CITIES","FLIGHTS"). When run against the demo database toursdb the output would look like this:ij> SELECT schemaname ||'.' || tablename as TableName, SYSCS_UTIL.SYSCS_CHECK_TABLE(schemaname, tablename) AS OK FROM sys.sysschemas s, sys.systables t WHERE s.schemaid = t.schemaid and t.tabletype = 'T' and tablename NOT IN ("CITIES","FLIGHTS"); TABLENAME |OK -------------------------------- -------------------- APP.AIRLINES |1 APP.COUNTRIES |1 APP.FLIGHTAVAILABILITY |1 APP.MAPS |1 APP.FLIGHTS_HISTORY |1 5 rows selected
Index problems reported by SYSCS_CHECK_TABLE can be corrected by dropping and recreating the indexes.
If problems are found in the datatables the database will need to be restored from the last backup.
Q: What is new in IBM Cloudscape, Version 10.1?
A: IBM Cloudscape 10.1 includes the Apache Derby 10.1 release and is the second commercial release from IBM of Apache Derby. In addition to all the benefits and features inherent in Version 10.0, Version 10.1 contains:
-
Support for JSR 169, the JDBC API defined for the Connected Device
Configuration (CDC) / Foundation Profile (FP). Use the
org.apache.derby.jdbc.EmbeddedSimpleDataSourceclass when running in a Java 2 Platform, Micro Edition (J2ME) CDC/FP environment. - The Derby Network Client, an open source client library for use with the Derby Network Server. Network Server, Version 10.1, continues to support use of the IBM DB2 JDBC Universal Driver, but the Derby Network Client will be the standard client library going forward.
-
XA distributed transaction support when using the Derby Network
client. Use the
org.apache.derby.client.jdbc.ClientXADataSourceclient class when implementing Derby in a J2EE Resource manager in a distributed J2EE environment implementing the Java Transaction API (JTA). -
Soft Upgrade: the ability to access a database created by
Derby 10.0 using version 10.1 without specifying
upgrade=trueand still be able to use the older version to access the database. Certain new features will not be available, however, until a hard upgrade is performed. - Support for the JDBC 3.0 Updatable ResultSet API
- SQL synonyms
- Functions for performing SQL Date/Time arithmetic
- SQL intercept and except
- Ability to import or insert values into identity columns
- Run in place compress table
- The Migration Tool to assist in upgrading from Cloudscape, Version 5, to IBM Cloudscape, Version 10.x (available for download from developerWorks at the end of August 2005). (See Resources .)
Q: Why was a new client driver library released with version 10.1?
A: The Derby Network Client (derbyclient.jar) is the new client-side library for the Derby Network Server. Like the DB2 JDBC Universal Driver, it provides access to Derby databases with the Derby Network Server from a remote machine or process. Unlike its predecessor, the Network Client is open source code and is better integrated with the Derby Network Server. Some of the improvements are:
- A username and password is only required if authentication is being used.
- Special quoting is no longer required to escape the slash and colon characters used in some database paths.
- The URL connection attributes are position independent and use a single delimiter (semicolon).
- There is no separate license for this client library (like the rest of the Apache Derby software, it is available under the Apache license).
Version 10.1 of the Derby Network Server supports both the Derby Network
Client and the DB2 JDBC Universal Driver. Going forward, enhancements to
Network Server will require the use of the Derby Network Client library.
For example, only the Network Client contains the Network Server XA
datasource,
org.apache.derby.client.jdbc.ClientXADataSource
.
Switching from using the Universal Driver to the Network Client will require:
- Distributing the derbyclient.jar file and placing it in the CLASSPATH, instead of db2jcc.jar and db2jcc_license_c.jar.
- Updating the application to load the new driver (org.apache.derby.jdbc.ClientDriver) .
-
Altering the database connection URLs to the simpler syntax:
jdbc:derby://<server>[:<port>]/<databaseName> [;<URL attribute>=<value>[;...]]
Q: What is IBM Cloudscape, Version 10.0?
A: IBM Cloudscape 10.0, a commercial release from IBM of Apache Derby, is a small-footprint, multiuser, standards-driven, relational database tem written entirely in Java that is well suited for embedding in Java applications and servers. The core functionality of Derby and IBM Cloudscape is the same as what is referred to here as Derby. Derby runs in any fully functional JVM, thus the same application will run unaltered on many hardware platforms.
Derby supports the SQL-92E (entry) standard and part of SQL-99. The Derby API is JDBC. Derby requires no administration (zero admin); so there is no need to install or manage the database separate from the application. Install it and forget it, is the key goal for a successful, embeddable component.
Learn
-
Cloudscape Technical Support Technotes
: Search available support for Cloudscape.
-
Cloudscape technical overview
(developerworks, August 2005) highlights the Derby embeddable SQL
database engine for Java technology and provides you with an overview of
Derby technical features.
-
Cloudscape technical resource center
on developerWorks: Access information about IBM Cloudscape.
-
Cloudscape and ODBC
(developerWorks, August 2005) discusses platforms, tem
requirements, and information about where to download the free driver.
-
Apache Derby Project
home page.
-
Derby Functions and Procedures
-
Java Security Manager Policy File Syntax
-
Java Security Specification: Permissions
-
Use Derby in a J2EE Server environment
-
developerWorks DB2 zone
: Learn more about DB2. You'll find technical documentation, how-to
articles, education, downloads, product information, and more.
Get products and technologies
-
Migration Tool
: Assists in upgrading from Cloudscape, Version 5, to IBM Cloudscape,
Version 10.x (available for download from developerWorks at the end of
August 2005).
-
IBM DB2 JDBC Universal Driver
for the Apache Derby Network Server: Provided with Cloudscape 10.0;
however, it is not included with the Derby Network Server. It can be
downloaded from this page, free.
-
IBM DB2 plug-ins for Eclipse, Version 1.0.2 Beta
(developerWorks, August 2004) discusses IBM DB2 plug-ins for
Eclipse and provides links for downloading the plug-ins. (
Note:
A smaller, improved version of this tool will be made available soon
after version 10.1 is released.)
-
Download free modules
- Cloudscape and related software from the Cloudscape download page.
-
Technical support
for IBM Cloudscape is available to individuals and organizations that
require guaranteed response and resolution to issues (fee-based).
Discuss
- Participate in the discussion forum.
-
developerWorks blogs
: Get involved in the developerWorks community.
Stanley Bradbury is currently working as Community Coordinator with the Cloudscape group at IBM. Before this he provided third level support to Cloudscape customers (2001 to 2005). Prior to working with Cloudscape he managed databases for companies in the Biotech, Manufacturing and Internet Security Service industries. He is a graduate of the University of California at Berkeley and enjoys spending time with his family.





