Developing with Apache Derby -- Hitting the Trifecta: Introduction to Apache Derby

Jump in and try it -- and appreciate the simplicity

At some point, almost every application developer confronts the need to save data. With the growth of Internet- or Web-enabled applications, this need has become even more acute. This installment of the regular column "Developing with Apache Derby -- Hitting the Trifecta" introduces Apache Derby -- an open source, standards-based, small-footprint Java database system -- compares it to other database systems, and discusses issues related to downloading and installing it. By the end of this article, you'll be ready to start developing database applications using Derby.

Share:

Robert Brunner, NCSA Research Scientist, Assistant Professor of Astronomy, University of Illinois, Urbana-Champaign

Robert J. BrunnerRobert J. Brunner is a research scientist at the National Center for Supercomputing Applications and an assistant professor of astronomy at the University of Illinois, Urbana-Champaign. He has published several books and a number of articles and tutorials on a range of topics.



14 February 2006

Also available in Russian Japanese

The Apache Derby project

This article is the first in a new series, "Developing with Apache Derby -- Hitting the Trifecta," which is devoted to exploring the software technology developed by the Apache Derby project. The software released by the Apache Derby project is an open source database system based on technology donated to the Apache Software Foundation by IBM. The Apache Derby database software is written in the Java™ programming language, so it's highly portable but still provides remarkable performance in a small package.

The Derby database also implements a number of database standards, so it's easy to either begin using Derby if you already have database experience or port existing Derby database applications to other standards-compliant database systems if the need arises. Because Derby was officially released less than a year ago, there's a shortage of useful information. IBM developerWorks is filling this void with a number of articles and tutorials. This series focuses on users who are less experienced with database systems. Other articles on the developerWorks Web site provide more advanced introductions to the Apache Derby database software and information about how it can be integrated into the Java Enterprise software stack.

In keeping with the spirit of this series, this article first takes a brief side trip to explore database systems in general before discussing Apache Derby in more detail.


A brief digression on database systems

Whether you realize it or not, as you surf the Internet you're interacting with a variety of database-backed Web applications. This nomenclature may be unfamiliar, but it simply means that a Web site you visit is dynamically created using data saved in a database. To demonstrate, consider the following types of Web sites that you may visit:

  • An information portal, like the developerWorks Open source project area shown in Figure 1
  • A newspaper Web site to catch up on the local news or sports
  • A financial Web site, like that of a bank or investment institution, to monitor your financial portfolio
  • A map Web site to find driving directions
  • A search engine where you can identify interesting Web sites for more detailed information on a subject
Figure 1. The developerWorks Open source project area
The developerWorks Open source project area

Each of these examples uses databases to store, locate, and retrieve information dynamically. In each of these applications, the Web site collects necessary information from the user (such as a street address), queries the application database, and collects the data that has been requested into a suitable visual result.

Many of these database systems are large and complex -- imagine holding all the map information needed to provide accurate driving directions with pictures! Clearly, storing data and making it available to applications is a big task, one that has been addressed by a number of vendors, including IBM with IBM DB2® and Microsoft® with Microsoft SQL Server. These commercial database systems provide full, enterprise-class capabilities. As a result, they can hold enormous quantities of data, concurrently interact with a large number of users, and scale across several large computational systems.

Database roles

As you might expect, working with these systems isn't trivial, and they can be expensive to operate. Historically, the tasks involved in working with these databases have been divided into three categories. Although the roles sometimes overlap, their individual responsibilities are easy to comprehend:

  1. Database administrator (DBA) -- Responsible for the overall operation of the database system, which includes the selection and layout of the underlying hardware, the installation and optimization of the database server (especially given the hardware being used), and the day-to-day operations of the database server, such as data backup and recovery.
  2. Database developer -- Responsible for the actual databases in operation, including designing databases, schemas, tables, table relationships, and indexes as well as optimizing queries.
  3. Database application developer -- Responsible for integrating application code with the underlying database by using database application programming interfaces (APIs) like Java Database Connectivity (JDBC) or Open Database Connectivity (ODBC) to store and retrieve data as necessary.

If the previous discussion leaves you feeling intimidated, that's OK -- working with databases has historically been difficult. To understand why, let's examine a specific example in more detail: online banking. When you connect to your bank's Web site, you provide your credentials (most likely a username and password) and thereby gain access to your financial accounts. You can view your data, pay bills, and transfer funds. The database your bank uses must quickly locate the relevant information, safely manage the transactions, securely interact with users, and -- most important -- not lose any data! And the bank must do this for a large number of users simultaneously.

But not all applications are this demanding, especially when you're starting out. If you're just learning to work with databases, or if you want to quickly prototype a database application, most commercial database systems can be cumbersome. Fortunately, using the Apache Derby database to develop database-enabled applications is easier than you might think. The rest of this article provides a basic introduction to the Apache Derby project. Future column installments will demonstrate how to build database applications using the Apache Derby database.


What is the Apache Derby project?

The Apache Derby project is aimed at building an open source database written entirely in the Java programming language that's easy to use but suitable for a majority of applications. As you can imagine, developing a database isn't simple, and the Apache Derby database is no exception (because it's open source software, you can look for yourself). But the Derby project didn't start from scratch. Back in 1996, a new company called Cloudscape, Inc. was founded with the goal of building a database server written in the Java language. The company's first release came a year later, and eventually the product's name was changed to Cloudscape. In 1999, Cloudscape, Inc. was purchased by Informix Software, Inc., a large database vendor. Informix Software was purchased by IBM in 2001, and the IBM Cloudscape™ database system was used as an embedded database engine in a number of IBM's products. In April 2004, IBM donated the Cloudscape database software to the Apache Software Foundation, and the Apache Derby project was born (see Figure 2).

Figure 2. The Apache Derby project Web site
The Apache Derby project Web site

At the time, the Cloudscape database was approximately half a million lines of Java code, which took some time to properly convert to the Apache Derby project. After an incubation period, Derby was officially released in July 2005. So while it may seem like the new kid on the block, Derby comes with nearly ten years of development behind it.

IBM continues to manage the Cloudscape database, which is built from the Apache Derby source code. IBM offers the Cloudscape database as a free download, and also offers fee-based consulting services for clients who want added peace of mind. In addition, Sun Microsystems has announced that it will include a patched version of Apache Derby as its Java DB product. This strong commitment from IBM and Sun emphasizes the bright future of the Apache Derby database. The Derby database also conforms to a number of database standards, such as SQL-92 and JDBC, Version 3.0; thus an application that is initially developed using the Derby database system can be easily ported to another database system, such as IBM DB2 Universal Database™.

An overview of the Apache Derby database

Apache Derby is written in the Java language, so it can run anywhere that a suitable Java Virtual Machine (JVM) exists. This means Derby can run on virtually any operating system, including the Microsoft Windows®, Macintosh, Linux®, and UNIX® platforms. Derby can also run on any of the three Java platforms: Java 2 Platform, Micro Edition (J2ME); Java 2 Platform, Standard Edition (J2SE); and Java 2 Platform, Enterprise Edition (J2EE). The Derby software is bundled in a Java Archive (JAR) file that's only 2 MB in size. Given this small footprint, the Derby database can be easily bundled along with an application.

You can use the Derby database two ways:

  • As an embedded database in which the user is unaware of the existence of the database. The application uses the database, both are running in the same JVM, and the database stores the data on the local file system. In the embedded model, the database only communicates with an application that is running in the same JVM.
  • As a client-server connection, the more traditional model used by many commercial vendors. In this model, the application communicates with the database over a network connection, and the application and the database operate in separate JVMs. The database server can communicate with multiple client applications.

Downloading Apache Derby

To appreciate the simplicity of working with Apache Derby, the best technique is to jump in and try it. The rest of this section provides general instructions for downloading and verifying your version of Apache Derby (see Resources for a link to the official Apache Derby Web site for instructions specific to your operating system). These instructions assume you have a suitable Java Runtime Environment (JRE) successfully installed. Any JRE version higher than 1.3 should be sufficient, but this article series uses Java 1.4.2 or higher.

With those prerequisites out of the way, the first step is to download Apache Derby. As shown in Figure 3, you can download three different versions: source, library, and binary. The source version is just that: the source code. To use this version, you must compile the source code and build your own .jar file. The library version includes only the necessary .jar file for the Derby database. The binary version includes the .jar file and the Derby documentation.

For simplicity's sake, download the binary version. Be sure to verify the integrity of your download; this includes verifying the PGP (Pretty Good Privacy) signature, which guarantees that you downloaded the official version, and verifying the MD5 (Message-Digest algorithm 5) signature, which guarantees that your download files weren't corrupted.

Figure 3. Downloading the Apache Derby database
Downloading the Apache Derby database

Installing Apache Derby

After you've successfully downloaded and verified the integrity of the archive containing the Derby database files, the installation is simple (although mildly platform dependent):

  1. Choose a suitable location, such as C:\Apache on a Windows system or /opt/Apache on a UNIX-based system.
  2. Open a terminal window (or a command prompt on Windows), change to this new directory, and expand the archive containing the Derby database. Doing so creates a new directory named with the version of the Derby database you installed, for example, db-derby-10.1.2.1-bin.
  3. Add the Derby .jar file to your CLASSPATH environment variable. If you're comfortable working at the command prompt, you can do this directly by adding derby.jar, which is located in the db-derby-10.1.2.1-bin/lib subdirectory, to the CLASSPATH variable. Alternatively, you can run the setEmbeddedCP script provided in the db-derby-10.1.2.1-bin\frameworks\embedded\bin subdirectory. Before running this script, you should either set the DERBY_INSTALL environment variable or modify the script so that DERBY_INSTALL points to the directory in which you installed the Derby database.
  4. Verify your installation by using the sysinfo tool, as shown in Listing 1.
Listing 1. Testing the Derby installation with sysinfo
rb$ java org.apache.derby.tools.sysinfo
------------------ Java Information ------------------
Java Version:    1.4.2_09
Java Vendor:     Apple Computer, Inc.
Java home:       /System/Library/Frameworks/JavaVM.framework/
                         Versions/1.4.2/Home
Java classpath:  /opt/Apache/db-derby-10.1.2.1-bin/lib/derby.jar:/
                          opt/Apache/db-derby-10.1.2.1-bin/lib/derbytools.jar:.
OS name:         Mac OS X
OS architecture: ppc
OS version:      10.4.3
Java user name:  rb
Java user home:  /Users/rb
Java user dir:   /opt/Apache/db-derby-10.1.2.1-bin
java.specification.name: Java Platform API Specification
java.specification.version: 1.4
--------- Derby Information --------
JRE - JDBC: J2SE 1.4.2 - JDBC 3.0
[/opt/Apache/db-derby-10.1.2.1-bin/lib/derby.jar] 10.1.2.1 - (330608)
[/opt/Apache/db-derby-10.1.2.1-bin/lib/derbytools.jar] 10.1.2.1 - 
   (330608)
[/opt/Apache/db-derby-10.1.2.1-bin/lib/derby.jar] 10.1.2.1 - (330608)
[/opt/Apache/db-derby-10.1.2.1-bin/lib/derbytools.jar] 10.1.2.1 - 
   (330608)
------------------------------------------------------
----------------- Locale Information -----------------
------------------------------------------------------

At this point, you have a working database system, which you can run either as a stand-alone network server or as an embedded database within your own application. Future column installments will show you how to run Derby in both models and provide additional database insights.

Using Apache Derby

Working with a database system doesn't have to be difficult. By using Apache Derby, you can quickly begin working with a full-featured database system. And because Apache Derby is a standards-compliant database, applications developed using it can be easily migrated to a more powerful database system as the situation warrants.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source, Java technology, Information Management
ArticleID=103663
ArticleTitle=Developing with Apache Derby -- Hitting the Trifecta: Introduction to Apache Derby
publish-date=02142006