The Apache Derby project
This article is the first in a new series, "Developing with Apache Derby -- Hitting the Trifecta," which is devoted to exploring the software technology developed by the Apache Derby project. The software released by the Apache Derby project is an open source database system based on technology donated to the Apache Software Foundation by IBM. The Apache Derby database software is written in the Java™ programming language, so it's highly portable but still provides remarkable performance in a small package.
The Derby database also implements a number of database standards, so it's easy to either begin using Derby if you already have database experience or port existing Derby database applications to other standards-compliant database systems if the need arises. Because Derby was officially released less than a year ago, there's a shortage of useful information. IBM developerWorks is filling this void with a number of articles and tutorials. This series focuses on users who are less experienced with database systems. Other articles on the developerWorks Web site provide more advanced introductions to the Apache Derby database software and information about how it can be integrated into the Java Enterprise software stack.
In keeping with the spirit of this series, this article first takes a brief side trip to explore database systems in general before discussing Apache Derby in more detail.
A brief digression on database systems
Whether you realize it or not, as you surf the Internet you're interacting with a variety of database-backed Web applications. This nomenclature may be unfamiliar, but it simply means that a Web site you visit is dynamically created using data saved in a database. To demonstrate, consider the following types of Web sites that you may visit:
- An information portal, like the developerWorks Open source project area shown in Figure 1
- A newspaper Web site to catch up on the local news or sports
- A financial Web site, like that of a bank or investment institution, to monitor your financial portfolio
- A map Web site to find driving directions
- A search engine where you can identify interesting Web sites for more detailed information on a subject
Figure 1. The developerWorks Open source project area
Each of these examples uses databases to store, locate, and retrieve information dynamically. In each of these applications, the Web site collects necessary information from the user (such as a street address), queries the application database, and collects the data that has been requested into a suitable visual result.
Many of these database systems are large and complex -- imagine holding all the map information needed to provide accurate driving directions with pictures! Clearly, storing data and making it available to applications is a big task, one that has been addressed by a number of vendors, including IBM with IBM DB2® and Microsoft® with Microsoft SQL Server. These commercial database systems provide full, enterprise-class capabilities. As a result, they can hold enormous quantities of data, concurrently interact with a large number of users, and scale across several large computational systems.
As you might expect, working with these systems isn't trivial, and they can be expensive to operate. Historically, the tasks involved in working with these databases have been divided into three categories. Although the roles sometimes overlap, their individual responsibilities are easy to comprehend:
- Database administrator (DBA) -- Responsible for the overall operation of the database system, which includes the selection and layout of the underlying hardware, the installation and optimization of the database server (especially given the hardware being used), and the day-to-day operations of the database server, such as data backup and recovery.
- Database developer -- Responsible for the actual databases in operation, including designing databases, schemas, tables, table relationships, and indexes as well as optimizing queries.
- Database application developer -- Responsible for integrating application code with the underlying database by using database application programming interfaces (APIs) like Java Database Connectivity (JDBC) or Open Database Connectivity (ODBC) to store and retrieve data as necessary.
If the previous discussion leaves you feeling intimidated, that's OK -- working with databases has historically been difficult. To understand why, let's examine a specific example in more detail: online banking. When you connect to your bank's Web site, you provide your credentials (most likely a username and password) and thereby gain access to your financial accounts. You can view your data, pay bills, and transfer funds. The database your bank uses must quickly locate the relevant information, safely manage the transactions, securely interact with users, and -- most important -- not lose any data! And the bank must do this for a large number of users simultaneously.
But not all applications are this demanding, especially when you're starting out. If you're just learning to work with databases, or if you want to quickly prototype a database application, most commercial database systems can be cumbersome. Fortunately, using the Apache Derby database to develop database-enabled applications is easier than you might think. The rest of this article provides a basic introduction to the Apache Derby project. Future column installments will demonstrate how to build database applications using the Apache Derby database.
What is the Apache Derby project?
The Apache Derby project is aimed at building an open source database written entirely in the Java programming language that's easy to use but suitable for a majority of applications. As you can imagine, developing a database isn't simple, and the Apache Derby database is no exception (because it's open source software, you can look for yourself). But the Derby project didn't start from scratch. Back in 1996, a new company called Cloudscape, Inc. was founded with the goal of building a database server written in the Java language. The company's first release came a year later, and eventually the product's name was changed to Cloudscape. In 1999, Cloudscape, Inc. was purchased by Informix Software, Inc., a large database vendor. Informix Software was purchased by IBM in 2001, and the IBM Cloudscape™ database system was used as an embedded database engine in a number of IBM's products. In April 2004, IBM donated the Cloudscape database software to the Apache Software Foundation, and the Apache Derby project was born (see Figure 2).
Figure 2. The Apache Derby project Web site
At the time, the Cloudscape database was approximately half a million lines of Java code, which took some time to properly convert to the Apache Derby project. After an incubation period, Derby was officially released in July 2005. So while it may seem like the new kid on the block, Derby comes with nearly ten years of development behind it.
IBM continues to manage the Cloudscape database, which is built from the Apache Derby source code. IBM offers the Cloudscape database as a free download, and also offers fee-based consulting services for clients who want added peace of mind. In addition, Sun Microsystems has announced that it will include a patched version of Apache Derby as its Java DB product. This strong commitment from IBM and Sun emphasizes the bright future of the Apache Derby database. The Derby database also conforms to a number of database standards, such as SQL-92 and JDBC, Version 3.0; thus an application that is initially developed using the Derby database system can be easily ported to another database system, such as IBM DB2 Universal Database™.
An overview of the Apache Derby database
Apache Derby is written in the Java language, so it can run anywhere that a suitable Java Virtual Machine (JVM) exists. This means Derby can run on virtually any operating system, including the Microsoft Windows®, Macintosh, Linux®, and UNIX® platforms. Derby can also run on any of the three Java platforms: Java 2 Platform, Micro Edition (J2ME); Java 2 Platform, Standard Edition (J2SE); and Java 2 Platform, Enterprise Edition (J2EE). The Derby software is bundled in a Java Archive (JAR) file that's only 2 MB in size. Given this small footprint, the Derby database can be easily bundled along with an application.
You can use the Derby database two ways:
- As an embedded database in which the user is unaware of the existence of the database. The application uses the database, both are running in the same JVM, and the database stores the data on the local file system. In the embedded model, the database only communicates with an application that is running in the same JVM.
- As a client-server connection, the more traditional model used by many commercial vendors. In this model, the application communicates with the database over a network connection, and the application and the database operate in separate JVMs. The database server can communicate with multiple client applications.
Downloading Apache Derby
To appreciate the simplicity of working with Apache Derby, the best technique is to jump in and try it. The rest of this section provides general instructions for downloading and verifying your version of Apache Derby (see Resources for a link to the official Apache Derby Web site for instructions specific to your operating system). These instructions assume you have a suitable Java Runtime Environment (JRE) successfully installed. Any JRE version higher than 1.3 should be sufficient, but this article series uses Java 1.4.2 or higher.
With those prerequisites out of the way, the first step is to download Apache Derby. As shown in Figure 3, you can download three different versions: source, library, and binary. The source version is just that: the source code. To use this version, you must compile the source code and build your own .jar file. The library version includes only the necessary .jar file for the Derby database. The binary version includes the .jar file and the Derby documentation.
For simplicity's sake, download the binary version. Be sure to verify the integrity of your download; this includes verifying the PGP (Pretty Good Privacy) signature, which guarantees that you downloaded the official version, and verifying the MD5 (Message-Digest algorithm 5) signature, which guarantees that your download files weren't corrupted.
Figure 3. Downloading the Apache Derby database
Installing Apache Derby
After you've successfully downloaded and verified the integrity of the archive containing the Derby database files, the installation is simple (although mildly platform dependent):
- Choose a suitable location, such as C:\Apache on a Windows system or /opt/Apache on a UNIX-based system.
- Open a terminal window (or a command prompt on Windows), change to this new directory, and expand the archive containing the Derby database. Doing so creates a new directory named with the version of the Derby database you installed, for example, db-derby-10.1.2.1-bin.
- Add the Derby .jar file to your
CLASSPATHenvironment variable. If you're comfortable working at the command prompt, you can do this directly by adding derby.jar, which is located in the db-derby-10.1.2.1-bin/lib subdirectory, to the
CLASSPATHvariable. Alternatively, you can run the
setEmbeddedCPscript provided in the db-derby-10.1.2.1-bin\frameworks\embedded\bin subdirectory. Before running this script, you should either set the
DERBY_INSTALLenvironment variable or modify the script so that
DERBY_INSTALLpoints to the directory in which you installed the Derby database.
- Verify your installation by using the
sysinfotool, as shown in Listing 1.
Listing 1. Testing the Derby installation with sysinfo
rb$ java org.apache.derby.tools.sysinfo ------------------ Java Information ------------------ Java Version: 1.4.2_09 Java Vendor: Apple Computer, Inc. Java home: /System/Library/Frameworks/JavaVM.framework/ Versions/1.4.2/Home Java classpath: /opt/Apache/db-derby-10.1.2.1-bin/lib/derby.jar:/ opt/Apache/db-derby-10.1.2.1-bin/lib/derbytools.jar:. OS name: Mac OS X OS architecture: ppc OS version: 10.4.3 Java user name: rb Java user home: /Users/rb Java user dir: /opt/Apache/db-derby-10.1.2.1-bin java.specification.name: Java Platform API Specification java.specification.version: 1.4 --------- Derby Information -------- JRE - JDBC: J2SE 1.4.2 - JDBC 3.0 [/opt/Apache/db-derby-10.1.2.1-bin/lib/derby.jar] 10.1.2.1 - (330608) [/opt/Apache/db-derby-10.1.2.1-bin/lib/derbytools.jar] 10.1.2.1 - (330608) [/opt/Apache/db-derby-10.1.2.1-bin/lib/derby.jar] 10.1.2.1 - (330608) [/opt/Apache/db-derby-10.1.2.1-bin/lib/derbytools.jar] 10.1.2.1 - (330608) ------------------------------------------------------ ----------------- Locale Information ----------------- ------------------------------------------------------
At this point, you have a working database system, which you can run either as a stand-alone network server or as an embedded database within your own application. Future column installments will show you how to run Derby in both models and provide additional database insights.
Using Apache Derby
Working with a database system doesn't have to be difficult. By using Apache Derby, you can quickly begin working with a full-featured database system. And because Apache Derby is a standards-compliant database, applications developed using it can be easily migrated to a more powerful database system as the situation warrants.
- Visit the Apache Derby Project Web site.
- Get more detailed information about how to use the Apache Derby database from the Apache Derby project online manuals.
- Access the Apache Derby project tutorial that details how to download and install Apache Derby.
- Learn how to verify your download.
- Read developerWorks' articles and tutorials about the Cloudscape database, which is built using the Apache Derby code base.
- Check out the developerWorks Apache Derby project area for articles, tutorials, and other resources to help you get started with Derby today.
- Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
- Browse all the Apache Derby articles and free tutorials available in the developerWorks Open source zone.
- Browse all the Apache articles and free Apache tutorials available in the developerWorks Open source zone.
Get products and technologies
- Download the Apache Derby 10.1.2.1 release.
- Innovate your next open source development project with IBM trial software, available for download or on DVD.
- Get involved in the developerWorks community by participating in developerWorks blogs.