Level: Intermediate Dan Jemiolo (danjemiolo@us.ibm.com), Advisory Software Engineer, IBM
11 Sep 2007 The Project Zero development platform includes a data access library that is
easy to use, allowing developers to execute SQL statements from their application
code with minimal configuration. In fact, setting up a database and connecting to it
requires just a four-line configuration file and knowledge of basic SQL, neither of
which should tax the average Web developer. But as simple as database-driven
development is, there are issues surrounding the packaging of a database-driven
components that require significant thought: Without the proper design, a Zero
component may drag along dependencies and make assumptions other developers
cannot accept. This article discusses best practices for configuring and packaging
database-driven components so they are highly reusable by other Zero developers.
Editor's note: IBM® WebSphere® sMash and IBM WebSphere sMash Developer Edition are based on the highly acclaimed Project Zero incubator project. Project Zero is the development community for WebSphere sMash and will continue to offer developers a cost-free platform for developing applications with the latest builds, the latest features, and the support of the community.
Before you get started
This article assumes you have downloaded Project Zero and either completed the
"Developing
applications with Project Zero" introductory tutorial or written a simple application yourself. You should be familiar with
Zero's data access library (referred to as zero.data) and Apache Ant.
And you should be using one of the following database products for your development:
See Resources for more information.
Introduction
 |
The Project Zero community
Take a stroll around the Project Zero Web site and see how Project Zero provides a powerful -- but radically simple -- development and execution platform for modern Web applications. |
|
When packaging a Zero component for use in a production system, database
configuration is of paramount concern. Without the proper design, a Zero component
may drag along dependencies and make assumptions that other developers cannot
accept. If a Zero component cannot be used without extracting all of its contents
and manipulating its configuration files and scripts, it will not be widely
reused. Even assuming that the license allows it, few developers are willing to
take on the responsibility of changing third-party code. The following sections
look at three areas of a Zero component that can be optimized to make it database
agnostic and consumer friendly: configuration files, library dependencies, and table creation.
Optimizing your configuration file
The Zero configuration file is where you specify the details of your component's
database connection, including the driver name, database location, and
authentication information. It is found in your application directory under
/config/zero.config. Listing 1 shows a sample zero.config file for a blog component that uses the Apache Derby database:
Listing 1. Sample zero.config file with database configuration
[/app/db/blog/config]
class=org.apache.derby.jdbc.ClientDataSource
serverName=localhost
portNumber=1527
databaseName=BlogDB
|
The implementation code for such a component would access the database by calling zero.data's Manager.create() method and passing it the name of the database configuration (e.g., Manager.create("blog")).
The file in Listing 1 is fine for initial development and testing, but the fact
is that we've hard-coded the database name and the database product into our
implementation. If someone wants to reuse our component in an application, we are
forcing that person to create a new database instance ("blog") rather
than reusing an existing database. The other person is also forced to install and
deploy our database product of choice if he/she is not already using it. In most production environments, both of these requirements are non-starters. There is no way that deployers can afford to give every Zero component its own database, nor can they install and maintain another product installation for the benefit of one component. In short, your configuration must be flexible enough to adapt to the consumer's database environment, or your code will not be used. In this section, we modify our zero.config file to be more accommodating.
Removing hard-coded database names
The first and easiest step toward flexibility is to remove the hard-coded
database name from the database-oriented code. The stanza in Listing 2 sets the value of the blog component's dbKey property, which we can read at runtime using Zero's GlobalContext API:
Listing 2. Configuring the database name
With this property in place, we can change our code so that instead of Manager.create("blog") we use Manager.create(app.blog.dbKey). Externalizing the database name makes it easier for a consumer to change it without hunting through our code.
Still, we can do better. The value of the dbKey
property can either be the name of the database or defaultDB; the latter is a special value that tells the zero.data
library to use the database configuration of the application that contains it.
This is exactly what we want! If other Zero developers add the blog component to
their application, they can override our dbKey value in
their application's zero.config file by setting it to defaultDB, shown in Listing 3:
Listing 3. Overriding the database name
[/app/blog]
dbKey=defaultDB
|
Now we have the perfect balance: during development and test, our component uses
the blog database; during deployment, our consumers can override the name in their own zero.config file so that it matches their database setup.
Dynamic database configuration
Our database name is now dynamic, but the blog component still has information in its zero.config file that is specific to our database setup. Listing 1 includes settings that are highly specific to Apache Derby, but it so happens that our blog component works just fine with IBM DB2 or MySQL, so why should we limit ourselves?
The first step to making this configuration dynamic is to pull it out of our
zero.config file. We can put the text in Listing 1 into
another file named data.config and include it in the original zero.config file
using the syntax in Listing 4:
Listing 4. Extracting the database configuration
For smaller components that will never be used as stand-alone applications, we can probably stop here. Before packaging this component for redistribution, we would just comment out the @include line, assuming that the consumer's application will provide the database configuration we need. That's all!
However, there are a number of component types that, while they are useful
building blocks for more complex applications, can also serve as complete applications themselves. Our blog component is a great example of this. There are many Web sites that deal solely with blogging (such as blogger.com) and many others for whom blogging is just one part of their user experience (such as MySpace). Such components must have the ability to set their database configuration dynamically if the consumer wants to use them as stand-alone applications. The ideal scenario would be for the component to configure its database requirements as part of its installation process, which the consumer initiates using the Zero command-line interface.
Adding dynamic database configuration is easy to achieve now that our database
information is in a separate file. Each Zero component has an Apache Ant file
(build.xml) that is used to invoke the Zero command-line interface, and you can
use it to add custom build and deployment logic. Our next step is to add Ant
targets that can generate the proper data.config file for the consumer's database.
You can add the Ant targets shown in Listing 5 to the blog component's build.xml file:
Listing 5. Ant tasks for generating data.config file
<property name="config-file" value="config/data.config"/>
<target name="create-derby">
<echo file="${config-file}">
[/app/db/blog/config]
class=org.apache.derby.jdbc.ClientDataSource
serverName=localhost
portNumber=1527
databaseName=BlogDB
</echo>
</target>
<target name="create-mysql">
<echo file="${config-file}">
[/app/db/blog/config]
class=com.mysql.jdbc.jdbc2.optional.MysqlDataSource
serverName=localhost
portNumber=3306
databaseName=BlogDB
</echo>
</target>
<target name="create-db2">
<echo file="${config-file}">
[/app/db/blog/config]
class=com.ibm.db2.jcc.DB2DataSource
serverName=localhost
portNumber=50000
databaseName=BlogDB
</echo>
</target>
|
Listing 5 adds one target for each of the three databases
we intend to support. Each target creates a data.config file with vendor-specific
settings. A consumer who has downloaded our blog component and wants to run it as
a stand-alone application backed by Apache Derby would issue the commands in
Listing 6:
Listing 6. Running Ant task to create data.config for Apache Derby
$ zero resolve
$ zero create-derby
$ zero run
|
DB2 and MySQL users would substitute create-derby for their respective targets.
At this point, we have made our database configuration as flexible as possible to meet the needs of our potential consumers. We now turn to the problem of unnecessary dependencies.
Optimizing your dependency list
To build and test your database-driven code, you need to add libraries to your
component that provide the necessary JDBC drivers. For example, if you are
supporting the Apache Derby database, you need the derbyclient.jar file that is included in every Derby distribution (we continue to use Derby as our example database for this section, but the advice applies to all database products). The JAR file containing the JDBC driver can either be placed in the component's /lib directory or it can be discovered during the build process using Apache Ivy. In the first case, installation is simple: you copy a JAR file to the appropriate directory; in the second case, you need to add an entry to your component's Ivy file, located at /config/ivy.xml. Listing 7 shows what such an entry would look like:
Listing 7. Ivy entry for Apache Derby's JDBC driver
<dependency name="derbyclient" org="org.apache.derby" rev="10.2+"/>
|
Either way you do it, you are adding a dependency to your component that will
affect your consumers. MySQL users will not want Derby drivers included in their application, and vice versa. Unused dependencies may not have an impact on the correctness of an application, but they do affect its size and licensing. You don't want your component to be thought of as "bloated", nor do you want it to be dismissed because of legal concerns. To avoid these issues, you need to separate your needs as a developer from those of your consumers.
Leave the details to the deployer
When we first started modifying our zero.config file, our intent was to leave as
much configuration as possible to the consumers so that we would not paint them
into a corner. We will approach dependency management in the same way: We want to include the necessary JDBC drivers in our build and test environment, but we don't want them to show up in our final distribution.
The simplest way to achieve our goal is to add the necessary JDBC drivers using
the /lib directory and then exclude them when creating the distribution artifact.
The Zero command-line interface has a package target
that takes everything in your component's directory and adds it to a ZIP file;
what you probably don't know is that the package target
has an optional excludes property that you can use to
leave out certain files and directories. The Ant target in Listing 8 sets the excludes property to the list of JAR files we want to leave out and then calls Zero's package target:
Listing 8. Excluding drivers from component package
<target name="clean-package">
<property name="excludes" value="lib/derbyclient.jar"/>
<package/>
</target>
|
You can add this Ant target to the component's build.xml file, just as we did
with the targets in Listing 5. When we are ready to create our distribution file, we simply run zero clean-package instead of the usual zero package. We now have a database-driven component that is flexible with regard to its configuration and free of any unwanted files.
Optimizing vendor-specific table creation
Unfortunately, not all projects end up as neat and clean as the picture we have
painted in the previous two sections. Occasionally, despite all attempts to avoid
it, you will end up in a situation where some of your logic or configuration
must be vendor specific. There have been many articles published on the
topic of writing vendor-neutral SQL statements, and their advice does not need to
be repeated here. Instead, we will focus on a case where vendor-specific SQL will
be unavoidable: creating database tables. In this case, you will have SQL scripts that you want to run during a component's installation, but you will not know ahead of time which database the consumer will be using. Fortunately, an elegant solution is not far off.
Ant to the rescue (again!)
In Listing 5, we created vendor-specific Ant tasks that
could generate a vendor-specific configuration file at installation time. We will
now add to those tasks by executing vendor-specific SQL to create our database
tables. Let's assume that we've organized all of our SQL scripts in directories
named after database products (for example, /sql/derby and /sql/mysql). Our
targets can take advantage of this convention to find and execute the right SQL script using Ant's sql task. Listing 9 shows how the create-derby target was modified to set the Derby-specific properties that are used by the sql task to execute the right script:
Listing 9. Augmenting our Ant task to execute SQL scripts
<property name="config-file" value="config/data.config"/>
<target name="create-derby" depends="init-derby, create-tables">
<echo file="${config-file}">
[/app/db/blog/config]
class=org.apache.derby.jdbc.ClientDataSource
serverName=localhost
portNumber=1527
databaseName=BlogDB
</echo>
</target>
<target name="init-derby">
<property name="db-name" value="derby"/>
<property name="db-driver" value="org.apache.derby.jdbc.ClientDriver"/>
<property name="db-jar" value="lib/derbyclient.jar"/>
<property name="db-url" value="jdbc:derby://localhost:1527/db/BlogDB"/>
</target>
<target name="create-tables">
<sql driver="${db-driver}"
classpath="${db-jar}"
url="${db-url}"
src="sql/${db-name}/create-blog-tables.sql"/>
</target>
|
Now when a deployer runs zero create-derby, it creates
the proper database tables in addition to generating the zero.config file. Our
MySQL and DB2 targets can be augmented in the same way by adding equivalent init- targets. The consumers can now ensure that the component's
database tables are created using the right syntax for their product, and the only
drawback is that we have to distribute the SQL scripts for all vendors. Because the scripts are relatively small text files with no licensing implications, this is nothing to worry about.
Conclusion
There are a number of issues to consider when it comes to packaging a Zero
component for reuse by other developers, and database access is the cause of many of them. In this article, you have learned how to alleviate some of the restrictions and problems that developers unknowingly create for their consumers during development time. With the help of Apache Ant and the Zero command-line tools, we were able to eliminate the most common problems associated with publishing database-driven components.
Resources Learn
Get products and technologies
Discuss
- The Project Zero Community site includes invitations to join in on forum discussions, blogs, and wikis. Get involved today!
About the author  | 
|  | Dan Jemiolo is an Advisory Software Engineer on IBM's Autonomic Computing team in Research Triangle Park, NC. He led the design and development of Apache Muse 2.0 and continues to work on the project today. Dan also participates in the WS-RF TC as editor of the WS-ResourceMetadataDescriptor specification and is involved in IBM's strategy for increasing adoption of Web services standards. He came to IBM just over two years ago after earning his Master of Science degree in Computer Science from Rensselaer Polytechnic Institute. |
Rate this page
|