Skip to main content

Tip: Create portable database representations with PEAR MDB2_Schema

Transfer database structures and records from one RDBMS to another with PHP

Vikram Vaswani, Founder, Melonfire
Vikram Vaswani is the founder and CEO of Melonfire, a consulting services firm with special expertise in open-source tools and technologies. He is also the author of the books PHP Programming Solutions and How to do Everything with PHP and MySQL.

Summary:  To change an application's database back-end is a complex task, that often requires the developer to manually re-create database tables and records using data types and SQL functions compatible with the new RDBMS. The PEAR MDB2_Schema package can make this task easier, by generating a vendor-neutral representation of a database using XML and providing tools to import this representation into any supported RDBMS.

View more content in this series

Date:  07 Aug 2007
Level:  Intermediate
Activity:  2208 views

Introduction

If you work with different databases on different platforms, it's likely that you're already aware of the difficulties involved in moving your application from one relational database management system (RDBMS) to another. Not only do different RDBMSs support different subsets of SQL-92 and SQL-99, but many also use their own custom enhancements and extensions to the language.

Switching to a new RDBMS thus requires the application developer to thoroughly check the code tree of the application for non-standard SQL or RDBMS-specific features, and then re-engineer these to work with the new RDBMS. This re-engineering might be as simple as resetting field data types, or as complex as rewriting SQL queries and stored procedures in a format more suitable to the new database engine. Database configuration settings and variables might also need to be optimized for maximum performance.

MDB2_Schema, a package from the PHP Extension and Application Repository (PEAR), can help you reduce some drudge work in moving from one RDBMS to another. The MDB2_Schema package provides an API to extract entities and records from an RDBMS, store them in a vendor-neutral XML format, and use this XML to recreate the original database and records in a different RDBMS. It supports most commonly-used database systems, and thus provides an easy and efficient way to quickly port your data from one system to another.


Installation

The MDB2_Schema package is maintained by Helgi Pormar, Igor Feghali and David Coallier, and released to the PHP community under a BSD license. It requires PHP 4.2.0 (or later). The easiest way to install it is with the automated PEAR installer, which should have been included by default with your PHP build. To install it, simply issue the following command at your shell prompt:

shell> pear install MDB2_Schema

The PEAR installer will now connect to the PEAR package server, download the package, and install it to the appropriate location on your system. This tip uses MDB2_Schema V. 0.7.2.

To install the package by hand, visit its home page, download the source code archive, and manually uncompress the files to the desired location. Note that this manual installation process presupposes some knowledge of PEAR's package organization structure.

MDB2_Schema depends heavily on two other PEAR packages, the MDB2 database abstraction layer and the XML_Parser package. MDB2, in turn, further requires that you install database-specific drivers and corresponding PHP extensions for the different database engines to be supported. As of this writing, MDB2 drivers are available for MySQL, SQLite, PostgreSQL, Firebird, Interbase, MS-SQL and Oracle. The examples in this tip make use of the MySQL and PostgreSQL drivers. As described previously, you can use the PEAR automated installer to install all of these packages; alternatively, you'll find links to the packages from the Resources for this tip.

This tip demonstrates use of the MDB2_Schema package by porting MySQL's sample world database, which provides various pre-populated and linked tables containing city and country data, to PostgreSQL. This pre-supposes that both MySQL and PostgreSQL are installed, and that the world database is correctly installed. See the instructions to obtain MySQL, PostgreSQL and the MySQL world database in Resources.


Using the Point-And-Click Tool

The easiest way to start with MDB2_Schema is with the example script included in the package archive. This script, named example.php, provides a Web-based form to create, import and alter XML-based definition files for a target database. To install it, simply copy it to a directory under the document root of your Web server.

Figure 1 illustrates what this form looks like:


Figure 1. The Web form to dump or create a database representation
Web form to dump or create a database representation

To dump a database definition of the MySQL world database in XML, fill up the form fields with appropriate access parameters for your MySQL server and submit it. In most cases, you will want to select the All value of the Dump option of the database. It's also a good idea to turn on debugging.

Figure 2 illustrates the filled-in form:


Figure 2. The filled-in Web form
Filled-in Web form

The form processor will use the supplied values to make a connection to the MySQL database, read its entities and contents, and write the output to the named file. Here's a snippet of what the file looks like:


Listing 1. Sample XML representation of a MySQL database
                
<?xml version="1.0" encoding="ISO-8859-1" ?>
<database>

 <name>world</name>
 <create>true</create>
 <overwrite>false</overwrite>

 <table>

  <name>city</name>

  <declaration>

   <field>
    <name>ID</name>
    <type>integer</type>
    <length>4</length>
    <notnull>true</notnull>
    <default>0</default>
    <autoincrement>1</autoincrement>
   </field>

   <field>
    <name>Name</name>
    <type>text</type>
    <length>35</length>
    <notnull>true</notnull>
    <fixed>true</fixed>
    <default></default>
   </field>

    ...

   <index>
    <name>idx_ccode</name>
    <field>
     <name>CountryCode</name>
     <sorting>ascending</sorting>
    </field>
   </index>

  ...

  </declaration>

  <initialization>

   <insert>

    <field>
     <name>ID</name>
     <value>1</value>
    </field>

    <field>
     <name>Name</name>
     <value>Kabul</value>
    </field>

    <field>
     <name>CountryCode</name>
     <value>AFG</value>
    </field>

    <field>
     <name>District</name>
     <value>Kabol</value>
    </field>

    <field>
     <name>Population</name>
     <value>1780000</value>
    </field>

   </insert>

   ...
   
  </initialization>
  
 </table>

 <table>

 ...

 </table>

</database>
        

A brief description of the XML format for the MDB2_Schema is useful here. Each table within the database is represented by a <table> element, which further has two primary child elements:

  • <declaration>, which contains descriptions of the database fields and indices.
  • <initialization>, which contains the actual records. Each record is represented by an <insert> element within the parent <initialization> node.

Once you have your database as an XML file, you can also use the example.php Web-based form to import it into another RDBMS. To do this, simply fill up the form again with access parameters for the target RDBMS, input the path to the source XML file, select the option to create a database, turn on debugging and let it run! MDB2_Schema will read and parse the XML definitions, and turn them into actual database tables, fields and records.

Figure 3 illustrates the result of converting the XML file generated previously into a PostgreSQL database:


Figure 3. The debug output after creating a PostgreSQL database from an XML representation
Debug output after creating a PostgreSQL database from an XML representation

Using the MDB2_Schema API

While point-and-click is good in certain situations, at other times you might wish to directly integrate MDB2_Schema into your PHP application. This is not difficult to do, for both dump and create operations. Listing 2 illustrates the PHP code that generates an XML representation of a database and its contents using MDB2_Schema:


Listing 2. The PHP code to dump a database in XML
                            
<?php
set_time_limit(0); 
// include class
include_once 'MDB2/Schema.php';

// initialize an MDB connection to the database
// configure for file output
$options['output_mode'] = 'file';
$options['force_defaults'] = true;
$schema = MDB2_Schema::factory(MDB2::factory('mysql://root@localhost/world'), 
$options);

// get data definitions
$data = $schema->getDefinitionFromDatabase();
if ($data instanceof MDB2_Error) {
    die ($data->getMessage() . ': ' . $data->getUserInfo());
} 

            
// dump structure and content            
$options['output'] = '/tmp/all.xml';
$ret = $schema->dumpDatabase($data, $options, MDB2_SCHEMA_DUMP_ALL);
if ($ret instanceof MDB2_Error) {
    die ($ret->getMessage() . ': ' . $ret->getUserInfo());
} else {
    echo 'Database structure and contents successfully dumped to /tmp/all.xml. 
    <br />';
}


// dump structure only
$options['output'] = '/tmp/structure.xml';
$ret = $schema->dumpDatabase($data, $options, MDB2_SCHEMA_DUMP_STRUCTURE);
if ($ret instanceof MDB2_Error) {
    die ($ret->getMessage() . ': ' . $ret->getUserInfo());
} else {
    echo 'Database structure successfully dumped to /tmp/structure.xml. 
    <br />';
}

// dump data only
$options['output'] = '/tmp/content.xml';
$ret = $schema->dumpDatabase($data, $options, MDB2_SCHEMA_DUMP_CONTENT);
if ($ret instanceof MDB2_Error) {
    die ($ret->getMessage() . ': ' . $ret->getUserInfo());
} else {
    echo 'Database contents successfully dumped to /tmp/content.xml. <br />';
}
?>
        

In Listing 2, the code first includes the necessary class file for the MDB2_Schema package, and then initializes the MDB2 abstraction layer by calling its static factory() method. This factory() method is passed a DSN string representing the database to be exported; the resulting MDB2 object instance is then used to initialize an MDB2_Schema object instance, which is stored in $schema.

Once an MDB2_Schema object is initialized, retrieving an XML-formatted data dump is an easy two steps away: first, call the object's getDefinitionFromDatabase() method to retrieve the database definition, and then call the object's dumpDatabase() method to actually write the database entities and/or records to a file. The name and location of the file are specified in the output key of the $options array, which is passed to dumpDatabase() as the second argument. Note also the third argument to dumpDatabase(), a constant which specifies whether the dump should include only entities (MDB2_SCHEMA_DUMP_STRUCTURE), only data (MDB2_SCHEMA_DUMP_CONTENT) or both (MDB2_SCHEMA_DUMP_ALL).

A word now about the $options array passed to MDB2_Schema's factory() method. This array holds various options that control how the database is exported, as well as the output mode. For example, the output_mode key of the array specifies whether to send the output of dumpDatabase() to a file or a user-defined PHP function, while the output key specifies the name of the destination file or function. Listing 2 uses the file output mode to create the files all.xml, structure.xml and content.xml, but you can just as easily pass the output of dumpDatabase() to a user-defined function for further processing.

Regenerating the database (even on a different RDBMS) is even simpler than dumping it. Consider Listing 3, which illustrates the process of reading an MDB2 schema file and transferring it into a PostgreSQL database:


Listing 3. The PHP code to import a database in XML
                            
<?php
set_time_limit(0);
// include class
include 'PEAR/MDB2/Schema.php';

// initialize an MDB connection to the database
$schema = MDB2_Schema::factory(
    MDB2::factory("pgsql://postgres:postgres@localhost/world"));

// parse data definition from file
$data = $schema->parseDatabaseDefinitionFile('/tmp/structure.xml');
if ($data instanceof MDB2_Schema_Error) {
    die ($data->getMessage() . ': ' . $data->getUserInfo());
} else {
    echo 'Database definition parsed. <br />';
}

// write database
$ret = $schema->createDatabase($data);
if ($ret instanceof MDB2_Error) {
    die ($ret->getMessage() . ': ' . $ret->getUserInfo());
} else {
    echo 'Database schema successfully imported. <br />';
}
?>
        

The code in Listing 3 initializes an instance of the MDB2_Schema class, by passing it an MDB2 object representing the target (PostgreSQL) database. Next, the parseDatabaseDefinitionFile() method reads the XML-formatted database dump file into the $data variable, and the createDatebase() method then regenerates the database using the parsed definition.

Once the script completes — it might take a while for large or complex databases, which is why Listing 3 includes a call to set_time_limit(0) at the top — you can pop open your PostgreSQL command-line client, and check that the data was correctly inserted:

postgres-# \c world
Password for user "postgres":
You are now connected to database "world".
world-# \dt
              List of relations
 Schema |      Name       | Type  |  Owner
--------+-----------------+-------+----------
 public | city            | table | postgres
 public | country         | table | postgres
 public | countrylanguage | table | postgres
(3 rows)

world=# SELECT * FROM city;
 id | name      | countrycode |       district       | population
----+-----------+-------------+----------------------+------------
  1 | Kabul     | AFG         | Kabol                |    1780000
  2 | Qandahar  | AFG         | Qandahar             |     237500
...
        


Caveats

While MDB2_Schema works quite well for most common databases (including MySQL, PostgreSQL and SQLite), it's important to note that it's still an automated tool and hence not perfect. As an example, one error you can encounter relates to TIMESTAMP or DATE fields which do not have a default field value. In these cases, when you read the corresponding XML definition with parseDatabaseDefinitionFile() method, you might receive a parser error related to the <default> element for those fields. The only way to solve these problems is to manually tweak the XML, on a case-by-case basis and take into account the SQL syntax of the target RDBMS.

As the previous listings illustrate, the MDB2_Schema package provides a useful tool to quickly and safely move data from one RDBMS to another. While it's not perfect and can require manual tweaking in some cases, it still provides an efficient alternative to hand-coding new tables and manually writing SQL to dump and insert records from one database to another. Try it out the next time you have a database to port, and see what you think!


Resources

Learn

Get products and technologies

  • IBM trial software: Build your next development project with trial software available for download directly from developerWorks.

  • The PEAR MDB2_Schema package: Download and maintain RDBMS independent schema files in XML that you can use to create, alter and drop database entities and insert data into a database.

  • The PEAR MDB2 package: Download a portable merge of the PEAR DB and Metabase php database abstraction layers for a common API for all supported RDBMS.

  • The MySQL driver: Download the driver for the PEAR MDB2 package.

  • The PostgreSQL driver: Download this driver for the PEAR MDB2 package.

  • The example MySQL world database: Obtain instructions for to download and install sample country, city, and language data.

  • IBM® DB2® Enterprise 9: Download a trial version of DB2 9 or DB2 Express-C 9, a no-charge version of DB2 Express 9 data server.

Discuss

About the author

Vikram Vaswani is the founder and CEO of Melonfire, a consulting services firm with special expertise in open-source tools and technologies. He is also the author of the books PHP Programming Solutions and How to do Everything with PHP and MySQL.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Open source
ArticleID=243931
ArticleTitle=Tip: Create portable database representations with PEAR MDB2_Schema
publish-date=08072007
author1-email=vikram.melonfire@gmail.com
author1-email-cc=dwxed@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers