MySQL is currently the most common database server used with the PHP programming language to build dynamic web applications. However, DB2 is another popular database that is well-supported by PHP and offers compelling advantages over MySQL that make it an ideal choice for many applications.
This series describes why it makes sense to move a PHP application to DB2, how to prepare for the migration, how to execute it, how to support it, and how to handle potential risks based on the authors' experience with a recent migration. Many code and configuration samples are provided, as well as pointers to resources to help the project run smoothly.
With examples and lessons learned from a successful real-life conversion, you will see that this can be a straightforward project that is well documented and offers compelling benefits.
This four-part series shares lessons learned from a successful MySQL to DB2 migration for a production-grade, mission-critical PHP intranet application used by 4,000 global users within IBM to support content production for ibm.com.
- Part 1 describes the steps taken to prepare for the migration.
- Part 2 describes the steps taken to migrate the database.
- Part 3 describes the steps taken to convert the PHP code.
- Part 4 describes the steps taken to deploy and support the application.
The goal of this article series is to provide you with an understanding of what is generally needed to migrate a PHP application from MySQL to DB2, what resources are available to help you, and how an IBM project team carried out the task in early 2010.
If you have investigated a migration from MySQL to DB2, you might have already seen the value DB2 offers based on product literature, performance benchmarks, features you have read about in the DB2 documentation, or comparisons in IBM Redbooks ® dedicated to the task, including the MySQL to DB2 Conversion Guide (see Resources ).
You might also know that DB2 Express-C is a free, fully functional relational data server that can be easily installed or evaluated using the IBM SmartCloud (formerly known as the Development and Test on the IBM Cloud) or Amazon EC2. Links to these offerings are available in the Resources section.
This article series provides you a concrete example of how a real world migration was executed successfully in 2010 for a heavily used PHP intranet application within IBM to support the day-to-day management of content that is published for many sections of the ibm.com web site.
After you finish reading this series, you will be able to make the case for a similar migration, understand the timing and dependencies of work items that need to be executed, anticipate potential risks, and know where to look for support each step of the way. All of this will give you more confidence to choose DB2 and make the best use of it for your PHP applications that are currently built on MySQL.
This article series intends to share the lessons learned from an internal IBM migration to DB2 from MySQL and provide you with information on the resources available to you to perform a similar undertaking. It is not a comprehensive guide to migration that is applicable to all scenarios.
To determine an approach that's right for you, refer to the MySQL to DB2 Conversion Guide or contact the Software Migration Project Office (SMPO) for a free migration estimate. Links are provided in the Resources section.
This article covers the five primary work efforts in the case study to carry out the migration of the PHP application code from using MySQL drivers and syntax to those that work with DB2. If needed, refer to Part 1 of the series to see the scheduling of these steps in the overall migration process before proceeding with the conversion.
- Step 1: Begin a first pass at the code migration
- Ensure that the PHP configuration is updated to support DB2
- Update the individual SQL statements to support DB2 syntax
- Emulate native MySQL functions using DB2 user-defined functions, if needed
- Move logic into PHP from SQL, if needed
- Step 2: Begin a second pass at the code migration
- Examine the changes needed to support the appropriate isolation levels
- Reorganize the queries into logical units of work to achieve better data integrity and improve performance
- Step 3: Begin initial business use case testing with stakeholders
- Engage stakeholders to execute use cases that they are accustomed to performing in the old system
- Capture test failures as defects to be analyzed and fixed by developers
- Step 4: Address bottlenecks and confirm against functional baseline
- Improve the performance of the system based on user feedback after functional verification
- Focus on what DB2 can fix for you automatically, because DB2 is the largest change you introduced
- Address PHP bottlenecks by looking at operating system resource usage
- Step 5: Evaluate the code migration baseline
- After iteratively completing each of the steps above, declare the code conversion complete
- Back up the system, and tag the milestone in your source control system
- Assess the level of preparation needed for the next task: application deployment
For the example application in this article, the source code for the Project Tracking Tool (PTT) consists of several hundred PHP files. The codebase contains function libraries, object-oriented code organized into data transfer objects and manager classes, and various HTML template fragments and helpers that render the user interface.
The PTT database is used for various functions to support the workflow of information published on ibm.com. Over 4,000 users worldwide access and modify the PTT database through the PHP web front end. At any given time, several hundred users are active on the system concurrently.
The code is deployed onto a single Apache web server that loads mod_php as a shared module.
In the example, the existing code was updated for the new DB2 system, primarily by making changes to the embedded SQL. Other minor modifications were required in order to update the PHP configuration to use the DB2 driver and to tweak the database connection code to use a new connection string. This article also identifies and addresses some improvements to the application structure to better achieve a model-view-controller (MVC) architectural model to improve application quality, organization, and maintainability.
To prepare for the example code conversion, install the following components on a Windows workstation to be used to carry out the conversion steps.
- A version of MySQL copy of the source database
- In order to verify each change to the code, it's important to have a copy of the old system available to use as a functional reference in order to validate changes to the new system side-by-side with the old. You can use the same system as you did in Part 2.
- A version of DB2 with the data server drivers installed locally or on a test server
- Install DB2 to create the new target database on the workstation. In general, this does not need to be the same edition as the database used in production, but for full feature compatibility, it is a good idea. To follow the example in this article, install DB2 Enterprise Server Edition Version 9.7.2. Ensure that the data server drivers are available to provide the necessary PHP client libraries. You can use the same system as you did in Part 2.
- A version of PHP built with the ibm_db2 extension or PDO_IBM (PHP Data Objects)
- Download the latest version of Zend Server, and select the optional DB2 extension, which requires the installation of additional packages.
- An IDE, such as Zend Studio or Eclipse PHP Development Tools (PDT)
- Use a PHP-aware IDE, such as Zend Studio or the Eclipse PHP Development Tools, in order to ease your PHP development. Because you maintain a PHP application, it's likely you already have a preferred tool.
Be sure to document your configuration decisions and lessons-learned carefully so that you can repeat the steps when you deploy. Consider saving a snapshot of the Windows operating system to a virtual image at key milestones when important goals are achieved in order to serve as configuration backups and baselines for further code improvement comparisons.
If you want to capture an image of a physical machine configuration, you can do so with the free VMware vCenter Converter. As an alternative, consider using the cloud for these hands-on changes. You can use the Amazon EC2 DB2 AMIs, or you can sign up for the IBM SmartCloud (formerly known as the Development and Test on the IBM Cloud). With virtual computers, you'll be able to avoid the initial effort to procure server hardware and install an operating system and DB2, which will save time, speed the migration process, and give you more confidence to experiment with the configuration that best fits your needs. Links to all these products can be found in Resources.
The first major code conversion step is to configure the new PHP and DB2 infrastructure and translate the way that the application uses PHP and SQL to access data. This step involves the following substeps:
- Update the PHP driver and configuration
- Convert SQL syntax
- Create user-defined functions to emulate MySQL
- Move logic to PHP from SQL, if needed
Ensure that the existing PHP configuration is modified to support the DB2 driver, to match the server database encoding, and to provide important diagnostic messages via verbose warning and error notifications.
In order to update the PHP code for DB2, you will normally change each spot in the code where the ibm_db2 extension functions are used, or update the PDO_IBM connection string that implements the PDO (PHP Data Objects) interface. If you use PDO as a database abstraction library in your MySQL-based system, the changes to your code should be minor. It is a matter of changing the single connection string. But even if you are using the procedural ibm_db2 extension, this task should be straightforward. You can normally replace any functions that start with mysql_ or mysqli_ in your code with db2_ prefixed equivalents. See section 6.3 in Chapter 6 of the IBM Redbook Developing PHP Applications for IBM Data Servers (see Resources) for function mapping tables and sample code.
Both connection methods rely on the DB2 client as a bridge to connect to the DB2 server, so you need to confirm that the DB2 client has been configured to match the settings on the remote server. In particular, you should make sure the code page of the DB2 client is the same as that of the DB2 server; otherwise, you will have trouble with character sets. For the example, run the command shown in Listing 1 to set the client encoding to UTF-8, which will match the CREATE DATABASE command in Listing 1 in Part 2.
Listing 1. Set the DB2 client code page to UTF-8
To ensure this setting is in place on each of the workstations or servers where you install the runtime client, run the command in Listing 2.
Listing 2. View the DB2 configuration variables
Finally, during development it is helpful to ensure that all error messages and warnings are captured and shown in the browser. In your php.ini file, update the parameters shown in Listing 3. Remember to set them back to less verbose settings when you move to production. Part 4 of this articles series provides an example of a more appropriate error-reporting mechanism to use when you deploy to production.
Listing 3. Set error control parameters in php.ini
display_errors = On error_reporting = E_ALL & ~E_NOTICE
Next, update the individual SQL statements to support DB2 syntax, or alter them to access data in a slightly different way that achieves the same result. For most applications, this step accounts for the bulk of the code updates and verification testing. Refer to Chapter 8.1 Data Manipulation Language differences and similarities in the Redbook MySQL to DB2 Conversion Guide to see a list of common syntax conversions. Following are a few more situations in the case study application.
- SELECT clause wildcard behavior
- SELECT statements that use a wildcard (the asterisk character) to choose specific columns in addition to all columns from multiple tables are legal in MySQL, but they are not allowed in DB2. In this case, add the table name (or alias) qualifier prefix to the wildcard, specify each column from each table you are interested in, or simply list all columns in each table with a single wildcard. Listing 4 shows how to modify such queries.
Listing 4. Comparing the legal column and table wildcards in MySQL and DB2
-- In MySQL: SELECT R.NAME, * FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID; -- In DB2: SELECT R.NAME, U.* FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID; -- Or SELECT R.NAME, U.ID, U.NAME, U.ROLE_ID, U.DEPART_ID FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID;
- SELECT clause specifying the maximum result set size with LIMIT
LIMITis a MySQL-only, non-standard keyword that specifies the maximum number of rows to return from a query. DB2 uses the
FETCH FIRST n ROWS ONLYsyntax for the same purpose. Listing 5 shows you how to rewrite the query for DB2.
Listing 5. Specifying the maximum number of rows to return from a query in MySQL and DB2
-- In MySQL: SELECT * FROM ROLE LIMIT 10; -- In DB2: SELECT * FROM ROLE FETCH FIRST 10 ROWS ONLY;
- GROUP BY clause
- MySQL allows the use of a GROUP BY clause without specifying each of the column names not operated by on an aggregate function in the column list. But in DB2, this is illegal, because the resultset can contain a degree of ambiguity unacceptable to many users. Listing 6 shows how to explicitly specify the columns in the query for DB2.
Listing 6. Comparing the legal GROUP BY aggregations in MySQL and DB2
-- In MySQL: SELECT R.ID, R.NAME, COUNT(U.ID) AS NUM FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID GROUP BY R.ID; -- In DB2: SELECT R.ID, MIN(R.NAME), COUNT(U.ID) AS NUM FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID GROUP BY R.ID; -- Or SELECT R.ID, R.NAME, COUNT(U.ID) AS NUM FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID GROUP BY R.ID, R.NAME;
In Listing 6, you can update the GROUP BY, because both R.ID and R.NAME are unique keys in this table. Listing 7 shows another kind of GROUP BY query that can't be handled as simply as the one in Listing 6.
Listing 7. Ambiguous GROUP BY in MySQL that needs to be modified for DB2
-- In MySQL: SELECT R.ID, R.NAME, COUNT(U.ID) AS NUM FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID GROUP BY R.NAME;
In this case, R.ID is unique within the table, but R.NAME is not. If there are duplicate R.NAME values in the ROLE table, then you can't replace R.ID with MIN(R.ID), and you can't add R.ID into a GROUP BY clause. How you convert this SQL depends on what kind of result you expect. Listing 8 shows a few options for this situation.
Listing 8. Translated GROUP BY query in DB2
-- In DB2: -- Option 1, if you want to get same result as what you get in MySQL, the SQL is: SELECT (SELECT RL.ID FROM ROLE RL WHERE RL.NAME = R.NAME FETCH FIRST 1 ROW ONLY) AS ID, R.NAME, COUNT(U.ID) AS NUM FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID GROUP BY R.NAME; -- Option 2, if role name is same, treat it as same role. In this case, -- the result is a little different from MySQL version in that there is no R.ID: SELECT R.NAME, COUNT(U.ID) AS NUM FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID GROUP BY R.NAME; -- Option 3, if role ID is different, treat them as different role. -- In this case, the result is totally different than the MySQL version: SELECT R.ID, R.NAME, COUNT(U.ID) AS NUM FROM USER U, ROLE R WHERE U.ROLE_ID = R.ID GROUP BY R.ID, R.NAME;
- REPLACE INTO in MySQL versus MERGE in DB2
- MySQL provides a REPLACE INTO clause. DB2 provides a MERGE clause for a similar, but not equivalent, purpose. To achieve the equivalent result, create a new record or replace an existing one if it has the same primary key or unique value. Listing 9 shows how to check for a unique value in the table and either update or insert values.
Listing 9. Translating MySQL REPLACE INTO syntax for DB2
-- In MySQL: REPLACE INTO ROLE (ID, NAME, DESCRIPTION) SELECT ID, NAME, DESCRIPTION FROM ROLE_TMP; -- In DB2: MERGE INTO ROLE R USING (SELECT ID, NAME, DESCRIPTION FROM ROLE_TMP) RT ON (R.ID = RT.ID) WHEN MATCHED THEN UPDATE SET (ID, NAME, DESCRIPTION) = (RT.ID, RT.NAME, RT.DESCRIPTION) WHEN NOT MATCHED THEN INSERT (ID, NAME, DESCRIPTION) VALUES (RT.ID, RT.NAME, RT.DESCRIPTION);
- JOIN clause
- When performing an outer join on a pair of tables while also retrieving data from a third table, specify the tables used for the outer join adjacent to the join keywords in DB2. MySQL allows them to be listed in either order. Listing 10 shows the required change.
Listing 10. Comparing MySQL and DB2 JOIN syntax
-- Both of the following work in MySQL: SELECT * FROM USER U, ROLE R LEFT JOIN DEPARTMENT D ON U.DEPT_ID = D.ID; -- Or SELECT * FROM ROLE R, USER U LEFT JOIN DEPARTMENT D ON U.DEPT_ID = D.ID; -- Only this one works in DB2: SELECT * FROM ROLE R, USER U LEFT JOIN DEPARTMENT D ON U.DEPT_ID = D.ID;
- Escape characters
- The character sequence \ represents a backslash escaped single quote ' in MySQL, but in DB2, a single quote must be escaped by adding another single quote '' (double single quotes). Listing 11 shows an example of the escape sequences for a string that uses a single quote as an apostrophe.
Listing 11. MySQL uses the backslash to escape single quotes where DB2 uses a second single quote
-- In MySQL: SELECT * FROM ROLE WHERE DESCRIPTION = 'It\'s a super admin role'; -- In DB2: SELECT * FROM ROLE WHERE DESCRIPTION = 'It''s a super admin role';
- Out of range checking in DB2
- MySQL doesn't enforce the range of a data type in a SELECT clause, but DB2 does. Thus the SQL shown in Listing 12 that queries for information between February 1 and February 30 doesn't work in DB2, but it works in MySQL.
Listing 12. MySQL does not check the range of values for WHERE clauses against the data type
-- MySQL allows you to specify February 30th as part of the range SELECT * FROM USER WHERE BIRTHDAY BETWEEN '1980-02-01' AND '1980-02-30';
- INSERT into a NOT NULL column without a default value
- In MySQL, if you try to execute an INSERT statement that does not provide a value for a NOT NULL column, the command is successful. MySQL populates a default value automatically, even if you never define a default value for that column when creating the table. But in DB2, you must provide a value for NOT NULL columns in an INSERT statement when a default hasn't been defined. The SQL statements in Listing 13 show the different behaviors.
Listing 13. MySQL and DB2 handle the insertion of NULL values differently
-- Works in both MySQL and DB2 CREATE TABLE TEST1 (ID INTEGER, NAME VARCHAR(20) NOT NULL); -- Works in MySQL, doesn't work in DB2 INSERT INTO TEST1 (ID) VALUES(1); -- Works in both MySQL and DB2 CREATE TABLE TEST2 (ID INTEGER, NAME VARCHAR(20) NOT NULL DEFAULT ''); -- Works in both MySQL and DB2 INSERT INTO TEST2 (ID) VALUES(1);
- Convert MySQL functions to similar functions in DB2
- In addition to MySQL-specific SQL, you may also be using built-in MySQL functions. You can replace these with SQL standard functions or an equivalent function available in DB2. Refer to Appendix A: Mapping MySQL built-in functions and operators in the MySQL to DB2 Conversion Guide Redbook (see Resources).
As just described, if you can't find a DB2 equivalent function, you can rewrite your SQL to access data a different way. Alternatively, you can choose to emulate the function in DB2 or move the logic into your PHP code, as described in the following sections.
If you can't just update SQL statements in DB2 syntax to achieve the same results as your MySQL queries, you can implement a workaround in the form of a user-defined function (UDF) in DB2 SQL to emulate a built-in MySQL function.
There were several places that used MySQL and PHP in the example scenario that should be approached differently when updating the application to use DB2. In particular, you can move some functionality out of MySQL into PHP, such as when doing date translation. Or you can emulate function in DB2, such as to create user-defined functions in DB2 to approximate some functions that are built into MySQL, including UNIX_TIMESTAMP() and NOW(). Appendix B in the MySQL to DB2 Conversion Guide Redbook (see Resources) explains the differences, and it gives an approach to addressing changes. Daniel Krook's tips collected from his customer migration experience are also helpful (see Resources).
The DML of DB2 is different from MySQL. This article described how to convert embedded SQLs from MySQL to DB2. However, it is not necessary to convert all syntax from MySQL to DB2, especially for MySQL built-in functions. Appendix A. Mapping MySQL built-in functions and operators in the MySQL to DB2 Conversion Guide Redbook (see Resources) lists lots of MySQL built-in functions. Generally, many of the functions can be converted to DB2 UDFs. You decide which frequently used built-in functions to convert. For the example scenario, it is worth the effort to use DB2 UDFs to convert MySQL functions or registers if you use them frequently. Following is a list of MySQL functions and registers you can convert.
- CURRENT_DATE() / CURDATE()
Fortunately, many of these functions have already been implemented and are freely available to reuse, as shown in the article "DB2 basics: Fun with dates and times" (see Resources). If you are using DB2 v9.7.2 or later, you can opt to set the new compatibility vector DB2_COMPATIBILITY_VECTOR=MYS.
DB2 makes user-defined functions easy, because they can be written in SQL rather than requiring the use of the C programming language, as is the case in MySQL.
Listing 14 is an example of a UDF that you can create to emulate MySQL's non-standard but commonly used NOW() function. You can use the emulation in many places in the example application.
Listing 14. A DB2 UDF definition to emulate the built-in MySQL NOW() function
CREATE FUNCTION NOW() RETURNS TIMESTAMP NO EXTERNAL ACTION BEGIN ATOMIC RETURN SELECT CURRENT TIMESTAMP FROM SYSIBM.DUAL; END
There are several other user-defined functions for emulating commonly used built-in MySQL functions listed in Appendix B of the MySQL to DB2 Conversion Guide Redbook (see Resources).
There may be situations where you cannot translate MySQL functions or syntax into a convenient DB2 format. In such cases, look into whether PHP itself supports the feature and use that function or extension instead. For example, MySQL provides an SQL function called INET_ATON() that has no equivalent in DB2. To write a user-defined function for INET_ATON() could be error prone or tedious. Fortunately PHP provides the ip2long() function that offers the same functionality and is not dependent on the database vendor.
With the PHP code updated to modify the embedded SQL statements and to read or update the data in a manner similar to MySQL, the next step is to refactor your queries to achieve the non-functional goals of better data integrity and improved performance. This step involves the following substeps:
- Select the appropriate approach to concurrency
- Consolidate queries to improve data quality and performance
Once the individual queries embedded in the MySQL-based application are converted to work on DB2 in an equivalent fashion, it's time to consider how they can be improved to take advantage of the DB2 concurrency and data integrity features. Examine the changes needed to support the appropriate isolation levels between several simultaneous connections to the database.
There are four isolation levels in DB2 to control concurrency. Isolation levels determine how a transaction hides in-progress data changes from other users. The isolation levels are as follows:
- Repeatable read isolation level (RR)
- All rows are locked until the end of the transaction.
- Read stability isolation level (RS)
- Rows qualifying the predicate condition are locked until the end of the transaction.
- Cursor stability isolation level (CS)
- Only those rows in which the cursor is positioned are locked. This is the default.
- Uncommitted read isolation level (UR)
- No rows are locked unless the data is changing.
Consider which isolation level is appropriate for your application from a performance and data integrity perspective. The concurrency decreases and data integrity increases when you move the isolation level from uncommitted read (UR) to repeatable read (RR). The RR isolation level ensures maximum data integrity at the cost of high performance. Conversely, UR ensures the highest level of performance at the cost of some data inconsistency. Figure 1 illustrates the relationship between performance and data integrity.
Figure 1. Isolation levels in DB2
You can specify the isolation level at different levels, from individual queries up through connections and the session. If you have a read-heavy workload, you might want to set isolation to uncommitted read (UR) at the connection level so that you don't have to specify it in each statement. If you have a mix of reads and writes, you may want to focus on modifying individual queries rather than changing from the default level, cursor stability (CS).
You might also want to consider the type of cursor you should use to optimize data access. Both the ibm_db2 and PDO_IBM drivers support two types of cursors: forward-only cursors and scrollable cursors. The default is the forward-only cursor. Generally speaking, the default is a good choice for the vast majority of use cases, in which case you iterate sequentially through a result set. In terms of performance, the forward-only cursor is better than the scrollable cursor. However, in some cases, you will need to read cursors both forward and backward, in which case you should use scrollable cursor. In the example scenario, the cursor type was set using the commands in Listing 15 for each necessary query.
Listing 15. Defining the DB2 scroll type from the ibm_db2 functions and PDO driver
-- For ibm_db2 db2_exec($connection_resource, $sql, array('cursor' => DB2_SCROLLABLE)); -- For PDO_IBM $DB_PDO->prepare($sql, array(PDO::ATTR_CURSOR, PDO::CURSOR_SCROLL));
See section 4.2 Using PHP with DB2 database in the Developing PHP Applications for IBM Data Servers Redbook (see Resources) for more details.
Next, group queries into logical units rather than independent queries. In this pass at the code migration, once you know that the SQL statements are receiving the same data and that the statements are modifying the data as they had in the old application, consider regrouping the data access queries and updates. The goal is to match the performance level of MySQL while ensuring the data integrity level of DB2.
- Narrow the proportion of database connections to statements
- In general, it's faster to build a connection to a database in MySQL than it is in DB2. Therefore reducing the number of total connections per HTTP request by reusing a connection for more than one query improves processor, disk, and network performance. For the example scenario, queries were consolidated, including getting user account information to perform more extensive joins up front (rather than using several follow-on lookup queries), as shown in Listing 16. This might come at the expense of pre-fetching too much data, but for the example scenario, performance improved, the amount of fetched data was reduced, and developers had a better idea of how much data actually should be retrieved for a given page, which also saved resources.
Listing 16. Consolidating queries
-- Two individual queries that require two trips to the data server on the page SELECT FIN_PROJECT_MANAGER AS PM, PROJNAME FROM PROJECT WHERE ID = $id; SELECT EMAIL, FIRSTNAME, LASTNAME FROM USER WHERE ID = $pm; -- Single consolidated query that retrieves the same information in one trip SELECT U.EMAIL, U.FIRSTNAME, U.LASTNAME FROM USER U, PROJECT P WHERE U.ID = P.FIN_PROJECT_MANAGER AND PROJECT.ID = $id
- Make strides toward a more maintainable MVC architecture
- In a well designed three-tiered web application built using the
model-view-controller (MVC) architecture pattern, the controller
handles a web request from a user and then invokes a command on the
model to prepare a result. This result object is transferred to the
view (often represented by a simple HTML template) to display the
response data on the page. The original application did not use the
MVC pattern. Instead, the application embedded queries directly into
the templates, effectively intermingling the three responsibilities
into one page.
By adopting an MVC best practice of grouping the queries, such as getting all user account information and all projects for that user in a single call to the model, the view page need only display the information, rather than incrementally fetching more data. This approach moves the application from being data-centric towards business object-centric, and therefore can help better organize your code by segmenting the data access logic from the layout and design markup.
- Better distribute computational workload
- Because the example scenario already had a single web server and a single database server, the web server mostly dealt with high volumes of HTTP traffic. The web server performed server-side logic through the PHP module. To improve performance at the HTTP server by reducing the CPU and memory consumption that the web server spent in mod_php, some business logic was offloaded to the data server. This often had the additional benefit of reducing the amount of network traffic (because it reduced traffic between the web and data servers), and therefore increased throughput by processing data at the same level at which it was accessed.
- Improve data integrity by grouping queries on the server
- The final and most important benefit for transactional integrity is
to group logic into atomic, transactional units on the data server
using stored procedures, triggers, and user-defined functions. You
will find that the creation of tasks in an application like PTT is
more reliable when the following happens:
- All the data is transferred as a whole to the database
- The multiple INSERT statements are performed together as a unit
- A single success or failure error is returned to the user
Listing 17. Group updates into a stored procedure
CREATE PROCEDURE BILLING_TYPE_UPDATE (IN p_date DATE) BEGIN -- Update for project's billing type FOR row AS SELECT * FROM proj_billingtype_snapshot WHERE end_date IS NULL AND start_date = p_date DO -- Execute two updates in one transaction t1: begin atomic -- Transaction begins -- First update UPDATE fin_attributes SET proj_type = row.billing_type WHERE project_id = row.proj_id; -- Second update UPDATE fin_attributes_archive SET proj_type = row.billing_type WHERE project_id = row.proj_id AND year = YEAR(NOW()); end t1; -- Transaction ends END FOR; END
All DB2 versions support transactions, but most MySQL storage engines, including the default MyISAM type, don't support them. The example application was built using the default MyISAM table type, and the developers tried to enforce data integrity at the code level. Therefore it was a welcome transition to DB2, because it was possible to delegate transaction management to the database, which is something DB2 does very well. It also reduced the overall size and complexity of the code and the various moving parts that needed to work together on any given PHP page in the application. See section 5.2.8 Transactions and isolation level in the Redbook PHP Applications for IBM Data Servers (see Resources) to learn more about concurrency in DB2.
By now, you should have a functional system that can be tested as a whole, or at least as a set of related functions that comprise a specific component. For example, creating a new task that is associated to a parent project is a fairly self-contained use case in the PTT system. Sending a piece of content, such as a product title and description, to a translation service is another use case that can be tested independently. This step of the conversion involves the following substeps:
- Compile or create user-acceptance tests
- Unit test and fix problems
- Capture an image to use as a baseline
In this step, start to engage interested stakeholders to execute a series of use cases that they are used to performing in the old system. You want to ensure that the same series of tests work as expected on the new system.
The example scenario used a fairly standard format for user-acceptance testing over the years. Therefore, there was an archive of tests to validate new features and to avoid regression problems with existing code. The test portfolio was expanded to accommodate the critical functions core to PTT's functionality. These tests generally encapsulate a set of related steps that achieve a business goal, and they resemble a use-case diagram. The difference from a static use-case model is that the tests are normally stored in a spreadsheet format along with screenshots and tables with updatable rows where a tester can enter whether a particular series of steps or individual steps passed or failed. Table 1 shows an example user-acceptance test case format to verify that task creation worked as expected.
Table 1. Sample user-acceptance test
|1. Create a new task||1.1 Log into the system.||Welcome page is shown.||Yes|
|1.2. Open new task page.||New task form loads.||Yes|
|1.3 Fill out form and save.||Success message and unique task ID link shown.||Yes|
|2. Approve task||2.1 Log into the system.||Welcome page is shown.||Yes|
|2.2. Locate the list of tasks.||The list of tasks loads.||Yes|
|2.3 Click the approve button next to task.||Success message notification shown and email sent.||No|
Keep in mind that it's critically important that your testers invest the proper time to execute user-acceptance tests and note any issues. If they simply skim through the tests, they may be overlooking errors. For the same reason, it is important to keep records of who tested at what time and what their reported results were in order to enforce accountability.
This step involves the systematic resolution of problems reported by user-acceptance testers. When the user-acceptance tests are received from the testers, problems that are reported are assigned to individual developers to validate and fix. For example, Test 2.3 in Table 1 would be assigned as a new defect.
As you iteratively migrate the system, developers should compare the migrated function in DB2 against the original MySQL source system to ensure that specific units of work performed as expected. Along with user-acceptance tests, it might be of value to automate this process using a unit-test framework such as PHPUnit. Because the example codebase was not well divided into MVC code that lends itself to easy programmatic testing, automated unit test or continuous integration system was not used.
After the bulk of the use cases are complete, the final step before starting a new test iteration is to capture the current state of the system as a backup and a baseline to refer back to for functional and performance verification. Again, as iterative changes meet stable milestones, it's important to capture backups. This can be in the form of traditional file and SQL backups, or it can be in the form of complete operating system images. It is also important to keep these save points in case of an error down the road, or to use as a comparative baseline.
Once you have a complete set of user-acceptance tests that have been signed off by your stakeholders, it's time to improve the performance of the system. This step involves completing the following steps:
- Confirm the baseline UAT tests
- Use tools to fix performance issues in DB2
- Locate and address bottlenecks in the operating system
For this preparatory step, it's important to reiterate that you should have a good view of the confirmed functionality of the system so that you know whether any improvements to non-functional requirements, such as performance improvements, don't negatively impact or come at a tradeoff to the agreed functional requirements. This step should happen after functional verification for two reasons. First, it ensures that you have a baseline of known-good, signed-off functionality before making any non-functional changes. Second, it helps you avoid the pitfalls of premature optimization in your migration. You may know that a certain query or approach is faster in one case, but you should not blindly apply a change across the entire codebase, particularly if there is no indication that there is a performance problem.
For the example, a history was kept of the user-acceptance tests to confirm whether something had been tested and passed earlier to use to verify regression problems if they arose later.
Your testing at this point should focus primarily on what DB2 can fix for you automatically, because DB2 is the largest change you introduced. As you change code, unit test, and perform UAT, you should identify bottlenecks and implement changes as needed. A key philosophy to follow is to change only the things that are known to be problematic through unit test and UAT testing rather than making assumptions about potential sources of performance problems. After this is done, you can then document what was changed and record it for future application against later problems.
There is a delicate balance here between premature optimization and applying fixes across the board based on a list of best practices. This will be an iterative process throughout the lifetime of the application. See the sidebar about the perils of premature optimization.
One technique that you may find useful is to implement a function, as was done for the example PTT application, that monitors scripts that run for longer than 60 seconds and send an error report. An example of this function is described in Part 4 of this article series.
Based on the information available in that report, you can discover whether the problem is a result of slow PHP execution or a slow query. For PHP problems, a debugger such as the one available in Zend Studio or the Eclipse PDT can help you locate the problem. For query problems, you can use IBM Data Studio to locate the problem and propose a fix. Figure 2 shows an example access plan graph for a given query.
Figure 2. Tuning a query in IBM Data Studio
If you've eliminated DB2 and PHP as the source of your application performance problems, you should iteratively locate additional bottlenecks by using the tried-and-true best practices described by John Coggeshall in "Zend Enterprise PHP Patterns" (see Resources). In short, you can start to look at three general sources of application bottlenecks in order to see if you can address any performance problems through a simple hardware upgrade or by tuning the resources allotted to a virtual machine:
While many of the tools used to determine bottlenecks in the Zend book apply to Linux operating systems, many of the techniques can be achieved with Windows-specific tools as you are developing or after you deploy your updated application to a test or staging server and again after deployment.
After one complete iteration of Steps 1-4, which in turn involved several passes themselves, you should have a functional database system on a Windows workstation and several notes on what you changed and what was problematic. If you used virtual machines, you can capture an image of the Windows system as a snapshot, both to archive as a functioning save point and to use as a baseline to compare future performance changes. Another option, of course, is to use a virtual image in the IBM or Amazon Cloud and use that in the same way. You might want to experiment with several different code updates to see what works best for your environment.
Once you are satisfied with the system on the Windows workstation used in the Steps 1-4, you can tag your converted code as a release in your version control system and prepare the infrastructure for the final steps before deployment as described in Part 4 of this article series.
The goal of this article series is to provide you with an understanding of what is generally needed to migrate a PHP application from MySQL to DB2, what resources are available to help you, and an example of a successful migration.
In this third part of the series, you:
- Learned about the source PHP code you converted.
- Learned how to update the application for DB2.
- Learned how to test and tune the code after you converted it.
In Part 4, you'll learn how to deploy the migrated application and handle ongoing support.
The authors thank Leons Petrazickis and Ambrish Bhargava for their review and comments on this article.
- Use an RSS feed to request notification for the upcoming articles in
this series. (Find out more about RSS feeds of
- Use the no-charge MySQL to DB2 Conversion Guide from IBM Redbooks to plan and
execute your migration.
- Consult the no-charge Developing
PHP Applications for IBM Data Servers IBM Redbook to understand
how to configure, run, and test a PHP application with DB2 using a sample
- Consult the Apress book Zend Enterprise PHP
Patterns by John Coggeshall with Morgan Tocker to learn best
practices and techniques for PHP application performance and architecture.
- Download and learn about the no-charge IBM Data Movement Tool, which can assist in the migration of
database objects and data from MySQL to DB2.
- Go to this Information Center topic to learn more about how to reverse
engineer an existing database into a data model.
- Read the InfoSphere documentation to find out how to transform a physical
data model into DDL using the Eclipse-based tools from IBM.
- View Daniel Krook's blog about Writing SQL for both
MySQL and DB2 about the migration of an ISV's application to DB2.
- See DB2 Basics: Fun with Dates and Times to learn more about MySQL
functions and registers you can convert.
- Consult the Recommended reading list: DB2 for Linux, UNIX, and Windows
application development to learn more about specific development
- Consult the Recommended reading list: DB2 for Linux, UNIX, and Windows database
administration to learn more about specific administration topics.
- Introduce yourself to DB2 and learn its
features in-depth with Understanding DB2: Learning Visually with Examples, 2nd Edition
by Raul F. Chong, Xiaomei Wang, Michael Dang, and Dwaine R. Snow from IBM
- Learn more about DB2 application
development and administration with DB2 9 for Linux, UNIX, and Windows: DBA Guide, Reference, and Exam
Prep, 6th Edition by George Baklarz and Paul C. Zikopoulos from
- Learn strategies to keep your data safe
with Leverage MySQL skills to learn DB2 Express: DB2 versus MySQL backup
- Read Leverage MySQL skills to learn DB2 Express: DB2 versus MySQL
administration and basic tasks to get a high-level overview of the
tools, SQL syntax, concurrency options, and permissions.
- Explore Leverage MySQL skills to learn DB2 Express: DB2 versus MySQL data
movement to learn about the import and export tools available
within MySQL and DB2.
- Compare the features available in the
visual tools for managing database systems in Leverage MySQL skills to learn DB2 Express: DB2 versus MySQL
graphical user interface.
- Read IBM DB2 functions
in the official PHP documentation and the PDO driver to
learn about the two recommended DB2 interfaces for PHP.
practices for DB2 for Linux, UNIX, and Windows offers a series of
papers offering optimal solutions to common challenges and how to best
make use of available tools, such as Information Modeling with Rational Data Architect Version 7,
which was used in this article's case study.
- Take a quick look at the IBM SmartCloud (formerly known as the Development and Test on the
- Get an introduction to DB2 performance
optimization in Best practices for DB2 for Linux, UNIX, and Windows: Cost reduction
strategies with DB2.
- Get an introduction to DB2 query tuning
optimization in Best practices: Writing and tuning queries for optimal
- Find more about migrating to IBM software
from other databases, application servers, and operating systems at the Migration station.
- Learn more about Information Management
at the developerWorks
Information Management zone. Find technical documentation, how-to
articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Follow developerWorks on
Get products and technologies
- Get the IBM DB2 e-kit for Database Professionals to learn DB2 features,
leverage your existing database administration and application development
skills, join the DB2 community, and prepare for certification.
- Reach out to the
Software Migration Project Office DB2 migration team for a
no-charge migration assessment.
- Create and verify
your PHP/DB2 applications on the IBM Cloud with the IBM Smart
Business Development and Test on the IBM Cloud .
- Evaluate DB2 on Amazon EC2.
- Check out Zend Server for a
complete, enterprise-ready web application server to run and manage PHP
applications that require a high level of reliability, performance, and
security on Linux, Windows, or IBM i.
- Zend Server
includes the DB2 drivers, but if you have your own PHP configuration, you
can find the driver extension source on PECL or Windows binaries on SourceForge.
- Learn more about
DB2 training and
certification or attend an Information Management bootcamp.
- Download and
install the no-charge DB2
Express-C data server.
- Use Rational Software Architect or InfoSphere Data Architect to perform logical and physical data
- Use Optim
Development Studio to develop and optimize your DB2-based
- Leverage the
vCenter Converter to convert physical machines to virtual images.
- Build your next
development project with IBM trial
software, available for download directly from developerWorks.
- Participate in the discussion forum.
- Search, monitor, and submit your
questions on the PHP forum: Developing PHP applications with IBM Information
Management products (DB2 and Informix).
- Submit your questions about the IBM
Data Management Tool on the DB2 Application Enablement forum.
- Check out the developerWorks
blogs and get involved in the developerWorks
Daniel Krook is an IBM/Open Group Master Certified IT Specialist based in the greater New York City area. He has over ten years experience in web application development and currently builds a Cloud infrastructure for IBM using Java EE, DB2, REST, and mobile technologies. He holds certifications in PHP, Java EE, BlackBerry, DB2, and Solaris. He writes PHP-related articles for IBM developerWorks and co-authored the IBM Redbook "Developing PHP Applications for IBM Data Servers."