Accelerate access to critical data using IBM solidDB Universal Cache

Use an in-memory RDBMS as a cache to existing disk-based databases

IBM® solidDB® Universal Cache is in-memory caching software for use with traditional disk-based relational databases. It promises extreme speed, adaptability, and robustness. Explore these promises and follow a sample use case for the first — extreme speed. See, also, how to use a custom testing tool written in Jython.

Share:

Sami Salkosuo (sami.salkosuo@fi.ibm.com), Software IT Architect, EMC

Sami Salkosuo works as a Software IT Architect at IBM Software Group in Finland.


developerWorks Contributing author
        level

12 March 2009

Also available in Chinese

IBM solidDB product family is a recent addition to the IBM software portfolio. Version 6.1 was released June 2008, and the second release, 6.3, was released December 2008. The two offerings of IBM solidDB product family are IBM solidDB, a persistent in-memory relational database, and IBM solidDB Universal Cache, in-memory caching software for traditional disk-based relational databases.

In this article, get an introduction to IBM solidDB Universal Cache and its features. Follow a simple use case and test scenario that shows the speed of IBM solidDB. See how to write your own test tools (even though application development is not the main focus of this article) and test JDBC connectivity using Jython programming language to quickly prototype a testing application. The test scenario uses IBM DB2® and Apache Derby with solidDB Universal Cache to explore database performance.

IBM solidDB and IBM solidDB Universal Cache

IBM solidDB is a relational in-memory database. It offers microsecond speed in database transactions. Applications can use familiar SQL to access it using standard JDBC (using JDBC 2.0 Type 4 driver) and ODBC interfaces. IBM solidDB supports multiprocessor and multi-core architectures, and it has configurable security using user and role privileges.

IBM solidDB has high availability and load balancing features, as well as advanced replication features. It can also be embedded and linked directly with Java™ or C/C++ applications.

IBM solidDB Universal Cache can be used as a cache to accelerate access for performance-critical data for traditional disk-based databases such as DB2, IBM Informix® Dynamic Server, and Oracle. Figure 1 illustrates the IBM solidDB product family:

Figure 1. IBM solidDB product family
IBM solidDB product family

Both IBM solidDB and IBM solidDB Universal Cache persist all data to disk so that applications using either product are assured of data durability.

Extreme speed

Because IBM solidDB is an in-memory database, speed is almost a side effect. In addition, solidDB is designed to take advantage of the fact that data is in memory. For example, data structures and access methods are specifically optimized to work on data that is located in main memory.

Since IBM solidDB always persists all the data to disk, data integrity is ensured over server failures. Persistence is achieved using built-in checkpoint and transaction logging mechanisms. Using checkpoints, committed transactions are written to disk. Between checkpoints, transactions are written to a transaction log and, if the system crashes, transactions are recovered from the transaction log. Transaction logging can be configured using strict or relaxed logging, where strict logging writes committed transactions to disk synchronously and relaxed writes transactions asynchronously. Transaction logging can also be turned off.

The extreme speed capability that is provided by an in-memory database can be combined with a traditional DBMS, where IBM solidDB acts as a cache to the databases. Figure 2 shows an overview of how Universal Cache works:

Figure 2. How IBM solidDB Universal Cache works
How IBM solidDB Universal Cache works

At a high level, here's how Universal Cache works:

  1. You identify the tables that would benefit from high-speed access. (High-speed access may be relevant to all or just a small set of tables.)
  2. You load the selected tables from the backend database into IBM solidDB Universal Cache.
  3. Applications connect to IBM solidDB Universal Cache and use standard interfaces to access the data.
  4. Transparent to applications, IBM solidDB Universal Cache automatically synchronizes data with the backend database.

Since the same data is in the back-end DBMS, applications that do not require high -peed access, such as reporting tools, can access data in the back-end database. Figure 3 shows an architectural overview of IBM solidDB Universal Cache:

Figure 3. IBM solidDB Universal Cache architecture overview
IBM solidDB Universal Cache architecture overview

The architecture includes following components:

  • solidDB - The frontend in-memory cache. Synchronization can be unidirectional from/to the front-end database or bi-directional.
  • Data server- The back-end database.
  • CDC instance (Change Data Capture instance) - An agent software for each database that reads database log files and replicates changes to target databases using another CDC agent.
  • Configuration tool - A tool that configures the CDC instance.
  • Management Console - A GUI application to configure and monitor replication.
  • Access server - Enables the Management Console to access and configure CDC instances.

Adaptability

IBM solidDB Universal Cache can be adapted to specific needs. Universal Cache can be deployed as read-only cache where data is owned by back-end database (such as DB2 or IDS), or read-write cache, where data is owned by IBM solidDB front-end database or data ownership is shared with front-end and back-end databases.

In each scenario, data changes are synchronized automatically, transaction by transaction, or on demand. In the shared ownership scenario, synchronization conflicts are resolved by using predefined conflict resolution methods.

Adaptability in IBM solidDB Universal Cache gives you great flexibility to configure the cache. You can use the same database schema in the front-end as in the back-end, you can load only specific tables, columns, or rows to the front-end, and you can extend the front-end in-memory cache with additional tables. IBM solidDB Universal Cache also supports stored procedures, triggers, and events, enabling IBM solidDB Universal Cache to execute business logic in the in-memory cache.

Scalability of IBM solidDB Universal Cache leverages large memory sizes of 64-bit computers, and uses multi-processor and multi-core architectures to provide and maintain microsecond response times and high transaction throughput. Scalability enables IBM solidDB Universal Cache to be scaled horizontally to multiple servers, where each server provides access to either identical data or each server may hold different partitions of large data in the back-end database. For example, if a back-end database contains 1,000,000 data rows, it can be partitioned to four IBM solidDB Universal Cache instances in four servers, each containing 250,000 data rows.

Robustness

Robustness of IBM solidDB Universal Cache has several components: data durability, high availability, instant failover, and cache management/monitoring capabilities.

IBM solidDB implements ACID-compliant (Atomicity, Consistency, Isolation, Durability) in-memory cache to provide data durability. ACID properties guarantee that solidDB transactions are processed reliably and that data is persisted to disk using checkpointing and transaction logging. Even if the network connection between the front-end and the back-end databases is disrupted, front-end logs transaction to disk and they are later, when network connection is available again, synchronized to the back-end database.

High availability and instant failover are provided by the solidDB HotStandby (HSB) configuration, where two copies of data is synchronized between two cache instances. If system failure occurs, failover is done in less than a second without disruption to operation.

In the HotStandby configuration, transactions are replicated from primary to secondary so that both nodes are synchronized. Replication can be configured using synchronous or asynchronous replication.

A major feature of HotStandby is that applications have only one logical connection to IBM solidDB. In case of failure, there is no action from the application and ODBC/JDBC driver automatically maintains connectivity to IBM solidDB.

HotStandby also has a beneficial side effect of load balancing read operations. Because driver maintains connection to both nodes while write transactions are directed to the primary node, read transactions can be directed to either node. This can increase performance by even 100%.

Figure 4 shows an architectural overview of IBM solidDB HotStandby configuration that is used to provide robustness in IBM solidDB (stand-alone) and IBM solidDB Universal Cache scenarios:

Figure 4. IBM solidDB HotStandby architecture overview
IBM solidDB HotStandby architecture overview

Primary and secondary servers have solidDB servers that communicate using HotStandby API (HSB API) admin commands. The watchdog application is called HAC, High Availability Controller.

IBM solidDB Universal Cache provides a graphical management console for users to define database schemas, field mappings, and data transformations. Runtime counters are also available to monitor the cache operation.


Use case for IBM solidDB Universal Cache

The use case in this article is taken from financial industry. There is a fictional stock exchange and a stock broker with multiple stock traders who each make a number of trades. In this scenario, trades are randomly generated. Figure 5 and Figure 6 provide architectural overviews of the scenario use case.

Speed and availability are paramount in financial markets, and IBM solidDB Universal Cache fulfills both. In addition, network latency is an issue that can be addressed with IBM WebSphere® MQ Low Latency Messaging, but it is out of the scope of this article.


Test scenario

To test the speed of IBM solidDB Universal Cache, this article uses a simple scenario where you have one or more "traders" that trade and you update the database using normal SQL INSERT statements.

The test scenario includes IBM DB2 and Apache Derby databases, and tests are executed against each database. Results of the tests are shown only to explore the speed of IBM solidDB Universal Cache; no conclusions of database performance have been made (see the test results disclaimer).

Architecture

The test case uses the architecture shown in Figure 5 and Figure 6. Figure 5 represents a scenario using a direct connection to a database (Derby and DB2 in this case):

Figure 5. Test architecture, direct to database
Test architecture, direct to database

Figure 6 shows scenario where IBM solidDB Universal Cache is used as a front-end cache to a DB2 back-end database:

Figure 6. Test architecture, solidDB cache
Test architecture, solidDB cache

Broker 1 machine is "a stock broker company" that has one or more Trader processes, "stock traders". Each Trader process runs on top of Java Virtual Machine (JVM). The test database is located in another machine called DB Server, and the database connection is JDBC using a database-specific driver.

The statistics database is a stand-alone IBM solidDB database in the "stock broker" computer, and the statistics program retrieves elapsed times of INSERT statements from the statistics database using a separate process after the test cases have been executed. The statistics database is also used to generate unique trade IDs.

Database servers, IBM solidDB Universal Cache cache server and DB2/Derby server, are VMWare virtual machines running on top of a single desktop PC. The component stack for both servers is shown below in Figure 7:

Figure 7. Database server component stack
Database server component stack

Table 1, below, shows the component details:

Table 1. DB server component details
ComponentDetails
HardwareCPU: AMD Athlon 64 Dual Core 3GHz
RAM: 3.5GB
Hard disk (back-end VM): 80GB External USB 2.0
Hard disk (front-end VM): 320GB Internal SATA
Network: 100 Mbps
Host OSWindows XP SP2 (x86)
Virtual hardwareBack-end server:
VMWare Workstation 6.5.0
RAM: 2GB
Single CPU
Bridged networking
Front-end server:
VMWare Workstation 6.5.0
RAM: 512MB
Single CPU
Bridged networking
Guest OSFront-end and back-end:
Windows 2003 Standard SP 2 (x86)
DatabaseDB2, Derby, and solidDB. (See the "Databases" section for more information.)

For the test scenario, "stock broker" was an IBM T42 laptop, with 2GB RAM, using Windows XP SP 2. The laptop was connected to virtual machines using 100Mbps LAN.

Tools

Jython

Jython is an implementation of the Python programming language written in Java code. Jython integrates with Java technology, so it makes it easy to develop Java applications using a scripting language. Jython can be easily embedded in applications to provide scripting capabilities for application users.
Jython is also used in IBM products. For example, WebSphere Application Server, Version 6.1 and later prefers Jython as a scripting language for WebSphere Application Server administration.

Testing tools for the test scenario are custom-made. This article uses Jython to test how it can be used in ad-hoc development and prototyping. The following tools were developed (by the author) for database testing:

  • trader.py
  • dbstats.py
  • cleartables.py
  • createstatisticstables.py
  • clearid.py

You can download the code for the testing tools from this article.

trader.py is the program that makes trades to the stock exchange. This Jython program connects to the database and updates the stock exchange database with random trade. Listing 1 shows the loop that inserts trades to the database:

Listing 1. Inserting trades to database
for dbname in databaseNames:
  preparedStatement[dbname]=dbConnections[dbname].prepareStatement(
          "INSERT INTO TRADE_DATA (ID, TICKER, PRICE) VALUES (?,?,?)")    
currentPrice=0.0
for i in range(totalTrades):
  print "\rTrade %04d/%04d" % (i+1,totalTrades),
  currentPrice=getNextPrice()
  tradeId=getNextIDFromSolidDB()
  insertElapsedTimeRow(tradeId)
  for dbname in databaseNames:
    preparedStatement[dbname].setLong(1,tradeId)
    preparedStatement[dbname].setString(2,ticker)
    preparedStatement[dbname].setFloat(3,currentPrice)
        
    startTime=System.nanoTime()
    preparedStatement[dbname].executeUpdate()
    endTime=System.nanoTime()
        
    updateElapsedTimeRow(tradeId,dbname,endTime-startTime)

Database response time used to get the elapsed time of the SQL INSERT is done using preparedStatement and its executeUpdate() method. Elapsed time, in microseconds, is then inserted to the statistics database.

dbstats.py program uses open source library JFreeChart to draw charts from statistics results. Listing 2 shows how JFreeChart is used to create a chart from the database table. The program generates a chart of the average times and displays the result in an HTML page. Results of the dbstats.py program are shown separately in the "Test results" section.

Listing 2. Creating statistics charts
for dbname in databaseNames:
  cols=cols+(", AVG(ELAPSED_TIME_%s) AS %s " % (dbname,dbname))

jdbcCategorySet=JDBCCategoryDataset(statDbConnection)
jdbcCategorySet.executeQuery("SELECT 'Averages' %s FROM ELAPSED_TIMES" % cols)
chart=ChartFactory.createBarChart3D(
          "Total trades: %d\nElapsed time averages." % totalTrades,"","Microseconds",
          jdbcCategorySet,PlotOrientation.VERTICAL,True,True,True)
averageFileName="%s_average_times.png" % timestamp
writeImage(chart, averageFileName)

cleartables.py, createstatisticstables.py, and clearid.py are used to set and reset the test scenario. cleartables.py clears the database tables to each database, createstatisticstables.py creates statistics tables to the IBM solidDB Universal Cache statistics database, and clearid.py clears trade IDs between test runs. More about the testing process is in the "Test cases" section.

Java runtime for testing tools was JDK 1.6.0_11. Download the code for the testing tools from this article. Note that you need to separately download Jython, Java runtime, JFreeChart, and JDBC drivers for the databases.

Databases

Table 2 lists databases that were used in the test. Databases are running simultaneously in the single virtual machine, but they are not accessed simultaneously.

Table 2. Databases in test scenario
DatabaseVersionComments
IBM solidDB Universal Cache6.3Front-end cache to back-end DB2 database.
IBM DB29.5Free Express-C version of DB2.
Apache Derby10.4.2.0Latest version at the time of testing.
JVM is Sun JDK 1.6.0_10.

Note that none of the databases were optimized in any way; they were used as they were installed using the default configuration.


Test cases

The test scenario uses three test cases to explore the speed if IBM solidDB Universal Cache. Figure 8 shows the testing procedure. Three test cases in the test scenario use 1, 2, or 4 trader clients inserting 50,000, 100,000, and 200,000 trades to the database.

Figure 8. Testing procedure
Testing procedure

As illustrated in Figure 8, you first need to set up each of the databases and configure solidDB Universal Cache as front end for DB2; replication is configured only from front-end solidDB to DB2 in the back end. You also need to create the tables that are used in the test.

Setting up IBM solidDB Universal Cache

Figure 9 shows the steps needed to set up solidDB Universal Cache.

Figure 9. Setting up solidDB Universal Cache
Setting up solidDB Universal Cache

In this test scenario, the first step — identify tables that would benefit high-speed access — is easy. For the test scenario, you have to create just one table in the DB2 database, called TRADE_DATA. Listing 3 shows the SQL to create the table in the DB2 database:

Listing 3. DB2 SQL snippet
CREATE TABLE TRADE_DATA (ID BIGINT NOT NULL, TICKER VARCHAR(5), PRICE REAL, 
                         CONSTRAINT pk PRIMARY KEY (ID) )

The second step of setting up solidDB Universal Cache is to load selected tables from the back-end to the front-end cache. Complete this step using the Management Console. Because of the adaptability of IBM solidDB Universal Cache, set up replication so that it is read/write cache, where data is owned by the front end.

The third step is actually developing (or using) applications that connect to the database. Since the connection is made using a standard interface (JDBC in this case), applications do not need to worry about database implementation.

The fourth step is automatic — IBM solidDB Universal Cache synchronizes data to the back-end database.

Testing

As illustrated in Figure 8, execute the following steps (steps 1-9 in Figure 8) for each test case (1, 2, or 4 clients):

  1. Clear tables. Before running the test, delete the data from the tables.
  2. Create the statistics tables before running tests.
  3. Do trades using 1, 2, or 4 clients against Derby database.
  4. Clear ids before running test against another database.
  5. Do trades using 1, 2, or 4 clients against DB2 database.
  6. Clear ids before running test against another database.
  7. Clear tables in DB2 so that solidDB cache does not try to update existing data.
  8. Do trades using 1, 2, or 4 clients against DB2 database.

In steps "Do x trades...", trader processes are executed in parallel, in one, two, or four processes.


Test results

Disclaimer

Results of the test cases are not, in any case, official or even unofficial results. They are merely an indication of how databases may perform and no conclusions have been made. The test cases were not executed by any other than the author, and the results were not verified by any other than the author.

The test environment and architecture are far from optimal, and there has been no optimization in any of the tested databases; each database is used as it was installed.

The testing tools were developed in a "quick-and-dirty" fashion and not optimized at all. Tools were done solely using Jython because the author was interested in seeing how Jython can be used for application prototyping.

In addition, all the tests were SQL INSERT commands. Read access to each database is naturally a lot faster.

Results are also shown in the test cases. Diagrams of average times for each test case are presented as well as tables of minimum and maximum times. Values for results are in microseconds.

Test case 1: One client

This test case is just a single client accessing the database. Broker 1 machine is used as a client. Total of trades, SQL INSERTs to database, for a single client is 50,000.

Figure 10. Results of test case 1
Results of test case 1
Table 3. Results of test case 1
DatabaseMinMaxAvg
solidDB270.98400986586.984375831.555115
Derby1223.060059155578.7187501938.367920
DB21159.64502085668.4296881823.605347

Note how total trades (in Figure 10) shows only 49994 trades. This is because the statistics program removes the maximum value (because that is probably the result of the first call to the database when the connection to the database is initialized) and the minimum value of elapsed time before drawing the diagram.

Test case 2: Two clients

This test case has two clients accessing the database. Total of trades for two clients is 100,000.

Figure 11. Results of test case 2
Results of test case 2
Table 4. Results of test case 2
DatabaseMinMaxAvg
solidDB274.058014102114.375000946.763733
Derby1262.171997278358.2812502938.149902
DB21250.718018194886.2187502905.133301

Test case 3: Four clients

This test case has four clients accessing the database. Total of trades for four clients is 200,000.

Figure 12. Results of test case 3
Results of test case 3
Table 5. Results of test case 3
DatabaseMINMAXAVG
solidDB270.704010174741.4843751137.652710
Derby1259.656982332118.7500003137.169678
DB21189.816040334269.8437502780.724121

Summary

This article introduced IBM solidDB Universal Cache and its major features: extreme speed, adaptability, and robustness. Extreme speed was explored in this article using SQL INSERT commands, and it is clear that speed would bring great value to various applications, such as in financial markets.


Download

DescriptionNameSize
Database testing toolsTestingTools.zip5KB

Resources

Learn

  • IBM solidDB: Get more information about solidDB, an in-memory database optimized for extreme speed.
  • IBM solidDB data sheet (IBM, 2008): Get detailed information about solidDB and its features.
  • DB2 skill kit: Increase your knowledge and skills whether you are a database administrator or an application developer. Learn how to install, configure, and administer IBM DB2 Universal Database. Topics covered include: Overview, installation, configuration, database creation, and administration.
  • developerWorks Information Management zone: Learn more about Information Management. Find technical documentation, how-to articles, education, downloads, product information, and more.
  • Stay current with developerWorks technical events and webcasts.
  • Technology bookstore: Browse for books on these and other technical topics.

Get products and technologies

  • The Jython Project: Learn more about Jython.
  • JFreeChart: Download JFreeChart, a free 100% Java chart library that makes it easy for you to display professional quality charts in your applications..
  • Apache Derby: Download Apache Derby.
  • DB2 Express-C: Now you can use DB2 for free. Download DB2 Express-C, a free version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition, and provides a solid base to build and deploy applications.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=375717
ArticleTitle=Accelerate access to critical data using IBM solidDB Universal Cache
publish-date=03122009