Previously, I’ve talked about the new DB2 10 temporal tables and how they are great for data warehousing applications. To leverage the temporal tables and data warehousing applications within the DB2 for z/OS environment the system needs the proper DSNZPARM settings. Since everyone has too many systems, some organizations only review system DSNZPARMs during migrations, meaning that many of the new settings might not be set up or enabled. It’s a good idea to review the DSNZPARMs on a regular basis and get them set up to maximize the performance of your data warehousing application SQL by leveraging all the DB2 optimizer capabilities.
First, get a listing of the current DSNZPARMs for your data warehouse system’s settings. This can be done a number of different ways: through your performance monitor, through the JCL that creates the DSNZPARMs or through the IBM REXX exec of the DB2 stored procedure DSNWZP. This DSNWZP stored procedure routine is one of the many DB2 administration procedures and is set up during the system install.
The data warehouse DSNZPARMs affect the performance of the DB2 system and its applications in many different ways, from the DB2 optimizer’s access path choices to the number of parallel processes DB2 will perform during data retrieval. To make sure that all DB2’s performance features and functions are available to improve performance I've created an alphabetic list of DB2 data warehouse DSNZPARMs. Making sure these data warehouse DSNZPARMs are enabled and adjusted for your system will help your application fully exploit DB2 and get the best performance possible for your data warehouse.
Well the votes are in, 22 billion rows is big enough data. It’s not the billions of web logs rows of a Google or Facebook but its big enough for everyone. One of the comments that struck me was that one in a million happens 22,000 times. So whatever your criterion is for big data, it is more a state of mind about the amount of data as opposed to the actual terabyte amounts or the number of rows. Regardless of what database systems you work with, big is a relative term. Just ask your SQL Server, Oracle and Sybase DBA friends what they consider a big system. Usually the answer is nowhere near what you get for DB2 z/OS or even DB2 LUW systems. I talked about this a last year in my blog ("Performance is Relative").
Other comments and questions received about last week’s blog asked for more clarification on the idea of keeping a database design simple. So below are three different ways to keep your big data data warehouse design simple.
First: There are reasons Bill Inmon’s and Ralph Kimball’s decentralized and centralized data warehouse ideas are so popular, those design patterns work. Design patterns for all types of IT applications, Java/.NET MVC (model view controller), various business models and standard processes have been extensively analyzed and endorsed over many years through the design pattern books, conferences and government studies. The decentralized and centralized data warehouse design patterns work and your design should use them for your data warehouse performance. Big data or not, there is no reason to do something more complex. Starting with these types of design patterns, using and optimizing simple Fact table(s) surrounded by Dimension tables(s) design pattern will provide you data warehouse performance. Decentralize or extend these design patterns with as many Fact tables and slow moving Dimension tables will optimize and minimize the amount of big data referenced in typical transaction and your data warehouse performance won’t be an issue.
Second: Make sure to normalize your big data design. It’s typical to try to consolidate everything within a data warehouse performance design. Unfortunately having too many elements in a table forces too much data into an application transaction and data warehouse performance can suffer. Just as decentralized and centralized data warehouse performance design patterns have been used for years, database table normalization has been around for even longer because it logically optimizes your design. The database design normalization process has been documented everywhere over the years and it is effective for making sure the table elements relate to the whole database table key(s). Combining table keys or designs causes excessive repeating data or data groups and over-normalization leads to excessive application joins. Normalization is striking a balance and no one does it perfectly the first time. Normalize your data warehouse performance design several times and your transaction performance can strike a balanced performance for all critical applications.
Third: Design, test, redesign test and repeat. Schedule enough database and application meetings and testing time to understand all the application transactions and reports. Data warehouse performance and modeling big data can get unwieldy, so testing your design early is vital. Sometimes big data table population numbers cause tools to abort. Cut the number of zeros down and model all your application transactions against your database design, build it and run applications against it. Data warehouse performance requires real life testing and actual running of the code or SQL that interfaces with the design. No one has any time to do it perfectly but everyone will be mad if you have to redesign and do it all over right before implementation. Know your performance before production through thorough testing. Big data and data warehouse performance requires design and testing. Make sure to do these several times during your development with as much big data as possible.
These are only some of the simple truths that insure your data warehouse performance for your big data system is a success.
Was 22 billion rows Big Data? As I began the new data warehouse project, I knew that the data warehouse performance needed to be designed from the very first data element. The new project was very interesting with complex calculations, cross referencing reports and Big Data implications and big data warehouse performance challenges. Many standard reports that affected whole industries and government policies were depending on data warehouse performance for timely reports and 100% not even 99.999% accuracy of every bit of the Big Data.
Another consideration was the implementation deadline for this high performance Big Data data warehouse system was only 120 days away and wasn’t movable. All of these reasons made this Big Data system quite the challenge. Our small team of business users, SQL analysts and technical staff were committed and we were successful in completing the implementation using agile scrum development techniques. The data warehouse performance was so improved that some of the processes went from 37 hours to under 7 minutes and some reports come back in seconds. Complex calculations and report verification done during our data quality testing found bugs in the old system’s existing reports that had been trusted for over 20 years, so we knew everything was 100%. Our overall performance, 100% accuracy and overall quality implementation were a testament to the commitment of a winning team effort.
There were many success factors that drove the team, the design, development and successful implementation of this 22 billion row Big Data system. Three success factors drove the data warehouse performance.
First: Keep it simple. Yes I know everyone talks about keeping it simple in every development and system design. Designing it and doing it simply can be much harder. Instead of endless discussions and meetings, design the big picture and then see how all the reports, every type of interface and users interact with the warehouse. Data warehouse performance is achieved through computing power. Simple designs that leverage parallelism, that drive I/O and calculation computing power for quickly producing reports are key.
Second: Leverage all existing infrastructure and processes. The tight 120-day schedule from start to finish did not allow the team to have issues, only time delayed solutions. Since we used an agile scrum type of development methodology the agile scrum stories and the data warehouse performance of the Big Data reports were tested several times during the short development period. All used the same standard corporate testing procedures, documentation and approval processes that had been used for years. The team contacted everyone that needed to guarantee the data warehouse performance, got them involved early and got their requirements, feedback, improvements and eventual approval for all the components and implementation. Everyone knew the Big Data project schedule ahead of time so all infrastructure and processes were ready. Issues were worked on until solutions were approved and implemented.
Third: Don’t be intimidated by Big Data. I have been fortunate to design, develop, deploy and study data warehouse performance since back in 1988 when I implemented my first “Big Data” system for the financial firm E.F. Hutton. At least back then it was considered Big Data and it seems that all the other data warehouse systems I have dealt with over the many years have had bigger data then the previous one. Big Data will keep getting bigger and it is just another couple of zeros on the end of your table population and design considerations. Data warehouse performance is built on the same principles as many years ago. As I mentioned with the first success factor, design for both I/O and CPU parallelism in every single item of your system. Only then can your Big Data system have the data warehouse performance it needs to speed up your processes and reports from hours to just minutes or even seconds making everyone, especially your users, happy
To figure out the best temporal table design aspects you need to think of the various options and considerations that will affect its performance. The most important aspect for your temporal table is the answers that your applications or users are expecting from it. The best way is to figure out the time aspect that the application is trying to capture. Are your applications looking for the financial value, the insurance coverage level, enrollment status, customer value or something else?
The temporal table status can be contingent on two types of settings: business time or the system processing time. If the processing is delayed and the system time is later than expected, does that affect your temporal table status? Or are you using the temporal table in a real time scenario where either the business or system time will affect the meaning of the data? There are many ways to respond to the situations and questions, but the design decision should be based on the application and user questions that need to be answered. So it is best to test both SYSTEM_TIME and BUSINESS_TIME scenarios out and see which design provides the best answers with the best performance.
The next design point is to figure out your timestamp type. Do your temporal table application answers require distinct timestamps throughout the system? Your DB2 10 system now has new capabilities to provide a column that is unique within the table system wide. This DB2 syntax is defined WITHOUT OVERLAPS and can be used for your temporal table only for your BUSINESS_TIME values. After the temporal table is created, an index is defined for it using your unique columns and the BUSINESS_TIME WITHOUT OVERLAPS keyword. BUSINESS_TIME is the only option the WITHOUT OVERLAPS keyword works with.
When BUSINESS_TIME WITHOUT OVERLAPS is specified, the columns of the BUSINESS_TIME period must not be specified as part of the constraint. The specification of BUSINESS_TIME WITHOUT OVERLAPS adds the following to the constraints:The end column of the BUSINESS_TIME period in ascending orderThe start column of the BUSINESS_TIME period in ascending orderThe minimum value of a TIMESTAMP(12), the value is 0001-01-01-00:00:00.000000000000The maximum value of a TIMESTAMP(12), the value is 9999-12-31-24:00:00.000000000000
For DATE the minimum is 0001-01-01 and the maximum value is 9999-12-31.
A system generated check constraint named DB2_GENERATED_CHECK_CONSTRAINT_FOR_BUSINESS_TIME is also generated this definition process to ensure that the value for end-column-name is greater than the value for start-column-name. BUSINESS_TIME WITHOUT OVERLAPS must not be specified for a PARTITIONED index.
There are a number of considerations when creating your DB2 10 temporal table. When your application needs it to be unique, the system wide the BUSINESS_TIME option provides the capabilities with some cautions. Check out other posts on temporal tables via Developer Works or at my site, www.davebeulke.com.
It is usually pretty easy to quickly implement new DB2 features. DB2 makes it easy to improve your database performance with a new zParm or table space definition. Unlike most of these new features, however, DB2 temporal tables need research before you implement them. DB2 temporal tables offer great flexibility and many data warehouse design options that can be leveraged very effectively--or be abused--with the wrong application design.
To prepare you need to evaluate whether the application is appropriate for temporal tables. First with DB2 temporal tables it is even more important to determine the frequency of the inserts, updates and deletes that are going to happen. Frequencies are always a good design point for any application but it is especially important for DB2 temporal tables because of the way BUSINESS_TIME or SYSTEM_TIME is maintained and how all the data changes are captured within the associated history temporal table. Every data change could really be two processes because rows need to be replicated into your history table. That could be a major performance consideration.
The next research points are the restrictions with DB2 temporal tables and history tables. The temporal table must be a regular table with the added BUSINESS_TIME or SYSTEM_TIME. No clone table capabilities, column masks, row permissions or security label columns are allowed. The same restrictions are in place for the history table.
The temporal table and its associated history table must be kept in sync. Restrictions also exist regarding the altering, adding or removing columns into the temporal table and the history table to guarantee integrity. Also backup and recovery for the temporal table and its history table must be kept in sync and there are restrictions around DB2 Utilities that could delete data from these tables. In addition once the history table is defined, its table space or table cannot be dropped. So make sure the columns desired in your DB2 temporal tables are stable and well defined for the application.
There are several resources that should be reviewed before designing your first application using DB2 temporal tables. The first is the IBM DB2 10 manuals. By reading these friendly manuals you get a good understanding of the syntax, various examples and details about all the restrictions. Next there have been some presentations about DB2 temporal tables at past IDUG conferences and IOD conferences. Track down you colleagues that went to the conferences and get the CD or access to the website for the presentation downloads.
The improvements in DB2 10 technology are definitely making building performance into business intelligence systems easier. Additional SQL data warehousing extensions such as Rank, Dense_Rank, and Materialized Query tables (MQTs) were great when they came out in DB2 8 and DB2 9. DB2 10 takes another giant step with data warehousing additional features.
Use a DB2 10 Temporal Table
DB2 10 provides Temporal and Bi-Temporal database tables along with the OLAP capabilities of moving sums, averages and aggregates. These additional improvements within the DB2 engine, SQL and table designs provide faster result set capabilities for overall business intelligence performance. Also the DB2 10 enhancements with the new scalar and table functions can provide performance for all types of OLAP activities.
Since these business intelligence performance features are built directly into the DB2 10 engine, temporal designs, moving sums, averages and aggregates will become standard calculations and properties for all reporting requirements. Calculations will be done using different groups of temporal time periods or location-based data for product sales, store location or other common data warehouse criteria. Data warehousing performance will improve dramatically as DB2 data warehouse applications are quickly designed to leverage these common built-in features and functions for all types of business intelligence performance requirements.
By using a temporal data warehouse design along with partitioning, time frame or common table design, the standard OLAP functions can provide quick calculations for complex or simple data warehouse reporting requirements. Also given the improvements within SQL, moving sums, averages and aggregates can be included in common table expressions, SQL SELECT lists or ORDER BY statements to satisfy any application requirements
Leverage DB2 10 for Business Intelligence Performance
Business Intelligence performance always improves when the designers and developers leverage the DB2 engine built in functions. Use temporal tables, bi-temporal, moving sums, averages and aggregates so your business intelligence performance will only get better with DB2 10.
Corporations build business intelligence and data warehouse systems to gain competitive advantage. DB2 helps you create business intelligence and data warehouse systems that give the performance corporations need by providing many industry leading features.
Combine DB2 MQTs for DB2 Business Intelligence Databases
In other posts I talked about DB2 business intelligence and data warehouse designs that combine multiple DB2 materialized query tables (DB2 MQTs). Combining these DB2 MQTS with different Weekly, Monthly, Quarterly and Yearly data provides a great way to quickly get reporting information.
But business intelligence data warehouse systems are more than just providing a platform to total sales figures quickly. As an IBM commercial pointed out, business intelligence is about getting a deeper understanding of your business. Your data warehouse design needs to be able to provide the extra data or information that provides context, comparisons and a deeper meaning to data.
Drill Down for Requirements for Your DB2 Business Intelligence and Data Warehousing Systems
Understanding the flow of the sales totals, noticing whether sales trends are improving or declining is only the beginning. The ability to drill into your data warehouse information quickly and meaningfully is the start of a business intelligence system. Being able to drill through big department categories and then also analyze to the tiniest pixel color levels within unstructured data items is becoming part of data warehouse requirements for new systems.
When you are designing your new DB2 data warehouse business intelligence system, don’t stop with the major questions that are going to be asked. Follow up to uncover additional questions that will be asked after the first question gets answered. After you have uncovered four or five levels of questions, you’re ready to start the DB2 business intelligence data warehouse design. This way your system can provide better value and deep insights into improving the business.
As the European IDUG conference approaches, it’s good to remember the outstanding technical experiences discussed at IDUG with DB2 for z/OS and LUW from many user companies. The knowledge is from users, consultants and IBMers who have implemented systems and applications and have that extra in-depth knowledge of the multiple versions of DB2, including DB2 10 from the beta program. The presentations always provide a little extra tidbit that helps you avoid problems because speakers discuss the details of their scars from the front lines.
At IDUG North America, all the DB2 10 beta users’ presentations was really great. Most of the DB2 10 beta customers gave their impressions and testing results of DB2 10. They demonstrated why DB2 10 would be rolling into production environments soon. Some companies were even planning on consolidating DB2 systems or data sharing members because of the improved memory capabilities and since a single DB2 10 system can now handle almost 10 times more users as previous versions.
Several companies were evaluating DB2 10 against SAP and PeopleSoft applications. The testing of these applications has gone well and the companies were waiting for SAP and Oracle to get back to them with their certification of DB2 10. It seems the beta customers are testing more extensively and faster than these software companies. It really doesn’t surprise me that Oracle would be late to certifying their application since Oracle continues to lose customers due to the improvements on the mainframe and other versions of DB2.
Also the talk at the North American IDUG conference was that many Oracle shops that also have DB2 LUW are converting a large number of their Oracle instances to DB2 because of the crazy Oracle priced contracts. Saving money with DB2 LUW over Oracle continues as companies look at the performance and reliability of DB2. Expect that trend to continue as DB2 10 rolls out and companies are even consolidating their Oracle instances into their DB2 for z/OS environments.
I look forward to getting the updates at IDUG Europe.
As the 2011 European IDUG approaches, it’s good to remember the annual North American IDUG conference in May where the latest and greatest information about DB2 was unveiled. The IDUG Board of Directors and especially the Conference Planning Committee did a great job providing qualified users, consultants and IBM developers as speakers to discuss their technical experiences with DB2 for z/OS and LUW, including the new release of DB2 10. DB2 performance tuning, application development and DBA topics filled the agenda and everyone learned the latest techniques to help their companies gain the competitive advantage
DB2 10 Beta program was the largest with testing by 24 companies and the most successful testing period ever for a new DB2 version.
The DB2 10 for z/OS software is only about a year old and is definitely being tested by an ever growing number of companies. The very nature of a new release requires the regular introduction of the maintenance for both z/OS and DB2 for the best stable environment and to fully leverage its performance tuning improvements. It is recommended that companies planning to go to DB2 10 should plan on at least quarterly maintenance cycles to fully incorporate all APARs, Hipers, and other fixes or enhancements.
DB2 10 continues to be enhanced due to some of the issues discovered during the beta program and other items that needed additional work. The DB2 for z/OS Version 9 Plan Stability feature is getting a lot of attention with its ability to help with production package access path preservation. The DB2 10 additional features of APCOMPARE and APREUSE are still being worked on. Even if these new enhancements don’t make it in the release, Plan stability continues to be a great feature introduced in DB2 for z/OS Version 9 for guaranteeing that access paths stay the same or improve with every new developer version.
For those of you that are looking for the DB2 10 White Paper the sign up link for the free download of the PDF from the IBM website is below. You have to sign up because they want to know how many people download the DB2 10 White Paper, but it’s free and you can opt out of any email spam. So download it today at:https://www14.software.ibm.com/webapp/iwm/web/signup.do?source=sw-infomgt&S_PKG=db2_zos_reduce_costs
DB2 offers application designers new functionality for their data warehousing requirements. The new DB2 10 Temporal Tables provide a way to have a snapshot in time of the status of customers, orders or any other type of business situation.
DB2 Temporal Tables, with their built in functionality, automatically understand the business time or system time of the data entered into the system. This functionality is ideal for handling and documenting the condition of the any business aspect at a certain time. This functionality is driven from two new column definitions, BUSINESS_TIME and SYSTEM_TIME, defined within a table definition. Using these new time period columns within a DB2 Temporal Table definition provides a system-maintained, a period-maintained or bi-temporal time period for your data.
Many systems today have manual processes or utilities that manage or migrate their real time data to history tables. The new DB2 Temporal Tables with their new system time and business time columns can be used in conjunction with a user-defined trigger to automatically migrate transactional temporal table data to another user defined HISTORY table. Having these facilities built into the database greatly improves regulatory compliance, operations and overall DB2 performance tuning.
Separating out the real time transaction data versus the old data within your database using the HISTORY table requires planning and design steps. The separation of the old data from new data guarantees application and SQL performance does not suffer when your database is fully populated. Separation of the old and new data also helps DB2 performance tuning management so more resources can be delegated to maintaining base new transaction data where DB2 performance tuning matters for business operational success.
Over the coming weeks I will go through the steps and design decisions required to set up a Temporal Table. We will go through the SYSTEM_TIME, BUSINESS_TIME and a bi-temporal table design.
Jeff Jonas' keynote session at IDUG Europe 2010 brought up several interesting thoughts and ideas. The sessions and conversations started and it seemed that Java, Hibernate and .Net systems have started to cause DB2 Java performance problems for a large number of companies. Many great hallway conversations pointed out how we all have great standards, code review, and EXPLAIN processes within our COBOL infrastructure, but have nothing within these other development environments, including DB2 Java. This is common and I always help clients with their DB2 Java performance by using Optimization Service Center, Visual Explain, the Optim Data Studio and Query Tuner products. All of these are great to quickly improve their DB2 Java, Hibernate and sometimes even .Net systems.
Java, Hibernate and .Net Projects
Several people wanted to hear about my experience with fixing DB2 Java performance, working with the Optim Data Studio products and how they can help with DB2 Java, Hibernate and .Net projects. We talked about how easy the SQL can get uncovered and then changed from dynamic JDBC Java processing to static SQL with the new IBM pureQuery product. For several companies their storage constrained DB2 systems can really use the reduction in the dynamic statement cache by getting these DB2 Java performance problems defined to be static applications. In addition, the bonus of getting a CPU reduction from Java, Hibernate and other JDBC connected applications from being static applications and not having to double check security, object existence and access plans is a huge business selling point for getting pureQuery implemented as soon as possible.
Be sure to join us at IDUG 2011 in Prague where I'll be presenting "DB2 10 Temporal Database Designs for Performance” on November 14th.
Another situation when a DB2 Java transaction runs into problems is when it must check something outside of the critical transaction path or its normal activity. For example, when a DB2 Java transaction uses seven different discrete web services to accomplish a complete transaction unit-of-work and after the third web service runs into a situation where something else needs to be checked. The processing then tries to resolve the situation by accessing another service and the new service experiences an error exception. In most DB2 Java database transaction environments the previous three services’ work would be rolled back and the entire transaction would need to be restarted.
Within good DB2 Java processing designs the extra situation checking would be moved out of the standard flow and services transaction processing. Starting another unit of work is understandable for these double checking situations only because we want to retain the integrity of the first group of services activity already completed. Analysis needs to be completed to determine the number of times the exception processing is needed and how many times it errors out with an exception.
One of the new features within DB2 Version 9.7, Cobra, is called autonomous transactions. This allows a transaction to commit a block of statements independent of an invoking transaction. This invoked autonomous transaction, implies that the work done is committed even if the invoking transaction itself is rolled back. This feature is perfect for this type of exception processing within DB2 Java applications and can be easily implemented within a web service structure.
In previous blog entries I have talked about transaction scope, how DB2 Java applications access the database too much and transaction units of work (UOWs) are not really analyzed properly .
Too often these days the design and development of DB2 Java applications are done in an Agile or SCRUM type of project methodology where short concise project deliverables are designed to deliver working transactions. These methodologies are good for transactions but sometime are not good at overall DB2 Java performance. Since the scope of the Agile or SCRUM sessions are individual transactions, the big picture of the overall business and processing objectives sometimes gets lost. This leads to transactions that only accomplish a small discrete piece of the business. Other transactions are necessary and retrieve the same master customer or product information again and again in order to complete the processing activity.
Database caching can mitigate and shield the impact on performance for repeatedly getting the same database information but cannot cache all the activity. When analyzing your various transactions, determine the overall business objectives and flow of your DB2 Java application. Combine standalone transactions or SOA services that use the same data keys as much as possible.
DB2 Java performance is often a problem because the application processing is emulating the database which executes more efficiently or the processing is poorly designed. Either of these scenarios that my teams have found during performance or design reviews of DB2 Java performance of systems and applications always led to extended I/O activities and excessive CPU usage.
Too often, when the DB2 Java application was designed, the full scope of the eventual implemented processing was unknown. The specifications or even the coding of the backbone framework processing began before everything was known or so many additional processing add-ons were bolted on the transaction that the original design no longer fits the transaction and it no longer performs well. When additional functions are added into the transaction scope many times, the additional data retrievals are not added into the existing SQL processing. Instead they are coded as additional stand-alone SQL calls. This leads to SQL statement after SQL statement being executed during the single transaction of a DB2 Java application.
These add-on transaction functions typically add additional SQL to the transaction unit-of-work. This leads to the DB2 Java system transactions that seem to need hundreds or even thousands of SQL database calls to process their transaction from beginning to end. These large numbers of SQL calls usually touch and lock a large number of tables, inspect the data and finally perform the transaction processing. This situation typically uses excessive CPU and performs a large number of unneeded I/Os.
By not combining or enhancing the existing SQL to retrieve the additional data, the overall number of calls continues to expand and the DB2 Java database performance continues to suffer. The application design is needlessly neglected when these new requirements come along. When the changing requirements result in additional SQL calls with the application itself evaluating or combining new SQL data with an existing object data store, the result is more CPU usage and poor response time.
To avoid these types of situations in your DB2 Java application, understand all the data that is needed by your transaction. The application processing that combines or reevaluates data needs to be pushed back into the existing database SQL statements. DB2 does it much more efficiently. Retrieving additional data is bad for I/O, CPU, locking, and overall performance. So next time your DB2 Java transaction needs additional functionality don’t just add on, integrate your new functions into the existing SQL and designs of your DB2 Java application database processes.
Designing the unit-of-work for a given transaction entails many components. Different techniques and methods are incorporated depending on the components such as Hibernate, iBatis, JPA or Enterprise Bean technology to process the transaction. The Java transaction framework and the object patterns incorporated with the components also affect the transaction unit-of-work. All these factors together provide complete flexibility for today's Java developers.
Unfortunately, with this flexibility comes the responsibility to handle the transaction as effectively and efficiently as possible. For example, the various Hibernate, iBatis, JPA or Enterprise Bean technologies often shield the programmer from the database access. The database object is often passed through several methods or classes before it is used so a number of modifications could have already taken place. This same database object is also the same database SQL table access used for many different of processes and this is typical for the majority of the DB2 Java system reviews done recently.
This same database object is usually a SQL access that is usually a generic SQL call to a single table retrieving all of its columns with minimal WHERE clause filtering to obtain DB2 Java data. Alternatively, the DB2 Java SQL call could be using a unique key to get a single row from the table. In both cases the SQL is fairly simple. When it only has minor WHERE clause filtering, the database access is too generic. When the access is a unique key, the access is usually too fine-tuned to retrieve the group of data desired. Sometimes the DB2 Java processing passes multiple instances of the database unique access object and the Java method processes all of these database objects.
In all these scenarios, the SQL database access within the DB2 Java application does not fit the transaction processing or its unit-of-work. Generic access through the various Java persistence layers usually only provides basic performance for your transaction processing and usually retrieves too many rows for a given transaction. To achieve peak performance the DB2 Java transaction needs to access specific sets of database information and process them quickly. In too many DB2 Java systems this is a rare transaction processing situation.
One of the first standards and principles neglected in the DB2 Java applications that I have seen is that the application references the database too many times to complete a single transaction. While it is good to use your ORM database interface, the architect and application programmer should know how many times the ORM layer is used during each different type of the transaction. Since these ORM frameworks mask the database as just another object, many programmers do not know when their Java class or web service is firing a SQL call to the database. Within some DB2 Java performance problem systems, I have seen several hundred DB2 Java application calls to the database to complete a single transaction. This level of activity will never provide sub-second DB2 Java application transaction performance.
Comparing these new DB2 Java application database call levels against the other applications within any environment usually shows a substantial increase in overall usage. Sometimes the legacy transactions are only referencing the database 10-25 times while the new DB2 Java applications are referencing the database 130-175 times to complete a single transaction. Database usage during the Agile project development process or new scrum scenarios needs must be highlighted so everyone understands the overall performance requirements and expectations.
Most object oriented applications have their database calls travel through the network, web server and application servers making performance monitoring and evaluation even more difficult. Even though the next buzz word of cloud computing is supposed to cache and make magic of all these transaction performance problems, not even a cloud can make hundreds of calls to the database perform with sub-second response time.
Given the object model of Java and the relational model of the DB2 database, accessing data properly continues to be difficult for most DB2 Java application developers. Over the history of DB2 Java development, there have been many attempts within vendor products, interfaces and open source projects to bridge this object to relational data chasm. Many object to relational mapping of (ORM) solutions exist but few applications leverage them properly or efficiently.
The leading architectures providing good performance for my DB2 Java clients today, Enterprise Java Bean (EJB) specifications and Java Persistence Architecture (JPA) continue to evolve. Sometimes I have seen the open source Hibernate product implemented by DB2 Java projects looking for a quick database interface, but usually it is leveraged improperly and performs poorly. Sometimes the Hibernate interface even masks or creates problems in DB2 Java systems with its SQL handling and various parameter settings. Some clients have even written their own plain old Java object (POJO) interface to get to their DB2 data.
Any of these ways to get to the data can work and get good performance if the proper application framework, architecture and design pattern is matched with the correct application and transaction type. Working with IMS, IDMS, CICS and MQ systems referencing DB2 shows that very large databases with high availability have a variety of frameworks, architectures and design patterns. Just because an application is using an object oriented language such as Java, C## or .Net does not mean that performance or good database standards and principles should not be implemented or expected. Sometimes the application focuses on making the application fit the framework and more attention should be paid to DB2 Java performance.
Over the last three years, my clients have shown that multiple framework, architecture and programming patterns are usually implemented within the same project. The problem is the poor performance lessons experienced from the application implementation are not fully understood and the performance problems are continued and replicated into the next architecture, framework or pattern iteration, including DB2 Java applications.
Each application is different and each service or process within the application is unique. Step back and be flexible in your design patterns to understand that one or two architectures, frameworks or programming patterns are not correct for every situation. Your design should reflect the application requirements and the correct implementation for achieving the best DB2 Java performance might mean a variety or mix of approaches.
Objects within Java are great for flexibility and reuse. Java services and open source products such as Hibernate, iBatis, Ruby and techniques such as Java Persistence Architecture (JPA) and Data Access Objects (DAO) are great for accessing the DB2 database. Many of these techniques are common in today’s DB2 Java applications. My clients have experienced problems with these techniques when the application processing does not pay attention to the transaction integrity or the unit of work properly. When this happens the DB2 Java application processes usually have connected to the database multiple times, processed the transaction too many times or not committed or rolled back the transaction properly. These DB2 Java transaction situations usually manifest themselves in JDBC errors or poor referential integrity issues that developers blame on the database. Unfortunately, it is not the database but the application coding of the services that cause the problems.
In the coming weeks I will talk further about DB2 Java applications, their processing and issues that I have experienced with my clients. I know it will help you avoid some of these problems too.
When getting my taxes ready every year, I review the previous year’s activity. Reviewing 2008 showed that the majority of my consulting was spent fixing and tuning DB2 Java based systems. This is not a big surprise since the trend and majority of my clients since 2005 have had DB2 Java performance tuning opportunities. Even in 2011, the trend of Java applications being built and implemented with DB2 z/OS and LUW continues at an ever-growing pace.
In most of these client situations, the majority of the problem has not been the database or the DB2 LUW or z/OS system; it has been related to the DB2 Java application processing. Java is the new workhorse and many systems are being implemented and not performing well. Working with my clients, I’ve discovered a variety of issues with these systems and over the next several weeks I will highlight the most common factors that kill performance for these new DB2 Java application systems.
Many object framework, architecture and programming pattern options are implemented with these languages. Given that there is no one object framework, architecture or application programming pattern that is right for every situation, there is neither a single right nor wrong way for performance success with your DB2 Java application.
(To be continued next week.)
Leverage DB2 Version 9 BASIC and EXTENDED Plan Stability Features
Last week I talked about leveraging the Version 9 DB2 Plan BASIC and EXTENDED plan stability features. I also commented that if you don’t have DB2 9 and are coming from DB2 Version 8 to DB2 10 that you can still save your DB2 access plans. If you are migrating from DB2 Version 8, move all your DB2 Plans to Packages and make copies of the Packages with different OWNERs or COLLIDs such as “SAVED” or something equally obvious. This way you can copy back, include the backup collection or manipulate your application to use these SAVED packages. Keeping the old SQL optimization and access path information is vital.
The following diagrams show how to setup a way save your SQL optimization access path and then back out a module. This worked for me a long time ago Version 3 or 4 and for a client with Version 7, so make sure you test out your setup and back out procedures. Then the process can be quickly executed by operations personnel with a REXX exec or library control process such as Endevor or other change management products. Since a picture is a thousand words below are two charts that illustrate how to set up a simple DB2 PLAN PKLIST access path preservation process.
Gene Fuh and Terry Purcell wrote an article describing this type of DB2 Plan technique and the CURRENT PACKAGESET options for protecting your access paths. "Insurance for Your Access Paths Across Rebinds" for further detailed explanation.
Remember DB2 Plan Stability with DB2 9 and DB2 10 is the best and simplest way to implement a method to protect your SQL Optimizer access paths. There are also software tools on the market for handling more complex backup and recovery situations. Use Version 10 DB2 Plan Stability as soon as possible, but there are several other techniques to protect your access paths and guarantee performance. My DB2 10 information has been put into a collaborative book with information from Roger Miller, Julian Stuhler, Surekha Pureka, DB2 10 for z/OS – Cost Savings ….. Out of the Box.