Claims are easy to make, but hard to back up.
IBM has a tendency to be conservative when making claims and I personally am even more so when I have to stand up in front of a customer and defend them.
As the saying goes:
Don't trust statistics you haven't developed yourself.
So when I started updating my presentations on Oracle Application Compatibility to DB2 10 I decided to do my own research.
Getting the data
It would have been easy to pick a small, fixed set of benchmark applications and measure compatibility over time.
The problem with such an approach is that it immediately introduces bias toward tuning towards that fixed set.
For example if openBravo
, the application we use for the DB2 Boot Camps
, were the indicator of compatibility, then temptation to tweak DB2 to achieve 100% for that specific application would be very high.
But that would give no indication whatever about overall compatibility for other applications.
A random set of applications is a much better measure.
Luckily such a set does exist.
Any customer or partner who is considering an Oracle Application for enablement to DB2 can download the MEET DB2 tool
In a nutshell MEET DB2
parses a DDL dump of an Oracle database including tables, views, sequences, and any PL/SQL objects.
It's task is to detect how many objects and how many lines of code (LOC) can be syntactically parsed by DB2.
It also has a rule engine that detects various semantic differences such as the absence of certain PL/SQL Packages.
The really interesting part about MEET DB2 as far as this post is concerned however is that the customer has to send the analysis in raw format to IBM to retrieve the printable report.
IBM does not receive any source code, but only the aggregated data.
The result is a database of MEET DB2 reports which provides a direct measure of compatibility for real applications outside of IBM's direct control.
The only downside is that the results are always about a point in time. A report on MEET DB2 9.7.3 cannot be re-graded to MEET DB2 10.
Here is a graph of the number of reports submitted to this database by customers and partners.
Note that this is not a complete list of all reports since IBM-run reports are typically not collected in this database.
This picture warrants some explanation.
Everyone wants to see an upwards trajectory. But no-one wants to see a peak.
Since these are customer initiated MEET reports they are primarily submitted for versions of DB2 which are generally available.
Since DB2 10 is not yet generally available that explains the very limited number of reports in DB2 10.
A similar reasoning holds for DB2 9.7.5. DB2 9.7.5 GA ed only a few months ago.
It has not yet had the opportunity to accumulate as much attention as DB2 9.7.4.
In a nutshell the peak is always trailing the GA date by several months.
Now, taking some 10 applications to come up with a general statement seems rather brazen.
Some 170 applications on DB2 9.7.5 on the other hand provide a decent statistical basis.
If hope we can all agree that compatibility of DB2 10 is bound to be higher than that of DB2 9.7.5.
After all DB2 10 includes all the previous version's features plus some more.
So I have chosen to investigate the DB2 9.7.5 MEET reports in more detail.
What I found was that there are quite a few number of reports with less than 1,000 Lines of code.
The smaller the report the higher the variation. There are numerous trivial ones which show 100% compatibility.
And there are some that have 0% compatibility.
One must wonder whether users accidentally analyzed non SQL file.
So my next step has been to ignore any applications with less than 1,000 LOC of PL/SQL.
In addition I eliminated a handful of duplicates.
This left me with:
- 74 MEET DB2 reports for DB2 9.7.5
- Sizes between 1,015 LOC and 240,706 LOC PL/SQL
- A total of 2,513,236 LOC PL/SQL
- Between 90.1% and 99.9% of PL/SQL statements directly compatible with DB2
- An average of 98.6% of LOC compatible.
The average here is the sum of the percentages divided by the number of applications.
I could only have used a weighted average, but that seems less realistic.
Here is the detail view:
Is there a correlation between the number of the LOC and the level of compatibility?
Interestingly: No, there is not.
In the above image on the top is the 240,000 LOC example.
The number of LOCs decrease clockwise down towards the 1,000 LOC.
I think IBM can safely defend a claim of an average of 98% compatibility.
Realistically even 98.6% seems to be a safe lower boundary.
Over time we will see whether DB2 10 reaches even higher levels once a sufficient number of reports have been submitted.