How DB2 Express-C avoids becoming crippleware
Free versions of commercial products are nothing new in 2012. The open source movement has created a climate of expectation for products to be available free of charge, and many software providers responded by releasing no-charge versions (lawyers don't like the word "free" for some reason) of their commercial products. It has not been an easy move to make. After all if one's products can be obtained free of charge, what is the reason to continue to pay hundreds of thousands or even millions of dollars to buy the same products?
To protect the considerable revenue from existing products, vendors saddled their no-charge products with limitations, often so severe that most of these products have been branded as "crippleware." Good examples are no-charge DBMS from Oracle and Microsoft. Both Oracle Express Edition and Microsoft SQL Server Express Edition are very capable versions of their commercial and at times very expensive counterparts. To protect the revenue (as they should) Oracle and Microsoft introduced a very similar set of restrictions in the areas of memory use, processing power, and availability. These restrictions, though on the low side, are fairly reasonable ways to restrict the performance of the DBMS without crippling the functionality. However, both vendors chose to do severely restrict the amount of data their no-charge DBMS can manage. Oracle restricts Oracle XE to 11GB of data and Microsoft SQL Server imposes a restriction of 10GB.
Is DB2 Express-C different from other free commercial DBMS?
DB2 Express-C chose a different strategy. We feel the course we took, while riskier for IBM, is a better one for our customers. (If people don't pay, are they still your customers? We think so.) This strategy makes available all of the capabilities of DB2, without any reservations, to the application developers. When building applications with DB2 Express-C, you can be sure that every programming interface that DB2 offers is available to you. You can also be sure that application code you build using DB2 Express-C will run without any modifications with any other edition of DB2 for Linux®, UNIX®, and Windows® and IBM InfoSphere® Warehouse (DB2 data warehouse product). In most cases it will also run unchanged with DB2 for z/OS® and DB2 on System i®.
And because DB2 Express-C contains exactly the same code as the priced editions of DB2 for Linux, UNIX, and Windows, you can be sure that your application will behave exactly the same when running with paid editions of DB2 as it does when you use it with the free DB2 Express-C. And unlike competing offerings from Oracle and Microsoft you are never limited by the amount of data DB2 Express-C can manage. Anyone who has seen ORA-12592 error message when exceeding 11GB database size can attest how devastating this can be, especially if your reputation is on the line with your customers. There is no equivalent message in DB2 Express-C and DB2 never stops working because you went over some arbitrary limit on the amount of data. The size of the database that DB2 Express-C can manage is essentially unlimited. It is constrained only by the resources you make available to your DB2 Express-C server and the constraint is on performance not on function. DB2 Express-C will not put you in a position where your application stops working because of license limitation.
Since we are on the subject of databases and resources we should also point out that DB2 Express-C servers can be physical or virtual. Moreover, you can have multiple instances of DB2 running on a single server with each instance having as many as 256 separate and distinct DB2 databases. Each database can further separate data in to 32K schemas. This is a great deal of power for any price DBMS and is unmatched in the "free DBMS" market.
If DB2 Express-C offers this much power at no charge, why would one ever want to pay for Express, Workgroup, Enterprise and Advanced Enterprise editions of DB2? The performance of a DBMS is highly dependent on memory and processor resources available to it. Its value is also directly related to the operational efficiency that is derived through advanced functionality available to the database administrator top optimize DBMS operation. These are the two dimensions of differentiation for the various editions of DB2.
Resources available to DB2 Express-C
Let's first take a look at the resource dimension. DB2 Express-C can be deployed on any size server (physical or virtual) with any amount of memory. In other words, your selection of the server hardware to run DB2 Express-C is not constrained in any way. However, DB2 Express-C will govern itself to only use the resources that it is allowed to use under the terms of the license agreement. Specifically, DB2 Express-C will schedule execution of DB2 tasks on two processor cores regardless of how many processor cores are present in the server where DB2 Express-C is running. This is done automatically without placing any burden of monitoring and managing compliance on the customer. If you wish, you can alter which processor cores will be used by DB2 Express-C. By default it will use the first two processor cores, that is, processor cores numbered 0 and 1. This CPU power limitation in DB2 Express-C 10 is unchanged from the previous versions of DB2 Express-C.
The other aspect of the resources dimension is main memory or RAM. Previous versions of DB2 Express-C governed themselves to utilize maximum 2GB of memory. Version 10 of DB2 Express-C doubles maximum amount of memory to 4GB. You may be wondering why we chose to double amount of usable memory while keeping number of processor cores the same as previous versions. We find that memory has a much greater impact on performance than CPU capacity. Specifically, we find that many of the new workloads such as those involving text and XML processing can really use the extra memory.
To complete the review of the resource dimension, consider a couple of examples. Say you have an x64 architecture server with two quad-core processors and 64GB of memory. You can deploy DB2 Express-C on this server without any worries about license compliance. DB2 Express-C will use two processor cores leaving the remaining six cores available for other workloads such as web and application server or the new text search server (more about this later). Similarly, DB2 Express-C will use 4GB of memory with the remaining 60GB being available for other workloads.
The question of why upgrade to the paid versions of DB2 is likely still on your mind. Paid versions of DB2 do take advantage of more processor and memory resources. As a result, they will deliver higher levels of performance. In addition, paid editions of DB2 offer DBAs additional functions to optimize performance of their databases and reduce operating costs. For example, paid editions of DB2 offer advanced data compression functionality that can drastically reduce the cost of storage while also improving performance. For customers needing to manage very large databases, savings in storage cost alone will easily pay for the cost of the DB2 license.
Another example of cost reducing features in the paid editions of DB2 is multi-temperature storage (new in V10.1). This feature allows DBAs to locate often used and important data (hot data) on faster storage such as SSD while placing reference or infrequently used data (cold data) on much cheaper storage. Just as is the case with compression, improved performance and reduced storage costs can more than offset the cost of the DB2 license.
Another example may be materialized query tables (MQT). These are essentially pre-computed complex queries that can provide instantaneous answers to questions that may otherwise take hours to compute. Connection concentrator function allows DBAs to handle workloads for tens of thousands of users without dedicating database server resources to each one. This permits a single database instance handle a much greater number of concurrent users. This capability is available in the paid versions of DB2 but is not offered as part of the DB2 Express-C product. Advanced security is another example. Label-based access control (LBAC) and row and column access control (RCAC, new in V10.1) allow DBAs to deliver stronger security and privacy of the data without burdening application developers with these requirements.
Notice that all of these functions are designed to help DBAs to design higher performing, less expensive to operate and more secure database environments. This is not a coincidence but is a design point. As we said earlier in the article, DB2 Express-C delivers all of the programming interfaces of DB2, allowing applications built with DB2 Express-C to run across all editions of DB2 while providing DBAs with advanced functionality to optimize cost and performance of these applications. We should also mention that DB2 Express-C is supported by the community of like minded people while paid editions of DB2 provide technical support delivered by the world wide IBM support team.
DB2 programming interfaces
So far we talked about how DB2 Express-C provides all of the interfaces of DB2 while differentiating on the resources and operational features. Now is the time to talk about these interfaces and the enhancements that are delivered in V10. We will start with something that is one of the most unique features of DB2, the ability to process XML data using the feature called pureXML®.
Even faster XML processing
We made it a point to include pureXML in the free DB2 Express-C. Version 10 delivers enhancements in the way it processes XML that are best described by a single word, "speed." DB2 processing of XML has always been fast. In version 10 it is even faster … much faster. First, there are enhancements deep in the engine to speed up several types of XML queries. Specifically, queries using XMLTABLE function, queries with early-out join predicates, non-linear XQuery queries, and queries with a parent axis. The best feature is that all of these enhancements improve performance without any change to the API. You get performance boost by simply upgrading to DB2 Express-C 10 and without touching any of your application code.
Another way XML queries are accelerated is through addition of new indexes.
In V10 you can define indexes on XML
INTEGER data types.
Before DB2 10, you were limited to indexes of type
DOUBLE, which didn't provide the native
DB2 XML engine with all of the opportunities to achieve the highest performance possible.
DB2 10 also enables you to define indexes over
FN:UPPER_CASE XML functions, resulting in
fast case-insensitive searches of your XML data. Indexes can also be
functions, thereby speeding up queries that search for a particular element in an XML column.
Again, it is important to point out that no changes to the application code are required.
Simply define these new indexes to accelerate existing queries.
Finally, Java programmers can now take advantage of the more compact
binary XML (
XDBX) format to speed up the transfer of XML data
between a Java application and the DB2 Express-C 10 server.
More efficient text search
In DB2, creating and maintaining text search indexes is the job of the text search service, which at times can be a resource-consuming task that has the potential to consume resources on the DB2 Express-C server. DB2 10 enables you to host the Text Search service on a separate server so that it gets dedicated resources and doesn't interfere with the rest of the DB2 work. DB2 Express-C users get an additional benefit. If you are running DB2 Express-C on a server with resources over an above those that can be used by the DB2 Express-C (2 processor cores and 4GB of memory) you can use the extra resources for the text search service.
In addition to improved efficiency, DB2 Express-C 10 text search supports fuzzy searches as well as proximity searches. Fuzzy search lets you find similar text (think Google Suggests type of function) while proximity search allows you to put constraints on the distance between the strings you are searching for. Unlike the other features we described so far, both proximity search and fuzzy search require minor changes to the queries.
RDF: A brand new opportunity for DB2 Express-C
The explosion of the Internet's popularity and the proliferation of web resources have caused many people to recognize the need for a cohesive framework for managing metadata in such a vast sea of resources. For example, a simple web page might have a title, an author, creation and modification dates, and attribution information, all of which can be of significant value if machines could be enabled to search for and discover resources on the web. The ability of computers to use such metadata to understand information about web resources is the idea behind the Semantic Web.
The Resource Description Framework (RDF) is a W3C standard for describing web resources. RDF describes web resources in the form of subject-predicate-object, for example, "Leon is Canadian" or "Leon knows Paul." These types of data entities expressed in the form of subject-predicate-object are called triples and are stored in specialized triplestore databases. DB2 Express-C 10 can function as a native triplestore database, providing the ability to store and query RDF data in DB2 Express-C. RDF is a very interesting data format for representing relationships such as social graph (think social networks such as Facebook and Twitter). Friend of a friend (FOAF) is a project that is creating machine-readable information describing people and links between them, and is an example of the use of the RDF data.
Time travel through data
Today, more than ever, there is a great deal of interest in managing time as a data dimension. There are a number of business drivers for this need. One such need stems from a requirement to comply with various government and industry regulations. Many of the compliance requirements center around a need to keep a complete history of data changes. Many regulations mandate the ability to answer questions about the state of business and various data points at a point in time. Many organizations require the ability to query and update effective dates and validity periods for their operations.
DB2 Express-C 10 introduces temporal data management capabilities called DB2 Time Travel Query that make the process of capturing data changes over time or setting effective dates and validity periods a simple process requiring just a few SQL statements. In the past, adding temporal capability would have required creation of database triggers and very complex application logic. The new Time Travel Query greatly simplifies this process. Our measurements show that DB2 Time Travel Query can reduce the effort as much as 45 times. Pretty good value for a free product.
Get a detailed introduction to temporal data and time travel query in the article " A matter of time: Temporal data management in DB2 10."
The discussion of the new programming feature would not be complete without at least a mention of the Oracle compatibility features of DB2 Express-C 10. While we did make Oracle compatibility available in DB2 Express-C V9.7.5, many people will have their first experience with these features in version 10 of DB2 Express-C. Also, in version 10, we have further enhanced Oracle compatibility to make even more applications built for Oracle to easily run on DB2 with very few changes. Recent studies show that as much as 98% of the application code written for Oracle can be left untouched, and applications will still work with DB2 Express-C. We are not going to recount all of the Oracle compatibility features of DB2 here, as there is a wealth of information available on this topic, such as the article "DB2 10: Run Oracle applications on DB2 10 for Linux, UNIX, and Windows."
Now is a great time to get started with DB2 Express-C 10
By now you have seen all of the value that the DB2 Express-C 10 product delivers, surprisingly, completely free of charge. We have also made is very easy to get started with DB2 Express-C v10. The product is available for download from ibm.com/db2/express. You can also get your very own DB2 Express-C server running on the cloud in less than 30 minutes: http://bit.ly/Iv8837. You can expect new free courses to appear on DB2University.com, and there is a veritable torrent of articles and tutorials coming on the new features in DB2 10. Now is the time to take action and give DB2 Express-C 10 a try.
- Learn more about DB2 10 in the article "What's new in DB2 10 for Linux, UNIX, and Windows" (developerWorks, Apr 2012).
- Get a detailed introduction to temporal data and time travel query in the article "A matter of time: Temporal data management in DB2 10" (developerWorks, Apr 2012).
- Learn the details of Oracle compatibility in DB2 10 in the article "DB2 10: Run Oracle applications on DB2 10 for Linux, UNIX, and Windows" (developerWorks, Apr 2012).
- Get a detailed guide to "DB2 V10.1 Multi-temperature data management recommendations" (developerWorks, Apr 2012).
- In the DB2 for Linux, UNIX, and Windows area on developerWorks, get the resources you need to advance your DB2 skills.
- Get more information about DB2 Express-C on ibm.com.
- Follow developerWorks on Twitter.
Get products and technologies
- Download DB2 Express-C today and try it for yourself.
- Participate in the discussion forum.
- Get involved in the developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.
Dig deeper into Information management on developerWorks
Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.
Keep up with the best and latest technical info to help you tackle your development challenges.
Software development in the cloud. Register today to create a project.
Evaluate IBM software and solutions, and transform challenges into opportunities.