In this final part of Carl Olofson’s paper on “The Third Generation of Database Technology”, I am summarizing the key points in the remainder of his paper, taking liberties to remove much of the detailed descriptions of each vendor, and jumping to his thoughts on Market Strategies, Technology Assessment and Essential Guidance.
It should be apparent to the reader by now that I chose to spend the time on this paper (versus many other papers available on the subject) because it validates a lot of the work that we are currently pursuing within the Research and Development teams at
Vendors and Products of the “3rd Generation Database Technology” (in Alphabetic Order)
Only the vendor and product names are listed here. You can find technologies associated with each vendor in Part II of my blog entry on this topic.
Amazon – SimpleDB
ENEA – Polyhedra
Four Js – Genero DB
Google – Bigtable
Infobionics – Knowledge Server
Ingres – VectorWise
McObject – eXtremeDB
Microsoft – SQL Azure
Oracle – TimesTen, Exadata Storage Server
ParAccel – PADB
RainStor – RainStor cell-based RDBMS
Vertica – FlexStor
The author believes that fundamentally different approaches to data definition, storage, manipulation, and delivery is needed to provide for more, better, faster, cheaper database management. The desire to acquire and use business analytics in moment-by-moment decision making, accumulate and reconcile large amounts of enterprise data for reporting and analysis, streamline operations with better and more precise data collection and movement all result in demand for DBMS technologies that deliver orders of magnitude better performance, with much higher reliability, than what the leading conventional products can offer.
As stated in previous parts of this blog entry, the author believes that the DBMS industry has long been held back by the relational paradigm, which has proven to be both a blessing and a curse. The blessing is that its simplicity and broad applicability have enabled DBMS technology to become the standard way that business application is stored and managed, which brings order and manageability to both the data and the applications that use it. The curse is that relational databases are unable to hold semantic metadata or directly represent data organization concepts such as multidimensionality, containment, derivation, recursion, or collection. This limitation can force DBAs to store data that reflects such concepts in arcane combinations of cross-reference tables that require multiple complex joins to navigate, and to encapsulate the management of such table relationship combinations in program code or stored procedures both of which, in the absence of detailed documentation, tend to mask the actual nature of the way the data is logically organized. [Though readers familiar with
The use of Web services and the growth in event-driven architectures (EDAs) are diminishing the need for SQL support in favor of the publication of data services that can be triggered or invoked in a SOA or EDA context. Freeing database management from the limitations imposed by SQL, even at the interface level, can further accelerate the development of DBMS technology that is, from the current perspective, truly revolutionary.
Essential Guidance (Carl Olofson’s conclusions)
The DBMS world is dominated by second-generation disk-based RDBMS technologies today. The third-generation technologies will probably not displace the existing disk-based RDBMS products overnight. In fact, it is more likely that the leading RDBMS products will evolve to include some of these technologies, and the leading vendors may acquire third-generation DBMS companies to add to their data management portfolios.
Third-generation technologies will arise in a market that acknowledges that one size truly does not fit all, and that some existing RDBMS technologies, perhaps with third-generation enhancements, remain perfectly appropriate for a range of OLTP and data warehousing workloads. Other workloads, such as those that depend on the rapid capture and processing of streaming data, involve the collection and analysis of vast amounts of heterogeneous data, or support the highly variable demands of the public cloud, will require these emerging technologies, often deployed in specialized, workload-specific ways.