Data normalization reconsidered, Part 2, Business records in the 21st century

An examination of record keeping in computer systems

From the developerWorks archives

Susan Malaika and Matthias Nicola

Date archived: January 13, 2017 | First published: January 12, 2012

Relational databases have been fundamental to business systems for more than 25 years. Data normalization is a methodology that minimizes data duplication to safeguard databases against logical and structural problems, such as data anomalies. Relational database normalization continues to be taught in universities and practiced widely. Normalization was devised in the 1970s when the assumptions about computer systems were different from what they are today.

The first part of this 2-part series provided a historical review of record keeping and examined the problems associated with data normalization, such as the difficulty of mapping evolving business records to a normalized format. Since the Internet has lead to a widespread creation of business records in digital format, such as XML, it has become possible to store records in computer systems in their original format.

The second part of the series discusses alternative data representations like XML, JSON, and RDF to overcome normalization issues or to introduce schema flexibility. In the 21st century digitized business records are often created in XML to begin with. This paper compares XML to normalized relational structures and explains when and why XML enables easier and faster data access. After a discussion of JSON and RDF it concludes with a summary and suggestions for reconsidering normalization.

This content is no longer being updated or maintained. The full article is provided "as is" in a PDF file. Given the rapid evolution of technology, some steps and illustrations may have changed.

Zone=Information Management
ArticleTitle=Data normalization reconsidered, Part 2: Business records in the 21st century