Comparison of the XML model and the relational model
When you design your databases, you need to decide whether your data is better suited to the XML model or the relational model.
This topic discusses some of the factors that you need to consider as you make this decision.
- XML data is hierarchical; relational data has a flat structure.
An XML document contains information about the relationship of data items to each other in the form of the hierarchy. With the relational model, the only types of relationships that can be defined are parent table and dependent table relationships.
- XML data is self-describing; relational data is not.
An XML document contains not only the data, but also tagging for the data that explains what it is. A single document can have different types of data. With the relational model, the content of the data is defined by its column definition. All data in a column must have the same type of data.
- XML data has inherent ordering; relational data does not.
For an XML document, the order in which data items are specified is assumed to be the order of the data in the document. There is often no other way to specify order within the document. For relational data, the order of the rows is not guaranteed unless you specify an ORDER BY clause on one or more columns.
Sometimes the nature of the data dictates the way in which you store it. For example, if the data is naturally hierarchical and self-describing, you might store it as XML data. However, other factors might influence your decision about which model to use.
- Whether maximum flexibility of the data is needed
Relational tables are fairly rigid. For example, normalizing one table into many or denormalizing many tables into one can be very difficult. If the data design changes often, representing it as XML data is a better choice.
- Whether maximum performance for data retrieval is needed
Some expense is associated with serializing and interpreting XML data. Retrieval of a few items from a large XML document is relatively expensive, so performance might be better for data in a relational format. However, for retrieval of entire documents, XML data might be more efficient if a large number of relational joins are needed to retrieve equivalent data in a relational format.
- Whether the data is processed later as relational data
If subsequent processing of the data depends on the data being stored in a relational database, it might be appropriate to store parts of the data as relational, using decomposition. An example of this situation is when online analytical processing (OLAP) is applied to the data in a data warehouse. Also, if other processing is required on the XML document as a whole, then storing some of the data as relational as well as storing the entire XML document might be a suitable approach in this case.
- Whether the data components have meaning outside a hierarchy
Data might be inherently hierarchical in nature, but the child components do not need the parents to provide value. For example, a purchase order might contain part numbers. The purchase orders with the part numbers might be best represented as XML documents. However, each part number has a part description associated with it. It might be better to include the part descriptions in a relational table, because the relationship between the part numbers and the part descriptions is logically independent of the purchase orders in which the part numbers are used.
- Whether data attributes apply to all data, or to only a small
subset of the data
Some sets of data have a large number of possible attributes, but only a small number of those attributes apply to any particular data value. For example, in a retail catalog, there are many possible data attributes, such as size, color, weight, material, style, weave, power requirements, or fuel requirements. For any given item in the catalog, only a subset of those attributes is relevant: power requirements are meaningful for a table saw, but not for a coat. This type of data is difficult to represent and search with a relational model, but relatively easy to represent and search with an XML model.
- Whether referential integrity is required
XML columns cannot be defined as part of referential constraints. Therefore, if values in XML documents need to participate in referential constraints, you should store the data as relational data.
- Whether the data needs to be updated often
Currently, you can update XML data in an XML column only by replacing full documents. If you need to frequently update small fragments of very large documents for a large number of rows, it can be more efficient to store the data in non-XML columns. If, however, you are updating small documents and only a few documents at a time, storing as XML can be efficient as well.