Understanding the logical structure

The first stage of an approach to modeling data by using DFDL involves examining the logical structure of your data.

Procedure

  1. Identify complex structures.
    Complex structures correspond to the complex types in the model. There will be an overall complex type for the entire data itself. If the data contains substructures, each substructure has a complex type. For example, each structure level in a COBOL copy book, or each different row in a CSV message, corresponds to an element of complex type.
  2. Identify simple items.
    Simple items occur within each complex type, and each has a logical data type. Simple items correspond to simple elements. For example, each field in a COBOL copy book with a PIC clause, or each comma-separated value in a CSV message, corresponds to an element of simple type.
  3. Identify structure ordering.
    Ordering determines whether the group within a complex type is a sequence or a choice. In a choice, only one of the listed items can occur. Examples are C unions and COBOL REDEFINES.
  4. Identify structure and item cardinality.
    Cardinality provides the values for the minOccurs and maxOccurs logical properties of your elements.
    • Is an element required (minOccurs != 0) or optional (minOccurs = 0)?
    • Is an element an array (maxOccurs > 1)?
    • If so are there a fixed number of occurrences (minOccurs = maxOccurs) or a variable number of occurrences (minOccurs != maxOccurs)?
    • Can the number of occurrences be unlimited (maxOccurs = unbounded)?
  5. Identify nillable items and default values.
    Some elements might need to carry a special out-of-range value, in which case they must be nillable. For example, a numeric field in a COBOL copy book might sometimes be set to SPACES, which is not legal for a DFDL number. Some required elements might be empty in the data, in which case a default value can be provided.
  6. Consider whether any components can be reused.
    If any of the types are common, consider creating global complex or simple types. If any of the elements are common, consider creating global elements.

Example

Image showing data format with two complex types.

As an example, the picture shows a file of employee records. This file could be modeled using DFDL as an overall complex element employees that contains a complex element employeeRecord. The employeeRecord element repeats an arbitrary number of times, hence maxOccurs is set to unbounded.

The employeeRecord is a sequence of simple elements:
  • name of type xs:string
  • age of type xs:int
  • dob of type xs:date
  • permanent of type xs:boolean
  • salary of type xs:decimal.
The salary element is present only when permanent is Y, so it is optional and has minOccurs 0. All the other simple elements are required and have minOccurs 1.

What to do next

The next stage is to configure the DFDL annotations: Configuring the DFDL annotations