The first stage of an approach to modeling data by using
DFDL involves examining the logical structure of your data.
Procedure
- Identify complex structures.
Complex structures
correspond to the complex types in the model. There will be an overall
complex type for the entire data itself. If the data contains substructures,
each substructure has a complex type. For example, each structure
level in a COBOL copy book, or each different row in a CSV message,
corresponds to an element of complex type.
- Identify simple items.
Simple items occur within
each complex type, and each has a logical data type. Simple items
correspond to simple elements. For example, each field in a COBOL
copy book with a PIC clause, or each comma-separated value in a CSV
message, corresponds to an element of simple type.
- Identify structure ordering.
Ordering determines
whether the group within a complex type is a sequence or a choice.
In a choice, only one of the listed items can occur. Examples are
C unions and COBOL REDEFINES.
- Identify structure and item cardinality.
Cardinality
provides the values for the
minOccurs
and
maxOccurs
logical
properties of your elements.
- Is an element required (
minOccurs != 0
) or optional
(minOccurs = 0
)?
- Is an element an array (
maxOccurs > 1
)?
- If so are there a fixed number of occurrences (
minOccurs
= maxOccurs
) or a variable number of occurrences (minOccurs
!= maxOccurs
)?
- Can the number of occurrences be unlimited (
maxOccurs
= unbounded)
?
- Identify nillable items and default values.
Some
elements might need to carry a special out-of-range value, in which
case they must be nillable. For example, a numeric field in a COBOL
copy book might sometimes be set to SPACES, which is not legal for
a DFDL number. Some required elements might be empty in the data,
in which case a default value can be provided.
- Consider whether any components can be reused.
If
any of the types are common, consider creating global complex or simple
types. If any of the elements are common, consider creating global
elements.
Example
As an example, the picture
shows a file of employee records. This file could be modeled using
DFDL as an overall complex element employees
that
contains a complex element employeeRecord
. The employeeRecord
element
repeats an arbitrary number of times, hence maxOccurs is set to unbounded
.
The
employeeRecord
is
a sequence of simple elements:
name
of type xs:string
age
of type xs:int
dob
of type xs:date
permanent
of type xs:boolean
salary
of type xs:decimal
.
The
salary
element is present only when
permanent
is
Y
,
so it is optional and has minOccurs
0
. All the other
simple elements are required and have minOccurs
1
.
What to do next
The next stage is to configure the DFDL annotations: Configuring the DFDL annotations