 | Level: Intermediate Brian Byrne (byrneb@us.ibm.com), Industry Models and Integration Architect, IBM David McCarty (davidmccarty@fr.ibm.com), IT Architect, IBM Japan Dr. Guenter Sauter (gsauter@us.ibm.com), Information Architect, IBM Peter Worcester (pworcest@us.ibm.com), Services Solution Marketing Manager, IBM Japan
27 Mar 2008 Discover how you can use the IBM® Rational® Data Architect, IBM Industry Models and the unified metadata management of IBM Information Server to align process, service, and data models. Use these tools to accelerate your SOA project. The fifth part of "The information perspective of SOA design" series describes the key features of the products that support the data modeling pattern in SOA.
Introduction
Rational Data Architect (RDA) is one of the strategic IBM tools that addresses
various modeling and design requirements within the information
management domain. It is built on the Rational Software Development
platform (and implicitly on Eclipse). This provides an integrated
tooling environment linking the capabilities of RDA with those of
Rational Software Architect (RSA), Rational Requisite Pro, ClearCase
and other key parts of the Rational Software Development
platform. Figure 1 shows an overview of the major
capabilities of RDA.
Figure 1. RDA capability overview
The core functions of RDA go beyond just data modeling. This article
introduces the major capabilities of RDA and shows how these play a
role in the SOA analysis and design process. These capabilities include:
-
Standardize - Supporting the Business Glossary
Part
2 in this series discusses how data naming standards promote a common understanding of
business terms. Sharing of these terms across organizational
boundaries can reduce data redundancy through a more consistent
expression of data requirements with the objective of the
consolidation of synonymous and overlapping data terms.
-
Model - Supporting conceptual, logical, and physical data modeling
Part 4
in this series outlines the role of conceptual and logical data modeling within
SOA. RDA provides extensive capabilities to define data models on
a conceptual, logical and physical level and to convert between
UML and Information Engineering notations as needed. This allows
modeling of data at different levels of abstraction while
maintaining consistency and traceability between these models.
-
Visualize - Supporting conceptual, logical, and physical data modeling
Data models quickly become very large with many entities and
relationships. RDA supports visualisation of data models through a
range of diagram types. These models can be very detailed, showing full entity,
relationship and attribute structures and can even have higher-level
visualization of the core relationships between coarse grained
concepts. RDA also supports visualization of deployed databases
through the extraction of physical data models.
-
XML
XML is the standard language for service specification
(through WSDL/XSD). Much of the data that is being processed,
shared, and exchanged by services needs to be persisted in a
database. RDA supports visualization of XML schema definitions and
mapping of the XML schema to logical data models. If required,
RDA assists in shredding the XML data into a relational form.
-
Relate
RDA supports complex relationships between different levels of
a model. For example, it can map between a logical and physical model,
that is, between the concepts of the canonical data model and the realization of these
concepts in the data platform. This allows analysis and traceability
of the relationship between data patterns, both across lines of
business and at varying levels of abstraction.
-
Synchronize
The IT infrastructure of an organization is dynamic and changes
over time. A logical model may have changed without propagating the
change in the physical model or vice versa. When different data models become
"out of sync," it is important that you can compare them, identify the
differences and then resolve them by synchronizing the models.
-
Federate
RDA allows mapping models to be used to specify a join of
information that may reside in multiple sources. It brings the information together in a
virtual federated view and generates the necessary statements that
you can deploy on IBM Information Server (WebSphere® Federation Server) or DB2.
-
Code
Once a data model is specified, RDA allows you to create the SQL code for
DB2 so that you can deploy the data model on a database. RDA also
allows you to develop code for Java applications around this data
model or to develop stored procedures.
-
Store
RDA models can be stored within the RDA tool, exported to the
unified metadata platform of IBM Information Server, or checked
into a versioning system such as Rational ClearCase.
-
Team
RDA supports team collaboration through integration with Rational
ClearCase and Rational ClearQuest. This promotes sharing of models
amongst team members, and makes tracking the model changes much easier.
-
Integrate
RDA integrates with a range of database and data modeling tools as
well as SOA tools and environments. This allows the canonical data
models that express the reusable data concepts to be integrated
tightly with other artifacts such as business glossaries, service
definitions and business processes.
-
Import/Export
RDA can import and export artifacts from a variety of environments
such as CA ERwin. This is critical in allowing existing data
modeling to be harvested into a single environment and, ultimately,
a single model.
RDA provides extensive data modeling capabilities that allow data models
to be defined at different levels of abstraction. RDA supports the
specification of glossary models, logical data models, physical data
models and the relationships and transformations between them. RDA can
be used to develop data models in either a top-down or bottom up manner,
allowing analysis of existing databases and development of data models
by reverse engineering of existing data assets.
Information modeling in the SOA analysis and design process
Part 4
of this series introduces the information modeling activities that can be applied to
SOA projects. Figure 2 shows some of the main
integration points and dependencies between these information modeling
activities and some of the other important activities in the SOA
analysis and design process.
Figure 2. Information modeling in the SOA analysis and design process
There are many varying approaches to
modeling tools and methods for SOA. This article explains some of the
common approaches and how these are supported by IBM tools.
Glossary modeling
Proper SOA design advocates the use of a consistent business glossary
that provides alignment of business terms across the enterprise. This
series of articles positions WebSphere Business Glossary as the strategic
software solution from IBM to define business terminology and to share
it broadly across the enterprise. RDA tightly integrates with
WebSphere Business Glossary, allowing the business glossary to be
accessed and manipulated within the data modeling environment. RDA
glossary models can be exported to IBM Information Server, populating the
business glossary, and re-imported into RDA, updating the original
glossary model. This is supported through a set of 'metabrokers'
installed on top of RDA. This means that traceability can be maintained
between the glossary models and the conceptual model/logical data model
within RDA.
An RDA glossary model (.ndm) may be used as the entry point for
glossary specification, either by manually defining a new model, or by
importing assets such as IBM's Industry Models. Alternatively, a
glossary model may be created initially within WebSphere Business Glossary, and later exported
to RDA so that it may be mapped to other data modeling artifacts.
Figure 3 shows a business glossary model in RDA.
Figure 3. Business glossary model in RDA
Regardless of its origin, the presence of a glossary model in RDA
enables mapping of standard business terms into the formalized
structures of the canonical (conceptual/logical) data model. As
discussed in Part 4, this canonical model influences not
just the analysis and design of the data platform, but also that of
shared services across the enterprise, and the business processes that
consume these services. This means that data and service architects
have direct access to structured and consistent business terms when
defining or extending canonical data models and service models. This results in designs that align with each other across traditional
silos (such as LOB or product) and with the stated requirements of the
business.
RDA explicitly supports glossary modeling during the definition of
data models. RDA's content assist key (ctrl+space) displays the defined
words when you select an entity or attribute name Figure 4 shows.
Figure 4. Using RDA content assist
During model review, the terms used in the logical model can be
analyzed for compliance with glossary model terminology, as Figure 5 illustrates.
Figure 5. Analyzing a model in RDA
Further detail on the role of the glossary model and support for
glossaries in WebSphere Business Glossary is provided in the second and
third articles of this series.
Business requirements and use cases
Many SOA projects use use-case modeling as a well-known technique for
capturing the business requirements and driving business process models.
Rational RequisitePro provides the requirements management tooling
necessary to formalize this process. The business glossary provides a
clear and unambiguous set of business entity definitions that can be
used in use-case documents to improve the accuracy and efficiency of
downstream tasks such as process and service modeling. RequisitePro
delivers an Eclipse plugin-based user interface component that can plug
into RDA, RSA or WebSphere Business Modeler. This means that
requirements definitions in RequisitePro can be traced directly to the
glossary definitions in RDA. This allows business rule specifications, service requirements or other requirement types to be expressed in terms of glossary terms. These structures are then mapped directly to the terms that they reference. The user interface capabilities of WebSphere Business Glossary Anywhere may also be leveraged to
provide broad-based access to glossary terms during requirements definition.
 |
Canonical data modeling
As discussed in the Part 4, canonical data
modeling is the approach to define a common representation of the
information in the scope of the project and, ultimately, of the entire
organization. It includes the concepts of conceptual and logical data
modeling. While conceptual and logical data modeling may result in two
separate yet related data models, more often, these are two sequential
phases of analysis acting upon the same data model.
The conceptual data model defines the major business entities,
generalizations of those entities, and relationships between the
entities. It may include the specification of some attributes of the
entities but it mainly defines the scope and not a comprehensive or
complete attribute definition. A logical data model (LDM) contains
representations of entities and attributes, relationships, unique
identifiers, sub-types and super-types, and constraints between
relationships, while remaining independent of the physical
representation of those data structures. A LDM does not just drive the analysis and design of downstream data persistence
structures defined in a PDM and instantiated in
operational systems or warehouses. It also drives the structures of the
LDM and the definitions of several other models
in the SOA development process as Figure 2 illustrates.
Canonical data model to service model integration
Service modeling is typically performed using UML in Rational Software
Architect (RSA). It is an iterative process that often requires several
interchanges with the canonical data model in Rational Data Architect.
This interchange of content is supported through the UML-LDM and
LDM-UML transforms that are available with RDA and RSA. These allow a
LDM expressed in RDA to be transformed into a UML model
within RSA for use during service analysis. Figure 2
illustrates this in the linkage between the logical data
model and the service analysis model. Similarly, changes to this UML
model, perhaps based on business use cases in RSA, can be imported as
updates to the LDM in RDA. The UML-LDM and LDM-UML
transforms are used by creating a transformation configuration
(identifying the source and target models) and running the transform,
as Figure 6 illustrates.
Figure 6. Synchronizing UML and LDM models
Following a top-down approach, a LDM can be developed
based on the union of requirements expressed through information
engineering and UML-based analysis, through successive refinement of
the conceptual data model, and the subsequent addition of attributes,
keys, domain structures and so on, all in accordance with the naming
standards expressed through the business glossary. Specifically, the
UML2LDM transform pulls the type, subtype, relationship and attribute
definitions from a UML model (in this case a services analysis model)
and uses them to construct a LDM. In reverse, the
LDM2UML transform takes the same information from the logical data
model, and uses it to build a UML model. In this round trip metadata
structures such as operations in the UML domain and keys in the LDM
space are not transformed. However, these constructions (and the
diagrams for each model type) can be maintained through the use of the
model merge capabilities of RSA or RDA respectively.
After entities and relationships are fully defined as a logical data
model, RDA can be used to transform the logical model into a
database-specific physical representation in the form of a physical
data model. Physical data models are database-specific models that
represent relational data objects (for example, tables, columns,
primary and foreign keys) and their relationships. A physical data
model can be used to generate data definition language (DDL) statements
which can then be deployed to a database server. RDA allows the modeler to change a data model and view the changes to the DLL generated from this model. This DDL can then be executed directly against a server updating the database specification, or stored as a file for distrubution across the enterprise.
Canonical data model to canonical message model integration
As outlined in the fourth article in this series, the canonical
message model represents the standardized format used for exchanging
business information on a service bus. Not every data structure passing
through the different layers of the architecture necessarily
complies with this model. Rather, the model provides the default
business data interchange formats so that all components need only
know (at most) their own data format and the default data format
(that shared on a service bus). In most cases, new services either drive the message
format or they are implemented to comply
with this message format.
A common representation for the canonical message model is to use a
set of XML schemas. The process to define such a set of schemas varies
across projects. The example shown in Figure 2 is
typical for a project being driven top-down. Here the XML schemas are
an output of the technical service design in UML (in this case the IFW
model for financial services) which produces WSDL definitions for the
services as well as the included data element schema definitions
making up the service messages. Figure 7 shows the XML types in RSA/RDA XML editor;
for a larger version, see the sidefile.
Figure 7. XML types in RSA/RDA XML editor
In other projects where emphasis is on ESB or bottom-up, reuse the
canonical message model can be developed directly in XML or extracted
from generated service classes and message flows. For this scenario,
it can be useful to export the base data types directly from the
logical model as a starting point for XML development. Figure 8 shows
this function in RDA.
Figure 8. Exporting LDM from RDA as XML
Canonical data model to business process integration
Business process models are typically developed in two phases during an
SOA project.
- Business analysts perform high-level process modeling to define the
business activities and flows, identify KPIs, and analyze the costs
and benefits of potential changes by doing simulations.
WebSphere Business Modeler provides the tooling for these activities.
- Business process designers take the models designed by business
analysts and assemble the implementation. This may be performed in
WebSphere Business Modeler and/or WebSphere Integration Developer.
The business analyst working in WebSphere Business Modeler defines
business items which represent the information entities passing through
the business processes. In WebSphere Business Modeler V6.1 these
business item definitions can be imported directly from a LDM defined
in Rational Data Architect. Both forms of the canonical data model
(the conceptual data model or the logical data model) in Rational
Data Architect can provide the source business item definitions needed
depending on how fine-grained the business analyst needs to be in his
process models. Alternatively, the expression of data requirements
during process analysis may be as textual requirements, referring back
to the business terms of the glossary. This really depends on the
skillset of the process analysis community, the maturity of the logical
data model at the time of process analysis and other factors. In
particular, it can be process analysis itself, and subsequent use-case
based analysis of the service requirements of this process (through the
service analysis model) that drives some of the requirements that
define the canonical data model.
The business process designer builds out the details of the business
process and then binds service implementations to it. Data passed
through the business process is in the form of business items which
are described as XML schemas. These business items are referenced as
inputs and outputs for the tasks in the business process, and are often
constructed from the reusable XSD type that results from the service
design phase. The canonical message model can be imported directly
into WebSphere Integration Developer as an XML schema using import ->
File System and specifying the xsd-includes target directory in your
business integration project . Figure 9 shows this step.
Figure 9. Importing canonical message model types into WebSphere Integration Developer
This results in all data types defined in the message model schema
being made available as business objects as Figure 10 shows. In
practice, these business objects are commonly expressed as a reusable
library structure within WebSphere Integration Developer that is then
used to describe both service parameters, and the dataflow through the
business process.
Figure 10. Imported business objects in WebSphere Integration Developer
Managing RDA models
RDA supports the comparison and synchronization of data models across
the enterprise. This is particularly useful in the maintenance of an
enterprise LDM, where extensions to that model may be
occurring within a number of distinct projects and are reconciled by
a model manager. RDA allows differences in models to be identified and
these differences accepted or rejected, merging accepted changes into
a resultant model. You can manage the resultant model by using a
source control management system such as ClearCase or CVS.
The RDA tooling can also relate and map one or more source models to
one or more target models. For example, you may want to map models that
are related but have developed independently such as a canonical data
model and physical data models of one or more legacy systems. This
helps you understand the distribution of data concepts across the
enterprise and the relationships between the same concepts expressed
in multiple models. Similarly, this capability is useful in the
definition of a single federated data model based on the physical
models of a number of discrete systems. RDA also supports mapping of
XML schemas to other model types, which is particularly useful where
a schema needs to be persisted in a data store, requiring a mapping
between the schema and the PDM of the data store.
Figure 11. Mapping of an XML schema against a PDM
Mapping may either be explicitly stated by the modeler, or
discovered by RDA. Mappings identified by RDA can be accepted or
rejected into the mapping model, allowing the modeler to retain
control over the mapping exercise.
Conclusion
In this article, you have seen that the patterns identified in Part 4 of this series can be followed using Rational Data Architect to
support canonical data modeling. Furthermore, canonical data models
can be made to comply with standardized business terms as expressed
through a business glossary. The resulting model structures of the
canonical data model may be interchanged with several other tools
used in the development process within an SOA project. Rational
Data Architect provides the tools to design information systems
including creation of the necessary SQL code to make those changes
a reality.
Resources Learn
-
Check out the rest of this
series
Get products and technologies
Discuss
About the authors  | 
|  | Brian Byrne has over 10 years experience in the design and development of distributed systems, spending 7 years driving the architecture of Industry Models across a range of industries. Brian is currently an architect within IBMs Information Management organization. |
 | 
|  | David McCarty is based at IBM's European Business Solution Center in La Gaude, France and has 20 years experience designing and developing IT systems with IBM customers. He is currently a member of the Information as a Service Competency Center developing techniques and best practices for leveraging data systems in SOA solutions. |
 | 
|  | Guenter Sauter is an architect in the Information Platform & Solutions segment within IBM's software group. He is driving architectural patterns and usage scenarios across IBM's master data management and information platform technologies. Until recently, he was the head of an architect team developing the architecture approach, patterns and best practices for Information as a Service. He is the technical co-lead for IBM's SOA Scenario on Information as a Service. |
 | 
|  | Peter joined IBM three years ago after almost 25 years at institutions like the US Dept. of Defense, GE Corporate and Morgan Stanley where he held technical leadership positions and gained valuable experience in Enterprise Architecture and Enterprise Data Integration. He initially joined IBM as a Sr. IT Architect as part of the architect team for Information as a Service. Currently he is a Solutions Marketing Manager for the IPS Global Services organization, specializing in MDM solutions. |
Rate this page
|  |