Rational Data Architect (RDA) is one of the strategic IBM tools that addresses various modeling and design requirements within the information management domain. It is built on the Rational Software Development platform (and implicitly on Eclipse). This provides an integrated tooling environment linking the capabilities of RDA with those of Rational Software Architect (RSA), Rational Requisite Pro, ClearCase and other key parts of the Rational Software Development platform. Figure 1 shows an overview of the major capabilities of RDA.
Figure 1. RDA capability overview
The core functions of RDA go beyond just data modeling. This article introduces the major capabilities of RDA and shows how these play a role in the SOA analysis and design process. These capabilities include:
Standardize - Supporting the Business Glossary
Part 2 in this series discusses how data naming standards promote a common understanding of business terms. Sharing of these terms across organizational boundaries can reduce data redundancy through a more consistent expression of data requirements with the objective of the consolidation of synonymous and overlapping data terms.
Model - Supporting conceptual, logical, and physical data modeling
Part 4 in this series outlines the role of conceptual and logical data modeling within SOA. RDA provides extensive capabilities to define data models on a conceptual, logical and physical level and to convert between UML and Information Engineering notations as needed. This allows modeling of data at different levels of abstraction while maintaining consistency and traceability between these models.
Visualize - Supporting conceptual, logical, and physical data modeling
Data models quickly become very large with many entities and relationships. RDA supports visualisation of data models through a range of diagram types. These models can be very detailed, showing full entity, relationship and attribute structures and can even have higher-level visualization of the core relationships between coarse grained concepts. RDA also supports visualization of deployed databases through the extraction of physical data models.
XML is the standard language for service specification (through WSDL/XSD). Much of the data that is being processed, shared, and exchanged by services needs to be persisted in a database. RDA supports visualization of XML schema definitions and mapping of the XML schema to logical data models. If required, RDA assists in shredding the XML data into a relational form.
RDA supports complex relationships between different levels of a model. For example, it can map between a logical and physical model, that is, between the concepts of the canonical data model and the realization of these concepts in the data platform. This allows analysis and traceability of the relationship between data patterns, both across lines of business and at varying levels of abstraction.
The IT infrastructure of an organization is dynamic and changes over time. A logical model may have changed without propagating the change in the physical model or vice versa. When different data models become "out of sync," it is important that you can compare them, identify the differences and then resolve them by synchronizing the models.
RDA allows mapping models to be used to specify a join of information that may reside in multiple sources. It brings the information together in a virtual federated view and generates the necessary statements that you can deploy on IBM Information Server (InfoSphere® Federation Server) or DB2.
Once a data model is specified, RDA allows you to create the SQL code for DB2 so that you can deploy the data model on a database. RDA also allows you to develop code for Java applications around this data model or to develop stored procedures.
RDA models can be stored within the RDA tool, exported to the unified metadata platform of IBM Information Server, or checked into a versioning system such as Rational ClearCase.
RDA supports team collaboration through integration with Rational ClearCase and Rational ClearQuest. This promotes sharing of models amongst team members, and makes tracking the model changes much easier.
RDA integrates with a range of database and data modeling tools as well as SOA tools and environments. This allows the canonical data models that express the reusable data concepts to be integrated tightly with other artifacts such as business glossaries, service definitions and business processes.
RDA can import and export artifacts from a variety of environments such as CA ERwin. This is critical in allowing existing data modeling to be harvested into a single environment and, ultimately, a single model.
RDA provides extensive data modeling capabilities that allow data models to be defined at different levels of abstraction. RDA supports the specification of glossary models, logical data models, physical data models and the relationships and transformations between them. RDA can be used to develop data models in either a top-down or bottom up manner, allowing analysis of existing databases and development of data models by reverse engineering of existing data assets.
Part 4 of this series introduces the information modeling activities that can be applied to SOA projects. Figure 2 shows some of the main integration points and dependencies between these information modeling activities and some of the other important activities in the SOA analysis and design process.
Figure 2. Information modeling in the SOA analysis and design process
There are many varying approaches to modeling tools and methods for SOA. This article explains some of the common approaches and how these are supported by IBM tools.
Proper SOA design advocates the use of a consistent business glossary that provides alignment of business terms across the enterprise. This series of articles positions InfoSphere Business Glossary as the strategic software solution from IBM to define business terminology and to share it broadly across the enterprise. RDA tightly integrates with InfoSphere Business Glossary, allowing the business glossary to be accessed and manipulated within the data modeling environment. RDA glossary models can be exported to IBM Information Server, populating the business glossary, and re-imported into RDA, updating the original glossary model. This is supported through a set of 'metabrokers' installed on top of RDA. This means that traceability can be maintained between the glossary models and the conceptual model/logical data model within RDA.
An RDA glossary model (.ndm) may be used as the entry point for glossary specification, either by manually defining a new model, or by importing assets such as IBM's Industry Models. Alternatively, a glossary model may be created initially within InfoSphere Business Glossary, and later exported to RDA so that it may be mapped to other data modeling artifacts.
Figure 3 shows a business glossary model in RDA.
Figure 3. Business glossary model in RDA
Regardless of its origin, the presence of a glossary model in RDA enables mapping of standard business terms into the formalized structures of the canonical (conceptual/logical) data model. As discussed in Part 4, this canonical model influences not just the analysis and design of the data platform, but also that of shared services across the enterprise, and the business processes that consume these services. This means that data and service architects have direct access to structured and consistent business terms when defining or extending canonical data models and service models. This results in designs that align with each other across traditional silos (such as LOB or product) and with the stated requirements of the business.
RDA explicitly supports glossary modeling during the definition of data models. RDA's content assist key (ctrl+space) displays the defined words when you select an entity or attribute name Figure 4 shows.
Figure 4. Using RDA content assist
During model review, the terms used in the logical model can be analyzed for compliance with glossary model terminology, as Figure 5 illustrates.
Figure 5. Analyzing a model in RDA
Further detail on the role of the glossary model and support for glossaries in InfoSphere Business Glossary is provided in the second and third articles of this series.
Many SOA projects use use-case modeling as a well-known technique for capturing the business requirements and driving business process models. Rational RequisitePro provides the requirements management tooling necessary to formalize this process. The business glossary provides a clear and unambiguous set of business entity definitions that can be used in use-case documents to improve the accuracy and efficiency of downstream tasks such as process and service modeling. RequisitePro delivers an Eclipse plugin-based user interface component that can plug into RDA, RSA or WebSphere Business Modeler. This means that requirements definitions in RequisitePro can be traced directly to the glossary definitions in RDA. This allows business rule specifications, service requirements or other requirement types to be expressed in terms of glossary terms. These structures are then mapped directly to the terms that they reference. The user interface capabilities of InfoSphere Business Glossary Anywhere may also be leveraged to provide broad-based access to glossary terms during requirements definition.
As discussed in the Part 4, canonical data modeling is the approach to define a common representation of the information in the scope of the project and, ultimately, of the entire organization. It includes the concepts of conceptual and logical data modeling. While conceptual and logical data modeling may result in two separate yet related data models, more often, these are two sequential phases of analysis acting upon the same data model.
The conceptual data model defines the major business entities, generalizations of those entities, and relationships between the entities. It may include the specification of some attributes of the entities but it mainly defines the scope and not a comprehensive or complete attribute definition. A logical data model (LDM) contains representations of entities and attributes, relationships, unique identifiers, sub-types and super-types, and constraints between relationships, while remaining independent of the physical representation of those data structures. A LDM does not just drive the analysis and design of downstream data persistence structures defined in a PDM and instantiated in operational systems or warehouses. It also drives the structures of the LDM and the definitions of several other models in the SOA development process as Figure 2 illustrates.
Service modeling is typically performed using UML in Rational Software Architect (RSA). It is an iterative process that often requires several interchanges with the canonical data model in Rational Data Architect. This interchange of content is supported through the UML-LDM and LDM-UML transforms that are available with RDA and RSA. These allow a LDM expressed in RDA to be transformed into a UML model within RSA for use during service analysis. Figure 2 illustrates this in the linkage between the logical data model and the service analysis model. Similarly, changes to this UML model, perhaps based on business use cases in RSA, can be imported as updates to the LDM in RDA. The UML-LDM and LDM-UML transforms are used by creating a transformation configuration (identifying the source and target models) and running the transform, as Figure 6 illustrates.
Figure 6. Synchronizing UML and LDM models
Following a top-down approach, a LDM can be developed based on the union of requirements expressed through information engineering and UML-based analysis, through successive refinement of the conceptual data model, and the subsequent addition of attributes, keys, domain structures and so on, all in accordance with the naming standards expressed through the business glossary. Specifically, the UML2LDM transform pulls the type, subtype, relationship and attribute definitions from a UML model (in this case a services analysis model) and uses them to construct a LDM. In reverse, the LDM2UML transform takes the same information from the logical data model, and uses it to build a UML model. In this round trip metadata structures such as operations in the UML domain and keys in the LDM space are not transformed. However, these constructions (and the diagrams for each model type) can be maintained through the use of the model merge capabilities of RSA or RDA respectively.
After entities and relationships are fully defined as a logical data model, RDA can be used to transform the logical model into a database-specific physical representation in the form of a physical data model. Physical data models are database-specific models that represent relational data objects (for example, tables, columns, primary and foreign keys) and their relationships. A physical data model can be used to generate data definition language (DDL) statements which can then be deployed to a database server. RDA allows the modeler to change a data model and view the changes to the DLL generated from this model. This DDL can then be executed directly against a server updating the database specification, or stored as a file for distrubution across the enterprise.
As outlined in the fourth article in this series, the canonical message model represents the standardized format used for exchanging business information on a service bus. Not every data structure passing through the different layers of the architecture necessarily complies with this model. Rather, the model provides the default business data interchange formats so that all components need only know (at most) their own data format and the default data format (that shared on a service bus). In most cases, new services either drive the message format or they are implemented to comply with this message format.
A common representation for the canonical message model is to use a set of XML schemas. The process to define such a set of schemas varies across projects. The example shown in Figure 2 is typical for a project being driven top-down. Here the XML schemas are an output of the technical service design in UML (in this case the IFW model for financial services) which produces WSDL definitions for the services as well as the included data element schema definitions making up the service messages. Figure 7 shows the XML types in RSA/RDA XML editor; for a larger version, see the sidefile.
Figure 7. XML types in RSA/RDA XML editor
In other projects where emphasis is on ESB or bottom-up, reuse the canonical message model can be developed directly in XML or extracted from generated service classes and message flows. For this scenario, it can be useful to export the base data types directly from the logical model as a starting point for XML development. Figure 8 shows this function in RDA.
Figure 8. Exporting LDM from RDA as XML
Business process models are typically developed in two phases during an SOA project.
- Business analysts perform high-level process modeling to define the business activities and flows, identify KPIs, and analyze the costs and benefits of potential changes by doing simulations. WebSphere Business Modeler provides the tooling for these activities.
- Business process designers take the models designed by business analysts and assemble the implementation. This may be performed in WebSphere Business Modeler and/or WebSphere Integration Developer.
The business analyst working in WebSphere Business Modeler defines business items which represent the information entities passing through the business processes. In WebSphere Business Modeler V6.1 these business item definitions can be imported directly from a LDM defined in Rational Data Architect. Both forms of the canonical data model (the conceptual data model or the logical data model) in Rational Data Architect can provide the source business item definitions needed depending on how fine-grained the business analyst needs to be in his process models. Alternatively, the expression of data requirements during process analysis may be as textual requirements, referring back to the business terms of the glossary. This really depends on the skillset of the process analysis community, the maturity of the logical data model at the time of process analysis and other factors. In particular, it can be process analysis itself, and subsequent use-case based analysis of the service requirements of this process (through the service analysis model) that drives some of the requirements that define the canonical data model.
The business process designer builds out the details of the business process and then binds service implementations to it. Data passed through the business process is in the form of business items which are described as XML schemas. These business items are referenced as inputs and outputs for the tasks in the business process, and are often constructed from the reusable XSD type that results from the service design phase. The canonical message model can be imported directly into WebSphere Integration Developer as an XML schema using import -> File System and specifying the xsd-includes target directory in your business integration project . Figure 9 shows this step.
Figure 9. Importing canonical message model types into WebSphere Integration Developer
This results in all data types defined in the message model schema being made available as business objects as Figure 10 shows. In practice, these business objects are commonly expressed as a reusable library structure within WebSphere Integration Developer that is then used to describe both service parameters, and the dataflow through the business process.
Figure 10. Imported business objects in WebSphere Integration Developer
RDA supports the comparison and synchronization of data models across the enterprise. This is particularly useful in the maintenance of an enterprise LDM, where extensions to that model may be occurring within a number of distinct projects and are reconciled by a model manager. RDA allows differences in models to be identified and these differences accepted or rejected, merging accepted changes into a resultant model. You can manage the resultant model by using a source control management system such as ClearCase or CVS.
The RDA tooling can also relate and map one or more source models to one or more target models. For example, you may want to map models that are related but have developed independently such as a canonical data model and physical data models of one or more legacy systems. This helps you understand the distribution of data concepts across the enterprise and the relationships between the same concepts expressed in multiple models. Similarly, this capability is useful in the definition of a single federated data model based on the physical models of a number of discrete systems. RDA also supports mapping of XML schemas to other model types, which is particularly useful where a schema needs to be persisted in a data store, requiring a mapping between the schema and the PDM of the data store.
Figure 11. Mapping of an XML schema against a PDM
Mapping may either be explicitly stated by the modeler, or discovered by RDA. Mappings identified by RDA can be accepted or rejected into the mapping model, allowing the modeler to retain control over the mapping exercise.
In this article, you have seen that the patterns identified in Part 4 of this series can be followed using Rational Data Architect to support canonical data modeling. Furthermore, canonical data models can be made to comply with standardized business terms as expressed through a business glossary. The resulting model structures of the canonical data model may be interchanged with several other tools used in the development process within an SOA project. Rational Data Architect provides the tools to design information systems including creation of the necessary SQL code to make those changes a reality.
Check out the rest of this
Get products and technologies
Simplify data modeling and integration design with
Rational Data Architect.
Accelerate projects and reduce risk with
IBM Industry Models.
Explore Data Studio
for a more comprehensive data management solution to design, develop,
deploy, and manage data-driven applications.
IBM Rational software helps organizations automate, integrate, and
govern the core business process of software and systems delivery via the
IBM Rational Software Delivery Platform.
and get involved in the developerWorks community.
Brian Byrne has over 10 years experience in the design and development of distributed systems, spending 7 years driving the architecture of Industry Models across a range of industries. Brian is currently an architect within IBMs Information Management organization.
David McCarty is based at IBM's European Business Solution Center in La Gaude, France and has 20 years experience designing and developing IT systems with IBM customers. He is currently a member of the Information as a Service Competency Center developing techniques and best practices for leveraging data systems in SOA solutions.
Guenter Sauter is an architect in the Information Platform & Solutions segment within IBM's software group. He is driving architectural patterns and usage scenarios across IBM's master data management and information platform technologies. Until recently, he was the head of an architect team developing the architecture approach, patterns and best practices for Information as a Service. He is the technical co-lead for IBM's SOA Scenario on Information as a Service.
Peter joined IBM three years ago after almost 25 years at institutions like the US Dept. of Defense, GE Corporate and Morgan Stanley where he held technical leadership positions and gained valuable experience in Enterprise Architecture and Enterprise Data Integration. He initially joined IBM as a Sr. IT Architect as part of the architect team for Information as a Service. Currently he is a Solutions Marketing Manager for the IPS Global Services organization, specializing in MDM solutions.