Extend IBM InfoSphere Data Architect to meet your specific data modeling and integration requirements, Part 2: Build customized reports and validation rules with IDA

IBM InfoSphere Data Architect (IDA, formerly Rational Data Architect) is gaining momentum as a comprehensive tool that helps organizations promote a thorough understanding of their enterprise information architecture. As more people use IDA, there's an increasing need for some customers to extend IDA to meet their unique data modeling and integration requirements. This two-part series shows you how to extend IDA's models, properties view, model reports and validation rules. In Part 1, you learned how to programmatically traverse and modify IDA models and how to add and display custom properties. In Part 2, learn how to generate customized reports and how to add your own validation rules for IDA models.

[02 Jul 2009: This article was updated to reflect that the GUI for BIRT report customization has been changed for performance improvement. This is shown in the section "Generate customized model reports using BIRT."--Ed.]

[07 Oct 2010: This article was updated to reflect InfoSphere Data Architect 7.5.3 capabilities.--Ed.]

Wei Liu (liuw@us.ibm.com), Software Engineer, IBM

Wei Liu is a software engineer working at the IBM Seattle office in Seattle, Washington. She works on data tooling and modeling.



07 October 2010 (First published 11 September 2008)

Also available in Chinese Vietnamese Portuguese Spanish

Introduction

IBM InfoSphere Data Architect (IDA) is a comprehensive development environment for data modeling and integration. IDA enables users to discover, model, visualize, and relate diverse and distributed data assets. IDA is an integrated data management offering from IBM and is tightly integrated with the Optim®, Rational, and InfoSphere products built using Eclipse. IDA supports logical, physical, glossary, storage, domain, and integration data modeling. As more enterprise customers use IDA, there is an increasing need to extend IDA to meet their unique data modeling and integration requirements.

Product name change

On December 16th, 2008, IBM announced that as of Version 7.5.1, Rational Data Architect is renamed to InfoSphere Data Architect to feature its role in InfoSphere Foundation Tools.

As mentioned before, IDA is Eclipse-based and therefore highly extensible by design. In this series, learn more about the extension points, APIs, and factories available with IDA that you can use to extend IDA. See how to extend IDA to:

  • Programmatically traverse and modify IDA models (Part 1)
  • Add new properties and display them in the Properties View (Part 1)
  • Generate customized model reports (Part 2)
  • Add model validation rules (Part 2)

The Eclipse BIRT project

The BIRT project is part of the Eclipse framework that provides infrastructure and tools to design, develop, and deploy report content to your Java®/J2EE application. BIRT has two main components: a report designer and a runtime component. BIRT also offers a charting engine that enables you to add charts to your applications. (See Resources for more information about BIRT).

This article assumes you have Eclipse plug-in development experience and the basic knowledge on Eclipse EMF and BIRT projects. The sample code provided in this article is tested on IDA 7.5.1 and 7.5.2 and on Eclipse 3.4.1 and 3.4.2.


Generate customized model reports using BIRT

Reporting is an important feature of IDA. It provides information on the whole or part of a model: that is, a list of objects and their relationships. This information can be copied, printed, and distributed as one physical document. Reports are also used to provide compliance information in many organizations. IDA provides a wide range of built-in reports or templates for your logical, physical, glossary, and mapping models. BIRT has been integrated and extended to provide more flexible reporting and customization capabilities since RDA 7003 (version 7 fixpack 3). IDA reporting uses BIRT in combination with the Open Data Access (ODA) component.

The ODA and EMF ODA driver

The Open Data Access (ODA) component is an open and flexible data access framework that provides a uniform, scalable way to retrieve data from heterogeneous enterprise data sources. BIRT provides JDBC, XML, Web services, and flat file support, as well as support for using code to get access to other sources of data. BIRT's use of the ODA framework allows anyone to build new user-interface support and runtime support for any kind of tabular data. Using the ODA extension framework, Eclipse developers can create new types of data components that will access data from custom data sources, through a user interface that is similar to the out-of-the-box BIRT data sources. (See Resources for more information on ODA).

IDA uses the EMF ODA driver developed at IBM to let you build complex queries that are executed directly against an EMF source, such as IDA models. Using the EMF ODA driver, either EMF model instances or Ecore meta-models can be defined as a data source to provide model structure information at report design time. At runtime (or preview time), model instances as a data source must be bound to the report in order for the report to be rendered.

In this section, you will do the following:

  1. Create a report and specify a data source
  2. Create a Tables data set that is used to display tables in the report
  3. Generate a sample report from existing IDA reports
  4. Customize the report to add a column that reports on the masking method used for a table column

You will lean how the EMF ODA driver works by completing these steps. Then, you can customize the reports the way you want using the BIRT designer.

Step 1. Create a new report and specify a data source

Follow these steps to create a new BIRT report design with an EMF data source:

  1. Open the Report Design perspective.
  2. Create a new report using File > New > Report.
  3. Select a parent folder and Simple Listing report template in the new report wizard, and click Finish.
  4. Right-click on the Data Sources folder in the Data Explorer and select New Data Source.
  5. Select EMF Data Source and enter SAMPLE Data Source in the New Data Source window, as shown in Figure 1.
Figure 1. New data source dialog
Screen shot of new data source dialog showing EMF Data Source selection and 'SAMPLE Data Source' entered as data source name
  1. Click Next.
  2. Select Add to add the SAMPLE.dbm (created in Part 1 of this series) as the EMF data instance, as shown in Figure 2.
Figure 2. Add SAMPLE.dbm as EMF data instance
Add SAMPLE.dbm as EMF data instance
  1. Click Finish.

The SAMPLE model, which is a physical model instance, is used in step 4 above as the data source. As mentioned, you can also define the meta-models as data source when designing a report.

Step 2: Create a Tables data set

Follow these steps to create a Tables data set that gets all the tables in a model:

  1. Right-click on the Data Sets folder in the Data Explorer and select New Data Set.
  2. Type in Tables Data Set as the data set name, and click Next.
  3. Click Next on the Query Parameters page.
  4. In the Row Mapping page, click on the drop-down arrow and select sample.dbm, as shown in Figure 3.
Figure 3. Select sample.dbm to browse in the row mapping page
Select sample.dbm to browse in the row mapping page

The database, schema, index, persistent table, and column objects contained in the SAMPLE model are then listed in the Browse area. You can expand objects in this page and get familiar with their structure.

  1. Select any of the persistent tables in the Browse area, click the > buttons to add query expressions, and set the query type, as shown in Figure 4.
Figure 4. Set query expression and type from a persistent table in the row mapping page
Set query expression and type from a persistent table in the row mapping page
  1. Click Next.
  2. In the Column Mapping page, click on the drop-down arrow, and select SQLTables:PersistentTable, as shown in Figure 5.
Figure 5. Select the persistent table to browse in the column mapping dialog
Select the persistent table to browse in the column mapping dialog

The structure of PersistentTable is populated.

  1. Select name:EString from the Browse area, click the > button to add it as a column query, as shown in Figure 6.
Figure 6. Add persistent table name as a column query in the column mapping dialog
Add persistent table name as a column query in the column mapping dialog
  1. Click Finish.
  2. Click Preview Results. You should see a list of tables, as shown in Figure 7.
Figure 7. Data set preview results
Data set preview results
  1. Click OK to finish. You should see the SAMPLE Data Source and Tables Data Set created in the Data Explorer, as shown in Figure 8.
Figure 8. A SAMPLE Data Source and Tables Data Set are created
A SAMPLE Data Source and Tables Data Set are created

Step 3: Generate a sample report from IDA built-in reports

The built-in reports provided by IDA are categorized by the model type they report on. Open the Report Explorer view under Reporting, and you can see a complete list of IDA built-in BIRT and XSLT reports as shown in Figure 9. The reports with a file extension of .rptdesign are BIRT reports. Those with the extension of .xsl are classic XSLT reports.

Figure 9. IDA built-in reports by model category
IDA built-in reports by model category

Click to see larger image

Figure 9. IDA built-in reports by model category

IDA built-in reports by model category

Note: There is a new transformation template, TransformationReport.rptdesign, that is custom-built for generating reports in true Excel-type horizontal report format, which provides the ability to sort and group information within the output. In addition, this transformation template defines joins across data model objects, effectively producing reports across different model types like logical and physical data models.

For example, if you have a physical model, you can generate a report for all objects in the model or only for columns, column mapping, or table spaces using the Physical Data Model Report, Column Report, Column Mapping Report, or Table Space Report are listed under the Physical Data Model category.

You need to create a copy of a built-in report to be able to open it. For example, you can open and have a closer look at the Data Source of the Column Report by following these steps:

  1. Right-click Column Report under the Physical Data Model category, and Copy and Paste it to a project.
  2. Open the copy of Column Report with Report Editor in the Report Design perspective.
  3. Right-click Data Source in the Data Explorer, and select Edit. You see that Ecore.ecore, db2.ecore, schema.ecore, tables.ecorem, and other SQL ecore models are listed as Ecore meta-models for the report design, as shown in Figure 10.
Figure 10. Ecore meta-models are defined as data source in built-in reports
Ecore meta-models are defined as data source in built-in reports

Follow these steps to configure and generate a column report for the SAMPLE model you created in Part 1:

  1. Open the Data Perspective, if it's not already opened.
  2. Click Run > Report > Report Configurations.
  3. Right-click BIRT Report, and select New in the Report Configurations dialog.
  4. Type in the name, set Column Report as Built-in report, add SAMPLE.dbm as Data Source, and select the output location and format, as shown in Figure 11.
Figure 11. Configure a column report for the SAMPLE model
Configure a column report for the SAMPLE model
  1. Click Report to generate a report.

Step 4: Customize IDA reports

IDA customers often have special requirements for the content and format of their model reports. IDA integrated and extended Eclipse BIRT since IDA V7003 to enable customers to customize the reports using BIRT designer to meet their special needs. Because a logical or physical model is an abstraction of the real-world system, it is usually very complicated, and it consists of many building blocks and relationships.

IDA has provided many built-in reports that can be used as templates to help you start with customization. The report list in Figure 9 shows you pairs of BIRT reports in which one member of the pair is labeled Blank (such as the Physical Data Model Report and Blank Physical Data Model Report). In general, reports such as the Physical Data Model and Column reports can be applied directly to a physical model to generate reports, while the blank reports are used as templates for customization. You can copy and paste a built-in report with .rptdesign extension from the Report Explorer, and then open the copy in the BIRT designer to display the report design. If you open a copy of the Column Report, you find defined Data Source and Data Sets, as well as a presentation design, as shown in Figure 12. The Blank Column Report only has the defined Data Source and Data Sets.

Figure 12. Column report
Column report

The presentation part is empty so that you can create it, as shown in Figure 13.

Figure 13. Blank column report
Blank column report

As a report customization example, you can update the column report to include the masking method, which was added as a new property in Part 1 of this series. You can start the customization either from the column report or the blank column report. Because it's best to keep most of the presentations from the column report, it saves time to start from this report.

  1. Copy the column report, right-click the physical data model folder, and paste it as Column Report with Privacy (set file name as ColumnWithPrivacy, and select a folder in the paste report dialogs), as shown in Figure 14.
Figure 14. Created Column Report with Privacy from Column Report
Created Column Report with Privacy from Column Report
  1. Double-click Column Report with Privacy to open it in the BIRT report designer.

If you saved the masking method property as an eAnnotation entry in Part 1 of this series, you need to add it to the Column data set to display it in your report.

To create a new column mapping for the Column data set, complete the following steps:

  1. Right-click Column data set in the Data Explorer, and select Edit, as shown in Figure 15.
Figure 15. Edit the Column data set
Edit the Column data set
  1. Select Column Mapping, click on the drop-down arrow, and select SQLTables:Column to browse in the Edit Data Set dialog. The browse tree is displayed.
  2. Expand the browse tree, select eAnnotations/details/value, and click the > button in the middle.
  3. Type in Masking Method as the name, and append [1] to the query, as shown in Figure 16.
Figure 16. Add column mapping using the Edit Data Set - Column dialog
Add column mapping using the Edit Data Set - Column dialog
  1. Click Preview Results to preview the newly added column, and click OK. The Masking Method column mapping is created for the Column data set, as shown in Figure 17.
Figure 17. Masking method is created in column data set
Masking method is created in column data set

Follow these steps to add a Masking Method field to the report:

  1. Insert a column to the right of the Documentation column, as shown in Figure 18.
Figure 18. Insert a new column in the report
Insert a new column in the report
  1. Drag the newly created Masking Method column mapping from the Data Explorer, and drop it in the table details row of the inserted column, as shown in Figure 19.
Figure 19. Bind the masking method column data set to the report by drag and drop
Bind the masking method column data set to the report by drag and drop
  1. Change the label of the inserted column to be Masking Method.
  2. Click File > Save.

When you generate a report using Column Report with Privacy for the SAMPLE model, you see a Masking Method field and HASHING as the value for BONUS column, as shown in Figure 20.

Figure 20. Column report with masking method column
Column report with masking method column

Add validation rules

At any time when you are building a data model, you can analyze the model to verify that it is compliant with the defined constraints. Based on EMF validation framework, IDA provides built-in constraints that not only ensure model integrity, but also help to improve the model quality by providing design suggestions and best practices. In this section, learn about how to extend the IDA built-in constraints to add a new constraint that checks for the existence of a Masking method when a column is set for privacy data.

EMF validation framework

The EMF validation framework provides support for constraint definitions for any EMF meta-model (batch and live constraints), customizable model traversal algorithms, constraint parsing for languages, configurable constraint bindings to application contexts, and validation listeners. (See Resources for more information about Eclipse EMF validation framework).

IDA built-in constraints

IDA model validation uses and extends the EMF validation framework. IDA provides comprehensive built-in constraints to do a model syntax check and to provide design suggestions for logical and physical models, as shown in Figure 21.

Figure 21. IDA built-in constrains to check for model syntax and give design suggestions
IDA built-in constrains to check for model syntax and give design suggestions

You can choose to enable or disable any of the constraints either through Preferences or the Analyze Model dialog. When you right-click a database or schema in a physical data model or a package in a logical data model from the Data Project Explorer and select Analyze Model, the validation results on the enabled constraints appear in the Problems view, as shown in Figure 22.

Figure 22. Analyze model results appear in the Problems view
Analyze model results appear in the Problems view

Add a new constraint

The org.eclipse.emf.validation.constraintProviders extension point is used to add constraints into the model validation framework (see Resources for more information about this extension point). Constraints are grouped into hierarchically structured categories. A constraint category defines the following attributes:

  • id—Identifier for the category. The ID is a hierarchical name, delimited by slashes, relative to the ID of the containing category element (if any).
  • name—The localized name of the category.
  • mandatory—Indicates whether the category is mandatory.

IDA defined the constraint categories shown in Listing 1.

Listing 1. Constraint categories defined in IDA
   <extension
      id="com.ibm.datatools.validation"
      name="Datatools Constraint Provider"
      point="org.eclipse.emf.validation.constraintProviders">
      <category
         name="%VALIDATION.CATEGORY.PHYSICAL"
         id="com.ibm.datatools.validation.physicalmodel">
         <category
            name="%VALIDATION.CATEGORY.SYNTAX"
            id="syntax">
            <category
               name="%VALIDATION.CATEGORY.DATATYPE"
               id="datatypes">
               %VALIDATION.CATEGORY.DATATYPE_DESC       
            </category>
            <category
               name="%VALIDATION.CATEGORY.SQL"
               id="sql_statement">
               %VALIDATION.CATEGORY.SQL_DESC
            </category>
            <category
               name="%VALIDATION.CATEGORY.OBJECTNAME"
               id="object_names">
               %VALIDATION.CATEGORY.OBJECTNAME_DESC
            </category>
            <category
               name="%VALIDATION.CATEGORY.KEY_CONSTRAINT_INDEX"
               id="key_constraints">
               %VALIDATION.CATEGORY.KEY_CONSTRAINT_INDEX_DESC
            </category>
            <category
               name="%VALIDATION.CATEGORY.IDENTITY_COLUMN"
               id="identity_columns">
               %VALIDATION.CATEGORY.IDENTITY_COLUMN_DESC
            </category>
            %VALIDATION.CATEGORY.SYNTAX_DESC
         </category>
         %VALIDATION.CATEGORY.PHYSICAL_DESC
      </category>
   </extension>

Constraint providers target one or more EPackages by namespace URI. A group of constraints declares categories in which they are members. Each constraint has a variety of meta-data associated with it. The following attributes are used to define a constraint:

  • id—A unique identifier for the constraint.
  • name—A localizable name for the constraint (appears in the GUI).
  • lang—Identifies the language in which the constraint is expressed. The language is not case-sensitive.
  • severity—The severity of the problem if the constraint is violated. This correlates to the severity of tasks in the Tasks view of the Eclipse environment.
  • statusCode—The plug-in unique status code, useful for logging.
  • class—For Java language constraints only, identifies a class implementing the constraint.
  • mode—Describes whether a constraint operates in batch mode, live mode, or feature mode.

Now, follow these steps to add a new constraint to check for the existence of a masking method if a column is set as privacy data:

  1. Add a new constraint using the extension point in the plugin.xml file, as shown in Listing 2.
Listing 2. Using the constraintProviders extension point to add a constraint
<!-- add a new constraint -->
<extension point="org.eclipse.emf.validation.constraintProviders">           
   <constraintProvider>
      <package namespaceUri="http:///org/eclipse/datatools/modelbase/sql/schema.ecore"/>
      <package namespaceUri="http:///org/eclipse/datatools/modelbase/sql/tables.ecore"/>
      <package namespaceUri="http:///org/eclipse/datatools/modelbase/derby/derby.ecore"/>
      <constraints
         categories="com.ibm.datatools.validation.physicalmodel/design/normalization">
         <constraint
            name="Column privacy"
            severity="WARNING"
            statusCode="10"
            class="com.ibm.extendrda.sample.validation.PrivacyDataCheck"
            lang="Java"
            mode="Batch"
            id="com.ibm.datatools.extendValidation.PrivacyDataCheck">
            <description>
               Discover columns that are defined as privacy data,
                  but do not have masking method defined
            </description>
            <param name="extraction" value="elementName"/>
            <message>
               Column {0} is defined as privacy data, 
                  but does not have a masking method defined
            </message>
            <target
               class="Column">
            </target>
         </constraint>            
      </constraints>
   </constraintProvider>         
</extension>

The code in Listing 2 adds the column privacy constraint under the design and normalization category as batch mode with a severity of warning, as shown in Figure 23.

Figure 23. The column privacy constraint
The column privacy constraint
  1. Add the implementation class, as shown in Listing 3.
Listing 3. Sample code that implements the column privacy constraint
public class PrivacyDataCheck extends AbstractModelConstraint {

	public IStatus validate(IValidationContext ctx) {
		EObject target = ctx.getTarget();
		if (target instanceof Column) {
			Column col = (Column) target;
			if (isPrivateData(col) && (!hasMaskingMethod(col))) {
				ctx.addResult(col);
				return ctx.createFailureStatus(
					new Object[] { col.getName() });
			}
		}
		return ctx.createSuccessStatus();
	}

	private boolean isPrivateData(Column column) {
		EAnnotation eannotation = column
			.getEAnnotation(SamplePropertySection.SAMPLE_EANNOTAITN_NAME);
		if (eannotation == null)
			return false;
		String privacyStr = (String) eannotation.getDetails().get(
			SamplePropertySection.SAMPLE_PRIVACY_PROPERTY_NAME);
		if (privacyStr == null)
			return false;
		return Boolean.getBoolean(privacyStr);
	}

	private boolean hasMaskingMethod(Column column) {
		EAnnotation eannotation = column
			.getEAnnotation(SamplePropertySection.SAMPLE_EANNOTAITN_NAME);
		if (eannotation == null)
			return false;
		String maskingStr = (String) eannotation.getDetails().get(
			SamplePropertySection.SAMPLE_MASKING_PROPERTY_NAME);
		if ((maskingStr == null) || (maskingStr.length() <= 0))
			return false;
		return true;
	}
}

Now, if you set a column in the SAMPLE model as privacy data and leave its masking method empty, you get a warning from the added constraint in the Problem view when you run Analyze Model on SAMPLE, as shown in Figure 24.

Figure 24. A warning from the column privacy constraint
A warning from the column privacy constraint

Conclusion

IDA as a comprehensive modeling and integration tool is very extensible by design. In this Part 2 of this two-part series, you learned how to generate customized model reports using BIRT and to add validation constraints to enforce business rules. When you combine these with how to programmatically traverse and modify IDA models and add custom properties from Part 1, you are able to extend IDA to meet your data modeling and integration requirements.


Acknowledgement

Thank you to Robin Raddatz, who is responsible for the updates made to the 2010 October 07 release of this article.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Rational
ArticleID=337510
ArticleTitle=Extend IBM InfoSphere Data Architect to meet your specific data modeling and integration requirements, Part 2: Build customized reports and validation rules with IDA
publish-date=10072010