Many organizations keep important data in mainframe data stores, such as sequential and VSAM files. This data format is often defined using data structures of a traditional programming language such as COBOL. For your enterprise modernization initiatives to be successful, it is crucial that your overall data design and integration process includes "legacy" data stored in these COBOL data structures along with information captured in more modern systems.
InfoSphere Data Architect 126.96.36.199 is a collaborative data design solution. It provides a COBOL import capability that enables you to include COBOL copybooks and source files in your data modeling efforts. This article shows you how to import COBOL data files into InfoSphere Data Architect, and how to leverage key modeling capabilities such as visualization using data diagrams and transformation of the logical data model to a number of physical data models based on deployment environments. These capabilities can be useful for understanding the existing data, or for creating a relational data store to hold the content of the data.
The Eclipse workbench stores files in folders called projects. So before you can create or import any files, you must create a project to contain them.
To create a data design project with InfoSphere Data Architect, follow these steps:
- From the main menu, select File > New > Data Design Project. This opens the New Data Design Project wizard.
- In the Project Name field, enter a name your project. For example, Copybook.
- Click Finish.
The Data Project Explorer displays the new data project and its initial set of folders. As shown in Figure 1, this includes folders for Mappings, XML Schemas, Data Diagrams, Data Models, and SQL Scripts.
Figure 1. Data Project Explorer view
Your environment is now prepared for importing a COBOL copybook into this data project.
Before actually importing a COBOL copybook or source file, you should verify that your InfoSphere Data Architect settings are correct for your environment. For example, you may need to change the platform information or select the appropriate code page.
To review or modify these setting, follow these steps:
- Select Window > Preferences.
- Expand the Importer item in the navigation tree and select COBOL.
- Modify the preference settings to match your environment.
In Figure 2, the target platform has been specified as Windows 32 and the desired code page and other related options are selected.
Figure 2. COBOL import preference settings
- If you make any changes, click OK to save them.
In this section you learn how to ensure you have the right input files or copybooks, and how to launch the COBOL Import wizard. Follow these steps:
- The COBOL import capability supports both copybooks and source files.
Prepare the import files by making sure they have the correct extensions:
- Make sure your COBOL copybooks have an extension of .cpy.
- Make sure your COBOL source files have an extension of either .cbl or .ccp.
- Select File > Import > Data to launch the Import wizard.
- From the displayed list of import filters, expand the Data folder
and select COBOL Model Import Wizard, as shown in Figure 3.
Figure 3. Import selection dialog
- Click Next.
- On the Cobol Source and Target Model screen,
enter appropriate values in the fields:
- Source file — the complete path to the source file, which can be either a COBOL copybook or COBOL source file. You can use the Browse... button to search for the file.
- Target project — the name of the project in the current workspace where you want to see the resulting logical data model. This field also has a Browse... button associated with it.
- File name — the name of the resulting logical data model.
For example, the values shown in Figure 4, would create a logical model named Copybook import model in a project named /Copybook.
Figure 4. Specify the input Copybook and the name of resulting logical data model
- Click Next.
- On the next screen, click Finish to complete the import process.
Listing 1 shows a simple COBOL copybook named coboltest.cpy.
Listing 1. A copybook
000100 01 PERSON-CREATION-REQUEST.
Figure 5 shows the Data Project Explorer view of the logical data model named Copybook import model.ldm that results from the import of the coboltest.cpy copybook. The view is expanded to show several of the entities, attributes, and relationships of the logical data model.
Figure 5. Resulting logical data model
Now that you have used the import process to create a logical data model in your project, you can use InfoSphere Data Architect to visualize the structure of the data. You do this by creating a data diagram to graphically depict the entities, attributes, and relationships of your model.
To create a diagram for the logical data model you just created, follow these steps:
- In the Data Project Explorer, expand the folder for the Package1 package.
- Right-click the Diagrams folder, and select New Blank Diagram from the context menu to create an empty diagram.
- Select the new diagram in the Data Project Explorer, and enter Copybook imported model in the Diagram name field in the General tab of the Properties view.
- Drag each of the entities from the Package1 package
in the Data Project Explorer onto to the diagram.
These entities are:
Figure 6 shows the resulting diagram.
Figure 6. Diagram for logical data model Copybook imported model
The data diagram in Figure 6 shows that the logical data model captures an example of supertype, subtypes, and generalization relationship as supported in InfoSphere Data Architect. PERSON-CREATION-REQUEST is the supertype, and PERSON-NAME-PART and PERSON-PHONE are subtypes. InfoSphere Data Architect supports three possible transformations of the generalization relationship:
- Roll down
- Roll up
- Separate table
To change the transformation options, follow these steps:
- From either the Copybook imported model data diagram (Figure 6) or the Data Project Explorer (Figure 5), select one of the generalization relationship links. For example, select the link between PERSON-PHONE and PERSON-CREATION-REQUEST.
- On the Properties tab, select General, and click on the
Transform As drop-down list to review the three supported options as shown in Figure 7.
Figure 7. Change transformation options for generalization relationships from the data diagram Properties tab
Because the ability to visualize data using diagrams is so critical, InfoSphere Data Architect 188.8.131.52 includes new diagram layout options that give you greater flexibility and control over the objects on the diagrams. This capability is made possible by new built-in integration with ILOG.
Although a detailed description of these capabilities is beyond the scope of this article, you can view the new layout options by selecting the data diagram, going to the properties tab, and selecting the Layout view, as shown in Figure 8. Explore the various layout options to determine which choice works best for your specific data diagram.
Figure 8. ILOG diagram layout options
InfoSphere Data Architect also provides you the ability to create a relational database that you can use to store the content of the legacy data. You can leverage the process of forward engineering to derive a number of physical data models from the imported logical data model for various target deployment environments. This capability is possible because of how InfoSphere Data Architect maintains logical and physical data models in separate files.
To transform a logical data model to a physical data model, follow these steps:
- From the Data Project Explorer, select a logical data model. For example, select the model that you named Copybook imported model.
- Navigate to Data > Transform > Physical data model. This opens the Transform to Physical Data Model wizard.
- Keep Create New Data Model checked, and click Next.
This takes you to the Options screen of the wizard as shown in Figure 9.
Figure 9. Transformation wizard Options screen
- Ensure that the Generate traceability check box is selected. This adds a dependency to each column on the physical data model so that you can use the Analyze Impact feature. This feature visually lists and reports on the dependent objects that could be impacted by a change. For example, in a physical model this analysis helps to identify objects such as foreign keys, primary keys, or indexes that could be affected by a change to a column.
- Click Next.
- On the next screen, click Finish. This completes the transformation process and generates the physical data model.
- In the Data Project Explorer, go to the Data Models folder and
expand your physical data model to see all the tables, as shown in Figure 10.
Figure 10. Physical data model created by the transform
Many businesses have a critical need to be able to continue using legacy data. This article showed you how to use InfoSphere Data Architect to import COBOL data files in several easy steps. This enables architects and developers to include this information in their overall data modeling and design process. The article demonstrated some of the key modeling capabilities of InfoSphere Data Architect that can be useful for visualizing and understanding the legacy data, and showed you how to create a relational store for the data.
InfoSphere Data Architect provides many other advanced capabilities in addition to those discussed here. Refer to the Resources section below for links to other related articles and additional information.
The authors would like to thank Davor Gornik, who wrote the original article published in 2006 upon which this article was based and that contained the example used in this article. The authors would also like to thank Kathryn Zeidenstein for editing and reviewing this article.
Watch the Introduction
to IBM InfoSphere Data Architect demo.
Practices: Information Modeling with Rational Data Architect Version 7 and
learn from the experience of others.
If you've had experience with CA ERwin Data Modeler and now are starting to
learn about InfoSphere Data Architect, check out Migrating
from CA ERwin to InfoSphere Data Architect
to speed up your model migration process and ease your learning process of the new environment.
Get products and technologies
IBM InfoSphere Data Architect
and try it yourself for free.
- Participate in the discussion forum.
- Check out the
Managing the data lifecycle blog and
get involved in the
Integrated Data Management Community Space,
which has a comprehensive list of resources and downloads.
Quy has been involved in database tools development for almost 10 years. His experience includes stored procedure builder and Java debugger for DB2, as well as the Integrated Query Editor for editing SQL and XQuery statements. Currently, he is working on InfoSphere Data Architect, a data modeling and integration design tool. His hobbies include traveling, tennis, and fishing.
Seeling Cheung is an enablement architect from the IBM Integrated Data Management team at IBM Silicon Valley Lab in San Jose, CA. She spends much of her time with customers and business partners helping them build solutions around the numerous integrated data management products. Previously, Seeling held other advanced technical positions, including development responsibilities for the federation technology and the pureXML capabilities on the Distributed DB2 database team. She joined IBM after finishing her Masters degree in Computer Sciences and working a couple years at Oracle.