Skip to main content

Using SoDA to generate large documents in the Microsoft Word environment

Dr. Einar W. Karlsen (einar.karlsen@de.ibm.com), Senior IT Specialist, IBM, Software Group
Author photo
Einar Karlsen has more than twenty years experience in the area of software development tools and methods. He has held positions in research, development, project management, and consulting. His Ph.D. from the University of Bremen addresses the integration of case tools. He has been with the Rational brand for eight years supporting customers in their use of Rational Apex, Rational Suite, and the Rational Software Development Platform.
Debra L.K. Johnson, Advisory Software Engineer, IBM
Debra Johnson
Debra Johnson has been involved in the electronics and software industries for more than 20 years. She has worked as a test floor supervisor, programmer, support engineer, instructor, technical writer, maintenance tech, education courseware developer, and marketing engineer. She has been with the Rational software brand for nine years, working in technical support as an Advisory Engineer for SoDA and ProjectConsole, as well as in marketing. Currently she is a marketing engineer for the Rational Portfolio Manager, Rational SoDA and the Team Unifying Platform tools.

Summary:  from The Rational Edge: Demonstrating compliance with industry and legal guidelines often requires software development teams to document their efforts at various points in the project lifecycle. This paper demonstrates how to divide complex documents into more manageable pieces using IBM Rational SoDA in concert with the Microsoft Word concept of master and affiliated documents.

Date:  15 Jan 2007
Level:  Introductory PDF:  A4 and Letter (620KB | 17 pages)Get Adobe® Reader®
Activity:  1222 views

illustration Compliance is a topic of increasing relevance in the IT industry, whether it's compliance with International Organization for Standardization (ISO), the Rational Unified Process®, or RUP®, the V-Model, Department of Defense (DoD) standards, or government mandates such as the Sarbanes-Oxley Act (SOX). One way that IT development teams frequently demonstrate compliance is by delivering documents that show that the development effort has been conducted in accordance with the relevant development standard. Usually, this imposes a need to generate documentation across workflows, phases, and iterations in the adopted development process. Moreover, documentation may also be required as a formal deliverable in a larger project. Finally, globally distributed teams have an increased need for generating reports and documentation to support the increased communication needs in a distributed environment.

A reporting system is a key solution instrumental in demonstrating that an IT organization is conducting its business in accordance with the company and/or organization's rules. IT organizations working in an environment with compliance requirements must to be able to document that they have designed, implemented, and tested the application in accordance with these standards. This essentially calls for capabilities of generating cross domain reports and documents spanning requirements, models, code, test cases and test results, to name a few.

In this paper we shall show you how to use IBM Rational SoDA® for Microsoft Word to generate large documents in a flexible and scalable way. The approach is based on using Microsoft Word's master and affiliated documents function. Rather than generating one large report, generation is split into a number of smaller generated reports. This way, each report can be regenerated independently. A small application called SoDA Report Generator is provided in the Rational area of IBM developerWorks that allows a user to re-generate all the reports that make up the master document in a single step, in order to limit the number of user interactions required when the entire master document needs to be produced. When generating large documents, a number of related issues come up such as selective generation or parameterization of templates. We shall therefore touch upon these issues as well. We assume that the reader knows how to develop SoDA templates. If you need more information on how to develop templates, refer to the exercises available on IBM developerWorks.

Automated documentation systems: A brief overview

As noted earlier, software development teams frequently need to generate documentation across workflows, phases, and iterations in the adopted development process. Locations in the development lifecycle where documentation is typically required are illustrated in Figure 1.

Figure 1

Figure 1: Documentation icons, mapped onto the standard Rational Unified Process chart, indicate locations in the project lifecycle where documentation may be needed.

Documents are typically formal in nature and consist of many components, such as front page, table of content, list of tables, list of figures, individual chapters (again structured hierarchically in subsections), appendices, etc. A document is therefore frequently made up of several chapters and contains a mixture of generated information and hand-written information: The introduction to a chapter may be provided by a human, whereas the detailed information following may be extracted from one or more development tools. A document is generated, delivered, read, reviewed, revised, and eventually stored (usually) in a Configuration Management System. All of which indicates a very time-consuming and human-intensive process. Basically this implies the need for producing the document more than once, which in the context of large projects with large amount of data rules out using a manual approach.

One of the key advantages to using a documentation system such as IBM Rational SoDA to generate reports is that it is customizable to meet documentation standards such as ISO 9000, SEI CMM, CMMI, Mil-Std-498, and IEEE. Since the information is extracted directly from the tools (rather than being copied by human beings), it reduces the overall cost for documentation by preventing mistakes, shortening the creation cycle, and shortening the review cycle, thus keeping the documents up-to-date. When changes to the data are made, it is quite easy to re-generate the documents, thus improving quality and consistency of information while keeping the costs of having to reproduce the documentation low. For large projects, there is a natural need for a larger set of documents that must be produced, delivered, reviewed, and changed over time. Consequently, it becomes vital to ensure that the documents are consistent with the development artifacts as the underlying artifacts are changed. In the context of large projects and the huge set of documentation required, flexibility and scalability are the key issues of concern. Forcing the user to regenerate 2000+ pages for a design document just because a couple of details have changed in a single part of the design is not an optimal solution.

Divide and conquer using master and affiliated documents

Let's start off by showing a typical SoDA template that has not yet been optimized using the master and affiliated documents function. We will change this template so that it becomes easier to handle through this optimized process. The purpose of the template is to generate a consolidated report covering the design phase of a development project. This document consists of a title page, a revision history and a table of contents followed by a several chapters related to the design of the project. The first three chapters -- "Scope," "Referenced Documents," and "Architectural Goals" -- contain text entered by the user. The remaining chapters are intended to contain information out of a model that has been created using the Rational Software Architect. These report elements are shown in the initial design template, illustrated in Figure 2.

Figure 2: Initial design template

Figure 2: Initial design template

Starting with chapter 4, SoDA commands are developed to extract the logical architecture and the detailed design of each involved package/subsystem contained within the model, as shown in Figure 3.

Figure 3: Logical and detailed design template

Figure 3: Logical and detailed design template

There are several reasons why this template may not be usable in a project with large models:

  1. The model contains sufficient number of packages/subsystems and classes to cause the size of the generated report to be greater than 1000 pages.
  2. Performance would be an issue.
  3. The size of the template would render the results hard to understand and consequently almost impossible to maintain (or debug).
  4. Having a document defined in terms of a single file is impractical when working in a team.

One of the most common reasons this template may not be usable is that many projects require the ability to quickly re-generate documentation when changes occur to the design model. The typical scenario is that some changes are made to the model (e.g., refining one or more class diagrams) just before a deadline. It would be impractical to force a project to re-generate the entire document just because some details have changed in a single part of the model. So usually there is a demand for selective regeneration of documentation when the artifacts have changed.

The solution is to split the template into smaller parts that can be (re-)generated or edited individually. We will use the Microsoft Word master and affiliated documents function to do this, as illustrated in Figure 4. The Microsoft Word master document will contain static parts such as the title page, revision history and the table of contents. It will furthermore contain references to the relevant affiliated documents, which in turn will either contain free form edited text that has been provided by the project members, or text generated in the form of SoDA reports. The master document and all the subsequent affiliated documents must be stored in the same directory if the SoDA Report Generator is to function correctly.

Figure 4: Master and affiliated documents

Figure 4: Master and affiliated documents

Figure 4 shows the document after it has been split into a number of affiliated documents. The first three affiliated documents -- "Scope," "Referenced Documents," and "Architectural Goals" -- are simple Microsoft Word documents that do not contain any SoDA commands. The rest are generated SoDA reports based on SoDA templates that target a specific portion of the model. The advantage of splitting a complex SoDA template into smaller parts is that it becomes more manageable to author and generate the entire document. The disadvantage of this approach is that the OPEN command connected to the Rational Software Architecture model needs to be duplicated in every SoDA template. For example, the template "04 Logical Architecture.doc" has an OPEN as the first command (see Figure 5) and so does any subsequent SoDA templates such as "05 Detailed Design.doc" (see Figure 6).

Figure 5: Template for logical architecture

Figure 5: Template for logical architecture

Figure 6: Template for detailed design

Figure 6: Template for detailed design

Selective generation

When faced with very large models we may experience that the splitting of individual chapters into affiliated documents as shown above does not quite solve the problem, since some chapters such as the chapter "Detailed Design" would result in too many pages in context of very large models. The only way around this situation is to split the chapter further into smaller parts -- i.e., to group the packages/subsystems into clusters, and then create an affiliated report for each cluster.

The technical solution, in SoDA terms, is to specify the names of the packages to generate in a text file, and then set up SoDA to extract this information as a basis for selecting the relevant packages from the model. Remember: you will need to set up this text file for each cluster with a unique name that SoDA can recognize. You must create the file that enumerates the packages to access from the model before you create the SoDA template. You must then insert an OPEN command that will connect to this file prior to any SoDA REPEAT or DISPLAY commands that will utilize the information stored in this file, as illustrated in Figure 7.

Figure 7: Design template for Part1

Figure 7: Design template for Part1

Figure 7 shows the OPEN command that connects to a file called "PackagesPart1.txt", which in turn lists the name of the packages/subsystems that data will be gathered for, i.e. "User Services" and "Business Services." After the OPEN command, a REPEAT command for File_Record is inserted which repeats over each record in the file.

Figure 8: File Model Packages using information in files

Figure 8: File Model Packages using information in files

The enclosed REPEAT command: NestedPackage, as shown in Figure 8, iterates over all top-level packages in the "Logical View" of the model with the condition that the package name must be equal to the first field of the file record. This way, only the packages listed in the file "PackagesPart1.txt" will be rendered when the SoDA report is generated. When entering the package names in the file "PackagesPart1.txt," take care to ensure the following:

  • The names listed in this file exactly match the package name in the model (this includes capitalization, spelling, and spacing).
  • A line feed is inserted at the end of the last package name.

The next step is to generate the report and then incorporate it as an affiliated document of the master document. Later, more packages can be added to the file "PackagePart1.txt." If the generated report for these packages/subsystems reaches a size of around 200-400 generated pages, it may pay off to define a new cluster. To do this, make a copy the template and rename it to "Detailed Design Part2.doc," then change the OPEN command for the file in this new template to point to a file called "PackagesPart2.txt" (which defines the packages in the second cluster), then create the file called PackagesPrt2.txt as shown earlier in this article. After generating this new report, add it to your master document. This solution is flexible and scalable and will allow the generation of large design documents. The technique is not limited to design documents but applies equally well to large requirement specifications and test reports.

Beyond the requirements for flexibility and scalability, there are also functional requirements that can be fulfilled with this approach. Frequently, projects want to print out the packages/subsystems in some logical order. SoDA allows a template to use alphabetic or numeric sorting of the artifacts, but this solution may not always be sufficient, when, for instance, the order of the packages/subsystems must be done based on some other criteria. With the approach just described the packages will be printed out in the order that they are listed in the text file. Moreover, it becomes quite straightforward to control whether or not a package is included in the documentation. If not, simply make sure that the package is not listed in the file (or any of the other files that define the clusters). This way you can ensure that standard libraries or other low-level utility packages are not included in the final report. Another use for this method arises during the testing of a template: It is much faster and easier to work with a subset of the data rather than huge models in order to ensure that the template is functional. You can remove this OPEN and the associated REPEAT after the template has been verified that it works without much difficulty.

Parameterized templates

Every now and then, template designers need to parameterize their SoDA template. For example, the design document could come in several variants: one overview document that just prints out the packages, class diagrams, and class descriptions, and then a more detailed variant that includes class details such as the operations, attributes, and relationships. In this context it would be quite impractical to maintain several variants of the template, since this would make template maintenance impractical. A better solution would be to use parameterized templates -- meaning that parameters are used to enable or disable specific parts of the template.

Parameterization of detailed design information

Figure 9: Parameterization of detailed design information

To create a parameterized template, use a file where the parameters can be entered. Each parameter in this file is identified by a unique name (the first field) and a parameter value (the second field), as shown in Figure 9. The connector defined by the OPEN command, FileSys_FileRecord, connects to a single row in the file "Parameter.txt" as defined by the unique key. This row is identified by a unique key: in the example above the first field "ShowClassDetails." The second field in the file is the parameter value, which is retrieved later on by a LIMIT command and consequently used to determine if class details are to be included or not.

Design template filtered by parameter

Figure 10: Design template filtered by parameter

Whether the class details will be rendered or not is specified as a condition associated with the LIMIT command, as shown in Figure 10. This condition will yield true if the second field of the file record has the value "Yes". The effect will be that information regarding the operations will be generated by SoDA as well.

Bulk report generation

When a template is split into a number of smaller and manageable parts, the next question is how to generate them. One solution is for a user to open each individual template and then generate it manually by invoking SoDA>Generate Report. When the number of templates is small, this is an acceptable solution, but when the number of templates increases, this method of generation is subject to potential data mismatches if one or more templates don't get regenerated. An alternative solution is to use a Visual Basic application that has been developed for this purpose. This SoDA Report Generator uses the name of the master document to generate all of the SoDA templates found in the same directory as the master document. While doing so, progress is being reported by SoDA (but actually SoDA will run in the background).

Generation of all affiliated documents

Figure 11: Generation of all affiliated documents

You can download this application from the Rational area of developerWorks. The application is still interactive: it will open up the template and generate the report. Any dialogs opened by SoDA will appear -- e.g., those dealing with security where a user is required to log into a domain (e.g., the IBM Rational RequisitePro® or the IBM Rational ClearQuest® domain) -- and must be responded to by a human before the generation process can continue.

Conclusion

In this paper we have briefly demonstrated how to divide complex SoDA based documents into more manageable pieces using the Microsoft Word concept of master and affiliated documents. This approach has several advantages:

  • The document becomes easier to develop, debug, and maintain
  • The solution is scalable when teams are faced with large information domains (for example, design models)
  • The approach makes it possible to mix handwritten text with generated text
  • Team development is supported since each affiliated document can be edited (or generated) independently

References

Einar Karlsen: "SoDA Report Generator, Rational SoDA script and template index," IBM developerWorks, http://www.ibm.com/developerworks/rational/library/05/712_soda/.

Debra L.K. Johnson: "Understanding IBM Rational SoDA, IBM developerWorks," http://www.ibm.com/developerworks/rational/library/05/726_us/.


About the authors

Author photo

Einar Karlsen has more than twenty years experience in the area of software development tools and methods. He has held positions in research, development, project management, and consulting. His Ph.D. from the University of Bremen addresses the integration of case tools. He has been with the Rational brand for eight years supporting customers in their use of Rational Apex, Rational Suite, and the Rational Software Development Platform.

Debra Johnson

Debra Johnson has been involved in the electronics and software industries for more than 20 years. She has worked as a test floor supervisor, programmer, support engineer, instructor, technical writer, maintenance tech, education courseware developer, and marketing engineer. She has been with the Rational software brand for nine years, working in technical support as an Advisory Engineer for SoDA and ProjectConsole, as well as in marketing. Currently she is a marketing engineer for the Rational Portfolio Manager, Rational SoDA and the Team Unifying Platform tools.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Rational
ArticleID=187678
ArticleTitle=Using SoDA to generate large documents in the Microsoft Word environment
publish-date=01152007
author1-email=einar.karlsen@de.ibm.com
author1-email-cc=
author2-email=debrajoh@us.ibm.com
author2-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers