A business glossary, sometimes referred to as a data dictionary, is the artifact that defines and contains the agreed definitions of the terms and data associated with an initiative. The business glossary defines the language of the business and, by extension, the language of the project. Care needs to be exercised that the terms defined in the business glossary are fully qualified and that specific, descriptive definitions are provided. To the extent possible, a definition that applies enterprise-wide should be crafted. Different departments may use a term differently; all those definitions should be captured and associated with their appropriate contexts (department). Business glossaries are one step along the path of establishing precise semantic definitions across an organization.
Simply put, the business glossary is the formal contract between the producers and consumers of information across the enterprise. It is intended to be the artifact or reference that allows anyone to determine the meaning, type and context of any term and, in particular, any business data element used in an initiative. Too often there is a significant lack of formal definition around data entities even within a single system. Different interpretations of the same term increase the risk of failed project delivery. There are often embedded business rules that make data inaccurate out of the context in which it is originally used in. There could be embedded rules in the programs that use the data or additional reference data that is needed to give the entity context.
Now that you have read the first two parts of this series and understand the importance of creating a business glossary, this article uses the InfoSphere Business Glossary as an example to show you how to use a business glossary to best suit your needs and purposes. The InfoSphere Business Glossary supports a glossary that is used in an SOA context but can also be used in data integration engagements and other projects that require clear definitions of terms.
InfoSphere Business Glossary creates and manages a controlled vocabulary that enables a common language between business and IT professionals. The following three products are complimentary and serve different user communities with specific functionality for each group. Below is a list of the products and the services they offer; the services are all discussed at later sections in this article:
Business Glossary targets "power users" that need to access, author, control, and administer a glossary:
- Manage business terms and categories (see Section below):
Business Glossary provides a dedicated, web-based user interface for creating, managing, and sharing a controlled vocabulary. Terms represent the major information concepts in your enterprise. Categories are used to organize these terms into hierarchies.
- Manage stewardship (see Section below):
Stewards are people or organizations with responsibility for a given information asset. Using the Business Glossary functionality of IBM Information Server, administrators can import stewards profiles from external sources, create and edit profiles in the Web interface, and create relationships of responsibility between stewards and business terms or any of the artifacts managed in Business Glossary.
- Customize and extend (see Section below):
Needs around business metadata tend to differ from one enterprise to the next. For this reason, there is no "one size fits all" meta-model. In addition to being able to customize the entry page to the application, administrators can extend the application with custom attributes on both business categories and business terms.
- Collaborate (see Section below):
It is not enough simply to document business metadata. This information must be alive in the enterprise, with open access to all. Business Glossary provides a collaborative environment in which users can organically grow this important information asset. Notes and annotations as well as subscriptions to topics of interest allow different users of Business Glossary to cooperate and to jointly develop and improve the glossary information.
Business Glossary browser is primarily aimed at business users that need to view glossary information such as definition of terms, associated stewards, etc.:
- Simply browse (see Section
and scenario below):
Business Glossary Browser is an intuitive read-only browser interface requiring no training to utilize. Business users can search and explore the vocabulary and its classification, identify stewards responsible for assets and provide direct feedback.
Business Glossary Anywhere allows anyone in the organization to view glossary information in the context of their traditional work environment and tools:
- Desktop search (see Section below):
Business Glossary Anywhere allows users to search any term from any application without loss of context. One click instantly pops up a small window with information about associated metadata in the business glossary including the steward of the term.
In addition to its glossary-authoring capabilities, InfoSphere Business Glossary allows data (or metadata) to be imported from other applications like Rational Data Architect (RDA). This obviously saves the manual time and labor of having to input this metadata by hand, and it also allows for consistency of business terms across tools.
InfoSphere Business Glossary also allows for simple and direct natural language query of the metadata, coupled with collaborative capabilities, so that when a term is updated or refined, people who "subscribe" to that term can be notified of the changes. The true value of a business glossary can only be accomplished when all relevant users (that is, architects, developers, and the like) follow that terminology.
Finally, InfoSphere Business Glossary can be populated with glossary structures from the IBM Industry Models, across a range of industries including banking, insurance, finance, telecommunications, healthcare and retail. Populating InfoSphere Business Glossary with content from Industry Models yields glossary structures for business terms, reporting requirements (including regulatory reporting), and business functions.
The collaborative glossary management support of InfoSphere Business Glossary, coupled with the modeling capabilities of Rational Data Architect and the rich business content of IBM Industry Models provides a complete solution to the glossary challenge, driving SOA initiatives through consistent and reusable business definitions.
Figure 1 shows the initial screen of the InfoSphere Business Glossary. Several tabs are available that group the functions that are available for a particular role as described above.
This article focuses on the Glossary page, where you manage the overview, business terms and custom attributes for those terms.
On the left hand side is the navigation pane where you can browse, search, manage and maintain the history associated with the business term in question.
Figure 1. Business Glossary home page
The Business Glossary is installed on a server and is accessed through a browser interface. This server is supported by a metadata repository which captures not just the metadata associated with a glossary, but also physical models, services and other metadata sources across the enterprise. There are three roles available in the business glossary; you will either be in a user, author or administrator role.
If you are in the Business Glossary user role, you can examine the metadata assets in the metadata repository, including the terms and the categories that contain terms. You can communicate your concerns or information about particular objects to the glossary administrator. As a user, you can perform the following types of tasks:
- Browse the structure of categories and terms
- Search the metadata repository for categories, terms, and other objects
- Explore the attributes and relationships of all objects in the metadata repository
- Send feedback to the administrator
If you have a Business Glossary author role, you can create and edit terms and categories and use terms to classify objects. The author role is assigned to users who manage categories and terms and who decide how objects are classified and who the stewards are for specific objects. As an author, you can perform all tasks that are associated with the Business Glossary user role. In addition, as an author, you can perform the following types of tasks:
- Create and edit a hierarchy of categories that contain terms that are used by your enterprise
- Classify objects in the metadata repository by using terms
- Set stewardship for objects in the metadata repository
- Upload terms and categories to the metadata repository
- Specify values for custom attributes
If you have a Business Glossary adminstrator role, you can set up and administer the glossary so that other users can find and analyze the information they need. You can perform all the tasks that are associated with the author and user roles. As an administatori, you can create, edit, and delete terms and categories, as well as associate terms and stewards with objects. You can browse the metadata repository and create annotations and can perform another other glossary task. In addition, being assigned as a Business Glossary administrator allows you to perform the following taks:
- Customize the Overview page of the Business Glossary to provide users with a starting point that is specific to an enterprise and allows them to easily navigate the hierarchy of categories
- Set application options
- Designate users and groups as stewards, and delete the steward relationship from a user or group
- Create, edit, and delete custom attributes
- Edit and delete annotations that were created by others
- Delete terms and categories that were created by others
Administrators and authors create a logical structure of categories, terms, and classified objects. The choice of procedures that you use when you build the glossary depends on whether you are primarily creating new categories and terms or using categories and terms that already exist in the metadata repository of the IBM Information Server. In either case, you should map out the desired structure before you build it in the glossary. When planning the structure, consider the following questions:
- What categories do you need?
- Do you have existing categories and terms that can be imported or uploaded into the repository?
- Which categories are the top-level categories, and which are sub-categories?
- Which categories do you want to present to the user on the Overview page as the starting point for browsing the metadata repository? These categories do not need to be the same as the top-level categories.
- What terms do you need?
- Which categories contain which terms?
- Which categories reference terms that they do not contain?
- Which terms are related to other terms?
- Which terms are synonyms of other terms?
If you answer these questions in detail before you create the glossary structure, you can build a structure that is simple for users to understand and that supports your enterprise goals. You can decide in advance which objects are classified by which terms, or you can wait until after you build the glossary structure to make that decision. If you are building a glossary structure by creating new categories and terms, instead of uploading or importing existing categories and terms into the repository, you must first create the categories, then create the terms, and then edit the categories and terms to set the relationships between them.
Administrators and authors can use terms to classify objects in the metadata repository. A term is a word or phrase that can be used to classify and group objects in the metadata repository. For example, you might use the term Africa Sales to classify some of the tables and columns in the metadata repository, and the term Europe Sales to classify other tables and columns.
If your metadata repository includes some different terms that mean the same thing, you can designate such terms as synonyms. If two terms are not synonyms, but are related in some other way that is important, you can designate them as related terms. You can specify which term of a group of terms is the preferred term, and which terms to replace with other terms. You can also specify standard abbreviations of terms.
When you create or edit a term, you can specify term properties, relationships (including synonyms, related terms and classified objects) and values for custom attributes and properties that apply to the term. Administrators and authors can also upload files of categories and terms to the metadata repository, and then specify additional properties and relationships.
The Business Glossary supports the addition of a wide range of properties to business terms, for example:
- Name -- Term names must be unique.
- Parent category -- The category that contains the term. A term has one and only one parent category.
- Short description (optional) -- Help uniquely identify a category in a list of other terms with similar names. The text should be no longer than one or two lines. Short descriptions are used in many searches and are displayed in lists of objects.
- Usage (optional) -- Information about how to use the term, and any business rules that govern its use.
- Example (optional) -- An example of how the term is used, or a typical sample value.
- Status -- Approval status of the term within the organization.
- Preferred synonym --Term is the preferred term in a group of synonym terms.
- Abbreviations (optional) -- Standard abbreviations of the term.
Similarly, a range of relationship types may be specified for terms, including:
- Steward -- The person or group that is responsible for the term. A term can have only one steward.
- Related terms -- Terms that are related in some way to the term in question.
- Synonyms -- Terms that have the same meaning.
- Classified objects -- Objects that the term classifies.
Stewards are users or groups that have responsibility for one or more glossary elements in the repository. Business Glossary administrators can designate that a user or group in the metadata repository is a steward and is responsible for one or more definitions -- usually the definitions for which that user or group is the primary contact. When browsing an object that has a steward, a link to the steward is displayed. The link leads to contact information, which includes the stewards e-mail address and phone number.
You can assign responsibility for multiple objects when you designate a new steward or when you edit a steward on the Manage Stewards page. You can also assign an object to a steward from the Tasks list on the browse page of the object, or on the browse page of a user or group who is a steward. In addition, you can assign responsibility for a particular category or term to a steward when you create or edit the category or term.
In addition to the standard term properties and relationships, glossary administrators can define additional custom attributes for categories and terms. Custom attributes are often used to apply governance standards, enable architecture frameworks, or provide other metadata that is standard for an organization. Each custom attribute has a name, a description, and a valid value or set of values, that can be changed at any time.
For example, an administrator creates a custom attribute named Data Sensitivity with the following description: "A number from 1 to 5, which indicates the sensitivity of the data. Sensitivity is a subjective measure of the impact of the data being released to unauthorized consumers." The administrator can specify that Data Sensitivity attribute applies only to terms, and chooses the enumerated valid value type entering the numbers 1 through 5 as valid values. This sets up a custom property for Data Sensitivity for which a glossary author may select from 1 through 5.
The Business Glossary supports collaborative authoring of glossary structures, including sending feedback on glossary terms to the glossary administrator. Glossary users may add notes to terms providing further contextual information and subject matter knowledge to the glossary. These notes are then made available to those browsing the glossary.
The Business Glossary Browser provides a web-based interface to search and browse for information that has been entered in Business Glossary. Users can search the glossary by terms, categories or both and for stewards in a flexible and easy manner. Business Glossary Browser returns then the list of entries that match the search string. Terms and categories are displayed in a list that includes the name, description, category, associated steward and type of the result (term / category). Search results for stewards are displayed in a list with their name and contact information.
Another option to view the glossary information is by browsing through the categories that are displayed in a hierarchical from as shown in Figure 3 below. The tree structure in the left part of the screen allows for easy navigation and the right hand pane shows the detail of a selected category.
Once the users has found a term or category of interest, Business Glossary Browser provides a detail page where the user can see additional information such as when the term/category was created, updated, the name of the steward, etc.
If the users has questions or concerns about a glossary definition, the tool allows to contact the associated steward to provide the feedback.
In addition to the Business Glossary interfaces discussed above, users also have access to glossary definitions through Business Glossary Anywhere. This is a simple read-only interface that provides an extremely flexible and business-user friendly browsing experience for glossary content. It can be invoked by users from any location that defines text (requirements documents, emails, model content, and so on). Business Glossary Anywhere is invoked through a mouse or keyboard shortcut on selected text and returns matching glossary terms. Business Glossary Anywhere can use text from documents, slides, models, web pages, reports -- anything that exposes a selectable text string -- and retrieve the appropriate terms from the enterprise glossary. This simple but powerful capability greatly increases the accessibility of glossary content to stakeholders across the organization, encouraging the use of standard business definitions in a range of contexts where more ambiguous terminology would traditionally be applied.
The screenshot below shows the Business Glossary Anywhere pop-up window that appears when you select "contact preference" and search for its definition.
Figure 2. Business Glossary Anywhere pop-up
In addition to the glossary authoring and management capabilities of InfoSphere Business Glossary and the modeling capabilities of RDA, the IBM Industry Models (that is, IFW and IAA) bring large numbers of pre-defined glossary categories and terms to accelerate glossary development within an organization. The glossary content of the industry models comes from three distinct model dimensions: business terms, business functions and reporting requirements.
The business terms content of the Industry Models provides a structured hierarchy of business definitions describing the data entities that are relevant for that particular industry. For example, it includes a set of formalized definitions describing accounting structures, relationships between accounts, and properties of accounts for a financial institution.
Figure 3. Business Glossary Browser with Industry Model content: Business terms
The business functions content of the Industry Models provides a structured decomposition of the business itself, defining non-overlapping areas of the business such as account administration, channel reconciliation or relationship monitoring.
Figure 4. Business Glossary Browser with Industry Model Content: Business functions
The reporting requirements of the Industry Models provide detailed business definitions of key reporting requirements, such as those implied by Basel II, Sarbanes Oxley or other regulatory requirements. These reporting requirements identify the measures and dimensions of interest for specific reporting structures such as Capital Allocation Analysis, Liquidity Analysis, Suspicious Activity Analysis and so on.
Figure 5. Business Glossary Browser with Industry Model Content: Reporting requirements
Using the IBM Industry Models to support glossary development brings a large number (of the order of 10,000) of business definitions to an organization's glossary development, but more importantly, these definitions have come from a single formalized source, are non-overlapping, and consistent. This very quickly provides a base set of categories and terms that can then be further annotated and customized within the organization. These glossary definitions are also consistent with and pre-mapped to the underlying logical data models (for example, data warehouse models) and UML models (for example, services models), providing an interconnected suite of glossary definitions and structured analysis and design models to a business entry point into solution specification. This is particularly important in the context of information within SOA, where consistency and reusability of data definitions, coupled with an understanding of how these data definitions relate to the underlying data platform, can be the difference between the success or failure of a project.
This section shows an example of the construction and use of a business glossary within a financial institution. The IBM Industry Models are defined in an extension of Rational Data Architect and you can see how the glossary content from RDA can be imported into InfoSphere Business Glossary. Alternatively, the process can start by entering glossary content directly into InfoSphere Business Glossary. After you see how to create the glossary, learn to customize that content within InfoSphere Business Glossary, browse the glossary and add notes to a business term.
Within RDA, define a glossary model (.ndm). For example, glossary models may be obtained through the IBM Industry Models. Selecting the RDA project containing the glossary, select Export\Export a Glossary Model to the Metadata Server. This invokes the RDA metabroker, which loads the content of the RDA model into InfoSphere Business Glossary as shown below:
Figure 6. Export from RDA to Metadata Server (and Business Glossary)
After the InfoSphere Business Glossary is populated with business terms, this glossary is then customized and extended over time, as a glossary is a living document. The procedure to customize the glossary is similar to that of defining a new glossary, in the event that no content was available for import during the previous step. In this example, assume that the glossary is to be customized to define the contact preferences for an individual customer. The contact preferences define how a customer wishes to be contacted -- at which address, by whom and at what times.
To capture this requirement within the glossary, define a new category "Contact Preference" within the glossary hierarchy. The location of this category within the hierarchy is a matter of choice; however, in this example, this category is insterted under the existing category that describes "Individual".
Figure 7. Customizing the glossary
Within this category, we define a set of terms that detail the data requirement: that individuals may specify a range of properties about how they may be contacted, including language, accessibility preferences, timing preferences, contacting person, address information and so on. It is important to get sufficient detail and clarity into the business definitions of terms. Each of these data requirements should be entered as a distinct business terms, and related to the term "Contact Preference".
Figure 8. Customizing the glossary: Relating terms
This results in a glossary structure similar to that shown below, with a category containing terms relating to contact preferences and terms defined for each aspect of that requirement. Related terms have also been established to refer to terms that are relevant to this business concept, but are not contained within this category (for example, address information for the individual).
Figure 9. Resulting glossary structure
Each term has a set of properties that can be set. In this example, the value of Status for each term is changed from 'Candidate' to \ 'Accepted'. Similarly, custom attributes, if they have been established, can be set for each term.
Figure 10. Modifying the status
Next a steward should be assigned for each term. The steward is defined as being the owner of the definition of this particular term and accountable for its maintenance.
Figure 11. Assigning a steward
Finally these glossary terms are linked to physical artifacts within the Metadata Server, defining a relationship between the business concepts expressed in the glossary and the (potentially multiple) realizations of those terms within technology specific artifacts. Specifically, terms can be used to describe technical artifacts from other artifacts such as data models, ETL flows and business intelligence structures.
There is a number of ways to browse glossary content within InfoSphere Business Glossary and the relationship of this content to other artifacts that are stored within the metadata server. This section provides a sample set of ways to access the content we have added through this example.
Users can browse the glossary structure to explore categories, terms, and objects in the metadata repository of IBM Information Server. Users start browsing the glossary from the Overview page (as Figure 12 shows), which displays the top-level categories that the glossary administrator has designated as most important for navigation in the metadata repository. Users can also search for objects and select an object from the search results.
When you select an object, the browse page of the object is displayed on the Browse Glossary tab, which lists the name, class, stewardship and other important properties of the object. Object attributes and relationships can be inspected, and feedback submitted to the glossary administrator.
Figure 12. Browsing the glossary
Users also have the ability to perform simple or advanced searches in the Business Glossary, illustrated in the screenshot below. There are many possibilities, some simple and others advanced, that users can use to perform searches.
Figure 13. Business Glossary search result
Glossary users may collaborate on the definition of a business term. Users have the ability to annotate various aspects of any defined terms in the glossary by defining notes. Users may add new notes or comment on existing ones, providing additional contextual details to glossary terms.
Figure 14. Annotation in Business Glossary
This article describes some of the usages and reasons for using InfoSphere Business Glossary not only in traditional information architecture context but also in SOA. The benefits are that you can use InfoSphere Business Glossary to create a common vocabulary between technical and business users in a common environment that allows for collaborative creation and maintenance of valuable glossary content. This ensures a more formal and system-supported agreement of key terminology early on in the project.
- Read the article The value of applying
the business glossary pattern in SOA in this
series for a product-neutral
description of the business glossary pattern.
Get products and technologies
Create, manage and share an enterprise vocabulary and classification system with
IBM Information Server
and in particular
InfoSphere Business Glossary.
- Use Rational Data Architect to simplify data modeling and integration design.
- Utilize IBM Industry Models to accelerate projects and
and get involved in the developerWorks community.
Brian Byrne has over 10 years experience in the design and development of distributed systems, spending 7 years driving the architecture of Industry Models across a range of industries. Brian is currently an architect within IBMs Information Management organization.
Guenter Sauter is an architect in the Information Platform & Solutions segment within IBM's software group. He is driving architectural patterns and usage scenarios across IBM's master data management and information platform technologies. Until recently, he was the head of an architect team developing the architecture approach, patterns and best practices for Information as a Service. He is the technical co-lead for IBM's SOA Scenario on Information as a Service.
Peter joined IBM three years ago after almost 25 years at institutions like the US Dept. of Defense, GE Corporate and Morgan Stanley where he held technical leadership positions and gained valuable experience in Enterprise Architecture and Enterprise Data Integration. He initially joined IBM as a Sr. IT Architect as part of the architect team for Information as a Service. Currently he is a Solutions Marketing Manager for the IPS Global Services organization, specializing in MDM solutions.