Cancer registry worked example

This worked example shows how to scope data elements in the IBM Unified Data Model for Healthcare to support a section of a typical US State Cancer Registry and extend the data scope to enable analysis related to breast cancer diagnosis, treatment and screening.

A cancer registry is a central repository of cancer data which is usually collected by cancer registrars. Each healthcare facility reports each instance of a new patient admitted to their facility for cancer-related treatment. The data represents a summary of the patient's demographic information, diagnosis, tumor and treatment.

Departments of health typically mandate the collection of cancer cases so that the collected data can be used to observe cancer trends and provide a research base for studies into the possible causes of cancer and treatment of cancer. The common goal for cancer registries is to reduce cancer related deaths and illnesses by providing data on cancer incidence.

The North American Association of Central Cancer Registries (NAACCR) establishes and maintains a consensus on standards for cancer registration in North America, and it aggregates the annual data from population-based registries throughout the US Supportive content that is provided in the vocabulary in Information Governance Catalog (IGC) for NAACCR v13, which contains mappings to the logical models. When scoping the model, it is important to refer to the NAACCR supportive content to ensure compliance with standards and to reuse the existing mappings to the Business Data Model (BDM), Atomic Warehouse Model (AWM) and the Dimensional Warehouse Model (DWM).

This worked example shows how a healthcare organization can use the IBM Unified Data Model for Healthcare to provide a subset of the data required by a cancer registry and, by extending with additional data elements, carry out further analysis of breast cancer patient data (diagnosis, treatment and screening).

Scoping UDMH is the process of selecting content from an existing version of the vocabulary, BDM, AWM and DWM. The scoped content can include:

Attributes
Entities
Diagrams
Packages

When you scope UDMH, you determine what terms in the vocabulary and which of the models, BDM, AWM and DWM, will be involved in the project. The default process involves:

Analyzing the business requirements to determine what existing and new content is needed
Searching for supporting content in the vocabulary
Transforming any changes into BDM and AWM
Creating analysis elements in DWM

Note: for this example, a section of a typical US State Cancer Registry is used and the required data can be found in UDMH. As the healthcare organization does not need to modify or add content to the vocabulary, BDM or AWM, this example focuses on scoping and creating data elements in DWM, which are mapped to BDM and AWM, and creating the corresponding analysis areas. the lessons show examples of how to scope and modify. Tables that contain the required customizations to the model are provided in the lessons and in the Appendix: referenced tables.

This worked example:

Explores the existing NAACCR supportive content in the Information Governance Catalog (IGC) to find the terms corresponding to the Cancer Registry sample.
Uses the assigned asset mappings for the identified NAACCR business terms
Traces what entities and attributes these data elements are mapped to in the logical data model:
- Business
- Atomic
- Dimensional
Create a scope within the logical dimensional model by using the identified entities and attributes
Create a sample breast cancer snapshot and diagram
Create a sample business solution template and diagram to show the number of breast cancer patients aged 50-70 who were screened within the last three years

Learning objectives

In this worked example, you will learn how to:

Scope the NAACCR supportive content in IGC to identify the required data elements
Trace these data elements to identify the entities and attributes in DWM
Scope DWM by creating diagrams, a snapshot and a business solution template

Time required

This worked example takes approximately 150 minutes to complete. If you explore other concepts related to this example, it will take longer to complete.

Skill level

Intermediate

Audience

This worked example is intended for data modelers with experience using IGC and IDA.

System requirements

Information Governance Catalog (IGC)
InfoSphere Data Architect (IDA)
IBM Unified Data Model for Healthcare

Review the InfoSphere Data Architect (IDA) Tutorials (in the IDA menu, click Help > Welcome).
See the IBM Unified Data Model for Healthcare user guide for more information.
Use the following table, which lists some data elements that are typically required by a cancer registry, as the basis for this worked example:

Cancer Registry Data Element
Patient Name (first, middle, last)
Patient Social Security Number
Patient Date of Birth
Patient Sex
Patient Address Current
Patient Race
Reporting Facility ID
Primary Site C
Date of Initial Diagnosis
Class of Case – Groups cancer cases into Analytic vs Non Analytic.
CS Mets at DX – Identifies if there are metastases involvement at time of diagnosis.
RX Date Chemo – Date Chemo Started.
RX Summ - Chemo – Code for type of chemo administered.
Rx Chemo Flag – Code indicating why no date has been identified.

Expected results

Identify the appropriate NAACCR data elements in IGC supportive content and their associated entities and attributes to support the section of a typical cancer registry. These will translate into scoped entities and attributes in DWM. This scope will be extended in DWM to include diagrams, a snapshot and a business solution template, which provide analysis for breast cancer analysis.

Note: reference tables used throughout the worked example are available in the appendix: