This worked example shows how to scope data elements in
the IBM Unified Data Model for Healthcare to
support a section of a typical US State Cancer Registry and extend
the data scope to enable analysis related to breast cancer diagnosis,
treatment and screening.
A cancer registry is a central repository of cancer data
which is usually collected by cancer registrars. Each healthcare facility
reports each instance of a new patient admitted to their facility
for cancer-related treatment. The data represents a summary of the
patient's demographic information, diagnosis, tumor and treatment.Departments
of health typically mandate the collection of cancer cases so that
the collected data can be used to observe cancer trends and provide
a research base for studies into the possible causes of cancer and
treatment of cancer. The common goal for cancer registries is to reduce
cancer related deaths and illnesses by providing data on cancer incidence.
The
North American Association of Central Cancer Registries (NAACCR) establishes
and maintains a consensus on standards for cancer registration in
North America, and it aggregates the annual data from population-based
registries throughout the US Supportive content that is provided in
the vocabulary in Information Governance Catalog (IGC) for
NAACCR v13, which contains mappings to the logical models. When scoping
the model, it is important to refer to the NAACCR supportive content
to ensure compliance with standards and to reuse the existing mappings
to the Business Data Model (BDM), Atomic Warehouse Model (AWM) and
the Dimensional Warehouse Model (DWM).
This worked example shows
how a healthcare organization can use the IBM Unified Data Model for Healthcare to
provide a subset of the data required by a cancer registry and, by
extending with additional data elements, carry out further analysis
of breast cancer patient data (diagnosis, treatment and screening).
Scoping
UDMH is
the process of selecting content from an existing version of the vocabulary,
BDM, AWM and DWM. The scoped content can include:
- Attributes
- Entities
- Diagrams
- Packages
When you scope
UDMH,
you determine what terms in the vocabulary and which of the models,
BDM, AWM and DWM, will be involved in the project. The default process
involves:
- Analyzing the business requirements to determine what existing
and new content is needed
- Searching for supporting content in the vocabulary
- Transforming any changes into BDM and AWM
- Creating analysis elements in DWM
Note: for this example, a section of a typical US State Cancer
Registry is used and the required data can be found in
UDMH.
As the healthcare organization does not need to modify or add content
to the vocabulary, BDM or AWM, this example focuses on scoping and
creating data elements in DWM, which are mapped to BDM and AWM, and
creating the corresponding analysis areas. the lessons show examples
of how to scope and modify. Tables that contain the required customizations
to the model are provided in the lessons and in the
Appendix: referenced tables.
This worked example:
- Explores the existing NAACCR supportive content in the Information Governance Catalog (IGC) to find the
terms corresponding to the Cancer Registry sample.
- Uses the assigned asset mappings for the identified NAACCR business
terms
- Traces what entities and attributes these data elements are mapped
to in the logical data model:
- Business
- Atomic
- Dimensional
- Create a scope within the logical dimensional model by using the
identified entities and attributes
- Create a sample breast cancer snapshot and diagram
- Create a sample business solution template and diagram to show
the number of breast cancer patients aged 50-70 who were screened
within the last three years
Learning objectives
In this worked
example, you will learn how to:
- Scope the NAACCR supportive content in IGC to identify
the required data elements
- Trace these data elements to identify the entities and attributes
in DWM
- Scope DWM by creating diagrams, a snapshot and a business solution
template
Time required
This worked example takes
approximately 150 minutes to complete. If you explore other concepts
related to this example, it will take longer to complete.
Audience
This worked example is intended
for data modelers with experience using IGC and IDA.System requirements
- Information Governance Catalog (IGC)
- InfoSphere Data Architect (IDA)
- IBM Unified Data Model for Healthcare
Recommended
- Review the InfoSphere Data Architect (IDA) Tutorials
(in the IDA menu, click ).
- See the IBM Unified Data Model for Healthcare user
guide for more information.
- Use the following table, which lists some data elements that are
typically required by a cancer registry, as the basis for this worked
example:
Cancer Registry Data Element |
Patient Name (first, middle, last) |
Patient Social Security Number |
Patient Date of Birth |
Patient Sex |
Patient Address Current |
Patient Race |
Reporting Facility ID |
Primary Site C |
Date of Initial Diagnosis |
Class of Case – Groups cancer
cases into Analytic vs Non Analytic. |
CS Mets at DX – Identifies
if there are metastases involvement at time of diagnosis. |
RX Date Chemo – Date Chemo
Started. |
RX Summ - Chemo – Code for
type of chemo administered. |
Rx Chemo Flag – Code indicating
why no date has been identified. |
Expected results
Identify the appropriate
NAACCR data elements in IGC supportive
content and their associated entities and attributes to support the
section of a typical cancer registry. These will translate into scoped
entities and attributes in DWM. This scope will be extended in DWM
to include diagrams, a snapshot and a business solution template,
which provide analysis for breast cancer analysis.Note: reference
tables used throughout the worked example are available in the appendix: