Data Collection Component architecture

Data is collected from the IBM® Engineering Lifecycle Management applications periodically by data collection jobs.

Data collection jobs are commonly known as extract, transform, and load or ETL jobs. Data Collection Component uses the ETL process to extract data from various products, for example Engineering Lifecycle Management products, transform it and ultimately store the data into the data warehouse for reporting purposes. When you author reports and you want the Engineering Lifecycle Management applicable metrics, Data Collection Component fills in all the metrics tables from the data warehouse.

Data Collection Component uses three different types of data collection:
  • ODS data collection
  • Data-mart data collection
  • Licenses data collection
Tip: You can schedule each of these collections independently to run in multiple intervals or certain time of the day. For details on scheduling the data collection, see the schedule data collection jobs step in the Collecting data warehouse data with the Data Collection Component topic.
Figure 1. Data flow process: Data collection process
Full data flow diagram with the data collection process highlighted.

ODS data collection

Operational data store (ODS) data collection are application-specific data collection, which extract data from the application data storage and Jazz Team Server data storage, transform the data to map to a predefined set of data tables, then load the transformed data into the respective tables in the operational data store (ODS) of the data warehouse.
Tip: All the jobs in this ODS collection are incremental delta loads, where you can schedule the jobs to run multiple times during the day. For example, you can schedule a time-frame as short as every 5 minutes, but typically for a production server you can schedule the ODS jobs to run every 30 minutes.
The operational data store has the following data collection jobs:
Change and Configuration Management - Planning
Collects planning data. This job runs only against IBM Engineering Workflow Management applications.
Jazz Foundation - Core
Collects data that is common to all Engineering Lifecycle Management applications, such as project areas, team areas, iterations, time-lines, user information, and other elements. This job runs against all Engineering Lifecycle Management applications, including the Jazz Team Server.
Quality Management
Collects test and quality data from the IBM Engineering Test Management application.
Requirement Management
Collects requirement data from the IBM Engineering Requirements Management DOORS® and IBM Engineering Requirements Management DOORS Next applications.
Change and Configuration Management - Build
Collects build data from the Engineering Workflow Management application.
Change and Configuration Management - Work Items
Collects work item data from the Engineering Workflow Management and Engineering Test Management applications.
Change and Configuration Management - Time Sheets (Part 1) to (Part 3)
Populates time-sheet data.

Data-mart data collection

The Data-mart data collection processes are unique from the application data collection.

The Data-mart data collection processes extract data from the ODS, transform it, then load it into metrics tables, which are comprised of a set of fact tables and associated dimensions. The Data-mart data collection files are stored in the Jazz Team Server. These files map relationships between the entries in the ODS tables and entries in the metrics tables’ fact tables and dimensions. For example, the metrics tables store defect arrival and closure rates.

In summary, this Data-mart collection contains jobs that mainly collects metric data and some slow changing operational data from the various Engineering Lifecycle Management applications. Some jobs in this Data-mart collection also perform maintenance and cleanup on the data warehouse tables.

Tip: Limit scheduling the Data-mart data collection jobs to run as infrequently as possible, for example at most once a day. In addition, jobs in this collection must never run individually. Processing Data-mart data collection jobs can take a significant amount of time because Data-mart jobs are not delta jobs and typically insert a lot of data in the data warehouse.
The data-mart data collection has the following data collection jobs:
Change and Configuration Management - Planning Capacity
Collects planning data about project area and team area capacities, such as the daily capacities in person-hours. This job runs only against Engineering Workflow Management applications.
Data Cleanup
Performs various maintenance and cleanup tasks on the data warehouse. An example of one of these tasks is to archive the child artifacts of a project area, when the project area is archived.
Delete Fact Table Data
Cleanup metric data that remains from a previous incomplete run.
Dimension
Populates the dimension tables in the Data-mart using data from the ODS tables.
Activity Facts
Populates activity related metric tables. The Engineering Workflow Management application only pushes activity data to the ODS tables, when the time-sheet function is enabled.
Build Facts
Populates build related metric tables.
File Facts
Populates source code control related metric tables.
Project Management Facts
Populates project management and time-sheet related metric tables.
Quality Management Facts
Populates quality and test metric related tables.
Requirement Management Facts
Populates requirement related metric tables.
Request Management Facts
Populates work item related metric tables.
Task Facts
Populates task related metric tables.
Jazz Foundation Services - Project Area (Part 1) and (Part 2)
Collects slow changing project area data, such as the read access permission, membership, and role data.
Change and Configuration Management - Source Control Dimensions
Populates source code control related dimension tables.
Change and Configuration Management - Source Control Metrics
Populates source code control related metric tables.
Change and Configuration Management - Source Control
Collects source code control data from the Engineering Workflow Management application.
Change and Configuration Management - Work Item Metrics
Populates special work item metric data used by the built-in Engineering Workflow Management BIRT reports. Also, collects work item state changes data.
Important: Data Collection Component no longer collects trend metrics broken down by users (owners or creators). This change was necessary to avoid metric data growth, which contributes to performance degradation over time. This new behavior can be disabled by following the steps outlined in the readme file at [server_installdir]/server/conf/dcc/mapping/legacy.

Licenses data collection

Licenses data collection is for gathering information and metrics with regards to license usage on the Engineering Lifecycle Management systems.
Tip: You must schedule the license data collection to run hourly.
The license data collection has the following data collection job:
Jazz Foundation Services - Licensing
Collects license usage metrics used to display the Jazz Team Server license usage reports.