SSRS

Microsoft SQL Server Reporting Services (SSRS) is the highest layer of the Microsoft business intelligence environment. This layer is responsible for creating reports from the data it extracts directly from the data mart as well as from the analytics center of your data warehouse.

IBM Automatic Data Lineage can process SSRS reports — creating the most precise end-to-end data lineage the Microsoft SQL Server family can get.

Automatic Data Lineage currently scans:

Reports
Datasets
Data sources

Check out the guides below for more details on setting up this scanner.

Extraction and Analysis Phase Scenarios

Extraction Phase

For the extraction phase for SQL Server Reporting Services, there are two scenarios.

SSRS extractor scenario — connects to each configured SSRS server and extracts the configured projects
SSRS ingestion scenario - pulls inputs from git Manta Flow Agent Configuration for Extraction:Git Source or a remote agent filesystem location Manta Flow Agent Configuration for Extraction:Agent Source

Analysis Phase

For the analysis phase for SSRS server, there is only one scenario.

SSRS server dataflow scenario — harvests metadata and lineage from the extracted SSRS projects and and saves it in your Automatic Data Lineage metadata repository

Visualization

Reports can use two types of datasets — embedded and shared. Their visualization and filtering are different.

Embedded Dataset

An embedded dataset is fully defined in the report definition; it only needs a data source definition, which again can be both shared and embedded. This dataset type is visualized as a single dataset object placed inside the report that it is embedded in. Its input is directly connected to the source database.

If both a dataset and a data source are embedded in a report, then no other definitions are required for the report analysis; it is enough to extract just the single report definition. If a shared data source is used, then its definition needs to be extracted too; otherwise, the analysis does not know which database is used as a data source and the lineage cannot be complete.

Shared Dataset

Shared datasets retrieve data from shared data sources that connect to external data sources. Shared datasets only use shared data sources, not embedded data sources. Like shared data sources, shared datasets are managed independently from the reports that they are used in. That means they both need to be included to extract all the reports that they are used in. When a shared dataset is used in a report, an instance of a shared dataset is added to the report. This is visualized as two duplicate datasets. One represents the shared dataset, and its input is the source database. The other one is placed in the report, and its input is the shared data source, which represents the dataset instance in the report.