SSRS
Microsoft SQL Server Reporting Services (SSRS) is the highest layer of the Microsoft business intelligence environment. This layer is responsible for creating reports from the data it extracts directly from the data mart as well as from the analytics center of your data warehouse.
IBM Automatic Data Lineage can process SSRS reports — creating the most precise end-to-end data lineage the Microsoft SQL Server family can get.
Automatic Data Lineage currently scans:
-
Reports
-
Datasets
-
Data sources
Check out the guides below for more details on setting up this scanner.
- SSRS Integration Requirements
- SSRS Resource Configuration
- SSRS Include/Exclude Options
- SSRS Manual Inputs (since R42.3)
Extraction and Analysis Phase Scenarios
Extraction Phase
For the extraction phase for SQL Server Reporting Services, there are two scenarios.
- SSRS extractor scenario — connects to each configured SSRS server and extracts the configured projects
- SSRS ingestion scenario - pulls inputs from git Manta Flow Agent Configuration for Extraction:Git Source or a remote agent filesystem location Manta Flow Agent Configuration for Extraction:Agent Source
Analysis Phase
For the analysis phase for SSRS server, there is only one scenario.
- SSRS server dataflow scenario — harvests metadata and lineage from the extracted SSRS projects and and saves it in your Automatic Data Lineage metadata repository
Visualization
Reports can use two types of datasets — embedded and shared. Their visualization and filtering are different.
Embedded Dataset
An embedded dataset is fully defined in the report definition; it only needs a data source definition, which again can be both shared and embedded. This dataset type is visualized as a single dataset object placed inside the report that it is embedded in. Its input is directly connected to the source database.
If both a dataset and a data source are embedded in a report, then no other definitions are required for the report analysis; it is enough to extract just the single report definition. If a shared data source is used, then its definition needs to be extracted too; otherwise, the analysis does not know which database is used as a data source and the lineage cannot be complete.
Shared Dataset
Shared datasets retrieve data from shared data sources that connect to external data sources. Shared datasets only use shared data sources, not embedded data sources. Like shared data sources, shared datasets are managed independently from the reports that they are used in. That means they both need to be included to extract all the reports that they are used in. When a shared dataset is used in a report, an instance of a shared dataset is added to the report. This is visualized as two duplicate datasets. One represents the shared dataset, and its input is the source database. The other one is placed in the report, and its input is the shared data source, which represents the dataset instance in the report.