PostgreSQL, Redshift, Greenplum, Yellowbrick, Amazon RDS, Amazon Aurora for PostgreSQL, YugabyteDB, and EDB
This scanner can handle many of the PostgreSQL-based databases — IBM Automatic Data Lineage has been tested with the following: PostgreSQL, Redshift, Greenplum, Yellowbrick, Amazon RDS, Amazon Aurora for PostgreSQL, YugabyteDB, and EDB (EnterpriseDB).
PostgreSQL is an open-source object-relational SQL database system for which Automatic Data Lineage offers a powerful scanner. Once configured, Automatic Data Lineage can automatically connect to the PostgreSQL resource for extracting and analyzing the pertinent metadata within the selected databases. This metadata includes but is not limited to the PostgreSQL data dictionary, scripts, views, and functions written in SQL and PL/pgSQL. Automatic Data Lineage also supports custom types in PostgreSQL. Automatic Data Lineage can parse all the programming code and logic stored within, generating lineage down to the column level while showing all transformation logic associated with individual column elements.
Automatic Data Lineage currently scans:
-
Data dictionaries
-
Scripts
-
Views
-
Functions (SQL and PL/pgSQL)
-
Custom types
Check out the guides below for more details on setting up this scanner.
Extraction and Analysis Phase Scenarios
Extraction Phase
For the extraction phase for PostgreSQL database servers, there are two scenarios.
-
PostgreSQL dictionary mapping scenario — connects to each configured PostgreSQL database server and stores the mapping between these values: dictionary ID, subdialect, host name, port, included databases/schemas, and excluded databases/schemas
-
PostgreSQL extractor scenario — connects to each configured PostgreSQL database server and extracts the database dictionary and DDL scripts from the configured schemas
-
IBM Automatic Data Lineage supports Git Ingest connections from version 42.4, for the download of files from a Git repository to the PostgreSQL workflow. For more information, see Manta Flow Agent Configuration for Extraction:Git Source
Analysis Phase
For the analysis phase for PostgreSQL database servers, there are three scenarios.
-
PostgreSQL dictionary dataflow scenario — analyzes metadata from the extracted PostgreSQL database dictionaries and saves it in your Automatic Data Lineage metadata repository
-
PostgreSQL DDL dataflow scenario — harvests metadata and lineage from the extracted PostgreSQL DDL scripts and saves it in your Automatic Data Lineage metadata repository
-
PostgreSQL PL/pgSQL dataflow scenario — harvests metadata and lineage from the provided PostgreSQL PL/pgSQL scripts and saves it in your Automatic Data Lineage metadata repository