PostgreSQL Resource Configuration

Before you configure your scanner, make sure you meet the prerequisites. Read our guide on PostgreSQL integration requirements to double-check.

Source System Properties

This configuration can be setup by creating a new connection on Admin UI /Connections tab or editing an existing connection in Admin UI / Connections / Databases / PostgreSQL / specific connection. New connection can also be created via Manta Orchestration API.

Granularity of the IBM Automatic Data Lineage connection for PostgreSQL is one PostgreSQL, Greenplum, Yellowbrick, Redshift, Amazon RDS, or Amazon Aurora for PostgreSQL server. Use filter on databases and schemas to limit the scope of analysis as needed. Use of multiple connections against a single PostgreSQL, Greenplum, Yellowbrick, Redshift, Amazon RDS, or Amazon Aurora for PostgreSQL server may lead to within-system lineage not be connected properly.

Property name	Description	Example
postgresql.dictionary.id	Name of a resource representing this PostgreSQL server known as the dictionary ID, used as an output subdirectory name for extracted DDL files and the database dictionary	dwh
postgresql.subdialect	The type of server—use `POSTGRESQL` for PostgreSQL, Greenplum, Yellowbrick, Amazon RDS, or Amazon Aurora for PostgreSQL; use `REDSHIFT` for Redshift	POSTGRESQL
postgresql.url	JDBC connection string for the database server PgBouncer is a PostgreSQL connection pooler which maintains a pool of connections that database transactions share. If PgBouncer is used by PostgreSQL and the pooling method is set to transaction pooling, `?prepareThreshold=0` must be added to the end of the connection string. See Prepared Statement 'X' Already Exists" in the PostgreSQL Extractor Log.	jdbc:postgresql://<host>:<port>/<database>?prepareThreshold=0
postgresql.username	User name for connection to the database server	manta_user
postgresql.password	User password or authentication token for connection to the database server. Use the authentication token if you are creating a connection to an Amazon RDS or Amazon Aurora for PostgreSQL database instance that uses the “Password and IAM database authentication“ authentication type.	manta_password
postgresql.extractedDbsSchemas	List of databases and schemas to extract, separated by commas, which are to be provided in the format database/schema; each part is evaluated as a regular expression. The filter is case-insensitive; it is not possible to configure case sensitivity in the filter at the moment.	database1/sch1,database2/sch2,database3
postgresql.excludedDbsSchemas	List of databases and schemas to exclude from extraction, separated by commas The filter is case-insensitive; it is not possible to configure case sensitivity in the filter at the moment.	database3/sch3,database3/sch4
postgresql.extraction.method	Set to Agent:default when the desired extraction method is the default Manta Extractor Agent, set to Agent:{remote_agent_name} when a remote Agent is the desired extraction method, set to Git:{git.dictionary.id} when the Git ingest method is the desired extraction method. For more information on setting up a remote extractor Agent refer to the Manta Flow Agent Configuration for Extraction documentation. For additional details on configuring a Git ingest method, refer to the Manta Flow Agent Configuration for Extraction:Git Source documentation.	default Git agent
postgresql.ddl.encoding	Encoding of automatically extracted DDL scripts. See Encodings for applicable values.	utf8
postgresql.script.encoding	Encoding of manually provided PgSQL scripts performed on this database server.See Encodings for applicable values.	utf8
postgresql.extractor.authentication.type (available as of R42.14)	Authentication type used to connect to the database instance. Default: BASE. Available only for the Redshift subdialect (`postgresql.subdialect property`)	Base
redshift.credentials.accessKeyId (available as of R42.14)	IAM Access Key ID representing the entity that connects to the Redshift instance.	AKIAIOSFODNN7EXAMPLE
redshift.credentials.secretAccessKey (available as of R42.14)	Secret Access Key for the entity represented by the paired Access Key ID (`redshift.credentials.accessKeyId`)	wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Common Scanner Properties

This configuration is common for all PostgreSQL source systems and for all PostgreSQL scenarios, and is configure in Admin UI / Configuration / CLI / PostgreSQL / PostgreSQL Common. It can be overridden on individual connection level.

Property name	Description	Example
postgresql.dictionary.dir	Directory with data dictionaries extracted from PostgreSQL	${manta.dir.temp}/postgresql
filepath.lowercase	Whether paths to files should be lowercase (false for case-sensitive file systems, true otherwise)	false true
postgresql.dll.output	Directory for automatically extracted PostgreSQL DDL scripts (for the extraction phase)	${manta.dir.temp}/postgresql/${postgresql.dictionary.id}/ddl
postgresql.ddl.input	Directory with automatically extracted PostgreSQL DDL scripts (for the analysis phase)	${postgresql.dll.output}
postgresql.script.input	Directory with manually provided PgSQL scripts which are performed on a given database server (for the analysis phase)	${manta.dir.input}/postgresql/${postgresql.dictionary.id}/
postgresql.script.replace	Path to the CSV file with the replacements to be applied to the provided PL/pgSQL scripts; see Placeholder Replacement in Input Scripts for details about the replacement file format	${manta.dir.input}/postgresql/${postgresql.dictionary.id}/replace.csv
postgresql.script.replace.regex	Flag specifying whether replacements for PL/pgSQL scripts in the provided CSV file specified in `postgresql.script.replace` should be interpreted as regular expressions (true) or simple text (false)	false true
postgresql.analyze.parallelCount	Number of parallel threads which will analyze DDL and PgSQL scripts	4
postgresql.dictionary.mappingFile	Path to automatically generated mappings for PostgreSQL servers	${manta.dir.temp}/postgresql/postgresqlDictionaryMantaMapping.csv
postgresql.dictionary.mappingManualFile	Path to mappings provided manually for PostgreSQL servers	${manta.dir.scenario}/conf/postgresqlDictionaryMantaMappingManual.csv
postgresql.defaultPath	Schema lookup path used by PostgreSQL servers; leave blank for the default value	stage,core
postgresql.standard_conforming_strings	Controls whether ordinary string literals ('...') treat backslashes literally, as specified in the SQL standard. Beginning in PostgreSQL 9.1, the default value of STANDARD_CONFORMING_STRINGS is ON (prior releases defaulted to OFF). (ON => true, OFF => false).	true false
postgresql.expressionDescriptions.enabled.ddl	Enables or disables building transformation logic descriptions from SQL code in DDL scripts	false true
postgresql.expressionDescriptions.enabled.plpgsql	Enables or disables building transformation logic descriptions from SQL code in PL/pgSQL scripts	false true
postgresql.expressionDescriptions.enabled.queryService	Enables or disables building transformation logic descriptions from PostgreSQL SQL code defined in other technologies	false true
postgresql.extractor.mode	Chooses which extractor implementation to use, bulk extraction mode. Bulk extraction mode should be used for the Greenplum database.	sequential parallel bulk
postgresql.parallelExtractor.numThreads	Number of threads to run DDL loading in the parallel extractor; only useful if postgresql.extractor.mode=parallel	8
postgresql.parallelExtractor.bufferSize	Number of DDL queries to pre-load; only useful if postgresql.extractor.mode=parallel	100
postgresql.bulk.numThreads	Number of parallel threads that will extract the metadata; only useful if postgresql.extractor.mode=bulk	4
postgresql.parallelExtractor.persistenceFlushCount	Number of dictionary persistence calls that will force an auto-flush; only useful if postgresql.extractor.mode=parallel	1000
postgresql.extractor.ssl.enabled	Use SSL-protected connection	true false
postgresql.extractor.ssl.mode	SSL mode for the connection	disable allow prefer require verify-ca verify-full
postgresql.analyze.retainUnusedResultSetColumns	Flag specifying whether the data lineage should include sub-query resultset columns that do not have any downstream lineage By default, set to false	false true

Manual Mapping Properties

It is possible to manually configure mappings for PostgreSQL servers. This is necessary for non-extracted schemas in the Automatic Data Lineage extraction phase and complicated usage of database links.

Each mapping has its own row with the following parts separated by semicolons.

Property name	Description	Example
Dictionary ID	Name of a resource representing this PostgreSQL server known as the dictionary ID, used as an output subdirectory name for extracted DDL files and the database dictionary	dwh
Sub-dialect	The type of server—use `POSTGRESQL` for PostgreSQL, Greenplum, Yellowbrick, Amazon RDS, or Amazon Aurora for PostgreSQL; use `REDSHIFT` for Redshift	POSTGRESQL
Hostname / Endpoint	Name of the host where the database is running, or endpoint name for Redshift	localhost 127.0.0.1
Port	Port on which the database listens	5432
Connection ID	External PostgreSQL connection ID in third-party tools; can be left empty if it is the same as the host name	dwh
Included databases/schemas	List of databases and schemas to extract, separated by commas, which are to be provided in the format database/schema; each part is evaluated as a regular expression	database1/sch1,database2/sch2,database3
Excluded databases/schemas	List of databases and schemas to exclude from extraction, separated by commas	database3/sch3,database3/sch4