IBM Netezza Performance Server lineage configuration

Connect to Netezza Performance Server to scan metadata and display it on lineage.

This information applies to IBM Manta Data Lineage service.

Supported Netezza Performance Server versions

IBM Netezza appliance software 11.2.2.10 and higher

Processed metadata

The following Netezza Performance Server metadata is processed and displayed on lineage:

Database Objects:
- Databases - database names and schemas
- Schemas - schema names and default schemas
- Tables - regular tables and external tables with columns
- Views - regular views and materialized views with definitions
- Procedures - stored procedures with arguments, body, and return types (including obfuscated procedures)
- Functions - user-defined functions (UDFs), aggregate functions, and table functions with arguments and return types
- Sequences – sequence objects with data types, minimum and maximum values, and step increments
- Synonyms - synonyms referencing other database objects
Column-Level Metadata:
- Column names
- Data types
- Descriptions
- NOT NULL constraints
- Default values
Constraints (Extended Attributes):
- Primary keys - primary key constraints on columns
- Unique constraints - unique constraints on columns
- Foreign keys - foreign key relationships between tables including:
- Constrained database, schema, and column
- Referenced database, schema, table, and column
SQL Processing
- DDL scripts - CREATE statements for tables, views, materialized views, procedures, sequences, and synonyms
- View definitions - SQL definitions for views and materialized views
- Procedure bodies – the code of stored procedures (both visible and obfuscated)
- SQL extensions - support for Netezza SQL extensions
Additional metadata:
- Schema mode - multi-schema or single-schema configuration
- External tables - external table definitions with source file references
- Function types - distinction between regular functions, aggregate functions, and table functions
- Procedure overloading - support for multiple procedures/functions with same name but different arguments
- Dependencies - object dependencies and references between database objects
Configuration options:
- Include/exclude filters for databases and schemas
- Extended attributes extraction (constraints)
- DDL extraction toggle
- SQL extensions support

Limitations

The following limitations apply:

The following data is not processed:
- Dynamic code that is executed by using the EXECUTE IMMEDIATE statement
- External tables
- Synonyms for built-in routines and functions from Netezza Extension toolkit
Synonyms are included in the data lineage visualization only if they are referenced in the SQL queries, procedures, or functions.
When synonym of the table is used within the SELECT statement, attributes of the synonym must be specified with the respective schema, otherwise such attributes are marked as deduced. For example, instead of SELECT syn_name.col_name ..., use SELECT schema_name.syn_name.col_name ....
In the single-schema mode, only the default schema is processed. When you create the metadata import, provide the default schema name in the scope of the import in the include list. To find the default schema name, run the SELECT database, defschema FROM _v_database; query.
- When you use single-schema mode with SQL Extensions, install the SQL Extensions in the default schema and specify the corresponding database and default schema names in the SQL Extensions Schema advanced property.

Prerequisite configuration

Before you import lineage metadata, ensure that you have the following rights on each extracted database:

LIST to access the database
SELECT on the following system tables or views. These rights are granted to every user by default.
- _t_environ with value set for HOSTNAME or manually defined User-set host name in the connection form
- _v_database
- _v_schema
- _v_table
- _v_extobject
- _v_view
- _v_relation_column
- _v_relation_keydata
- _v_procedure
- _v_function
- _v_aggregate
- _v_sequence
- _v_synonym
In each extracted schema:
- LIST to access the schema
- SELECT on the following management table. You must grant this right explicitly.
  - _vt_sequence
- LIST on all relevant object types:
  - GRANT LIST ON <db>.<schema>.SEQUENCE TO <manta user>;
  - GRANT LIST ON <db>.<schema>.SYNONYM TO <manta user>;
  - GRANT LIST ON <db>.<schema>.TABLE TO <manta user>;
  - GRANT LIST ON <db>.<schema>.EXTERNAL TABLE TO <manta user>;
  - GRANT LIST ON <db>.<schema>.FUNCTION TO <manta user>;
  - GRANT LIST ON <db>.<schema>.AGGREGATE TO <manta user>;
  - GRANT LIST ON <db>.<schema>.PROCEDURE TO <manta user>;
  - GRANT LIST ON <db>.<schema>.VIEW TO <manta user>;
  - GRANT LIST ON <db>.<schema>.MATERIALIZED VIEW TO <manta user>;

Creating a metadata import asset

Data source connection

To connect to the data source from which you want to import lineage metadata, you need to select a data source definition and a connection. You can create them before you start creating the metadata import, or you can create them when you create and configure the metadata import asset.

Data source definition

Select IBM Netezza Performance Server as the data source type.

Connection

Database names are case-sensitive. When you specify a database to connect to, you must provide the exact name, including letter case. If you are unsure of the correct case, specify the full database name in uppercase, which is the default in Netezza Performance Server. After the import completes, verify that the data was imported correctly.

For connection details, see Netezza Performance Server connection.

Include and exclude lists

You can include or exclude assets up to the schema level. Provide databases and schemas in the format database/schema. Each part is evaluated as a regular expression. Assets which are added later in the data source will also be included or excluded if they match the conditions specified in the lists. Example values:

myDB/: all schemas in myDB database.
myDB2/.*: all schemas in myDB2 database.
myDB3/mySchema1: mySchema1 schema from myDB3 database.
myDB4/mySchema[1-5]: any schema in my myDB4 database with a name that starts with mySchema and ends with a digit between 1 and 5.

External inputs

You can add external NZPLSQL scripts scripts for Netezza Performance Server in a .zip file as external inputs. You can organize the structure of a .zip file as subfolders that represent databases and schemas. After the scripts are scanned, they are added under respective databases and schemas in the selected catalog or project. The .zip file can have the following structure:

    <database_name>
       - <schema_name>
          - <script_name.sql>
    <database_name>
       - <script_name.sql>
    <script_name.sql>
    replace.csv

The replace.csv file contains placeholder replacements for the scripts that are added in the .zip file. For more information about the format, see Placeholder replacements.

External input properties

Encoding: Specify an encoding of NZPLSQL scripts:

Default Value: UTF-8
Example Value: UTF-8

Replacements support: External inputs allow variable replacement using the replace.csv file.

Advanced import options

Extract extended attributes

You can extract extended attributes like primary key, unique and referential integrity constraints of columns. By default these attributes are not extracted.

SQL Extensions installed

If you installed the SQL Extensions toolkit, set this property to true.

SQL Extensions database

If you installed the SQL Extensions toolkit, provide the database name where the toolkit is installed.

SQL Extensions schema

If you installed the SQL Extensions toolkit, provide the schema name where the toolkit is installed.

Performance profile

For selected data sources you can choose a performance profile. Depending on your current needs, the lineage metadata import might be faster or more complete. You can choose between the following profiles:

Fast: Low time and memory consumption are the priorities in this profile. If your input is large, lineage might not be complete.
Balanced: Both performance and lineage completeness are important. It is a compromise between the lineage compactness and time and memory that is spent on lineage import.
Complete: The completness for lineage is the priority in this profile. If your input is large, the lineage import might take a significant amount of resources and time.
Custom profile: You can create your own performance profile by providing values for the following properties:
- Dataflow Analysis Timeout Limit: Specifies the maximum estimated time (in seconds) after which the dataflow analysis of a single input is stopped. The time is checked when each node is added, or in some cases when edges are created. Therefore, in some cases, the timeout might slightly exceed the specified limit. If you set the value to 0, the analysis is not stopped. Example value: 60.
- Dataflow Analysis Edge Limit: Specifies the maximum number of edges that are allowed for a single input during the dataflow analysis. If this limit is exceeded, all filter edges are removed and no more filter edges are added. If the limit is still exceeded even after that, the analysis is stopped and the input fails. To disable the limit, set the value to 0. Example value: 2500.