How IBM Automatic Data Lineage Handles Database Temporary Tables
Overview
A temporary table is a table that holds data only for the duration of a session or transaction.
Temporary tables are often used in conjunction with ETL pipelines as temporary staging areas for a variety of purposes. This can be a great mechanism to store and transform data en route to its target. A temporary table exists solely to store data in the context of a session and can be global or local in nature. While Automatic Data Lineage will process temporary tables, there are some nuances that this article aims to address and clarify.
Automatic Data Lineage and Temporary Tables
There are a number of flavors of temporary tables that vary across different database platforms; that is, not all platforms support all flavors, and they may be called by different names. The most common flavors are global and local temporary tables. These temporary tables can be implemented in a number of contexts. The flavor of the temporary table will dictate how/if it is extracted and analyzed using Automatic Data Lineage. Here are some steps that are generally recommended to perform this action.
-
For MS SQL Server, explicitly extract and analyze the
tempdb
. This can capture any global temporary tables that are still persisted in the database. -
Use an identical user for both your Automatic Data Lineage database connection and your production ETL connection. This can allow for the capture of local temporary database objects if they weren’t deleted prior to the session.
-
Allow Automatic Data Lineage to perform deduction when the database object is created in another technology connecting to the database, as it won't exist in the database and exists only for the single script context.
-
Consider migrating architecture patterns from a local to a global scope in extreme cases where local temporary tables are not being extracted.
-
Especially for the session-level or local temp tables, it may happen that the table does not exist in the context under which Automatic Data Lineage runs the extraction; for example, local temporary tables in MS SQL. This means that Automatic Data Lineage does not understand the table structure and any use of such a temp table will suffer from limitations of Deduction: How Manta Handles Unresolved References to Database Objects.
The following sections discuss the specifics of temporary tables in selected technologies.
SQL Server
Table type |
Comments |
Syntax |
---|---|---|
Table variables |
Are visible only to the connection that creates it, and are deleted when the batch/stored procedure ends |
|
Local temporary tables |
Local temporary tables are only available to the SQL Server connection (meaning a single user). Local temporary tables are deleted/dropped when the user disconnects from the SQL Server instance. |
|
Global temporary tables |
Global temporary tables are visible to any user and any connection after they are created. They are deleted when all users that are referencing the table disconnect from the SQL Server instance. |
|
Tempdb permanent tables |
Are visible to everyone and are deleted when the server is restarted |
Oracle
Similar to SQL Server except that they are statically defined. You create a temp table once per database instead of once per stored procedure.
Table type |
Comments |
Syntax |
---|---|---|
Global temporary tables |
Permanent database objects that store data on a disk and are visible to all sessions. The data stored in the global temporary table is private to the session. |