SAS Manual Inputs
To analyze SAS files:
-
Copy all SAS code files to the
<MANTA_DIR_HOME>/input/sas/${sas.system.id}
folder, wheresas.system.id
is the identification of the project (if not changed in the configuration). -
As of R40, copy all SAS metadata collection for predefined library table definitions into
<MANTA_DIR_HOME>/input/sas/${sas.system.id}/${sas.extractor.input}
This is required when datasets are predefined or defined in different scripts than where they are used. See the section below on how to obtain those.
SAS Metadata Collection for Predefined Library Tables
Problem Statement
Permanent SAS datasets are often used in SAS code by defining the
libname
statement in the code. However, those permanent datasets may have been created in a different SAS code. For example:
create_member.sas
libname stg1 "/opt/sas/data/stage1";
data stg1.member;
member_id= 12345;
member_name= "John Smith";
member_address= "1234, Manta Lane, Tampa, FL, 33634";
member_phone= "999-999-9999";
run;
use_member.sas
libname stg1 "/opt/sas/data/stage1";
data member1;
set stg1.member;
run;
proc sql noprint;
create table member2 as
select member_id, member_name
from member1;
quit;
In cases where these datasets are used to build subsequent datasets (like in the example for use_member.sas
above), Manta does not have the metadata for the source table, so the source table is deduced based on the query used to build
the target table. This sometimes leads to incorrect or missing lineage.
And you will see the log message below in the sasDataflowScenario log file.
2023-01-29 09:43:37.854 [pool-2-thread-2] 2 INFO eu.profinit.manta.connector.common.resolver.ReferenceResolver.DEDUCTION [Context: \library_table_use.sas] Resolving reference <4,5> AST_QUALIFIED_NAME using deduction to DB_TABLE: ResUnknownObject [currentGuess=Table [name=MEMBER, properties=[DB_TABLE, PROP_RESULTSET], data.id=null, id=2124004797, definition=4,10] (data type: ResUnknownDataType [currentGuess=AbstractDataType [name=MEMBER, properties=[DB_TABLE_TYPE, PROP_RESULTSET], data.id=null, id=1886977149, definition=4,10], properties=[DB_TABLE_TYPE, PROP_UNKNOWN, PROP_RESULTSET, PROP_PROXY], position=4,10]), properties=[DB_TABLE, PROP_UNKNOWN, PROP_RESULTSET, PROP_PROXY], position=4,10]
Notice the part
... using deduction to DB_TABLE: ResUnknownObject [currentGuess=Table [name=MEMBER ...
.
Solution
To circumvent this issue, provide the table metadata to Manta. As of R40, use SAS scanner to provide SAS scripts to build a “database/dataset” dictionary that will be used while processing actual SAS code with data transformations.
Expected Result:
Obtaining SAS Dictionary Scripts
Use the SAS script generate_metadata_table.sas
below to generate a SAS code with DDL statements for all the tables from the SAS libraries, and include this SAS file along with your other SAS files for analysis using the replace.csv
file.
Follow these steps.
-
Create a
libnames.txt
file with all the library definitions that the metadata needs to be extracted from. Here is an examplelibnames.txt
file.libname dds '/opt/sas/data/dds'; libname dds1 '/opt/sas/data/dds1'; libname stg1 '/opt/sas/data/stg1'; libname stg2 '/opt/sas/data/stg2';
-
Download the SAS code attached above to generate DDL for all the tables from all SAS permanent libraries.
-
Place
libnames.txt
in the/tmp
directory. If you would like to use a different location, modify line 21 ingenerate_metadata_table.sas
.%let libloc= /tmp; /*** path where libnames.txt file will be saved *****/
It will produce all the output
lib_meta_"library".sas
files in the/tmp
location. You can modify the section of the code below if you want to create the files at a different location on line 22.%let outfile= /tmp; /*** Path where the output sas code files will be generated ****/
-
The total number of files generated after the SAS code execution should be the same as the number of entries in the
libnames.txt
file. Here is an example forlib_meta_"library".sas
(e.g.,lib_meta_stg1.sas
).libname DDS '/tmp/dds'; /* SAS data step DDL for DDS.A2_ADDRESS_WITH_LABLES; */ data DDS.A2_ADDRESS_WITH_LABLES; attrib ADDRESS_END_DATE length=$8. format=22.3 label="Last date of Address validity" ADDRESS_RK length=8 format=12. label="ADDRESS_RK" ADDRESS_START_DATE length=$8. format=22.3 label="First day of address validity" ADDRESS_STATE_CODE length=3 format=$3. label="ADDRESS_STATE_CODE" ADDRESS_STATE_NAME length=50 format=$50. label="ADDRESS_STATE_NAME" ADDRESS_STREET length=100 format=$100. label="ADDRESS_STREET" ADDRESS_TOWN length=50 format=$50. label="ADDRESS_TOWN" ADDRESS_TYPE_CODE length=3 format=$3. label="Adress Type Code (HOM=Home, WRK=Work)" ADDRESS_TYPE_TEXT length=50 format=$50. label="Address Type (Home, Work)" ADDRESS_ZIP length=10 format=$10. label="ZIP code" ARRIDX_POPULATION_REGISTRY_IF23 length=8 format=13. label="ARRIDX_POPULATION_REGISTRY_IF23" IDS_IF23_ADDRESS length=8 format=22. label="IDS_IF23_ADDRESS" POPULATION_REGISTRY_RK length=8 format=14. label="POPULATION_REGISTRY_RK" PROCESSED_DTTM length=8 format=20. label="PROCESSED_DTTM" SOURCE_SYSTEM_CD length=3 format=$3. label="SOURCE_SYSTEM_CD" SYS_CREATEDATE length=8 format=9. label="SYS_CREATEDATE" VALID_FROM_DTTM length=8 format=20. label="VALID_FROM_DTTM" VALID_TO_DTTM length=8 format=20. label="VALID_TO_DTTM" ; stop; run; /* SAS data step DDL for DDS.TEST1; */ data DDS.TEST1; attrib col1 length=8 format=8. col2 length=3 format=3. col3 length=4 format=4. ; stop; run;
-
Copy these files to the Manta server
<MANTA_DIR_HOME>/input/sas/${sas.system.id}/${sas.extractor.input}
folder. -
In the Admin UI for the SAS connection, include the extractor.
-
Execute the workflow.