Commands for managing analysis tasks

You can use commands to manage analysis tasks like getting the status of an analysis task or canceling the analysis tasks.

The basic syntax of a command to manage tasks is:
IAAdmin -user user_name -password password -url https://host:port option -[suboption]

Basic parameters

Table 1. Basic parameters
Parameter Description
-user user_name Name of the user (required)
-password password Password to use (required)
-help or -h Displays list of available IAAdmin parameters
-v Verbose. If specified, success or failure error codes are displayed on the console or are sent to the specified file. The success code is (200). Error code 400 indicates that the request was bad. Error code 500 indicates that there was a server error. Consult the application server log files for more details.
-url https://host:port Host name and port number of the server (required)

-batchAnalysis option

The command -batchAnalysis runs an analysis on a large number of data sets without overloading the system. Without the -batchAnalysis command, analysis jobs are queued using the workload manager, and it takes awhile before the first job starts running. With the -batchAnalysis command, the large payload is split into one request per two tables (by default), which means that the analysis of the first job starts quickly, because the system is only busy preparing the first few jobs of the first tables, and starts preparing the next jobs while the first job runs. The command takes a text file (.txt) that contains a list of tables or files that you want to analyze and runs the column and data quality analyses through batch processing. The text file must be formatted correctly in order for the analyses to be run.

Table 2. -batchAnalysis option
Option Description
-projectName The name of the InfoSphere® Information Analyzer project that you want to work with.
-content The name of the text file that contains a list of tables or files that you want to run you want to run a column analysis or data quality analysis for. The text file must be in the following format:
// Use the code below to run a column analysis on the following tables.  
// If you only want to run a column analysis, omit the @ColumnAnalysis annotation 
//and provide a list of tables or files. 
@ColumnAnalysis
<Table_Name1>
<Table_Name2>
<Table_NameN>
@SampleOptions
<SampleType[random|sequential|every_nth]>,<Size>,[<Percent>],[<Seed>],[<Nth_Value>]

// Use the code below to run a data quality analysis for the following tables. 
//Note: This option also runs a column analysis for files if a column 
//analysis has not been run yet.
// 'SampleOptions' is an optional parameter.
@DataQualityAnalysis
<Table_Name1>
<Table_Name2>
<Table_Name3>
<Table_NameN>
@SampleOptions
<SampleType[random|sequential|every_nth]>,<Size>,[<Percent>],[<Seed>],[<Nth_Value>]
All table and file names must be fully qualified. All files must have a header, be comma delimited, and have line feed terminators. They must be in the following format:
  • File name: <host>: <connection_name>:<folder_path>:<file_name>
  • Table name: <host>. <database_name>.<schema_name>.<table_name>
If you do not know the name of the data connection, you can specify the asterisk symbol (*).

You can specify the following sample types:
  • Random: This sample type requires the variables -sampleSize, -seed, and -percent.
  • Every nth: This sample type requires the variables -sampleSize and -interval.
  • Sequential: This sample type requires the variable -sampleSize.
Specify the @ColumnAnalysis option if you only want a column analysis run on particular tables. Specify @DataQaulityAnalysis if you want to run a column analysis and a data quality analysis on particular tables. If you do not specify an option, a column analysis is run. The following example shows how to specify a column analysis and quality analysis for the JK_BANK2 and JK_BANK1 tables. The RunBothCAandDQLWithSample.txt file contains the following parameters:
@ColumnAnalysis
JKLW_HS.JKLW.JK_BANK2.*
JKLW_HS.JKLW.JK_BANK1.*

@SampleOptions
every_nth,2000,2


@DataQualityAnalysis
JKLW_HS.JKLW.JK_BANK2.*
JKLW_HS.JKLW.JK_BANK1.*

@SampleOptions
every_nth,2000,2
The following example shows how to specify a data quality analysis for the customers.csv file. The text file RunDQForCustomerFile.txt contains the following parameters:
@DataQualityAnalysis
cvm140-rh7:*:/iaqa/IAPRD:customers.csv

@SampleOptions
every_nth,2000,2
[-registerIfRequired] Specify this parameter if you need to register tables or files to the workspace that is specified with the projectName option.
[-nbOfConcurrentTables <number_of_tables>] Specify this parameter if you want to analyze multiple tables or files at the same time. This parameter controls how many requests are processed concurrently. Specify the number of tables that you want to run concurrently. If you do not specify the number of tables, the default setting is 2. Tables and files are run sequentially (one at a time) if you do not specify this parameter. If you set this value too high, the workload manager will limit the number of jobs that are executed. Setting it too low reduces the efficiency of your system because the system will wait until the first jobs are processed before preparing the next job, even if the system has the capacity to run multiple jobs at the same time.
[-noDataClasses] Specify this parameter if you do not want to include data classes when running the analyses.

-generateXML option

The command -generateXML takes a text file (.txt) that contains a list of tables that you want to analyze and generates the corresponding XML. You can use the XML to run a column or data quality analysis by using the -runTasks command. The text file must be formatted correctly in order for the XML to be generated. You can use this command if you want to generate XML in order to run other tasks from the command line interface.

Table 3. -generateXML option
Option Description
-projectName The name of the InfoSphere Information Analyzer project that you want to work with.
-content The name of the text file that you want to generate XML content for. The text file must be in the following format:
// Use the code below to generate XML for the following tables.  
@ColumnAnalysis
<Table_Name1>
<Table_Name2>
<Table_NameN>
@SampleOptions
<SampleType[random|sequential|every_nth]>,<Size>,[<Percent>],[<Seed>],[<Nth_Value>]

// Use the code below to generate XML for the following tables. 
// 'SampleOptions' is an optional parameter.
@DataQualityAnalysis
<Table_Name1>
<Table_Name2>
<Table_Name3>
<Table_NameN>
@SampleOptions
<SampleType[random|sequential|every_nth]>,<Size>,[<Percent>],[<Seed>],[<Nth_Value>]
For example, the RunBothCAandDQLWithSample.txt file contains the following parameters:
@ColumnAnalysis
JKLW_HS.JKLW.JK_BANK2.*
JKLW_HS.JKLW.JK_BANK1.*

@SampleOptions
every_nth,2000,2


@DataQualityAnalysis
JKLW_HS.JKLW.JK_BANK2.*
JKLW_HS.JKLW.JK_BANK1.*

@SampleOptions
every_nth,2000,2
[-noDataClasses] Specify this parameter if you do not want to include data classes when generating the XML.

-runTasks option

The command -runTasks runs analyses. For example, it runs column analyses, key analyses, rules, or metric analyses.

Table 4. -runTasks option
Option Description
-content inputXMLFile Path of the xml file specifying the analysis to be run.

-cancelTask and -getStatus options

The scheduleID is an unique identifier used for running analysis. You get the Schedule ID from the return value of the -runTasks.

The commands, -cancelTask cancels or aborts a running analysis, and -getStatus, gets the status of an analysis that is previously submitted with -runTasks.

Table 5. -cancelTask option
Option Description
-scheduleID taskScheduleID Specifies the schedule ID of the task.

Examples:

The following command generates XML from a text file:
IAAdmin -user isadmin -password isadmin -url https://host:9443 -generateXML -projectName IAProject1 -content tables.txt 
The following command runs the tasks specified by the input xml:
IAAdmin -user isadmin -password isadmin -url https://host:9443 -runTasks -content xmlFile
The response for runTasks:
ScheduledTaskId	TaskType
d70c6594.80cb2b5c.000kocnf1.4iavtdr.3mqnmp.0sov0mba9tcso94i7nia7 ColumnAnalysis
The following command gets the status of the task with the given scheduleID:
IAAdmin -user isadmin -password isadmin -url https://host:9443 -getStatus -scheduleID d70c6594.80cb2b5c.000kocnf1.4iavtdr.3mqnmp.0sov0mba9tcso94i7nia7
The response for getStatus:
ExecutionId = d70c6594.80cb2b5c.000kocnf1.4iavtdr.3mqnmp.0sov0mba9tcso94i7nia7-680a0afd-9144-48f8-8093-01925b20cfb2
ExecutionDate = 2015-03-19T17:47:48+01:00
ExecutionTime = 0
Status = running
Progress = 33
Jobs:  
The following command cancels the task with the given scheduleID:
IAAdmin -user isadmin -password isadmin -url https://host:9443 -cancelTask -scheduleID d70c6594.80cb2b5c.000kocnf1.4iavtdr.3mqnmp.0sov0mba9tcso94i7nia7

The cancelTask has no response.

The following command gets the status of the task again with the given scheduleID to check the status that is canceled:
IAAdmin.sh -user isadmin -password isadmin -url https://host:9443 -getStatus -scheduleID d70c6594.80cb2b5c.000kocnf1.4iavtdr.3mqnmp.0sov0mba9tcso94i7nia7
The response for getStatus after cancellation:
ExecutionId = d70c6594.80cb2b5c.000kocnf1.4iavtdr.3mqnmp.0sov0mba9tcso94i7nia7-680a0afd-9144-48f8-8093-01925b20cfb2
ExecutionDate = 2015-03-19T17:47:48+01:00
ExecutionTime = 262939
Status = cancelled
Progress = 
Jobs: