SUBSET

Use the SUBSET statement to create a subset file from an existing archive file. The original archive file remains intact in the directory.

The subset file is an archive file that contains a subset of related rows from the source archive file and the objects needed to recreate the database tables. You can use the subset file to restore previously archived rows in a database.

There is a growing need to ensure information privacy and preserve data for audit or legal purposes. It is equally important to establish policies to dispose of information at the end of its lifecycle. Features of the Subset process enable you to meet these requirements. You can use Subset to select rows from the original archive and create a file with a retention period that is different from the source archive file. When you create subset file you can specify a Point-and-Shoot list, override variables used in the source archive file, change relationship properties and control whether a table is processed as a reference table. Additionally you can register the subset file in the archive directory, create an index for it, and add it to a collection.

If security is active at your site, you can assign security status for any files created using the SUBSET process and cataloged in the archive directory.

SUBSET 
  ARCHIVE_NAME {explicitfilename | (USEDD)}
  SUBSET_FILE(File Allocation Parameters)
    [ COPY_ARCHIVE_INFO { YES |NO }]
  [ REF_TABLE ([ cid. ] tblname1 { YES | NO })]
  [ ROWLIST {dsname| (USEDD) } ]
  [ POINT_SHOOT_ERROR { STOP | CONTINUE | SUBSET }
  [ SELECT ( [ cid. ] tblname1,coloperator,colname1,   critoperator,criteria
             [, colname2,critoperator,criteria,... ] ) ]
  [ SELECT ( [ cid. ] tblname2,coloperator,colname1, critoperator,criteria 
             [, colname2,critoperator,criteria,... ] ) ]…
  [ SQL ( [ cid. ] tblname1, { AND | OR }, sqlclause1 ) ]
  [ SQL ( [ cid. ] tblname2, { AND | OR }, sqlclause2 ) ]...
  [ CASE { YES | NO } ]
  [ INDEX_FILE (File Allocation Parameters)]
  [ DROP_INDEX_COLUMN( [ cid. ] table, column1 [,  column2, ... ] ) ]
  [ DROP_INDEX_TABLE( [ cid. ] table1 [, [ cid. ] table2,... ] ) ]
  [ DROP_INDEX_ALL{ YES | NO } ]
  [ SPARSE_INDEX( [ cid. ] table, column1 [, column2, ...] ) ]
  [ DENSE_INDEX( [ cid. ] table, column1 [, column2, ... ] ) ]
  [ VAR (varname, value)]
  [ SECURITY_STATUS { PUBLIC | READONLY | PRIVATE } ]
  [ GROUP group ]
  [ DESC desc ]
  [ RETENTION_PERIOD{ NOLIMIT | nD | nY | yyyy-mm-dd | yyyy.ddd | PERM | NEVER } ]
  [ SKIP_ARC_CATALOG  { YES | NO }]
  [ REPLACE_ARC_DIR { YES | NO }]
  [ COLLECTION (cid.collectionname [,cid.collectionname2,...])]
  [ START_TABLE [ cid. ] tblname ]
  [ REL_OVRD (TABLE [ cid.] tblname, REL relname
                  [ SELECT { YES | NO } | Q1 { YES | NO } | Q2 { YES | NO } ] ) ]]
  [ REPORT_LEVEL { DETAIL | SUMMARY } ]
  [ NOT_FOUND_RC4 { YES | NO } ]

ARCHIVE_NAME

The name of the source archive file from which the subset file will be created. ARCHIVE_NAME is required.

explicitfilename: The fully qualified name of a cataloged archive file.

(USEDD): Use the DDNAME PSDFEXTR. Only one USEDD operand is supported.

SUBSET_FILE: Name of the output file that will contain the subset of rows from the source archive file. This parameter is required. See File Allocation Parameters.

COPY_ARCHIVE_INFO

Use this parameter to create the subset file with the same attributes as the source archive file.

YES: Copy the source archive file attributes EXPIRATION_DATE, SECURITY_STATUS, GROUP, DESC and RETENTION_PERIOD and use these attributes for the subset file created by this process.
Note: EXPIRATION_DATE cannot be overridden. All of the remaining attributes can be overridden using the respective parameters.

NO: Do not copy the source archive file attributes, except for EXPIRATION_DATE which is always copied and used for the subset file. This is the default.

Processing parameters

Use the following parameters to provide processing options for the archive file.

REF_TABLE

Use this parameter to identify tables to use as reference tables in the subset process. All rows in a reference table are selected, unless selection criteria are specified for the table. By default, the reference table specifications in the Access Definition used to create the source Archive File are used. You can add reference table specifications or remove an existing reference table specification for tables. Use the REF_TABLE parameter for each table.

cid.tblname: Table name. Prefix with Creator ID, if needed to identify the table.

YES: Table is a reference table.

NO: Table is not a reference table.

ROWLIST

The name of a Point-and-Shoot file used to select data for subsetting.

Note: If you specify ROWLIST and SELECT or SQL criteria, only rows that meet the SELECT or SQL criteria are matched against the ROWLIST.

dsname: The fully qualified name of a sequential file or partitioned data set with member name.

(USEDD): Use the DDNAME PSDFPNS to specify the Point-and-Shoot file. Only one USEDD operand is supported.

POINT_SHOOT_ERROR

Indicate action to be taken when processing the Point-and-Shoot file results in errors.

STOP: Stop the run if the Point-and-Shoot file contains invalid keys or is empty. This is the default.

CONTINUE: Continue the run if invalid keys are found, but do not use the Point-and-Shoot file.

SUBSET: If invalid keys are found, continue the run with the valid keys only. If errors other than invalid keys are found, stop the run.

SELECT

Criteria used to select data. If you omit the SELECT parameter, all rows in the archive file are copied to the subset file.

Note: If you specify ROWLIST and SELECT or SQL criteria, only rows that meet the SELECT or SQL criteria are matched against the ROWLIST.

cid.tblname

The table name. Prefix with Creator ID, if needed to identify the table.

coloperator

Operator used to combine criteria for the table. One of the following must be indicated, regardless of the number of colname operands.

AND: Select data that matches criteria for all columns.
OR: Select data that matches criteria for at least one column.

colname

The name of the column. At least one colname operand, with coloperator, critoperator, and criteria, must be used.

critoperator

Operator for criteria. (See following explanation for criteria.)

criteria

Up to 250 characters of criteria per column.

If the column has a character, graphic, or binary data type (e.g., CHAR, VARCHAR, GRAPHIC, VARGRAPHIC, BINARY, VARBINARY, DATE, TIME, TIMESTAMP, or TIMESTAMP WITH TIME ZONE), the criteria must be delimited with single quotes.

If the column has a numeric data type (e.g., INTEGER, SMALLINT, BIGINT or DECIMAL), the criteria must not be in quotes.

If the archive file has dense indexes that you want to use, the criteria must be case-sensitive. See the CASE parameter for more information.

Separate criteria and critoperator values with a comma.

Note: You must leave a space after a comma that precedes a numeric value if the DB2^® setup specifies a comma as the decimal point value.

EQ, criteria
NE, criteria
GT, criteria
LT, criteria
GE, criteria
LE, criteria
IN, (a,b,c,d...)
NOT IN, (a,b,c,d...)
IS NULL
IS NOT NULL
LIKE, pattern
NOT LIKE, pattern
BETWEEN, x AND y
NOT BETWEEN, x AND y

SQL

SQL WHERE criteria used to select data from a table in the source archive file.

Note: If you specify ROWLIST and SELECT or SQL criteria, only rows that meet the SELECT or SQL criteria are matched against the ROWLIST.

cid.tblname: Table name. Prefix with Creator ID, if needed to identify the table.
AND: Select data that matches selection criteria and SQL WHERE clause for table (default).
OR: Select data that matches selection criteria or SQL WHERE clause for table.
sqlclause: Specify criteria as an SQL WHERE clause. Do not use quotes or parentheses; clause ends with statement (at close parenthesis).
Note: You must leave a space after a comma that precedes a numeric value if the DB2 setup specifies a comma as the decimal point value.

CASE

Case requirement for matching selection criteria. Omit the CASE parameter to select data without regard to case. The value for this parameter must be YES if the archive file has dense indexes that you want to use. Specifying the value NO or omitting this parameter causes any dense indexes to be ignored.

YES: Select data that match case of criteria.
NO: Select data that match criteria without regard to case. This is the default.

INDEX_FILE: Create an Index file for dense indexes. If omitted, and a dense index is to be created, an error occurs. Use the parameters shown in File Allocation Parameters

DROP_INDEX_COLUMN

Drop the index for one or more columns in a table in the Access Definition. (If the Access Definition does not have an index for the specified column, an error occurs.) You can specify multiple DROP_INDEX_COLUMN parameters, but duplicate column names cause an error.

cid.table: The name of the table. If you omit the Creator ID, the default Creator ID is used.

column: The name of one or more columns with an index to be dropped.

DROP_INDEX_TABLE

Drop all indexes for the specified tables in the Access Definition. You can specify multiple DROP_INDEX_TABLE parameters, but duplicate table names cause an error.

cid.table: The name of one or more tables with indexes to be dropped. If you omit the Creator ID, the default Creator ID is used.

DROP_INDEX_ALL

Drop all indexes for all tables in the Access Definition.

YES: Drop all indexes.

NO: Do not drop all indexes (default).

SPARSE_INDEX

Create a sparse index for one or more columns in the specified table. If the Access Definition includes a sparse index for the specified column, an error occurs unless the index has been dropped using a prior DROP_INDEX parameter.

You can use multiple SPARSE_INDEX parameters, but duplicate column names cause an error. You can specify a sparse index for a column that has a dense index.

cid.table: The name of the table. If you omit the Creator ID, the default Creator ID is used.

column: The name of one or more columns to be indexed.

DENSE_INDEX

Create a dense index for one or more columns in the specified table. If the Access Definition defines a dense index for the specified column, an error occurs unless the index has been dropped using a prior DROP_INDEX parameter.

You can use multiple DENSE_INDEX parameters, but duplicate column names cause an error. You can create a dense index for a column that also has a sparse index.

cid.table: The name of the table. If you omit the Creator ID, the default Creator ID is used.

column: The name of one or more columns to be indexed.

VAR

Override for the default value of a substitution variable assigned in the Access Definition. If a default value was not defined in the Access Definition, you must use the VAR parameter to specify a value, or an error occurs.

varname: The name of the substitution variable assigned in the Access Definition. A colon (:) before varname is optional.

value: The value for the substitution variable. You must enclose the value in single quotes if the variable is for a CHAR, VARCHAR, GRAPHIC, VARGRAPHIC, BINARY, VARBINARY, DATE, TIME, or TIMESTAMP column.
Note: If you specify a column name for the default value, do not enclose the value in quotes.

SECURITY_STATUS

Security status to be assigned to the subset file. If archive security is not currently activated, the assigned status becomes effective when security is activated for the site. Omit this parameter to use the site default for archive security (or PUBLIC, if the site default is not specified).

PUBLIC: All can use and modify.

READONLY: All can use, but only the Archive Administrator or the owner can modify.

PRIVATE: Only the Archive Administrator or the owner can use and modify.

GROUP

Group designation for the subset file.

group: Up to 8-character group designation, enclosed in single or double quotes if using blanks or special characters.

DESC

Description of the subset file.

desc: Up to 40-character description, enclosed in single or double quotes.

SKIP_ARC_CATALOG

Indicate whether to create an archive Directory entry for the subset file.

YES: Skip creating an entry for the subset file or the duplicate archive file. This is the default.

NO: Create an entry for the subset file.

DUP: Skip creating an entry for the duplicate archive file.

REPLACE_ARC_DIR

The action taken if an Archive Directory entry already exists for the file created by this process.

Note: If you specify SKIP_ARC_DIR YES, any specification for REPLACE_ARC_DIRECTORY is ignored.

YES: Replace the existing entry.

NO: Do not replace any existing archive directory entry. This is the default.

RETENTION_PERIOD

The retention period for the subset file. You cannot delete or overwrite the subset file, its associated Archive Directory entry, or any associated archive index file until after the specified retention period.

This parameter is valid only when the Site Option, Specify Arc File Retention, is set to User. If omitted, when the Site Option allows user specification, no retention period is assigned.

NOLIMIT: No retention period is assigned.

nD: The number of days to retain the subset file.

nY: The number of years to retain the subset file.

yyyy-mm-dd, yyyy/mm/dd, or yyyy.ddd: An explicit date, after which you can delete or overwrite the subset file. Enter a year from 1900-2155. For ddd values, enter a day from 000-366.

PERM, NEVER, 1999.365, 1999.366, 1999/12/31, or 99/12/31: The retention period does not expire.
Note: Any archive file assigned a retention period over 9999 days or 27 years is considered permanent, and can only be deleted or overwritten if you reduce the retention period using the ALTER statement of the IBM^® Utility program, IDCAMS.

SECURITY_STATUS

PUBLIC: All can use and modify.

READONLY: All can use, but only the Archive Administrator or the owner can modify.

PRIVATE: Only the Archive Administrator or the owner can use and modify.

COLLECTION

If appropriate, specify the name of the Archive Collection(s) to which the subset file should be added. Each collection is identified by a Collection ID of up to eight characters, and a Collection Name of up to 12 characters, such as CID1.COLLECTION. You can specify multiple collections, but duplicate names cause an error.

Note: If you specify the COLLECTION parameter you must also specify SKIP_ARC_CATALOG NO.

cid: The Collection ID.

collectionname: The Collection Name.

START_TABLE

Override for the Start Table in the Access Definition used to create the source archive file.

cid.tblname: Table name. Prefix with Creator ID, if needed to identify the table.

REL_OVRD

Overrides the “select” flag and Q1/Q2 values for a relationship, as defined in the Access Definition. The REL_OVRD parameters must be enclosed in parentheses and separated by spaces. See Select Relationships in the Common Elements Manual for a detailed description of how Optim uses the following relationship parameters.

TABLE: Name of the child table in the relationship (required).
Note: Prefix with the Creator ID, if needed to identify the table. If you omit the Creator ID, the default Creator ID from the Access Definition is used.

REL: The relationship name (required).

SELECT

Indicates use of the relationship in processing (optional).

YES: Use the relationship.

NO: Ignore the relationship.

Q1: The Q1 value for the relationship (optional).

Q2: The Q2 value for the relationship (optional).

Note: The SELECT, Q1, and Q2 parameters are optional. If omitted, the corresponding Access Definition values apply.

REPORT_LEVEL

The level of detail in the process report.

DETAIL: Produce a detailed report (default).
SUMMARY: Produce a summary report.

NOT_FOUND_RC4

Indicate the action taken and return code (RC) setting when no rows meet the selection criteria during subset batch processing.

YES: A warning message will be issued and the program will end with RC=4 (default).
NO: The program will end with RC=0.

Example

Use the following statement to create a subset file, named DEPT14.OCT2006.CUST.NJ, from the existing archive file, DEPT14.OCT2006.CUST.

SUBSET
   ARCHIVE_NAME  DEPT14.OCT2006.CUST
   SUBSET_FILE  (DSNAME  DEPT14.OCT2006.CUST.NJ  MODE REP)
   SELECT  (DEPT14.CUSTOMERS, OR, STATE, EQ, 'NJ')
   CASE YES