SUBSET

Use the SUBSET statement to create a subset file from an existing archive file. The original archive file remains intact in the directory.

The subset file is an archive file that contains a subset of related rows from the source archive file and the objects needed to recreate the database tables. You can use the subset file to restore previously archived rows in a database.

There is a growing need to ensure information privacy and preserve data for audit or legal purposes. It is equally important to establish policies to dispose of information at the end of its lifecycle. Features of the Subset process enable you to meet these requirements. You can use Subset to select rows from the original archive and create a file with a retention period that is different from the source archive file. When you create subset file you can specify a Point-and-Shoot list, override variables used in the source archive file, change relationship properties and control whether a table is processed as a reference table. Additionally you can register the subset file in the archive directory, create an index for it, and add it to a collection.

If security is active at your site, you can assign security status for any files created using the SUBSET process and cataloged in the archive directory.

SUBSET 
  ARCHIVE_NAME {explicitfilename | (USEDD)}
  SUBSET_FILE(File Allocation Parameters)
    [ COPY_ARCHIVE_INFO { YES |NO }]
  [ REF_TABLE ([ cid. ] tblname1 { YES | NO })]
  [ ROWLIST {dsname| (USEDD) } ]
  [ POINT_SHOOT_ERROR { STOP | CONTINUE | SUBSET }
  [ SELECT ( [ cid. ] tblname1,coloperator,colname1,   critoperator,criteria
             [, colname2,critoperator,criteria,... ] ) ]
  [ SELECT ( [ cid. ] tblname2,coloperator,colname1, critoperator,criteria 
             [, colname2,critoperator,criteria,... ] ) ]…
  [ SQL ( [ cid. ] tblname1, { AND | OR }, sqlclause1 ) ]
  [ SQL ( [ cid. ] tblname2, { AND | OR }, sqlclause2 ) ]...
  [ CASE { YES | NO } ]
  [ INDEX_FILE (File Allocation Parameters)]
  [ DROP_INDEX_COLUMN( [ cid. ] table, column1 [,  column2, ... ] ) ]
  [ DROP_INDEX_TABLE( [ cid. ] table1 [, [ cid. ] table2,... ] ) ]
  [ DROP_INDEX_ALL{ YES | NO } ]
  [ SPARSE_INDEX( [ cid. ] table, column1 [, column2, ...] ) ]
  [ DENSE_INDEX( [ cid. ] table, column1 [, column2, ... ] ) ]
  [ VAR (varname, value)]
  [ SECURITY_STATUS { PUBLIC | READONLY | PRIVATE } ]
  [ GROUP group ]
  [ DESC desc ]
  [ RETENTION_PERIOD{ NOLIMIT | nD | nY | yyyy-mm-dd | yyyy.ddd | PERM | NEVER } ]
  [ SKIP_ARC_CATALOG  { YES | NO }]
  [ REPLACE_ARC_DIR { YES | NO }]
  [ COLLECTION (cid.collectionname [,cid.collectionname2,...])]
  [ START_TABLE [ cid. ] tblname ]
  [ REL_OVRD (TABLE [ cid.] tblname, REL relname
                  [ SELECT { YES | NO } | Q1 { YES | NO } | Q2 { YES | NO } ] ) ]]
  [ REPORT_LEVEL { DETAIL | SUMMARY } ]
  [ NOT_FOUND_RC4 { YES | NO } ]
ARCHIVE_NAME
The name of the source archive file from which the subset file will be created. ARCHIVE_NAME is required.
explicitfilename
The fully qualified name of a cataloged archive file.
(USEDD)
Use the DDNAME PSDFEXTR. Only one USEDD operand is supported.
SUBSET_FILE
Name of the output file that will contain the subset of rows from the source archive file. This parameter is required. See File Allocation Parameters.
COPY_ARCHIVE_INFO
Use this parameter to create the subset file with the same attributes as the source archive file.
YES
Copy the source archive file attributes EXPIRATION_DATE, SECURITY_STATUS, GROUP, DESC and RETENTION_PERIOD and use these attributes for the subset file created by this process.
Note: EXPIRATION_DATE cannot be overridden. All of the remaining attributes can be overridden using the respective parameters.
NO
Do not copy the source archive file attributes, except for EXPIRATION_DATE which is always copied and used for the subset file. This is the default.
Processing parameters
Use the following parameters to provide processing options for the archive file.
REF_TABLE
Use this parameter to identify tables to use as reference tables in the subset process. All rows in a reference table are selected, unless selection criteria are specified for the table. By default, the reference table specifications in the Access Definition used to create the source Archive File are used. You can add reference table specifications or remove an existing reference table specification for tables. Use the REF_TABLE parameter for each table.
cid.tblname
Table name. Prefix with Creator ID, if needed to identify the table.
YES
Table is a reference table.
NO
Table is not a reference table.
ROWLIST
The name of a Point-and-Shoot file used to select data for subsetting.
Note: If you specify ROWLIST and SELECT or SQL criteria, only rows that meet the SELECT or SQL criteria are matched against the ROWLIST.
dsname
The fully qualified name of a sequential file or partitioned data set with member name.
(USEDD)
Use the DDNAME PSDFPNS to specify the Point-and-Shoot file. Only one USEDD operand is supported.
POINT_SHOOT_ERROR
Indicate action to be taken when processing the Point-and-Shoot file results in errors.
STOP
Stop the run if the Point-and-Shoot file contains invalid keys or is empty. This is the default.
CONTINUE
Continue the run if invalid keys are found, but do not use the Point-and-Shoot file.
SUBSET
If invalid keys are found, continue the run with the valid keys only. If errors other than invalid keys are found, stop the run.
SELECT
Criteria used to select data. If you omit the SELECT parameter, all rows in the archive file are copied to the subset file.
Note: If you specify ROWLIST and SELECT or SQL criteria, only rows that meet the SELECT or SQL criteria are matched against the ROWLIST.
cid.tblname
The table name. Prefix with Creator ID, if needed to identify the table.
coloperator
Operator used to combine criteria for the table. One of the following must be indicated, regardless of the number of colname operands.
AND
Select data that matches criteria for all columns.
OR
Select data that matches criteria for at least one column.
colname
The name of the column. At least one colname operand, with coloperator, critoperator, and criteria, must be used.
critoperator
Operator for criteria. (See following explanation for criteria.)
criteria
Up to 250 characters of criteria per column.

If the column has a character, graphic, or binary data type (e.g., CHAR, VARCHAR, GRAPHIC, VARGRAPHIC, BINARY, VARBINARY, DATE, TIME, TIMESTAMP, or TIMESTAMP WITH TIME ZONE), the criteria must be delimited with single quotes.

If the column has a numeric data type (e.g., INTEGER, SMALLINT, BIGINT or DECIMAL), the criteria must not be in quotes.

If the archive file has dense indexes that you want to use, the criteria must be case-sensitive. See the CASE parameter for more information.

Separate criteria and critoperator values with a comma.

Note: You must leave a space after a comma that precedes a numeric value if the DB2® setup specifies a comma as the decimal point value.
  • EQ, criteria
  • NE, criteria
  • GT, criteria
  • LT, criteria
  • GE, criteria
  • LE, criteria
  • IN, (a,b,c,d...)
  • NOT IN, (a,b,c,d...)
  • IS NULL
  • IS NOT NULL
  • LIKE, pattern
  • NOT LIKE, pattern
  • BETWEEN, x AND y
  • NOT BETWEEN, x AND y
SQL
SQL WHERE criteria used to select data from a table in the source archive file.
Note: If you specify ROWLIST and SELECT or SQL criteria, only rows that meet the SELECT or SQL criteria are matched against the ROWLIST.
cid.tblname
Table name. Prefix with Creator ID, if needed to identify the table.
AND
Select data that matches selection criteria and SQL WHERE clause for table (default).
OR
Select data that matches selection criteria or SQL WHERE clause for table.
sqlclause
Specify criteria as an SQL WHERE clause. Do not use quotes or parentheses; clause ends with statement (at close parenthesis).
Note: You must leave a space after a comma that precedes a numeric value if the DB2 setup specifies a comma as the decimal point value.
CASE
Case requirement for matching selection criteria. Omit the CASE parameter to select data without regard to case. The value for this parameter must be YES if the archive file has dense indexes that you want to use. Specifying the value NO or omitting this parameter causes any dense indexes to be ignored.
YES
Select data that match case of criteria.
NO
Select data that match criteria without regard to case. This is the default.
INDEX_FILE
Create an Index file for dense indexes. If omitted, and a dense index is to be created, an error occurs. Use the parameters shown in File Allocation Parameters
DROP_INDEX_COLUMN
Drop the index for one or more columns in a table in the Access Definition. (If the Access Definition does not have an index for the specified column, an error occurs.) You can specify multiple DROP_INDEX_COLUMN parameters, but duplicate column names cause an error.
cid.table
The name of the table. If you omit the Creator ID, the default Creator ID is used.
column
The name of one or more columns with an index to be dropped.
DROP_INDEX_TABLE
Drop all indexes for the specified tables in the Access Definition. You can specify multiple DROP_INDEX_TABLE parameters, but duplicate table names cause an error.
cid.table
The name of one or more tables with indexes to be dropped. If you omit the Creator ID, the default Creator ID is used.
DROP_INDEX_ALL
Drop all indexes for all tables in the Access Definition.
YES
Drop all indexes.
NO
Do not drop all indexes (default).
SPARSE_INDEX
Create a sparse index for one or more columns in the specified table. If the Access Definition includes a sparse index for the specified column, an error occurs unless the index has been dropped using a prior DROP_INDEX parameter.
You can use multiple SPARSE_INDEX parameters, but duplicate column names cause an error. You can specify a sparse index for a column that has a dense index.
cid.table
The name of the table. If you omit the Creator ID, the default Creator ID is used.
column
The name of one or more columns to be indexed.
DENSE_INDEX
Create a dense index for one or more columns in the specified table. If the Access Definition defines a dense index for the specified column, an error occurs unless the index has been dropped using a prior DROP_INDEX parameter.
You can use multiple DENSE_INDEX parameters, but duplicate column names cause an error. You can create a dense index for a column that also has a sparse index.
cid.table
The name of the table. If you omit the Creator ID, the default Creator ID is used.
column
The name of one or more columns to be indexed.
VAR
Override for the default value of a substitution variable assigned in the Access Definition. If a default value was not defined in the Access Definition, you must use the VAR parameter to specify a value, or an error occurs.
varname
The name of the substitution variable assigned in the Access Definition. A colon (:) before varname is optional.
value
The value for the substitution variable. You must enclose the value in single quotes if the variable is for a CHAR, VARCHAR, GRAPHIC, VARGRAPHIC, BINARY, VARBINARY, DATE, TIME, or TIMESTAMP column.
Note: If you specify a column name for the default value, do not enclose the value in quotes.
SECURITY_STATUS
Security status to be assigned to the subset file. If archive security is not currently activated, the assigned status becomes effective when security is activated for the site. Omit this parameter to use the site default for archive security (or PUBLIC, if the site default is not specified).
PUBLIC
All can use and modify.
READONLY
All can use, but only the Archive Administrator or the owner can modify.
PRIVATE
Only the Archive Administrator or the owner can use and modify.
GROUP
Group designation for the subset file.
group
Up to 8-character group designation, enclosed in single or double quotes if using blanks or special characters.
DESC
Description of the subset file.
desc
Up to 40-character description, enclosed in single or double quotes.
SKIP_ARC_CATALOG
Indicate whether to create an archive Directory entry for the subset file.
YES
Skip creating an entry for the subset file or the duplicate archive file. This is the default.
NO
Create an entry for the subset file.
DUP
Skip creating an entry for the duplicate archive file.
REPLACE_ARC_DIR
The action taken if an Archive Directory entry already exists for the file created by this process.
Note: If you specify SKIP_ARC_DIR YES, any specification for REPLACE_ARC_DIRECTORY is ignored.
YES
Replace the existing entry.
NO
Do not replace any existing archive directory entry. This is the default.
RETENTION_PERIOD
The retention period for the subset file. You cannot delete or overwrite the subset file, its associated Archive Directory entry, or any associated archive index file until after the specified retention period.
This parameter is valid only when the Site Option, Specify Arc File Retention, is set to User. If omitted, when the Site Option allows user specification, no retention period is assigned.
NOLIMIT
No retention period is assigned.
nD
The number of days to retain the subset file.
nY
The number of years to retain the subset file.
yyyy-mm-dd, yyyy/mm/dd, or yyyy.ddd
An explicit date, after which you can delete or overwrite the subset file. Enter a year from 1900-2155. For ddd values, enter a day from 000-366.
PERM, NEVER, 1999.365, 1999.366, 1999/12/31, or 99/12/31
The retention period does not expire.
Note: Any archive file assigned a retention period over 9999 days or 27 years is considered permanent, and can only be deleted or overwritten if you reduce the retention period using the ALTER statement of the IBM® Utility program, IDCAMS.
SECURITY_STATUS
Security status to be assigned to the subset file. If archive security is not currently activated, the assigned status becomes effective when security is activated for the site. Omit this parameter to use the site default for archive security (or PUBLIC, if the site default is not specified).
PUBLIC
All can use and modify.
READONLY
All can use, but only the Archive Administrator or the owner can modify.
PRIVATE
Only the Archive Administrator or the owner can use and modify.
COLLECTION
If appropriate, specify the name of the Archive Collection(s) to which the subset file should be added. Each collection is identified by a Collection ID of up to eight characters, and a Collection Name of up to 12 characters, such as CID1.COLLECTION. You can specify multiple collections, but duplicate names cause an error.
Note: If you specify the COLLECTION parameter you must also specify SKIP_ARC_CATALOG NO.
cid
The Collection ID.
collectionname
The Collection Name.
START_TABLE
Override for the Start Table in the Access Definition used to create the source archive file.
cid.tblname
Table name. Prefix with Creator ID, if needed to identify the table.
REL_OVRD
Overrides the “select” flag and Q1/Q2 values for a relationship, as defined in the Access Definition. The REL_OVRD parameters must be enclosed in parentheses and separated by spaces. See Select Relationships in the Common Elements Manual for a detailed description of how Optim uses the following relationship parameters.
TABLE
Name of the child table in the relationship (required).
Note: Prefix with the Creator ID, if needed to identify the table. If you omit the Creator ID, the default Creator ID from the Access Definition is used.
REL
The relationship name (required).
SELECT
Indicates use of the relationship in processing (optional).
YES
Use the relationship.
NO
Ignore the relationship.
Q1
The Q1 value for the relationship (optional).
Q2
The Q2 value for the relationship (optional).
Note: The SELECT, Q1, and Q2 parameters are optional. If omitted, the corresponding Access Definition values apply.
REPORT_LEVEL
The level of detail in the process report.
DETAIL
Produce a detailed report (default).
SUMMARY
Produce a summary report.
NOT_FOUND_RC4
Indicate the action taken and return code (RC) setting when no rows meet the selection criteria during subset batch processing.
YES
A warning message will be issued and the program will end with RC=4 (default).
NO
The program will end with RC=0.

Example

Use the following statement to create a subset file, named DEPT14.OCT2006.CUST.NJ, from the existing archive file, DEPT14.OCT2006.CUST.

SUBSET
   ARCHIVE_NAME  DEPT14.OCT2006.CUST
   SUBSET_FILE  (DSNAME  DEPT14.OCT2006.CUST.NJ  MODE REP)
   SELECT  (DEPT14.CUSTOMERS, OR, STATE, EQ, 'NJ')
   CASE YES