Database Export Index Options
The Indexes dialog box enables you to create indexes on database tables
exported from IBM® SPSS® Modeler. You can specify the
field sets you want to include and customize the CREATE INDEX
command, as needed.
The dialog box has two parts:
- The text field at the top displays a template that can be used to generate
one or more
CREATE INDEX
commands, which by default follows the format:CREATE INDEX <index-name> ON <table-name>
- The table in the lower portion of the dialog box enables you to add
specifications for each index you want to create. For each index, specify the index name and the
fields or columns to include. The dialog box automatically generates the values of the
<index-name>
and<table-name>
parameters accordingly.For example, the generated SQL for a single index on the fields empid and deptid might look like this:
CREATE INDEX MYTABLE_IDX1 ON MYTABLE(EMPID,DEPTID)
You can add multiple rows to create multiple indexes. A separate
CREATE INDEX
command is generated for each row.
Customizing the CREATE INDEX Command
Optionally, you can customize the CREATE INDEX
command for
all indexes or for a specific index only. This gives you the flexibility to accommodate specific
database requirements or options and to apply customizations to all indexes or only specific ones,
as needed.
- Select Customize CREATE INDEX command at the top of the dialog box to modify the template used for all indexes added subsequently. Note that changes will not automatically apply to indexes that have already been added to the table.
- Select one or more rows in the table and then click Update selected indexes at the top of the dialog box to apply the current customizations to all selected rows.
- Select the Customize check box in each row to modify the command template for that index only.
Note that the values of the <index-name>
and
<table-name>
parameters are generated automatically by the dialog box based
on the table specifications and cannot be edited directly.
BITMAP KEYWORD. If you are using an Oracle database, you can customize the template to create a bitmap index rather than a standard index, as follows:
CREATE BITMAP INDEX <index-name> ON <table-name>
Bitmap indexes may be useful for indexing columns with a small number of distinct values. The resulting SQL might look this:
CREATE BITMAP INDEX MYTABLE_IDX1 ON MYTABLE(COLOR)
UNIQUE keyword. Most databases support the UNIQUE
keyword in the
CREATE INDEX
command. This enforces a uniqueness constraint similar to a primary
key constraint on the underlying table.
CREATE UNIQUE INDEX <index-name> ON <table-name>
Note that for fields actually designated as primary keys, this specification
is not necessary. Most databases will automatically create an index for any fields specified as
primary key fields within the CREATE TABLE
command, so explicitly creating indexes
on these fields is not necessary. See the topic Database export schema options for more information.
FILLFACTOR keyword. Some physical parameters for the index can be fine-tuned. For example, SQL Server enables the user to trade off the index size (after initial creation) against the costs of maintenance as future changes are made to the table.
CREATE INDEX MYTABLE_IDX1 ON MYTABLE(EMPID,DEPTID) WITH FILLFACTOR=20
Other Comments
- If an index already exists with the specified name, index creation will fail. Any failures will initially be treated as warnings, allowing subsequent indexes to be created and then re-reported as an error in the message log after all indexes have been attempted.
- For best performance, indexes should be created after data has been loaded into the table. Indexes must contain at least one column.
- Before executing the node, you can preview the generated SQL in the message log.
- For temporary tables written to the database (that is, when node caching is enabled) the options to specify primary keys and indexes are not available. However the system may create indexes on the temporary table as appropriate, depending on how the data is used in downstream nodes. For example, if cached data is subsequently joined by a DEPT column, it would make sense to index the cached tabled on this column.
Indexes and Query Optimization
In some database management systems, once a database table has been created, loaded, and indexed, a further step is required before the optimizer is able to utilize the indexes to speed up query execution on the new table. For example, in Oracle, the cost-based query optimizer requires that a table be analyzed before its indexes can be used in query optimization. The internal ODBC properties file for Oracle (not user-visible) contains an option to make this happen, as follows:
# Defines SQL to be executed after a table and any associated indexes
# have been created and populated
table_analysis_sql, 'ANALYZE TABLE <table-name> COMPUTE STATISTICS'
This step is executed whenever a table is created in Oracle (regardless of whether primary keys or indexes are defined). If necessary, the ODBC properties file for additional databases can be customized in a similar way - contact Support for assistance.