IDS Experts

For a complete list of current SQL reserved words, see the IBM Informix Guide to SQL: Syntax.

Sanjit Chakraborty[Read More]

ON-Bar whole system parallelize backup/restore

cheetahblog 1 Comment 1,924 Views

Prior to Cheetah,  a whole-system backup (onbar -b-w)  was serial backup of all storage spaces andlogical logs based on a single checkpoint. That time is stored with thebackup information. Theadvantage of using a whole-system backup was  that you canrestore the storage spaces with or without the logical logs.Because the data in all storage spaces is consistent in a whole-systembackup, you do not need to restore the logical logs to make the dataconsistent. Level 0, 1, or 2 backups are supported.

One of the new features of Cheetah is Backup/Restore dbspacesin parallel with whole system functionality .  With parallelbackup , multiple processes run simultaneously  each processbacking up a different dbspace . In most cases, parallel backups complete fasterthan serialbackups, which use only one process.

ONCONFIG parameter  "BAR_MAX_BACKUP" isused for whole system backups/restores as well. TheBAR_MAX_BACKUP parameter specifies the maximum number of parallelprocesses that are allowed for each onbar command. Both UNIX andWindows support parallel backups. Although the database server defaultvalue for BAR_MAX_BACKUP is 4, the onconfig.std value is 0.

To specify parallel backups and restores, including parallel wholesystem backups and restores, set BAR_MAX_BACKUP to a value higher than1. For example, if you set BAR_MAX_BACKUP to 3 and execute anON–Bar command, ON–Bar will spawn the maximum 3 processes concurrently.

If  BAR_MAX_BACKUP to 0, the system creates as manyON–Bar processes as needed. The number of ON–Barprocesses is limited only by the number of storage spaces or the amountof memory available to the database server, whichever is less.

To perform a serial backup or restore, including a serialwhole system backup or restore, set BAR_MAX_BACKUP to 1. 

Also in Cheetah,  storage manager efficiency isimproved  by improvingdbspace ordering .  "Biggest" dbspaces are backed up first and the "smallest"last.Dbspace with most pages used is the "biggest" and will be backed upfirst. This ensures better parallelism, no matter how BAR_MAX_BACKUP isset or how many pages are to be backed up in different dbspaces. Neworder of backup/restore is also effective for non-wholesystembackup/restore. Restore is done in the same order as backup to ease storagemanager's  access to stored objects.

To implement new ordering of dbspaces for backup and restore changes are made to the bar_instance table in the sysutils Database .
New column "ins_backup_order" (integer default 0 not null) is added to the table bar_instance .
Also changes are made to ixbar file :
New field with backup sequence integer (Last field in the line ) is added in ixbar file.
Value of the new field is always "0" (zero) for log backup objects.

Even if the BAR_MAX_BACKUP is set to 1 for serial backup,the dbspaces will be backed up using the new dbspace order .

Backup Operation
Only 1 checkpoint for all dbspaces - just before backup of rootdbs .
Rootdbs is backed up first, without parallelism (no change here).
Before image processor threads "arcbackup2" are started at this time, one for each dbspace  (more threads running in parallel.)
As each dbspace backup completes, the respective "arcbackup2" thread exits ( less "arcbackup2" threads as backup progresses. )
New order of backup. The order is based on the used-pages count at the start time of backup. Dbspace with most pages used is the "biggest" and will be backed up first.

Restore Operation
Not much changed, only it is now parallel.
Rootdbs is restored first, without parallelism (no change here).
Restore dbspaces in same order as they were backed up. Use the value of "ins_backup_order" to determine correct order. If BAR_MAX_BACKUP is changed between backup and restore, the objects will still be restored in the same order, however the timing relative to each other may differ significantly. This could have negative effects on SM performance .

Rashmi Chawak[Read More]

Index Self-Join

cheetahblog Tags:  index selfjoin 3 Comments 4,724 Views
Beginning with IDS version 11.10, Informix introduced a new index scan feature, called “Index Self Join”. Priorto IDS version 11.10, a composite index scan allows to scan a single range of an index, based on start and end index key position. The Index Self Join access method permits scan many small subsequent ranges inside the composite index, instead of a large single range, based on filter on non-leading keys of a composite index.

On previous IDS versions, queries from table with composite index perform inefficiently where higher ratio of duplicate values in leading column than subsequent column(s) in the composite index. The Index Self Join is new type of index scan that can use subsets of the full range of a composite index. With this new index scan, tables can join logically to itself, and more selective non-leading index keys are applied as index bound filters to each unique combination of the leading key values.

Previously optimizer scans all index keys that fulfill the complete index key condition; otherwise it would make a sequential scan if the leading key had no WHERE clause conditions associated in the query. With this feature it will find the unique leading keys (low selectivity), and will make small queries using this unique keys and the rest of the index keys provided in the WHERE clause condition of the query.

An Index Self Join is beneficial for situations in which:

  • The lead key of an index has many duplicates, and
  • Predicates on the lead key are not selective, but predicates on the non-leading index keys are selective.

How ‘Index Self Join’ works
Lets create a table ‘tab1’ with following table schema and load some data:
      CREATE TABLE tab1 (        col1  int,        col2  int,        col3  int,        col4  char(10));      CREATE INDEX idx1 ON tab1(col1,col2,col3);

Here is an illustration how IDS process following query prior version and on v11.10:

      SELECT * FROM TAB1      WHERE col1 >= 1 AND col1 <= 3      AND   col2 >= 20 AND col2 <=31      AND   col3 >= 40 AND col3 <= 62;

The simple view of the composite index ‘idx1’ on tab1(col1,col2,col3):

Diagram of a composite index

Prior to IDS v11.10, only possible filters could use on col1 (col1 >= 1 AND col1 <= 3) for positioning of the index scan:

Prior version
Lower Filter col1 >= 1, Upper Filter col1 <= 3

Diagram of an index scan prior to v11.10

Beginning with IDS version 11.10, we can use filters on ‘col2’ and ‘col3’for positioning of the index scan that allows skip unnecessary index keys at two ends of the index. IDS scan selective index keys that are relevant, which avoid scanning a large portion of the index. This strategy will improve the query performance by reducing the portion of the index needs to scan.

IDS version 11.10
Lead Keys: col1, col2
Lower Filter col1 = col1, col2 = col2 and col3 >= 40
Upper Filter col3 <= 62

Diagram of an index scan on v11.10

Example of ‘Index Self Join’
A table ‘tab1’ has been created with following schema:

Table schema of TAB1

So the table ‘tab1’ has a composite index ‘idx1’ on columns ‘col1’, ‘col2’ and ‘col3’. Now we will test following query without the leading index key ‘col1’ in WHERE clause condition:

    SELECT * FROM TAB1    WHERE col2 >= 10 AND col2 <=11    AND   col3 >= 20 AND col3 <= 22;

Following is the query access path of above SQL in IDS version prior to 11.10 where optimizer doing sequential scan:

Explain output prior to IDS v.11.10

Following is the query access path of previous SQL in IDS version 11.10 where optimizer using subsets of the full range of composite index:

Explain output on IDS v.11.10

Default Setting
By default, the optimizer considers ‘Index Self Join’ on IDS v.11.10. Following are couple of ways to change the default settings:

  • Set onconfig parameter INDEX_SELFJOIN to 0
  • Dynamically change the onconfig settings using onmode command
           onmode -wm INDEX_SELFJOIN=0 
  • Using optimizer directive AVOID_INDEX_SJ
           SELECT {+AVOID_INDEX_SJ(TAB1 IDX1) *        FROM TAB1       WHERE col2 >= 10 AND col2 <=11       AND   col3 >= 20 AND col3 <= 22;

New Directives with ‘Index Self Join’
The optimizer also supports two new access-method directives:
    INDEX_SJ (table index [ , index ...] )     AVOID_INDEX_SJ (table index [ , index ...]
Use the specified index to scan the table in an index self-join path. The optimizer is forced to scan the table using an index self-join path with the specified index (or to choose the least costly index in a comma-separated list of indexes for an index self-join path).

Does not use an index self-join path for the specified indexes. The optimizer does not consider the specified index for scanning the table in an index self-join path.

Sanjit Chakraborty[Read More]

Introduction to Basic Text Search Datablade Part 1 Description and Syntax

cheetahblog Tags:  text search syntax bts 2,762 Views
One of the new features for Cheetah is the inclusion of severaladditional datablades. One of the more exciting of these blades is theBasic Text Search blade, BTS  for short. This datablade allowsyou to do a lot of  searches on text data found in char,varchar, lvarchar, blob and clob datatypes. The intent is to provide amore robust method by which to search text than current forms like theMATCHES or LIKE functions.  The documentationfor the BTS blade can be found in TheBuilt-In DataBlade Modules User's Guide . Today we willgo over a little of the syntax that allows you to perform some neatqueries.

At the heart of the BTS datablade is a single function. The name isbts_contains, and it has the following syntax:

bts_contains (<column to search>, <searchcriteria>,  {score # REAL})

The score section at the end is optional, and what it does is allow youto give a weighted value to each member of your result set.
So let's presume that you had a varchar(255) column called remarks in atable called photos. If you wanted to search for the words "Apple" orthe phrase "Macbook Pro" , and you only wanted the score of 70.0 or greater you would do the following:

SELECT * FROM photos WHERE bts_contains(remarks, 'Apple OR "MacbookPro"', score # REAL)
AND score >= 70.0

If  you didn't care about getting the score back from yoursearch, then you could write it as the following:

SELECT * FROM photos WHERE bts_contains(remarks, 'Apple OR "MacbookPro"')

The BTS datablade takes most of the shortcuts you are familiar withfrom using most web search engines in today's marketplace.

In part 2 we will cover how to setup the BTS blade and the administrative functionsthat are available.

Mark Jamison[Read More]

onstat -g pqs

cheetahblog Tags:  diagnostic sql analysis 1 Comment 3,874 Views
In version 10.00 IDS made available a new onstat command. The onstatcommand was onstat -g pqs. For those familiar with the old X-based toolcalled xtree, onstat -g pqs allows you to dynamically track a session'squery plan/tree. The syntax is as follows:

onstat -g pqs {0||<session id>}

The 0 will actually attempt to grab the current step of each query planfor all sessions.

Two considerations

  1. onstat -g pqs only shows all phases that have currentlyprocessed, plus the current phase.
  2. onstat -g pqs only shows active queries, once a query iscomplete the plan is gone.

Below is a sample based on the following query.

SELECT c.customer_num, fname,lname, company, o.order_num
FROM customer c, orders o, items i
WHERE c.customer_num= o.customer_num
AND o.order_num = i.order_num
GROUP BY c.customer_num, o.order_num, fname, lname, company
ORDER BY c.customer_num, lname, fname, company, order_num

Onstat -g pqs output

As you can see an additional drawback is that the tables themselves are not named, but it does clearly show you the query tree as it was executed.

Mark Jamison[Read More]