Encoded vector indexes

An encoded vector index (EVI) is used to provide fast data access in decision support and query reporting environments.

EVIs are a complementary alternative to existing index objects (binary radix tree structure - logical file or SQL index) and are a variation on bitmap indexing. Because of their compact size and relative simplicity, EVIs provide for faster scans of a table that can also be processed in parallel.

An EVI is a data structure that is stored as two components:

  • The symbol table contains statistical and descriptive information about each distinct key value represented in the table. Each distinct key is assigned a unique code, either 1 byte, 2 bytes or 4 bytes in size.

    By specifying INCLUDE on the create, additional aggregate values can be maintained in real time as an extension of the key portion of the symbol table entry. These aggregated values are over non-key data in the table grouped by the specified EVI key.

  • The vector is an array of codes listed in the same ordinal position as the rows in the table. The vector does not contain any pointers to the actual rows in the table.

Advantages of EVIs:

  • Require less storage
  • May have better build times than radix, especially if the number of unique values in the columns defined for the key is relatively small.
  • Provide more accurate statistics to the query optimizer
  • Considerably better performance for certain grouping types of queries
  • Good performance characteristics for decision support environments.
  • Can be further extended for certain types of grouping queries with the addition of INCLUDE values. Provides ready-made numeric aggregate values maintained in real time as part of index maintenance. INCLUDE values become an extension of the EVI symbol table. Multiple include values can be specified over different aggregating columns and maintained in the same EVI provided the group by values are the same. This technique can reduce overall maintenance.

Disadvantages of EVIs:

  • Cannot be used in ordering.
  • Use for grouping is specialized. Supports:
    • COUNT, DISTINCT requests over key columns
    • aggregate requests over key columns where all other selection can be applied to the EVI symbol table keys
    • INCLUDE aggregates
    • MIN or MAX, if aggregating value is part of the symbol table key.
  • Use with joins always done in cooperation with hash table processing.
  • Some additional maintenance idiosyncrasies.