Creating tables in an IBM Db2 Warehouse on Cloud Enterprise MPP or IBM Db2 Warehouse MPP database

You can create tables in a IBM® Db2® Warehouse on Cloud Enterprise MPP or IBM Db2 Warehouse MPP database by using the web console or through an application that is connected to the database. When you create a table, you must decide how to distribute the table data.

About this task

When you create a new table in an MPP database, you can use hash distribution or random distribution:
Hash distribution

Data is distributed by applying a hashing algorithm to the values in the columns that are listed in a distribution key.

A well-chosen distribution key can balance two objectives:
  1. Maximizing parallel processing of queries and maximizing use of available storage space by distributing table data evenly across the system.
  2. Minimizing the time that it takes to fetch data by collocating data that is likely to be fetched together.

In general, a well-chosen hash distribution key produces the best performance.

You can specify the distribution key by including the DISTRIBUTE BY HASH clause in the CREATE TABLE statement.

Random distribution

Data is distributed evenly across the system.

Random distribution maximizes parallel processing of queries and maximizes use of available storage space.

Consider using random distribution as a simpler alternative to hash distribution:
  • If you don't have enough information to choose an effective hash distribution key.
  • If you know that collocation isn't needed for the queries that are entered against the table.
  • If the table is small enough that performance is not affected by the distribution.

You can use random distribution by including the DISTRIBUTE BY RANDOM clause in the CREATE TABLE statement.

If you specify DISTRIBUTE BY RANDOM when creating a table with a primary or unique key, the database manager will implement the random distribution by creating a hash key on the unique or primary key. In this case, when you view the table definition in the web console or the catalog tables, you will see this hash distribution key despite the fact that you specified DISTRIBUTE BY RANDOM when you created the table.

Default

If you do not specify a distribution clause with the CREATE TABLE statement, a default hash distribution key is used.

Procedure

Issue the CREATE TABLE statement in either the web console or an application that is connected to your MPP database.

  • Define the primary key and other check constraints in the CREATE TABLE statement instead of creating the table first and then defining the primary key or other check constraints in an ALTER TABLE statement.
  • Specify the distribution in the CREATE TABLE statement.
    Example 1: Hash distribution
    
    CREATE TABLE MYTABLE
    (
      COL1 INT,
      COL2 VARCHAR(5)
    )
    DISTRIBUTE BY HASH( COL1 )
    
    Example 2: Random distribution
    
    CREATE TABLE MYTABLE
    (
      COL1 INT,
      COL2 VARCHAR(5)
    )
    DISTRIBUTE BY RANDOM
    

    See also: CREATE TABLE statement

  • If you do not specify a distribution key in the CREATE TABLE statement, a default hash distribution key is used. You can see which column was used as the default distribution key by viewing the table definition information in the web console.