In some cases, however, an unequal distribution of data can be desirable. To use an example from the DB2 9.7 documentation, if some database partitions reside on a particularly powerful machine, then it may be beneficial for those database partitions to contain larger volumes of data than other partitions.
To determine data skew, the same documentation recommends using a query like the following on a representative table in a partition group:
SELECT DBPARTITIONNUM(column_name), COUNT(*) FROM table_name
GROUP BY DBPARTITIONNUM(column_name)
ORDER BY DBPARTITIONNUM(column_name) DESC
where column_name is the name of the distribution key for table table_name. Another option for determining data skew is the ANALYZE_LOG_SPACE procedure.Another form of skew to examine is skew in your data access patterns. You can observe data access skew by looking at how the data actually gets used, that is, to see if more data is being read from one partition than the others by the queries being run by the users of the database. In today’s post, we will investigate the Skew screen of db2top and learn what it can tell you about data access skew in your database.
Click here to read the rest of this post on the db2top Skew Screen at Keith's blog