Analysis database sizing

An analysis database is a component that IBM® InfoSphere® Information Analyzer uses when it runs analysis jobs.

The extended analysis information is stored in the analysis databases. The extended analysis information includes the high-volume, detailed analysis results, such as column analysis, primary key analysis, and domain analysis. Additionally, the metadata repository contains the information analysis projects that contain the analysis results.

Before you create the analysis databases, review the quantity of data to be analyzed. This review helps you to determine an appropriate storage size, location, and configuration of the analysis databases.

When you plan for the size of your databases, consider these factors that affect the size of each database:

  • Number of tables to be analyzed
  • Number of columns in tables to be analyzed
  • Number of unique records within these tables
  • Number of char and varchar columns
  • Types of analysis to be done

Unless you use sampled analysis, an analysis database might be larger than the combined size of all the analyzed data sources.