• 1 reply
  • Latest Post - ‏2010-08-18T20:52:44Z by smithha
3 Posts

Pinned topic Does IA store the source or only information about it?

‏2010-08-17T13:23:01Z |
Does Information Analyzer, when it runs analysis such as column analysis,store only information about the source, or does it copy the entire source and permanently keep it.

We have a DB2 Data Warehouse and we were wondering how much space IA will use.

When IA runs Column Analysis does it copy and permanently store the tables that we are doing the Column Analysis on, or does it read the tables, extract or create the information that it needs and store only that information i.e. IA does not store a copy of the tables but only the metadata?

How can we estimate the storage space needed for IA?

Thank you.
Updated on 2010-08-18T20:52:44Z at 2010-08-18T20:52:44Z by smithha
  • smithha
    162 Posts

    Re: Does IA store the source or only information about it?

    Information Analyzer will only store results from column analysis, specifically frequency distributions (detail values) and summarized analysis information. It does not copy/store the original source.

    Specifically, the column analysis process does a single pass read of the source, utilizes a patented frequency distribution process in the parallel engine to optimize data processing, and then loads the frequency distributions and generates the summarized results.

    Space requirements are highly dependent on the cardinality of your data and on the number of columns you are analyzing. The primary high-volume storage requirement is for the frequency distributions, though there are other tables in the IADB repository.

    The maximum size of the frequency distribution (FD) tables after analysis is 678 bytes. The minimum is 116 bytes. You could roughly use the following to gauge the storage for these tables:

    Estimated Minimum Size = Number of Distinct Value * (116 bytes + average source column length * 2)

    Maximum Size = Number of Distinct Value * 678 bytes

    I would note that this only suggests the logical space requirement for the Information Analyzer Analysis Database (IADB)--physical database size estimate, which depends on DBMS type, database configuration and optimization is not addressed by this.

    Hopefully that provides some of what you need.