Topic
5 replies Latest Post - ‏2013-01-09T14:17:11Z by smithha
SystemAdmin
SystemAdmin
294 Posts
ACCEPTED ANSWER

Pinned topic best way to compare sum1 vs sum2 where join is not possible

‏2013-01-08T07:43:56Z |
i want to compare sum value from table_1 VS sum from table_2.
the rule set force me to perform join between the tables.
the join is not needed.
is there a way to do such compare without jion?
Updated on 2013-01-09T14:17:11Z at 2013-01-09T14:17:11Z by smithha
  • smithha
    smithha
    144 Posts
    ACCEPTED ANSWER

    Re: best way to compare sum1 vs sum2 where join is not possible

    ‏2013-01-08T13:05:18Z  in response to SystemAdmin
    Hi,

    There was a similar question posed in the "Table Rows Count" thread, so you can see some of the same discussion there http://www.ibm.com/developerworks/forums/thread.jspa?threadID=465654&tstart=0

    The simplest way is to create a rule for each table such as FieldX Exists and bind that to the key field. Each rule will generate statistics for Total Records and Total Rows Met (those should generally be equal on a key field).

    Then create a Metric that compares the Total Records for table1 and the Total Records for table2. Assuming either one could be higher than the other a good Metric expression would be: *ABS(ruleA Total Records - ruleB TotalRecords) *
    Include a Benchmark for the Metric so that you can indicate what variance, if any, you tolerate.

    For more on logic for Metrics see: http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_functions_syntax_metrics.html

    Harald
    • SystemAdmin
      SystemAdmin
      294 Posts
      ACCEPTED ANSWER

      Re: best way to compare sum1 vs sum2 where join is not possible

      ‏2013-01-08T14:07:26Z  in response to smithha
      Hi,

      your idea will work well with count1 VS count2.
      I'm trying to compare sum VS sum.
      Metric let you use number of record or % number of records but it is not possible to use the result of sum from your rule.

      thanks
      • smithha
        smithha
        144 Posts
        ACCEPTED ANSWER

        Re: best way to compare sum1 vs sum2 where join is not possible

        ‏2013-01-08T17:52:59Z  in response to SystemAdmin
        Hi,

        You are correct that the metric approach won't work for sum vs. sum.

        If you are using v8.7 or before, there is not going to be an easy solution since there is nothing in common to bring the sums together. The two approaches that are viable in v8.7 are:
        1) Create rules to calculate the sums for each table; write the distinct sums plus a common key value to the output tables. Extract the values of the output tables via the API's and format as .csv files. Ensure you have ODBC definitions and the metadata imported for the .csv files. Add a third rule to read the .csv files and compare the sums using a join on the common key value.
        2) Put the rules into distinct Rule Stages in a DataStage/QualityStage job where you read each file, run each through their respective rules, then bring the output together and compare the sums (could be in a third Rule Stage if you've put the two sums into a common record). Will allow you to evaluate the sums against each other, but won't have the results stored in the IA repository.

        In the latest v9.1 release, there are a couple of added options taking advantage of the named output table support.
        1) Create rules to calculate the sums for each table; write the sums plus a common key to distinct named output tables; add a third rule to read the named output tables and compare the sums using a join on the common key.
        2) Create rules to calculate the sums for each table; write the sums plus a common key to the same named output table; add a third rule to read the named output table and compare the max and min value for the sum column for the common key (avoids a join and minimizes the number of named output tables needed even where you have multiple sums you want to compare).

        Harald
        • SystemAdmin
          SystemAdmin
          294 Posts
          ACCEPTED ANSWER

          Re: best way to compare sum1 vs sum2 where join is not possible

          ‏2013-01-09T08:01:33Z  in response to smithha
          Hi,

          Thank you for your replay.
          Is there is a way to upgrade only the information analyzer from 8.7 to 9.1 version without upgrade the datastage?

          tanks.
          • smithha
            smithha
            144 Posts
            ACCEPTED ANSWER

            Re: best way to compare sum1 vs sum2 where join is not possible

            ‏2013-01-09T14:17:11Z  in response to SystemAdmin
            You cannot upgrade one component of Information Server without upgrading another -- they share application server and repository.
            You could install a separate Information Server 9.1 instance on another server with only install Information Analyzer there. That would allow you to focus your information quality work, but you would have to assess impacts in regards to other Information Server components (e.g. Business Glossary, Metadata Workbench) that you may be using.