Topic
1 reply Latest Post - ‏2012-12-07T14:27:01Z by RobertDickson
SystemAdmin
SystemAdmin
7754 Posts
ACCEPTED ANSWER

Pinned topic What stage to use to get a SQL UNION effect in datastage?

‏2012-12-06T10:50:54Z |
Hello,
How to design a job that takes the union of multiple tables so that the end table contains
1. all the columns present in each table (columns that are common in all tables only occur once) and
2. the rows represent a union of all the data present in all the tables. So row data is inserted in their respective columns and nulls in any extra columns where they do not exist in their source tables.
In the Developer Guide, I was inclined to use the funnel stage but it says the metadata for all the tables must be the same. Some columns are present in all the multiple tables and some are unique to certain tables. Will it transfer those records whose columns metadata don't match? i.e extra cols?E.g.
Table 1:
ID EMP_ID STATUS LEAVE
Table 2:
ID EMP_ID FUNCTION COMMENTS START END
Table 3:
ID EMP_ID STATUS LEAVE START END
Table 4:
ID EMP_ID STATUS PENSION DEPT

The Final Table should have a union of all the above table rows and the following columns:
ID EMP_ID STATUS LEAVE FUNCTION COMMENTS START END PENSION DEPT

Assume the first ID, EMP_ID are integers in all and all other columns are VARCHAR.

I hope I was clear enough, please let me know if I need to clarify the question any further.

Regards
Updated on 2012-12-07T14:27:01Z at 2012-12-07T14:27:01Z by RobertDickson
  • RobertDickson
    RobertDickson
    33 Posts
    ACCEPTED ANSWER

    Re: What stage to use to get a SQL UNION effect in datastage?

    ‏2012-12-07T14:27:01Z  in response to SystemAdmin
    Hi,

    Funnel requires the metadata be the same. You could read each source, use a Transformer to create a common output, and then Funnel. But I don't think that will give you what you want. I THINK (not sure) you want one record will all possible columns for ID+EMP_ID keys. If this is the case, then this sounds like a series of Joins using ID and EMP_ID as the join keys and then mapping the required columns to the output.

    However, if all tables exist in the same database, it may be better to do the SQL in the database.

    If I am guessing incorrectly, then can you please provide an example with data?

    Regards,
    Robert