
An SQL join operation matches rows from two or more related tables according to a condition and creates a single data set with columns selected from across the multiple tables. Typically, the condition matches keys from the two tables by an equality predicate. Join operators perform the same function in a data flow, and support inner joins, outer joins, and cross-joins of multiple input data sets.
With a join operator, you can perform the following types of joins:
You can also specify columns to project from the result of the join operation because a join operator provides the same functionality as project operators, allowing you to add or remove columns, and define scalar and arithmetic functions. The single result data set contains each of the selected columns from the multiple input tables.
You can access the expression builder to create conditions and define column expressions.
Example
You want to select columns from two related tables. The first table is a dimension table. Col1 is the key and Col2 is a dimension attribute:Table1
Col1 Col2
1 A
2 B
3 C
4 D
Table2
Col1 Col2 Col3
1 2 27
2 1 32
3 1 16
4 3 28
5 2 19
6 4 5
You create an inner join by defining a condition that compares column 1 of table 1 and column 2 of table 2:
table1.col1 = table2.col2
You remove one of the key columns that you used in the condition from the select list to create the following data set:
Col1 Col2 Col3
1 B 27
2 A 32
3 A 16
4 C 28
5 B 19
6 D 5
The following partial data flow shows an inner join of three tables. The result of the join is sent to an order by operator in this case. Note that only the Inner output port is available in this case. You cannot outer join more than two tables with one join operator, so the Outer ports are unavailable. (The validation error for the order by operator shows that its result needs to be connected to another operator to complete the data flow.)

See Creating a data flow that populates the MARTS.ORDER_FACT table in the SQL Warehousing tutorial for an example of using the table join operator.