Avoiding unnecessary outer joins
The semantics of certain queries require outer joins (either left, right, or full). However, if the query semantics do not require an outer join, and the query is being used to deal with inconsistent data, it is best to deal with the inconsistent data problems at their root cause.
For example, in a data mart with a star schema, the fact table
might contain rows for transactions but no matching parent dimension
rows for some dimensions, due to data consistency problems. This could
occur because the extract, transform, and load (ETL) process could
not reconcile some business keys for some reason. In this scenario,
the fact table rows are left outer joined with the dimensions to ensure
that they are returned, even when they do not have a parent. For example:
SELECT...
FROM DAILY_SALES F
LEFT OUTER JOIN CUSTOMER C ON F.CUST_KEY = C.CUST_KEY
LEFT OUTER JOIN STORE S ON F.STORE_KEY = S.STORE_KEY
WHERE
C.CUST_NAME = 'SMITH'
The left outer join can prevent
a number of optimizations, including the use of specialized star-schema
join access methods. However, in some cases the left outer join can
be automatically rewritten to an inner join by the query optimizer.
In this example, the left outer join between CUSTOMER and DAILY_SALES
can be converted to an inner join because the predicate C.CUST_NAME
= 'SMITH'
will remove any rows with null values in this column,
making a left outer join semantically unnecessary. So the loss of
some optimizations due to the presence of outer joins might not adversely
affect all queries. However, it is important to be aware of these
limitations and to avoid outer joins unless they are absolutely required.