Federation: A walk through a basic federated query

At a high level, the basic steps in a federated query are:

User or application submits a query.
The federated server decomposes the query by source.
The federated server and wrappers collaborate on a query plan.
The federated server implements the plan through query execution.
Wrappers take sources through each source's API.
Sources return data to the wrappers.
Wrappers return the data to the federated server.
The federated server compensates for work that the data sources are unable to do and combines data from different sources.
The federated server returns data to the user or application.

After a query is submitted, The federated server consults its system catalog. It is looking for information such as what tables or other data stores contain the information to be retrieved and what wrappers have been designated to initiate the retrieval.

The federated server devises alternative strategies, called access plans, for evaluating the query. Such a plan might call for parts of the query to be processed by the data sources, by the federated server, or partly by the sources and partly by the federated server. The federated server chooses among the plans primarily on the basis of cost.

The optimizer generates sub-pieces of the original query submitted by the user's application called query fragments. The federated server submits each query fragment to a wrapper in a request. The wrapper responds with a reply. The reply lets the optimizer know which sub-fragments (such as select-list elements and predicates) of that specific query fragment the wrapper can execute. This set of sub-fragments is called the accepted fragment. In the reply, the wrapper also gives an estimate of the cost (in time) and number of rows that will be produced if it is asked to evaluate the accepted fragment. The optimizer then compensates for those sub-fragments that the data source can not handle by adding them to the federated server's portion of the query plan. Overall, the whole process by which the wrapper and the federated server interact during query planning is called the Request-Reply-Compensate (RRC) protocol .

The federated server analyzes combinations of the plans for individual fragments to determine the best overall plan.