Federation: Request-reply-compensate protocol

During query planning, the optimizer generates sub-pieces of the original query submitted by the user's application called query fragments. A query fragment can contain tables, predicates, and head expressions. Head expressions are expressions found in the SELECT clause of a query. The optimizer then submits each query fragment to a wrapper in a request.

By definition, all the data needed to evaluate a fragment comes from one data source. However, the processing of this data can be done by the foreign server, by the federated server, or some by each. The wrapper indicates which sub-pieces of the fragment, called sub-fragments, it can evaluate, and puts this information in the reply to the request. The wrapper must either accept or reject an entire sub-fragment.

The reply contains:

accepted nicknames, predicates and head expressions
cost and cardinality estimates for the accepted fragment
ordering properties, if any, for the results that will be returned
Wrapper Execution Descriptor (described in the following text)

For the purposes of this document, "quantifier" and "nickname" are interchangeable.

For a single query, the optimizer typically generates many requests for each wrapper, each request representing a different fragment of the original query. For each such request, the wrapper generates zero, one, or more replies. Each reply represents a different accepted fragment. An accepted fragment is a fragment the wrapper or data source can evaluate itself. Each reply contains the associated cost and cardinality estimates for the accepted fragment.

By the end of query planning, the optimizer will have made a cost-based decision. It will have come up with a plan incorporating some set of the accepted fragments offered up by the wrapper in response to requests. It is this particular set of accepted fragments that the wrapper will eventually be asked to execute.

The optimizer will ensure that any sub-fragments that are not part of the accepted fragments in the plan it has chosen will be evaluated by the federated server. Examples of this include a complex predicate or a sort that is beyond the capability of the data source in question. This includes all cross-source joins as well as any other expressions (such as function invocations) that mix data from multiple sources since a fragment is single-source. This is called compensation.

Overall, the whole process by which the wrapper and the federated server interact during compilation is called the Request-Reply-Compensate (RRC) protocol.

The wrapper can determine the cost and cardinality estimates to put in the replies on its own or it can reuse a default cost model provided by the federated server. It can also selectively replace parts of the default cost model so as to improve its accuracy without building a new model from scratch. The default model uses statistics that the wrapper can calculate from the data, or they can be supplied via DDL.

As part of the reply, wrappers also have to provide a Wrapper Execution Descriptor. This is a black box whose content is up to the wrapper. The wrapper must be able to submit the accepted fragment that the reply represents to the data source. If the optimizer uses the accepted fragment in the execution plan it ultimately selects, the Wrapper Execution Descriptor will be returned to the wrapper when it is time to run the query. In between, the Wrapper Execution Descriptor resides in the federated server catalogs, as part of the access plan for a precompiled query.

The wrapper for the source has complete control over the persistent representation of the remote query in the Wrapper Execution Descriptor.