Query analysis
An important part of query processing is the analysis that determines how to tune the query for optimal performance.
To obtain data from data sources, clients (users and applications) submit queries in SQL to the federated database. The SQL compiler then consults information in the global catalog and the data source wrapper to help it process the query. This includes information about connecting to the data source, server attributes, mappings, index information, and nickname statistics.
- Processed by the data sources
- Processed by the federated server
- Processed partly by the data sources and partly by the federated server
- The amount of data that needs to be processed.
- The processing speed of the data source.
- The amount of data that the fragment will return.
- The communication bandwidth.
Pushdown analysis is only performed on relational data sources. Nonrelational data sources use the request-reply-compensate protocol.
The following figure illustrates the steps performed by the SQL compiler when it processes a query.

The query optimizer generates local and remote access plans for processing a query fragment, based on resource cost. The federated database then chooses the plan it believes will process the query with the least resource cost.
If any of the fragments are to be processed by data sources, the federated server submits these fragments to the data sources. After the data sources process the fragments, the results are retrieved and returned to the federated server. If the federated database performed any part of the processing, it combines its results with the results retrieved from the data source. The federated server then returns the results to the client.
The primary task of pushdown analysis is to determine which operations can be evaluated remotely. Pushdown analysis does this based on the SQL statement it receives and its knowledge of the capabilities and semantics of the remote data source. Based on this analysis, the query optimizer evaluates the alternatives and chooses the access plan based on cost. The optimizer might choose to not perform an operation directly on a remote data source because it is less cost-effective. A secondary task is to attempt to rewrite the query to compensate for the difference in semantics and SQL operations between the federated server and the data source so that the query is better optimized.
The final access plan selected by the optimizer can include operations evaluated at the remote data sources. For those operations that are performed remotely, the SQL compiler creates efficient SQL phrased in the SQL dialect of the remote data source during the generation phase. The process of producing an optimal query plan that takes all sources into account is called global optimization.
For nonrelational sources, the wrappers use the request-reply-compensate protocol.