This post brings the insights on client connectivity for ‘PureData System for Analytics’ at one place. It does not cover the information on installation and configuration of client drivers, details of which can be found in:
Let us categorize the details as follows
- Execute queries
- Describe Metadata
Before moving on to the details, here’s a figure that shows the analogy between different types of drivers like ODBC, JDBC and OLEDB.
Coming back to the categories, let’s discuss in details
A session is a handshake protocol for negotiation between the client and server that starts with authentication. Session parameters like database, user, password, protocol etc, can be configured while creating the Data sources in these drivers or can be provided as part of connection string. There are few advanced options available which can be configured while initiating the connection. To take ODBC as example, refer below link
On the server (PureData System for Analytics) side, nzsession command can be used to view and manage sessions. Refer link below for details
Transactions refer to unit of work that operates on database-related objects, data or both.
Transactions provide the following benefits:
- Ensure integrity among multiple operations by allowing all or none of the operations to take effect. You accomplish this by starting a transaction, performing operations and then executing either a commit or a rollback (also called an abort).
- Provide means of cancelling completed work for a series of operations that fail before finishing.
- Provide a consistent view of data to users, in the midst of changes by other users.
The transaction processing task contains five steps:
- Allocating statement handle(s)
- Preparing and executing SQL statements
- Processing results
- Committing or Rolling Back
- (Optional) Freeing statement handle(s) if the statement is unlikely to be executed again.
Client drivers provide explicit APIs to start/end transactions (e.g. SQLTransact, SQLEndTran), control on isolation level( e.g. SQLSetStmtAttr, SQLSetConnectAttr), query cancellation(e.g. SQLCancel), etc.
Once again, nzsession on server is handy to show and manage transactions.
- Execute Queries
Query execution from client driver involves multiple phases as listed below:
- Parsing the query
Minimal parsing is done is client drivers just to identify the type of query (parameter markers, multi-statement batch), normalization (white spaces, comments) and processing of escape sequences.
- Preparing the statement
Preparation of statement involves processing of vendor specific escape clauses, translation based on server supported syntax, creating the access plan and getting some metadata information for result-sets (column information – types, lengths and others). These steps are relevant for SELECT queries and no-op for rest of the SQL.
- Execution and fetching the results
This step deals with binding of data and actual execution of the query on the server. In case of multi-row operations (supported for INSERT/UPDATE/DELETE), column-wise or row-wise binding is done. This also involves any codepage conversion and fetching the result-sets (if generated) on client. Once the result-sets are available, it can be fetched in the application into bound variables.
Client drivers provides APIs to execute (e.g. SQLExecute, SQLExecDirect), fetch (e.g. SQLFetch, SQLFetchScroll), getting the row-counts (e.g. SQLRowCount) etc.
Everything is executed synchronously as there is wait/block for response from server. Also the server does not support for multiple queries to be active on a single session.
- Describe Metadata
Metadata information refers to data about data. It describes the specific characteristics of an individual data item like table, column, stored procedure, etc which is usually stored in the system catalogs.
Client drivers provide APIs to retrieve metadata information, e.g.
- GetTables (Databases, Users)
There is no client API for creating/modifying the metadata. Server provides it’s own set of system views like _v_odbc_tables1/2/3, _v_jdbc_tables1/2/3 and _v_oledb_tables1/2/3 to be utilized by client having different version suffix for cross db and schema support.
As far as linguistics are concerned, client drivers provide full unicode support for queries and data with UTF8, UTF16 or UTF32 encoding though internally everything is in UTF8 (same as server encoding).
For locale specific encoding to unicode, drivers make use of ICU libraries (except JDBC).
Error messages are in english only which means that drivers are fully internalized but not localized.
Protocol can be talked about in two ways here
- Handshake protocol
This refers to the custom protocol for initial negotiations while creating sessions.
- Data Protocol
This refers to protocol for any information to sent/received within a session. State transitions are part of it. It is partly inherited from postgres protocol and consists of two parts
- Common Header
- Single byte ASCII code – key to indicate message type/action
- Message number – 4-byte - currently only for synchronizing after cancellation
- Message length – 4-byte (not including the header fields)
- Variable-length pay-load
Both the types of protocols are versioned for extensibility.
Hope this helps – questions/comments are welcome.