Data processing
The purpose of a module is to extract some information from its input, and generate some contextual information and metrics as output.
Limitations
As this is a sample module, some protocols may have only limited implementation. For example, for LDAP, only search type requests are extracted. You can expand the information that can be decoded for a protocol using the Generic TCP Module.
Event-driven processing
Processing is triggered
by a change in the input (for example, additional payload), which
leads to an invocation of the module’s process function.
Each invocation
of the module’s process function is provided with
three parameters: the wrt_module_instance_t initialized
by init, a wrt_api_session_t handle,
and a wrt_api_data_t handle. The session handle is
common for each call to process for the same network
session (for example, TCP session); this handle can be used to maintain
state between calls to process. The data handle provides
access to the currently available request/reply data, context, and
metrics.
TCP-based protocols are usually stateful, which means
some state must be stored between calls to the module’s process function
to decode them. Even non-stateful protocols may require some state
passing, as the processing may be provided with partial data that
must be either processed immediately, or buffered by the module. Both
scenarios are described in the following sections.
Storing state
As mentioned above, the wrt_api_session_t handle
may be used to maintain state between calls to process.
This can be done by using the wrt_module_api_t set_userdata,
and get_userdata functions.
For example, to store some data in the session, use the following code:
void my_destructor(wrt_api_session_t session, void *data) {
free(data);
}
...
void *userdata = malloc(sizeof(long)); /* any data that fits in void* */
wrt_api_status_t status = api->set_userdata(session, userdata, &my_destructor);
If the call to set_userdata succeeds
(that is, it returns zero), retrieve the value later with get_userdata.
When the session terminates, the destructor (if specified) will be
invoked with the session and the userdata as
arguments.
To retrieve the userdata, call the
API as follows:
void *userdata;
wrt_api_status_t status = api->get_userdata(session, &userdata);
If no data was previously set, get_userdata returns WRT_API_STATUS_NODATA;
otherwise it copies the value into the provided pointer, and then
returns WRT_API_STATUS_OK (zero). For a new session, userdata is
always unset. A common pattern for initializing state for session
decoding is to first call get_userdata, check if WRT_API_STATUS_NODATA was
returned, and if so create a new state object and call set_userdata.
struct my_session_state *state = NULL;
wrt_api_status_t status = api->get_userdata(session, (void**)&state);
if (status == WRT_API_STATUS_NODATA)
{
state = malloc(sizeof(struct my_session_state));
/* init state */
status = api->set_userdata(session, state, &destroy_state);
if (status != WRT_API_STATUS_OK)
{
/* catastrophic failure: could not set state. */
}
}
Buffering data
In order to minimize resource requirements, the module container does not retain payload data after it has provided it to a module. If a module is presented with partial data, and the module cannot process the data until it is received in its entirety, the module must perform its own buffering.
To buffer data, use the session state and userdata mechanism described above. For example, you could store a state structure which contains the amount of data buffered so far, and a pointer to a heap-allocated copy of the data. The API flow in process is similar to the following:
- Obtain or initialise the session state, using the pattern described above
- Obtain the request/reply payload data, accumulating it into any previously buffered payload data.
- Process the currently buffered data, and retain only the unprocessed data.
Contextual information and metrics
When
a module processes some data, it may choose to send it along to the
next module in the processing chain, typically with some additional
information that it has extracted from the input. The data that is
sent is transferred through a wrt_api_data_t handle.
A wrt_api_data_t has
associated context information (e.g. the source and destination IP
addresses, source and destination TCP ports), and some metrics (e.g.
the request/reply response time, request timestamp, reply timestamp).
Each module in a processing chain may add to or modify the values
in the set, but never remove information. Thus, all input context
and metrics are implicitly output; only their values may be modified,
and additional context and metrics may be added.
To set context
and metrics, a module requires a unique numeric ID for each context
and metric item as described in “Module initialization”. These IDs
are provided to the module via a wrt_module_config_t structure,
and the module supplies them to calls to the get/set_metric and get/set_context API
functions.
Context example
A call to get or set context is shown below. In Module initialization a context ID was extracted for the context item baz.
wrt_context_id_t baz_id; /* Assigned in foo_init_function. */
wrt_context_type_t ctx_type;
const void *ctx_value;
size_t ctx_size;
wrt_api_status_t status;
status = api->get_context(data, baz_id, &type, &ctx_value, &ctx_size);
/* Do something with the value, then update it. */
status = api->set_context(data, baz_id, type, ctx_value, ctx_size);
There are various in-built context items, depending on the underlying protocol. The numeric IDs for these context items can be obtained in the same way as described previously. For request/reply TCP or IPv4, the keys of the context items are:
| Context key | Description | Type |
|---|---|---|
| tcp.srcport | Source TCP port | uint16 |
| tcp.dstport | Destination TCP port | uint16 |
| ipv4.srcaddr | Source IPv4 address | ipv4 |
| ipv4.dstaddr | Destination IPv4 address | ipv4 |
| ipv4.origsrcaddr | Original source IPv4 address | ipv4 |
| ipv4.origdstaddr | Original destination IPv4 address | ipv4 |
ipv4.srcaddr may be updated
to represent a source address other than the actual address, for example,
for HTTP, report X-Forwarded-For. The value of ipv4.origsrcaddr should
always be the actual source IPv4 address (for example, of the proxy
server).Metrics example
Metrics are handled
similarly to context. See below for an example of using get_metric and set_metric:
wrt_metric_id_t server_time_id; /* Assigned in foo_init_function. */
wrt_metric_type_t type;
wrt_metric_value_t value;
wrt_api_status_t status;
status = api->get_metric(data, server_time_id, &type, &value);
/* wrt_metric_value_t is a union of basic integer types.
* If you don't know the type of the metric ahead of time,
* check the "type" variable updated by get_metric, and switch
* on the result. For brevity, we assume a specific type here. */
/* The Server Time metric is an unsigned 64-bit integer. */
value.u64 += 42;
status = api->set_metric(data, server_time_id, type, value);
tcp.response_time.totaltcp.response_time.servertcp.response_time.networktcp.response_time.loadtcp.response_time.resolvetcp.response_time.client_render
See the Enhanced network timing calculations for Web Response Time metrics in the Administrator's Guide for definitions of these metrics.
Trace logging
To enable you to debug a module,
the API provides two functions for logging: init_log and log_message.
init_log is
an optional function for registering a logging handle with a specified
filename. The filename is used only for identifying log messages.
A typical call to init_log looks like:wrt_api_log_handle_t log_handle;
wrt_api_status_t status = api->init_log(__FILE__, &log_handle);
log_message formats and logs a
message to the module container’s log, optionally specifying a log
handle initialized with init_log. The format is the
same as in the C89/C99 vprintf function. Calls to log_message specify
a log level, which is interpreted by the container to determine whether
or not to log the message. A typical call to log_message looks
like:api->log_message(log_handle, WRT_API_LOG_ERROR,
__func__, __LINE__,
"send_data failed with status code %d",
(int)status);
The log handle parameter may optionally be NULL, in which case the log message is associated with a filename of the module container’s choosing, instead of a filename specified by the module.