Technical library using IBM AIX RAS infrastructure tracing services

For kernel extensions and device driver

This article explains how to use IBM® AIX® reliability, availability, and serviceability (RAS) infrastructure services to enable trace capability for kernel extension and device drivers. Mainly, the AIX infrastructure manages core error logging, component tracing, and dump facilities for kernel extension and device drivers. It provides the interface to register and control the level of tracing facilities for kernel extension and driver components. It also exemplifies the concepts through an example implementation and best practices to be followed to interface with the RAS infrastructure.

Premkumar Nagarajan (premkuna@in.ibm.com), staff system engineer, IBM

Premkumar Nagarajan is a staff system engineer working for AIX RAS development team. He is working on various features to enable with reliability, serviceability, and availability capabilities for AIX. Before joining IBM, he worked on various storage technologies and various core kernel component development activities.



01 March 2013

Also available in Chinese

Overview

The IBM AIX RAS infrastructure is built up to understand the AIX and third-party components hierarchy structure to provide error logging, component tracing, and dumping facilities. Component hierarchy structure is built based on base, sub component, and component path name. It enables to alter the component characteristics of error logging, dumping, and tracing at component granularity level. All the components will be registered in the logical namespace ("<base component name>.<component specific subcomponents>") to represent hierarchical naming structure.


Terminology

component: Refers to an individual base component or subcomponent. In file system terms, this is analogous to an individual directory (every component is a directory as it can have further child components). Each component has a name relative to its parent and can have further child components.

base component: Refers to a component without any parent components.

subcomponent: Refers to a component with parent components.

component path: Refers to a full component name. In file system terms, this is analogous to the full path to a file.

Registrant: Refers to the subsystem or code that is dealing with the component hierarchy.

Registration: Registrants use the service ras_register and ras_unregister to register and unregister components. Each component is assigned a ras_block_t during registration that can be used as a handle to control that registrant's RAS characteristics. Here, the component intended to register it as a traceware.


RAS kernel service routines

typedef long (*ras_callback_t)(
              ras_block_t,
              long,
              void *,
              void *);
 
long ras_register(
              ras_block_t *rasbp,
              char *name,
              ras_block_t parent,
              ras_type_tlong typesubtype,
              char *desc,
              long flags,
              ras_callback_t ras_callback,
              void *private_data);

When registering base components, an additional typesubtype field is also registered. This field provides a hint as to the base component's function. An example of this is the scdisk component, which would get registered with a typesubtype of RAS_TYPE_STORAGE_DISK. This denotes a parent-child type relationship with re-spect to component naming specification.

Syntax:

long (*ras_callback)(
              ras_block_t ras_blk,
              long command,
              void *arg
              void *private_data);
 

#include <sys/ras.h>
 
long ras_customize(ras_block_t *ras_blk);

#include <sys/ras.h>
 
long ras_control(
              ras_block_t ras_blk,
              long command,
              void *arg,
              long argsize);

In order to facilitate the persistency of component registration, the components have to go through a three-step process.

  • Call the register to specify a name and set the properties.
  • Call ras_control to set the default setting for a component with the help of RAS infrastructure. For example, you can set values to the trace buffer size and error logging level attributes.
  • Call synchronous ras_customize to validate the setting and make it persistent. The registrant can be informed by callback of what has changed, but the RAS infrastructure does all the work, such as calling ras_control to change the properties that are customized for the component. The callback commands will pass through the component before any default action from the RAS infrastructure.

Sample kernel extension

#include <errno.h>
#include <syslog.h>
#include <sys/device.h>
#include <sys/ras_trace.h>
#include <sys/ras.h>
#include <sys/trchkid.h>
ras_block_t rascb;
#define  HKWD_SAMPLE 0xcccc

#define CTRC_HOOK(level , hook,tag,d1,d2,d3,d4) \
        CT_HOOK5 (rascb, level, MT_SYSTEM|MT_PRIV,    \
          hook, *(ulong *)tag,  \
          (ulong)d1, (ulong)d2, (ulong)d3, (ulong)d4 );


int
hello_callback(ras_block_t cb,
                                unsigned long long cmd,
                                void *arg, void *callback_data)
{
        int rc;


 switch (cmd) {
        /* Component Trace commands */
        case RASCT_MEMTRC_ON:
                rc = ras_control(cb, RASCT_SET_ALLOC_BUFFER, 0, 0);
                if (rc) break;
                /* fall through */
        case RASCT_MEMTRC_RESUME:
                rc = ras_control(cb, RASCT_SET_MEMTRC_RESUME, 0, 0);
                break;
        case RASCT_MEMTRC_OFF:
                rc = ras_control(cb, RASCT_SET_MEMTRC_SUSPEND, 0, 0);
                if (rc) break;
                rc = ras_control(cb, RASCT_SET_FREE_BUFFER, 0, 0);
                break;
        case RASCT_MEMTRC_SUSPEND:
                rc = ras_control(cb, RASCT_SET_MEMTRC_SUSPEND, 0, 0);
                break;


 case RASCT_MEMTRC_LVL:
                rc = ras_control(cb, RASCT_SET_MEMTRC_LVL, arg, 0);
                break;
        case RASCT_SYSTRC_ON:
                /* Apply cmd to support API, although we do not use systrace */
                rc = ras_control(cb, RASCT_SET_SYSTRC_ON, 0, 0);
                break;
        case RASCT_SYSTRC_OFF:
                /* Apply cmd to support API, although we do not use systrace */
                rc = ras_control(cb, RASCT_SET_SYSTRC_OFF, 0, 0);
                break;
        default:
                rc = -1;
        }

        return rc;
}
int hello_init(int cmd, struct uio *uio)
{

        kerrno_t krc=0;
        rascb = RAS_BLOCK_INVALID;
        char *name="mydriver";
        if (krc = ras_register(&rascb,name, NULL, RAS_TYPE_FILESYSTEM, 
        "All ethernet devices", RASF_TRACE_AWARE, hello_callback, NULL)) {
                return;
        }

         if (krc = ras_control(rascb, RASC_LOGICAL_ALIAS, 0, 0)) {
                goto ras_error;
        }
        if (krc = ras_control(rascb, RASCT_SET_SYSTRC_ON, 0, 0)) {
                goto ras_error;
        }
        if (krc = ras_control(rascb, RASCT_SET_MEMBUFSIZE,
                                                        (void *)2048, 0)) {
                goto ras_error;
        }


 if (krc = ras_control(rascb, RASCT_SET_ALLOC_BUFFER, 0, 0)) {
                goto ras_error;
        }
        if (krc = ras_control(rascb, RASCT_SET_MEMTRC_RESUME, 0, 0)) {
                goto ras_error;
        }
        krc = ras_customize(rascb);

     CTRC_HOOK(0,HKWD_SAMPLE,"ACLE","BCLE","CCLE","DCLE","ECLE");

     ras_error:

       return 0;
}

Sample output from the above commands:

Extract from the component buffer:

mtrcsave -M rare -C all -d <dir>

Display:(it can be formatted by including HKWD_SAMPLE  entry in /etc/trcfmt)
# trcrpt mtrcrare
..
60C   409.451976437    9433.321187           D1=ACLE D2=BCLE D3=CCLE D4=DCLE D5=ECLE

Best practices to be considered for a traceware component

  • There are two types of serialization available for the components to access it's trace buffers. The first type is to delicate RAS to control the buffer access to maintain serialization from multiple buffer operations, such as logging from mul-tiple code location from driver routines and suspending/resuming operation. Other type of serialization is to component itself control the serialization. With this serialization, there can be a few scenarios where the component should handle the same way as RAS infrastructure, such as component tracing in parallel with any buffer resizing. In above cases, the trace buffer should be accessed from any one of the operations and it should be serialized. CT_TRCON is the macro used to identify that tracing is enabled or disabled for the particular component. It can be used as an indicator to log the trace data for a component. It also indicates that the trace buffer is in active state and not in the suspend or resume state. In case of compo-nent serialization, components have to make sure that lock is held in order to ac-cess this macro.
  • Level and buffer size

    Components can check the level of tracing from the user level and component buffer size that is set from the ctctrl command. So, components have the flexibility to adjust to appropriate requirements.

  • Register / Unregister

    If component tracing is enabled for a driver, a kernel extension, or a subsystem which may be stopped, you should unregister that component. A component cannot be unregistered from the framework if it has some child components. So, subcomponents must be unregistered first.

    Moreover, when unregistering, no component trace activity should be occurring.

    In our case, the code is:

    ras_unregister(rasb_eth0);
    ras_unregister(rasb_eth);

    The ras_unregister call must be done from the process environment.

  • __INFREQUENT macro can be preceded to a trace, so that it can be notified as a rare execution code path to compiler.

    if (CT_TRCON(rascb, CT_LVL_DETAIL) 
    {
    __INFREQUENT;  /* the above condition is rare */
    if (adapter->opened)
         {
               TRACE_CODE();
          }
    }
  • Components have to specify a parameter in CT_HOOK to trace data in system trace (MT_SYSTEM), light weight memory trace (MT_COMMON, MT_RARE), or component private buffer (MT_PRIV) depending on the requirement.


Resources

  1. Component trace facility
  2. Various RAS infrastructure component trace commands
  3. Debugging and performance tracing

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=859546
ArticleTitle=Technical library using IBM AIX RAS infrastructure tracing services
publish-date=03012013