Introducing InfoSphere Streams 2.0 features, Part 1: Application monitoring with metrics

C++ and SPL metrics examples

New to IBM® InfoSphere® Streams 2.0, the introduction of the metrics component provides runtime access to system and user-defined statistics than can be used to monitor the inner workings of your Streams application. This article outlines the new metrics system and describes how to access and use both system and user-defined metrics as part of your Streams application.

Share:

Chris Howard (chris_howard@ie.ibm.com), Big Data Solution Architect, Office of the CTO, IBM

Chris HowardChris Howard has been with IBM since 1998 and is currently a solution architect focused on Big Data solutions (InfoSphere Streams and InfoSphere BigInsights) as part of the CTO's office within IBM Software Group. In his previous role, he managed the EMEA Stream Computing Centre in Dublin and was responsible for client solution development for IBM's stream processing solution: IBM InfoSphere Streams. He is a Chartered Fellow of the British Computer Society.



15 March 2012

Also available in Chinese Spanish

Introduction

This article describes the capabilities of the new metrics system accessible to InfoSphere Streams developers. Introduced with the release of Streams 2.0, the metrics system provides runtime access to application-specific metrics, providing the ability to externally monitor application statistics of interest. We provide a number of samples in the Streams Processing Language and C++.

This article is written for Streams component developers and application programmers who have Streams programming language skills and C++ skills. Use the article as a reference, or the samples in it can be examined and executed to demonstrate the techniques described. To execute the samples, you should have a general working knowledge of Streams programming.

Running the samples will require a Red Hat Enterprise Linux® box with InfoSphere Streams V2.0 or later.


Overview of the metrics system

The Streams 2.0 runtime provides the ability to query various application metrics and make use of this data as part of a running Streams job or via an external tool or programs (such as streamtool). Access to these metrics provides for a range of uses, whether simple application monitoring or more sophisticated real-time adaptations to be made to the running application (load shedding, for example).

Access to these metrics is provided via runtime APIs. These allow the retrieval of a range of metric data related to the number or tuples processed, dropped, queued, etc. for input ports and output ports at the Streams Operator or Processing Element (PE) level.

The Streams runtime provides the ability to support two primary metric types:

  • System metrics— Pre-defined counter-data for Operators and PEs. This might include data like number of Streams tuples submitted by an operator port.
  • User-defined metrics— Developer-defined metrics, implemented at the operator level, and used to report any application values of interest.

The sections below go into more detail on the various types of metrics and the specifics of their implementation.

Metrics data types

The Streams data recorded, whether system-provided or user-defined, can take on one of three basic metric data types. These types provide some context as to the class of counter-data being represented by the metric. The list of metric data types is listed below.

  • Counters— Used to represent metric values that increase (or decrease) over time, for example the number of tuples processed by an operator input port.
  • Gauges— Used to represent a value at a point in time, and can be set to any value such as the number of window punctuation's currently queued.
  • Time— Used to represent a date/time stamp, this might be used to record the time at which a particular event occurred within the Streams application.

System metrics provided by the runtime are of fixed data type (normally counters or gauges), where as the choice of metric data type depends on the developer when implementing a user-defined metric.


System metrics

Included by default as part of the Streams runtime, system metrics allow access to a range of values that provide insight into the internal operation of a Stream application. At the application level, APIs available to an operator-provided access to metrics associated with both that operator and the processing element containing the operator; scope for metric access is limited locally.

Having this level of insight into the inner workings of an operator and the container processing element allows an application to potentially adapt its behavior based on the metric values. Within the application itself, opportunities to load shed tuples based on excessive data rates, or to checkpoint an operators state based on the number of tuples processed are potential options. Externally to the application, scripts or programs might poll the wider application or system-level metrics available, whether for basic monitoring or to react at a system level to events being recorded by the metrics system.

System metrics come in two varieties: operator-level metrics and PE-level metrics. In the following section, we will explore each.

Operator metrics — Input and output port metrics

Threaded ports

A threaded port allows an operator input port to execute in its own separate thread. Tuples arrive on each named input port in a separate execution thread, even if the operator is fused with others. This allows for more concurrent execution of operators, especially in the same partition/PE. See Table 1 for a list of those metrics available only for threaded ports.

Operator-level system metrics allow for fine-grain inspection of the processing being performed by a specific operator. It is possible to query a variety of metric values for each input and output port associated with an operator — ranging from data tuple counts (processed, dropped, queued) to details of punctuation tuples such as numbers of, type (window, final) and status (processed, queued).

Having access to this level of detail for an individual operator allows for a considerable amount of diagnostic capability. This allows you to diagnose issues with an application, such as performance bottlenecks, via the monitoring of dropped or queued tuples on threaded ports (see the "Threaded ports" sidebar).

Figure 1. PE-level metrics
Image shows a Streams operator with two input ports and one output port

Gaining access to operator-level metrics is performed via the OperatorContext class and its getMetrics() member function, returning an OperatorMetrics object. Accessing the operators context first requires a call to the getContext() member function of the operator.

SPL::Metric class

The SPL::Metric class, used to represent a metric object, provides a range of member functions for accessing metric name, description, kind, in addition to getting the current value. See Resources for links to the SPL Operator Runtime C++ API documentation.

Calls to getInputPortMetric(port, metric) and getOutputPortMetric(port, metric) methods, both of the OperatorMetrics class, allow direct access to individual input and outport port metrics, respectively. Here, port is the index number of the port and metric is the name of the metric to be returned (using OperatorMetrics::InputPortMetricName or OperatorMetrics::OutputPortMetricName, depending on the metric type being requested).

Listing 1. Declaring the metric header file (_h.cgt)
    ...
  
    private:
      // Members 
    
        // Note const required for getInputPortMetric and getOutputPortMetric
        Metric const & numTuplesProcessed; 

    ...
Listing 2. Access the metric through the primitive operator (_cpp.cgt)
    // Constructor
    MY_OPERATOR::MY_OPERATOR()
      : numTuplesProcessed(getContext().getMetrics()\
          .getInputPortMetric(0, OperatorMetrics::nTuplesProcessed))
    {
    }    
	    
    // Tuple processing for non-mutating ports
    void MY_OPERATOR::process(Tuple const & tuple, uint32_t port)
    {

        // Declare variable to store the metrics current value    
        int64_t nProcessedValue;

        ...

        nProcessedValue = numTuplesProcessed.getValueNoLock();
    
        ...
    }

NOTE: Notice how the call to getInputPortMetric() is performed as part of the constructor initialization list and not in the constructor body. This is due to the member function returning a const value and needs to be performed during initialization and not during assignment.

The following gives an example in SPL, this time for the number of tuples queued at an input port. Implemented as part of the onTuple event within an SPL custom operator, this would allow alternative action to be taken based on excessive tuples queuing at the input port.

Listing 3. SPL operator getInputPortMetric example
    // Declare metric value
    mutable int64 nQueuedValue = 0; 	
	
    // Return the number of Tuples Queued for input port 0
    getInputPortMetricValue(0u, Sys.nTuplesQueued, nQueuedValue);

The following table gives details of each of the operator-level metrics available, listing metric name, scope, and description.

Table 1. Operator-level metrics provided by the system
NameScopeDescription
nTuplesProcessedper input portNumber of tuples processed (counter)
nTuplesDropped*per input portNumber of tuples dropped (counter)
nTuplesQueued*per input portNumber of tuples queued (gauge)
nWindowPunctsProcessedper input portNumber of window punctuations processed (counter)
nFinalPunctsProcessedper input portNumber of final punctuations processed (counter)
nWindowPunctsQueued*per input portNumber of window punctuations queued (gauge)
nFinalPunctsQueued*per input portNumber of final punctuations queued(gauge)
queueSize*per input portSize of queue for the port or 0 if no queue (gauge)
nTuplesSubmittedper output portNumber of tuples submitted (counter)
nWindowPunctsSubmittedper output portNumber of window punctuations submitted (counter)
nFinalPunctsSubmittedper output portNumber of final punctuations submitted (counter)

Metrics flagged with * are only available for input ports configured with threadedPort. A call to getInputPortMetric() will return 0 for non-threadedPorts.

PE metrics — Input and output port metrics

Metrics at the processing element level are similar to operator-level metrics. When dealing with operator-level metrics, we are concerned with values specific to the operator in context, whereas metrics at the processing element level relate to all operators that implement streams that cross the PE boundary.

Figure 2 illustrates this boundary concept in detail. Here, we represent a single processing element that contains four operators. Each operator has a number of input and output ports. When dealing with PE-level metrics, only the input and output ports that cross the PE boundary are available for querying (for example, the number of tuples processed from input port 0, 1, and 2 in addition to the number of tuples submitted at output port 0). Although PE input port 1 feeds operators 1 and 2, the tuples are only counted once when querying the number of tuples processed for PE input port 1.

Figure 2. PE-level metrics
Image shows Streams processing element with multiple operators

Table 2 gives a mapping between the operator and PE-level metrics. Notice how ports internal to the PE that do not cross the PE boundary are only accessible at the operator level.

Table 2. Operator and PE port mapping
OperatorOperator portPE portAccess scope
Operator 1Input Port 0Input Port 0Operator and PE
Input Port 1Input Port 1Operator and PE
Output Port 0n/aOperator only
Operator 2Input Port 0Input Port 1Operator and PE
Input Port 1Input Port 2Operator and PE
Output Port 0n/aOperator only
Operator 3Input Port 0n/aOperator only
Output Port 0n/aOperator only
Operator 4Input Port 0n/aOperator only
Output Port 0Output Port 0Operator and PE

To arrive at a total number of tuples processed by the PE, it's necessary to sum the individual tuple numbers for each PE input port (ports 0, 1 and 2). A PE that contains only one operator (for example, using partition exlocation placement) will return the same results for PE and operator-level metrics.

In addition to the tuple and punctuation counts provided at the operator level, PE metrics are extended to cover tuple byte rates on input and output ports, as well as details of connection status for output ports (number of broken connections, for example). This type of metric might be useful to check for failed links between PEs that may require some level of attention. PE-level metrics follow the same access mechanism as operator metrics, but rather than querying via the operator context, you use getPE() to return a handle to the PE.

The following table gives details of each of the PE-level metrics available, listing metric name, scope, and description.

Table 3. PE-level metrics provided by the system
NameScopeDescription
nTuplesProcessedper input portNumber of tuples processed (counter)
nTupleBytesProcessedper input portNumber of bytes processed by the port (counter)
nWindowPunctsProcessedper input portNumber of window punctuations processed (counter)
nFinalPunctsProcessedper input portNumber of final punctuations processed (counter)
nTuplesSubmittedper output portNumber of tuples submitted (counter)
nTupleBytesSubmittedper output portNumber of tuple bytes submitted(counter)
nWindowPunctsSubmittedper output portNumber of window punctuations submitted (counter)
nFinalPunctsSubmittedper output portNumber of final punctuations submitted (counter)
nBrokenConnectionsper output portNumber of broken connections that have occurred on the port (counter)
nRequiredConnectingper output portNumber of required connections currently connecting on the port (gauge)
nOptionalConnectingper output portNumber of optional connections currently connecting on the port (gauge)

User-defined metrics

In addition to the built-in metrics provided by the Streams runtime, it is also possible to add custom metrics. These behave the as system metrics, but they allow you to expose your own interesting and significant values externally.

When to use a custom metric

Custom metrics provide an excellent mechanism to expose items of interest that might normally be buried deep in your Streams operator code. Anytime you need to know the size of a buffer, the length of a linked list, or the timestamp of an operator's last checkpoint, a custom metric can be used. Custom metrics provide an alternative to resorting to a sink operator or debug logging.

Defining custom metrics

Custom metrics can be declared statically via the operator model or can be created dynamically at runtime. Metrics defined in the operator model are automatically instantiated by the SPL language runtime on execution. To create custom metrics at runtime, use the createCustomMetric(name, description, kind) member function.

Regardless of the way in which they are declared, custom metrics can be accessed via the getCustomMetricByName(name) member function of the OperatorMetric class. You are free to update the value of a custom metric at any point after creation. Its value can be updated via the use of setValue(newValue) and incrementValue(incValue) functions. More details can be found in the IBM Streams Processing Language Runtime API Documentation.

Defining a custom metric as part of the Streams operator model

An operator model is an XML document used to describe a C++ primitive operator or a Java™ primitive operator. The document consists of four major elements:

  • Context
  • Parameters
  • Input ports
  • Output ports

The context element describes global properties applicable to the operator that are not linked with particular parameters, ports of the operator. This element also captures common definitions that may be referenced in later entries of the operator model.

Editing the operator model is performed using the Streams Studio Editor in tree view or source XML mode. Figure 3 shows the tree view, and Listing 4 shows the corresponding XML snippet for the custom operator.

Figure 3. Editing the operator model tree view
Image of the Operator Model Editor
Listing 4. Sample XML (partial) from the operator model
  ...        
  <metrics>
    <metric>
      <name>myCount</name>
      <description>A new metric defined via the operator model</description>
      <kind>Counter</kind>
    </metric>
  </metrics>
  ...

As shown, the accessing of custom metrics is a straightforward operation, using the OperatorContext class and its getMetrics() member function. The example below shows how you can access your newly created customer metric MyCount and assign a value to it.

Listing 5. Declaring the metric header file (_h.cgt)
    ...
  
    private:
        // Members 
    
        Metric & myCountStatic;
    ...
Listing 6. Access the metric through the primitive operator (_cpp.cgt)
    // Constructor
    MY_OPERATOR::MY_OPERATOR()
      : myCountStatic(getContext().getMetrics().getCustomMetricByName("myCount"))
    {
    }    
	    
    // Tuple processing for non-mutating ports
    void MY_OPERATOR::process(Tuple const & tuple, uint32_t port)
    {

        ...

        // Set the Custom Metric to an arbitrary value
        myCountStatic.setValue(someValue);

        // Increment the Metric by 10
        myCountStatic.incrementValue(10);
    
        ...
    }

Two other useful member functions for the SPL::OperatorMetrics class are hasCustomMetric(name) and getCustomMetricNames(), which allow you to query for the existence of any custom metrics defined in the operator model and return their names.

Creating a custom metric at runtime

As an alternative to pre-defining a custom metric as part of the operator model, it is also possible to dynamically create new metrics using the createCustomMetric(name, description, kind) function, with kind being one of the following:

  • Metric::Counter
  • Metric::Gauge
  • Metric::Time

Our final example shows how to dynamically create a new metric at runtime without having to previously define it as part of the operator model. Once defined, the same mechanisms are used to set or increment the value of the new metric.

Listing 7. Declaring the metric header file (_h.cgt)
    ...
  
    private:
      // Members 
    
        Metric & myCountDynamic;
    ...
Listing 8. Access the metric through the primitive operator (_cpp.cgt)
    // Constructor
    MY_OPERATOR::MY_OPERATOR()
      : myCountDynamic(getContext().getMetrics().createCustomMetric("myCountDynamic",\
          "A description for a dynamic metric", Metric::Counter))
    {
    }    
	    
    // Tuple processing for non-mutating ports
    void MY_OPERATOR::process(Tuple const & tuple, uint32_t port)
    {

        ...

        // Set the Custom Metric to an arbitrary value
        myCountDynamic.setValue(someValue);

        // Increment the Metric by 10
        myCountDynamic.incrementValue(10);
    
        ...
    }

NOTE: Although we have provided a description for our metric above, this value will not be picked up and displayed when inspecting the Operator/PE in the instance graph of Streams Studio. Metrics created via the operator model have their descriptions added to the application description language file (.adl), which is then used to construct the live graph. This is not the case for customer metrics created at runtime.


Accessing metrics

The final section of the article looks at the range of techniques available to inspect metrics data without having to resort to custom development and the various runtime APIs.

Accessing metrics through the streamtool capturestate command

The streamtool capturestate command can be used to capture the current status of system hosts and jobs in a Streams instance, in addition to the performance metrics available for those hosts and jobs. For reference, the command returns output in XML format, so this data can be parsed and used within your own scripts to drive further actions as needed.

The schema is available in the file: streams-install-directory/schema/streamsInstanceState.xsd.

Use the following command to capture performance metrics information for the jobs running in a Streams instance: streamtool capturestate --select jobs=metrics.

When you request performance metrics information, the command output includes additional information about PEs, operators, and ports. Listing 9 shows an sample of metrics information for an operator.

Listing 9. Sample XML output from streamtool capturestate
  ...
  <operator name="PrimOp">
    <metric name="myCountDynamic" lastChangeObserved="1329912395" userDefined="true">
      <metricValue xsi:type="streams:longType" value="65"/>
    </metric>
    <metric name="myCount" lastChangeObserved="1329912395" userDefined="true">
      <metricValue xsi:type="streams:longType" value="1065"/>
    </metric>
    <inputPort index="0" name="BeaconStream">
      <metric name="nTuplesProcessed" lastChangeObserved="1329912395" userDefined="false">
        <metricValue xsi:type="streams:longType" value="782"/>
      </metric>

    ...
    </inputPort>
    ...
  </operator>

Notice how our custom metrics are shown with their current values. The lastChangeObserved field gives the time based on the number of seconds since the epoch. For more information about command usage and how to interpret the output, see the section on performance metrics in the Streams Administration and Installation Guide (Chapter 10, Monitoring and managing IBM InfoSphere Streams).

Accessing metrics through Streams Studio — Metrics view

Streams Studio contains a metrics view that allows you to monitor metrics associated with a running instance. The metrics view is divided into two primary components:

  • A tree-based navigation view in the left-hand pane where you select the element whose metrics you want to see
  • A table-based metrics view in the right-hand pane, which contains the metrics for the selected element on the left

Navigating the tree in the left-hand pane allows you to select hosts, jobs, PEs, operators, and input and output ports for inspection, with the right-pane reflecting the detail of the captured metrics for the selected entity. The figures below show two examples of the metrics being displayed for our custom metrics, then at an operator input port level.

Figure 4. Operator custom metrics
Image of the metrics view showing custom metrics, shows time, alerts, myCountDynamic, and myCount
Figure 5. Operator input port metrics
Image of the metrics view showing input port metrics, shows time, nTuplesProcessed, nTuplesDropped, and nTuplesQueued

This is only a small example of the wealth of information available through the metrics view. A detailed exploration of the view is outside of the scope of this article, but the InfoSphere Streams Studio Installation and User's Guide provides a detailed walk-through in Chapter 9 — Monitoring running SPL applications.

Accessing metrics through Streams Studio — Instance graph

Our final option for inspecting runtime metrics is provided via the instance graph view of Streams Studio. The instance graph view shows a dynamic topology of the SPL applications running on a Streams instance. The view can be customized to change the way in which the topology of the running applications is displayed.

Collecting metrics information

To enable the collection of metrics information, click the Collect Metrics Information icon Collect metrics Information icon.

You can choose what type of Streams elements (such as operators, PEs, and jobs) to include in the graph and how to group and color these elements. It is also possible to enable and configure the collection of runtime metrics information for operators and PEs in the Instance Graph view.

In Figure 6, you can see an example of the runtime metrics being displayed as the mouse is hovered over a given operator. The fly-over window shows the operator name, metrics such as tuples in and tuples out, followed by the custom metrics declared earlier. Notice how the metrics created at runtime do not have descriptions associated with them.

Again, this is only a small example of the information available through the metrics view. Please see the InfoSphere Streams Studio Installation and User's Guide for more information about metrics monitoring under the instance graph.

Figure 6. Fly-over showing operator runtime metrics
Image shows operator runtime metrics

Conclusion

We hope this article has introduced you to the use of metrics with InfoSphere Streams. Whether you are looking for insight into the operations of your running application or have a specific need to expose key counters, the metrics system provides a flexible way to tap this data.

The easiest way to find out more is to try it yourself. Armed with these code snippets, build your own operator and experiment with the range of systems provided and custom metrics available to you. If you would like more information, or have a question on anything covered as part of this article please feel free to post a question to the Streams forum (see Resources.

Acknowledgement

Many thanks to Mark Mendell for his assistance and review of this article.

Resources

Learn

Get products and technologies

  • Download a trial version of InfoSphere Streams and learn how to implement your own metrics.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.
  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Big data and analytics on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Big data and analytics, Information Management
ArticleID=801172
ArticleTitle=Introducing InfoSphere Streams 2.0 features, Part 1: Application monitoring with metrics
publish-date=03152012