Apps may be composed of many microservices—save your admins headaches by enabling proper distributed tracing

In a cloud-native application, multiple microservices are collaborating to deliver the expected functionality. If you have hundreds of services, how do you debug an individual request as it travels through a distributed system? For Java enterprise developers, the Eclipse MicroProfile OpenTracing specification makes it easier.

In our reference implementation, our storefront application is composed of several independent microservice applications, as depicted below:


To assure transactions are delivered reliably and thus keep the database data consistent, the microservices shown above coordinate updates through the open source message queue, RabbitMQ. For the sake of demonstrating the need for microservices-specific tracing, let’s assume RabbitMQ has a problem and has stopped working. If the standard logging was only going to the server hosting the service, finding the source of the problem could require the admin to traipse through logs server-by-server, adding delay to problem resolution:


With distributed tracing, it’s much easier to find where the problem lies and resolve it (e.g., by restarting the server, allocating more disk space, etc.).

With a proper distributed tracing system in place, you have important clues to help you debug the problematic services. Fortunately for Java enterprise developers, you can easily enable it in your MicroProfile application without any explicit code, thanks to MicroProfile OpenTracing.

In our last blog post, David Shi explained how to monitor microservices using MicroProfile Health and Metrics. In today’s entry, I’ll cover the MicroProfile OpenTracing specification we implemented as part of our team’s reference storefront application (GitHub).


This blog series is based on my team’s experience with migrating from Spring Boot-based microservices to MicroProfile, an optimized enterprise Java programming model for a microservices architecture. Both the projects are hosted on GitHub. You can access the Spring Boot version of our reference application here and MicroProfile version of our reference application here.

Monitoring your microservices app in Spring Boot

A commonly used distributed tracing tool for Spring Cloud is Spring Cloud Sleuth. Adding logging traces helps us to debug the transaction flows. Using the traceIDs and spanIDs provided by the Spring Cloud Sleuth, the log statements help identify the individual traces for the existing microservices and allow developers and admins to troubleshoot the cause.

Optionally, you can integrate Zipkin with Spring Cloud Sleuth to add more information to the traces. Simply add Zipkin as a dependency, enable it in your application using the @EnableZipkinServer annotation, and provide the required configurations in the .properties file. It is pretty simple to integrate it and get a nice insight into the transactions and communication happening between different microservices in our application.

@SpringBootApplication
@EnableZipkinServer
public class SpringBootApplication {
    public static void main(String[] args) {
        SpringApplication.run(SpringBootApplication.class, args);
    }
}
Scroll to view full table

The next section compares the approach to distributed tracing with Spring Cloud Sleuth to the Eclipse MicroProfile implementation, OpenTracing.

Monitoring your microservices app with MicroProfile OpenTracing

MicroProfile OpenTracing enables distributed tracing in our application. It helps us to analyze the transaction flows so that we can easily debug the problematic services and fix them.

In our sample application, we used Zipkin as our distributed tracing system. So, let’s see how we configured WebSphere Liberty to use MicroProfile OpenTracing using Zipkin:

In the server.xml, add opentracingZipkin as a user feature:

<featureManager>
  <feature>usr:opentracingZipkin-0.30</feature>
</featureManager>
Scroll to view full table

Specify the host and port of your Zipkin server in your server.xml:

<opentracingZipkin host="${env.zipkinHost}" port="${env.zipkinPort}"/>
Scroll to view full table

In order to enable custom tracing in your application, we need to implement the OpenTracing Tracer interface. In WebSphere Liberty,  we have the Zipkin server implementation defined as a user feature. To install and download this feature, add the Maven dependency in your pom.xml as shown below:

<!-- Plugin to install opentracing zipkin tracer -->
<plugin>
  <groupId>com.googlecode.maven-download-plugin</groupId>
  <artifactId>download-maven-plugin</artifactId>
  <version>${version.download-maven-plugin}</version>
  <executions>
    <execution>
    <id>install-tracer</id>
    <phase>prepare-package</phase>
    <goals>
      <goal>wget</goal>
    </goals>
    <configuration>
      <url>https://repo1.maven.org/.../liberty-opentracing-zipkintracer-1.0-sample.zip</url>
      <unpack>true</unpack>
      <outputDirectory>${project.build.directory}/liberty/wlp/usr</outputDirectory>
    </configuration>
    </execution>
  </executions>
</plugin>
Scroll to view full table

You can also install this feature manually without using this dependency; see Enabling distributed tracing for details.

Once the MicroProfile OpenTracing mpOpenTracing-1.0 feature is enabled, by default, the distributed tracing is enabled for all the JAX-RS methods in our application. You can also further customize the traces using the @Traced annotation and an ActiveSpan object to retrieve messages.

Now, let’s consider the Catalog service in our reference implementation. This is a sample snippet that gives you an idea of how we defined custom traces in our sample application:

@RequestScoped
@Path("/items")
@Produces(MediaType.APPLICATION_JSON)
 
public class CatalogService {
  @Inject
  Tracer tracer;
 
  @Timeout(value = 2, unit = ChronoUnit.SECONDS)
  @Retry(maxRetries = 2, maxDuration = 2000)
  @Fallback(fallbackMethod = "fallbackInventory")
  @GET
  @Traced(value = true, operationName = "getCatalog.list")
 
  public List<Item> getInventory() {
        ...snip...
    // Return all items in Inventory
  }
 
  public List<Item> fallbackInventory() {
    try (ActiveSpan childSpan =
      tracer
        .buildSpan("Grabbing messages from Messaging System")
        .startActive()) {
            ...snip...
          // Return a default fallback list
        }
    }
  }
 
...
}
Scroll to view full table

Defining custom traces is very easy. @Traced  includes an option to disable the default tracing. You can set the value parameter to false to disable it. With this, you can also name your spans using the operationName parameter. You can also define customized traces using a custom tracer object by injecting it with the @Inject annotation.

Now that you are all set to use distributed tracing in your application, what does the result look like? Let’s take a look at the Zipkin traces of an actual logging of a problem in our storefront app. In this example, I’ve intentionally terminated the RabbitMQ service. When this happens, the updated stock will not be passed to the inventory service and thus the MySQL database will not be synchronized. A clear indicator of a problem is the trace depth of “3” instead of the expected “6” from a completed transaction sequence:


If the admin doesn’t have logging indicating the source of the problem (i.e., RabbitMQ is down), it’s not easy to identify the source of the problem because from the user’s point of view, the web interface is (almost) working.

Below is the Zipkin trace details for the sequence above; it includes a log of the HTTP 500 error interrupting the workflow of the application:


Compare this to a normal operational logging of the same transaction:


To run the whole application together, please check out BlueCompute – MicroProfile Implementation.

You can speed problem resolution with proper distributed tracing

In a large organization, your applications may be composed of hundreds of microservices—save your SREs headaches by enabling proper distributed tracing. It will speed problem resolution by simplifying the effort to isolate a failure to a specific microservices application and providing important context surrounding the problem. MicroProfile OpenTracing does (most of) the hard work for you.


For enabling open tracing in this application, I found the Open Liberty guides very useful, specifically the guide Enabling distributed tracing in microservices (20 minutes to read).

This blog excerpts the code and specifications for distributed tracing in our sample application using MicroProfile OpenTracing. You can see all the code available on GitHub. The microservices in our simple storefront application can be run individually using Maven—you can import them and run them as-is locally. You can also run them on IBM Cloud and IBM Cloud Private.

More from Cloud

Modernizing child support enforcement with IBM and AWS

7 min read - With 68% of child support enforcement (CSE) systems aging, most state agencies are currently modernizing them or preparing to modernize. More than 20% of families and children are supported by these systems, and with the current constituents of these systems becoming more consumer technology-centric, the use of antiquated technology systems is archaic and unsustainable. At this point, families expect state agencies to have a modern, efficient child support system. The following are some factors driving these states to pursue modernization:…

7 min read

IBM Cloud Databases for Elasticsearch End of Life and pricing changes

2 min read - As part of our partnership with Elastic, IBM is announcing the release of a new version of IBM Cloud Databases for Elasticsearch. We are excited to bring you an enhanced offering of our enterprise-ready, fully managed Elasticsearch. Our partnership with Elastic means that we will be able to offer more, richer functionality and world-class levels of support. The release of version 7.17 of our managed database service will include support for additional functionality, including things like Role Based Access Control…

2 min read

Connected products at the edge

6 min read - There are many overlapping business usage scenarios involving both the disciplines of the Internet of Things (IoT) and edge computing. But there is one very practical and promising use case that has been commonly deployed without many people thinking about it: connected products. This use case involves devices and equipment embedded with sensors, software and connectivity that exchange data with other products, operators or environments in real-time. In this blog post, we will look at the frequently overlooked phenomenon of…

6 min read

SRG Technology drives global software services with IBM Cloud VPC under the hood

4 min read - Headquartered in Ft. Lauderdale, Florida, SRG Technology LLC. (SRGT) is a software development company supporting the education, healthcare and travel industries. Their team creates data systems that deliver the right data in real time to customers around the globe. Whether those customers are medical offices and hospitals, schools or school districts, government agencies, or individual small businesses, SRGT addresses a wide spectrum of software services and technology needs with round-the-clock innovative thinking and fresh approaches to modern data problems. The…

4 min read