Hybrid Deployments

Monitor and Debug Java Microservices with MicroProfile OpenTracing

Share this post:

Apps may be composed of many microservices—save your admins headaches by enabling proper distributed tracing

In a cloud-native application, multiple microservices are collaborating to deliver the expected functionality. If you have hundreds of services, how do you debug an individual request as it travels through a distributed system? For Java enterprise developers, the Eclipse MicroProfile OpenTracing specification makes it easier.

In our reference implementation, our storefront application is composed of several independent microservice applications, as depicted below:

Storefront application composed of several independent microservice applications

To assure transactions are delivered reliably and thus keep the database data consistent, the microservices shown above coordinate updates through the open source message queue, RabbitMQ. For the sake of demonstrating the need for microservices-specific tracing, let’s assume RabbitMQ has a problem and has stopped working. If the standard logging was only going to the server hosting the service, finding the source of the problem could require the admin to traipse through logs server-by-server, adding delay to problem resolution:

Microservice app fails. Where is the problem?

With distributed tracing, it’s much easier to find where the problem lies and resolve it (e.g., by restarting the server, allocating more disk space, etc.).

With a proper distributed tracing system in place, you have important clues to help you debug the problematic services. Fortunately for Java enterprise developers, you can easily enable it in your MicroProfile application without any explicit code, thanks to MicroProfile OpenTracing.

In our last blog post, David Shi explained how to monitor microservices using MicroProfile Health and Metrics. In today’s entry, I’ll cover the MicroProfile OpenTracing specification we implemented as part of our team’s reference storefront application (GitHub).

Microprofile

This blog series is based on my team’s experience with migrating from Spring Boot-based microservices to MicroProfile, an optimized enterprise Java programming model for a microservices architecture. Both the projects are hosted on GitHub. You can access the Spring Boot version of our reference application here and MicroProfile version of our reference application here.

Monitoring your microservices app in Spring Boot

A commonly used distributed tracing tool for Spring Cloud is Spring Cloud Sleuth. Adding logging traces helps us to debug the transaction flows. Using the traceIDs and spanIDs provided by the Spring Cloud Sleuth, the log statements help identify the individual traces for the existing microservices and allow developers and admins to troubleshoot the cause.

Optionally, you can integrate Zipkin with Spring Cloud Sleuth to add more information to the traces. Simply add Zipkin as a dependency, enable it in your application using the @EnableZipkinServer annotation, and provide the required configurations in the .properties file. It is pretty simple to integrate it and get a nice insight into the transactions and communication happening between different microservices in our application.

@SpringBootApplication
@EnableZipkinServer
public class SpringBootApplication {
    public static void main(String[] args) {
        SpringApplication.run(SpringBootApplication.class, args);
    }
}

The next section compares the approach to distributed tracing with Spring Cloud Sleuth to the Eclipse MicroProfile implementation, OpenTracing.

Monitoring your microservices app with MicroProfile OpenTracing

MicroProfile OpenTracing enables distributed tracing in our application. It helps us to analyze the transaction flows so that we can easily debug the problematic services and fix them.

In our sample application, we used Zipkin as our distributed tracing system. So, let’s see how we configured WebSphere Liberty to use MicroProfile OpenTracing using Zipkin:

In the server.xml, add opentracingZipkin as a user feature:

<featureManager>
  <feature>usr:opentracingZipkin-0.30</feature>
</featureManager>

Specify the host and port of your Zipkin server in your server.xml:

<opentracingZipkin host="${env.zipkinHost}" port="${env.zipkinPort}"/>

In order to enable custom tracing in your application, we need to implement the OpenTracing Tracer interface. In WebSphere Liberty,  we have the Zipkin server implementation defined as a user feature. To install and download this feature, add the Maven dependency in your pom.xml as shown below:

<!-- Plugin to install opentracing zipkin tracer -->
<plugin>
  <groupId>com.googlecode.maven-download-plugin</groupId>
  <artifactId>download-maven-plugin</artifactId>
  <version>${version.download-maven-plugin}</version>
  <executions>
    <execution>
    <id>install-tracer</id>
    <phase>prepare-package</phase>
    <goals>
      <goal>wget</goal>
    </goals>
    <configuration>
      <url>https://repo1.maven.org/.../liberty-opentracing-zipkintracer-1.0-sample.zip</url>
      <unpack>true</unpack>
      <outputDirectory>${project.build.directory}/liberty/wlp/usr</outputDirectory>
    </configuration>
    </execution>
  </executions>
</plugin>

You can also install this feature manually without using this dependency; see Enabling distributed tracing for details.

Once the MicroProfile OpenTracing mpOpenTracing-1.0 feature is enabled, by default, the distributed tracing is enabled for all the JAX-RS methods in our application. You can also further customize the traces using the @Traced annotation and an ActiveSpan object to retrieve messages.

Now, let’s consider the Catalog service in our reference implementation. This is a sample snippet that gives you an idea of how we defined custom traces in our sample application:

@RequestScoped
@Path("/items")
@Produces(MediaType.APPLICATION_JSON)

public class CatalogService {
  @Inject
  Tracer tracer;

  @Timeout(value = 2, unit = ChronoUnit.SECONDS)
  @Retry(maxRetries = 2, maxDuration = 2000)
  @Fallback(fallbackMethod = "fallbackInventory")
  @GET
  @Traced(value = true, operationName = "getCatalog.list")

  public List<Item> getInventory() {
        ...snip...
    // Return all items in Inventory
  }

  public List<Item> fallbackInventory() {
    try (ActiveSpan childSpan =
      tracer
        .buildSpan("Grabbing messages from Messaging System")
        .startActive()) {
            ...snip...
          // Return a default fallback list
        }
    }
  }

...
}

Defining custom traces is very easy. @Traced  includes an option to disable the default tracing. You can set the value parameter to false to disable it. With this, you can also name your spans using the operationName parameter. You can also define customized traces using a custom tracer object by injecting it with the @Inject annotation.

Now that you are all set to use distributed tracing in your application, what does the result look like? Let’s take a look at the Zipkin traces of an actual logging of a problem in our storefront app. In this example, I’ve intentionally terminated the RabbitMQ service. When this happens, the updated stock will not be passed to the inventory service and thus the MySQL database will not be synchronized. A clear indicator of a problem is the trace depth of “3” instead of the expected “6” from a completed transaction sequence:

Zipken logged problem

If the admin doesn’t have logging indicating the source of the problem (i.e., RabbitMQ is down), it’s not easy to identify the source of the problem because from the user’s point of view, the web interface is (almost) working.

Below is the Zipkin trace details for the sequence above; it includes a log of the HTTP 500 error interrupting the workflow of the application:

Zipken logged problem details

Compare this to a normal operational logging of the same transaction:

Zipken logged no problem

For more details on how the Catalog Service is built, see the source code here. To run the whole application together, please check out BlueCompute – MicroProfile Implementation.

You can speed problem resolution with proper distributed tracing

In a large organization, your applications may be composed of hundreds of microservices—save your SREs headaches by enabling proper distributed tracing. It will speed problem resolution by simplifying the effort to isolate a failure to a specific microservices application and providing important context surrounding the problem. MicroProfile OpenTracing does (most of) the hard work for you.

For enabling open tracing in this application, I found the Open Liberty guides very useful, specifically the guide Enabling distributed tracing in microservices (20 minutes to read).

This blog excerpts the code and specifications for distributed tracing in our sample application using MicroProfile OpenTracing. You can see all the code available on GitHub. The microservices in our simple storefront application can be run individually using Maven—you can import them and run them as-is locally. You can also run them on IBM Cloud and IBM Cloud Private.

Cloud Solution Architect

More Hybrid Deployments stories
February 18, 2019

VIDEO – Hybrid Cloud Architecture Part 1: Connectivity

We're excited to bring you Part 1 of our Hybrid Cloud Architecture video series: Connectivity. In this lightboarding video, we're going to explain how you can connect your various cloud environments in the overall hybrid cloud.

Continue reading

February 14, 2019

Migrating Java Microservices to MicroProfile – Epilogue

If you are building next-generation microservices with enterprise Java, you should look at Eclipse MicroProfile. It is lightweight, easy to learn, and covers the whole spectrum of core technologies required for a microservices architecture. This is the summary of the nine-part series on our team's migration of a Spring-based microservices app to MicroProfile.

Continue reading

February 14, 2019

VIDEO – Hybrid Cloud Architecture: Intro

We're excited to introduce a three-part lightboarding video series that's going to delve into the world of hybrid cloud architecture. In this Intro video, Sai Vennam lay out the three major hybrid cloud architecture issues that we're going to cover in this series

Continue reading