opensource.google.com

Menu

Java zPages for OpenTelemetry

Friday, August 14, 2020

What is OpenTelemetry?

OpenTelemetry is an open source project aimed at improving the observability of our applications. It is a collection of cloud monitoring libraries and services for capturing distributed traces and metrics and integrates naturally with external observability tools, such as Prometheus and Zipkin. As of now, OpenTelemetry is in its beta stage and supports a few different languages.

What are zPages?

zPages are a set of dynamically generated HTML web pages that display trace and metrics data from the running application. The term zPages was coined at Google, where similar pages are used to view basic diagnostic data from a particular host or service. For our project, we built the Java /tracez and /traceconfigz zPages, which focus on collecting and displaying trace spans.

TraceZ

The /tracez zPage displays span data from the instrumented application. Spans are split into two groups: spans that are still running and spans that have completed.

TraceConfigZ

The /traceconfigz zPage displays the currently active tracing configuration and allows users to change the tracing parameters. Examples of such parameters include the sampling probability and the maximum number of attributes.

Using the zPages

This section describes how to start and use the Java zPages.

Add the dependencies to your project

First, you need to add OpenTelemetry as a dependency to your Java application.

Maven

For Maven, add the following to your pom.xml file:
<dependencies>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-api</artifactId>
        <version>0.7.0</version>
    </dependency>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-sdk</artifactId>
        <version>0.7.0</version>
    </dependency>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-sdk-extension-    zpages</artifactId>
        <version>0.7.0</version>
    </dependency>
</dependencies>

Gradle

For Gradle, add the following to your build.gradle dependencies:
implementation 'io.opentelemetry:opentelemetry-api:0.7.0'
implementation 'io.opentelemetry:opentelemetry-sdk:0.7.0'
implementation 'io.opentelemetry:opentelemetry-sdk-extension-zpages:0.7.0'

Register the zPages

To set-up the zPages, simply call startHttpServerAndRegisterAllPages(int port) from the ZPageServer class in your main function:
import io.opentelemetry.sdk.extensions.zpages.ZPageServer;

public class MyMainClass {
    public static void main(String[] args) throws Exception {
        ZPageServer.startHttpServerAndRegisterAllPages(8080);
        // ... do work
    }
}
Note that the package com.sun.net.httpserver is required to use the default zPages setup. Please make sure your version of the JDK includes this package if you plan to use the default server.

Alternatively, you can call registerAllPagesToHttpServer(HttpServer server) to register the zPages to a shared server:
import io.opentelemetry.sdk.extensions.zpages.ZPageServer;

public class MyMainClass {
    public static void main(String[] args) throws Exception {
        HttpServer server = HttpServer.create(new                     InetSocketAddress(8000), 10);
        ZPageServer.registerAllPagesToHttpServer(server);
        server.start();
        // ... do work
    }
}

Access the zPages

View all available zPages on the index page

The index page (at /) lists all available zPages with a link and description.


View trace spans on the /tracez zPage

The /tracez zPage displays information about running and completed spans, with completed spans further organized into latency and error buckets. The data is aggregated into a summary-level table:


You can click on each of the counts in the table cells to access the corresponding span details. For example, here are the details of the ChildSpan latency sample (row 1, col 4):


View and update the tracing configuration on the /traceconfigz zPage.

The /traceconfigz zPage provides an interface for users to modify the current tracing parameters:


Design

This section goes into the underlying design of our code.

Frontend


The frontend consists of two main parts: HttpHandler and HttpServer. The HttpHandler is responsible for rendering the HTML content, with each zPage implementing its own ZPageHandler. The HttpServer, on the other hand, is responsible for listening to incoming requests, obtaining the requested data, and then invoking the aforementioned ZPageHandlers. The HttpServer class from com.sun.net is used to construct the default server and to handle http requests on different routes.

Backend





The backend consists of two components as well: SpanProcessor and DataAggregator. The SpanProcessor watches the lifecycle of each span, invoking functions each time a span starts or ends. The DataAggregator, on the other hand, restructures the data from the SpanProcessor into an accessible format for the frontend to display. The class constructor requires a TracezSpanProcessor instance, so that the TracezDataAggregator class can access the spans collected by a specific TracezSpanProcessor. The frontend only needs to call functions in the DataAggregator to obtain information required for the web page.

Conclusion

We hope that this blog post has given you a little insight into the development and use cases of OpenTelemetry’s Java zPages. The zPages themselves are lightweight performance monitoring tools that allow users to troubleshoot and better understand their applications. Once OpenTelemetry is officially released, we hope that you try out and use the /tracez and /traceconfigz zPages!

By William Hu and Terry Wang – Software Engineering Interns, Core Compute Observability
.