Spring Boot Observability: Setting up Micrometer, Grafana and Prometheus

Observability is the desired quality of any production system: we want to see what’s happening in production and make conclusions from that.

There are a lot of observability tools and processes out there, but, we are going to focus this series on Spring Boot and some of the tools it brings us.

In this post, we are going to talk about the basis of observability, install Micrometer as part of a simple Spring Boot app and monitor its metrics using Grafana and Prometheus.

TRY IT YOURSELF: You can find the source code of this post here.

Spring Boot Observability Series

  • Part 1: Setting up Micrometer, Grafana and Prometheus (you are here)
  • Part 2: Discovering a Database Connection Leak (soon)
  • Part 3: Validating Tail Latency with Percentiles (soon)
  • Part 4: Distributed Tracing with Sleuth, Graylog and Zipkin (soon)

If you like this post and are interested in hearing more about my journey as a Software Engineer, you can follow me on Twitter and travel together.

What is Observability?

Observability is the ability to see what’s happening in a distributed system and conclude what is its current state. What we see, we can measure and improve. Observability takes into account mainly three different things:

  • Metrics: Data about the resources involved inside the distributed system, for instance memory consumption, database connections, CPU, requests by minute, etc. Metrics are constrained by time, those change when time passes and can mean different things at different times.
  • Logs: Information at different levels of details about what operations are happening in the system. They have levels like INFO, DEBUG and so on, plus, information relevant to that moment in time and operation.
  • Traces: How the information is flowing through the system. Connecting dots through the whole system, understanding where an operation starts and finishes, and how much time it took.

Use Case: Observing JVM Metrics

In the following sections, we are going to set up all what we need to observe JVM metrics in a Spring Boot application.

Solution Architecture

Requirements

  • Java 17
  • Docker
  • Docker compose

Setting up a Spring Boot app with Micrometer

Micrometer is a framework that allows us to define metrics and send/expose them to other applications.

First, we need to add the micrometer and actuator dependencies to the project:

dependencies {
	....
	implementation 'org.springframework.boot:spring-boot-starter-actuator'
	....
	runtimeOnly 'io.micrometer:micrometer-registry-prometheus'
	....
}

On line 3, we can see the actuator dependency, required to expose an endpoint to grab metrics, and on line 5, we can see Micrometer Registry for Prometheus, as we will use Prometheus in the following sections.

Now, we need to expose those metrics on the application.yml file:

management:
  endpoints:
    web:
      exposure:
        include: metrics, prometheus
  metrics:
    tags:
      application: ${spring.application.name}

spring:
  application:
    name: spring-observability

On line 5, we expose the endpoints for actuator named metrics and prometheus. On line 8, we set a tag for each generated metric to identify that those metrics are related to the application name on line 12.

The following is an example of calling http://localhost:8080/actuator/metrics, showing what metrics by default will generate Micrometer:

{
  "names": [
    "jvm.buffer.count",
    "jvm.buffer.memory.used",
    "jvm.buffer.total.capacity",
    "jvm.classes.loaded",
    "jvm.classes.unloaded",
    "jvm.gc.live.data.size",
    "jvm.gc.max.data.size",
    "jvm.gc.memory.allocated",
    "jvm.gc.memory.promoted",
    "jvm.gc.pause",
    "jvm.memory.committed",
    "jvm.memory.max",
    "jvm.memory.used",
    "jvm.threads.daemon",
    "jvm.threads.live",
    "jvm.threads.peak",
    "jvm.threads.states",
    "logback.events",
    "process.cpu.usage",
    "process.files.max",
    "process.files.open",
    "process.start.time",
    "process.uptime",
    "system.cpu.count",
    "system.cpu.usage",
    "system.load.average.1m",
    "tomcat.sessions.active.current",
    "tomcat.sessions.active.max",
    "tomcat.sessions.alive.max",
    "tomcat.sessions.created",
    "tomcat.sessions.expired",
    "tomcat.sessions.rejected"
  ]
}

There, we see a lot of default metrics, from memory and CPU, to threads and sessions.

Besides, the following is an example of calling http://localhost:8080/actuator/prometheus, showing in real time the current values of the metrics we saw on the metrics endpoint:

...
# TYPE system_cpu_count gauge
# HELP system_cpu_count The number of processors available to the Java virtual machine
system_cpu_count{application="spring-observability"} 4.0
...
# TYPE jvm_threads_states_threads gauge
# HELP jvm_threads_states_threads The current number of threads having NEW state
jvm_threads_states_threads{application="spring-observability",state="timed-waiting"} 4.0
...
# TYPE jvm_memory_used_bytes gauge
# HELP jvm_memory_used_bytes The amount of used memory
jvm_memory_used_bytes{application="spring-observability",area="nonheap",id="CodeHeap 'profiled nmethods'"} 9829120.0
...

There, we can find the Prometheus format. Each line defines a metric, with a name and different labels (properties), plus, its current value.

NOTE: See the label application, that will have the application name to which this metric belongs. This value was setup on the application yml file, on the property management.metrics.tags.application.

Prometheus endpoint will be used by Prometheus to grab metrics from the Spring Boot app. In the next section, will describe how.

Grabbing Metrics in Prometheus from Spring Boot

Prometheus is a specialized database to save timeline series data like metrics and query them with different operations, like aggregation or average.

Let’s start with Prometheus configuration file prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 30s

scrape_configs:
  - job_name: prometheus
    honor_labels: true
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: spring-observability
    honor_labels: true
    metrics_path: /actuator/prometheus
    static_configs:
      - targets: ["host.docker.internal:8080"]

There we can find:

  • On line 2, the scrape interval time. Prometheus scrapes/pulls from a source the metrics each 15 seconds
  • On line 6, we defined a scrape job to grab metrics from Prometheus itself on localhost:9090
  • On line 11, we defined a scrape job to grab metrics from host.docker.internal:8080/actuator/prometheus

NOTE: The path /actuator/prometheus is where Spring Boot app is exposing the Prometheus metrics.

NOTE2: host.docker.internal is a special DNS inside a docker container to link to the host machine, to access processes that are not inside a docker container.

Now, let’s set up a Docker compose to start Prometheus:

services:
  prometheus:
    image: prom/prometheus:v2.31.1
    command: --config.file=/etc/prometheus/prometheus.yml --log.level=debug
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"

There we can see:

  • On line 4, additional parameters to the startup command. This is to tell where the prometheus.yml file will be located, and also, changing the log level to assest better if an error ocurrs.
  • On line 6, we setup an extra host named host.docker.internal. This is to allow a process inside the Docker container to see a process outsite.
  • On line 8, we link our prometheus.yml and the Docker container file.
  • On line 10, we expose Prometheus over the 9090 port.

Next, we can run the Docker compose file with the command docker-compose up , this will print the following on the console: msg="Server is ready to receive web requests."

Also, you see a debug message like the following: msg="Scrape failed" err="Get \"http://host.docker.internal:8080/actuator/prometheus\": dial tcp 172.17.0.1:8080: connect: connection refused" , this is because Prometheus tries to scrape the Spring Boot app, but, the app is not running. To solve this message, just start the Spring Boot app.

Now, you can access a query console in this URL: http://localhost:9090/graph. You will see the following web page:

Prometheus UI

There, we can query the data Prometheus offers. Let’s start with up query, to see what scrapers we have active and working:

Querying metric up on Prometheus UI

The image shows the two scrappers we setup are active.

Now, let’s query something more interesting, the live threads count in the last minute of the Spring Boot app: jvm_threads_live_threads{application="spring-observability"}[1m]

Querying jvm_threads_live_threads metric on Prometheus UI

There we can see the values of live threads Prometheus has. You can do complex queries like aggregations, ranges, percentiles, and so on, which we won’t cover in this post, but you can find them here.

So far we exposed metrics using Micrometer that is scrapped by Prometheus, and we can use Prometheus query language to access this information, however, it is hard to conclude something with a table of data, it is better to see the data as a chart over time to see the behavior of the metric and make conclusions. This is where Grafana comes in.

Using Grafana to Chart Prometheus Data

Grafana allows us to create operational dashboards with different chart types to display data over time and in an easy way to understand and make decisions.

Let’s add Grafana to the Docker compose file:

services:
  prometheus:
    image: prom/prometheus:v2.31.1
    command: --config.file=/etc/prometheus/prometheus.yml --log.level=debug
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
  grafana:
    image: grafana/grafana:8.2.3
    ports:
      - "3000:3000"
    volumes:
      - grafana-storage:/var/lib/grafana
    user: '104'
    links:
      - prometheus
volumes:
  grafana-storage:

There, we can find :

  • On line 14, exposing Grafana on 3000 port.
  • On line 16 and 21, creating a volume to save Grafana data locally
  • On line 17, setting up the user to use over the Docker container to allow reads/writes over disk
  • On line 19, we link this container to Prometheus container. This will create a DNS named prometheus, to be able to call Prometheus container from Grafana container.

Now, let’s run the Docker compose with docker-compose upcommand. After, we can access Grafana in the URL http://localhost:3000 :

Grarfana UI

Now, we need to create a Prometheus data source:

  1. Click on the “cogwheel” in the sidebar to open the Configuration menu.
  2. Click on “Data Sources”.
  3. Click on “Add data source”.
  4. Select “Prometheus” as the type.
  5. Set the appropriate Prometheus server UR, in this case, it will be http://prometheus:9090
  6. Click “Save & Test” to save the new data source.

The configuration should look like the following image:

Creating a Prometheus data source on Grafana

After, we should import the first dashboard. We can find an open-source dashboard on the Grafana page, one of them is a JVM micrometer dashboard.

To import the dashboard, you should do:

  1. Click on the “four squares” in the sidebar to open the Configuration menu.
  2. Click on “Manager”.
  3. Click on “Import”.
  4. Set the Grafana dashboard id to 4701
  5. Click on “Load”
  6. Select folder and Prometheus datasource.
  7. Click on “Import”

After you navigate to the new dashboard, you will see something like this (take into account that you need to wait for a while meanwhile Prometheus grabs enough information from the Spring Boot app, so, Grafana could create some charts):

JVM Grafana dashboard

There, we find a lot of charts expressing from CPU to threads changes. You can filter by time and application name. Also, you can expand a chart to see more details or choose a date range selecting a spam interval inside the chart.

Each of the charts has one or more Prometheus queries, for instance, the Threads chart uses the following query: jvm_threads_live_threads{application="$application", instance="$instance"}

Details of Threads Grafana dashboard

$application and $instance are dynamic variables that you choose on the dropdowns input on the top of the page.

Finally, you can observe some of the metrics of your Spring Boot app using Micrometer, Prometheus, and Grafana.

Final Thought

Observability is crucial on any production system: We want to understand how the system is behaving and fix errors that could appear.

Spring Boot has default integration with different observability frameworks, in this case, Prometheus helps us to save the time series data and Grafana helps us to chart that data.

In the following post, we will use these tools to debug a leaking database connections problem on a Spring Boot app.

If you like this post and are interested in hearing more about my journey as a Software Engineer, you can follow me on Twitter and travel together.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s