💨 How to know what's going on in your cloud 🌥️⚡️

edenfed

Eden Federman

Posted on November 6, 2023

💨 How to know what's going on in your cloud 🌥️⚡️

TL;DR

Today, you will learn about the world of open-source observability. In this guide, I'll be touching:

  • Build a Comprehensive Observability Stack: Assemble an end-to-end solution using popular open-source tools.

  • Extract and Ship with Ease: Learn the ins and outs of handling traces, metrics, and logs.

  • Master Data Correlation: Gain expertise in linking various types of observability data for deeper insights.

  • Utilize a Powerful Backend Stack: Implement Grafana, Prometheus, Tempo, and Loki for robust backend management.

  • Control with Confidence: Harness the capabilities of Odigos to manage your observability landscape effectively.

Kitty Cloud


Odigos - Open-source Distributed Tracing

Monitor all your apps simultaneously without writing a single line of code!
Simplify OpenTelemetry complexity with the only platform that can generate distributed tracing across all your applications.

We are really just starting out.
Can you help us with a star? Plz? 😽

https://github.com/keyval-dev/odigos

GiveStar


Theory

If you are new to observability or just interested in the difference between monitoring and observability, we recommend watching this short video by the creator of OpenTelemetry.

In short, distributed traces, metrics, and logs, with the ability to correlate between one signal to another, are the best practices for debugging production issues when working with microservices-based applications.

This is exactly what we are going to achieve for our demo application.

There is no need to learn any new technologies in order to implement and enjoy observability. With some basic Kubernetes commands — you are ready to get started.


Solution Overview

We are going to deploy 3 different systems on our Kubernetes cluster:

  • Target application — we will use a microservices-based application written in Java and Python. (for example applications with more programming languages and more complex architecture, use the example from the Odigos getting started guide)
  • Observability backend — we are going to use the following applications to store and analyze our observability data: - Grafana: for dashboards and visualization of the data - Prometheus: for storage of metrics data - Loki: for storage of logs data - Tempo: for storage of distributed tracing data
  • Observability control plane — we will use Odigos for automatic instrumentation of our applications (automatic extraction of traces, metrics, and logs), collectors deployment, and configuration.

SolutionVerview

Prerequisites

The following tools are required to run this tutorial:


Creating the Kubernetes cluster

Create a new local Kubernetes cluster, by running the following command:

kind create cluster
Enter fullscreen mode Exit fullscreen mode

Installing Target Applications

We will install a fork of bank-of-athnos, an example of a bank application created by Google.

We use a modified version without any instrumentation code to demonstrate how Odigos automatically collects observability data from the application.

Deploy the application using the following command:

kubectl apply -f https://raw.githubusercontent.com/keyval-dev/bank-of-athnos/main/release/kubernetes-manifests.yaml
Enter fullscreen mode Exit fullscreen mode

Installing Observability Backend

As there is currently no one database that can store traces, logs, and metrics, we will deploy three different databases alongside Grafana as a visualization tool.

The following helm chart deploy Tempo (traces database), Prometheus (metrics database), and Loki (logs database) as well as a preconfigured Grafana instance with those databases as data sources.

Install odigos CLI by following this link

Connecting Everything Together Using Odigos

Now that our test application is running, our observability databases are deployed and ready to receive data, the last piece of the puzzle is to extract and ship logs, metrics, and traces from our applications to the observability databases.

The simplest and easiest way to do it is by using Odigos - a control plane for observability data. Install Odigos by executing the following command:

odigos install
Enter fullscreen mode Exit fullscreen mode

After all the pods in the odigos-system namespace are running, open the Odigos UI by running the following command:

odigos ui
Enter fullscreen mode Exit fullscreen mode

And navigate to http://localhost:3000 to access the UI.

Selecting Applications

There are two ways to select which applications Odigos should instrument:

  • Opt out (recommended): Instrument everything, including every new application that will be deployed going forward. Users can still manually mark applications that should not be instrumented
  • Opt In: Only instrument the applications selected manually by the user.

Sources

For this tutorial, we recommend choosing the opt out mode.

Choosing Destinations

Destinations

The next step is to tell Odigos how to reach the three databases that we deployed earlier.
Add the following three destinations:

Tempo

Prometheus

In order to add another destination, select Destinations from the sidebar and click Add New Destination

Loki

Exploring The Data

Wait a few seconds for Odigos to finish deploying the required collectors and instrument the target applications. You can monitor the progress by running

kubectl get pods -w
Enter fullscreen mode Exit fullscreen mode

Wait for all the pods to be in Running state (especially notice the transactionservice application which has a slow startup time).

The last step is to explore our observability data in Grafana. We can now see and correlate metrics to traces to logs in order to dive deeply into how our application behaves.

Connecting to Grafana

Port forward to your Grafana instance by running:

kubectl port-forward svc/observability-grafana -n observability 3000:80
Enter fullscreen mode Exit fullscreen mode

And navigating to http://localhost:3000

  • Enter admin as the username
  • For the password enter the output of the following command:
kubectl get secret -n observability observability-grafana -o jsonpath={.data.admin-password} | base64 --decode
Enter fullscreen mode Exit fullscreen mode

Service Graph

Let’s start by viewing a service graph of our microservices application:

  1. Go to the Explore from the sidebar
  2. Select Tempo as datasource
  3. Choose the Service Graph tab
  4. Run the query

Image description

Metrics

Now let’s view some metrics. Click on the contacts node from the service graph and choose Request rate
Image description

A graph similar to the following should be presented:
Image description

There are many more metrics that Odigos collect and can be queried easily from the Prometheus data source, check out this document for the full list.

Traces

Click on the contacts application again in the Service Graph, but this time choose Request Histogram.

In order to correlate metrics to traces, we will use a feature called exemplars. To show exemplars:

  1. Open the options menu
  2. Turn on exemplars
  3. Notice that the histogram is now added with green diamonds.

Image description

Hover over one of the added points and click Query With Tempo. A trace similar to the following should be presented:
Image description

In this trace, you can see exactly how much time each part of the entire request took.

Digging into one of the sections will show additional information such as database queries.

Logs

To further investigate specific action you can simply query the relevant logs by pressing on the small document icon.
Press on the document icon next to the balancereader to show the relevant logs:
Image description

Summary

We have shown how easy it is to extract and ship logs, traces, and metrics using only open-source solutions.

In addition, we were also able to generate traces, metrics, and logs from an application within minutes.

We now also have the ability to correlate between the different signals: We correlated metrics to traces and traces to logs. We now have all the needed data to quickly detect and fix production issues in our target applications.

What’s Next?

Notice that the observability backend that we installed is not suited for production usage.

For high volumes of data, it is recommended to persist those databases to cloud storage like S3 or use a managed offering.

💖 💪 🙅 🚩
edenfed
Eden Federman

Posted on November 6, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related