OTel Crash Course in a Week

For this week’s Hackathon at Grafana Labs, I decided I wanted to spend some time learning about OpenTelemetry in the most hands-on and immersive way possible. For me, hands-on means I get to write some code, however sloppy, and see how the system behaves. Immersive means I’ll be looking to leverage several resources for my learning, which would usually include watching YouTube videos of conference presentations on the subject, listening to Podcasts and maybe throw in a book for good measure.

I decided to attempt to do the following as part of learning about OpenTelemetry:

Method or Madness

I have recently been incorporating the Cursor AI-powered IDE into my workflow for writing code. The first thing I thought to try was have Cursor implement the instrumentation of an existing Go service that integrates the OpenWeatherMap API and Phillips Hue bridge to “light the weather.” A few suggestions in and I realized this was not going to work — the code was getting too complicated and I had no idea what Cursor was really up to. So, I took a step back and did what any sensible engineer faced with a problem they cannot solve should do — read the frigging documentation (https://opentelemetry.io/docs/) and start by solving a smaller related problem. Thankfully, the Open Telemetry Getting Started Documentation has a demo of instrumenting a web-based dice-rolling service written in Go. So, away I went, copying the code for the dice-rolling service, making some modifications to package it as a container, adding a Prometheus instance to scrape metrics and then starting things up to see how the heck it works.

Spelunk Adventure: The OTel SDK for Go

Once I had the container up and running and could view traces and logs in the console (stdout), I went spelunking into the code to understand what libraries were being imported (cue, there’s a lot) and what each of those libraries was responsible for. After a few of hours of spelunking around the code and reading library documentation, I had a good enough mental model, that I could get started instrumenting a simpler version of the Microservice I was looking to instrument. I cut out the interaction with the Phillips Hue bridge but kept the calls to OpenWeatherMap to make things simpler but also close enough to the reality of microservice environments where one service calls another and another ad nauseaum .

The modified microservice would simply call the OpenWeatherMap API to fetch temperatures when a request is received at the /getTemp endpoint. In the rest of this blog post, we will do a walkthrough of the code and use that as a jump-off to learn about important observability and OpenTelemetry concepts. But first, what is heck is Open Telemetry and what problem does it really solve?

What is this OTel Thing?

Martin Thwaites’s What is OpenTelemetry presentation at goto; 2024 is an excellent resource for understanding the problem that OpenTelemetry (OTel) seeks to solve, the history of the project and why it’s become the fastest growing CNCF project (almost toe-to-toe with Kubernetes by contributor count). OTel is a newish solution to an old problem — portal and unified observability in a complex microservice environment.