Getting Started with PromQL: A Comprehensive Guide

Abhinavsharma
3 min readFeb 27, 2024

--

Introduction

Prometheus, an open-source monitoring and alerting toolkit, relies on PromQL (Prometheus Query Language) to interact with and analyze time series data. PromQL is a powerful language that allows users to query and manipulate metrics collected by Prometheus. In this guide, we will explore the key concepts of PromQL and provide examples to help you get started.

Basics of PromQL

Metric Selectors

At the core of PromQL are metric selectors, allowing you to retrieve specific metrics. The basic syntax is as follows:

<metric_name>{<label_name>=<label_value>, ...}

Let’s consider an example:

cpu_usage{job="webapp", instance="server-1"}

This query selects the CPU usage metric for the “webapp” job on the “server-1” instance.

Instant Vector Selectors

Instant vector selectors retrieve the value of a metric at a specific point in time. For example:

cpu_usage{job="webapp"}  // Returns the current CPU usage for the "webapp" job.

Range Vector Selectors

Range vector selectors retrieve a range of values over a specified time interval. To calculate the rate of a metric over the last 5 minutes:

rate(http_requests_total{job="api"}[5m])

This query calculates the per-second rate of HTTP requests for the “api” job over the last 5 minutes.

Aggregation and Grouping

PromQL supports various aggregation functions to summarize and analyze data. For instance, to calculate the total CPU usage per instance for the “app” job:

sum(cpu_usage{job="app"}) by (instance)

This groups the data by the “instance” label and calculates the sum of CPU usage for each instance.

Common Functions and Operators

Rate Function

The rate() function calculates the per-second rate of increase over a specified time range. Example:

rate(http_requests_total{job="frontend"}[1h])

This query computes the rate of HTTP requests per second over the last hour for the “frontend” job.

Increase Function

The increase() function returns the total increase in a metric over a specified time range. Example:

increase(http_errors_total{job="api"}[30m])

This query returns the total increase in HTTP errors for the “api” job over the last 30 minutes.

Binary Operators

PromQL supports various binary operators like +, -, *, and / for arithmetic operations between vectors.

http_requests_total{job="app"} + http_requests_total{job="api"}

This adds the total HTTP requests for the “app” and “api” jobs.

Advanced Topics

Subqueries

PromQL allows the use of subqueries to perform more complex analyses. Example:

avg_over_time(http_request_duration_seconds{job="webapp"}[5m:1m])

This calculates the average request duration over a 5-minute range with a 1-minute step.

Alerting Rules

Prometheus uses PromQL-based alerting rules to define conditions for triggering alerts. Example:

ALERT HighErrorRate
IF rate(http_errors_total{job="api"}[5m]) > 0.5
FOR 3m

This rule triggers an alert if the error rate for the “api” job exceeds 0.5 errors per second over a 5-minute window for 3 minutes.

Conclusion

PromQL is a versatile language for querying and analyzing time series data collected by Prometheus. This guide provides a foundation for understanding the basics, aggregation, functions, operators, and advanced features of PromQL. Experimenting with these examples in a Prometheus environment will deepen your understanding and proficiency in using PromQL for effective monitoring and alerting.

--

--

Abhinavsharma

Platform Engineer | Community Leader | Advising and training companies around Cloud Native