โœ•

Join us at KubeCon + CloudNativeCon North America, Nov 7-9 in Chicago! Swing by Booth L26 for exclusive SWAG ๐ŸŽ

โœ•
Skip to content
On this page

Subscribe to our newsletter

Get the latest dates and news about GreptimeDB.
Engineering
โ€ข
February 21, 2023

Deep Dive into PromQL and How GreptimeDB Implements the Parser

As the main query language within Prometheus, PromQL helps to explore and visualize time series data. Our last blog introduces the basic concept and illustrates the main differences between PromQL and SQL. In this blog, we are going to dig a bit deeper into the back-end, showing you how data is processed and calculated in PromQL and introduce how GreptimeDB implements promql-parser to offer GreptimeDB better Prometheus compatibility.

As a powerful and the main query language within Prometheus, PromQL helps to explore and visualize time series data by building queries for graphs, notifications and alerts. Since time series is a relatively new field, and PromQL has little similarity with popular relational database query languages, beginners always find itโ€˜s challenging to pick up the syntax of PromQL. Besides, the official documentation doesn't introduce PromQL by comparing it with more commonly used query language, such as SQL. Thus, it's difficult for beginners to grasp the logic behind how time series data is processed in PromQL.

Our last blog introduces the basic concept and illustrates the main differences between PromQL and SQL using a few examples. If you are new to PromQL, we recommend you read this article first to find out how simple and efficient PromQL is. In this blog, instead of scratching the surface, we are going to dig a bit deeper into the back-end, showing you how data is processed and calculated in PromQL and comparing it with SQL in more details.

Prepare โ€‹

In this blog, letโ€™s use a simple example to explain data processing logic. Please note the returned output of each query may differ slightly.

Suppose Prometheus scrapes disk usage metrics in bytes from three nodes, collecting one data point every 15 seconds, thereby producing twelve rows of demo_disk_usage_bytes data points per minute.

PromQL โ€‹

You can directly visit promlens demo and type the same query used in this tutorial to retrieve similar results, but the actual value will be slightly different.

MySQL โ€‹

The following example illustrates the preparation in a table view of conventional relation database, while how the samples stored in TSDB is totally different. The actual value and timestamp may differ in each query.

sql
SELECT * FROM demo_disk_usage_bytes;
markdown
+----+----------------------+------+--------------+------------+
| id | instance             | job  | value        | timestamp  |
+----+----------------------+------+--------------+------------+
|  1 | demo-service-0:10000 | demo | 128860721875 | 1676516213 |
|  2 | demo-service-0:10000 | demo | 128868200567 | 1676516228 |
|  3 | demo-service-0:10000 | demo | 128875203461 | 1676516243 |
|  4 | demo-service-0:10000 | demo | 128883076445 | 1676516258 |
|  5 | demo-service-1:10001 | demo | 128412356891 | 1676516212 |
|  6 | demo-service-1:10001 | demo | 128419231363 | 1676516227 |
|  7 | demo-service-1:10001 | demo | 128426393871 | 1676516242 |
|  8 | demo-service-1:10001 | demo | 128433694141 | 1676516257 |
|  9 | demo-service-2:10002 | demo | 128428942426 | 1676516217 |
| 10 | demo-service-2:10002 | demo | 128436270269 | 1676516232 |
| 11 | demo-service-2:10002 | demo | 128444039092 | 1676516247 |
| 12 | demo-service-2:10002 | demo | 128452595934 | 1676516262 |
+----+----------------------+------+--------------+------------+
12 rows in set (0.000 sec)

"Series" in Prometheus is equivalent to "table" in RDBMS, and "label" is to "column" if it helps when comparing the two.

Basic โ€‹

Unlike SQL, PromQL has fewer syntaxes. The basic query to select a time series from Prometheus is simply typing the series name, which means table name in RDBMS.

To improve your understanding of PromQL syntax, assume you have a table named temperature, which only includes three columns: city, value, timestamp. Now if you want to find the latest temperature of all cities, you can just type temperature, and itโ€™s all displayed for you in tables. No group or other syntax is needed compared to SQL.

You can filter the result by putting labels in curly brackets at the end of the query. Let's say you want to see the temperature of London, LA and Tokyo. You can type temperature{city=~"london|la|tokyo"}.

If you want to check the temperature of all cities during the last five minutes, simply append the time duration at the end in square brackets --> temperature{city=~"london|la|tokyo"}[5m].

To get the average temperature of all these cities, call avg_over_time(temperature{city=~"london|la|tokyo"}[5m]) and the result comes up.

As you can tell from the examples above, PromQL is as simple and intuitive as the human language. Nevertheless, you should bear in mind the following when using it:

  • PromQL retrieves the latest data if no offset or at is used, and the result contains all UNIQUE label combinations.

  • If you want to inquire data of a certain time period instead of the latest, make sure to specify it by appending [duration].

In PromQL, there are four types of data๏ผš

  • Instant vector: each unique label combination produces only the latest one value
  • Range vector: each unique label combination produces multiple values in a specified range of time
  • Scalar literal: like 3.1415
  • String literal: like "promql"

Instant vector and range vector are fundamental in PromQL, and are grouped by unique label dimension by default. Let's examine them in more details by comparing with SQL.

Instant vector โ€‹

Instant vector produces different results if compared with select * from <table> in SQL, and the latter should be grouped by label combination manually to retrieve the latest record of each unique combination.

Select time series โ€‹

  • PromQL

demo_disk_usage_bytes

markdown
| Series                                                             | Value        |
|--------------------------------------------------------------------|--------------|
| demo_disk_usage_bytes{instance="demo-service-0:10000", job="demo"} | 128807264685 |
| demo_disk_usage_bytes{instance="demo-service-1:10001", job="demo"} | 128360742191 |
| demo_disk_usage_bytes{instance="demo-service-2:10002", job="demo"} | 128375885663 |

While it's pretty easy in Prometheus to retrieve the latest record of each label combination, it is much more complex in SQL, because you have to find the latest index of each unique column combination first, then query the result via the index.

  • MySQL
sql
SELECT du.*
FROM   demo_disk_usage_bytes AS du
       JOIN (SELECT instance,
                    job,
                    Max(timestamp) AS max_ts
             FROM   demo_disk_usage_bytes
             GROUP  BY instance,
                       job) gdu
         ON du.instance = gdu.instance
            AND du.job = gdu.job
            AND du.timestamp = gdu.max_ts;
markdown
+----+----------------------+------+--------------+------------+
| id | instance             | job  | value        | timestamp  |
+----+----------------------+------+--------------+------------+
|  4 | demo-service-0:10000 | demo | 128883076445 | 1676516258 |
|  8 | demo-service-1:10001 | demo | 128433694141 | 1676516257 |
| 12 | demo-service-2:10002 | demo | 128452595934 | 1676516262 |
+----+----------------------+------+--------------+------------+
3 rows in set (0.001 sec)

Select time series with label filter โ€‹

  • PromQL

demo_disk_usage_bytes{instance="demo-service-0:10000"}

markdown
| Series                                                             | Value        |
| demo_disk_usage_bytes{instance="demo-service-0:10000", job="demo"} | 129012213695 |
  • MySQL
sql
SELECT *
FROM   demo_disk_usage_bytes
WHERE  instance = "demo-service-0:10000"
ORDER  BY `timestamp` DESC
LIMIT  1;
markdown
+----+----------------------+------+--------------+------------+
| id | instance             | job  | value        | timestamp  |
+----+----------------------+------+--------------+------------+
|  4 | demo-service-0:10000 | demo | 128883076445 | 1676516258 |
+----+----------------------+------+--------------+------------+
1 row in set (0.000 sec)

Range vector โ€‹

Instead of retrieving the latest record of each label combination as what the instant vector does, range vector retrieves multiple records of a specified duration. For example, select * from <table> where timestamp > <ts> results in a range vector.

Notice that the time duration is in the form of a number plus one of the following units:

  • ms milliseconds
  • s seconds
  • m minutes
  • h hours
  • d days - assuming a day always has 24h
  • w weeks - assuming a week always has 7d
  • y years - assuming a year always has 365d

Note, there is no unit for months, you have to calculate how many days in a month and use the unit d instead.

Select last one minute series โ€‹

  • PromQL

demo_disk_usage_bytes[1m]

markdown
| Series                                                             | Value        | timestamp  |
|--------------------------------------------------------------------|--------------|------------|
| demo_disk_usage_bytes{instance="demo-service-0:10000", job="demo"} | 129202305831 | 1676516888 |
|                                                                    | 129208891990 | 1676516903 |
|                                                                    | 129217645753 | 1676516918 |
|                                                                    | 129225360100 | 1676516933 |
| demo_disk_usage_bytes{instance="demo-service-1:10001", job="demo"} | 128753672200 | 1676516887 |
|                                                                    | 128761251595 | 1676516902 |
|                                                                    | 128769530143 | 1676516917 |
|                                                                    | 128776696388 | 1676516932 |
| demo_disk_usage_bytes{instance="demo-service-2:10002", job="demo"} | 128761347797 | 1676516877 |
|                                                                    | 128768555774 | 1676516892 |
|                                                                    | 128775362323 | 1676516907 |
|                                                                    | 128782587455 | 1676516922 |
  • MySQL
sql
SELECT *
FROM   demo_disk_usage_bytes
WHERE `timestamp` > unix_timestamp() - 60;
markdown
+----+----------------------+------+--------------+------------+
| id | instance             | job  | value        | timestamp  |
+----+----------------------+------+--------------+------------+
|  1 | demo-service-0:10000 | demo | 128860721875 | 1676516213 |
|  2 | demo-service-0:10000 | demo | 128868200567 | 1676516228 |
|  3 | demo-service-0:10000 | demo | 128875203461 | 1676516243 |
|  4 | demo-service-0:10000 | demo | 128883076445 | 1676516258 |
|  5 | demo-service-1:10001 | demo | 128412356891 | 1676516212 |
|  6 | demo-service-1:10001 | demo | 128419231363 | 1676516227 |
|  7 | demo-service-1:10001 | demo | 128426393871 | 1676516242 |
|  8 | demo-service-1:10001 | demo | 128433694141 | 1676516257 |
|  9 | demo-service-2:10002 | demo | 128428942426 | 1676516217 |
| 10 | demo-service-2:10002 | demo | 128436270269 | 1676516232 |
| 11 | demo-service-2:10002 | demo | 128444039092 | 1676516247 |
| 12 | demo-service-2:10002 | demo | 128452595934 | 1676516262 |
+----+----------------------+------+--------------+------------+
12 rows in set (0.000 sec)

Operator โ€‹

PromQL supports binary and aggregation operators.

Binary operator โ€‹

There are Arithmetic, Comparison and Logical/set binary operators, shaping PromQL to a high-level programming language like Python.

Arithmetic binary operators โ€‹

Binary arithmetic operators are defined between scalar/scalar, vector/scalar, and vector/vector value pairs. For operations between two instant vectors, the matching behavior can be modified.

  • + (addition)
  • - (subtraction)
  • * (multiplication)
  • / (division)
  • % (modulo)
  • ^ (power/exponentiation)
Convert the unit from byte to gigabyte โ€‹
  • PromQL

demo_disk_usage_bytes / 1024 / 1024 / 1024

markdown
| Series                                        | Value              |
|-----------------------------------------------|--------------------|
| {instance="demo-service-0:10000", job="demo"} | 120.68923075404018 |
| {instance="demo-service-1:10001", job="demo"} | 120.26891701295972 |
| {instance="demo-service-2:10002", job="demo"} | 120.28242788556963 |
  • MysQL
sql
SELECT du.instance, du.job, du.value / 1024 / 1024 / 1024 as value
FROM   demo_disk_usage_bytes AS du
       JOIN (SELECT instance,
                    job,
                    Max(timestamp) AS max_ts
             FROM   demo_disk_usage_bytes
             GROUP  BY instance,
                       job) gdu
         ON du.instance = gdu.instance
            AND du.job = gdu.job
            AND du.timestamp = gdu.max_ts;
markdown
+----------------------+------+------------------+
| instance             | job  | value            |
+----------------------+------+------------------+
| demo-service-0:10000 | demo | 120.031718579121 |
| demo-service-1:10001 | demo | 119.613198694773 |
| demo-service-2:10002 | demo | 119.630802361295 |
+----------------------+------+------------------+
3 rows in set (0.001 sec)

It may be hard to comprehend the concept of binary operators between two instant vectors. Well, the following two tips may help:

  • These operators behave as a filter by default, which is used to match the entries and only the exact same label combination can be operated on.

  • For operations between two instant vectors, the matching behavior can be modified. The vector-matching syntax on ignoring group_left group_right helps in more complex scenarios, and the matching is just to tell how the labels are matched in the operations, without changing the operation itself.

Aggregation operator โ€‹

PromQL supports the following aggregation operators, which can only be used among instant vectors, the result is another instant vector.

  • sum (calculate sum over dimensions)
  • min (select minimum over dimensions)
  • max (select maximum over dimensions)
  • avg (calculate the average over dimensions)
  • group (all values in the resulting vector are 1)
  • stddev (calculate population standard deviation over dimensions)
  • stdvar (calculate population standard variance over dimensions)
  • count (count number of elements in the vector)
  • count_values (count number of elements with the same value)
  • bottomk (smallest k elements by sample value)
  • topk (largest k elements by sample value)
  • quantile (calculate ฯ†-quantile (0 โ‰ค ฯ† โ‰ค 1) over dimensions)

Sum in gigabyte โ€‹

  • PromQL

sum(demo_disk_usage_bytes) / 1024 / 1024 / 1024

markdown
| Series | Value              |
|--------|--------------------|
| {}     | 361.62267420999706 |
  • MysQL
sql
SELECT Sum(du.value) / 1024 / 1024 / 1024 AS value
FROM   demo_disk_usage_bytes AS du
       JOIN (SELECT instance,
                    job,
                    Max(timestamp) AS max_ts
             FROM   demo_disk_usage_bytes
             GROUP  BY instance,
                       job) gdu
         ON du.instance = gdu.instance
            AND du.job = gdu.job
            AND du.timestamp = gdu.max_ts;
markdown
+------------------+
| value            |
+------------------+
| 359.275719635189 |
+------------------+
1 row in set (0.001 sec)

In aggregation operation, keep in mind that you can only use instant vectors to generate other instant vectors, and preserve distinct dimensions by using by and without syntax.

There are also some built-in functions such as <aggregation>_over_time() to aggregate range vectors.

Function โ€‹

In order to behave more like an analytical query language, the ability to support various calculations of functions is essential. Since achieving this in SQL is almost impossible, we won't provide comparisons with MySQL.

rate โ€‹

rate calculates the per-second average of increase in the range vector. If you want to know the increment, itโ€™s impossible to calculate it with the latest record alone. Instead, you have to retrieve the sample data through a time period, the process can be roughly divided into three steps:

  • Get samples in a time period, let's say one hour
  • Sum the value of each label combination
  • Divide the result by 3,600

Calculate the byte increment per second for all nodes and convert it to megabyte โ€‹

rate(demo_disk_usage_bytes[1m]) / 1024 / 1024

markdown
| Series                                        | Value               |
|-----------------------------------------------|---------------------|
| {instance="demo-service-0:10000", job="demo"} | 0.4426248338487413  |
| {instance="demo-service-1:10001", job="demo"} | 0.4578960418701171  |
| {instance="demo-service-2:10002", job="demo"} | 0.46780273706060654 |

Calculate the bytes increment per second for all nodes and convert it to megabyte, then sum up โ€‹

The combination of sum and rate is helpful. However, always use rate before sum, otherwise you may get unpredicted behaviors.

See rate then sum never sum then rate.

sum(rate(demo_disk_usage_bytes[1m])) / 1024 / 1024

markdown
| Series | Value              |
|--------|--------------------|
| {}     | 1.4306116906318123 |

<aggregation>_over_time โ€‹

<aggregation>_over_time is available for range vector aggregation, returning an instant vector with per-series aggregation results.

avg_over_time(demo_disk_usage_bytes[1m]) / 1024 / 1024 / 1024

markdown
| Series                                        | Value              |
|-----------------------------------------------|--------------------|
| {instance="demo-service-0:10000", job="demo"} | 121.08960774564184 |
| {instance="demo-service-1:10001", job="demo"} | 120.67763310461305 |
| {instance="demo-service-2:10002", job="demo"} | 120.67987838410772 |

Graph โ€‹

For those curious to learn about the powerful visualization ability of PromQL, you may be wondering how time series data is displayed in dashboard in Grafana. Actually, only instant vectors can be displayed, and in order to plot lines, one query isn't enough.

Then how is the line graph composed? Well, It depends on range query API, which acquires the evaluation of an expression query over a range of time. As each label dimension possesses multiple points, corresponding lines will be composed accordingly.

The query_range API supports query, start, end, step, and timeout parameters.

As one data type of PromQL, range vector CANNOT be drawn on the dashboard. You can only apply correlative functions, such as avg, sum, etc. on range vectors. If the type of the result function is an instant vector, then the result CAN be displayed on the Dashboard.

Parser โ€‹

To explore data stored in GreptimeDB through PromQL, GreptimeDB needs to provide the ability to parse the query into AST (abstract syntax tree), and retrieve data from memory or disk via logical and physical plans. Since there is no ready-to-use PromQL Rust Parser, our team decides to develop it ourselves, and weโ€™re glad to announce that promql-parser v0.1.0 is now available.

We choose cfgrammar for promql-parser because it's compatible with yacc rules. Thus developers can easily reuse Prometheus's yacc rule file during the implementation. However, there are still several differences between Yacc and grmtools. One of them is explained below:

Although rare, it is possible to generate accept/reduce conflicts (e.g. for a grammar with the sole rule A: A;). grmtools considers accept/reduce conflicts to be a hard error, and refuses to generate anything for the resulting grammar, whereas Yacc allows them through (with unclear consequences). Bison also appears to consider accept/reduce conflicts a hard error, though it appears to detect them in a more generic way (reporting such rules as "not generating any sentences").

The difference causes shift/reduce conflicts in original prometheus Yacc rules (visit silcnitc for detailed explanations of the Yacc grammar).

In Prometheus, when matching many-to-one and one-to-many vectors, group_left and group_right can be used either with or without parenthesis (when no label is needed).

For example:

  • method_code:http_errors:rate5m / ignoring(code) group_left method:http_requests:rate5m
  • method_code:http_errors:rate5m / ignoring(code) group_left() method:http_requests:rate5m
  • method_code:http_errors:rate5m / ignoring(code) group_left(status) method:http_requests:rate5m

Nevertheless, in original Prometheus Yacc rules, it is not specified how group_<left/right> behaves when appending with (. As a result, the parser cannot tell if the expression in the appending () is the label belonging to the group_<left/right>, or it is just the right hand side expression of the binary operator, causing shift/reduce conflicts. ( has to be assigned with higher precedence, and this issue has been resolved at promql.y:

// left_paren has higher precedence than group_left/group_right, to fix the reduce/shift conflict.
// if group_left/group_right is followed by left_paren, the parser will shift instead of reduce
%nonassoc GROUP_LEFT GROUP_RIGHT
%right LEFT_PAREN

Conclusion โ€‹

PromQL is a handy tool everyone should master in the field of cloud-native monitoring and Observability. With its much simpler keywords compared to SQL and more tailored for time series scenarios, PromQL helps users explore the true value behind their vast amount of time series data.

This quick start only covers some common grammar and syntax. For other aggregators or functions, you can visit the official website query and cheat sheet to manage the language better. If you want to know how PromQL syntax and grammar are tested, visit parser_test.go or parse.rs.

As a database vendor committed to utilizing powerful database services, especially for time series data processing, GreptimeDB will soon support PromQL in the coming version 0.1. Please find us on GitHub to try out GreptimeDB, and you are also welcome to join our slack community to stay tuned.

Join our community

Get the latest updates and discuss with other users.

Subscribe to our newsletter

Get the latest dates and news about GreptimeDB.