Skip to content
On this page

Engage Prometheus in the Rust Ecosystem: A PromCon 2025 Talk Recap

Our engineer Ruihang Xia shared at PromCon EU 2025 how Rust can unlock new possibilities in the Go-dominated Prometheus ecosystem—and how GreptimeDB achieved the highest PromQL compliance score outside of Prometheus itself.

Our engineer Ruihang Xia shared at PromCon EU 2025 how Rust can unlock new possibilities in the Go-dominated Prometheus ecosystem—and how GreptimeDB achieved the highest PromQL compliance score outside of Prometheus itself.

Open the PromQL compliance test page, and you'll see a clear pattern: row after row of Go gophers next to every tested implementation. If you want to build serious Prometheus tooling, Go seems like your only option.

But at PromCon EU 2025 in Munich, our engineer Ruihang Xia presented a different path—reimplementing core Prometheus components in Rust and achieving the highest PromQL compliance score among all tested implementations, second only to Prometheus itself.

This article is based on his talk "Engage Prometheus in the Rust Ecosystem."

Opening: Some Necessary "Content Warnings"

"I'm not that kind of guy who's yelling of using Rust to rewrite everything."

That's how Ruihang opened his talk. As an Apache DataFusion PMC member and Apache Arrow Committer, he works in the Rust ecosystem daily. But this talk wasn't about starting a language war.

He offered a few honest disclaimers: this is about Rust in a Go-centric project; some comparisons involve his own code (expect bias); and—"broken English, not very fluent."

This candor set the tone: technical discussion, not tribal warfare.

WHY: Bringing Prometheus into Rust

The Go-Dominated Reality

Open the PromLabs PromQL compliance test page[1], and you'll notice something striking: every tested project has a Go gopher next to it. Amazon Managed Service for Prometheus, VictoriaMetrics, Cortex, Thanos—all Go.

PromQL Compliance Test Results

This isn't coincidental. The Prometheus ecosystem has formed a powerful, battle-tested stack: Prometheus itself, Cortex/Mimir, Thanos, M3DB, and various cloud-hosted services. Innovation typically happens within the Go ecosystem—by importing Prometheus internals, you can quickly build new features.

"If you are involved with Prometheus, you are mostly limited to exporting metrics, only in the SDKs to report data to Prometheus, not leverage your data with your system," Ruihang pointed out.

The Pain of CGO

Ruihang shared a personal story: he once spent nearly a year trying to integrate a Rust key-value storage engine into a Go project via CGO.

"I think that's one of the worst choices I have ever made," he admitted. In reality, about 70% of the time wasn't spent on the project itself—it was about negotiating between Go and Rust through CGO.

CGO lets you integrate Rust or C++ code into Go projects, but the practical friction far exceeds expectations. This experience led him to ask: instead of painfully bringing Rust into the Go ecosystem, why not rebuild Prometheus's core capabilities directly in Rust?

Rust's Unique Advantages

Choosing Rust isn't just about language preference—it's about accessing capabilities the Go ecosystem can't easily reach.

Cutting-Edge Data Formats

The talk presented a comparison of innovative data formats that have recently emerged:

FormatDescriptionSource LanguageGo SDK
BtrBlocksLightweight encoding/compression (SIGMOD'23)C++
FastLanesSIMD-friendly format (VLDB'25)C++
LanceML/AI-focused columnar formatRust
Nimble (Meta)Wide table/feature store optimizationC++
VortexCompressed Arrow arrays, LF AI incubationRust

"I don't mean to find a lot of formats that's not available in Go. Actually, this is a list that our customers sent to me and asked, do you support them?" he explained. Except for the old Apache ORC, all the cutting-edge formats lack official Go SDKs.

Not surprising. More low-level innovation happens in Rust and C++, especially in data infrastructure.

Cross-Language Bindings: Nuclear Fission

Rust code can relatively easily generate bindings to other languages. Ruihang gave two examples:

  1. promql-parser: Our Rust PromQL parser now has Python and Lua bindings created by the community.

  2. Apache OpenDAL: This Rust-based cloud storage access layer has bindings for 15+ languages including C, C++, Java, Python, Node.js, and Go.

"Have you ever thought of writing some plugin in your NGINX layer with Lua script to intercept some PromQL query? This is not feasible if you are all in Go, but it's now an option."

Jump out of the Go ecosystem, and it's like nuclear fission—releasing multi-language energy. Theoretically Go could achieve similar results through CGO, but such cases are rare in practice.

WHAT: Prometheus as a Standard

This practical experience changed how Ruihang views Prometheus.

"My perspective to Prometheus has changed from it is a monolith system to, I think it's more like a standard for me."

Prometheus defines three core interfaces:

  1. OpenMetrics — Data format standard
  2. PromQL — Query language standard
  3. Remote Write — Transport protocol standard

All three have corresponding compliance tests[2]. Standards aren't bound to any specific language—if your implementation conforms to these interface specifications, you're part of the Prometheus ecosystem.

"If one thing sounds like Prometheus and looks like Prometheus and acts like Prometheus, then it is Prometheus."

This perspective shift matters: you can implement a Prometheus-compatible system in any language, with any tech stack, without being confined to the Go ecosystem.

HOW: Two Fundamentally Different Implementation Strategies

The talk shared two cases representing completely different implementation strategies.

Case 1: promql-parser — Faithful Following

promql-parser[3] is our open-source pure Rust PromQL parser. Its strategy: follow the native Prometheus implementation as faithfully as possible.

Technical Implementation

About five years ago, Prometheus migrated from a hand-written parser to a YACC-based generated parser. This created an opportunity for cross-language reuse: YACC grammar definitions are language-agnostic.

promql-parser reuses Prometheus's YACC grammar definition, with only necessary porting work (Prometheus uses GoYACC, which differs slightly from standard YACC). The core outputs are:

  1. AST (Abstract Syntax Tree) definitions — describing the structure of PromQL queries
  2. Generated parsing logic — converting query strings to AST
rust
use promql_parser::parser;

let promql = r#"
    http_requests_total{
        environment=~"staging|testing|development",
        method!="GET"
    } offset 5m
"#;

match parser::parse(promql) {
    Ok(expr) => {
        println!("Prettify:\n{}\n", expr.prettify());
        println!("AST:\n{expr:?}");
    }
    Err(info) => println!("Err: {info:?}"),
}

Careful Dependency Selection

The Rust community doesn't have a one-to-one mapping of dependencies to Go. It took about half the time to choose dependencies carefully to maximize compatibility.

Use Cases

  • PromQL query analysis
  • Query interception and rewriting
  • Structured alert rule processing
  • Parsing PromQL in Lua scripts (via lua-promql-parser binding)

Project Status

At the time of the talk, promql-parser was compatible with Prometheus v2.45.0. Shortly after, we released v0.7.0 with Prometheus 3 support—including string identifiers and Unicode label names[4]. By reusing the YACC grammar definition, we can keep up with upstream changes relatively easily.

Case 2: PromQL Query Engine — Complete Reconstruction

GreptimeDB's PromQL execution engine took a completely different path: fully deconstruct PromQL and rebuild it using data lake approaches.

Core Insight: "Monitoring is About Junk Data"

"Monitoring is about junk data," Ruihang stated bluntly.

This isn't pejorative. Individual monitoring data points have low value—you don't care about a single CPU usage reading, you care about trends, anomaly detection, and aggregations. Only by storing and processing large volumes can you extract value.

This is exactly what data lakes excel at.

Data Lake + Prometheus

GreptimeDB maps PromQL semantics entirely to the SQL model:

PromQL ConceptSQL/Data Lake Mapping
Binary OperatorJOIN
Range VectorArray slices (Array of Array)
FilterWHERE clause
AggregationGROUP BY + aggregate functions

This mapping lets you reuse mature SQL execution engine capabilities. More importantly, the approach generalizes—"If you want to use, for example, DuckDB to read your own version of PromQL, you can, or at least for me, I have confidence to do this within one month."

Tech Stack: All Apache Projects

GreptimeDB's chosen stack:

  • Apache Parquet — Columnar file format, also used as the index format
  • Apache DataFusion — Query execution engine; PromQL remapping and reusable operations are directly taken from it
  • Apache Arrow — In-memory format used not only at the computation level, but also at the protocol level and memory cache level
  • Apache OpenDAL — Cloud storage access layer for many different services

"We didn't choose our dependency based on their names, but they happened to bring Apache," Ruihang joked.

Goals Achieved

  1. Drop-in replacement: Just change the query endpoint to switch from Prometheus
  2. Highest PromQL compliance: We are the highest one except Prometheus itself, even higher than implementations that depend on Prometheus
  3. Built-in distributed support: Horizontal scaling without additional components

SQL Experience Absorbed

Reimplementation brought unexpected benefits—direct access to mature SQL ecosystem technologies:

  • Query Optimizer: Rewrite inefficient queries or reject queries that could overload your system
  • EXPLAIN ANALYZE: Visualize query execution plans with precise timing per step
  • Out-of-the-box distributed query support

SQL Experience Absorbed

SQL + PromQL Hybrid Queries

A unique capability: mixing SQL and PromQL in the same query using CTE (Common Table Expression):

sql
WITH
    tql_data (ts, val) AS (TQL EVAL (0, 40, '10s') metric),
    filtered AS (SELECT * FROM tql_data WHERE val > 5)
SELECT count(*) FROM filtered;

TelemetryQL (TQL) is GreptimeDB's wrapper for PromQL.

The first part is PromQL, then you follow with SQL to filter, then another SQL to count. Both query languages have their strengths—now you can choose whatever you want to do.

How does this work? We shift all query logic to a centric intermediate representation (IR), the logic plan. At this level, you cannot distinguish whether a step came from SQL or PromQL—they're totally mixed together, so they can seamlessly cooperate.

For a practical example of using TQL with CTE for Kubernetes monitoring, see PromQL + SQL: A Practical Guide to Analyzing Kubernetes Monitoring Data.

Rich Index Support

Beyond the core capabilities mentioned in the talk, GreptimeDB provides rich index support to accelerate queries:

Index TypePurposeImplementation
Inverted IndexTime series lookupIn-house
Skipping IndexLarge-scale time series filteringIn-house
Fulltext IndexKeyword searchIn-house/tantivy
Vector IndexSimilarity queriesIn progress

Extensible Semantics

Complete reimplementation also means extending PromQL semantics:

  • Optional extrapolate: Control whether extrapolation is enabled
  • Strict mode: Prometheus won't report an error if you query something that doesn't exist, but we can choose to report errors because this is more like SQL behavior
  • Rich types: Support Duration, Interval, DateTime, and other types
  • Multi-values: Physically store related metrics like load1, load5, load15 together

On rich types, Ruihang gave an interesting example: "Have you ever wanted to query things other than just float? What about duration? What about interval? What about date time? My home assistant, I have a mmWave radar in my home, and it will report if there are some human activities, which the data is represented in interval."

Catching the Fundamentals: Protocol, Data, Computation

Ruihang summarized the core philosophy: the rewriting is about catching the fundamental things—the protocols, the data, and the computation.

  1. Protocol — Compatible with Prometheus interface standards
  2. Data — Stored and organized the data lake way
  3. Computation — Reuse SQL execution engine capabilities

Catch these three fundamentals, and the specific implementation can be completely different. That's why a system that "looks like Prometheus and works like Prometheus" can be built with an entirely different tech stack.

The PromQL Compliance Journey

GreptimeDB's PromQL compliance wasn't achieved overnight. The GitHub tracking issue[5] documents the evolution:

DateComplianceMilestone
2023-0213.14%Initial support
2023-0233.03%aggr_over_time functions
2023-0454.38%Fixed offset edge cases
2023-0666.61%predict_linear support
2023-1282.12%histogram_quantile support

From 13% to over 82%—two years of continuous investment. GreptimeDB now holds the highest score outside of Prometheus itself, and we're continuing to improve compliance.

Closing

"In Rust, in multiple languages, in the most suitable tech stack." That's how Ruihang concluded.

Prometheus as an open standard means more possibilities:

  1. A new toolbox opens: Rust ecosystem's cutting-edge data infrastructure can now serve Prometheus users
  2. Cross-language collaboration becomes possible: From Rust, you can bind to almost any language
  3. Extend time series the data lake way: Absorb decades of engineering wisdom from the SQL world

This isn't about replacing Go ecosystem Prometheus projects—it's about providing more implementation choices for the Prometheus standard.

Q&A

After the talk, the host remarked: "I've always been waiting for the day somebody wakes up and says, let's rewrite Prometheus in Rust. And I'm happy that it's happening. It's actually quite exciting."

An audience member asked: "Where can we follow what you're doing?"

"For the parser, we have a dedicated repo for it, and you can also try to add bindings for other languages. And for the query, it is a standalone module within GreptimeDB's repo. It does not have a dedicated repo, but the code itself is separated."


This article is based on Ruihang Xia's talk "Engage Prometheus in the Rust Ecosystem" at PromCon EU 2025 (Munich). Ruihang is a Senior Software Engineer at GreptimeDB and Apache DataFusion PMC Member.

References


  1. PromQL Compliance Tests - PromLabs ↩︎

  2. Prometheus Compliance Repository ↩︎

  3. promql-parser - GitHub ↩︎

  4. promql-parser v0.7.0 Release ↩︎

  5. Tracking Issue: Improve PromQL compliance - GreptimeDB ↩︎

Join our community

Get the latest updates and discuss with other users.