Welcome New Committer Erxi | Greptime Biweekly Report - No. 84

Welcome Our New Committer: Erxi 🎉

Welcome Erxi as a GreptimeDB Committer

We're delighted to welcome Erxi as a new GreptimeDB Committer!

In the community he goes by Erxi — a name with a quiet bit of wordplay. It comes from the Chinese word "无名" (nameless): keep only the upper halves of the two characters and they become "二夕" (Erxi).

Erxi comes from a big-data and OLAP DBA background. What first drew him to GreptimeDB was stability — from years of running production systems he knows how much it matters. In a conversation with Wayne, he learned that GreptimeDB has had very few systemic, severe production issues, with most problems being logic-related rather than systemic — something he attributes in part to it being written in Rust. The more he took part, the more he came to value how open and contributor-friendly the project keeps its community, even with a company behind it.

In GreptimeDB his interests span storage and flush behavior, big-data ecosystem integration, and observability. Over the past two months he has landed a steady run of meaningful work, including:

Hardening the metasrv control plane — revoking meta KV writes outside the metasrv leader (#8060)
Flush observability — propagating flush reasons through the FlushRegions path (#8051)
Query lifecycle — tracking INSERT ... SELECT in the process manager so it can be cancelled, with KILL coverage (#8138, #8151)
Pipeline ergonomics — allowing detailed index configuration in pipelines (#8036)
Mito option validation (#8094) and gRPC CLI option naming alignment (#8021)
COPY FROM CSV usability — skipping bad records (#8198) and headerless CSV support (#8233)

Beyond GreptimeDB, Erxi is active in the Apache Paimon community — especially the paimon-rust project and integrations with Ray Data and Daft — maintains a personal GreptimeDB Flink connector, and has lately been exploring AI observability and multimodal data processing.

In his own words, he'd rather not pin too many labels on himself — just "a developer who enjoys exploring and building interesting things." You can find him on X and GitHub.

Welcome aboard, Erxi! 🚀

Summary

Development period: 2026-05-18 - 2026-05-31

Here are the highlights from recent commits:

Tables can be repartitioned without dropping and recreating them via meta-srv
Flow queries track incremental read positions with checkpoint support
Query prefilters cache results to skip repeated parquet row group scans

We encourage users on older versions to upgrade for these fixes and improvements.

Contributors

Over the past two weeks, 15 contributors merged a total of 50 PRs. Among them, 4 individual contributors contributed 5 PRs. Welcome to our new contributor: @rogierlommers!

Thanks to our individual contributors:

Highlights of Recent PRs

db#8179 feat(flow): support incremental read checkpoints

Flow queries can now use incremental checkpoints instead of full table scans when processing supported aggregate SQL patterns. The system tracks region watermarks to safely advance checkpoints and merges delta results with existing sink state, reducing computation overhead for continuous data processing workloads.

db#8108 feat: inc query join rewrite helper

Incremental queries can now be rewritten as joins with sink tables through a new query rewrite helper. This internal optimization prepares the foundation for more efficient incremental query execution patterns.

db#8154 feat: add flow query-context plumbing for terminal watermarks

Flow tasks can now propagate query context and collect terminal watermark metrics through new plumbing in the frontend client. This infrastructure prepares for incremental Flow reads without changing current batching execution behavior.

db#8186 feat(meta-srv): support repartition for unpartitioned tables

Until now, repartitioning only worked on tables that already had partition keys defined. The repartition procedure can now convert unpartitioned tables into partitioned ones by adding partition key indices and splitting the single region into multiple regions.

db#8102 feat: implement a cache for the prefilter

Repeated scans over the same parquet row groups now reuse previously computed filter results from a new PrefilterResultCache instead of re-reading and re-evaluating filter columns. The cache is keyed per row group and filter expression, with a configurable size limit (default 128MB) set via the new prefilter_result_cache_size option.

Good First Issue

Issue#8227 Timestamp display precision should respect column schema

When querying through the MySQL/PostgreSQL CLI, timestamp rendering does not always respect the precision defined by the column schema — e.g. a TIMESTAMP(9) column gets truncated to microseconds. Fix the formatting to honor each column's declared precision.

Keywords: MySQL protocol, Timestamp formatting
Difficulty: Easy

Issue#7987 feat: add flow_statistics system table and SHOW FLOW STATUS for flow runtime observability

Add a system table called flow_statistics and a SHOW FLOW STATUS SQL command to display flow runtime information like start time, uptime, processed data volume, and recent errors.

Keywords: SQL parser, Observability
Difficulty: Medium

Welcome New Committer Erxi | Greptime Biweekly Report - No. 84

Welcome Our New Committer: Erxi 🎉 ​

Summary ​

Contributors ​

Highlights of Recent PRs ​

db#8179 feat(flow): support incremental read checkpoints ​

db#8108 feat: inc query join rewrite helper ​

db#8154 feat: add flow query-context plumbing for terminal watermarks ​

db#8186 feat(meta-srv): support repartition for unpartitioned tables ​

db#8102 feat: implement a cache for the prefilter ​

Good First Issue ​

Issue#8227 Timestamp display precision should respect column schema ​

Issue#7987 feat: add flow_statistics system table and SHOW FLOW STATUS for flow runtime observability ​

Join our community