The GreptimeDB v0.14 release focuses on enhancing full-text indexing capabilities, improving performance and cost efficiency, and introducing a dual-engine architecture for the Flow Engine by supporting both batching and streaming modes. Additionally, it officially adds support for OTel Traces ingestion and querying.
For v0.14, the Greptime team made significant progress: we merged 247 pull requests, including** 100 feature enhancements**, 56 bug fixes, 30 code refactorings, 9 performance optimizations, and extensive testing improvements.
During this cycle, 7 community contributors made 16 contributions. We deeply appreciate their efforts and invite more developers to join us!
Let’s take a quick look at what’s new:
Full-Text Index Enhancements
GreptimeDB provides full-text indexing to accelerate text search operations.
Users can configure full-text indexes during table creation or modification, with a variety of options tailored for different use cases.
v0.14 brings two major updates to full-text indexing:
New matches_term
Function and @@
Operator
Users can now perform exact term or phrase matching in SQL queries using the new matches_term
function, especially useful for log analytics.
The @@
operator is also introduced as a shorthand for matches_term
.
Example:
-- Using matches_term function
SELECT * FROM logs WHERE matches_term(message, 'error') OR matches_term(message, 'fail');
-- Using @@ operator (shorthand for matches_term)
SELECT * FROM logs WHERE message @@ 'error' OR message @@ 'fail';
How matches_term
works:
text
: The column containing the string data to match.term
: The exact term or phrase to match, following these rules:- Case-sensitive
- The match must be delimited by non-alphanumeric characters, or text boundaries
- Supports both single-word and multi-word phrases
For detailed usage, check the GreptimeDB documentation.
Full-Text Index Backend: Bloom and Tantivy
v0.14 introduces a new Bloom-based full-text index backend.
Now users have two backend options to choose from depending on their use case:
1. Bloom Backend
Best For | General-purpose log search |
Highlights | - Efficient filtering using Bloom filters- Low storage overhead- Stable performance across query patterns |
Limitations | Slightly slower for highly selective queries |
Example Storage Costs | - Raw data: ~10GB- Bloom index: ~1GB |
2. Tantivy Backend
Best For | High-selectivity queries (e.g., TraceID) |
Highlights | - Fast, accurate matching using inverted indexes- Outstanding performance for selective queries |
Limitations | - High storage overhead (near raw data size)- Slower for low-selectivity queries |
Example Storage Costs | - Raw data: ~10GB- Tantivy index: ~10GB |
Performance Comparison
The table below compares query performance across methods (Bloom = 1x baseline):
Query Type | High Selectivity (e.g., TraceID) | Low Selectivity (e.g., "HTTP") |
---|---|---|
Bloom | 1x (baseline) | 1x (baseline) |
Tantivy | 5x faster | 5x slower |
LIKE Query | 50x slower | 1x |
Key observations:
- Tantivy excels at highly selective queries (like unique identifiers).
- Bloom performs more consistently across different query types.
- Bloom offers much lower storage costs (1GB vs 10GB in our tests).
Learn more about full-text indexing in the official documentation.
Other Notable Updates
Experimental Support for Jaeger Query Protocol
You can now query Traces data in GreptimeDB using Grafana’s Jaeger plugin or the Jaeger UI.
⚠️ The Jaeger query protocol support is currently experimental and may change in future releases.
For setup instructions, check GreptimeDB documentation.
Flow Engine: Dual-Engine Architecture
The Flow Engine now supports both Batching mode and Streaming mode, enabling more flexible data processing:
- Batching Mode: Periodically processes data with a minimum flush interval.
- Streaming Mode: Continuously processes incoming data for lower latency.
Typical use cases:
- Batching: Suitable for large-scale aggregation, reporting, or scheduled jobs with smoother resource consumption.
- Streaming: Ideal for real-time scenarios requiring low-latency data processing, though it demands more resources.
The system currently auto-selects the appropriate mode based on the task.
In the future, we plan to allow users to configure the mode manually.
Read more about Flow Engine in the user guide.
Full Support for OTel Traces
GreptimeDB now fully supports direct ingestion of OpenTelemetry Traces using the native OTel protocol. Built-in table schemas make it easy to query, analyze, and visualize Traces data.
Learn how to ingest and work with Traces in GreptimeDB documentation.
Upgrade Notes
v0.14 is fully compatible with v0.13 for both data and configurations.
If you’re on v0.13, you can upgrade directly.
For other versions, please follow our upgrade guide.
What’s Next?
Following our 2025 GreptimeDB Roadmap, the next versions will focus on:
- Bulk ingestion throughput optimization
- Distributed reliability and stability enhancements
Stay tuned for even more improvements!
About Greptime
GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.
GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.
GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.
GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.
🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.