Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Engineering
April 27, 2025

GreptimeDB v0.14 Release-Full-Text Index Evolution, Dual-Engine Flow, and OTel Traces Support

GreptimeDB v0.14 introduces enhanced full-text search with matches_term, Bloom indexing, and a dual-engine Flow architecture. It also adds official support for OpenTelemetry Traces and Jaeger querying, boosting scalability and observability.

The GreptimeDB v0.14 release focuses on enhancing full-text indexing capabilities, improving performance and cost efficiency, and introducing a dual-engine architecture for the Flow Engine by supporting both batching and streaming modes. Additionally, it officially adds support for OTel Traces ingestion and querying.

For v0.14, the Greptime team made significant progress: we merged 247 pull requests, including** 100 feature enhancements**, 56 bug fixes, 30 code refactorings, 9 performance optimizations, and extensive testing improvements.

During this cycle, 7 community contributors made 16 contributions. We deeply appreciate their efforts and invite more developers to join us!

Let’s take a quick look at what’s new:

Full-Text Index Enhancements

GreptimeDB provides full-text indexing to accelerate text search operations.

Users can configure full-text indexes during table creation or modification, with a variety of options tailored for different use cases.

v0.14 brings two major updates to full-text indexing:

New matches_term Function and @@ Operator

Users can now perform exact term or phrase matching in SQL queries using the new matches_term function, especially useful for log analytics.

The @@ operator is also introduced as a shorthand for matches_term.

sql
Example:
-- Using matches_term function
SELECT * FROM logs WHERE matches_term(message, 'error') OR matches_term(message, 'fail');
-- Using @@ operator (shorthand for matches_term)
SELECT * FROM logs WHERE message @@ 'error' OR message @@ 'fail';

How matches_term works:

  • text: The column containing the string data to match.
  • term: The exact term or phrase to match, following these rules:
    • Case-sensitive
    • The match must be delimited by non-alphanumeric characters, or text boundaries
    • Supports both single-word and multi-word phrases

For detailed usage, check the GreptimeDB documentation.

Full-Text Index Backend: Bloom and Tantivy

v0.14 introduces a new Bloom-based full-text index backend.

Now users have two backend options to choose from depending on their use case:

1. Bloom Backend

Best ForGeneral-purpose log search
Highlights- Efficient filtering using Bloom filters- Low storage overhead- Stable performance across query patterns
LimitationsSlightly slower for highly selective queries
Example Storage Costs- Raw data: ~10GB- Bloom index: ~1GB

2. Tantivy Backend

Best ForHigh-selectivity queries (e.g., TraceID)
Highlights- Fast, accurate matching using inverted indexes- Outstanding performance for selective queries
Limitations- High storage overhead (near raw data size)- Slower for low-selectivity queries
Example Storage Costs- Raw data: ~10GB- Tantivy index: ~10GB

Performance Comparison

The table below compares query performance across methods (Bloom = 1x baseline):

Query TypeHigh Selectivity (e.g., TraceID)Low Selectivity (e.g., "HTTP")
Bloom1x (baseline)1x (baseline)
Tantivy5x faster5x slower
LIKE Query50x slower1x

Key observations:

  • Tantivy excels at highly selective queries (like unique identifiers).
  • Bloom performs more consistently across different query types.
  • Bloom offers much lower storage costs (1GB vs 10GB in our tests).

Learn more about full-text indexing in the official documentation.

Other Notable Updates

Experimental Support for Jaeger Query Protocol

You can now query Traces data in GreptimeDB using Grafana’s Jaeger plugin or the Jaeger UI.

⚠️ The Jaeger query protocol support is currently experimental and may change in future releases.

For setup instructions, check GreptimeDB documentation.

Flow Engine: Dual-Engine Architecture

The Flow Engine now supports both Batching mode and Streaming mode, enabling more flexible data processing:

  • Batching Mode: Periodically processes data with a minimum flush interval.
  • Streaming Mode: Continuously processes incoming data for lower latency.

Typical use cases:

  • Batching: Suitable for large-scale aggregation, reporting, or scheduled jobs with smoother resource consumption.
  • Streaming: Ideal for real-time scenarios requiring low-latency data processing, though it demands more resources.

The system currently auto-selects the appropriate mode based on the task.

In the future, we plan to allow users to configure the mode manually.

Read more about Flow Engine in the user guide.

Full Support for OTel Traces

GreptimeDB now fully supports direct ingestion of OpenTelemetry Traces using the native OTel protocol. Built-in table schemas make it easy to query, analyze, and visualize Traces data.

Learn how to ingest and work with Traces in GreptimeDB documentation.

Upgrade Notes

v0.14 is fully compatible with v0.13 for both data and configurations.

If you’re on v0.13, you can upgrade directly.

For other versions, please follow our upgrade guide.

What’s Next?

Following our 2025 GreptimeDB Roadmap, the next versions will focus on:

  • Bulk ingestion throughput optimization
  • Distributed reliability and stability enhancements

Stay tuned for even more improvements!


About Greptime

GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.

  • GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.

  • GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.

  • GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.

🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.

GitHub | 🌐 Website | 📚 Docs

💬 Slack | 🐦 Twitter | 💼 LinkedIn

Join our community

Get the latest updates and discuss with other users.