10x Performance Improvement in Certain Scenarios! Support Jaeger Query Protocol and Unify Index Syntax | Greptime Biweekly Report

Together with our global community of contributors, GreptimeDB continues to evolve and flourish as a growing open-source project. We are grateful to each and every one of you.

Below are the highlights among recent commits:

Support for Jaeger query protocol
Unified syntax for creating index, with support for using ALTER to modify skipping index
Optimized table creation speed in metrics scenarios
Improved write and flush performance

Contributors

For the past four weeks, our community has been super active with a total of 140 PRs merged. 19 PRs from 3 individual contributors merged successfully and lots pending to be merged.

Congrats on becoming our most active contributors in the past 4 weeks:

@ozewr (db#5484)
@Stephan3555 (db#5441)
@yihong0618 (db#5458 db#5453 db#5442 db#5433 db#5422 db#5398 db#5529 db#5523 db#5518 db#5507 db#5497 db#5491 db#5473 db#5470 db#5468 db#5467 demo-scene#72)

👏 Welcome @ozewr @yihong0618 to the community as a new contributor with a successfully merged PR, and more PRs from other individual contributors are waiting to be merged.

🎆 We are thrilled to welcome @JetSquirrel(Tian Deng) as GreptimeDB's first-ever Advocate! He will play a key role in driving GreptimeDB's technical advocacy, expanding our community, and building the developer ecosystem. Together with the community, he'll help propel the project’s growth and widespread adoption!

🎉 A big THANK YOU to all our members and contributors! It is people like you who are making GreptimeDB a great product. Let's build an even greater community together.

Highlights of Recent PRs

db#5452 Support for Jaeger query protocol

GreptimeDB is compatible with the following query protocols of Jaeger v2 version at the /v1/jaeger HTTP endpoint:

/api/services
/api/operations
/api/traces

Users can now write trace data using the OpenTelemetry protocol and then query and subsequently query and analyze it with the Jaeger protocol.

db#5486 db#5538 Unified syntax for creating index, with support for using `ALTER` to modify skipping index

PR db#5486 unified the syntax for creating index in GreptimeDB, resolving the inconsistencies in the syntax for creating inverted, skipping, and full-text index. Readers can refer to issue db#5332 for more background information.

The unified index syntax allows using either column constraints to create index:

sql

Column Constrain: ... <INDEX_TYPE> INDEX [WITH (...)]

For example, the following statements demonstrate several ways for creating an index:

sql

CREATE TABLE IF NOT EXISTS system_metrics (
    host STRING,
    idc STRING FULLTEXT INDEX INVERTED INDEX,
    cpu_util DOUBLE,
    memory_util DOUBLE,
    disk_util DOUBLE,
    desc1 STRING,
    desc2 STRING FULLTEXT INDEX,
    desc3 STRING FULLTEXT INDEX,
    ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY(host, idc),
    TIME INDEX(ts)
);

Users can now use the ALTER command to set or remove skipping index:

sql

ALTER TABLE table_name MODIFY COLUMN column_name SET SKIPPING INDEX WITH(granularity = 1024, type = 'BLOOM');
ALTER TABLE table_name MODIFY COLUMN column_name UNSET SKIPPING INDEX;

Additionally, when modifying a full-text index using ALTER, users must specify the INDEX keyword:

sql

ALTER TABLE table_name MODIFY COLUMN column_name SET FULLTEXT INDEX WITH(analyzer = 'English', case_sensitive = 'true');

db#5503 db#5504 Optimized table creation speed in metrics scenarios

When ingesting data through the Prometheus Remote-Write protocol, GreptimeDB may need to automatically create a large number of tables. Previously, table creation could take several minutes, causing users to wait a long time before they could reliably write data. The optimizations in db#5503 and db#5504 have significantly improved table creation efficiency, reducing the time to create 3,000 tables (with metric-engine) from 8 minutes to just a few seconds.

db#5451 db#5456 db#5455 db#5460 Optimized write performance

The above PRs parallelize certain operations during the write process and optimize the memtable implementation, further enhancing write performance in a single region. Writing to a single region can now also take advantage of multiple CPU cores.

db#5518 Improved deduplication efficiency during `last_non_null` table flushing by 10 times

When using a table with merge_mode = last_non_null, if users write a large amount of duplicate data, the database may experience long delays in flushing that table's data to disk. The primary reason for this issue was the low deduplication efficiency of the original implementation.

After the optimizations in db#5518, the flushing speed in this scenario has improved by approximately 10 times!

Good First Issue

db#5296 Storing necessary fields instead of the entire `QueryContext` struct in `CreateFlowData`

Level: Medium
Keyword: Query, refactor

About Greptime

Greptime offers industry-leading time series database products and solutions to empower IoT and Observability scenarios, enabling enterprises to uncover valuable insights from their data with less time, complexity, and cost.

GreptimeDB is an open-source, high-performance time-series database offering unified storage and analysis for metrics, logs, and events. Try it out instantly with GreptimeCloud, a fully-managed DBaaS solution—no deployment needed!

The Edge-Cloud Integrated Solution combines multimodal edge databases with cloud-based GreptimeDB to optimize IoT edge scenarios, cutting costs while boosting data performance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected.

10x Performance Improvement in Certain Scenarios! Support Jaeger Query Protocol and Unify Index Syntax | Greptime Biweekly Report

Contributors ​

Highlights of Recent PRs ​

db#5452 Support for Jaeger query protocol ​

db#5486 db#5538 Unified syntax for creating index, with support for using ALTER to modify skipping index ​

db#5503 db#5504 Optimized table creation speed in metrics scenarios ​

db#5451 db#5456 db#5455 db#5460 Optimized write performance ​

db#5518 Improved deduplication efficiency during last_non_null table flushing by 10 times ​

Good First Issue ​

db#5296 Storing necessary fields instead of the entire QueryContext struct in CreateFlowData ​

About Greptime ​

Join our community