Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Biweekly
September 11, 2024

GreptimeDB v0.9.3 Released! Introducing indexing for Remote WAL to accelerate replay | Greptime Biweekly Report

A recap of the past two-weeks progress and changes happened on GreptimeDB.

Summary

Together with our global community of contributors, GreptimeDB continues to evolve and flourish as a growing open-source project. We are grateful to each and every one of you.

Below are the highlights among recent commits:

  • Released GreptimeDB v0.9.3, with the following major bug fixes:

    • Fixed an issue where the last_value function could return null due to caching problems.
    • Fixed a bug that caused certain queries to crash in last_non_null merge mode.
    • Fixed the issue where some data would be lost when querying timestamp columns in append mode.
  • Added indexing for Remote WAL to reduce read amplification and improve replay speed.

For the past two weeks, our community has been super active with a total of 82 PRs merged. Of these, 5 PRs were contributed by 3 individual contributors, with many more pending to be merged.

Congrats on becoming our most active contributors in the past 2 weeks!

👏 Welcome contributor @billy7x17 join to the community as the new individual contributor, and congratulations on successfully merging their first PR!

New Contributors of GreptimeDB
New Contributors of GreptimeDB

A big THANK YOU to all our members and contributors! It is people like you who are making GreptimeDB a great product. Let's build an even greater community together.

Highlights of Recent PRs

db#4424 db#4461 db#4530 db#4565 Added indexing implementation for Remote WAL to mitigate read amplification and improve replay speed

Since the number of topics in Remote WAL is usually limited, in previous versions, if there were many regions in a single GreptimeDB cluster instance, multiple regions' WALs would map to the same topic. This caused severe read amplification during WAL replays, consuming a large amount of bandwidth and slowing down replay speed. The above PRs implemented data offset indexing on the client side, significantly improving replay speed.

db#4382 db#4639 db#4642 db#4654 Enhanced database backup and recovery tools

We already had backup and recovery functionality to help back up and restore the database in case of issues. However, it was not user-friendly, as both the COPY DATABASE SQL and the Greptime CLI tool had to be used together. These PRs enhanced the CLI tool to support one-click database backup and one-click recovery.

db#4571 Refactored and optimized the construction of data streams in Log pipeline

The pipeline is used to process structured key-value data, including trimming and modifying data. This PR optimizes the construction of data streams in the pipeline and enhances the pipeline’s intermediate state types. In several key cases, performance improvements exceed 10%.

Good First Issue

db#4480 Add debug level traces in Mito Engine

The current Trace is not suitable for user performance debugging. More detailed Spans could be added in the Mito Engine, such as the time taken to read data from S3, etc.

  • Keywords: Mito Engine, Trace

  • Difficulty: Medium

db#3265 Add more tests for COPY FROM statements

COPY FROM involves loading data from external source, therefore more tests are required to assure its functionality.

  • Keywords: SQL, Unit Tests, Coverage

  • Difficulty: Simple


About Greptime

We help industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real-time.

Visit the latest version from any device to get started and get the most out of your data.

  • GreptimeDB, written in Rust, is a distributed, open-source, time-series database designed for scalability, efficiency, and powerful analytics.
  • Edge-Cloud Integrated TSDB is designed for the unique demands of edge storage and compute in IoT. It tackles the exponential growth of edge data by integrating a multimodal edge-side database with cloud-based GreptimeDB Enterprise. This combination reduces traffic, computing, and storage costs while enhancing data timeliness and business insights.
  • GreptimeCloud is a fully-managed cloud database-as-a-service (DBaaS) solution built on GreptimeDB. It efficiently supports applications in fields such as observability, IoT, and finance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected. Also, you can go to our contribution page to find some interesting issues to start with.

biweekly

Join our community

Get the latest updates and discuss with other users.