Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Biweekly
June 5, 2024

Manual Trigger for Multi-Strategy Compaction and View Query Feature Now Available | Greptime Biweekly Report

A recap of the past two-weeks progress and changes happened on GreptimeDB.

Summary

Together with our global community of contributors, GreptimeDB continues to evolve and flourish as a growing open-source project. We are grateful to each and every one of you.

Below are the highlights among recent commits:

  • Manual Compaction Options: Added the ability to manually trigger different compaction strategies via SQL commands, providing more control over data organization.

  • View Query Support: Enabled the creation and querying of views, simplifying complex queries and improving data management.

  • Performance Boost for Scanner: Enhanced the UnorderedScan by enabling parallel row group processing, significantly improving scan efficiency.

Contributors

For the past two weeks, our community has been super active, with a total of 49 PRs merged. 8 PRs from 6 individual contributors merged successfully and lots more are pending.

Congrats on becoming our most active contributors in the past 2 weeks:

👏 Welcome contributor @confoc @LYZJU2019 join to the community as the new individual contributor, and congratulations on successfully merging their first PR, more PRs are waiting to be merged.

New Contributor of GreptimeDB
New Contributor of GreptimeDB

A big THANK YOU to all our members and contributors! It is people like you who are making GreptimeDB a great product. Let's build an even greater community together.

Highlights of Recent PRs

db#3988 Manual Table Compaction with Different Strategies

This PR introduces the capability to manually trigger different types of table compactions via SQL commands. The newly supported syntax is:

sql
SELECT COMPACT_TABLE(<table_name>, [<compact_type>], [<options>])

Currently supported compact_type options include:

  • regular: Triggers standard compaction similar to those initiated by flush operations.

  • strict_window: Splits SST files strictly by a specified time window.

The <options> are type-specific compaction settings. For example, with strict_window, the option specifies the compaction window in seconds.

db#3952 Enable Querying from Views

This PR introduces support for querying from views, greatly enhancing the simplicity and efficiency of working with complex queries.

With this update, users can now create and query views using intuitive SQL syntax. This feature encapsulates complex logic into reusable virtual tables, providing better data security, performance optimization, and abstraction of data complexity.

The newly supported syntax is:

  • Create View:
sql
CREATE [OR REPLACE] [IF NOT EXISTS] VIEW <view_name> AS <SELECT_statement>;
  • Query View:
sql
SELECT * FROM <view_name>;

db#3957 Adding TLS Support for gRPC Service

This PR enhances the security of the gRPC service by adding TLS (Transport Layer Security) support. gRPC server TLS configurations:

toml
[grpc.tls]
## TLS mode.
mode = "enable"
## Certificate file path.
cert_path = "/path/to/certfile"
## Private key file path.
key_path = "/path/to/keyfile"
## Watch for Certificate and key file change and auto reload.
watch = false

db#3992 Enhancing Performance with Parallel Unordered Scanner

This PR significantly boosts the performance of the UnorderedScan by enabling parallel reading of Parquet row groups. This update enhances scan efficiency by allowing concurrent processing of row groups and memtables based on specified parallelism settings. The new parallel scan strategies ensure improved efficiency and responsiveness, contributing to a more efficient data processing pipeline.

db#4024 Introducing Round-Robin Selector In Metasrv

This PR introduces a RoundRobin selector that ensures fair and sequential selection of peers for load distribution across datanodes in Metasrv. By cycling through datanodes based on their node_id, it provides an additional method alongside existing LoadBased and LeaseBased selectors. This approach is useful for evenly distributing loads but it does not account for the actual load on each datanode. Use it with care!

db#4019 Enabling TCP Keepalive for HTTP Server

This PR introduces TCP keepalive for the HTTP server to address the issue of "dangling" HTTP connections observed in some customer environments. These connections, which remained in an established state long after the client-side pods were destroyed, consumed significant memory resources. By implementing TCP keepalive, idle connections are closed after one hour, mitigating issues caused by ungraceful shutdowns or network problems common in cloud environments. The keepalive option is hardcoded to maintain simplicity and avoid complicating the configuration file.

Good First Issue

db#3884 Remove unnecessary traits and wrapper types from the query crate

The manifest doesn't have any checksum for data validation. We need a way to do the checksum validation for region manifests. A possible way is to save the checksum as the part of manifest file name, for example, 000000000001-{checksum}.json.

Most implementations simply forward requests to Datafusion. Since we are highly coupled with Datafusion and have no plans to support another query engine, we can remove these types.

Keywords: Refactoring

Difficulty: Simple


About Greptime

We help industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real-time.

Visit the latest version to get started and get the most out of your data.

  • GreptimeDB, written in Rust, is a distributed, open-source time-series database designed for unlimited horizontal scaling, high performance, and integrated analytics. We provide GreptimeDB Enterprise for corporate users which supports more enterprise features and customized services. Contact us here for more information.

  • GreptimeCloud is a fully-managed cloud database-as-a-service (DBaaS) solution built on GreptimeDB. It efficiently supports applications in fields such as observability, IoT, and finance. The built-in observability solution, GreptimeAI, helps users comprehensively monitor the cost, performance, traffic, and security of LLM applications.

  • The Vehicle-Cloud Integrated TSDB is a finely tailored solution that aligns closely with the specific business scenarios of automotive companies, addressing the challenges posed by the exponential growth of vehicle data. The multimodal vehicle-side database, combined with the cloud-based GreptimeDB Enterprise, greatly reduces traffic, computing, and storage costs, and boosts data timeliness and business insight capabilities.

If anything above draws your attention, don't hesitate to star us on GitHub or join GreptimeDB Community on Slack. Also, you can go to our contribution page to find some interesting issues to start with.

biweekly

Join our community

Get the latest updates and discuss with other users.