Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Biweekly
July 17, 2024

Full-Text Indexing with v0.9 Log Engine Improves Data Search Efficiency | Greptime Biweekly Report

A recap of the past two-weeks progress and changes happened on GreptimeDB.

Summary

Together with our global community of contributors, GreptimeDB continues to evolve and flourish as a growing open-source project. We are grateful to each and every one of you.

Below are the highlights among recent commits:

  • Support Full-text Index: Added full-text index to support searching keywords in log data.

  • View-related Statement: Added some SQL statements to support show, create and drop views.

  • Optimize last_value Performance: Made last_value faster (from 7s to 0.5 s in a complex query) by applying new optimizing rules.

  • Start flownode in Distributed Mode: Support starting flownode to enable continuous aggregation in Distributed Mode.

Attention

We're thrilled to announce the release of the new GreptimeDB version in the second half of 2024! This update features the Log Engine, a storage engine optimized for log storage and queries. It improves log data storage efficiency and offers powerful log data processing and querying capabilities.

Join our Virtual Meetup to learn more tech details and watch the Log Engine Demo.

📅 July 31st at 8 PM PST

Register now to receive the Zoom invite: https://forms.gle/6warqF3RrQvLRfJz9

The GreptimeDB v0.9 Virtual Meetup
The GreptimeDB v0.9 Virtual Meetup

Contributors

For the past two weeks, our community has been super active with a total of 127 PRs merged. 8 PRs from 6 individual contributors merged successfully and lots pending to be merged.

Congrats on becoming our most active contributors in the past 2 weeks!

👏 Welcome contributor @scintillavoy join to the community as the new individual contributor, and congratulations on successfully merging their first PR, more PRs are waiting to be merged.

New Contributor of GreptimeDB
New Contributor of GreptimeDB

A big THANK YOU to all our members and contributors! It is people like you who are making GreptimeDB a great product. Let's build an even greater community together.

Highlights of Recent PRs

db#4310 Support Full-text Index

This PR introduces SQL syntax FULLTEXT and gRPC options to enable full-text indexing when creating a table. Example:

sql
CREATE TABLE log (
    ts TIMESTAMP TIME INDEX,
    msg TEXT FULLTEXT WITH (analyzer='English', case_sensitive='true'),
);

This will create a full-text index on the msg column with a case-sensitive English analyzer. Then, you can use the following SQL to search in the msg column:

sql
SELECT * FROM log WHERE MATCHES(msg, 'error OR fail');

These PRs added SQL statements to support SHOW CREATE VIEW, DROP VIEW, and SHOW VIEWS.

For example:

sql
public=> CREATE VIEW v1 as SELECT * FROM numbers LIMIT 10;
OK 0
public=> SHOW CREATE VIEW v1;
 View |                   Create View                    
------+--------------------------------------------------
 v1   | CREATE VIEW v1 AS SELECT * FROM numbers LIMIT 10
(1 row)

public=> SHOW VIEWS;
 Views 
-------
 v1
(1 row)

public=> DROP VIEW v1;
OK 0

db#4357 db#4369 Optimize last_value Performance

These PRs implement numerous optimizations, including adding optimization rules and caching the last value of a row, making last_value considerably faster. For example, for the following query, the execution time is reduced from 7 seconds to around 0.5 seconds:

sql
select last_value(hostname order by ts), last_value(region order by ts), last_value(datacenter order by ts), last_value(rack order by ts), last_value(os order by ts), last_value(arch order by ts), last_value(team order by ts), last_value(service order by ts), last_value(service_version order by ts), last_value(service_environment order by ts), last_value(usage_user order by ts), last_value(usage_system order by ts), last_value(usage_idle order by ts), last_value(usage_nice order by ts), last_value(usage_iowait order by ts), last_value(usage_irq order by ts), last_value(usage_softirq order by ts), last_value(usage_steal order by ts), last_value(usage_guest order by ts), last_value(usage_guest_nice order by ts) from cpu group by hostname;

db#4256 Start flownode in Distributed Mode

This PR allows you to start a flownode, enabling you to create continuous aggregation jobs in distributed mode. For example, you can use the following shell command to start a flownode:

shell
greptime flownode start --node-id=0 --rpc-addr=127.0.0.1:6800 --metasrv-addrs=127.0.0.1:3002;

Then, you can use CREATE FLOW to create new continuous aggregation jobs, just like in standalone mode.

Good First Issue

db#4351 Fail to Parse Number 32768 in Default Value Position

In GreptimeDB, a faulty impl of SMALLINT causes its range to become -32767 to 32767, which should actually be -32768 to 32767. We can fix this by properly implementing default value support.

  • Keywords: Query, SQL

  • Difficulty: Easy

db#4340 Improve Error Messages

The error messages in GreptimeDB have some issues:

a. Errors from DataFusion always appear as internal errors, which can be confusing and unhelpful for users.

b. The MySQL and PostgreSQL protocols consistently return an internal error type, which is also unhelpful.

c. The Timeout Error message is also very unclear.

  • Keywords: DataFusion, SQL

  • Difficulty: Medium

db#3799 Remove QueryRequest from GreptimeRequest

We have the QueryRequest in the GreptimeRequest enum, However, now the GreptimeRequest is not supposed to be used for query. So the QueryRequest in it (but not the QueryRequest itself!) can be deleted, along with some codes.

  • Keywords: Protobuf

  • Difficulty: Easy

biweekly

Join our community

Get the latest updates and discuss with other users.