โœ•

Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs ๐Ÿ‘‰๐Ÿป Register Now

โœ•
Skip to content
On this page
Engineering
โ€ข
Aug 2, 2024

GreptimeDB vs. Grafana Mimir - First Official Benchmark for High Volume Write In Performance

We conducted a massive data ingestion performance test using the Prometheus-Benchmark, which is popular in the observability industry to verify the strong scalability of the GreptimeDB architecture. Compared to Grafana Mimir, which is also based on object storage, GreptimeDB demonstrated a 5x reduction in resource consumption.

In the industry-standard Prometheus-Benchmark test, GreptimeDB handled approximately 40 million points per second with a single cluster of 100 nodes, each with 8 cores and 16 GB of memory. At peak load, the datanodes sustained CPU usage at 38% and memory usage at 40%. The system managed a total of 610 million active time series, updating 6.15 million time series every ten minutes, maintaining stable write performance throughout the entire 1.5-hour test period.

node exporter

The test results indicate that GreptimeDB's architecture can support extremely large-scale cluster while maintaining very low resource usage. In contrast, Grafana Mimir, which also utilizes object storage, requires more than five times the CPU and memory resources to handle the same data scale (based on Mimir's test report calculations).

Test Environment and Tools โ€‹

Test Tools The key parameters and explanations for the Prometheus Benchmark stress testing tool used in the testing are as follows:

test tools

With the above configuration, the final theoretical data generation rate is approximately 41.6 million points per second.

Test Environment There are two clusters in the test environment: the Prometheus Benchmark TestSuite and the GreptimeDB database cluster. Both clusters are deployed on AWS EKS. The image version used by the GreptimeDB cluster is v0.8.2. The deployment specifications and numbers for the components involved are as follows:

test environment table

The WAL component of the GreptimeDB cluster uses AWS EBS; SST flush and compaction are both performed on S3. The local disk specification is GP3, and the S3 class is Standard within the same region. This test does not include scenarios involving node scheduling or migration. Data partitions are evenly distributed among nodes using a Round-Robin Selector, with a total of 200 regions.

Detailed Testing Data โ€‹

Write Throughput โ€‹

The test lasted for approximately 1.5 hours. The total write volume and the per-node write volume monitoring data are as follows. It can be observed that during the 1.5 hours, the cluster's write throughput consistently maintained a stable rate of approximately 40 million points per second.

The write throughput of GreptimeDB

Resource Utilization โ€‹

Observation of the GreptimeDB cluster's resource utilization yields the following data:

  • Datanode's total CPU usage: With a total of 100 nodes each having 8 cores (800 cores in total), the peak usage reached 305 cores, approximately 38%.
Datanode's Total CPU Usage
  • Frontend's total CPU usage: With a total of 50 nodes each having 8 cores (400 cores in total), the peak usage reached 250 cores, approximately 63%.
Frontend's Total CPU Usage Chart
  • Datanode's total memory usage: With a total memory capacity of 100 nodes each having 16 GiB (1600 GiB in total), the peak usage reached 640 GiB, approximately 40%.
Datanode's Total Memory Usage Chart
  • Frontend's total memory usage: With a total memory capacity of 50 nodes, each having 16 GiB (800 GiB in total), the peak usage reached 110 GiB, approximately 14%.
Frontend's Total Memory Usage Chart

Storage Write Throughput โ€‹

The following figure shows the S3 write throughput statistics. The bandwidth usage was mostly below 512 MiB/s, with a total write volume of 2.8 TiB over the 1.5-hour period.

S3 Write Throughput Statistics

The WAL component observed a write throughput of approximately 800 MiB/s. By comparing the two figures, it can be seen that GreptimeDB effectively compressed the data, achieving a compression ratio of 75%, thereby reducing the data flush throughput. As a result, the S3 directory size at the stress test was significantly less than 1 TiB.

WAL Write Size Chart

Conclusion โ€‹

A 100-node cluster is far from the upper limit; this test has verified the unlimited scalability of GreptimeDB's architecture. Compared to Grafana Mimir, which is also based on object storage, GreptimeDB demonstrates a 5x reduction in resource consumption.

We will continue optimizing performance and conducting further tests to enable GreptimeDB to handle massive volumes of time-series data for reading, writing, and analysis. The version tested in this round was v0.8.2. Since then, GreptimeDB has released version v0.9.0, which introduces a log engine and full-text search capabilities, further enhancing performance and stability. This marks a significant step towards becoming a unified time-series database that integrates metrics, logs, and events. You're all welcomed to test GreptimeDB nad all feedback is welcomed!


About Greptime โ€‹

We help industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real-time.

Visit the latest version from any device to get started and get the most out of your data.

  • GreptimeDB, written in Rust, is a distributed, open-source, time-series database designed for scalability, efficiency, and powerful analytics.
  • Edge-Cloud Integrated TSDB is designed for the unique demands of edge storage and compute in IoT. It tackles the exponential growth of edge data by integrating a multimodal edge-side database with cloud-based GreptimeDB Enterprise. This combination reduces traffic, computing, and storage costs while enhancing data timeliness and business insights.
  • GreptimeCloud is a fully-managed cloud database-as-a-service (DBaaS) solution built on GreptimeDB. It efficiently supports applications in fields such as observability, IoT, and finance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected. Also, you can go to our contribution page to find some interesting issues to start with.

scalability
benchmark

Join our community

Get the latest updates and discuss with other users.