Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Benchmark
March 10, 2025

67.5% Boost in Write Performance, 50% Lower Storage Cost —— GreptimeDB v0.12 Log Performance Benchmark

GreptimeDB v0.12 brings a significant performance boost for log ingestion and query scenarios, achieving up to 67.5% higher write throughput while reducing CPU usage by 40% compared to v0.9. The latest version also offers 50% lower storage costs than ClickHouse and 12.7% of Elasticsearch's storage footprint. With enhanced write and query capabilities, GreptimeDB provides a cost-effective and high-performance alternative for log processing and analytics.

Key Findings

  • Compared to GreptimeDB v0.9, which initially introduced log-related functionalities, v0.12 delivers substantial improvements in read/write performance and resource efficiency, achieving up to 67.5% higher write throughput while reducing CPU usage to 40%.
  • In log ingestion scenarios, whether handling structured or unstructured data, GreptimeDB demonstrates exceptional write throughput, reaching 111% of ClickHouse’s performance and 470% of Elasticsearch’s. Furthermore, when leveraging cost-efficient object storage, there is no significant performance degradation.
  • For log queries, GreptimeDB remains competitive with ClickHouse and Elasticsearch, with each database excelling in different use cases. Thanks to multi-level caching optimizations, GreptimeDB sustains high query performance even when data is stored on object storage, achieving an optimal cost-to-performance balance.
  • With its columnar storage and advanced compression algorithms, GreptimeDB offers the best compression ratio among the three databases, with persistent storage requirements at only 50% of ClickHouse’s and 12.7% of Elasticsearch’s, effectively reducing long-term data retention costs.

Benchmark Scenarios

Test Data and Methodology

We used nginx access logs as the input data, with a typical log entry structured as follows:

plaintext
129.37.245.88 - meln1ks [01/Aug/2024:14:22:47 +0800] "PATCH /observability/metrics/production HTTP/1.0" 501 33085

The log data was generated and ingested using Vector, an open-source observability pipeline. The overall benchmarking process is illustrated below:

Figure 1: Test Flowchart
Figure 1: Test flowchart

We tested two data models based on different storage and query needs:

  • Structured Data Model: Logs were parsed into individual fields, with each field stored in separate columns. Queries targeted specific columns for filtering and retrieval.
  • Unstructured Data Model: Except for the timestamp field, the entire log message was stored as a single text field with full-text indexing enabled. Keyword-based queries were used for searches.

Hardware and Software Setup

Hardware Platform

ComponentConfiguration
Server ModelAWS c5d.2xlarge, 8 vCPU 16GiB Memory
Operating SystemUbuntu24.04 LTS

Software Versions

DatabaseVersion
GreptimeDBv0.12
ClickHouse24.9.1.219
Elasticsearch8.15.0

Note: Detailed database configurations are provided in the appendix.

Read/Write Performance Benchmarking

Write Throughput

DatabaseStructured Model Avg. TPSUnstructured Model Avg. TPS
GreptimeDB (v0.9)127,22695,238
GreptimeDB (v0.12)185,535159,502
GreptimeDB (v0.12, S3)182,272154,209
ClickHouse166,667136,612
Elasticsearch39,40128,058
Figure 2: Write/Throughput performance
Figure 2: Write/Throughput performance

Observations:

  • GreptimeDB v0.12 significantly improves write performance, achieving a 45.8% increase in structured data throughput and a 67.5% increase in unstructured data throughput compared to v0.9.
  • GreptimeDB outperforms ClickHouse in write throughput, while both significantly surpass Elasticsearch. For structured logs, GreptimeDB achieves 4.7x Elasticsearch’s throughput, while for unstructured logs, it reaches 5.7x.
  • When using AWS S3 for storage, GreptimeDB’s write throughput is reduced by only 1–2%, making it a cost-efficient solution without compromising performance.

Query Performance

Six different query types were evaluated to simulate real-world log analysis use cases:

  1. Count Query: Count total rows in the dataset.
  2. Keyword Matching: Filter logs based on user, method, endpoint, version, and status code.
  3. Time Range Query: Retrieve approximately half of the dataset (~50 million rows) within a time range.
  4. Mid-Time Range Query: Retrieve 1,000 rows within a middle-time window.
  5. Recent-Time Range Query: Retrieve 1,000 rows within the most recent minute.
  6. Keyword Matching + Time Range Query: Filter logs within a specific time range using keyword matching.

Note:

  • Both GreptimeDB and ClickHouse use SQL for queries. GreptimeDB supports MySQL client tools, whereas ClickHouse provides its own command-line client: ClickHouse CLI.
  • For Elasticsearch, we used the search REST API: Elasticsearch Search API.

Structured Data Queries

Query Type (ms)GreptimeDBGreptimeDB on S3ClickHouseElasticsearch
Count Query664610
Keyword Matching22.824.752134
Time range512.3653.941316
Middle Time Range18.615.75632
Recent Time Range16.111.813325
Keyword Matching+Range Query1943.25288
Figure 3: Structured data query time consumption
Figure 3: Structured data query time consumption

Findings:

  • GreptimeDB and ClickHouse exhibit comparable query performance for structured data.
  • For large time-range scans, GreptimeDB and ClickHouse experience noticeable latency increases but complete queries within a few hundred milliseconds.
  • Thanks to multi-level caching, GreptimeDB maintains stable performance even when data is stored on S3.

Unstructured Data

Query Type & Execution Time (ms)GreptimeDBGreptimeDB on S3ClickHouseElasticsearch
Count query6689
Keyword matching254724782080161
Time range1166133057210
Middle Time Range22.432.25126
Recent Time Range13.513.860622
Keyworld Matching and Range Query331.8350.21610122
Figure 4: Unstructured data query time consumption
Figure 4: Unstructured data query time consumption
  • GreptimeDB v0.12 has improved performance in most unstructured data queries compared to v0.9.
  • Both GreptimeDB and ClickHouse experience performance degradation in unstructured data queries, falling behind Elasticsearch.

Resource Utilization and Compression Rate

Resource Utilization

With a write rate limited to 20,000 rows per second, we compared CPU and memory usage for GreptimeDB, ClickHouse, and Elasticsearch:

DatabaseStructured ModelStructured ModelUnstructured ModelUnstructured Model
ParameterCPU (%)Memory (MB)CPU (%)Memory (MB)
GreptimeDB (v0.9)33.2433716.79462
GreptimeDB (v0.12)13.240810.29624
ClickHouse9.5661126.77732
Elasticsearch40.22988347.549320
  • ClickHouse has the lowest CPU usage, while GreptimeDB is slightly higher but shows significant improvements over v0.9 (from 33.24% to 13.2% in structured data and from 16.79% to 10.29% in unstructured data).
  • GreptimeDB consumes the least memory, whereas Elasticsearch’s memory usage is an order of magnitude higher than both GreptimeDB and ClickHouse.
  • Both GreptimeDB and ClickHouse employ LSM Tree-like structures, resulting in significant memory fluctuations.

Compression Rate

We tested data compression by writing approximately 10GB of raw data and measuring the persistent storage size across different databases:

DatabaseStructured ModelStructured ModelUnstructured ModelUnstructured Model
ParameterData Size (GB)Compression RatioData Size (GB)Compression Ratio
GreptimeDB1.313%3.333%
ClickHouse (Before Compression)7.626%15.551%
ClickHouse (After Compression)2.626%5.151%
Elasticsearch (Before Compression)14.6102%19172%
Elasticsearch (After Compression)10.2102%17.2172%

Key takeaways:

  • GreptimeDB v0.12 maintains a strong compression advantage, storing structured data at just 13% of the original size and unstructured data at 33%.
  • Structured data models leverage columnar encoding and adaptive compression strategies, achieving significantly better compression than unstructured models.

Note: ClickHouse and Elasticsearch continuously compress data in the background. The table above records both pre- and post-compression storage sizes.

Conclusions

GreptimeDB v0.12 brings significant improvements in both write and query performance, making it an efficient and cost-effective choice for log processing. With superior write throughput, robust storage efficiency, and competitive query performance, GreptimeDB stands as a compelling alternative to ClickHouse and Elasticsearch in observability and logging use cases.


Appendix

Software Configuration

  • GreptimeDB (Local Storage): Default settings were used.
  • GreptimeDB (S3 Storage): Configured as follows
toml
[storage]
type = "S3"
bucket = "<bucket_name>"
root = "log_benchmark"
access_key_id = "<ACCESS_KEY>"
secret_access_key = "<SECRET_KEY>"
endpoint = "<S3_ENDPOINT>"
region = "<S3_REGION>"
cache_path = "<CACHE_PATH>"
cache_capacity = "20G"

[[region_engine]]
[region_engine.mito]
enable_experimental_write_cache = true
experimental_write_cache_size = "20G"

Vector Log Parsing Configuration

toml
[transforms.parse_logs]
type = "remap"
inputs = ["demo_logs"]
source = '''
. = parse_regex!(.message, r'^(?P<ip>\S+) - (?P<user>\S+) \[(?P<timestamp>[^\]]+)\] "(?P<method>\S+) (?P<path>\S+) (?P<http_version>\S+)" (?P<status>\d+) (?P<bytes>\d+)$')

# Convert timestamp to a standard format
.timestamp = parse_timestamp!(.timestamp, format: "%d/%b/%Y:%H:%M:%S %z")

# Convert status and bytes to integers
.status = to_int!(.status)
.bytes = to_int!(.bytes)
'''

Table Schema

Structured Data Model:

GreptimeDB
sql
-- Enables append mode and sets `user`, `path`, and `status` as tags (i.e., primary keys)
CREATE TABLE IF NOT EXISTS `test_table` (
    `bytes` Int64 NULL,
    `http_version` STRING NULL,
    `ip` STRING NULL,
    `method` STRING NULL,
    `path` STRING NULL,
    `status` SMALLINT UNSIGNED NULL,
    `user` STRING NULL,
    `timestamp` TIMESTAMP(3) NOT NULL,
    PRIMARY KEY (`user`, `path`, `status`),
    TIME INDEX (`timestamp`)
)
ENGINE=mito
WITH(
    append_mode = 'true'
);
ClickHouse
sql
--Use the default MergeTree engine, defining the same sorting key.
CREATE TABLE IF NOT EXISTS test_table
(
    bytes UInt64 NOT NULL,
    http_version String NOT NULL,
    ip String NOT NULL,
    method String NOT NULL,
    path String NOT NULL,
    status UInt8 NOT NULL,
    user String NOT NULL,
    timestamp String NOT NULL,
)
ENGINE = MergeTree()
ORDER BY (user, path, status);
Elasticsearch
json
{
  "vector-2024.08.19": {
    "mappings": {
      "properties": {
        "bytes": {
          "type": "long"
        },
        "http_version": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "ip": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "method": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "path": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "status": {
          "type": "long"
        },
        "timestamp": {
          "type": "date"
        },
        "user": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

Unstructured Data Model

GreptimeDB
sql
--message column enables the FULLTEXT option, which turns on full-text indexing.
CREATE TABLE IF NOT EXISTS `test_table` (
    `message` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
    `timestamp` TIMESTAMP(3) NOT NULL,
    TIME INDEX (`timestamp`)
)
ENGINE=mito
WITH(
    append_mode = 'true'
);
ClickHouse
sql
SET allow_experimental_full_text_index = true;
CREATE TABLE IF NOT EXISTS test_table
(
    message String,
    timestamp String,
    INDEX inv_idx(message) TYPE full_text(0) GRANULARITY 1
)
ENGINE = MergeTree()
ORDER BY tuple();
Elasticsearch
json
{
  "vector-2024.08.19": {
    "mappings": {
      "properties": {
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "service": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "source_type": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "timestamp": {
          "type": "date"
        }
      }
    }
  }
}

About Greptime

Greptime offers industry-leading time series database products and solutions to empower IoT and Observability scenarios, enabling enterprises to uncover valuable insights from their data with less time, complexity, and cost.

GreptimeDB is an open-source, high-performance time-series database offering unified storage and analysis for metrics, logs, and events. Try it out instantly with GreptimeCloud, a fully-managed DBaaS solution—no deployment needed!

The Edge-Cloud Integrated Solution combines multimodal edge databases with cloud-based GreptimeDB to optimize IoT edge scenarios, cutting costs while boosting data performance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected.

Join our community

Get the latest updates and discuss with other users.