Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Engineering
April 15, 2025

A Practical Guide to Observability Data Modeling with GreptimeDB

Observability hinges on metrics, logs, and traces. This article provides a hands-on guide to modeling observability data with GreptimeDB, covering key practices such as understanding column cardinality, primary key design, wide table schema modeling, index selection, deduplication, and partitioning strategies.

When we talk about observability data, we inevitably deal with the holy trinity: Metrics, Logs, and Traces. Much like how a doctor needs body temperature, blood reports, and CT scans to assess a patient's health, these three data types are essential for gaining a comprehensive view of system behavior.

In this article, we’ll walk through how to build an efficient and flexible "Medical Record System" for your infrastructure using GreptimeDB, a comprehensive database purpose-built for observability workloads.

🧩 Step 1: Understand the Data Before Modeling

Before diving into schema design, it’s crucial to understand two foundational concepts:

1. Column Cardinality: How 'Unique' is Your Data

Imagine you're managing a library:

  • Low-cardinality columns are like book genres—Sci-Fi, History, Art. These usually contain only hundreds or thousands of unique values. Examples include:
    • region
    • cluster_name
    • http_method
  • High-cardinality columns are more like ISBNs or user IDs—potentially millions or even billions of unique values, such as:
    • trace_id
    • user_id
    • request_id

In one e-commerce observability project, region had only 7 values (low cardinality), while user_id reached the billion-scale (high cardinality). This distinction may seem trivial, but it has massive implications for performance.

2. Column Roles: Tag, Field, and Timestamp

Every column in GreptimeDB plays a specific role:

  • Tag: Used in the primary key; determines how data is deduplicated and sorted.
  • Field: Stores actual measurements or non-key values.
  • Timestamp: Marked with TIME INDEX constraint; required for time-series context.

Example:

sql
CREATE TABLE http_logs (
  access_time TIMESTAMP TIME INDEX,
  application STRING,
  http_status INT,
  request_id STRING,
  PRIMARY KEY(application)
);

🔑 Step 2: Designing the Primary Key — Choosing the Right Identifier

Scenario 1: Log Collection Without Primary Key

For write-heavy logging workloads, using no primary key with append-only mode can maximize performance:

sql
CREATE TABLE access_logs (
  timestamp TIMESTAMP TIME INDEX,
  service STRING,
  status_code INT,
  response_time DOUBLE,
  trace_id STRING
) with ('append_mode'='true');

This pattern is ideal for 'write once, never update' use cases and can improve write performance by over 30%.

Scenario 2: Preferences for Low-cardinality Columns

When updates or dimension-based queries are required, thoughtful key design is critical. Golden rules:

  1. ✅ Choose only low-cardinality columns (avoid trace_id, user_id).
  2. ✅ Keep the cardinality of the key combination under 100k.
  3. ✅ Use ≤ 5 key columns.
  4. ✅ Prefer strings and integers as keys, avoid floats and timestamps.

Example:

sql
CREATE TABLE service_metrics (
  datacenter STRING,
  service_name STRING,
  cpu_util DOUBLE,
  memory_util DOUBLE,
  ts TIMESTAMP,
  PRIMARY KEY(datacenter, service_name),
  TIME INDEX(ts)
);

🔍 Step 3: Accelerate Queries with the Right Indexes

GreptimeDB provides three types of indexes. Use them based on your query and data characteristics:

Inverted Index — Precise and Categorized Retrieval

Best for low-cardinality filtering:

sql
CREATE TABLE api_metrics (
  
endpoint
 STRING,
  service_name STRING INVERTED INDEX,
  http_status INT INVERTED INDEX,
  latency DOUBLE,
  ts TIMESTAMP,
  PRIMARY KEY(service_name, 
endpoint
),
  TIME INDEX(ts)
);

Supports =, <, >, IN, and BETWEEN with Inverted Index.

Skipping Index — Efficient on High-Cardinality Data

Great for selective equality filters on massive datasets:

sql
CREATE TABLE user_events (
  user_id STRING SKIPPING INDEX,
  event_type STRING,
  device STRING,
  ts TIMESTAMP,
  TIME INDEX(ts)
) with ('append_mode'='true');

The skipping index is a very efficient option: it requires less storage and has minimal impact on write performance. However, it only supports equality queries.

sql
CREATE TABLE error_logs (
  message STRING FULLTEXT INDEX WITH(analyzer = 'English'),
  
level
 STRING INVERTED INDEX,
  ts TIMESTAMP,
  TIME INDEX(ts)
) with ('append_mode'='true');

Allows expressive queries like:

sql
SELECT * FROM error_logs WHERE matches(message, 'connection timeout');

🧮 Step 4: Wide Table vs. Multi-Table Schema

Wide Table — A Supermarket of Metrics

Our best practice: store related metrics in a wide table, especially when collected together:

sql
CREATE TABLE node_metrics (
  host STRING,
  cpu_user DOUBLE,
  cpu_system DOUBLE,
  memory_used DOUBLE,
  disk_read_bytes DOUBLE,
  disk_write_bytes DOUBLE,
  net_in_bytes DOUBLE,
  net_out_bytes DOUBLE,
  ts TIMESTAMP,
  PRIMARY KEY(host),
  TIME INDEX(ts)
);

Benefits:

  • 🔒 Better compression (30–50% less storage);
  • 🔍 Simplified queries (no JOIN needed).

When to Use Multiple Tables

Split metrics into multiple tables only when:

  • Metrics have very different collection frequencies (e.g., per second vs. hourly).
  • Tag schema varies widely.
  • Access control requires separation.

🔄 Step 5: Merge Modes — Update or Deduplicate Data

GreptimeDB supports two deduplication modes:

last_row: Row-Level Deduplication

Default mode last_row — entire rows are replaced by the most recent entry.

last_non_null: Field-Level Updates

Update only selected fields — using last_non_null is a better choice.

sql
CREATE TABLE device_telemetry (
  device_id STRING,
  temperature DOUBLE,
  humidity DOUBLE,
  battery DOUBLE,
  ts TIMESTAMP,
  PRIMARY KEY(device_id),
  TIME INDEX(ts)
) with ('merge_mode'='last_non_null');

Update example:

sql
INSERT INTO device_telemetry(device_id, temperature, ts)
VALUES ('device1', 25.5, now());

🌐 Step 6: Distributed Table Design — Scale to Petabyte Level

When data exceeds TB scale, distributed partitioning is essential:

sql
CREATE TABLE global_metrics (
  region STRING,
  datacenter STRING,
  host STRING,
  cpu DOUBLE,
  memory DOUBLE,
  ts TIMESTAMP,
  PRIMARY KEY(region, datacenter, host),
  TIME INDEX(ts)
) PARTITION ON COLUMNS (region) (
  region = 'region1',
  region = 'region2',
  region = 'region3'
);

Best practice: partition by columns with even distribution and query-aligned patterns—region, business unit, or app name work well.

🛠️ Case Study Analysis

Case 1: Large-Scale API Monitoring

For a system handling billions of API requests daily, we use:

sql
CREATE TABLE api_metrics (
  region STRING,
  service STRING,
  
endpoint
 STRING,
  status_code INT INVERTED INDEX,
  p50_latency DOUBLE,
  p95_latency DOUBLE,
  p99_latency DOUBLE,
  error_count INT,
  success_count INT,
  ts TIMESTAMP,
  PRIMARY KEY(region, service, 
endpoint
),
  TIME INDEX(ts)
) PARTITION ON COLUMNS (region) (
  region = 'us-east-1',
  region = 'eu-central-1',
  region = 'ap-southeast-2'
) with ('merge_mode'='last_non_null');

Advantages of this design:

  • Uses multi-level, low-cardinality primary keys, making it easy to filter by service or endpoint.
  • Stores related metrics in the same row, simplifying queries and improving compression.
  • Creates an index on status codes to support fast filtering of error requests.
  • Partitions by region to reduce cross-region query latency.

Case 2: IoT Device Monitoring Platform

For a platform that monitors hundreds of thousands of IoT devices, we adopted the following schema:

sql
CREATE TABLE device_metrics (
 device_type STRING,
 `location` STRING,
 device_id STRING SKIPPING INDEX,
 temperature DOUBLE,
 humidity DOUBLE,
 battery_level DOUBLE,
 signal_strength DOUBLE,
 alert_count INT,
 ts TIMESTAMP,
 PRIMARY KEY(device_type, `location`),
 TIME INDEX(ts)
 ) PARTITION ON COLUMNS (device_type) (
 device_type = 'sensor',
 device_type = 'actuator',
 device_type = 'controller'
 );

Benefits:

  • Uses device_type and location as low-cardinality primary keys instead of the high-cardinality device_id.
  • Adds a skipping index on device_id to allow fast lookup of individual devices.
  • Stores related sensor metrics in the same row to improve query performance.
  • Partitions by device type for easier management of retention policies across device categories.

💡 Performance Tuning Tips

Based on our hands-on experience, here are eight practical tips to significantly improve performance:

  • Start simple: Begin with basic tables without primary keys to establish a performance baseline;
  • Avoid over-indexing: Indexes increase write overhead — only index frequently used query conditions;
  • Monitor table size: Consider partitioning when a single table exceeds 500GB;
  • Time partitioning is built-in: GreptimeDB automatically organizes data by time — no extra steps needed;
  • Watch for hot partitions: Ensure write and query loads are evenly distributed across partitions;
  • Set TTLs wisely: Use different data retention periods based on importance;
  • Capacity planning: Reserve at least 50% of system resources for queries after meeting write throughput;
  • Downsampling strategy: Use Flow tasks to downsample data (e.g., from per-second to per-minute granularity).

👉 Flow Documentation

Conclusion

Designing for observability isn’t just about storing data—it’s about making it accessible, fast, and scalable. With the right schema and indexing strategies, GreptimeDB can serve as a powerful foundation for your metrics, logs, and trace workloads.

Author's note: This article is based on features available in GreptimeDB v0.13 and above. Some details may change as new versions are released.

You're welcome to check out the official Data Modeling Guide.


About Greptime

Greptime offers industry-leading time series database products and solutions to empower IoT and Observability scenarios, enabling enterprises to uncover valuable insights from their data with less time, complexity, and cost.

GreptimeDB is an open-source, high-performance time-series database offering unified storage and analysis for metrics, logs, and events. Try it out instantly with GreptimeCloud, a fully-managed DBaaS solution—no deployment needed!

The Edge-Cloud Integrated Solution combines multimodal edge databases with cloud-based GreptimeDB to optimize IoT edge scenarios, cutting costs while boosting data performance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected.

Join our community

Get the latest updates and discuss with other users.