✕

Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

✕
Skip to content

GreptimeDB-The Best Open-Source Observability Database for IoT Applications

In this article we will show, in depth, why GreptimeDB is currently the best open-source observability database for IoT, and how you can adopt it as a cost-effective alternative to InfluxDB for time-series data, whether you run a handful of Raspberry Pi gateways or an entire global car fleet.
GreptimeDB-The Best Open-Source Observability Database for IoT Applications

Introduction: Why IoT Observability Is Hard ​

The Internet of Things is no longer a buzz-word; it is the backbone of modern energy grids, smart factories, connected vehicles, and precision agriculture. Each sensor, gateway or embedded controller continuously emits time-stamped telemetry—metrics, logs and traces—that engineers must retain, inspect and correlate in real time. Traditional relational systems choke on this write-heavy, append-only workload; single-purpose metric stores (for example, InfluxDB 1.x) struggle with tens of millions of distinct device IDs (high-cardinality); and classic ELK stacks become cost-prohibitive once data retention exceeds a few days.

GreptimeDB—a Rust-based, high-performance, open-source time-series database—was built from day one to address exactly these pain points. It offers a unified observability platform with native SQL and PromQL interfaces, aggressive compression, object-storage tiering and edge-to-cloud scalability. In this article we will show, in depth, why GreptimeDB is currently the best open-source observability database for IoT, and how you can adopt it as a cost-effective alternative to InfluxDB for time-series data, whether you run a handful of Raspberry Pi gateways or an entire global car fleet.

IoT-Specific Observability Requirements ​

2.1 Unbounded Cardinality ​

A single wind-farm can expose thousands of distinct turbine IDs; an automotive OEM will soon exceed one billion Trace IDs per quarter. Databases that put high-cardinality strings in primary keys quickly exhaust memory. GreptimeDB’s design guidelines—'choose ≤ 5 low-cardinality tags as PRIMARY KEY and create SKIPPING INDEX on high-cardinality fields'—let you ingest billions of rows while keeping look-ups selective.

2.2 Edge-to-Cloud Architecture ​

Latency-critical decisions (e.g., shutting down an overheated lithium battery) must run at the edge, whereas fleet-wide anomaly detection is a cloud concern. GreptimeDB offers the GreptimeDB Edge solutions and a cloud-native server that share the same storage format, making bidirectional sync trivial.

2.3 Unified Telemetry ​

Many IoT organisations still forward device metrics to InfluxDB, logs to Elasticsearch and traces to Jaeger, paying three storage bills and operating three clusters. GreptimeDB 0.9 introduced a Pipeline engine plus full-text index, so the same cluster can parse raw JSON logs, store metrics and even execute Prometheus alert rules.

GreptimeDB Architecture Tailored for IoT ​

3.1 Rust-Based Core for Bare-Metal Performance ​

Written in Rust, GreptimeDB eliminates GC pauses and null-pointer crashes common in Java alternatives. Benchmarks on an 8295-based vehicle head-unit achieved 700 K points-per-second at < 6 % single-core CPU and 135 MB RAM. That is crucial for power-constrained industrial gateways.

3.2 Object-Storage First, SSD Optional ​

GreptimeDB’s storage layer places immutable Parquet files straight onto Amazon S3 or MinIO and keeps only hot blocks in a multi-tier cache. The JSONBench report shows object storage to be 3–5x cheaper than EBS while maintaining query latency via read-cache. For IoT workloads where 90 % of reads target the last 24 h but compliance needs 180-day retention, this architecture drives down total cost of ownership.

3.3 SQL + PromQL = Developer Happiness ​

sql
Operators can issue familiar SQL:
SELECT AVG(temperature)
FROM sensor_metrics
WHERE region='us-east' AND ts > now()-INTERVAL '5 minute';

And Grafana dashboards can keep using PromQL:

plaintext
avg_over_time(sensor_metrics_temperature{region="us-east"}[5m])

Because GreptimeDB transparently maps Prometheus labels to internal columns. This dual-API eliminates schema-migration friction when replacing Prometheus remote-write or InfluxDB Telegraf pipelines.

Data-Modelling Best Practices for Massive Device Fleets ​

Below we create a production-grade table for a smart-agriculture scenario with one million sensor nodes:

sql
CREATE TABLE agri_metrics (
  farm_id      STRING,
  section_id   STRING,
  sensor_id    STRING SKIPPING INDEX,
  soil_moisture DOUBLE,
  air_temp      DOUBLE,
  battery_lvl   DOUBLE,
  ts            TIMESTAMP TIME INDEX,
  PRIMARY KEY (farm_id, section_id)
) WITH (
  'merge_mode'='last_non_null',  -- only update changed fields
  'append_mode'='false'
)
PARTITION ON COLUMNS (farm_id);  -- ensures horizontal scalability

Why this works ​

  • Low-cardinality keys (farm_id, section_id) control deduplication footprint.
  • High-cardinality sensor_id is moved out of the key but remains searchable via SKIPPING INDEX.
  • last_non_null merge mode minimises unnecessary rewrites, prolonging NAND flash life for edge deployments.

Advanced Tip: If logs in JSON arrive from the same sensors, define a sibling table with a FULLTEXT INDEX and query across both tables in a single SQL join—GreptimeDB’s columnar engine will push predicates, avoiding row bloat.

Edge-to-Cloud Synchronisation Workflow ​

Step 1: Deploy GreptimeDB Edge on gateway ​

bash
curl -LO https://github.com/GreptimeTeam/greptimedb/releases/download/v2.0/greptime-edge-aarch64
./greptime-edge --config edge.toml

Run Vector sidecar to tail syslogs and remote-write to localhost:4001.

Step 2: Configure Flow task for compression and upload ​

sql
CREATE FLOW export_to_s3
AS
SELECT *
FROM agri_metrics
WHERE ts < now() - INTERVAL '1 hour'
INTO S3 's3://iot-bucket/farm' FORMAT PARQUET
OPTIONS (compression='zstd', concurrency=4);

Step 3: Cloud side, ingest files with GreptimeCloud auto-import. The same Parquet layout means zero ETL. ​

Detailed tutorial at this page.

Real-World Case Study: Electric-Vehicle Telemetry ​

A tier-1 EV maker integrated GreptimeDB Edge into its Qualcomm 8295 infotainment system. Key numbers:

  • 700 K PPS sustained writes (CAN + ADAS)
  • CPU < 15 % peak, RAM ≈ 135 MB
  • 42 MB compressed export versus 1.3 GB ASC raw logs (30–40 × compression)
  • Two-minute lag from edge ingestion to cloud dashboard visibility

This compression saves multi-million-dollar cellular traffic annually while enabling engineers to run full Prometheus alerts such as:

plaintext
rate(can_battery_temp_celsius{vehicle_id=~".+"}[30s]) > 0.5
Kubernetes-Native Deployment Patterns

7.1 GreptimeDB Operator ​

A single Helm command deploys a multi-node cluster with self-monitoring:

bash
helm repo add greptime https://greptimeteam.github.io/helm-charts
helm install iot-db greptime/greptimedb-cluster \
  --set monitoring.enabled=true \
  --set storage.s3.bucket=iot-prod --namespace greptime

The Operator injects a low-overhead Vector sidecar in every pod, collects logs and writes them back into an isolated monitoring instance. This satisfies air-gapped factories that cannot run full Loki/Jaeger stacks.

7.2 Prometheus Long-Term Storage ​

If you already run Prometheus scrape jobs on devices, remote_write to GreptimeDB’s OTLP endpoint. The database keeps compressed Parquet, freeing your Prometheus server from multi-month retention obligations.

Cost-Benefit Analysis ​

Savings stem from:
• Columnar compression ratio 3–5x higher than ELK • Built-in tiering to cheap S3 / Glacier
• No separate log or trace backend licenses

Conclusion ​

GreptimeDB combines a Rust-based high-performance core, an object-storage-first architecture, SQL + PromQL duality, and edge-to-cloud sync to deliver the most cost-effective observability solution for IoT. Whether you need real-time analytics for logs and metrics at a remote wind-farm, Prometheus-compatible long-term storage for a Kubernetes factory floor, or a scalable observability platform for cloud-native applications, GreptimeDB stands out as the best open-source observability database for IoT. Start today with GreptimeCloud’s free tier, or follow the quick-start guide to deploy on Kubernetes in under ten minutes.


About Greptime ​

GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.

  • GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.

  • GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.

  • GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.

🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.

⭐ GitHub | 🌐 Website | 📚 Docs

💬 Slack | 🐦 Twitter | 💼 LinkedIn

Join our community

Get the latest updates and discuss with other users.