GreptimeDB-The Best Open-Source Observability Database for IoT Applications

Introduction: Why IoT Observability Is Hard â
The Internet of Things is no longer a buzz-word; it is the backbone of modern energy grids, smart factories, connected vehicles, and precision agriculture. Each sensor, gateway or embedded controller continuously emits time-stamped telemetryâmetrics, logs and tracesâthat engineers must retain, inspect and correlate in real time. Traditional relational systems choke on this write-heavy, append-only workload; single-purpose metric stores (for example, InfluxDB 1.x) struggle with tens of millions of distinct device IDs (high-cardinality); and classic ELK stacks become cost-prohibitive once data retention exceeds a few days.
GreptimeDBâa Rust-based, high-performance, open-source time-series databaseâwas built from day one to address exactly these pain points. It offers a unified observability platform with native SQL and PromQL interfaces, aggressive compression, object-storage tiering and edge-to-cloud scalability. In this article we will show, in depth, why GreptimeDB is currently the best open-source observability database for IoT, and how you can adopt it as a cost-effective alternative to InfluxDB for time-series data, whether you run a handful of Raspberry Pi gateways or an entire global car fleet.
IoT-Specific Observability Requirements â
2.1 Unbounded Cardinality â
A single wind-farm can expose thousands of distinct turbine IDs; an automotive OEM will soon exceed one billion Trace IDs per quarter. Databases that put high-cardinality strings in primary keys quickly exhaust memory. GreptimeDBâs design guidelinesâ'choose ⤠5 low-cardinality tags as PRIMARY KEY and create SKIPPING INDEX
on high-cardinality fields'âlet you ingest billions of rows while keeping look-ups selective.
2.2 Edge-to-Cloud Architecture â
Latency-critical decisions (e.g., shutting down an overheated lithium battery) must run at the edge, whereas fleet-wide anomaly detection is a cloud concern. GreptimeDB offers the GreptimeDB Edge solutions and a cloud-native server that share the same storage format, making bidirectional sync trivial.
2.3 Unified Telemetry â
Many IoT organisations still forward device metrics to InfluxDB, logs to Elasticsearch and traces to Jaeger, paying three storage bills and operating three clusters. GreptimeDB 0.9 introduced a Pipeline engine plus full-text index, so the same cluster can parse raw JSON
logs, store metrics and even execute Prometheus alert rules.
GreptimeDB Architecture Tailored for IoT â
3.1 Rust-Based Core for Bare-Metal Performance â
Written in Rust, GreptimeDB eliminates GC pauses and null-pointer crashes common in Java alternatives. Benchmarks on an 8295-based vehicle head-unit achieved 700 K points-per-second at < 6 % single-core CPU and 135 MB RAM. That is crucial for power-constrained industrial gateways.
3.2 Object-Storage First, SSD Optional â
GreptimeDBâs storage layer places immutable Parquet files straight onto Amazon S3 or MinIO and keeps only hot blocks in a multi-tier cache. The JSONBench report shows object storage to be 3â5x cheaper than EBS while maintaining query latency via read-cache. For IoT workloads where 90 % of reads target the last 24 h but compliance needs 180-day retention, this architecture drives down total cost of ownership.
3.3 SQL + PromQL = Developer Happiness â
Operators can issue familiar SQL:
SELECT AVG(temperature)
FROM sensor_metrics
WHERE region='us-east' AND ts > now()-INTERVAL '5 minute';
And Grafana dashboards can keep using PromQL:
avg_over_time(sensor_metrics_temperature{region="us-east"}[5m])
Because GreptimeDB transparently maps Prometheus labels to internal columns. This dual-API eliminates schema-migration friction when replacing Prometheus remote-write or InfluxDB Telegraf pipelines.
Data-Modelling Best Practices for Massive Device Fleets â
Below we create a production-grade table for a smart-agriculture scenario with one million sensor nodes:
CREATE TABLE agri_metrics (
farm_id STRING,
section_id STRING,
sensor_id STRING SKIPPING INDEX,
soil_moisture DOUBLE,
air_temp DOUBLE,
battery_lvl DOUBLE,
ts TIMESTAMP TIME INDEX,
PRIMARY KEY (farm_id, section_id)
) WITH (
'merge_mode'='last_non_null', -- only update changed fields
'append_mode'='false'
)
PARTITION ON COLUMNS (farm_id); -- ensures horizontal scalability
Why this works â
- Low-cardinality keys (
farm_id
,section_id
) control deduplication footprint. - High-cardinality
sensor_id
is moved out of the key but remains searchable via SKIPPINGINDEX
. last_non_null
merge mode minimises unnecessary rewrites, prolonging NAND flash life for edge deployments.
Advanced Tip: If logs in
JSON
arrive from the same sensors, define a sibling table with aFULLTEXT INDEX
and query across both tables in a single SQL joinâGreptimeDBâs columnar engine will push predicates, avoiding row bloat.
Edge-to-Cloud Synchronisation Workflow â
Step 1: Deploy GreptimeDB Edge on gateway â
curl -LO https://github.com/GreptimeTeam/greptimedb/releases/download/v2.0/greptime-edge-aarch64
./greptime-edge --config edge.toml
Run Vector sidecar to tail syslogs and remote-write to localhost:4001
.
Step 2: Configure Flow task for compression and upload â
CREATE FLOW export_to_s3
AS
SELECT *
FROM agri_metrics
WHERE ts < now() - INTERVAL '1 hour'
INTO S3 's3://iot-bucket/farm' FORMAT PARQUET
OPTIONS (compression='zstd', concurrency=4);
Step 3: Cloud side, ingest files with GreptimeCloud auto-import. The same Parquet layout means zero ETL. â
Detailed tutorial at this page.
Real-World Case Study: Electric-Vehicle Telemetry â
A tier-1 EV maker integrated GreptimeDB Edge into its Qualcomm 8295 infotainment system. Key numbers:
- 700 K PPS sustained writes (CAN + ADAS)
- CPU < 15 % peak, RAM â 135 MB
- 42 MB compressed export versus 1.3 GB ASC raw logs (30â40 Ă compression)
- Two-minute lag from edge ingestion to cloud dashboard visibility
This compression saves multi-million-dollar cellular traffic annually while enabling engineers to run full Prometheus alerts such as:
rate(can_battery_temp_celsius{vehicle_id=~".+"}[30s]) > 0.5
Kubernetes-Native Deployment Patterns
7.1 GreptimeDB Operator â
A single Helm command deploys a multi-node cluster with self-monitoring:
helm repo add greptime https://greptimeteam.github.io/helm-charts
helm install iot-db greptime/greptimedb-cluster \
--set monitoring.enabled=true \
--set storage.s3.bucket=iot-prod --namespace greptime
The Operator injects a low-overhead Vector sidecar in every pod, collects logs and writes them back into an isolated monitoring instance. This satisfies air-gapped factories that cannot run full Loki/Jaeger stacks.
7.2 Prometheus Long-Term Storage â
If you already run Prometheus scrape jobs on devices, remote_write to GreptimeDBâs OTLP endpoint. The database keeps compressed Parquet, freeing your Prometheus server from multi-month retention obligations.
Cost-Benefit Analysis â
Savings stem from:
⢠Columnar compression ratio 3â5x higher than ELK ⢠Built-in tiering to cheap S3 / Glacier
⢠No separate log or trace backend licenses
Conclusion â
GreptimeDB combines a Rust-based high-performance core, an object-storage-first architecture, SQL + PromQL duality, and edge-to-cloud sync to deliver the most cost-effective observability solution for IoT. Whether you need real-time analytics for logs and metrics at a remote wind-farm, Prometheus-compatible long-term storage for a Kubernetes factory floor, or a scalable observability platform for cloud-native applications, GreptimeDB stands out as the best open-source observability database for IoT. Start today with GreptimeCloudâs free tier, or follow the quick-start guide to deploy on Kubernetes in under ten minutes.
About Greptime â
GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and tracesâdelivering sub-second insights from edge to cloud âat any scale.
GreptimeDB OSS â The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.
GreptimeDB Enterprise â A robust observability database with enhanced security, high availability, and enterprise-grade support.
GreptimeCloud â A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.
đ Weâre open to contributorsâget started with issues labeled good first issue and connect with our community.