Introduction: Why IoT Observability Is Hard
The Internet of Things is no longer a buzz-word; it is the backbone of modern energy grids, smart factories, connected vehicles, and precision agriculture. Each sensor, gateway or embedded controller continuously emits time-stamped telemetry—metrics, logs and traces—that engineers must retain, inspect and correlate in real time. Traditional relational systems choke on this write-heavy, append-only workload; single-purpose metric stores (for example, InfluxDB 1.x) struggle with tens of millions of distinct device IDs (high-cardinality); and classic ELK stacks become cost-prohibitive once data retention exceeds a few days.
GreptimeDB—a Rust-based, high-performance, open-source time-series database—was built from day one to address exactly these pain points. It offers a unified observability platform with native SQL and PromQL interfaces, aggressive compression, object-storage tiering and edge-to-cloud scalability. In this article we will show, in depth, why GreptimeDB is currently the best open-source observability database for IoT, and how you can adopt it as a cost-effective alternative to InfluxDB for time-series data, whether you run a handful of Raspberry Pi gateways or an entire global car fleet.
IoT-Specific Observability Requirements
2.1 Unbounded Cardinality
A single wind-farm can expose thousands of distinct turbine IDs; a connected-vehicle fleet generates millions of distinct trace IDs that quickly exhaust traditional indexing. Databases that put high-cardinality strings in primary keys typically suffer memory bloat and slow scans. GreptimeDB v1.0 ships flat format as the default SST layout — a redesigned memtable plus multi-series merge path that delivers 4× write throughput and up to 10× faster queries on millions of series without the memory pressure that breaks series-oriented stores. For schema design, a practical pattern is to limit PRIMARY KEY to a handful of low-cardinality tags and add SKIPPING INDEX on high-cardinality fields.
2.2 Edge-to-Cloud Architecture
Latency-critical decisions (e.g., shutting down an overheated lithium battery) must run at the edge, whereas fleet-wide anomaly detection is a cloud concern. GreptimeDB offers the GreptimeDB Edge solutions and a cloud-native server that share the same storage format, making bidirectional sync trivial.
2.3 Unified Telemetry
Many IoT organisations still forward device metrics to InfluxDB, logs to Elasticsearch and traces to Jaeger, paying three storage bills and operating three clusters. GreptimeDB stores metrics, logs, traces, and wide events in one engine, with a Pipeline engine plus full-text index for raw JSON parsing and native Prometheus alert rule execution.
GreptimeDB Architecture Tailored for IoT
3.1 Rust-Based Core for Bare-Metal Performance
Written in Rust, GreptimeDB eliminates GC pauses and null-pointer crashes common in Java alternatives. Published benchmarks on an 8295-based vehicle head-unit sustained 700 K points-per-second at ~6% average single-core CPU (peak below 15%) and ~130 MB RAM. That headroom matters on power-constrained industrial gateways.
3.2 Object-Storage First, SSD Optional
GreptimeDB’s storage layer places immutable Parquet files straight onto Amazon S3 or MinIO and keeps only hot blocks in a multi-tier cache. The JSONBench report shows object storage to be 3–5x cheaper than EBS while maintaining query latency via read-cache. For IoT workloads where 90 % of reads target the last 24 h but compliance needs 180-day retention, this architecture drives down total cost of ownership.
3.3 SQL + PromQL = Developer Happiness
Operators can issue familiar SQL:
SELECT AVG(temperature)
FROM sensor_metrics
WHERE region='us-east' AND ts > now()-INTERVAL '5 minute';And Grafana dashboards can keep using PromQL:
avg_over_time(sensor_metrics_temperature{region="us-east"}[5m])Because GreptimeDB transparently maps Prometheus labels to internal columns. This dual-API eliminates schema-migration friction when replacing Prometheus remote-write or InfluxDB Telegraf pipelines.
Data-Modelling Best Practices for Massive Device Fleets
Below we create a production-grade table for a smart-agriculture scenario with one million sensor nodes:
CREATE TABLE agri_metrics (
farm_id STRING,
section_id STRING,
sensor_id STRING SKIPPING INDEX,
soil_moisture DOUBLE,
air_temp DOUBLE,
battery_lvl DOUBLE,
ts TIMESTAMP TIME INDEX,
PRIMARY KEY (farm_id, section_id)
) WITH (
'merge_mode'='last_non_null', -- only update changed fields
'append_mode'='false'
)
PARTITION ON COLUMNS (farm_id); -- ensures horizontal scalabilityWhy this works
- Low-cardinality keys (
farm_id,section_id) control deduplication footprint. - High-cardinality
sensor_idis moved out of the key but remains searchable via SKIPPINGINDEX. last_non_nullmerge mode minimises unnecessary rewrites, prolonging NAND flash life for edge deployments.
Advanced Tip: If logs in
JSONarrive from the same sensors, define a sibling table with aFULLTEXT INDEXand query across both tables in a single SQL join—GreptimeDB’s columnar engine will push predicates, avoiding row bloat.
Edge-to-Cloud Synchronisation Workflow
GreptimeDB Edge runs on the gateway or in-vehicle compute, and shares the same storage format as the cloud-side cluster. Raw telemetry is decoded, compressed, and uploaded as Parquet — the cloud side ingests directly without an ETL step.
Li Auto's vehicle data architecture shows this in production. By deploying an on-vehicle time-series database alongside the data collector and signal decoder, the team applies columnar encoding (delta encoding, run-length encoding, zstd with pre-trained dictionaries) before upload. The result is files over 30% smaller than raw messages with generic compression, and tens of millions saved in cumulative bandwidth and cloud resource costs across the fleet. The Greptime Edge-Cloud solution productises the same pattern for connected vehicles and industrial IoT.
For the gateway-to-cloud path, the GreptimeDB ingestion guide covers Prometheus Remote Write, OTLP, InfluxDB Line Protocol, and gRPC — pick the one that matches the existing collector.
EMQX Tables: MQTT + Time-Series in One Platform
Many IoT deployments already terminate MQTT through EMQX, one of the most widely deployed brokers, with over 250 million connected devices across 60+ countries. EMQX Tables builds on this by embedding GreptimeDB directly inside EMQX Cloud — MQTT broker and time-series database on a single managed platform.
Device-to-dashboard is four steps:
- Connect devices to the EMQX broker via MQTT.
- Route topics to EMQX Tables through the built-in rule engine — no custom code, no ETL.
- Query with SQL or PromQL from the EMQX Cloud console.
- Visualise with Grafana, Metabase, or any SQL-compatible tool.
Because GreptimeDB supports schema-on-the-fly, new device payloads create tables and columns automatically. There is no ALTER TABLE step every time a sensor type changes, which matters in fleets where device firmware evolves faster than the data team can keep up.
Real-World Case Studies
Connected Vehicles: Edge Telemetry on the 8295 SoC
A tier-1 EV maker integrated GreptimeDB Edge into its Qualcomm 8295 infotainment system. Key numbers from the published benchmark:
- 700 K PPS sustained writes (CAN + ADAS)
- ~6% average single-core CPU, peak below 15%, RAM ≈ 130 MB
- 42 MB compressed export versus 1.3 GB ASC raw logs (30–40 × compression)
- Two-minute lag from edge ingestion to cloud dashboard visibility
This compression saves multi-million-dollar cellular traffic annually while enabling engineers to run full Prometheus alerts such as:
rate(can_battery_temp_celsius{vehicle_id=~".+"}[30s]) > 0.5Industrial IoT: D6 Monitoramento
D6, a Brazilian industrial IoT team, monitors energy consumption and production signals on factory floors. Their current deployment covers 20 industrial assets, 10 edge gateways, and 30 sensors, with over 1.2 billion data points stored on a single standalone GreptimeDB node — targeting 10 years of retention.
What changed with GreptimeDB versus their previous VictoriaMetrics setup: PromQL still drives real-time dashboards, while SQL plus the Flow engine lets them derive higher-level indicators from raw electrical measurements — inferring whether a machine is running or idle, its operational cycles, and in some cases the type of product being manufactured. TimescaleDB came up during their evaluation but required significantly more infrastructure for the same workload.
Kubernetes-Native Deployment Patterns
GreptimeDB Operator
The GreptimeDB Operator deploys a multi-node cluster with self-monitoring via Helm. The Operator can also inject a Vector sidecar to collect logs and write them back into an isolated monitoring instance — suitable for air-gapped factories that cannot run full Loki/Jaeger stacks.
Prometheus Long-Term Storage
If you already run Prometheus scrape jobs on devices, remote-write to GreptimeDB. The database keeps compressed Parquet on object storage, freeing your Prometheus server from multi-month retention obligations.
Cost-Benefit Analysis
Object-storage-first design (S3, GCS, Azure Blob) decouples storage from provisioned compute. At AWS list prices, S3 Standard sits roughly 3–5× below provisioned SSD EBS, and GreptimeDB's columnar compression reaches 30–40× on time-series payloads. Combined with one cluster handling metrics, logs, and traces — instead of separate Prometheus, Loki, and Elasticsearch stacks — the GreptimeDB OSS product page reports up to 50× total cost reduction for observability workloads.
Conclusion
GreptimeDB combines a Rust-based high-performance core, an object-storage-first architecture, SQL + PromQL duality, and edge-to-cloud sync to deliver the most cost-effective observability solution for IoT. Whether you need real-time analytics for logs and metrics at a remote wind-farm, Prometheus-compatible long-term storage for a Kubernetes factory floor, or a scalable observability platform for cloud-native applications, GreptimeDB stands out as the best open-source observability database for IoT. Follow the quick-start guide to deploy GreptimeDB on Kubernetes in under ten minutes.
About Greptime
GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces — delivering sub-second insights from edge to cloud — at any scale.
GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.
GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.
🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.
Stay in the loop
Join our community
Get the latest updates and discuss with other users.
