Skip to content

How to Choose the Right Ingestion Protocol for GreptimeDB

Benchmark results comparing 7 GreptimeDB ingestion protocols under identical conditions on GreptimeDB v1.0 GA, with throughput from 72K to 2.7M rows/sec at batch=1000 (peak 3.3M at batch=2000). Practical guidance for choosing the right one.
How to Choose the Right Ingestion Protocol for GreptimeDB
On this page

Different ingestion protocols can differ by up to ~37x in throughput. This post uses benchmark data to help you pick the right one.

Updated 2026-04-21: The data below has been regenerated on GreptimeDB v1.0 GA. A few changes relative to the original run are worth noting. The OTLP path now uses the greptime_identity pipeline to expand attributes into individual columns, producing a storage layout comparable to the other protocols. v1.0 also made Flat SST / BulkMemtable the default; the stricter Arrow schema check on that path surfaced a timestamp type mismatch in the Go ingester, which has been fixed in greptimedb-ingester-go v0.7.2. PostgreSQL now has prepared-statement caching enabled. Release context is in the v1.0 GA post.

GreptimeDB supports over a dozen ingestion protocols, and the most common question in our community is: which one should I use?

There's plenty of scattered data out there, but test conditions vary so much that direct comparison is impossible. So I built an open-source benchmark tool, greptimedb-ingestion-benchmark, to test the most common protocols under identical conditions. This post shares the results and recommendations.

Protocols tested

Three categories, picked from GreptimeDB's many ingestion options:

GreptimeDB gRPC protocol, used through official SDKs, with three write modes:

Write modeDescription
gRPC SDK (Unary)One RPC call per batch, simplest
gRPC StreamBidirectional streaming over a persistent connection, suited for high-frequency and sustained high-throughput writes
gRPC Bulk (Arrow)Arrow Flight DoPut with columnar transfer, highest throughput

Open standard protocols: InfluxDB Line Protocol (HTTP text) and OTLP Logs (HTTP + Protobuf).

SQL protocols: MySQL INSERT and PostgreSQL INSERT.

We tested OTLP Logs rather than OTLP Metrics. In GreptimeDB's OTLP data model, Metrics maps each metric name to a separate table. This benchmark has 5 metric fields, so the Metrics model would create 5 tables — not a fair comparison. The Logs model writes all fields into a single table, keeping conditions consistent. In this run the OTLP Logs path uses the greptime_identity pipeline, which expands log attributes into individual columns instead of packing them into a JSON column — so the resulting storage layout is comparable to the other protocols.

On Schemaless writes: gRPC SDK, gRPC Stream, InfluxDB LP, and OTLP all support automatic table creation[1] — just write new fields and GreptimeDB adds columns on the fly. SQL INSERT and gRPC Bulk (Arrow) require pre-created tables. SQL depends on an existing table structure for INSERT INTO statements; Arrow Bulk needs the target table to exist for column mapping via the DoPut interface. If your data structure changes frequently (IoT device fields, LLM conversation data, etc.), go with a Schemaless-capable protocol.

Test setup

10 million rows, 1 million time series (1,000 hosts × 5 regions × 10 datacenters × 20 services), 5 float64 metric fields per row, fixed random seed (seed=42). Each protocol writes to its own isolated table. 5 concurrent workers, all SDKs at default settings.

Test environment: MacBook Pro 14-inch (M4 Max, 48 GB), GreptimeDB v1.0 GA standalone mode, greptimedb-ingester-go v0.7.2. This is a single-machine test to compare relative differences between protocols, not to measure absolute throughput limits. A production distributed cluster will yield higher absolute numbers, but the relative ordering stays the same. Full methodology in the repository README.

Results

batch=1000, 1M series

1 million time series is close to real production cardinality, and batch=1000 is a reasonable default for most workloads.

ProtocolThroughput (rows/sec)DurationP50 latencyP99 latency
gRPC Bulk (Arrow)2,678,8393.7s1.54 ms8.80 ms
gRPC Stream1,562,1346.4s2.88 ms10.77 ms
gRPC SDK1,174,2218.5s4.15 ms10.80 ms
InfluxDB LP889,05111.3s5.42 ms13.06 ms
OTLP Logs (HTTP)621,36716.1s7.79 ms16.38 ms
PostgreSQL INSERT73,760135.6s66.58 ms101.58 ms
MySQL INSERT72,103138.7s67.65 ms119.37 ms

Protocol throughput comparison

The three gRPC modes land between 1.17M and 2.68M rows/sec. HTTP protocols — InfluxDB LP at 889K and OTLP Logs at 621K — sit in the next tier. SQL comes in at 72K–74K rows/sec. That's roughly a 37x gap between the fastest and slowest.

OTLP at 621K rows/sec falls below InfluxDB LP, which is expected. Both run over HTTP with one request-response per batch, but OTLP additionally runs the greptime_identity pipeline server-side to expand attributes into columns; that expansion is the dominant CPU cost. In return, OTLP logs are stored in a column layout comparable to the other protocols, and, as the cardinality section below shows, this path is nearly insensitive to series count.

A note on the SQL results: both the connection pool and concurrency were set to 5, same as every other protocol in this benchmark. In practice, you can improve SQL write throughput by increasing the connection pool size and concurrency, but that's outside the scope of this test. The numbers here reflect relative performance under identical concurrency.

Lower cardinality reference: batch=1000, 100K series

With fewer time series (a few hundred hosts, say), most protocols are essentially flat — but not all:

ProtocolThroughput (rows/sec)P50 latencyP99 latency
gRPC Bulk (Arrow)1,950,5171.37 ms12.92 ms
gRPC Stream1,653,2252.64 ms11.47 ms
gRPC SDK1,228,2983.96 ms9.87 ms
InfluxDB LP945,9124.99 ms15.26 ms
OTLP Logs (HTTP)616,8487.46 ms18.16 ms
PostgreSQL INSERT74,21865.90 ms106.31 ms
MySQL INSERT73,81366.12 ms107.61 ms

gRPC Bulk is actually slower at 100K series than at 1M — the opposite of every other protocol. That isn't a typo; the "Series cardinality impact" section below digs into why.

Batch size impact

Four batch sizes (50 / 200 / 1,000 / 2,000) at 1M series:

Protocolbatch=50batch=200batch=1000batch=2000
gRPC Bulk (Arrow)806,4951,524,3162,678,8393,335,918
gRPC Stream641,4851,076,5551,562,1341,759,202
gRPC SDK552,341950,1971,174,2211,415,292
InfluxDB LP507,072761,088889,0511,082,739
OTLP Logs (HTTP)425,849564,228621,367683,222
MySQL INSERT66,78867,69272,10373,534
PostgreSQL INSERT66,49068,17673,76073,617

Batch size impact on throughput

The gRPC family keeps gaining as batches grow. Bulk goes from 806K at batch=50 to 3.34M at batch=2000 — more than 4x. Stream 2.7x (641K → 1.76M), SDK 2.6x. The latency tradeoff is reasonable too: at batch=2000, P99 stays in the low-teens ms for gRPC Bulk (13.78 ms).

InfluxDB LP scales better than OTLP (507K → 1.08M vs 426K → 683K). OTLP runs protobuf plus pipeline processing, both of which are heavier than InfluxDB's plain text line protocol.

SQL barely moves. MySQL goes from 67K to 74K, PostgreSQL from 66K to 74K. Throughput is saturated by the INSERT ... VALUES (...) row-by-row execution model; bigger batches mostly just inflate per-statement latency (P99 climbs from ~7 ms to ~200 ms).

Series cardinality impact

One piece of v1.0 context matters for this section: Flat SST / BulkMemtable is now the default storage format (see the v1.0 GA announcement). Every protocol in this benchmark writes through the new BulkMemtable, and the cardinality sweep reflects that clearly.

100K vs 1M series, batch=1000:

Protocol100K series1M seriesChange
gRPC Bulk (Arrow)1,950,5172,678,839+37.3%
gRPC Stream1,653,2251,562,134−5.5%
gRPC SDK1,228,2981,174,221−4.4%
InfluxDB LP945,912889,051−6.0%
OTLP Logs (HTTP)616,848621,367+0.7%
PostgreSQL INSERT74,21873,760−0.6%
MySQL INSERT73,81372,103−2.3%

Series cardinality impact

Two observations stand out. Almost everything except gRPC Bulk is flat within ±6%. gRPC Bulk is the outlier and moves in the opposite direction: +37.3% faster at 1M series than at 100K. The result is stable across repeated runs, not a one-off.

Where the difference actually lives

At 10M rows and batch=2000, dumping information_schema.region_statistics plus the mito flush metrics gives a clear picture:

Metric100K series1M series
Throughput rows/sec2,159K2,933K (+36%)
Wall clock4.63 s3.41 s (Δ = −1.22 s)
Region count11
Flush count (reason="EngineFull")33
SST output size491 MB493 MB
flush_elapsed_sum{type="total"}2.31 s1.02 s
flush_memtables sum2.28 s0.99 s
write_batch (parquet IO) sum2.23 s2.01 s

A few facts read directly off the table above. The two runs have identical region count, flush count, and SST output size, so the gap does not come from triggering more flushes or writing more bytes. write_batch (parquet IO) is nearly identical as well, so the bottleneck is not IO. The only row that moves materially is flush_memtables: at 100K it takes 2.3x as long as at 1M, and that ~1.29 s delta covers almost the entire 1.22 s wall-clock gap. The cardinality reversal is, at its core, a reversal in how long each flush takes inside BulkMemtable.

Control experiment: no flush, no gap

If flush is the source of the gap, shrinking the total write down to something that doesn't trigger a flush should make the gap disappear. It does:

Scenario100K1MFlush count
total = 1M rows (no flush)5,138K / 4,981K rows/sec5,425K / 3,537K rows/sec0
total = 10M rows (3 flushes)2,186K rows/sec2,929K rows/sec3

Without a flush, the two cardinalities are indistinguishable (100K even edges ahead on one run). Once flushes start firing, the gap reappears.

50M rows: the same pattern at 5x scale

To confirm this isn't a 10M-row artifact, we reran at 50M rows on the same machine and config:

batch100K rows/sec1M rows/sec1M advantage100K flushes / total1M flushes / totalPer-flush ratio
10001,604K2,235K+39%12 / 5.80 s12 / 3.85 s1.5×
100003,171K4,090K+29%12 / 14.86 s12 / 10.22 s1.45×

Flush count is identical across both cardinalities (12), and the overall shape matches. Each 100K flush runs roughly 50% slower than the corresponding 1M flush. The gap narrows from +39% (batch=1000) to +29% (batch=10000) but does not disappear. Results at 10M and 50M rows agree, so this is a structural effect rather than small-sample noise.

Why only gRPC Bulk shows this

Every other protocol does per-row work on the way in: partition routing, row-to-column conversion, schema validation. That cost grows with cardinality and roughly cancels the flush speedup on the memtable side, which is why Stream, SDK, InfluxDB, OTLP, and SQL all come out within a few percent of flat. Bulk skips almost all of that per-row work (whole Arrow RecordBatches are appended as parts), so the flush-side cardinality effect is no longer hidden. In this range it works in the user's favor.

Why the gap is so large

gRPC's advantage comes from encoding efficiency. Protocol Buffers is a compact binary format — small payloads, fast parsing. The three modes differ in connection handling: SDK sends one independent RPC per batch; Stream reuses a bidirectional stream, skipping per-batch connection negotiation for roughly 20–30% higher throughput; Bulk uses the Arrow Flight protocol[2] for columnar transfer, and since GreptimeDB also uses Arrow internally as its in-memory format, writes are near zero-copy — that's where the 2.68M rows/sec (and 3.34M at batch=2000) comes from. The tradeoff: you need to pre-create the table.

InfluxDB LP and OTLP both run over HTTP, with a full request-response cycle per batch. That's their ceiling. InfluxDB LP uses a text format but is otherwise a thin path; OTLP adds greptime_identity pipeline processing on the server, which is why it ends up below InfluxDB LP in this run even though both are HTTP + batch-oriented.

SQL is slow for two reasons. First, the processing path is long: the client assembles INSERT INTO ... VALUES (...) text, the server parses the SQL, converts types row by row, then writes. Every step adds overhead, and the text payload is much larger than binary. Second, the concurrency model: MySQL and PostgreSQL protocols use synchronous connections — one connection handles one statement at a time, and concurrency is limited by the connection pool. This is fundamentally different from gRPC's asynchronous streaming model. None of this is GreptimeDB-specific — any time-series database accepting SQL writes faces the same protocol overhead.

How to choose

Most workloads: gRPC SDK. Around 1.17M rows/sec, simple code, Schemaless support. Our official SDKs cover Go, Java, Rust, Erlang, and .NET. If you don't have special requirements, start here. For JS/TS stacks (no gRPC JS client yet), use InfluxDB LP or OTLP instead — both have mature JS libraries and perform at the hundreds-of-thousands-rows-per-second level.

Bulk imports: gRPC Bulk. Data migrations, backfills, ETL. ~2.68M rows/sec at batch=1000, over 3.3M at batch=2000; 10 million rows in under 4 seconds. Requires pre-created tables. The Erlang SDK doesn't support this mode yet.

High-frequency or sustained high-throughput: gRPC Stream. IoT gateways, monitoring collectors, or any scenario with continuous non-stop writes. Also a good fit when write frequency is very high with small payloads per request. Bidirectional streaming avoids per-batch connection setup, delivering ~1.56M rows/sec with Schemaless support.

InfluxDB ecosystem: InfluxDB Line Protocol. Already running Telegraf or outputting Line Protocol? Plug straight into GreptimeDB's compatible endpoint. Around 890K rows/sec, near-zero migration cost.

OTel ecosystem: OTLP. Already using OpenTelemetry Collector or OTel SDKs? OTLP is the natural fit at around 620K rows/sec with Schemaless support. The throughput is lower than InfluxDB LP because the server-side greptime_identity pipeline expands attributes into proper columns — worth the cost because it gives you a comparable storage layout. Note that Metrics and Logs use different data models[3]: Metrics creates one table per metric name (suited for Prometheus-style monitoring), while Logs writes to a unified log table (suited for flexible data structures). Pick based on your actual data model.

Development and debugging: MySQL / PostgreSQL. Write throughput is low (~72K rows/sec for both), but mysql, psql, DBeaver, ORMs, and language drivers all connect directly. No Schemaless support — create tables first. Slow writes don't mean slow queries: MySQL/PG protocols are GreptimeDB's primary query interface.

Quick reference

gRPC SDKgRPC StreamgRPC BulkInfluxDB LPOTLPMySQL/PG
Throughput1.17M/s1.56M/s2.68M/s890K/s620K/s~72K/s
Schemaless❌ Pre-create❌ Pre-create
Wire formatProtobufProtobufArrow IPCTextProtobufSQL text
SDK coverageGo/Java/Rust/Erlang/.NETSameSame (no Erlang)All languagesAll languagesAll languages
Best forGeneral defaultHigh-freq / sustainedBulk importInfluxDB migrationOTel ecosystemQueries & debugging

In short: pick gRPC for performance (start with SDK, move to Stream or Bulk when needed), pick the compatible protocol for your existing ecosystem (InfluxDB LP / OTLP), and use SQL for queries and debugging.

Reproduce it yourself

bash
git clone https://github.com/killme2008/greptimedb-ingestion-benchmark.git
cd greptimedb-ingestion-benchmark
bin/run.sh

The script downloads GreptimeDB, starts it, runs every protocol, and prints results. Customize as needed:

bash
bin/run.sh -protocols grpc,grpc_bulk,influxdb -batch-size 500,1000,2000
bin/run.sh -host 10.0.0.1  # connect to a remote instance

Got different results, or findings from a specific workload? We'd love to hear about it on GitHub Discussions or Slack.

References


  1. GreptimeDB ingestion — automatic schema generation ↩︎

  2. Apache Arrow Flight protocol ↩︎

  3. GreptimeDB OpenTelemetry data model ↩︎

Stay in the loop

Join our community