Faster Historical Queries: Guide to GreptimeDB's Independent Index File Cache

Querying historical data on S3 means waiting for index files to download—every single time. For GreptimeDB users on object storage, this has been a persistent pain point. Version 1.0.0-beta.1 introduces independent index file caching to fix this.

The new version caches index files separately on local disk and preloads them from object storage on startup. Historical queries no longer hit object storage repeatedly, and latency drops significantly.

Why Separate Caching for Index Files?

Here's GreptimeDB's storage architecture with object storage:

Write Request → WAL → Memtable → Flush → Parquet Files
                                           ↓
                                    Write Cache (Local Disk)
                                           ↓
                                    Object Storage (S3/GCS/Azure Blob)

During queries, GreptimeDB reads from local caches whenever possible. The cache hierarchy from fastest to slowest:

Memtable — Most recent writes, in-memory
Write Cache — Local cache for recently written data
Read Cache — Data loaded from object storage during queries, LRU eviction
Object Storage — Persistent storage, highest latency

The independent index file cache is part of Write Cache. In future versions, Write Cache and Read Cache will be unified.

The problem: in previous versions, increasing local cache size only helped newly written data files. Index files for historical data had no dedicated cache space, so every query might fetch them from object storage.

Worse, index files are small but accessed on every query—they're needed to locate the actual data. When indexes share the cache pool with large data files, they get evicted easily.

How It Works

The 1.0.0-beta.1 solution is straightforward: give index files their own dedicated cache space instead of competing with data files.

Key mechanisms:

Dual cache pools: Parquet (data files) and Puffin (index files) use separate LRU pools that don't interfere with each other
Default 20%: Automatically reserves 20% of Write Cache for indexes, with a minimum of 512MB per pool
Background preloading: Indexes load asynchronously from object storage on startup, newest data first (sorted by timestamp descending); stops when cache capacity is reached, without blocking startup
Configurable ratio: Adjust allocation via index_cache_percent

In practice:

Indexes won't get evicted by large data file writes
Historical indexes stay local—no waiting for the first query to trigger loading
Fast recovery after restarts, no warm-up needed

Configuration

Basic Setup

When using object storage, Write Cache is enabled by default. Just configure the index ratio:

toml

[[region_engine]]
[region_engine.mito]
# Enabled by default for object storage
enable_write_cache = true

# Total Write Cache size
write_cache_size = "10GiB"

# Index file percentage (default 20, valid range 1-99)
index_cache_percent = 20

# Preload indexes on startup (default true)
preload_index_cache = true

With this configuration, a 10GiB Write Cache allocates 2GiB for indexes and 8GiB for data files.

Tuning the Ratio

Increase the ratio when:

Index cache hit rate drops below 80%
Check index_length in information_schema.tables—if total index size exceeds current cache allocation
Historical queries dominate
Complex query conditions with frequent index access

Decrease the ratio when:

Mostly querying recent data (hot data already in Write Cache)
Simple table structures with small index files
Limited local disk space

Monitoring Cache Performance

Use Prometheus metrics to track cache behavior. The type label distinguishes index and data files:

promql

# Index cache miss rate
rate(greptime_mito_cache_miss{type="index"}[5m])

# Data file cache miss rate
rate(greptime_mito_cache_miss{type="file"}[5m])

# Index cache usage
greptime_mito_cache_bytes{type="index"}

# Index cache hit rate
rate(greptime_mito_cache_hit{type="index"}[5m]) /
(rate(greptime_mito_cache_hit{type="index"}[5m]) + rate(greptime_mito_cache_miss{type="index"}[5m]))

If the index cache hit rate drops below 80% (empirical threshold), cache capacity may be insufficient. Increase write_cache_size or raise index_cache_percent.

Relationship with Other Caches

GreptimeDB has multiple cache layers. Understanding their roles helps with tuning:

toml

[[region_engine]]
[region_engine.mito]
# Disk-level cache
enable_write_cache = true
write_cache_size = "10GiB"
index_cache_percent = 20

# Memory-level cache
sst_meta_cache_size = "128MB"
vector_cache_size = "512MB"          # Tag column Arrow arrays
page_cache_size = "512MB"
selector_result_cache_size = "512MB" # File last row cache

# Index-specific memory cache
[region_engine.mito.index]
metadata_cache_size = "64MiB"
content_cache_size = "128MiB"

Cache responsibilities:

Cache	Location	Purpose
Write Cache (data portion)	Disk	Avoids downloading data files from object storage
Write Cache (index portion)	Disk	Avoids downloading index files from object storage
`metadata_cache_size`	Memory	Avoids disk I/O for index metadata
`content_cache_size`	Memory	Avoids disk I/O for index content

Disk and memory caches complement each other: disk cache determines whether to download from object storage; memory cache determines whether to read from local disk. With ample disk space, increase write_cache_size. With ample memory, increase metadata_cache_size and content_cache_size.

Best Practices

Scenario 1: Trace Query Workloads

One user stores 10 billion traces on Alibaba Cloud OSS and needs to query individual traces by trace_id across the entire dataset. Querying index_length from information_schema.tables shows a total index size of ~106GB. Before upgrading, queries frequently timed out or took too long because indexes kept getting evicted from cache. After enabling independent index caching, queries now return within minutes.

Recommended configuration:

toml

[[region_engine]]
[region_engine.mito]
write_cache_size = "200GiB"
index_cache_percent = 30  # ~60GB for indexes, covers hot index data

[region_engine.mito.index]
metadata_cache_size = "128MiB"
content_cache_size = "256MiB"

Scenario 2: Real-time Monitoring

For dashboards that mostly query recent data, defaults work well:

toml

[[region_engine]]
[region_engine.mito]
write_cache_size = "10GiB"
index_cache_percent = 20

Upgrade Notes

When upgrading to 1.0.0-beta.1:

No manual migration: The feature works out of the box; indexes load automatically in the background
Expect startup I/O spike: First startup may show brief I/O spikes during preloading; monitor progress via greptime_mito_cache_fill_downloaded_files metric
Check disk space: Ensure sufficient local disk for the index cache
To disable preloading: Set preload_index_cache = false
Cluster deployment: Configure on each Datanode; caches are not shared between nodes

Summary

Independent index file caching optimizes GreptimeDB for object storage: 20% of Write Cache reserved for indexes by default, background preloading on startup (newest first), tunable via index_cache_percent.

Expect faster historical queries after upgrading.

For more details, see the GreptimeDB Performance Tuning Guide.

Faster Historical Queries: Guide to GreptimeDB's Independent Index File Cache

Why Separate Caching for Index Files? ​

How It Works ​

Configuration ​

Basic Setup ​

Tuning the Ratio ​

Monitoring Cache Performance ​

Relationship with Other Caches ​

Best Practices ​

Scenario 1: Trace Query Workloads ​

Scenario 2: Real-time Monitoring ​

Upgrade Notes ​

Summary ​

Join our community

Why Separate Caching for Index Files?

How It Works

Configuration

Basic Setup

Tuning the Ratio

Monitoring Cache Performance

Relationship with Other Caches

Best Practices

Scenario 1: Trace Query Workloads

Scenario 2: Real-time Monitoring

Upgrade Notes

Summary