Skip to content
On this page
Tutorial
2025-12-02

Faster Historical Queries: Guide to GreptimeDB's Independent Index File Cache

GreptimeDB 1.0.0-beta.1 introduces independent index file caching to solve the object storage latency problem. The new version caches index files separately on local disk and preloads them on startup, making historical queries significantly faster.

Querying historical data on S3 means waiting for index files to download—every single time. For GreptimeDB users on object storage, this has been a persistent pain point. Version 1.0.0-beta.1 introduces independent index file caching to fix this.

The new version caches index files separately on local disk and preloads them from object storage on startup. Historical queries no longer hit object storage repeatedly, and latency drops significantly.

Why Separate Caching for Index Files?

Here's GreptimeDB's storage architecture with object storage:

Write Request → WAL → Memtable → Flush → Parquet Files

                                    Write Cache (Local Disk)

                                    Object Storage (S3/GCS/Azure Blob)

During queries, GreptimeDB reads from local caches whenever possible. The cache hierarchy from fastest to slowest:

  1. Memtable — Most recent writes, in-memory
  2. Write Cache — Local cache for recently written data
  3. Read Cache — Data loaded from object storage during queries, LRU eviction
  4. Object Storage — Persistent storage, highest latency

The independent index file cache is part of Write Cache. In future versions, Write Cache and Read Cache will be unified.

The problem: in previous versions, increasing local cache size only helped newly written data files. Index files for historical data had no dedicated cache space, so every query might fetch them from object storage.

Worse, index files are small but accessed on every query—they're needed to locate the actual data. When indexes share the cache pool with large data files, they get evicted easily.

How It Works

The 1.0.0-beta.1 solution is straightforward: give index files their own dedicated cache space instead of competing with data files.

Key mechanisms:

  • Dual cache pools: Parquet (data files) and Puffin (index files) use separate LRU pools that don't interfere with each other
  • Default 20%: Automatically reserves 20% of Write Cache for indexes, with a minimum of 512MB per pool
  • Background preloading: Indexes load asynchronously from object storage on startup, newest data first (sorted by timestamp descending); stops when cache capacity is reached, without blocking startup
  • Configurable ratio: Adjust allocation via index_cache_percent

In practice:

  1. Indexes won't get evicted by large data file writes
  2. Historical indexes stay local—no waiting for the first query to trigger loading
  3. Fast recovery after restarts, no warm-up needed

Configuration

Basic Setup

When using object storage, Write Cache is enabled by default. Just configure the index ratio:

toml
[[region_engine]]
[region_engine.mito]
# Enabled by default for object storage
enable_write_cache = true

# Total Write Cache size
write_cache_size = "10GiB"

# Index file percentage (default 20, valid range 1-99)
index_cache_percent = 20

# Preload indexes on startup (default true)
preload_index_cache = true

With this configuration, a 10GiB Write Cache allocates 2GiB for indexes and 8GiB for data files.

Tuning the Ratio

Increase the ratio when:

  • Index cache hit rate drops below 80%
  • Check index_length in information_schema.tables—if total index size exceeds current cache allocation
  • Historical queries dominate
  • Complex query conditions with frequent index access

Decrease the ratio when:

  • Mostly querying recent data (hot data already in Write Cache)
  • Simple table structures with small index files
  • Limited local disk space

Monitoring Cache Performance

Use Prometheus metrics to track cache behavior. The type label distinguishes index and data files:

promql
# Index cache miss rate
rate(greptime_mito_cache_miss{type="index"}[5m])

# Data file cache miss rate
rate(greptime_mito_cache_miss{type="file"}[5m])

# Index cache usage
greptime_mito_cache_bytes{type="index"}

# Index cache hit rate
rate(greptime_mito_cache_hit{type="index"}[5m]) /
(rate(greptime_mito_cache_hit{type="index"}[5m]) + rate(greptime_mito_cache_miss{type="index"}[5m]))

If the index cache hit rate drops below 80% (empirical threshold), cache capacity may be insufficient. Increase write_cache_size or raise index_cache_percent.

Relationship with Other Caches

GreptimeDB has multiple cache layers. Understanding their roles helps with tuning:

toml
[[region_engine]]
[region_engine.mito]
# Disk-level cache
enable_write_cache = true
write_cache_size = "10GiB"
index_cache_percent = 20

# Memory-level cache
sst_meta_cache_size = "128MB"
vector_cache_size = "512MB"          # Tag column Arrow arrays
page_cache_size = "512MB"
selector_result_cache_size = "512MB" # File last row cache

# Index-specific memory cache
[region_engine.mito.index]
metadata_cache_size = "64MiB"
content_cache_size = "128MiB"

Cache responsibilities:

CacheLocationPurpose
Write Cache (data portion)DiskAvoids downloading data files from object storage
Write Cache (index portion)DiskAvoids downloading index files from object storage
metadata_cache_sizeMemoryAvoids disk I/O for index metadata
content_cache_sizeMemoryAvoids disk I/O for index content

Disk and memory caches complement each other: disk cache determines whether to download from object storage; memory cache determines whether to read from local disk. With ample disk space, increase write_cache_size. With ample memory, increase metadata_cache_size and content_cache_size.

Best Practices

Scenario 1: Trace Query Workloads

One user stores 10 billion traces on Alibaba Cloud OSS and needs to query individual traces by trace_id across the entire dataset. Querying index_length from information_schema.tables shows a total index size of ~106GB. Before upgrading, queries frequently timed out or took too long because indexes kept getting evicted from cache. After enabling independent index caching, queries now return within minutes.

Recommended configuration:

toml
[[region_engine]]
[region_engine.mito]
write_cache_size = "200GiB"
index_cache_percent = 30  # ~60GB for indexes, covers hot index data

[region_engine.mito.index]
metadata_cache_size = "128MiB"
content_cache_size = "256MiB"

Scenario 2: Real-time Monitoring

For dashboards that mostly query recent data, defaults work well:

toml
[[region_engine]]
[region_engine.mito]
write_cache_size = "10GiB"
index_cache_percent = 20

Upgrade Notes

When upgrading to 1.0.0-beta.1:

  1. No manual migration: The feature works out of the box; indexes load automatically in the background
  2. Expect startup I/O spike: First startup may show brief I/O spikes during preloading; monitor progress via greptime_mito_cache_fill_downloaded_files metric
  3. Check disk space: Ensure sufficient local disk for the index cache
  4. To disable preloading: Set preload_index_cache = false
  5. Cluster deployment: Configure on each Datanode; caches are not shared between nodes

Summary

Independent index file caching optimizes GreptimeDB for object storage: 20% of Write Cache reserved for indexes by default, background preloading on startup (newest first), tunable via index_cache_percent.

Expect faster historical queries after upgrading.

For more details, see the GreptimeDB Performance Tuning Guide.

Join our community

Get the latest updates and discuss with other users.