Starting with greptimedb#1733 in last June, GreptimeDB has adopted Jemalloc as its default memory allocator. This change not only boosts performance and reduces memory fragmentation but also offers convenient memory analysis capabilities.
In our previous article, Unraveling Rust Memory Leaks: Easy-to-Follow Techniques for Identifying and Solving Memory Issues, we explored several common methods for analyzing memory leaks in Rust applications.
Here in this article, I will delve into detailed techniques for troubleshooting based on Jemalloc. If you encounter any unusual memory usage issues while using or developing GreptimeDB, refer to this article for quick diagnostics and identification of potential memory leaks.
Preparations
Install tools
- Install
flamegraph.pl
script
curl -s https://raw.githubusercontent.com/brendangregg/FlameGraph/master/flamegraph.pl > ${HOME}/.local/bin/flamegraph.pl
chmod +x ${HOME}/.local/bin/flamegraph.pl
export PATH=$PATH:${HOME}/.local/bin
flamegraph.pl
, authored by Brendan Gregg, is a Perl script designed for visualizing hot spots in code call stacks. Brendan Gregg is an expert in system performance optimization. We are grateful to him for developing and open-sourcing numerous tools, includingflamegraph.pl
.
- Install
jeprof
command
# For Ubuntu
sudo apt install -y libjemalloc-dev
# For Fedora
sudo dnf install jemalloc-devel
For other operating systems, you can find the dependency packages for
jeprof
through pkgs.org.
Enabling Heap Profiling in GreptimeDB
The heap profiling feature in GreptimeDB is turned off by default. You can enable this feature by turning on the mem-prof
feature when compiling GreptimeDB.
cargo build --release -F mem-prof
The discussion about whether the
mem-prof
feature should be enabled by default is ongoing in greptimedb#3166. You are welcome to share your opinion there.
Starting GreptimeDB with mem-prof
Feature
To enable the heap profiling feature, you need to set the MALLOC_CONF
environment variable when starting the GreptimeDB process:
MALLOC_CONF=prof:true <path_to_greptime_binary> standalone start
You can use curl
command to check if heap profiling is enabled.
curl <greptimedb_ip>:4000/v1/prof/mem
If the heap profiling feature is turned on, executing the curl
command should yield a response similar to the following:
heap_v2/524288
t*: 125: 136218 [0: 0]
t0: 59: 31005 [0: 0]
...
MAPPED_LIBRARIES:
55aa05c66000-55aa0697a000 r--p 00000000 103:02 40748099 /home/lei/workspace/greptimedb/target/debug/greptime
55aa0697a000-55aa11e74000 r-xp 00d14000 103:02 40748099 /home/lei/workspace/greptimedb/target/debug/greptime
If you receive the response {"error":"Memory profiling is not enabled"}
, it indicates that the MALLOC_CONF=prof:true
environment variable has not been set correctly.
For information on the data format returned by the heap profiling API, refer to the HEAP PROFILE FORMAT - jemalloc.net.
Begin your Memory Exploration Journey
By using the command curl <greptimedb_ip>:4000/v1/prof/mem
, you can quickly obtain details of the memory allocated by GreptimeDB. The tools jeprof
and flamegraph.pl
can be used to visualize memory usage details into a flame graph:
# To get memory allocation details
curl <greptimedb_ip>:4000/v1/prof/mem > mem.hprof
# To generate a flame graph of memory allocation
jeprof <path_to_greptime_binary> ./mem.hprof --collapse | flamegraph.pl > mem-prof.svg
After executing the above commands, a flame graph named 'mem-prof.svg' will be generated in the working directory.
How to Interpret the Flame Graph
Created by Brendan Gregg, the flame graph is a powerful tool for analyzing CPU overhead and memory allocation details. Its principle of generation is based on recording the function call stack that triggers each memory allocation event during each memory sampling.
After recording a sufficient number of times, the call stacks of each allocation are merged, thus revealing the memory size allocated by each function call and its child function calls.
The bottom of the flame graph represents the base of the function stack, while the top represents the stack top.
Each cell in the flame graph represents a function call, with the cells below it being the callers of that function, and the cells above being the callees, the functions that it calls.
The width of a cell indicates the total amount of memory allocated by that function and its child functions. Wider cells indicate that those functions are allocating more memory. If some functions allocate a lot of memory but they do not have many child functions (as shown in the diagram, with wider stack tops in the flame graph, known as plateaus), it suggests that these functions themselves might have a substantial number of allocation operations.
The color of each cell in the flame graph is a random warm color.
Opening the flame graph's SVG file in a browser allows for interactive clicking into each function for more detailed analysis.
Accelerating Flame Graph Generation
The heap memory details returned by Jemalloc include the addresses of each function in the call stack. Generating the flame graph requires translating these addresses into file names and line numbers, which is the most time-consuming step. Typically on Linux systems, this task is accomplished by the addr2line
tool from GNU Binutils.
To speed up the generation of the flame graph, we can replace the Binutils addr2line tool
with glimi-rs/addr2line
, thereby achieving at least a 2x increase in speed.
git clone https://github.com/gimli-rs/addr2line
cd addr2line
cargo build --release
sudo cp /usr/bin/addr2line /usr/bin/addr2line-bak
sudo cp target/release/examples/addr2line /usr/bin/addr2line
Catching Memory Leaks through Allocation Differences
In most memory leak cases, the usage of memory tends to increase slowly. Therefore, during the process of memory growth, capturing memory usage at two different time points and analyzing the difference between them often points to potential memory leaks.
We can collect the memory data at the initial time point to establish a baseline:
curl -s <greptimedb_ip>:4000/v1/prof/mem > base.hprof
When memory usage increases slowly, which suggests a possible memory leak, we should collect the memory data again:
curl -s <greptimedb_ip>:4000/v1/prof/mem > leak.hprof
Then, using 'base.hprof' as a baseline, analyze the memory usage and generate a flame graph:
jeprof <path_to_greptime_binary> --base ./base.hprof ./leak.hprof --collapse | flamegraph.pl > leak.svg
In the flame graph generated with the --base
parameter specifying the baseline, only the memory allocation differences between the current memory collection and the baseline will be included. This allows for a clearer understanding of which function calls are responsible for the increase in memory usage.