Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Engineering
December 16, 2024

GreptimeDB Now Supports JSON Type for Complex Time-Series Data Structures

This article explores the new JSON type feature in GreptimeDB v0.10, using MySQL protocol examples to illustrate how to efficiently store and query semi-structured data, such as trace data.

GreptimeDB v0.10's JSON Support for Unified Time-Series Data

GreptimeDB v0.10(FYI. v0.11.0 already released) introduces support for the JSON type, enhancing its ability to handle both semi-structured and structured data. GreptimeDB's schemaless model, similar to MongoDB, supports dynamic table creation and automatic column insersion when writing through protocols like gRPC and Prometheus Remote Write, allowing it to adapt flexibly to changing data structures.

However, in many use cases, much of the data is structured, such as the typical Traces Data in OpenTelemetry:

json
{
  "name": "hello",
  "context": {
    "trace_id": "5b8aa5a2d2c872e8321cf37308d69df2",
    "span_id": "051581bf3cb55c13"
  },
  "parent_id": null,
  "start_time": "2022-04-29T18:52:58.114201Z",
  "end_time": "2022-04-29T18:52:58.114687Z",
  "attributes": {
    "http.route": "some_route1"
  },
  "events": [
    {
      "name": "Guten Tag!",
      "timestamp": "2022-04-29T18:52:58.114561Z",
      "attributes": {
        "event_attributes": 1
      }
    }
  ]
}

The context, attributes, and events are all JSON objects or arrays. Since GreptimeDB is a time-series database capable of unifying metrics, logs, events, and traces, it is essential to support the JSON type.

This article uses the MySQL protocol supported in GreptimeDB as an example, assuming that readers have already installed GreptimeDB v0.10 or later versions. For installation instructions, please refer to the documentation.

Creating JSON-Type Tables and Managing JSON Data with GreptimeDB

Creating tables with JSON Column in GreptimeDB

First, when creating a table, you can specify a column as the JSON type:

sql
CREATE TABLE traces(
   `name` STRING, 
    context JSON, 
    parent_id STRING, 
    start_time TIMESTAMP, 
    end_time TIMESTAMP, 
    attributes JSON, 
    events JSON, 
    TIME INDEX (start_time), 
    PRIMARY KEY(`name`, parent_id, context)
    );

In this example, we set context, attributes, and events as JSON types. The time index is set to start_time, and name, parent_id, and context are included in the Primary Key as tag columns.

Since name is a reserved keyword, it needs to be escaped using backticks.

Inserting Data

Now, let's try inserting data:

INSERT INTO traces
  VALUES
  (
     'hello',
     '{
       "trace_id": "5b8aa5a2d2c872e8321cf37308d69df2",
       "span_id": "051581bf3cb55c13"
      }',
      null,
      '2022-04-29T18:52:58.114201Z'::TIMESTAMP,
      '2022-04-29T18:52:58.114687Z'::TIMESTAMP,
      '{
        "http.route": "some_route1"
      }',
      '[
         {
           "name": "Guten Tag!",
           "timestamp": "2022-04-29T18:52:58.114561Z",
           "attributes": {
             "event_attributes": 1
           }
         }
      ]');

You can insert a JSON string as a JSON type, and GreptimeDB will automatically convert it to the JSON type.

Querying and Handling JSON Data

To query the table:

sql
mysql> SELECT * FROM traces\G;
*************************** 1. row ***************************
      name: hello
   context: {"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}
 parent_id: NULL
start_time: 2022-04-29 18:52:58.114000
  end_time: 2022-04-29 18:52:58.114000
attributes: {"http.route":"some_route1"}
    events: [{"attributes":{"event_attributes":1},"name":"Guten Tag!","timestamp":"2022-04-29T18:52:58.114561Z"}]
1 row in set (0.01 sec)

ERROR:
No query specified

When querying table data, the JSON type is automatically converted to a string for display, as the type information can be retrieved from the table's schema.

GreptimeDB offers several JSON functions to assist with processing JSON data. For instance, parse_json can be used to convert a string into JSON:

sql
mysql> select parse_json('{"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}');
+--------------------------------------------------------------------------------------------------+
| parse_json(Utf8("{"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}")) |
+--------------------------------------------------------------------------------------------------+
| @          span_idtrace_id051581bf3cb55c135b8aa5a2d2c872e8321cf37308d69df2              |
+--------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

Since GreptimeDB stores JSON data in binary format at the underlying level, the parse_json function returns a binary result.

The json_to_string function converts the JSON type into a string:

sql
mysql> select json_to_string(parse_json('{"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}'));
+------------------------------------------------------------------------------------------------------------------+
| json_to_string(parse_json(Utf8("{"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}"))) |
+------------------------------------------------------------------------------------------------------------------+
| {"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}                                     |
+------------------------------------------------------------------------------------------------------------------+

You can extract specific fields from a JSON object using the json_get_ functions. For example:

sql
mysql> SELECT json_get_string(context, 'trace_id') FROM traces;
+--------------------------------------------------+
| json_get_string(traces.context,Utf8("trace_id")) |
+--------------------------------------------------+
| 5b8aa5a2d2c872e8321cf37308d69df2                 |
+--------------------------------------------------+
1 row in set (0.02 sec)


mysql> select json_get_string(parse_json('{"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}'), 'span_id');
+-----------------------------------------------------------------------------------------------------------------------------------+
| json_get_string(parse_json(Utf8("{"span_id":"051581bf3cb55c13","trace_id":"5b8aa5a2d2c872e8321cf37308d69df2"}")),Utf8("span_id")) |
+-----------------------------------------------------------------------------------------------------------------------------------+
| 051581bf3cb55c13                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

At this point, those familiar with MongoDB might wonder if we can directly query with SELECT context.trace_id FROM traces.

Unfortunately, this isn't possible for now. This is because we can't guarantee that the type of context.trace_id will be consistent across all rows. GreptimeDB's SQL query engine and results are strongly typed, meaning that users must explicitly specify the type they want to return.

If the field type doesn't match while using json_get_ functions, it will either attempt a type conversion or return null.

To test the compatibility of different JSON-type fields, let's insert data with an integer trace_id:

sql
INSERT INTO traces(name, context,start_time) 
  values ('world', '{"trace_id": 999}',now());

Next, let's query using json_get_string and json_get_int respectively:

sql
mysql> SELECT json_get_string(context, 'trace_id') FROM traces;
+--------------------------------------------------+
| json_get_string(traces.context,Utf8("trace_id")) |
+--------------------------------------------------+
| 999                                              |
| 5b8aa5a2d2c872e8321cf37308d69df2                 |
+--------------------------------------------------+
2 rows in set (0.02 sec)


mysql> SELECT json_get_int(context, 'trace_id') FROM traces;
+-----------------------------------------------+
| json_get_int(traces.context,Utf8("trace_id")) |
+-----------------------------------------------+
|                                          NULL |
|                                           999 |
+-----------------------------------------------+
2 rows in set (0.02 sec)

The results show that json_get_string successfully returns both rows, converting the integer 999 to a string and displaying it correctly. On the other hand, json_get_int correctly extracts the integer field 999, but since the other row's trace_id is a string and cannot be converted, it returns NULL.

Enhanced JSON Support in GreptimeDB Coming Versions

With the introduction of JSON type support in GreptimeDB v0.10, along with JSON functions, the system has enhanced its ability to handle both semi-structured and structured data. In the future, GreptimeDB will further improve its support for JSON types, such as enabling inverted indexing on specific fields within JSON objects to accelerate queries and introducing functions similar to Databend JQ function to simplify data processing.


About Greptime

Greptime offers industry-leading time series database products and solutions to empower IoT and Observability scenarios, enabling enterprises to uncover valuable insights from their data with less time, complexity, and cost.

GreptimeDB is an open-source, high-performance time-series database offering unified storage and analysis for metrics, logs, and events. Try it out instantly with GreptimeCloud, a fully-managed DBaaS solution—no deployment needed!

The Edge-Cloud Integrated Solution combines multimodal edge databases with cloud-based GreptimeDB to optimize IoT edge scenarios, cutting costs while boosting data performance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected.

JSON type

Join our community

Get the latest updates and discuss with other users.