Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Engineering
July 23, 2024

What is Semantic Convention in Observability and Why it Matters

Semantic conventions standardize naming in observability to ensure clarity and consistency in monitoring data. They help avoid miscommunication and enhance tool interoperability. This article explains the detail of semantic convention in observability and why it's important.

Semantic Conventions in Observability

Semantic conventions provide agreed-upon meanings for words and phrases within a language or culture, facilitating clear communication. In programming, they are even more crucial because the lack of conversational cues makes misreading easy, complicating code maintenance. In the field of observability, semantic convention is equally important to ensure consistency and clarity.

Definition and Examples

Semantic conventions in observability are standardized sets of telemetry and attribute names. They define how users should name their observability data when monitoring common software or libraries.

For example, in a database-driven application, which is a widely adopted architecture in the industry, semantic conventions define standard names and value enumerations such as:

  • db.system: The database type (e.g., mysql or postgresql)
  • db.operation.name: The database operation (e.g., SELECT)

These standardized names can be used across various observability data types:

  • Metrics: As metric names or label names and values
  • Logs/Events: As field names and values
  • Traces: As event field or attribute field names and values

Industry Standards

Several widely-adopted semantic conventions in the industry include:

Concept Architecture: The V-Model

To illustrate the relationship between various observability concepts, we can use the V-model architecture.

As shown in the diagram below, the left half of the letter "V" represents the data collection chain, while the right half represents the data application chain. Data is collected and transmitted from the upper left side, then stored. From storage, it is accessed via query languages and APIs, and finally applied in monitoring products such as dashboards and alert systems.

Observability V-model
Observability V-model

Semantic Layer

  • Responsible for collecting data and providing data applications (analytics, dashboarding, alerting)
  • Understands data and its meaning
  • Example: A metric named db.connection.pool.active represents active connections in the application's connection pool
  • Enables building domain-specific applications for data insights and suggestions

Protocol Layer

  • Adds abstraction and is responsible for moving data from API/SDK via network to collectors
  • Typically doesn't need to understand the meaning of particular values
  • Deals with concepts like metrics (counters, histograms), logs, and traces
  • Defines query languages and transport APIs for data extraction
  • Most OpenTelemetry specifications focus on this layer

Storage Layer

  • Simpler and sometimes agnostic to observability
  • Views data by its model (time series or table) and data types (string, floats)
  • Observability-focused databases may include features for efficient querying and data retrieval

Why semantic convention matters

Historically, the industry lacked a widely recognized standard for the semantic layer. This led to some issues like:

  • Organizations defining their own metric names and labels
  • Custom-built applications for dashboards and alerting
  • Vendor lock-in (e.g., Datadog's proprietary conventions)

A standard semantic layer offers several benefits:

  • Interoperability: Sharing agents and applications across different backends
  • Consistency and ecosystem: Building standard observability solutions for common middleware and infrastructures
  • Simplified Adoption: Organizations can leverage pre-built dashboards, alerts, and analytics that understand these conventions
  • Enabling optimization of data storage and computation in other layers

How GreptimeDB works for semantic conventions

GreptimeDB, the time series database that works for various types of observability data like metrics, logs, and events, has benefited from the establishment of a well-defined semantic standard in the industry.

For instance, GreptimeDB has built-in ETL engine to parse unstructured logs into standard events. We recommend user to name your fields in Open Telemetry Standard Conventions, so in future you can benefit from standard dashboards and analytic products. We are also likely to provide upper layer applications on cloud for those compatible with the semantic standard. Stay tuned for future updates!


About Greptime

We help industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real-time.

Visit the latest version from any device to get started and get the most out of your data.

  • GreptimeDB, written in Rust, is a distributed, open-source, time-series database designed for scalability, efficiency, and powerful analytics.
  • Edge-Cloud Integrated TSDB is designed for the unique demands of edge storage and compute in IoT. It tackles the exponential growth of edge data by integrating a multimodal edge-side database with cloud-based GreptimeDB Enterprise. This combination reduces traffic, computing, and storage costs while enhancing data timeliness and business insights.
  • GreptimeCloud is a fully-managed cloud database-as-a-service (DBaaS) solution built on GreptimeDB. It efficiently supports applications in fields such as observability, IoT, and finance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected. Also, you can go to our contribution page to find some interesting issues to start with.

observability

Join our community

Get the latest updates and discuss with other users.