OpenTelemetry is an open-source observability framework used to collect, process, and export telemetry data from distributed systems. It allows developers to instrument their applications with standardized, vendor-neutral APIs and SDKs, enabling them to generate telemetry data that can be used for monitoring, debugging, and optimizing their systems.
OpenTelemetry provides a unified approach to observability by supporting a variety of telemetry data types, including traces, metrics, and logs. It also supports a wide range of programming languages and cloud platforms, making it a versatile solution for developers building modern cloud-native applications.
OpenTelemetry comprises several key elements, including a cross-language specification, SDKs for various programming languages, tools for collecting, transforming, and exporting telemetry data, as well as automatic instrumentation and contrib packages. By utilizing OpenTelemetry, developers can eliminate the need for vendor-specific SDKs and tools to generate and export telemetry data.
The Collector in OpenTelemetry is a component that facilitates the management and processing of telemetry data. It can receive telemetry data from various sources, including instrumented applications and services, and then preprocess, filter, and route the data to appropriate destinations. The Collector also supports data transformation, aggregation, and buffering, as well as integration with various backends and exporters. It provides a flexible and scalable approach to telemetry data collection and management in distributed systems.
Automatic instrumentation in OpenTelemetry refers to a feature that enables the automatic generation of telemetry data from an application's code without the need for manual instrumentation. By using auto-instrumentation libraries or agents, developers can easily add OpenTelemetry instrumentation to their applications, eliminating the need to manually instrument each function or method.
The automatic instrumentation libraries or agents can automatically instrument common frameworks and libraries used in modern applications, including databases, message queues, and web servers, to generate consistent telemetry data. This makes it easier to monitor and debug complex distributed systems.
OpenTelemetry provides language-specific SDKs for various programming languages, including Java, Python, Go, C++, .NET, Node.js, Ruby, and PHP. These SDKs allow developers to instrument their applications with standardized, vendor-neutral APIs and generate telemetry data, which can then be collected, processed, and exported by the OpenTelemetry Collector. The language SDKs support various telemetry data types, such as traces, metrics, and logs, and can be integrated with popular frameworks and libraries.
How Does the OpenTelemetry Collector Work?
The OpenTelemetry project simplifies the collection of telemetry data via the OpenTelemetry Collector. This collector provides a vendor-agnostic approach for receiving, processing, and exporting telemetry data, eliminating the need for multiple agents or collectors to support open-source observability data formats such as Jaeger and Prometheus. The Collector also empowers end-users by giving them control over their data.
The Collector comprises the following components.
Receivers are modules that enable the Collector to receive telemetry data from various sources. They are responsible for ingesting data from different protocols, such as HTTP, gRPC, and UDP. Receivers support various telemetry data types, such as traces, metrics, and logs, and can be used to receive data from instrumented applications or services. The Collector supports multiple receivers, allowing it to receive data from different sources simultaneously.
Exporters are modules that allow the Collector to export telemetry data to various backends, such as monitoring systems, storage systems, or analysis tools. Exporters can transform and format telemetry data to fit the requirements of different backends. The Collector supports multiple exporters, enabling it to export data to multiple destinations simultaneously.
Processors are modules that enable the Collector to modify and enrich telemetry data before exporting it to backends. Processors can filter, aggregate, sample, and transform telemetry data to fit the requirements of different use cases. For example, processors can aggregate metrics, add context to trace data, or filter out irrelevant logs.
Extensions are modules that provide additional functionality to the Collector. They can be used to add custom receivers, exporters, or processors, or to provide additional functionality, such as automatic endpoint discovery, authentication, or data encryption. Extensions are easy to develop and integrate with the Collector, making it a flexible and extensible platform for telemetry data management.
The service section of the Collector includes pipelines and extensions. Pipelines are configurable sequences of receivers, processors, and exporters that define how telemetry data is processed and exported. Extensions can be added to pipelines to provide additional functionality, such as automatic load balancing or data sampling. The service section provides a unified, modular approach to telemetry data management, making it easy to configure, customize, and extend the Collector to meet specific requirements.
Learn more in our detailed guide to the OpenTelemetry Collector (coming soon)
Architecture of an OpenTelemetry Client
The architecture of an OpenTelemetry client is organized into “signals.” In the context of observability, signals are the data streams or telemetry data generated by an application or service that can be collected, processed, and analyzed to gain insights into the performance, behavior, and state of the system.
The OpenTelemetry framework defines three types of signals for telemetry data: traces, metrics, and logs.
Traces: These represent the distributed flow of a transaction or request through a system, including the various services, components, and dependencies that handle the transaction. Traces are composed of spans, which represent a single unit of work within the transaction. Spans can include information such as timestamps, attributes, and annotations.
Metrics: These represent quantitative measurements of the performance, behavior, or state of a system or service. Metrics can include information such as counters, gauges, and histograms, and are typically aggregated over a period of time. Learn more in our detailed guide to OpenTelemetry metrics (coming soon)
Logs: These represent unstructured data generated by applications or services, including errors, warnings, and informational messages. Logs can include information such as timestamps, severity levels, and message contents.
Within the signals, there are several packages that managed different aspects of each signal:
API packages: These packages define a standardized set of interfaces and methods for generating telemetry data, such as traces, metrics, and logs. The API packages ensure that telemetry data is generated in a consistent format across different languages and platforms, enabling easy integration with the OpenTelemetry Collector.
SDKs: These typically include automatic instrumentation for popular libraries and frameworks, such as databases, message queues, and web servers. They also provide functionality for managing the lifecycle of telemetry data, including sampling, context propagation, and data aggregation.
Semantic conventions: OpenTelemetry define keys and values describing commonly observed concepts, protocols, and operations. These conventions include Resource Conventions, Span Conventions, and Metrics Conventions.
Contribution packages: These enable users to extend the functionality of the SDKs and the OpenTelemetry Collector. They can provide additional instrumentation for popular libraries and frameworks, or add new functionality to the SDKs, such as custom exporters or processors.