The roadmap to v1 for the OpenTelemetry Collector

The OpenTelemetry Collector is a very popular component in OpenTelemetry that has been under heavy development for quite some time. It is a binary that allows many formats of telemetry to be sent to it, transformed, and emitted to a destination. Much has been said about the Collector over the past few years in various blog posts and talks. Here’s a small list of talks about the Collector if you haven’t had the chance to learn about it:

The Collector has been a core component for organizations looking to adopt OpenTelemetry as part of their strategy to improve the telemetry emitted by their systems. Organizations around the world have already adopted it and successfully process large amounts of data through pipelines as documented by these various talks:

A few months ago, there was an ask from the community to declare the OpenTelemetry Collector stable.

Can haz Collector v1?

Now you might be asking yourself “Why would anyone want the Collector to be declared stable? You just told me it’s already used in production!” It’s true, the Collector and its configuration have been fairly stable for core components for some time. However, being “unofficially stable” is not good enough for a wide variety of organizations who wish to adopt the Collector:

  • An official v1 will signal that the OpenTelemetry community is ready to provide long term support and not introduce backwards incompatible changes without bumping the major version.
  • Organizations with policies not to use pre-release software will be able to start adopting the Collector.
  • Stability in the Collector helps the community move OpenTelemetry to become a CNCF Graduated project.

The request to stabilize was met with pushback from maintainers since calling anything 1.0 has a way of setting expectations indefinitely. This led to a series of discussions and meetings that brought together the maintainers of the Collector to decide on what a 1.0 really means for the Collector.

And after a lot of back and forth, we decided on a limited scope of what we wanted to focus on:

  1. A distribution of the Collector that only includes an OTLP receiver and an OTLP exporter.
  2. Individual Go modules that the Collector components rely upon must also be marked as stable as per the project’s versioning guidelines.

Aside from this, there were a few areas the contributors wanted to improve based on user feedback:

  • The telemetry generated by the Collector about itself:
    • Traces, metrics, and logs must be available via OTLP.
    • The configuration for the telemetry must follow the configuration schema.
  • The scalability of the Collector:
    • Handling for queueing, back pressure, and errors must be improved.
    • Clear benchmarks and performance expectations for end users.
  • Overall documentation befitting of a stable piece of critical infrastructure.

The roadmap was published in the Collector’s repository and milestones were created to track the work underway. To ensure the effort can be successful, the scope of the deliverable was limited to provide:

  • a clear and achievable goal
  • the focus needed to not get distracted
  • a signal to new contributors of where the project is focusing

There is much to do as you can see on the project board, but there is a lot of excitement around this effort. If you’re keen on helping, reach out either by commenting on any of the open issues in GitHub, or attending the Collector SIG call that happens weekly on Wednesdays. For a quick overview of the 1.0 progress you can checkout the tracking issue.

Last modified May 6, 2024: add roadmap blog post (#4419) (62debcd2)