Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've noticed that orgs where I've worked vary between being totally insensitive to observability cost to being real hardasses about it. But I think most smaller shops are falling into the former category. I've even heard in meetings crazy shit like "It's very low overhead, only about 5%" which would get you laughed out of the office at, say, Google. Unfortunately (to me) the focus on ease-of-use has meant that OpenTelemetry concepts are structured in such a way to preclude even the possibility of a very efficient implementation, which means that there will be a schism between people who are happy in the otel ecosystem and people who can't use it on cost grounds, who probably will splinter into distinct home-grown solutions.


> Unfortunately (to me) the focus on ease-of-use has meant that OpenTelemetry concepts are structured in such a way to preclude even the possibility of a very efficient implementation

Curious what you mean about the design of OpenTelemetry precluding efficient implementation?


It’s also about where your costs are. Google has much less revenue per compute-time/requests/whatever than most other companies. If you target the business to business field your computing costs are usually negligible while your developers cost a lot of money. Throwing more hardware at the problem is often the most economical solution.


You're correct that 5% increase in resource usage is probably not noticeable for most orgs.

Its important to know what audience you're building for. I believe the audience for otel consists largely of companies that don't look anything like Google. So its fine to sacrifice that last bit of perf gain if it means the code is easier to use and maintain.

FWIW, it also appears that companies like Google would fork or reimplement such systems anyway.


I want to point out that OpenTelemetry is explicitly designed to decouple the API from the SDK, allowing for other people to reimplement parts of the SDK as needed while maintaining compatibility with not only other SDK components, but also the overall ecosystem. This was one of the major changes we introduced as part of the OpenTracing/OpenCensus merger.


> Unfortunately (to me) the focus on ease-of-use has meant that OpenTelemetry concepts are structured in such a way to preclude even the possibility of a very efficient implementation

Looks like Opentelemetry (at least its precedessor, OpenCensus) is originated from Google. From OpenCensus website (https://opencensus.io/):

> OpenCensus and OpenTracing have merged to form OpenTelemetry

> OpenCensus originates from Google, where a set of libraries called Census are used to automatically capture traces and metrics from services.

Original internal Google tracing system was probably designed for scale. And opentelemetry's design is probably based on that internal system.

So, maybe poor performance is just an implementations issue.


OpenCensus bears no resemblance whatsoever to Google Census, except for the name. Census at Google is wired tight as hell. Any time it showed up on the first page of fleet-wide profiles it would get hammered back down. At the same time it also has more capabilities than its open source successors. Unfortunately, nothing has ever been published about it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: