Day Two Cloud 192: OpenTelemetry – Getting From Visibility To Observability With Ben Hall

Ned
Bellavance

Ethan
Banks

Listen, Subscribe & Follow:
Apple Podcasts Spotify Overcast Pocket Casts RSS

OpenTelemetry is an open-source project that brings together tools, SDKs, and APIs for collecting telemetry–that is, logs, metrics, and traces—in a standardized way. The goal of the project is to help developers and operators instrument highly distributed applications and services to understand dependencies, monitor performance, and quickly troubleshoot problems.

On today’s Day Two Cloud podcast we explore OpenTelemetry and how it works. We also discuss the difference between visibility and observability, and why this matters. Guest Ben Hall sees visibility as measuring something that’s happening (for example CPU or memory utilization on a host) while observability is understanding why something is happening. Ben is a Principal Software Engineer at Homes England.

We discuss:

  • The origins of OpenTelemetry
  • OpenTelemetry’s key components
  • What information operators can learn from logs, metrics, and traces
  • Working with OpenTelemetry collectors
  • Industry support for OpenTelemetry
  • How to get started
  • More

Takeaways:

1. OpenTelemetry offers a unified standard for observability, enabling developers to gain greater insight into their systems and troubleshoot issues more efficiently across tech stack boundaries.

2. The OpenTelemetry Collector is a versatile Swiss Army knife, really simplifying the transition to OpenTelemetry and enabling a gradual, seamless integration with existing and new monitoring and observability tools.

3. OpenTelemetry promotes a clear division of responsibilities between tracing and logging, allowing developers to focus on capturing the right information for each purpose. See it as training wheels – encouraging us to improve our logging practices, and reduce “log trash” to maintain more meaningful logs, while leveraging tracing for efficient transaction tracking and correlation.

Show Links:

OpenTelemetry.io

@benhall_io – Ben Hall on Twitter

Ben Hall on LinkedIn

Failing Fast.io – Ben’s blog

The Modern Observability Problem

OpenTelemetry, The Missing Ingredient

Transcript:

[00:00:01.310] – Ethan
Welcome to day two. Cloud. Today we got a good one for you today. We’re talking about the open source project, Open Telemetry, and this is going to be a I honestly think this is going to be a very big deal, very impactful for you if you’re in Cloud, the cloud native world, et cetera. Open telemetry is going to impact you one way or the other. Would you, would you agree, Ned?

[00:00:20.380] – Ned
Yeah, I always think of monitoring as super boring, but it turns out it’s kind of interesting and that’s why we ended up with such a robust episode with our guest today.

[00:00:30.940] – Ethan
Our guest is Ben Hall. He is a principal software engineer at Homes England and he’ll talk about that in just a moment. Enjoy this conversation with Ben Hall. Ben hall. Welcome to day two. Cloud. Sir. Thank you for making the time. And you’ve not been on the show before, so let’s introduce you to the audience. Tell the nice people listening who you are and what you do.

[00:00:53.970] – Ben
Yeah, as you said, my name is Ben Hall. I’m a principal software engineer at Homes England, which is a UK public body. And we work to stimulate the housing market by increasing the supply of affordable homes and creating sustainable communities across England. And in my past life, I was a school teacher.

[00:01:12.470] – Ethan
Okay, you said principal software engineer. Is that a distinct, like, tier of software engineer, principal as opposed to a different sort?

[00:01:22.330] – Ben
I suppose so, yeah. I think it depends on the company, the organization you work at, how they use these terms. Some people will treat principal as an architectural role in place of solutions architects. Some might use it as a sort of senior technical position. And others much more managerial, more akin to maybe head of software driving practice. I consider myself quite fortunate that I get to do all three of those best I can, balancing my time between the three of those.

[00:01:51.890] – Ethan
So there’s an architecture role here. Ned and I and most of our audience are more on the infrastructure side of things. So when you say architecture to us, what we’re thinking about is how to build a system end to end that’s going to deliver an application that Ben Hall might be writing or leading a team of developers to write, perhaps. So when you say software architecture, what is that at a very high level? What’s that look like?

[00:02:16.680] – Ben
I think the line is really blurred. I think I’ve always struggled to understand the different types of architects. You see all these subtly different job titles. I suppose in my world, an enterprise architect would start looking at that much bigger picture about how all the systems hook together and then solutions architecture and sort of architecture principle might get involved in will zoom in a little bit closer on a particular service and the technology stack being used to build that. So in a net azure world, I might be consulted as to whether a logic app or function would be a good way to solve a particular problem.

[00:02:48.310] – Ethan
Right, okay. So our worlds overlap quite a bit then. You’re just looking at it from a different point of view, even though what an infrastructure architect and what a software architect might be doing. Again, some different things, different focuses, but I think a similar result in a lot of ways and I think a lot of that overlap. If we were to draw a Venn diagram, would be in our topic today visibility, observability, understanding what’s going on within that architecture. Now, the premise of our recording today is the two articles you wrote on Open telemetry, the Open Standard here about telemetry getting a look inside the black box at what’s going on. So we need to set the stage here for people with some kind of some baseline definitions. And one of those is the difference between visibility and observability. Ben, that’s one of the very first things you bring out in your two part blog series. Can you explain your perspective on that?

[00:03:44.310] – Ben
Yeah, I think it’s important to have that understanding before you dive in. So visibility is just when something’s happened. If I was being a bit of a cynical dev historically, maybe we would have seen the Ops guys in that half of DevOps is doing a lot of that stuff. A lot of those monitoring tools telling us how healthy the infrastructure is, performance measures, how’s the CPU doing, how’s the hard drive space, that sort of stuff. And a lot of those tend to be the predictable problems. You know, that processor being overused on a box is something that can happen. So you might put an alert in place that warns you when that happens. So you would know when that’s happened, but you wouldn’t necessarily know why it happened. And actually, slight tangent here because I love every time I talk about observability, I love to talk about, quote, charity majors somewhere. She’s great from Honeycomb and she does say, look, if it is a predictable problem, then why haven’t you automated the fix anyway? Why is don’t know that. So classic example. Database connections, they do have problems. They can go down and you’ll monitor that.

[00:04:50.010] – Ben
But if you know that’s something that could happen, then you should have implemented a circuit breaker, some way of handling those connection failures gracefully and retrying at specified interval. So you might argue that there’s not much of an ROI return on investment in stuff invisibility. So then on the other hand, there’s observability. And I’m not just going to say, well, developers do that and they’ve done it well because we haven’t. And I’ll get on to that maybe when we talk about logging. But that’s the why and my question. There’s a lot of great definitions out there. Charity has slightly different ways of explaining it, but for me it’s just why is this system exhibiting certain behavior it’s that deep understanding, finding the root cause and having a system, having an observability solution, a way of digging into that system and finding answers to those unpredictable problems, the ones you didn’t know would happen. Having that in place, that’s the challenge. Right?

[00:05:45.230] – Ned
So not something that you would necessarily pick up on just by observing the standard. I say observing, but I guess we’re going to have to work on some terms. But by just looking at the metrics, I wouldn’t necessarily know there’s a problem because the CPU looks fine, the memory looks fine, but the application is thrashing for some reason that I have predicted. Observability is the thing that’s going to help me understand why that’s happening with the application.

[00:06:12.550] – Ben
Yeah, that’s a really good way of putting it actually. Instead of talking about observing a system, you can talk about monitoring. I think that means a lot to most people, means the same thing.

[00:06:24.170] – Ned
Okay, so if we’re thinking about monitoring and then observing as a larger portion of that activity, what are the main things that I’m going to be focused on to get the information I need to perform that monitoring and then later that observability.

[00:06:43.970] – Ben
So you might have heard the term the three pillars of telemetry. And there’s quite a lot of things to unpick here. So you might not have come across the term telemetry for before. That is just the data that we’re getting. It could be those metrics, it can be logs or it could be traces. And those are the three pillars that’s that data that we can use to help us understand the state of a particular system or service. You may also hear people, you may also hear me using it about even thinking, calling, telemetry signals. It’s just another term we use for that data that’s coming out of the system that we’re emitting from it. So there’s three pillars, you got logs and that’s very much what has happened. And if a lot of developers were honest, they would say that’s all they know how to do. That’s what they do, they just fill up lots of logging. Those are those human readable detailed records of important events that you use for troubleshooting. And they are really useful for troubleshooting if you know, they’re usually very detailed. The problem with those of course, is they’re not always tied or certainly loosely tied to any particular user or request in the system.

[00:07:50.360] – Ben
So you can’t always chase that back to other connected events. And myself guilty early in my career as well of abusing them and using them. More like tracing, which takes me on to tracing, which is finding out where it happens. So if you implement the tracing signal in your systems, if you emit that everywhere, then you can capture that whole end to end flow, that whole transaction through the system and ultimately reduce that need to put excessive logging everywhere to try and rebuild that journey. And then the third pillar, which is the one we’ve already touched on is the metrics. So that kind of tells you how well things have been happening and for me they’re the numbers, aren’t they? I’m a little bit of a I like to iterate you’ve seen maybe my blog title failing fast. I like to get in there and just try and build things. If I build it and it works, I’m happy. And I do find especially maybe some of the organizations I’ve worked you tend to come to these NFRS, these non functional requirements after you finished it and say oh, we need to sign it off, how fast do we want it?

[00:08:57.130] – Ben
Then we put the metrics over it.

[00:09:00.170] – Ethan
I wanted to dig into the logging versus tracing for just a second here. Now you said you’re guilty as a developer sometimes or you have been guilty of using logs as tracing. I think I know what you’re getting at there. I hack around a bit, and if I’m working on code and trying to understand it, sometimes I’ll just be putting.

[00:09:18.500] – Ned
Like in the very early stages of.

[00:09:20.000] – Ethan
Development, you throw in a print statement just to throw something on the screen so you know something happened or what just occurred. And you could do that a bunch, and you get a bunch of things on the screen that’s kind of logging, but it’s really more tracing because you’re trying to debug something or figure something out that’s going on. Is that the kind of thing you’re getting at?

[00:09:41.530] – Ben
It is, yeah. And I mean I could go for a lot of dirty secrets in my past in the logging world. I mean, you’ve got a problem in Live, you add some logs, you redeploy run one of the services in a distributed system locally to make the stuff traffic go through you and debug for it. I’ve rebuilt a library with extra logging and we deployed that DLL into Live just so I can get that extra logging. Because we hadn’t instrumented, we hadn’t put the right telemetry in our system such that we could deal with these unpredictable problems. So when you start doing these anti patterns, you know that something must have gone wrong. You obviously haven’t made an observable system. Ultimately a lot of logging isn’t bad. I mean it is bad when you see permanent logging in a service that says entered this function, left this function because people are trying to get that journey traced through. Because of course, stack traces brilliant thing, really useful really within one service, but you only get those if something actually crashes. There’s an exception that you haven’t handled a lot of the stuff that goes wrong, especially in large modern systems.

[00:10:45.160] – Ben
It’s just not that simple. To isn’t going to block an obvious error anyway.

[00:10:49.730] – Ethan
Now, logging in my mind was distinct from tracing in that tracing provides some amount of context around the event so that you can tell end to end like we were talking about distributed systems. So if we’re running a transaction that goes through several different microservices, let’s say before the transaction is complete and you need context around this event, you need to know where it is in the transaction, that the thing happens so you have some idea of what’s going on and where the failure is occurring. And to me that’s kind of the distinction between logging and tracing is tracing gives me that context. So I know where in the distributed stack I am.

[00:11:30.410] – Ben
Yeah, and I think just to help, you know, to see where I was coming from actually in a previous role in our organization, they only had logging, they were doing it at scale, it was going into elastic and it was sort of homegrown, home rolled correlation. So there’s this vague communication and trust between teams to sort of PaaS correlation IDs or have some sort of unique guid such that when you query it later on in this giant vault of logs, you can find a way to join those together and recreate that journey. Which works to a point, but it’s flaky and eventually it does become really challenging to maintain. I mean, there was confusion all over the place with that one. So our ideal situation of course, would be that we have all three of these signals and the logs, they’re really important, you put them in critical places but they are also decorated with the same ID, the same unique Identifier. That means we can find, let’s put it another way. So if we know there’s a problem in one in service C, we look at the logs and we can’t work out why this problem has happened.

[00:12:36.760] – Ben
We can then use the traces to go back and find where the original, the root cause was, which service did have the problems, and then associate that with the logs on that service and hopefully dig deeper and find the actual root cause. That makes sense, right?

[00:12:52.310] – Ned
So traces are at least in part linking the events that have occurred in these different services together through some sort of shared ID. Whether it’s a transaction ID or some other way of stamping that this is all part of a larger transaction that’s happening across multiple services.

[00:13:11.710] – Ben
It’s a better way of putting it and some might say, well, why don’t you just put your logging in there? But of course there’s overheads here to storing all that data. You can talk about the cardinality of a metric or a trace or a log, how many different values you try squeezing in it. Logs are going to be expensive storing. I mean you’re using plain English in them a lot. Whereas traces, you want to be able to store more of those and you need to keep those lighter. So you don’t want all that logging information in there, but you want to be able to find the log associated with a trace at a particular point in a transaction, right?

[00:13:47.180] – Ned
And when I’ve looked at logs, they can be really difficult to parse even though they’re written in human language. Sometimes you want to search them programmatically and that can be very difficult because they’re not in that format. That’s one problem that I can think of. But I imagine there’s some other problems that we’re facing in the world of observability when it comes to these more complex modern systems. What are some of the big issues that you’re trying to get your arms around with something like open telemetry?

[00:14:15.970] – Ben
So another big issue for me, one of the two bigger challenges as well as the correlation and the context problems is also the format and structure. A large organization will have many teams historically probably doing things different ways, different ways of implementing observability, possibly different back ends for looking at it or logging libraries, probably different text tags, potentially merged with some other companies and brought in their tech as well. So it could be a big mismatch and at some point you are going to need that cohesive view of it all be able to do that. And of course, one of the problems we got is some fantastic open source and commercial vendors out there for working with observability. But they use proprietary protocols, their own protocols. So you end up sort of vendor locked, which for me is one of those other bigger challenges as well. I really fear that. I’m a big proponent of open standards and I’m just always terrified of getting locked into something because doing this sort of work instrumenting your applications, plugging in the sort of observability solution on a large scale is a big ask and you don’t really want to have to come back and do it again, right?

[00:15:36.040] – Ned
And I would worry about homegrown solutions as well where somebody decide to implement their own observability or telemetry stack and come up with their own data formats and I’m going to use JSON, but I’m going to use this really weird structure that nobody else uses. And then when you try to parse that or load it into your larger library to look at things, you need now run like write a custom adapter.

[00:15:59.740] – Ben
For that and it just encourages it needs someone to own that very clearly. It’s very difficult maintaining the schemas for that sort of thing. I’ve been in situations where we’ve tried to find out are we using the term correlation ID or trace ID because it’s our homegrown, one home hold. We couldn’t even because both of them were present in some of these JSONs. And it actually gets to a point in the end whether you’re using a vast amount of storage because people are just JSON dumping every object, everything. If you’re not sure what to log, just put it all there. Maintaining that is hard enough as it is and onboarding is difficult. And then I couldn’t even imagine trying to then have to write our own libraries for different languages just so we could support our entire state so it could all log to the same place. Because ultimately that’s going to be one of the key needs key asks of deservability is having all that in one place so you can get that joined up visibility.

[00:16:53.490] – Ned
That’s a really good point. I didn’t think about the polyglot problem where you probably have yes, it’s a microservices architecture. You might have four or five different languages involved there from different teams and they’re all going to have their own libraries and ideally you’d still like them to emit something that is useful to your telemetry solution that’s consistent across all of these languages.

[00:17:18.510] – Ben
Yeah, and it’s not just them, it’s also all the other services that you use going down the lines that you access solutions, libraries that you bring in. A lot of the programming happens, all the code is actually inside the library and then that becomes a black box in itself. How do we know what’s happened inside that too? So there’s all challenges that sort of chains on from this bigger problem.

[00:17:46.850] – Ethan
Ben, again, most of us that are listening to day two cloud are not developers around the other side of the fence, more on the off side. And you said something that caught my attention. You said depending on the toolkit that you’re using, the tech stack that you’re using, you’re getting a bunch of telemetry kind of built in is how I understood it. Did I understand that right? Well I guess what I’m trying to understand is what is the tooling that you as a developer are using to write your code? What of the telemetry is built in versus what are you as a coder creating to write specific log entries and so on.

[00:18:22.320] – Ben
Where to take this in terms of are we talking about because obviously it’s widely variable widely I mean it depends what tech stack works, which organization. Everyone does things so differently so I’m not sure how I could even give.

[00:18:35.590] – Ethan
You a well so here’s what I’m trying to understand. You get as a function of the tech stack that you’re using, the framework that you’re using, maybe that’s the right word some amount of telemetry functionality that’s built into that that you were able to turn on take advantage of as a developer. Is that fair to say?

[00:18:54.410] – Ben
I’ll speak from my own net world, I seem to move from Microsoft house to Microsoft house so that’s where my confidence lies and I can speak to that and most of the time we’re rolling stuff based on ASP. Net web apps, whether it be functions, that sort of thing. And out the box net core isn’t really sort of you’d still have to plug in libraries, but of course the core libraries azure stuff off the shelf is very much about is usually applications insight. So straight away I’m talking about vendor locking there. You do get some quite nice automatic instrumentation there. You get some reasonable traces you can look at in Azure monitor, but it very much sort of locks you into that straight away.

[00:19:36.890] – Ned
Right, I have some experience with that, setting up application insights for application like development teams. And they were coding and using that library in their code when they’re writing the application. So, yeah, it slots really nicely into Azure and running it there, but if they wanted to move that same tech stack over to AWS and run their net application there, they wouldn’t be able to emit that same telemetry data to not cloud formation, cloud something. I’m trying to remember all the different names of the Cloud Watch. They wouldn’t be able to emit the same sort of metric cloud Watch because it doesn’t understand application insights. So that seems like a pretty big problem.

[00:20:21.510] – Ben
You kind of have to fully embrace being having that freedom to move. So we might containerize something, put it in Kubernetes, we’re really proud, we can lift it up and we can put it wherever we like if we ever chose to move. But once you start actually locking yourself into Azure’s monitoring system, Azure’s ad system, that isn’t an easy change. You can’t just because you have to go through a lot of your code changing how you log and how you instrument, depending on which sort of back end you’re going to be using in the end.

[00:20:50.270] – Ned
Not as simple as just switching out a library, you have to switch out all the calls to that library and whatnot as well.

[00:20:56.850] – Ben
Yeah, it depends. I mean, if you have just been a bit lazy and just plugged in the automatic stuff, then you could just remove the library. But there’s a limit to how much you get from automatic instrumentation.

[00:21:07.070] – Ethan
Right, well, this sounds like the perfect time to introduce Open Telemetry then, Ben, because we just, I think, articulated a big part of the problem. So tell us, what is open? Telemetry. And I think we can call it, we can abbreviate it Otel. Is that the way the industry has been, what they’ve been calling it these days?

[00:21:23.290] – Ben
Yeah, certainly when it comes to typing it. Anything to save some save my fingers a few more years until the AI can just guess what I want to say. So, yeah, Open Telemetry hotel, it’s the cool kid on the block. It’s an evolution of its predecessors. So that was open tracing and open sensors. They were a bit of a fragmented ecosystem. They did some good stuff, I think, open tracing, tracing open sensors did a bit of tracing and metrics, but developers were kind of having to pick and choose a bit of both. And they had a lot of other problems too, as well, especially in terms of being able to adapt them and extend them. They haven’t quite got that right. So Open Telemetry has come along and it’s like a single comprehensive solution that combines the strengths of both of those, and it’s vendor neutral by default, which is pretty awesome.

[00:22:16.810] – Ethan
Now. The open telemetry project is part of the CNCF. It’s yet another one of the many, many projects under this Cloud native Computing Foundation’s, care and Feeding, right?

[00:22:27.790] – Ben
It is. I’ve had quotes that it’s, I think, number two most active project under Kubernetes. I haven’t seen that for myself, but might just be.

[00:22:43.430] – Ethan
You just said kubernetes, which I suppose is a magic word there. Was Otel conceived with kubernetes in mind. Or is that because we’re talking about distributed applications and microservices, which tend to be the platform tends to be Kubernetes that is run on is that just tend to come up naturally in the Open Telemetry conversation?

[00:23:04.050] – Ben
I wouldn’t want to imply that Link, but it was definitely hotel was definitely born out of the needs of distributed systems. But that said, its features are valuable in any environments, not just distributed systems.

[00:23:18.040] – Ned
Okay, yeah, I was going to ask that question because it’s part of the CNCF that says cloud native to me, that says these large distributed systems. What if I’m not running that? What if I’m running like a basic three tier application? Is that still useful to me?

[00:23:34.950] – Ben
Any three tiers? I tell you what, we can talk about stories of distributed systems that were made up of end tier applications. So you go up one side, trace, go across, come down the other side, maybe into a library that’s faulty tier, and that’s another story for another day. But in terms of monolith, dirty word, isn’t it? But you’re single service. I like to use analogies for this sort of thing. So if you think of maybe a public transportation system, you’ve got your bus routes in the city, that’s your nondistributed, that’s your single service system. So you’ve got your little bus system, and then you’ve got a big train network, a rail network that interconnects all these cities for the more complex journeys. And I’m probably going to take a bus in this city, then take to a train station, take the rail across to the other city, and then the bus again at the other side. And the sort of system we’d expect is that I can buy shared tickets that work on both. I don’t have to buy separate ones. Now there’s a standard route map planning system. Just like when I go on Google, it doesn’t tell me I have to go off and research the bus or the train separately.

[00:24:39.100] – Ben
I can do all that in one place. The schedules, the signage, whether I’m ticking the bus or the train, they all make the same. They all mean the same to me. Just like when I’m checking the service health or disruption, I want to know, is my journey going to go okay? How long is it going to take? I want to be able to use the same language, the same systems to look at that to get that overall picture. So I don’t want that. A lot of that’s relevant to distributed systems, too. But if you pick through the stuff and going through, there’s also really important benefits just for that developer, that flexibility to move between systems and understand to have that common standard. It’s the same each way. Actually, I think we struggle a lot in the public sector. We don’t pay brilliantly, but we have other benefits. But anything that can help recruitment and retention, if we say, well, we’re using this standard, you’ll know how to, you can onboard and immediately start working with it. And likewise for the developers here, you’re working on a modern standard. This is the future. So you’ll be learning it as well.

[00:25:41.560] – Ben
And you’ll be able to reuse that anywhere. Welcome to any system because they will speak in the same language.

[00:25:46.670] – Ethan
So it’s all about that standardization of the data for you. You’re making a bigger deal of that than the fact that it works well and helps you solve the tracing through a distributed system.

[00:26:00.850] – Ben
Yeah, you can talk about the other benefits that you get. There’s more components, there’s more to it than just that standard protocol. But that is at the core of it.

[00:26:14.420] – Ethan
Well, so the reason I posed the question, Ben, is it feels like the magic would be to me in the tracing component. So just being able to figure out where the heck this transaction went wrong, knowing it’s buried deeply in a complex transaction, and then being able to pinpoint that using the telemetry was super key. But I guess as you’re stating it here, there are other tools that can do that, right? There are other solutions that can give you that distributed tracing functionality. What Open telemetry is giving us is that but then also this predictable format for the telemetry that gives you portability and gives you, again, that standard that everyone can interoperate with if they choose to, which we’re already seeing from some tooling vendors. Some tooling vendors are announcing support for open telemetry in one way or the other. Either they consume it or they can produce open telemetry formatted data that’s popping.

[00:27:12.910] – Ben
Up more and more or native only now. So I think was it Jaeger? I think I’m right. Jaeger is now otlp. You don’t use Jaeger’s protocol anymore. You have to use the new protocol, which we might get on to. Yes, there’s freedom. It gives you that freedom. You’re completely decoupled. So you can just switch your languages at will. You can switch the back end provider. You could use multiple back end providers. And that’s not a big deal to be able to do that anymore. But you’re right, the tracing itself isn’t, isn’t new. That’s been around. It was harder if I wanted to use a particular vendor. It was reliant on them having libraries that supported instrumenting C Sharp or JavaScript or Python. And eventually there’s going to be a point. Where there isn’t a way to get the telemetry from the system I like and that kind of ruins my party then, isn’t it? I don’t have that bigger picture. There’s always going to be that. And I found out I loved n service bus. They got that right. I’m a big fan of it. I think we had about 20 plus endpoints. So always messages flying around, but there’d always be a point where an endpoint called some API that wasn’t part of N service Bus’s.

[00:28:19.860] – Ben
Lovely correlation IDs and probably on a different time zone as well because N service Bus kept forcing it to somewhere in America. So your correlation would just fizzle and straight away you wouldn’t really be sure where the root cause was and what you’d lose that trail. So there’s that big risk which open telemetry is going to help get rid of. You really will be able to make sure you get that complete cohesive view.

[00:28:42.710] – Ned
Interesting. So it’s really playing that layer that I’d almost make an analogy to something like TCP IP where it’s not governing what’s happening on one end of the conversation or the other end, it’s just the protocol by which data can be transferred. And so Hotel is kind of doing that from the application to whatever is going to be ingesting that information and making sense of it. Does that line up or am I like way off?

[00:29:11.390] – Ben
Yeah, if you’re on protocol analogies you could say why wouldn’t you adopt it? It would be like think about the CompuServe days or having really strange proprietary instead of putting up a website that people can access using Https, you use Ethan’s protocol. Why would you tell yourself in that way that everyone had to go out and get the special browser, the special protocol, just so they can talk to your look at your website, you just wouldn’t do that.

[00:29:41.290] – Ned
I guess Internet Explorer tried to do that for a while and we know how well that went, at least from a security standpoint. But yeah, if we’re looking at Otel and it’s kind of loosely coupled on both ends. So what is Otel actually comprised of? What are the components that it’s built up from?

[00:30:00.590] – Ben
Yeah, there’s several components, I think six, seven, depending on what sort of detail you’re going to go to. But I think that might model things to start with from a developer’s point of view and for just getting started in that sense, I would probably be talking about the core libraries. So the SDKs that implement the APIs, I won’t go into the details about where the differences lie in that sense, but essentially your SDK is made up, gives you the bare bones to start manually sending the free types of telemetry from a particular application. Those SDKs are available for many languages, I think officially supported eleven at the moment, but that could have changed. I know there’s a couple more trying to sneak in there I think you have to get to a certain amount of supporters and people working on something before it can be officially moved in as a community owned one. So once you’ve got that basic SDK SDK in place, you can then bring in the instrumentation libraries. I’m not sure if we touched on this yet, but instrumentation means getting some code in there to have an application, have a service send out those signals.

[00:31:16.560] – Ben
That telemetry data, albeit metrics logs or traces, and there’s tons of those out the box for all sorts of languages and frameworks. So in net world, I think we talked about how easily you could do some initial automatic instrumentation. And with an ASP net application, I can just add a couple of libraries. One for the website for incoming requests and one for Http client, and then out the box. Straight away, my API calls another API which has also got that automatic stuff, and it’s going to return that journey for me. It’s going to propagate, it’s going to pass on that information, because that’s not magic. There is no central serverless doing this. If you want that context those IDs to move across your system so you can follow that trail, then you are going to have to propagate that. So any service involved needs to be aware of open telemetry so that it can receive it and pass it on. But you can get that with your automatic stuff.

[00:32:13.050] – Ned
You just connected some dots for me because I was sort of thinking of it, well, who generates that ID to begin with and how is that ID passed along through all the applications as if you want to do this trace? And that was the connection point for me, is each application needs to be aware that open telemetry is what’s being used, and then respect what the calling application is doing.

[00:32:39.570] – Ben
A lot of developers listening might tune in and listen to this and kicking themselves thinking, oh, I could have released this years ago. I’ve written this because I think we all have. We’ve all written something. And the first service that finds that it doesn’t have an ID goes, oh, it must be my turn to create it. I’ll be the one that creates it, and then it’ll get used forever on I don’t think there’s any rocket science here. It’s a straightforward thing. It just needed people to step up. And it has been that way. All the big providers stepping and contributing and making this happen. The other thing I didn’t mention, when you talk about core libraries, you got the instrumenters which generate the stuff ready to send out, but you also need to pick your export to libraries, and there’s a load of them as well. So I can have my SP net application. I could export to Jaeger, I could export to Geomonto, I could export to loads of places from my application. You might not want to do that. That could be a little bit slowing, but you can do that.

[00:33:33.830] – Ned
Okay, so where it’s exporting to would be whatever system of record you’re using to capture all this trace data so you can comb through it later. Okay. And would each individual application in the chain export that to the collector? Or would it be once the transaction is complete or reaches some sort of end state okay, that last app in the chain is what actually sends that up, because then you could lose information in the chain, I guess.

[00:34:01.290] – Ben
Yeah. Look at you, keen and jumping straight onto the collector. I think there’s a lot of options. I don’t know if you want to I mean, whether or not we want to, I’m quite good at tangents. I could go off into collector now if you like. It’s certainly part of the solution to that. It depends. Certainly in early days, really small systems experimenting with it, you could just have every service just exporting directly to the back end of choice, where you want that data to end up.

[00:34:28.860] – Ethan
But the Open Telemetry collector does that, and we want to get there in a minute. But before we do that, Ben, could you walk us through what a trace feels like going through the Open Telemetry system? How does Open Telemetry do the job of tracing? And I know in your blog post you described an hotel span. It’s got a trace ID. Then there’s a span ID and a parent ID and so on. Can you walk us through that?

[00:34:55.310] – Ben
Yeah, it fully brings back visions if anyone study computer science of sort of linked lists and stuff, being able to reverse them backwards and forward. And it is that simple, though. So, like we said, you want traces so you can map that whole journey of the transactions through your entire system. And what Open Telemetry does, then it just breaks it down to a few simple terms. So you have a span. Well, sorry, stepping back a bit, you have a unique ID, which you’ve just been talking about, a trace ID, which represents that entire journey. So if I want to find any logs, metrics traces that are associated with that particular transaction for the system, I’ve got that trace ID. That’s how I’m going to correlate.

[00:35:34.310] – Ethan
Okay. You associate a trace ID with a transaction, which is how I was thinking of it. A transaction is end to end, the delivery of this through the system. A trace ID is identifying that transaction, and the trace ID belongs to a span, to what Open Telemetry would call a span.

[00:35:53.090] – Ben
No. So at the top of this tree, we’ve got the whole journey, the trace ID. In there, you have got single operations, and that doesn’t necessarily mean everything one service does. It could mean a lot of things the service does, unless you’re using some sort of automatic instrumentation to get you started. It’s the developers that choose what a span is. And I think that’s quite important to acknowledge that because I’m not a big fan of some of the automated chasing, because I’m also not a big fan of leaving it till later and doing it retrospectively because you need to do there. And then when you have an understanding of what that code does, when you’re writing that code, you know best which bits are going to be important to know about later on. And if you don’t, that’s where you should be talking to the business and other stakeholders to find out what’s important to them, what do they need to know? Because they might not need to know if that CPU is high when newsletters get sent out, it’s not important to them what is important, what’s going to affect their users. So a span is another ID linked to suppose it’s a child ID, if you like, from that Trace.

[00:37:00.400] – Ben
It’s one of the spans in that Trace. So it’s linked to that trace ID. So you got Trace ID at the top and then a span. Of course, there’s going to be a lot of spans through that journey, all over the place. How are you going to know which order to put them back together and to reconstruct them so you can look at that journey? Each span will keep a record of the parent span ID that it spawned from. Nice words.

[00:37:23.870] – Ned
Okay.

[00:37:24.340] – Ben
And then you put that whole puzzle back together by retracing your steps, looking at the parent ID.

[00:37:30.350] – Ned
Right. So you’re building out a tree here where you have the Trace ID is just the unique Identifier for this transaction. And then you start with one span, which is that initial kick off of the transaction, which create other spans, always referencing their parent. And it could be this large branching out tree from a single beginning. Trace, yeah.

[00:37:49.690] – Ben
Incredibly hard to draw, which is why I’ve not drawn it in my blog. And why because the way we normally look at these things, app insects is the classic does it out the box. But they look a bit like Gantt charts. Gantt charts. You might remember those from college or school. Horrible things. Great for waterfall. So when waterfall comes back so all Gantt charts here. But yeah, so back ends tend to display against time like that, but they but multiple spans, really complex journeys can get a little bit hard to represent. So but you can drill down and that’s where you want that freedom to pick the best back end. I’m going to try not to name any particular vendors, but one day one might be doing a really good job of that. And I love that. And I want to just build a switch without having to say, look, I need six months because we got to change our applications in the way they’re instrumented to do this. I want to build a change fact. Even better than changing, I’d like to just be able to point to a second one too. You couldn’t go for your code doing that.

[00:38:51.210] – Ben
It’s just not even something that would be possible right now.

[00:38:54.110] – Ned
Right, that does sound if you could just okay, we’re going to point at both for a little bit, make sure that the new solution does everything the old solution can, plus whatever it promised. And then when we’re ready, we just stop sending to the old solution and everybody’s happy. That’s ideal. Can you explain the concept of hotel Baggage? I know that came up, I believe, in one of your posts.

[00:39:19.650] – Ben
Yeah, I mean, going on a trip, you’ve got your suitcase, you pick up things as you go and as you go through your distributed system, you can pick up a bit of information. That’s the analogy. We’re just talking simple key value pairs. It is a shame. I always think whenever anything like this comes into software that I write. Whenever you do it’s like a singleton pattern. I don’t know. Obviously not all your audience is developers, but the singleton pattern is where I could be really cruel to. The pattern is where you’ve written a bit of code and you’ve got something that lots of bits of code that you’ve decoupled need to use. It’s too late, you’ve written it in the wrong way. All you can do is just have this big shared class where everyone can access it. And there’s all sorts of bad reasons for doing that, and Baggage feels like that. And I think it’s probably right that it’s there in the beginning, because you can’t get too deep into this when if you’re trying to get people to adopt it and migrate, you can’t make it too difficult. And Baggage is going to be one of those shortcuts, isn’t it?

[00:40:25.620] – Ben
So I can put in I think classic example is at this point in this service, I knew their user ID five services down the line. I don’t but I’m going to need it so I could put it in Baggage and pull it out. But it’s very worrying. I couldn’t get it past our security guys. The idea of the information exposure, people putting sensitive data in and then the wrong service being able to see it and trying to keep an eye on all that. There’s definitely ways, I think with time we’ll learn good ways of avoiding it. Once we understand open telemetry, you got custom attributes on spans. Once we’ve got logging in place because it’s not fully there yet and you can correlate those azure traces, then you can put all that good information in your logs then, can’t you can you can sneak a bit more in there and then because it’s correlated, I can go back and find it later on. Not ideal, but it’s a lot safer than just blindly sending baggage around all over the place.

[00:41:18.730] – Ned
The danger there is. I’m putting information I think might be useful, just cramming that into the span or whatever and passing that along to the next service down the row where that information should really be in a log entry, which I can then access via the information that’s in the span know, okay, that correlates to this log, which I can pull up the entry for as opposed to passing it down the line.

[00:41:43.790] – Ben
That’s definitely one way of doing it. There’s other ways, but logging will be one solution to that. But there’s different ways to solve the problem without having to do that. And I don’t even expect, I don’t think you’re going to be second guessing it’s more likely you’re working on two systems. You’ve decided you want that bit of information on the other end on the next system, so you just quickly go and redeploy that one, telling it to shove it in the baggage and you can pick it up the other end them. Because we’ll take shortcuts and developers, we do have a bad rep, but we have to deliver at a certain time and we like our technical debt backlog, keep it nice and full of things we should have done. Right? So if it’s there, we’ll use it and we’ll abuse it.

[00:42:22.410] – Ethan
Well, let’s move on to the open telemetry collector. Ben, this is something that in our prep call and in your blog post you make a big deal about. You’re a big fan of the collector and what all it’s doing for us. So fit it into the open telemetry architecture first of all, and then let’s get into what it does.

[00:42:39.250] – Ben
Yeah, the reason why I’m such a big fan of it is because in the end, yes, we hate vendor lock in and it would be annoying being locked into a vendor that went out of business. Then you really would have to move on, wouldn’t you? But otherwise it’s always scary talking about these big migrations and big changes and I think the collector is really going to help with that as a migration tool as well as a migration aid. I don’t know if that’s necessarily being sort of bigged up as much as it could be. So the collector is although open telemetry is very much about the standard as a course specification, you’re just setting just a common language for how we do things and then all the vendors in the community are building the libraries to allow you to gather that information. We also need to help get that information out. All that data, all that telemetry data to the different to the back ends. So what collector is, is a binary. I mean, you can run it, obviously, containerized or in a Kubernetes cluster, or just on its own dedicated server, or there’s a number of ways we can get onto that and try to think, well, it’s a bit like it’s not really like a proxy, but it can ingest.

[00:43:49.250] – Ben
It can take all that telemetry data from all these different sources, do something with the data people talking about massaging the data, but there’s a lot of different things it can do that data and then ultimately send it to any number of places, too. And I think it just adds that extra freedom to decouple from existing solutions without disrupting your service. And we kind of sort of touched on that already about parallel stuff. So imagine a distributed system. I’ve got multiple services on different tech stacks I’m using. Name a few. These ones are called to the Project. So I’ll think a lot of these, like, Jaeger Prometheus zipkin. I got Prometheus, my zip metrics, I got zipkinsa tracing. There some fluent D logging, lots of stuff going on. So those are all sort of embedded in my applications and these different things, signals, and they’re being sort of spat out at the moment. And I’m using back end A for my monitoring, but I’m interested in using back end B. So instead of having to go through, like we said earlier, go through all my applications to completely reinstruating changing new libraries, changing to probably different types of structured logins or stuff like that, all I’d have to do is tell the collector to also export to the second place, to that one and that one, and I could just pass that stuff on.

[00:45:09.560] – Ben
So I think at the core, what’s happening is the Collector can take all sorts of data formats, loads of different data formats, and it’s going to convert all of those internally to its own protocol, one single protocol, and then you can choose to just export that. That’s the future we’re looking for. Well, all you have to export is the Open Telemetry protocol to all these services, and we’ll get there, I hope. But at the moment, all the vendors, of course, they’re making sure that the Collector has these exporter plugins to make sure you can keep sending that stuff to their service.

[00:45:48.990] – Ethan
In networking, we have a parallel with network visibility fabrics. You can collect packet data streams, send them to this box in the middle that would function in this architecture like the Collector. You can munge those data streams in various ways depending on what tool you want to send that data stream to, and then off it goes. And these are very powerful boxes that give you the ability to munge data, be selective about the kind of data that you’re sending to various back ends. And this feels very much like that, right? It’s not a proxy, exactly. Architecturally, it kind of feels like a proxy, but it isn’t really a proxy. And in this situation, Ben, with the open telemetry collector, you’re saying it gives us the ability to talk to whatever back end we’ve been using to munge our telemetry data and present it to us in some kind of an interesting way and parse through it and so on, and then move to a different tool if we want to later on. We’re not stuck with again, the one vendor lock in situation, the collector feels like a big part of the magic that liberates us from that.

[00:47:00.950] – Ben
Yes, but on Liberates a good word for it as well. And I think I suppose the word I should have used is a pipeline. And I think strictly conceptually, it’s probably a pipeline per signal type. Certainly if you go and look at the Open Telemetry documentation, the YAML files you use to build the pipelines, they are very much sort of separate things. So on the receivers function and you build it with plugins, essentially, so you can pick any receivers you like. I think there’s over 100 now. So it’s important to understand Openflimpty when you initially go, let’s say, to the collector GitHub repository, you’ll just see a few receivers and a few exports. Those are the core ones. But there’s also a contrib version as well, the community driven stuff. And that’s where the numbers really add up and that’s where it gets us up to about 100 receivers. So by 100 receivers we’re talking about 100 different data formats and they go right down to even simple ones. There’s one called the file log receiver. So if I’ve got and I do have a lot of old net framework applications that Azure just logging to file with log for net.

[00:48:09.050] – Ben
In my collector, I have the File Log receiver give it a pile of file path, give it a regular expression that tells it how to pass my log files and that’s it. I now have all those log files from those legacy applications being brought in and converted to ATLP. And all I have to do is put a couple of lines in into my collector as a receiver. And then on the exporter side, like you covered, of course there’s exporters for all these different systems as well. And in the middle there’s a lot of magic you can do as well. The processes we call them, I did see some stuff the other day. I think there’s a bit getting too many processes, it’s becoming hard to manage. So there’s our new piece of work in Alpha called the Transformer, which will do a lot of those things. And I think it’s getting its own programming language called, I don’t know, observability transformation language, maybe, I’m not sure. I just glanced at it, just came off my newsfeed. But there’s lots of stuff in there. You’ve got like a Kubernetes processor. So your collector, if it’s got all these different apps in Kubernetes sending it telemetry, it would automatically stamp metadata such as Namespace or Pod IDs and that sort of stuff on there for you, for that sort of consistency, enriching it with all that metadata.

[00:49:29.120] – Ben
There’s a personal information processor as well. So you can have a processor in there that redacts and anonymizes personal identifiable information. And I could go on again. If you look in the processes folder on the country repository, there’s a lot there.

[00:49:44.330] – Ned
It kind of reminds me of how people started using envoy to do some data transformation or massaging, when it intercepts the request between two different services. And this sounds like kind of like the same thing, but for exporting telemetry data or logging. And you have this intermediary that can also do a little has these loops where it’s doing some processing before it sends it out the other end.

[00:50:07.330] – Ethan
Yeah, there’s a scaling challenge here. To me, Ben, at least, this feels like I could be throwing an awful lot of telemetry into this binary, into this container that’s doing an awful lot of munging and sending to an awful lot of exporters. And that this could be a bottleneck or I could be just dropping transactions on the ground. Is there a way I can scale this thing, throw it behind a load balancer or something?

[00:50:32.730] – Ben
How you deploy the collector completely, totally depends on your requirements. And I know it’s easy to say depends on in our industry, it’s the answer to everything, but depends on your requirements, depends on your infrastructure. But there’s two really common deployment patterns. You’ve got the as a central gateway and as an agent. So as a gateway, it’s a central process. And that that could, again, it could be a dedicated box, it could be in a Kubernetes cluster, and then we could be talking lost myself for a moment. And that could take all your data and do its processing, that sort of stuff, and then send it to multiple back ends. And that would give you that single point of egress simplifying that management. So I don’t have to give all these different development teams, all the different application to your API tokens to talk to the different back ends and batching, all that sort of stuff. I can control that in one place. And it’s often why I think we’re going to find more and more DevOps the ops personnel, owning the central gateway so you can get better governance around what goes out, how things are shuttled, all that sort of stuff, and what’s encrypted and what’s hidden.

[00:51:53.650] – Ben
So you probably will run a central gateway regardless for that. But then the same binary can also be run in agent mode. And you’re probably quite familiar with that in the DevOps community, in infrastructure, there’s quite a few things that tend to run on agents as agents on boxes, aren’t there, for monitoring behavior. And that’s going to be deployed closer to every host. One of the main advantages there, and you were talking about performance problems as well, is that you want your application to offload logging quickly. The back end is misbehaving. And in the days when I might have sent it straight to some vendor back end, if that vendor was having problems, my whole application might just stall, waiting until it can send these logs or I’d have to bin them. So the nice thing about having that collector as an agent close to your host is you can offload this stuff quickly and you can also do different levels of configuration more suited to that particular scenario, wherever it might be. And that could be classic sidecar or even one to one inside Kubernetes or a demon set where you’ve got one for a bunch of nodes in the cluster.

[00:53:04.630] – Ben
There’s a lot of options as to how you might choose to do that. And that really is it depends on you might want to only have one egress out of a particular network boundary. So you might have a collector, one collector that a few apps go to.

[00:53:20.430] – Ethan
I’m glad you said sidecar because I was like, wow, a host on an agent or an agent on a host. I mean, even that’s a potential bottleneck depending depending on a lot of things. Could I do it as a sidecar? More like per process or per container if I wanted? And it sounds or maybe per pod is the right way to put it in a Kubernetes context. It sounds like I can do either of those agent on a host or sidecar and a pod.

[00:53:47.070] – Ben
Yeah. And a lot of us, I mean, when we’re looking at this, all the greenfield of the modern cloud stack stuff and the Kubernetes stuff is quite well suited to this. But actually, for this big picture, for this journey, I think we’re going to have a lot of legacy applications that we’re going to want to do this too, without necessarily having to modernize when we put them in the cloud housing. And we still have plenty in the data center on the older net frameworks, but we still want them to be part of that bigger picture. And there’s lots that can offer in that scenario as well. If you put one of these collector agents on the box, no surprise there’s receivers as well for monitoring Windows, Linux, even Macs, all the stuff you get from like Windows counters normally to look at the state of processes and the health of a machine. There’s receivers for that too. Now I think there’s a collect D receiver as well, you might be familiar with that. So there’s an awful lot of power and I think that’s what impresses me most about the collection, the speed that that has grown up and also the concern it’s caused that they’re trying to make one processor to rule the ball, then the dark was buying them and all that stuff because there’s too many of them.

[00:54:55.230] – Ethan
So Ben, it feels like there’s a lot of industry support for open Telemetry. It’s growing. We’re getting briefed by vendors with tooling that are adding support for open telemetry in some way or another. There’s a lot of momentum here, so maybe this is popping up on someone’s radar and they want to know how to get started. With hotel. Can you give them some recommendations?

[00:55:13.750] – Ben
Yes, absolutely. Definitely. The adoption, the signs are that. It’s here to stay and there’s lots of success stories. I don’t think there’s any avoiding it, which is why I know the typical advice is we’ll just try it on one application, see how it works for you. And it’s like, well, do you have to do that if you don’t actually have any choice? I think maybe what people are looking for now is how do I actually start adopting it? What’s the realistic plan? What I would say before I get onto that though, is in terms of readiness, we’re still waiting. Logging still needs some work to be finished. It was the one that was less till last because it was the hardest, it was most difficult. In terms of the challenges involved, the traces and metrics were very much sort of a clean rewrite, pretty straightforward if you allow to reinvent something. Logging was trickier because there’s a really large number of existing logging systems and they’re quite divided communities too. In Net you’ve got the log for Nets, the N logs and whatnot not to be confused with log for J and those awful bugs.

[00:56:18.530] – Ben
Nothing to do with that. Yeah, so because of that, it’s just taking a bit longer because they want to embrace those logging solutions. They don’t actually want to wipe them, they want to try and ensure compatibility because they’re too ingrained. Because AWS, you know, we have been abusing them. We’ve been using logging for everything. I’ve seen logging used for metrics in a light touch way, people recording timestamps. Developers love them. That’s the only tool they’ve really had. Because the DevOps culture is improving. But I wouldn’t say necessarily devs Azure deep into metrics. Even if grafana is beautiful, pretty thing and all that, I still think logs are something we need to preserve. So what’s happening is the Open Telempty project is now going to start looking for a way to add that richer correlation to our logs by ensuring compatibility if we’ve got those logs, but we also get those trace and span IDs in there, so our existing logging. So if we make the effort now to start instrumenting a bit of tracing in our applications, that will be able to correlate with the existing logging that we have, so we can get that bigger picture of outdoor any brand new work for logging.

[00:57:32.050] – Ben
And with that, they’re introducing a new logging API. And off the back of a model and off the back of that, it’s got a log bridge API, which is going to help us make that transition. They don’t expect developers to do that though. It’s the library authors. So in net world we have sirilog. And that is just a plug and play though. So it’s just Siri log. I’ve already prototyped the library one line of code in my SP Net application that really uses Sirilog. And magically, I’m now going to be emitting Open Telemetry compliant logging with that tracing information. So it looks like it’s going to be a smooth transition, but I don’t know. Hands up, I haven’t had time to be involved in the community, but I’m getting the impression it’s going to be a good year before we feel comfortable saying we’re going to adopt all three. But I don’t think we should just be waiting. So onto my how to get started, how do I actually start adopting? I would actually go it’s not about writing some code first. I think it’s about starting with the infrastructure and getting an end to end journey in place.

[00:58:34.790] – Ben
And that would be a gateway collector first, because obviously it does depend. Are you talking about greenfield or existing estate? I think where the challenges of adopting us are going to be, there are the existing estate. And if that’s the problem you’re solving, I will be where I am. Then it’s get that collector in place, get that infrastructure, and then maybe if you’ve got some legacy bare metal or those sort of VMs or data center, then that host metrics receiver I talked about, that’s a lot of fun. Get that agent on a host and start getting some metrics back about your servers via the collector instead, and hopefully export that to your existing back end that you’re using anyway to monitor the health of your servers. Just have a go at doing that from a sort of DevOps infrastructure angle. And then after that, it’s once you’ve got that channel going, that full journey, then it’s just a case of looking at small standard applications and just having a go at plugging them in. But get that infrastructure, get that end to end journey in first.

[00:59:37.350] – Ethan
Okay, that all seems straightforward enough and logical. And the caveat is about where the project is and how that might impact your decision to go forward with open telemetry now, or maybe wait a little bit is helpful, helpful insight. But I heard what you were saying. You’re saying don’t wait, but just know that we’re not fully mature here with open telemetry.

[01:00:00.830] – Ben
I wouldn’t consider doing any greenfield, any new work now without fully embracing open telemetry because of the collector, because of the way it works, because of even if you don’t want the collector, there’s also exported libraries for all the different languages to continue using your existing back end. So you can write a new application as if completely in the open flametry world, but still actually send that in jaeger format to Jaeger still if you want to, until you’re ready to switch over. So certainly I wouldn’t be writing anything new without embracing open telemetry. Absolutely.

[01:00:32.360] – Ethan
Okay, so folks listening, this podcast was an expansion of two blog posts that Ben wrote about open telemetry, the modern observability problem, and open telemetry, the missing ingredient, which you can find in his blog, failingfast IO. Now, Ben. Back to you. Are there other places people can follow you if they want to interact with you or just keep up with what you’re doing.

[01:00:55.330] – Ben
Follow on Twitter. I haven’t jumped shipped to Master there or anything like that yet. So I am on Twitter. I’m at Ben Hall underscore IO. There’s a daft story to that we’ll save for another day, but I’m stuck with it now. Every time I mention it, obviously it’s even firmer embedded in my future. Ben Holland IO on Twitter.

[01:01:15.510] – Ethan
Very good. Again, Ben’s blog failing fast at IO. And if you’ve listened all the way to this point, virtual high fives to you, you awesome human. If you have suggestions for future shows, ned and I would love to hear them. You can hit us up on Twitter because we haven’t jumped shipped to Mastodon either. I think I have a mastodon account. I haven’t even checked it since I built it. Ned, do you have a Mastodon on account?

[01:01:35.220] – Ben
Did you do that? Yes.

[01:01:36.580] – Ned
And same here. Barely checked it.

[01:01:38.740] – Ethan
Okay, well, anyway, we are still paying attention on Twitter at day two cloud show. And if you’re not a Twitter person, you can fill out the request form on our website daytoolcloud. IO and let us know the topics you’d like us to dig into for you. And if you like engineering oriented shows like this one, and I know you do, visit PacketPushers. Net subscribe Multicloud is of course part of a Packet Pushers podcast network. And that subscribe page has everything you’d want know from the Packet Pushers podcast network. All of our podcasts newsletters and websites, it’s all there. Nerdy content designed for your professional career development. And until then, just remember, cloud is what happens while it is making other plans.

Share this episode

A Free Newsletter That Doesn't Suck

Human Infrastructure covers IT blogs, news and vendor announcements of interest to hands-on engineers.

Subscribe

Have feedback for the hosts?

We want your follow-up.

Send us follow-up! 😎

Leave a Comment