Conf42 Observability 2025 - Online

- premiere 5PM GMT

Writing Custom eBPF Programs for Observability: What You Need to Know

Video size:

Abstract

Is your observability deep enough? Getting critical insights too late or not at all? You need eBPF! Gain privileged, real-time visibility into system operations without code changes or heavy overhead. From tracing syscalls to profiling latency, learn to harness eBPF like a pro!

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Okay, so let's go right into the presentation writing custom EVPF programs for observability purposes. What exactly is EVPF? The extended Berkeley packet filter is an evolution of the original BPF technology. It extends the capabilities of BPF by. Providing a more powerful and flexible way to perform more features like dynamic tracing, network analysis, performance monitoring. But mainly how it does this is by providing a framework for you to write kernel compatible sandbox. Programs that can be executed directly in your kernel space without modifying the kernel code itself, or without running a file of any security protocols, you get a flexible tool you that you can use to program observability control or enhance your kernel behavior. So looking here, we see the architecture, basically the process by which the EVPF program is created and is loaded to the kernel space. So basically you start with the C source or like a c program EBPF program. And then this program is converted to by code before being, converted to birth code by the compiler the compiler before being attached into the or loaded into the kernel space and attached to either hooks or events or system calls as the case may be. So between between the compiler and the loader page. Verifier goes through your programs to ensure that, okay, there are no loops, there are no out of memory or any security protocols. And then this makes it a compatible, a kind of compatible program that can be executed as machine code within the kernel space. Why is E-B-P-F-A game changer for observability? It's because you get that flexibility, you get that ability to basically execute code within your NEL space. This gives you nel level visibility, something that you can use to like able to get custom metrics way before any user space. Application might see it. You get them like almost immediately in run time. As the operations are being executed, you get that visibility. You get no code instrument. No, there's no code instrumentation need, no manual instrumentation is needed for you to get this visibility. Because yeah, EPF is a clinical compatible tool or technology. Basically, it works for. Most or any kernel version, and you do not need to basically create or mess with your security, the security of your kernel. Also low overhead. As we said earlier, everything gets compiled or extended to machine code. And all your scripts, however, flexible, gets compiled into the machine code that is executed at runtime at the same time as your operations, basically real world use cases. So a lot of companies have used this tool. A lot of companies have extended this tool creating using it as a foundation for tracing system calls, network visibility, performance profile lead, security monitoring. And I linked in some examples from Netflix, apple, basically what they have done with this tool. We have a demo with, I added in a GitHub link, so if that is useful to us, but let's see. This is the GitHub link where the code instructions on how to basically set your environment to test the EVPF code as well. U usually, or like for this particular one, you just need an instance. I'm going to be using, aWS instance, prettier micro or, yeah, that, and that's all I need to just test these two demos. First, the packet log, which would basically which would basically get traffic packets and log it in to a BPF map. So let's just see how that works. I have already done this already. Put up the environment basically so we can just test this. Just basically sending in packets, sending in traffic packets, and then we'll see that this. Has been a BPF map has been generated for us in packet counters and packet events, and we can be able to see this particular ones. This BPF map is accessible to NEL space, but also accessible to the user space as well. It can be exported, it can be pushed towards your, your observability tools like Prometheus and Graph, basically with user space tools. But then for this particular demo, we used C program, the C program to basically pass the packet information from our traffic. Log that data into basic BPF map and with the counter also to, to count to basically generate metrics, just costal metrics like a counter get the packet size. Also put that in there as well. So once you do this, once you like put down your logic, your EVPF logic, you then load it. Like I said earlier, you compile it. This is the compiling stage. You compiled it, loaded it into this interface, and then yeah, you could immediately see that it has been loaded. This is the packet of server. And then we have tested it and seen that. Okay, it. The map has been created and been stored in the kernel space where it can be accessed by user space programs. But then that is not the only way to create or to utilize. Your EBPF programs, you can create native C-E-B-P-F programs, but there are also extensions such as Python extensions, go rust in BCC Cilium library. And then I, this extensions offer, like abstractions of the base code. They help you basically do more. With your EVPF programs make it easier for you. However, they also have their performance cost, basically because you get more delayed execution and a larger footprint. We'll see this with our next demo where we use the BCC BCC tool to, create E an EVPF program in Python. So this is the Python code. You can see that BCC is imported and then used in passing your main VPF programming and logic. And then that is now used to initialize and process events. Within the code instructions. So this particular code traces first we'll pull up an in JX web service and then we'll add some files into it. And then you can trace the access, you can trace user access to those files. Basically just login or. Accessing the files and generating metrics from that. I have done all this installing testing creating the HTML files that we would then trace. I've done all this in the Ubuntu instance, so we just. Start that up and then try and test it. Let's go with the load test instead. So the load test of our inject web service in another terminal, and then it's basically mixed like a hundred requests to all. Different pages and then we can see it here. The operations, this is our BPF program. Basically reacting and basically carrying out our program logic for the a hundred requests. That is that service receives the program. Logic is up to you. You can. Whatever you want to do with that information. Whatever you want to do with the visibility or the access to the visibility that you have, whether you want to create custom metrics from that, whether you want to export the data directly to your observability tools, your analysis tools, whether you just want that information logged. EBPF gives you the access. And gives you the ability to program your custom logic into the NEL space. So we have seen two demos now. One, using native c programming to create, to create, eVPF program and attach that to the kernel space. Another using Python basically. And then as we saw, we didn't need to load or to attach manually the Python BC, C program did that at runtime for the Python execution. Yeah. So extending EVPF, like I said earlier, you have either those custom metrics or you have the BPPF the BP BPF map basically with your data and you can export this, you can, utilize it with a user space application. However you need for extending EVPF. There's a wide range of tools. Open Telemetry has one. Promeus has one. Grafana has one. Basically all this based on EVPF and utilizing this revolutionary technology that gives you access to corner space. For your observability tools, your observability tooling. So here we see a kind of ecosystem map, basically looking at all the projects, XD case that are available to you for whatever use case you have for observability and monitoring. We have pixie. Gimme a moment. We have Pixie, which is a Kubernetes observability tool. Basically takes away the need for manual instrumentation as it is based on EVPF. You have Parker, which gives you continuous system profiling Periscope, another continuous profiling platform. Deep Flow has automated observability with tracing, and you have network tools like Cilium. Hopefully. Which gives you like that networking for your Kubernetes Kubernetes purposes. There are also program management tools. You have Leaf BPF man and bleb, basically this ecosystem. Your EBPF ecosystem is, has a wide range of. Variety of tolling, variety of management and optimization tools that you can use to better to better utilize and optimize observability networking or security across your across your infrastructure. Here I linked in some, so more learning docs, basically to see more on how EVPF came to be on the infrastructure of EPF, on basically learning how to utilize EVPF more. And then an article EVPF for Frank, which basically goes through how how. Startups or how companies take giants like apple and Netflix are using EBPF for their tools. This is the end of my presentation. The GitHub links slides as well, will be made available to you. So thank you very much for taking the time to go through to this presentation.
...

Sooter Saalu

Technical Writer @ Draft.dev

Sooter Saalu's LinkedIn account Sooter Saalu's twitter account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)