My IBM

What is eBPF?

8 October 2024

Authors

What is eBPF?

eBPF is an event-driven programming technology that enables developers to write efficient, safe and non-intrusive programs that run directly in the Linux operating system (OS) kernel space, effectively “extending” the OS kernel.

The kernel of an operating system is an extremely, and intentionally, stable entity. It supports the entire OS, so—by design—it can be complicated and labor-intensive to amend or modify. eBPFs address this extensibility challenge by enabling developers to run sandboxed programs in privileged contexts, such as an OS kernel.

The OS stack can be broken down into three logical layers: the hardware layer, the kernel layer and the user layer. The kernel layer is the core of an operating system. It sits between the physical layer—which houses all the physical hardware, memory and storage components of an OS—and the user layer—which houses the web browsers and applications on an OS.

The apps and browsers in the user space must communicate with components of the physical layer to complete their respective tasks, but each component of the physical layer has specific communication protocols and compatibility requirements. This is where the kernel layer (or kernel space) enters the picture. It interprets system calls and enables applications to effectively communicate with physical network components.

eBPF tools help developers more easily expand the features of existing software at runtime without modifying the kernel source code, loading kernel modules (loadable pieces of code that can extend kernel functions) or otherwise disrupting the kernel space.

eBPF technologies represent an evolution of the original Berkeley Packet Filter (BPF), which provided a simple way to select and analyze network packets in a user space program. But beyond packet filtering, BPF programs lacked the flexibility to handle more complex tasks within the kernel.

Recognizing the need for a more versatile technology, the Linux community developed eBPF, which built upon the backend features of BPF but extended its in-kernel programmability. The advanced features of eBPF programs—and their sandbox approach—enables developers to implement enhanced packet filtering processes, improve kernel space observability and monitoring capabilities, conduct high-end performance analyses, and enforce kernel-level security policies in both on-premises data centers and cloud-native environments.

Components of eBPF programs

The primary components of an eBPF program are:

eBPF bytecode

eBPF programs are initially written in a restricted C subset and then compiled into eBPF bytecode by using tools such as LLVM, which serves as the eBPF’s back-end architecture for front-end programming languages (Clang, for instance) The bytecode is essentially a restricted set of instructions that adhere to the eBPF instruction set architecture and prevent runtime errors.

eBPF interpreter/JIT compiler

Linux kernel technology can convert eBPF bytecode into executable actions, but just-in-time (JIT) compilers offer superior performance. JIT compilers can convert bytecode into native machine code for specific hardware platforms as needed.

User space loader

User space loaders are programs in the user space that load the eBPF bytecode into the kernel, attaching it to the appropriate hooks and managing any associated eBPF maps. Examples of user space loaders include tools such as BPF Compiler Collection (BCC) and bpftrace.

eBPF maps

eBPF maps are data structures with key-value pairs and read/write access that provide shared storage space and facilitate interaction between eBPF kernel programs and user space applications. Created and managed through system calls, eBPF maps can also be used to maintain state between different iterations of the eBPF programs.

eBPF verifier

The verifier—a critical component of eBPF systems—checks the bytecode before it's loaded into the kernel to make sure that the program doesn't contain any harmful operations, such as infinite loops, illegal instructions or out-of-bounds memory access. It also helps make sure that all data paths of the program terminate successfully.

eBPF hooks

Hooks are points in the kernel code where eBPF programs can be attached. When the kernel reaches a hook, it run the attached eBPF program.

Different types of hooks such as tracepoints, kprobes, uprobes and network packet receive queues give eBPF programs broad data access and enable them to complete various operations. Tracepoints, for example, enable programs to inspect and collect data about the kernel or other processes, while traffic control hooks can be used to inspect and modify network packets. And kprobes and uprobes facilitate dynamic tracing at the kernel level and the user level.

Express data paths (XDPs)

XDPs are high-performance data paths that accelerate packet processing at the driver level and facilitate transfer across communication layers. They enable eBPF systems to make data routing decisions before data packets even reach the kernel.

The integration of XDPs with the Linux kernel (in the mid-2010s) ultimately enabled developers to deploy eBPF-based load balancing functions capable of managing data traffic in even the busiest data centers.

Helper functions

Because eBPFs cannot generate arbitrary functions and must maintain compatibility with every possible kernel version, sometimes basic eBPF instruction sets aren’t nuanced enough to run advanced operations. Helper functions bridge this gap.

Helper functions—sets of predefined, API-based kernel functions that eBPFs can call from within the system—provide a way for eBPF programs to complete more complex operations (such as getting the current time and date or generating random numbers) that aren’t directly supported by the instruction set.

The latest AI News + Insights  

Discover expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter.

Subscribe today

How does eBPF work?

Generally, eBPFs operate as virtual machines (VMs) inside the Linux kernel, working on a low-level instruction set architecture and run eBPF bytecode. However, the complex process of running an eBPF program tends to follow certain major steps.

Developers first write the eBPF program and compile the bytecode. The program's purpose will dictate the appropriate type of code. For instance, if a team wants to monitor CPU usage, it will write code that includes functions for capturing usage metrics.

After the eBPF compiler converts the high-level C code into lower-level bytecode, a user space loader will generate a BPF system call to load the program into the kernel. The loader is also responsible for addressing errors and setting up any eBPF maps the program needs.

With the program bytecode and maps in place, the eBPF will run a verification process to confirm the program is safe to run in the kernel. If it’s deemed unsafe, the system call to load the program will fail, and the loader program will receive an error message. If the program passes verification, it's allowed to run.

Using either an interpreter or a JIT compiler, the eBPF will convert the bytecode into actionable machine code. However, eBPF is an event-driven technology, so it runs in response to specific hook points or events within the kernel (system calls, network events, process initiation, CPU idling, for example). When an event occurs, the eBPF runs the corresponding bytecode program, enabling developers to inspect and manipulate various components of the system.

When the eBPF program is running, developers can interact with it from the user space using eBPF maps. For example, the application might periodically check a map to collect data from the eBPF program, or it might update a map to change the program's behavior.

Unloading the program is the final step of most eBPF execution processes. When the eBPF has done its job, the loader can use the BPF system call again to unload it from the kernel, at which point the eBPF stops running and frees its associated resources. The unloading process might also include iterating over any eBPF maps the team no longer needs to free up useful individual elements, and then deleting the map itself (using the ‘delete’ syscall).

IBM Power

Modernize and Automate with IBM Power

Joe Cropper, IBM Master Inventor, STSM, IBM Power Hybrid Cloud Platform, walks you through a demo of how IBM Power can help you modernize your applications.

Explore IBM Power

eBPF vs. BPF

The Berkeley Packet Filter (BPF) was originally developed as a mechanism for packet filtering in Unix-based systems, allowing user-level code to define filters that could efficiently capture and process network packets within the kernel. Therefore, this approach minimized the processing power that is needed to transfer unnecessary data to the user space and could streamline and optimize computer networking.

BPF uses a kernel agent to process packets at the networking stack's entry point. After a BPF program is developed, its loaded to the kernel space by a BPF kernel agent, which verifies its accuracy before attaching it to the relevant socket. Consequently, in the user space, only packets matching the BPF program's filter can receive data from a given socket. This safeguarding feature limits a program’s access to permitted memory areas and prevents potential kernel crashes.

eBPF first emerged in 2014, at which point it represented a significant evolution of the original BPF concept. In addition to the original networking use cases, eBPF applications broadened to include system calls and other functions, which is why developers often referred to it as the "fully extended Berkeley Packet Filter."

One of the key areas where eBPF excels is network performance monitoring. It enables IT teams to conduct real-time analyses and troubleshooting by providing granular insights into network behavior, performance metrics and bottlenecks. eBPF plays a key role in network security, monitoring and filtering system calls and network activities, enforcing network security policies and detecting system anomalies.

eBPF also offers developers a valuable tool for tracing and profiling both kernel and user-space applications and running custom actions and data transformations as data traverses the kernel, further enhancing its versatility and utility. Due to these expansive capabilities (which go far beyond packet filtering), eBPF is now recognized as a stand-alone term, rather than an acronym for extended Berkeley Packet Filter.

Advancements in eBPF technology have compelled software developers to expand its applications to all operating systems, so that non-Linux-based platforms can take advantage of eBPF’s sophisticated tracing, networking and monitoring capabilities.¹

In fact, the eBPF Foundation—an extension of the Linux Foundation whose members include Google, Meta, Netflix, Microsoft, Intel and Isovalent, among others—has invested heavily in the expansion of OS compatibility for eBPF programs, hoping to eventually broaden the usefulness of eBPF programming.²

Though BPF laid the groundwork for efficient packet filtering, eBPF has undeniably expanded its scope. Modern eBPFs provide a comprehensive tool for optimized observability, performance and security in Linux systems. Its ability to run dynamic, user-defined programs within the kernel creates new possibilities for system monitoring and management, making eBPF an indispensable tool for both software developers and computer programmers.

eBPF use cases and benefits

eBPF technologies have already become a cornerstone of modern Linux systems, enabling fine-grained control over the Linux kernel and empowering companies to build more innovative programs within the Linux ecosystem.

eBPF has facilitated advancements in:

Networking and network filtering

eBPF enables developers to install faster, more tailored packet processing features, load balancing processes, application profiling scripts and network monitoring practices. Open-source platforms, such as Cilium, deploy eBPF to provide secure, scalable, observable networking for Kubernetes clusters and workloads, and other containerized microservices.

eBPF also helps IT teams impose simple and complex rules early in the event path for more effective traffic routing, content filtering and loss prevention. Using kernel-level package forwarding logic, eBPFs can minimize latency, streamline routing processes and enable faster overall network response.

Observability

As apps are broken down into microservices, observability into the user space can become challenging. eBPFs give monitoring tools a kernel-space point of view so that observability remains intact end-to-end.

eBPFs allow developers to instrument the kernel and user space applications to collect detailed performance data and metrics without significantly impacting the system's performance. These capabilities help organizations stay ahead, enabling real-time monitoring and observability for each network component (and its dependencies).

Real-time monitoring

eBPFs can monitor system calls, network traffic and system behavior at both the kernel and socket levels to detect and respond to potential security threats in real time. Falco (a cloud-native runtime security tool) for example, uses eBPF to implement runtime security auditing and incident response, enhancing the overall security of the system.

Performance tuning

Many eBPF tools can trace system calls, monitor CPU usage and track resource usage (disk I/O, as one example). These features can help developers more easily probe for bottlenecks in system performance, implement debugging protocols and identify optimization opportunities.

Security policies

eBPFs can install and enforce kernel-level security policies (network traffic filters, firewalls and behavior restrictions, for instance) and safety checks to prevent bad actors and unauthorized users from accessing the network.

eBPF in microservices architectures

In a microservices architecture, visibility into production workloads within the container is vital. However, traditional observability tools can struggle to keep up with containerized microservices.

Containers are ephemeral by design; they’re created when needed and destroyed as soon as they’ve served their purpose. Each container acts like an individual host, and in a production environment, the sheer volume of metrics they create can easily overwhelm standard app, network and infrastructure monitoring tools. Virtual machines can behave similarly, but the rapid-cycling nature of containers can complicate telemetry capture.

Furthermore, containers are often deployed in large numbers in cloud environments, making visibility even more challenging.

eBPF, which runs at the kernel level of a host or container, enables developers to gather telemetry from short-lived data entities. It helps integrate network, application and infrastructure visibility into a unified eBPF-based service. With eBPF, developers can capture data on processes, memory usage, network activity and file access at the container level, even if the containers aren’t deployed in the cloud.

eBPF and Kubernetes

Similarly, in Kubernetes-based containerized environments, eBPF uses a single interface and toolset to collect data from disparate clusters so that IT teams don’t have to deploy individual user space agents to complete data collection tasks across the network. eBPF tools can run on control plane nodes (for API server monitoring, for instance) and monitor worker nodes to generate insights, correlating data points and insights from both node types for fine-tuned cluster observability.

Still Running Legacy Virtual Machines? See What’s Slowing You Down

IDC shows how enterprises are rethinking virtualization and modernizing with hybrid and cloud-first platforms.

Resources

Understanding virtual machines: Maximizing IT flexibility

Discover the benefits of virtual machines (VMs), which enable businesses to efficiently run multiple operating systems on one physical server, optimizing resources and reducing costs.

Unlock the power of cloud computing

Accelerate your business transformation with cloud solutions designed for innovation and growth. Explore cutting-edge tools and insights to stay ahead of the competition.

Discover innovation with hybrid by design

With 84% of digital transformations failing, it’s time for a reset. Explore how gen AI and hybrid cloud are shaping the future of business innovation. Learn about hybrid-by-design frameworks that enable businesses to harness the full potential of AI.

Mastering hypervisors: Powering virtualization and cloud efficiency

Explore how hypervisors are essential to virtualization by allowing multiple operating systems to run on a single hardware unit. Learn about the key types, features and popular solutions to efficiently manage IT resources and drive digital transformation.

Footnotes

¹Foundation Proposes Advancing eBPF Adoption Across Multiple OSes, DevOps.com, 21 August 2021.

²Latest eBPF Advances Are Harbingers of Major Changes to IT, DevOps.com, 13 September 2023.

What is eBPF?

8 October 2024

Authors

Chrystal R. China

Michael Goodwin

What is eBPF?

Components of eBPF programs

eBPF bytecode

eBPF interpreter/JIT compiler

User space loader

eBPF maps

eBPF verifier

eBPF hooks

Express data paths (XDPs)

Helper functions

The latest AI News + Insights

How does eBPF work?

Modernize and Automate with IBM Power

eBPF vs. BPF

eBPF use cases and benefits

Networking and network filtering

Observability

Real-time monitoring

Performance tuning

Security policies

eBPF in microservices architectures

eBPF and Kubernetes

Resources

Related solutions

Footnotes