For a while, I’ve been following stuff around eBPF, and it is very promising. What I just wrote is an understatement. At first glance, eBPF is bringing many new possibilities to our toolbox. You can start with performance profiling, tracing, security, networking, etc. But let’s start from the beginning.
By the way, I’m doing this on OSX. For eBPF, you need Linux kernel 4.1 or newer. So, I’ll be running some VMs. This setup should be doable on Linux too. Code is available here.
What is eBPF?
A lit bit of history
A long time ago, in the last century (1992), Steven McCanne and Van Jacobson wrote the paper The BSD Packet Filter. In short, the goal was to tap on a network interface and filter packets. The result is the in-kernel virtual machine that is used to filter packets. This virtual machine is called Berkeley Packet Filter (BPF). It is not only used for filtering packets but also for tracing, performance profiling, security, etc.
In 2014, Alexei Starovoitov and Daniel Borkmann started to work on extending BPF. That is how extended BPF (eBPF) was born. eBPF is not only used for filtering packets but as mentioned, eBPF is used for tracing, performance profiling, security, etc.
eBPF virtual machine
EBPF program compiles to eBPF bytecode. This bytecode is loaded into the kernel, verified, and then executed by eBPF virtual machine. It uses different maps to communicate with the kernel, store data, and expose data to user space. Above is how eBPF works, in short.
eBPF program is event-driven. It is attached to some hook and executed when that hook is triggered. There are many different hooks that are used, like internal system calls, functions entry and exit, network events, etc.
We write eBPF programs in pseudo-C code and then use LLVM to compile it eBPF bytecode. Or we use eBPF programs with projects like Cilium, bcc, and bpftrace. They are abstractions on the top of eBPF and make our life easier. After the program is compiled to eBPF bytecode, it is loaded into the kernel and verified. If everything is ok, eBPF bytecode is translated to native code by the JIT compiler and executed.
In the above image, there are eBPF maps. The eBPF programs use them to collect, store and share data. There are different types of eBPF maps, like hash table, array, ring buffer, etc.
Helper functions, tail, and function calls are not on the image. Helper functions are used to access kernel functions. Tail calls are used to call other eBPF programs. Function calls are used to call other functions inside the eBPF program.
Enough theory. Let’s start with some practice…
Setting up the environment (OSX)
I’m doing this on OSX. But, even if I do this on a Linux box, I still use VMs. Why? Because I would like to have a clean environment and not mess up my host.
First, I set up a small Git project for messing stuff. I’ll be using it for all my experiments. All code will be in this repo. And I’ll be using VS Code or Community Edition of IntelliJ IDEA. I aim to set up VM(s) to run code and IDE on my host. I should be able to easily share files and run code on VM(s).
Vagrant would be a CLI tool for managing VMs. But under the hood, it uses VirtualBox. So, you can use Vagrant and VirtualBox. I’m not VirtualBox fun, but I’ll give it a try.
I want to use BCC (BPF Compiler Collection) on the first try. BCC is a toolkit for creating efficient kernel tracing and manipulation programs. It is based on eBPF. It is written in Python. So, my Vagrantfile looks like this:
# -*- mode: ruby -*- # vi: set ft=ruby : $script = <<-SCRIPT echo "Provisioning..." sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4052245BD4284CDD echo "deb https://repo.iovisor.org/apt/$(lsb_release -cs) $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/iovisor.list sudo apt-get update sudo apt-get -y install bcc-tools libbcc-examples linux-headers-$(uname -r) SCRIPT Vagrant.configure("2") do |config| config.vm.box = "ubuntu/bionic64" config.vm.provision "shell", inline: $script end
If you run
vagrant up you should get Ubuntu 18.04 LTS VM with BCC installed. You can check it with
vagrant ssh and then
dpkg -l | grep bcc. You should get something like this:
ii bcc-tools 0.10.0-1 all Command line tools for BPF Compiler Collection (BCC) ii libbcc 0.10.0-1 all Shared Library for BPF Compiler Collection (BCC) ii libbcc-examples 0.10.0-1 amd64 Examples for BPF Compiler Collection (BCC) ii libcc1-0:amd64 8.4.0-1ubuntu1~18.04 amd64 GCC cc1 plugin for GDB ii python-bcc 0.10.0-1 all Python wrappers for BPF Compiler Collection (BCC)
Now if you go to
/vagrant folder, you’ll see your host folder with Vagrantfile. There I keep my files as well. I have
Hello World example:
Well. I want to skip VirtualBox, and I would like to use the newer Ubuntu. So, I decided to try Lima.
According to docs, it is best suited to run on OSX. It uses Hypervisor.framework to run VMs. To install it, you can use
brew install lima or see Getting started.
For a start, I’m going with this config:
images: # Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months. - location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img" arch: "x86_64" - location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-arm64.img" arch: "aarch64" mounts: - location: "~" - location: "/tmp/lima" writable: true provision: - mode: system script: | apt-get update apt-get install -y apt-transport-https ca-certificates curl clang llvm jq apt-get install -y libelf-dev libpcap-dev libbfd-dev binutils-dev build-essential make apt-get install -y linux-tools-common linux-tools-5.15.0-41-generic bpfcc-tools apt-get install -y python3-pip rm -rf /usr/local/go && tar -C /usr/local -xzf go1.20.3.linux-amd64.tar.gz
limactl start --name=ubuntu ubuntu-lst-lima.yml and choose
Proceed with the current configuration. When it is done, you can run
limactl shell ubuntu to get a shell:
Lima for bpftrace
So, in both previous setups, I could run the
hello.py example. But I would also like to run bpftrace. For it, I’m creating a separate config:
images: # Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months. - location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img" arch: "x86_64" - location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-arm64.img" arch: "aarch64" mounts: - location: "~" - location: "/tmp/lima" writable: true provision: - mode: system script: | apt-get update echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" | tee -a /etc/apt/sources.list.d/ddebs.list sudo apt install ubuntu-dbgsym-keyring sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F2EDC64DC5AEE1F6B9C621F0C8CAB6595FDFF622 apt-get update apt-get install -y bpftrace apt-get install -y bpftrace-dbgsym libbpfcc-dbgsym
Create VM with
limactl start --name=bpftrace bpftrace.yml and run
limactl shell bpftrace to get a shell:
I’m inviting you to join me on this journey. I set up two VMs with Ubuntu 18.04 LTS and Ubuntu 22.04 LTS. I was able to run
bpftrace examples. IMHO, this is
good enough for a start. For now, I have all I need to start learning eBPF. If you have any suggestions or comments, please, let me know.