For a while, I’ve been following stuff around eBPF, and it is very promising. What I just wrote is an understatement. At first glance, eBPF is bringing many new possibilities to our toolbox. You can start with performance profiling, tracing, security, networking, etc. But let’s start from the beginning.

By the way, I’m doing this on OSX. For eBPF, you need Linux kernel 4.1 or newer. So, I’ll be running some VMs. This setup should be doable on Linux too. Code is available here.

What is eBPF?

A lit bit of history

A long time ago, in the last century (1992), Steven McCanne and Van Jacobson wrote the paper The BSD Packet Filter. In short, the goal was to tap on a network interface and filter packets. The result is the in-kernel virtual machine that is used to filter packets. This virtual machine is called Berkeley Packet Filter (BPF). It is not only used for filtering packets but also for tracing, performance profiling, security, etc.

In 2014, Alexei Starovoitov and Daniel Borkmann started to work on extending BPF. That is how extended BPF (eBPF) was born. eBPF is not only used for filtering packets but as mentioned, eBPF is used for tracing, performance profiling, security, etc.

eBPF virtual machine

ebpf-workflow

EBPF program compiles to eBPF bytecode. This bytecode is loaded into the kernel, verified, and then executed by eBPF virtual machine. It uses different maps to communicate with the kernel, store data, and expose data to user space. Above is how eBPF works, in short.

eBPF program is event-driven. It is attached to some hook and executed when that hook is triggered. There are many different hooks that are used, like internal system calls, functions entry and exit, network events, etc.

We write eBPF programs in pseudo-C code and then use LLVM to compile it eBPF bytecode. Or we use eBPF programs with projects like Cilium, bcc, and bpftrace. They are abstractions on the top of eBPF and make our life easier. After the program is compiled to eBPF bytecode, it is loaded into the kernel and verified. If everything is ok, eBPF bytecode is translated to native code by the JIT compiler and executed.

In the above image, there are eBPF maps. The eBPF programs use them to collect, store and share data. There are different types of eBPF maps, like hash table, array, ring buffer, etc.

Helper functions, tail, and function calls are not on the image. Helper functions are used to access kernel functions. Tail calls are used to call other eBPF programs. Function calls are used to call other functions inside the eBPF program.

Enough theory. Let’s start with some practice…

Setting up the environment (OSX)

I’m doing this on OSX. But, even if I do this on a Linux box, I still use VMs. Why? Because I would like to have a clean environment and not mess up my host.

First, I set up a small Git project for messing stuff. I’ll be using it for all my experiments. All code will be in this repo. And I’ll be using VS Code or Community Edition of IntelliJ IDEA. I aim to set up VM(s) to run code and IDE on my host. I should be able to easily share files and run code on VM(s).

Vagrant

Vagrant would be a CLI tool for managing VMs. But under the hood, it uses VirtualBox. So, you can use Vagrant and VirtualBox. I’m not VirtualBox fun, but I’ll give it a try.

I want to use BCC (BPF Compiler Collection) on the first try. BCC is a toolkit for creating efficient kernel tracing and manipulation programs. It is based on eBPF. It is written in Python. So, my Vagrantfile looks like this:

# -*- mode: ruby -*-
# vi: set ft=ruby :

$script = <<-SCRIPT
  echo "Provisioning..."
  sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4052245BD4284CDD
  echo "deb https://repo.iovisor.org/apt/$(lsb_release -cs) $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/iovisor.list
  sudo apt-get update
  sudo apt-get -y install bcc-tools libbcc-examples linux-headers-$(uname -r)
SCRIPT

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/bionic64"
  config.vm.provision "shell", inline: $script
end

If you run vagrant up you should get Ubuntu 18.04 LTS VM with BCC installed. You can check it with vagrant ssh and then dpkg -l | grep bcc. You should get something like this:

ii  bcc-tools                        0.10.0-1                            all          Command line tools for BPF Compiler Collection (BCC)
ii  libbcc                           0.10.0-1                            all          Shared Library for BPF Compiler Collection (BCC)
ii  libbcc-examples                  0.10.0-1                            amd64        Examples for BPF Compiler Collection (BCC)
ii  libcc1-0:amd64                   8.4.0-1ubuntu1~18.04                amd64        GCC cc1 plugin for GDB
ii  python-bcc                       0.10.0-1                            all          Python wrappers for BPF Compiler Collection (BCC)

Now if you go to /vagrant folder, you’ll see your host folder with Vagrantfile. There I keep my files as well. I have hello.py as Hello World example:

Well. I want to skip VirtualBox, and I would like to use the newer Ubuntu. So, I decided to try Lima.

Lima

According to docs, it is best suited to run on OSX. It uses Hypervisor.framework to run VMs. To install it, you can use brew install lima or see Getting started.

For a start, I’m going with this config:

images:
# Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months.
- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img"
  arch: "x86_64"
- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-arm64.img"
  arch: "aarch64"

mounts:
- location: "~"
- location: "/tmp/lima"
  writable: true
provision:
- mode: system
  script: |
    apt-get update
    apt-get install -y apt-transport-https ca-certificates curl clang llvm jq
    apt-get install -y libelf-dev libpcap-dev libbfd-dev binutils-dev build-essential make 
    apt-get install -y linux-tools-common linux-tools-5.15.0-41-generic bpfcc-tools
    apt-get install -y python3-pip 
    rm -rf /usr/local/go && tar -C /usr/local -xzf go1.20.3.linux-amd64.tar.gz    

Just run limactl start --name=ubuntu ubuntu-lst-lima.yml and choose Proceed with the current configuration. When it is done, you can run limactl shell ubuntu to get a shell:

Lima for bpftrace

So, in both previous setups, I could run the hello.py example. But I would also like to run bpftrace. For it, I’m creating a separate config:

images:
# Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months.
- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img"
  arch: "x86_64"
- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-arm64.img"
  arch: "aarch64"

mounts:
- location: "~"
- location: "/tmp/lima"
  writable: true
provision:
- mode: system
  script: |
    apt-get update
    echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" | tee -a /etc/apt/sources.list.d/ddebs.list
    sudo apt install ubuntu-dbgsym-keyring
    sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F2EDC64DC5AEE1F6B9C621F0C8CAB6595FDFF622
    apt-get update
    apt-get install -y bpftrace 
    apt-get install -y bpftrace-dbgsym libbpfcc-dbgsym    

Create VM with limactl start --name=bpftrace bpftrace.yml and run limactl shell bpftrace to get a shell:

Conclusion

I’m inviting you to join me on this journey. I set up two VMs with Ubuntu 18.04 LTS and Ubuntu 22.04 LTS. I was able to run hello.py and bpftrace examples. IMHO, this is good enough for a start. For now, I have all I need to start learning eBPF. If you have any suggestions or comments, please, let me know.

Resources