Syscall Sandboxing with seccomp-BPF (on Yocto)

When a network-connected application on an embedded device gets compromised, the attacker’s first move is typically to spawn a shell or execute additional tools. This tutorial demonstrates how to use seccomp-BPF to block process spawning at the kernel level, limiting the blast radius of a successful exploit. Even if an attacker gains code execution within your application, they cannot break out to a shell.

This approach is used in production by Chrome, Docker, systemd, and Android. There are no secrets involved and nothing to leak—the kernel simply enforces a syscall policy. The application either works correctly or dies loudly when attempting a blocked operation.

All files required for this tutorial are provided inline below. Create the layer structure and copy the code blocks into the respective files.

Yocto Project version compatibility

Yocto Project Codename Supported
5.0 Scarthgap
4.0 Kirkstone

Earlier releases may work but have not been tested. The kernel configuration options required (CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER) have been available since Linux 3.5. See seccomp(2) for the system call documentation.

Prerequisites

This tutorial assumes a working Yocto build environment. Your build host must satisfy the standard Yocto Project requirements. If you prefer reproducible builds with kas, see Using kas to reproduce your Yocto builds.

The threat model

Any code that processes external input is attack surface. This includes:

  • Network services — HTTP endpoints, MQTT subscribers, REST APIs, or any socket-based communication. A malformed request can trigger buffer overflows or injection vulnerabilities.
  • Configuration files — Applications parsing YAML, JSON, XML, or custom formats from /etc, /data, or user-writable locations. A crafted configuration file can exploit parser vulnerabilities or trigger unintended code paths.
  • Peripheral input — Serial data, USB devices, or sensor readings that are parsed without sufficient validation.

The common pattern: untrusted data reaches a parser, and a vulnerability in that parser grants the attacker code execution within the application’s context. Traditional security measures focus on preventing the initial breach, but defense in depth assumes that breaches will occur and limits the damage.

Once an attacker has code execution, their next step is typically to:

  1. Spawn a shell (execve("/bin/sh", ...))
  2. Download and execute additional tools
  3. Pivot to other processes or systems

With seccomp-BPF, we block these post-exploitation steps at the kernel level:

  • Process spawning (execve, fork, clone) — prevents shell execution and tool deployment
  • Process debugging (ptrace) — prevents injection into other processes
  • System modification (mount, chmod) — prevents persistence mechanisms

The filter is applied by the application itself before processing untrusted input. Once loaded, the kernel enforces it—the application cannot remove or weaken the filter, and neither can an attacker who has compromised it.

Layer structure

Create a layer named meta-devicedojo-seccomp. First, set up the directory structure:

mkdir -p meta-devicedojo-seccomp/conf
cd meta-devicedojo-seccomp

The complete layer will have this structure:

meta-devicedojo-seccomp/
├── conf/
│   └── layer.conf
├── recipes-demo/
│   └── images/
│   │   └── devicedojo-seccomp-image.bb
│   └── seccomp-demo/
│       ├── seccomp-demo_1.0.bb
│       └── files/
│           └── seccomp-demo.c
└── recipes-kernel/
    └── linux/
        ├── linux-yocto_%.bbappend
        └── files/
            └── seccomp.cfg

Layer configuration

cat > conf/layer.conf << 'EOF'
BBPATH .= ":${LAYERDIR}"

BBFILES += "${LAYERDIR}/recipes-*/*/*.bb \
            ${LAYERDIR}/recipes-*/*/*.bbappend"

BBFILE_COLLECTIONS += "devicedojo-seccomp"
BBFILE_PATTERN_devicedojo-seccomp = "^${LAYERDIR}/"
BBFILE_PRIORITY_devicedojo-seccomp = "6"

LAYERDEPENDS_devicedojo-seccomp = "core"
LAYERSERIES_COMPAT_devicedojo-seccomp = "scarthgap"
EOF

Kernel configuration fragment

The kernel must have seccomp support enabled. The audit options are not strictly required but provide visibility into blocked syscalls via dmesg.

mkdir -p recipes-kernel/linux/files

cat > recipes-kernel/linux/files/seccomp.cfg << 'EOF'
CONFIG_SECCOMP=y
CONFIG_SECCOMP_FILTER=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
EOF

cat > recipes-kernel/linux/linux-yocto_%.bbappend << 'EOF'
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"

SRC_URI += "file://seccomp.cfg"
EOF

The demo application

The application is a minimal HTTP client that applies a seccomp filter before making network requests. It uses libseccomp for readability—you could also construct raw BPF programs, but that is significantly more error-prone. The demo fetches data from httpbin.org, a simple HTTP request/response service.

mkdir -p recipes-demo/seccomp-demo/files

cat > recipes-demo/seccomp-demo/files/seccomp-demo.c << 'EOF'
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netdb.h>
#include <seccomp.h>
#include <errno.h>

#define TARGET_HOST "httpbin.org"
#define TARGET_PORT "80"
#define TARGET_PATH "/ip"

static int apply_seccomp_filter(void) {
    scmp_filter_ctx ctx;
    int rc;

    /* Default action: allow all syscalls */
    ctx = seccomp_init(SCMP_ACT_ALLOW);
    if (ctx == NULL) {
        fprintf(stderr, "seccomp_init failed\n");
        return -1;
    }

    /* Block process spawning */
    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, SCMP_SYS(execve), 0);
    if (rc < 0) goto fail;

    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, SCMP_SYS(execveat), 0);
    if (rc < 0) goto fail;

    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, SCMP_SYS(fork), 0);
    if (rc < 0) goto fail;

    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, SCMP_SYS(vfork), 0);
    if (rc < 0) goto fail;

    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, SCMP_SYS(clone), 0);
    if (rc < 0) goto fail;

    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, SCMP_SYS(clone3), 0);
    if (rc < 0) goto fail;

    /* Defense in depth: block debugging */
    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, SCMP_SYS(ptrace), 0);
    if (rc < 0) goto fail;

    rc = seccomp_load(ctx);
    if (rc < 0) {
        fprintf(stderr, "seccomp_load failed: %s\n", strerror(-rc));
        goto fail;
    }

    seccomp_release(ctx);
    printf("[+] seccomp filter applied: execve/fork/clone blocked\n");
    return 0;

fail:
    seccomp_release(ctx);
    fprintf(stderr, "seccomp_rule_add failed: %s\n", strerror(-rc));
    return -1;
}

static int do_http_get(void) {
    struct addrinfo hints, *res, *rp;
    int sockfd, rc;
    char request[256];
    char response[4096];
    ssize_t n;

    memset(&hints, 0, sizeof(hints));
    hints.ai_family = AF_UNSPEC;
    hints.ai_socktype = SOCK_STREAM;

    printf("[*] Resolving %s...\n", TARGET_HOST);
    rc = getaddrinfo(TARGET_HOST, TARGET_PORT, &hints, &res);
    if (rc != 0) {
        fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rc));
        return -1;
    }

    for (rp = res; rp != NULL; rp = rp->ai_next) {
        sockfd = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
        if (sockfd == -1) continue;

        if (connect(sockfd, rp->ai_addr, rp->ai_addrlen) == 0)
            break;

        close(sockfd);
    }

    freeaddrinfo(res);

    if (rp == NULL) {
        fprintf(stderr, "Could not connect to %s\n", TARGET_HOST);
        return -1;
    }

    printf("[*] Connected to %s:%s\n", TARGET_HOST, TARGET_PORT);

    snprintf(request, sizeof(request),
        "GET %s HTTP/1.1\r\n"
        "Host: %s\r\n"
        "Connection: close\r\n"
        "\r\n",
        TARGET_PATH, TARGET_HOST);

    n = write(sockfd, request, strlen(request));
    if (n < 0) {
        perror("write");
        close(sockfd);
        return -1;
    }

    printf("[*] Request sent, reading response...\n");

    n = read(sockfd, response, sizeof(response) - 1);
    if (n < 0) {
        perror("read");
        close(sockfd);
        return -1;
    }

    response[n] = '\0';
    close(sockfd);

    printf("[+] Response received (%zd bytes):\n", n);
    printf("--- BEGIN RESPONSE ---\n");

    char *line = response;
    int lines = 0;
    while (line && lines < 10) {
        char *next = strchr(line, '\n');
        if (next) *next = '\0';
        printf("%s\n", line);
        if (next) line = next + 1;
        else break;
        lines++;
    }
    printf("--- END RESPONSE ---\n");

    return 0;
}

#ifdef SIMULATE_ATTACK
static void simulated_attack(void) {
    printf("\n[!] SIMULATING ATTACK: attempting to spawn shell...\n");
    int rc = system("/bin/sh -c 'echo ATTACKER SHELL SPAWNED; id'");
    if (rc == -1) {
        printf("[!] system() returned error: %s\n", strerror(errno));
    } else {
        printf("[!] system() returned: %d\n", rc);
    }
}
#endif

int main(int argc, char *argv[]) {
    int use_seccomp = 1;

    printf("=== seccomp-BPF Demo ===\n\n");

    if (argc > 1 && strcmp(argv[1], "--no-seccomp") == 0) {
        use_seccomp = 0;
        printf("[!] WARNING: seccomp disabled (unsafe mode)\n\n");
    }

    if (use_seccomp) {
        if (apply_seccomp_filter() < 0) {
            fprintf(stderr, "Failed to apply seccomp filter, aborting\n");
            return 1;
        }
    }

    if (do_http_get() < 0) {
        fprintf(stderr, "HTTP request failed\n");
        return 1;
    }

#ifdef SIMULATE_ATTACK
    simulated_attack();
#endif

    printf("\n[+] Demo complete\n");
    return 0;
}
EOF

The filter uses a blocklist approach: allow everything by default, then explicitly block dangerous syscalls. This is more robust against libc updates than an allowlist, which would break whenever libc starts using a new syscall internally.

The recipe

The recipe builds two variants: seccomp-demo for normal operation and seccomp-demo-attack which simulates what happens when an attacker gains code execution and tries to spawn a shell.

cat > recipes-demo/seccomp-demo/seccomp-demo_1.0.bb << 'EOF'
SUMMARY = "seccomp-BPF demonstration application"
DESCRIPTION = "Network client demonstrating syscall filtering"
LICENSE = "MIT"
LIC_FILES_CHKSUM = "file://${COMMON_LICENSE_DIR}/MIT;md5=0835ade698e0bcf8506ecda2f7b4f302"

SRC_URI = "file://seccomp-demo.c"

DEPENDS = "libseccomp"

S = "${WORKDIR}"

do_compile() {
    ${CC} ${CFLAGS} ${LDFLAGS} -o seccomp-demo seccomp-demo.c -lseccomp
    ${CC} ${CFLAGS} ${LDFLAGS} -DSIMULATE_ATTACK -o seccomp-demo-attack seccomp-demo.c -lseccomp
}

do_install() {
    install -d ${D}${bindir}
    install -m 0755 seccomp-demo ${D}${bindir}/
    install -m 0755 seccomp-demo-attack ${D}${bindir}/
}
EOF

The image

The image includes strace for analyzing which syscalls the application uses—useful for understanding what to allow or block.

mkdir -p recipes-demo/images

cat > recipes-demo/images/devicedojo-seccomp-image.bb << 'EOF'
SUMMARY = "DeviceDojo seccomp demo image"

inherit core-image

IMAGE_INSTALL += " \
    seccomp-demo \
    strace \
"
EOF

Building

This tutorial assumes you have an existing Yocto build environment. If you need a starting point with Mender support, the meta-mender-community repository provides ready-to-use configurations, such as kas/qemuarm64.yml.

Add the layer to your build:

bitbake-layers add-layer /path/to/meta-devicedojo-seccomp

Then build the image:

bitbake devicedojo-seccomp-image

Running the demo

Start QEMU with user-mode networking to allow outbound connections:

runqemu qemuarm64 nographic slirp

Once booted, run the normal application:

seccomp-demo

Expected output:

=== seccomp-BPF Demo ===

[+] seccomp filter applied: execve/fork/clone blocked
[*] Resolving httpbin.org...
[*] Connected to httpbin.org:80
[*] Request sent, reading response...
[+] Response received (..."origin": "..."...)
--- BEGIN RESPONSE ---
HTTP/1.1 200 OK
...
--- END RESPONSE ---

[+] Demo complete

Now run the attack simulation with seccomp enabled:

seccomp-demo-attack

The process will be killed by SIGSYS when the system() call attempts to invoke execve:

=== seccomp-BPF Demo ===

[+] seccomp filter applied: execve/fork/clone blocked
[*] Resolving httpbin.org...
...
[!] SIMULATING ATTACK: attempting to spawn shell...
Bad system call

Check the kernel audit log:

dmesg | grep seccomp

You will see an entry indicating which syscall was blocked and by which process.

For comparison, run the attack variant without the seccomp filter:

seccomp-demo-attack --no-seccomp

Output:

=== seccomp-BPF Demo ===

[!] WARNING: seccomp disabled (unsafe mode)

[*] Resolving httpbin.org...
...
[!] SIMULATING ATTACK: attempting to spawn shell...
ATTACKER SHELL SPAWNED
uid=0(root) gid=0(root)
[!] system() returned: 0

[+] Demo complete

The attacker successfully spawned a shell and executed id.

Analyzing syscalls with strace

Use strace to understand which syscalls your application uses:

strace -f -o /tmp/trace.log seccomp-demo
cat /tmp/trace.log

Look for the syscalls that are actually invoked. For a network client, expect to see socket, connect, sendto, recvfrom, read, write, and various memory management calls (brk, mmap, mprotect). Notably absent should be execve, fork, and clone—these are what we block.

Production considerations

Testing is critical. A seccomp filter that is too aggressive will break your application in subtle ways. Test thoroughly after libc or library updates.

Use SCMP_ACT_LOG during development. Replace SCMP_ACT_KILL_PROCESS with SCMP_ACT_LOG to log blocked syscalls without killing the process. This helps identify missing syscalls before deploying a kill policy.

Consider systemd integration. For managed services, systemd provides SystemCallFilter= directives that apply seccomp filters without modifying application code.

Allowlist vs blocklist tradeoffs. This tutorial uses a blocklist for simplicity. An allowlist (only permit specific syscalls) provides stronger security but requires more maintenance as dependencies change.


If this tutorial helped you, press like or leave a comment below. Questions and feedback are welcome.