🐳 A Minimal Container in Go β€” ELI5

This Go program creates a tiny container, kind of like Docker, but from scratch using Linux syscalls. No Docker or container engine involved!


🧠 What It Does

  1. Runs itself twice β€” once as a parent, once as a child
  2. The parent sets up Linux namespaces for isolation
  3. The child changes its environment to behave like a container
  4. Finally, it runs a shell (/bin/sh) inside that container

🧩 Step-by-Step Breakdown

The Golang program we discuss here consists of this header, plus three functions:

1
2
3
4
5
6
7
8
package main

import (
    "fmt"
    "os"
    "os/exec"
    "syscall"
)

1. Main Function

1
2
3
4
5
6
7
func main() {
    if len(os.Args) > 1 && os.Args[1] == "child" {
        runChild()
    } else {
        runParent()
    }
}
  • If the program was run with the argument "child", it does container setup
  • Otherwise, it behaves as the parent and starts a new containerized child process

2. Parent Process

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
func runParent() {
    fmt.Println(">> Parent: starting container process")
    cmd := exec.Command("/proc/self/exe", "child")
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    cmd.SysProcAttr = &syscall.SysProcAttr{
        Cloneflags: syscall.CLONE_NEWUTS |
                    syscall.CLONE_NEWPID |
                    syscall.CLONE_NEWNS |
                    syscall.CLONE_NEWIPC |
                    syscall.CLONE_NEWNET,
    }
    err := cmd.Run()
    if err != nil {
        panic(err)
    }
}
  • Runs the same binary again with "child" as argument

  • /proc/self/exe refers to the currently running executable

  • Cloneflags create new Linux namespaces for:

    • 🏷 CLONE_NEWUTS: hostname isolation
    • πŸ”’ CLONE_NEWPID: new PID tree (starts from PID 1)
    • πŸ“ CLONE_NEWNS: new mount namespace
    • 🧠 CLONE_NEWIPC: new shared memory namespace
    • πŸ“‘ CLONE_NEWNET: separate networking stack

This isolates the child process just like a real container.


3. Child Process (the “Container”)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
func runChild() {
    fmt.Println(">> Child: setting up container")

    syscall.Sethostname([]byte("container"))
    syscall.Chroot("rootfs")
    os.Chdir("/")

    syscall.Mount("proc", "/proc", "proc", 0, "")

    fmt.Println(">> Running shell inside container")
    syscall.Exec("/bin/sh", []string{"/bin/sh"}, os.Environ())
}
  • Sets the container’s hostname to "container"
  • Uses chroot("rootfs") to change the root directory β€” this limits what the container can “see” on the host filesystem
  • Changes directory to / inside that new root
  • Mounts /proc, so commands like ps, top, etc. work
  • Finally, replaces the process with /bin/sh so you’re inside the container shell

πŸ“¦ Requirements to Make It Work

You need a minimal Linux root filesystem (rootfs/) that includes:

  • /bin/sh (a shell like BusyBox)
  • A basic directory structure (/proc, /etc, /bin, etc.)
  • Correct permissions and mountable directories

You can build this using BusyBox:

1
2
3
4
5
6
7
8
9
mkdir -p rootfs/{bin,proc}
wget https://busybox.net/downloads/binaries/1.21.1/busybox-x86_64
mv busybox-x86_64 rootfs/bin/sh
chmod +x rootfs/bin/sh
cd rootfs/bin
for cmd in sh ls mkdir echo cat sleep uname; do
  ln -s sh $cmd
done
cd -

🀯 Why Is This Cool?

You’re building a container runtime like Docker from scratch:

  • Isolated process tree
  • Isolated hostname and network
  • Own root filesystem
  • Interactive shell inside the container

All in ~50 lines of Go code, using only Linux syscalls.


πŸ§ͺ Want to Try It?

Containers from Scratch - Building a Container Runtime with Nothing But Syscalls (in Go)