Frequently Asked Question

What are Linux capabilities and why split up root?

Historically the kernel had one big switch: either a process was root (UID 0) and could do anything, or it was not and could do almost nothing privileged. That meant every program that needed a single privileged trick, binding to port 80, sending an ICMP echo, changing its UID, had to be setuid root and inherit the keys to the kingdom. Capabilities, introduced in 2.2 and refined ever since, break that monolith into about forty distinct privileges (CAP_NET_BIND_SERVICE, CAP_NET_RAW, CAP_CHOWN, CAP_SYS_ADMIN, …) that can be granted to a binary or process individually.

The user-space tooling is setcap to attach capabilities to a file and getcap to list them; man 7 capabilities enumerates the full set. A modern ping, for example, no longer needs to be setuid root: it carries cap_net_raw+ep and nothing else, so even a successful exploit in ping cannot read other users' files. Container runtimes like Docker and Podman use the same mechanism to give containers only the kernel privileges they need.

For a defender, capabilities are how you express least privilege at the kernel level. When you write a systemd unit, prefer AmbientCapabilities=CAP_NET_BIND_SERVICE and CapabilityBoundingSet= to scoping in the smallest set of privileges, rather than running the service as root because "that worked".

Further reading and video