The Linux Kernel: How do system calls actually work?

Dr Chris Paton

Frequently Asked Question

How do system calls actually work?

Each syscall has a number (e.g. read is 0, write is 1, openat is 257 on x86-64). The user-space program loads that number into a register (rax on x86-64), puts arguments in other registers (rdi, rsi, rdx, r10, r8, r9), and executes the syscall instruction. The CPU jumps to a fixed address in the kernel (the syscall entry point), which looks up the function in the syscall table and calls it. The return value comes back in rax.

You almost never write that assembly yourself. The C library (glibc, musl) provides a thin wrapper for each syscall, read(fd, buf, n) in C compiles into the few instructions above. You can see the wrappers in strace: every line of strace output is one syscall.

How do system calls actually work?

Video

Further reading and video