Frequently Asked Question

What are dirty pages and how does writeback work?

When a process writes to a file, the data lands in the page cache and the affected pages are marked dirty, newer in RAM than on disk. The kernel's writeback machinery flushes them out later, in the background, so the write syscall returns immediately. Two thresholds in /proc/sys/vm/ control the pace: dirty_background_ratio (default 10% of available memory) is the level at which the kworker flush threads start writing in the background; dirty_ratio (default 20%) is the level at which writing processes are forced to block and help flush. dirty_expire_centisecs puts a maximum age on a dirty page (30 seconds by default) so even a quiet write eventually reaches disk.

You can see the in-flight numbers in /proc/meminfo as Dirty and Writeback. If Dirty is large and slowly draining, a slow disk is the bottleneck. If it spikes and then your application stalls, you are hitting the dirty_ratio ceiling, either the workload is too bursty for the disk or the thresholds are tuned too high for the underlying storage. On fsync-heavy workloads (databases) the picture is different again: every fsync forces writeback of that file's dirty pages before returning.

Further reading and video