Frequently Asked Question

How do BEGIN and END blocks in awk work?

BEGIN and END are special patterns that never match input lines. A BEGIN block runs once, before any input is read; an END block runs once, after all input has been processed. Between them, the normal pattern { action } rules run on each line. So a complete awk program might initialise variables in BEGIN, accumulate state line by line, and report results in END.

The classic use is summing or averaging a column. awk '{ sum += $1 } END { print sum }' numbers.txt walks every line, adds field 1 into a running total, and prints the total once at the end. Add { count++ } and print sum/count to get the mean. BEGIN is often used to set the field separator (BEGIN { FS = ":" }) or to print a header line. Together they turn awk into a respectable one-pass report generator, a SQL-like aggregate query, in two lines.

Further reading and video