Frequently Asked Question

What is the difference between basic, extended, and Perl-compatible regex?

Unix regex comes in three main dialects. Basic Regular Expressions (BRE) are the oldest and the default for plain grep and sed. In BRE, the metacharacters (, ), {, }, |, +, and ? are literal characters: to use them as operators you must backslash them, as in \(foo\|bar\). Extended Regular Expressions (ERE), enabled with grep -E (or the equivalent egrep) and sed -E, flip the convention: those characters are metacharacters by default and you backslash them to match literally. ERE is almost always the dialect you want.

Perl-Compatible Regular Expressions (PCRE), enabled with grep -P, are a superset of ERE borrowed from Perl. They add shorthand character classes (\d for digits, \w for word characters, \s for whitespace), non-greedy quantifiers (*?, +?), lookahead and lookbehind assertions, and Unicode properties. Most modern languages and editors (Python, JavaScript, Vim, VS Code) use a PCRE-ish dialect, so learning it once pays off everywhere.

Video

Further reading and video