Frequently Asked Question
What is the difference between basic, extended, and Perl-compatible regex?
Unix regex comes in three main dialects. Basic Regular Expressions (BRE) are
the oldest and the default for plain grep and sed. In BRE, the metacharacters
(, ), {, }, |, +, and ? are literal characters: to use them as
operators you must backslash them, as in \(foo\|bar\). Extended Regular
Expressions (ERE), enabled with grep -E (or the equivalent egrep) and
sed -E, flip the convention: those characters are metacharacters by default and
you backslash them to match literally. ERE is almost always the dialect you want.
Perl-Compatible Regular Expressions (PCRE), enabled with grep -P, are a
superset of ERE borrowed from Perl. They add shorthand character classes (\d
for digits, \w for word characters, \s for whitespace), non-greedy quantifiers
(*?, +?), lookahead and lookbehind assertions, and Unicode properties. Most
modern languages and editors (Python, JavaScript, Vim, VS Code) use a PCRE-ish
dialect, so learning it once pays off everywhere.