Regular Expression Reference

Here is a quick reference for general Regular Expression help.

Introduction

Regular expressions (regex) provide powerful pattern matching features to software products:

Flavors vary.  For example, while a regex match is confined to a single line in Beyond Compare, regex can span multiple lines in Notepad++.


Syntax


Character Classes

Note:  Most regular expression "special characters" are treated as literal in character class definitions. Hint:  A caret ^ that does not start a character class definition or a hyphen at the beginning or end of a definition are literal without an escape modifier.

Character Shorthand

Character Shorthand Inclusive Exclusive
Tab character \t
Whitespace (including tabs) \s \S
Numeric characters \d \D
Word characters \w \W

Boundaries

Word boundaries ensure whole-word matches by skipping results adjacent to other word characters (alphanumerics and underscores).

Subexpressions

Hint:  A backreference match is used in Search criteria while a backreference insert is used in Replace criteria.  n is the <name> or numeric sequence of the backreferenced subexpression.

Quantifiers

Number of Matches Greedy Lazy Possessive
0 or 1 (optional) ? ?? ?+
0 or more (optional) * *? *+
1 or more (required) + +? ++
Exactly n matches {n} {n}?
At least n matches {n,} {n,}?
Min(n)/Max (m) {n,m} {n,m}?

Note:  Possessive quantifiers discard backtracking positions and can short circuit before completing all permutations.  Used in performance tuning.

Mode Modifiers

Enable Disable
Case Insensitivity (?i) (?-i)
Free-Spacing (?x) (?-x)

Lookarounds

Positive Negative
Lookahead (?...) (?!...)(
Look Behind (?<=...) (?<!...)

Atomic Expressions

Hint:  A branch reset subexpression can capture an alteration match into a single backreference:


Recursion

Note:  The main purpose of recursion is to match balanced or nested constructs.  An optional recursive expression is repeatedly applied until it fails, then the remaining expression is applied until all open levels of recursion have been closed. (?0) and \g<0> are synonyms for recursion.