Matching a single character | |
Characters that otherwise have special regexp meanings | |
| \ | Precedes characters that have a special meaning: \. \+ \* \? \| \{ \( \[ \^ \$ |
Characters that need to be written in a special way | |
| \t | The tab character |
| \n | The newline (line feed) character |
| \r | The carriage-return character |
| \f | The form-feed character |
Matching a single character with a predefined character class | |
| . | Any character (may or may not match line terminators) |
| \d | A digit: [0-9] |
| \D | A non-digit: [^0-9] |
| \s | A whitespace character: [ \t\n\x0B\f\r] |
| \S | A non-whitespace character: [^\s] |
| \w | A word character: [a-zA-Z_0-9] |
| \W | A non-word character: [^\w] |
Defining Character classes (match one character) | |
| Character classes provide a way to specify a set of characters. The class specification is enclosed in []. The set can also be expressed by what must not be in it by beginning the set with a caret, "^". Minus, "-", can be used to indicate a range of character values. Altho a character class matches only one character, a quantifier following it can be used to match multiple characters. | |
| [abc] | a, b, or c (simple class) |
| [^abc] | Any character except a, b, or c (negation) |
| [a-zA-Z] | a through z or A through Z, inclusive (range) |
Position and Boundary patterns (match zero characters) | |
| ^ | The beginning of a line. Very useful. |
| $ | The end of a line. Very userful. ^$ matches all emtpy lines. |
| \b | A word boundary |
| \B | A non-word boundary |
| \A | The beginning of the input |
| \G | The end of the previous match |
| \Z | The end of the input but for the final terminator, if any |
| \z | The end of the input |
Quantifiers (repeating the previous element) | |
| Greedy quantifiers - Expand as much as possible | |
|---|---|
| X? | X, once or not at all |
| X* | X, zero or more times |
| X+ | X, one or more times |
| X{n} | X, exactly n times |
| X{n,} | X, at least n times |
| X{n,m} | X, at least n but not more than m times |
| Reluctant quantifiers - Expand only if forced by later failure to match | |
| X?? | X, once or not at all |
| X*? | X, zero or more times |
| X+? | X, one or more times |
| X{n}? | X, exactly n times |
| X{n,}? | X, at least n times |
| X{n,m}? | X, at least n but not more than m times |
Other | |
| Alternation | |
| X|Y | Tries matching X first, if that doesn't work, tries Y |
| Grouping - Parentheses both group and create a numbered element that can be used later. | |
| (X) | X. This capturing group is remembered so it can be referenced later. Numbered starting at 1. |