[Previous] [Contents] [Next]


Repeating Patterns


When we want to match a character or character class occurring more than once, we can use quantifiers, which enable us to specify a minimum and maximum number of times the preceding entity can occur. Quantifiers are specified by including the minimum and maximum number in brackets: {min, max}.

One common misspelling seen these days is the word lose spelled as loose. To match either of these, you could use the following expression: "lo{1,2}se", which would match lose and loose, but neither lse nor looose:

regex_play(array("loser", "looser", "lser", "looooser"),
           'lo{1,2}se');

regex_play called to match 'lo{1,2}se':
Array Index 0 matches: array ( 0 => 'lose', ) "loser"
Array Index 1 matches: array ( 0 => 'loose', ) "looser"

You can, if you so desire, omit the upper bound, in which case any number greater than or equal to the minimum bound matches:

regex_play(array("loser", "looser", "lser", "looooser"),
           'lo{1,}se');

regex_play called to match 'lo{1,}se':
Array Index 0 matches: array ( 0 => 'lose', ) "loser"
Array Index 1 matches: array ( 0 => 'loose', ) "looser"
Array Index 3 matches: array ( 0 => 'loooose', ) "looooser"

Three extremely common repeating patterns get their own special quantifiers:

  • {0,} This is represented by the special quantifier *, which means match zero or more of the preceding entity.

  • {1,} This is represented by the special quantifier +, which means match one of more of the preceding entity.

  • {0,1} This sequence denotes that something can optionally existbut only once if it does. It is represented by the special quantifier ?.

For example, to match any sequence of digits ending in 99, we can use the regular expression "[0-9]*99".


[Previous] [Contents] [Next]