PHP

The Dot

One special character, the period or dot character (.), is used in regular expressions to match any single character. Therefore, the regular expression "s.n" matches sun, sin, son, s!n, s%n, and sSn. The dot must, however, match one character. Thus, sn would not match the previous regular expression. To actually match a period character in your string, you must escape the dot with a backslash (\.).

A dot character inside of a character class has no special meaning and is not treated as a metacharacter: It is just used to match a period. For example, to match some common characters seen at the end of a word, we might write [.,:;-'">?!\]].

Repeating Patterns

When we want to match a character or character class occurring more than once, we can use quantifiers, which enable us to specify a minimum and maximum number of times the preceding entity can occur. Quantifiers are specified by including the minimum and maximum number in brackets: {min, max}.

One common misspelling seen these days is the word lose spelled as loose. To match either of these, you could use the following expression: "lo{1,2}se", which would match lose and loose, but neither lse nor looose:

the_regex(array("loser", "looser", "lser", "looooser"),
           'lo{1,2}se');
the_regex called to match 'lo{1,2}se':
Array Index 0 matches: array ( 0 => 'lose', ) "loser"
Array Index 1 matches: array ( 0 => 'loose', ) "looser"

You can, if you so desire, omit the upper bound, in which case any number greater than or equal to the minimum bound matches:

the_regex(array("loser", "looser", "lser", "looooser"),
           'lo{1,}se');
the_regex called to match 'lo{1,}se':
Array Index 0 matches: array ( 0 => 'lose', ) "loser"
Array Index 1 matches: array ( 0 => 'loose', ) "looser"
Array Index 3 matches: array ( 0 => 'loooose', ) "looooser"

Three extremely common repeating patterns get their own special quantifiers:

  • {0,} This is represented by the special quantifier *, which means match zero or more of the preceding entity.

  • {1,} This is represented by the special quantifier +, which means match one of more of the preceding entity.

  • {0,1} This sequence denotes that something can optionally existbut only once if it does. It is represented by the special quantifier ?.

For example, to match any sequence of digits ending in 99, we can use the regular expression "[0-9]*99".

by BrainBellupdated
Advertisement: