[Previous] [Contents] [Next]


Subpatterns in a regular expression can be grouped by placing parentheses around them. This allows the optional and repeating operators to be applied to groups rather than just a single character. For example, the expression:

 ereg("(123)+", $var)

matches "123", "123123", "123123123", etc. Grouping characters allows complex patterns to be expressed, as in the following example that matches a URL:

// A simple, incomplete, HTTP URL regular expression that doesn't allow numbers
$pattern = '^(http://)?[a-zA-Z]+(\.[a-zA-z]+)+$';
$found = ereg($pattern, "www.ora.com"); // true

The regular expression assigned to $pattern includes both the start and end anchors, ^ and $, so the whole subject string, "www.ora.com" must match the pattern. The start of the pattern is the optional group of characters "http://", as specified by "(http://)?". This doesn't match any of the subject string in the example but doesn't rule out a match, because the "http://" pattern is optional. Next the "[a-zA-Z]+" pattern specifies one or more alpha characters, and this matches "www" from the subject string. The next pattern is the group "(\.[a-zA-z]+)". This pattern must start with a period-the wildcard meaning of . is escaped with the backslash-followed by one or more alphabetic characters. The pattern in this group is followed by the + operator, so the pattern must occur at least once in the subject and can repeat many times. In the example, the first occurrence is ".ora" and the second occurrence is ".com".

Groups can also define subpatterns when ereg( ) extracts values into an array. We discuss the use of ereg( ) to extract values later in this section.

Alternative patterns

Alternatives in a pattern are specified with the | operator; for example, the pattern "cat|bat|rat" matches "cat", "bat", or "rat". The | operator has the lowest precedence of the regular expression operators, treating the largest surrounding expressions as alternative patterns. To match "cat", "bat", or "rat" another way, the following expression can be used:

$var = "bat";
$found = ereg("(c|b|r)at", $var);
//Output: true

Another example shows alternative beginnings to a pattern:

// match some URLs
$pattern = '(^ftp|^http|^gopher)://';
$found = ereg($pattern, "http://www.ora.com");
//Output: true

[Previous] [Contents] [Next]