PHP

Basic Searches

In its most basic usage, a regular expression contains a character or set of characters to match in the input string. If we have the following array in our PHP scripts

$clothes = array("shoes", "pants", "socks", "jacket", "cardigan",
  "scarf", "t-shirt", "blouse", "underpants", "belt",
  "hand bag",
);

then to see which strings contain the letter a, we could write the following:

the_regex($clothes, 'a');

This function outputs the following:

the_regex called to match 'a':
Array Index 1 matches: array ( 0 => 'a', ) "pants"
Array Index 3 matches: array ( 0 => 'a', ) "jacket"
Array Index 4 matches: array ( 0 => 'a', ) "cardigan"
Array Index 5 matches: array ( 0 => 'a', ) "scarf"
Array Index 8 matches: array ( 0 => 'a', ) "underpants"
Array Index 10 matches: array ( 0 => 'a', ) "handbag"

It is interesting to look at the last item in the array: handbag. We might intuitively ask why the results array does not contain two instances of the letter a in it, because there are two in the input string. The answer lies in how the POSIX regular expression processor works: as soon as it satisfies a condition (i.e. look for a single letter 'a'), it stops processing.

To find all those entries that contain pants, we could write the following:

the_regex($clothes, 'pants');

The output would be as follows:

the_regex called to match 'pants':
Array Index 1 matches: array ( 0 => 'pants', ) "pants"
Array Index 8 matches: array ( 0 => 'pants', ) "underpants"

Given that the pants in underpants also matched against our regular expression, we see further evidence that the regular expression is just matching characters. It normally does not care about word boundaries or whether that which it seeks is buried among other characters.

We can also search for multi-byte characters, assuming we have correctly enabled the mbstring extensions:

$mb_strings = array("",
                    "",
                    "");
the_regex($mb_strings, "");

The output from the preceding would be as follows:

the_regex called to match '':
Array Index 0 matches: array ( 0 => '', ) ""
Array Index 2 matches: array ( 0 => '', ) ""

by BrainBellupdated
Advertisement: