PHP

Searching in Strings

if (strpos($string, $substring) === false) {
  echo 'No match found.';
} else {
  echo 'Match found.';
}

When looking for substrings in strings, strpos() is used (and its counterpart strrpos(), which searches from the end of the string). The tricky thing about this function is that it returns the index of the first occurrence of the substring, or false otherwise. That means that the preceding code snippet is incorrect.

The preceding code snippet is incorrect because if $string happens to start with $substring, strpos() returns 0, which evaluates to false. Therefore, a comparison using === or !== must be used to take the data type into account. The code at the beginning of This shows how to correctly use strpos().

Understanding Regular Expressions

Regular expressions are, to put it simple, patterns that can be matched with strings. Two kinds of regular expressions are available in PHPPOSIX regular expressions and PHP regular expressions. The former can be installed when configuring PHP with the switch --with-regex. Windows users do not have to do this; the support for POSIX Regex is enabled by default.

The alternatives are Perl-compatible regular expressions (PCRE). PCRE are often said to be faster, and do offer more features. This functionality is enabled in PHP by default; however, if you compile PHP by yourself, you can deactivate PCRE using the switch --without-pcre-regex.

A pattern in a regular expression contains a string that can be searched for in a larger string. However, this can also be done (faster) using strpos(). The advantage of regular expressions is that some special features such as wildcards are available. Table shows some special characters and their meaning.

Special Characters in Regular Expressions

Special Character Description Example

^

Beginning of the string

^a means a string that starts with a

$

End of the string

a$ means a string that ends with a

?

0 or 1 times (refers to the previous character or expression)

ab? means a or ab

*

0 or more times (refers to the previous character or expression)

ab* means a or ab or abb or ...

+

1 or more times (refers to the previous character or expression)

ab+ means ab or abb or abbb or ...

[...]

Alternative characters

PHP[45] means PHP4 or PHP5

- (used within square brackets)

A sequence of values

PHP[3-5] means PHP3 or PHP4 or PHP5

^ (used within square brackets)

Matches anything but the following characters

[^A-C] means D or E or F or ...

|

Alternative patterns

PHP4|PHP5 means PHP4 or PHP5

(...)

Defines a subpattern

(a)(b) means ab, but with two subpatterns (a and b)

.

Any character

. means a, b, c, 0, 1, 2, $, ^, ...

{min, max}

Minimum and maximum number of occurrences; if either min or max is omitted, it means 0 or infinite

a{1,3} means a, aa or aaa. a{,3} means empty string, a, aa, or aaa. a{1,} means a, aaa, aaa, ...

\

Escapes the following character

\. stands for .


The de facto standard reference for regular expressions is the title Mastering Regular Expressions, by Jeffrey E. F. Friedl. A bit dated, but a fun read.

Other special characters and expressions are available, for instance a character that refers to a digit. However, this differs between POSIX and PCRE, which in the example use [:digit:] and \d, respectively.

by BrainBellupdated
Advertisement: