REGEXP Matching Multiple Instances
Ads:
If you want to locate all numbers regardless of how many digits the number contains, or you might want to locate a word but also be able to accommodate a trailing s if one exists, and so on.
This can be accomplished using the regular expressions repetition metacharacters, listed in following table.
| Metacharacter | Description |
|---|---|
|
|
0 or more matches |
|
|
1 or more matches (equivalent to |
|
|
0 or 1 match (equivalent to |
|
|
Specific number of matches |
|
|
No less than a specified number of matches |
|
|
Range of matches ( |
Following are some examples.
SELECT prod_name FROM products WHERE prod_name REGEXP '\\(products? [0-9]\\)' ORDER BY prod_name;
The above statement displayed the following output:
+------------------+
| prod_name |
+------------------+
| abc (product 1) |
| abc (product 2) |
| abc (products 1) |
.
.
.
| abc (products 10)|
+------------------+
Regular expression \\(products? [0-9]\\) requires some explanation. \\( matches (, [0-9] matches any digit, products? matches product and products (the ? after the s makes that s optional because ? matches 0 or 1 occurrence of whatever it follows), and \\) matches the closing ). Without ? it would have been very difficult to match both stick and sticks.
Here's another example. This time we'll try to match four consecutive digits:
SELECT prod_name
FROM products
WHERE prod_name REGEXP '[[:digit:]]{4}'
ORDER BY prod_name;
The above code displayed the following output:
+--------------+ | prod_name | +--------------+ | product 1000 | | product 2000 | +--------------+
As explained previously, [:digit:] matches any digit, and so [[:digit:]] is a set of digits. {4} requires exactly four occurrences of whatever it follows (any digit), and so [[:digit:]]{4} matches any four consecutive digits.
It is worth noting that when using regular expressions there is almost always more than one way to write a specific expression. The previous example could have also been written as follows:
SELECT prod_name FROM products WHERE prod_name REGEXP '[0-9][0-9][0-9][0-9]' ORDER BY prod_name;