[0-9]{5,5}([- ]?[0-9]{4,4})?

The first part of this regular expression, `[0-9]{5,5}`

, is rather straightforward, but the second part, `([- ]?[0-9]{4,4})?`

, might seem a little less so. In effect, we have grouped the entire "plus 4" sequence with parentheses and qualified those with a `?`

character, saying they can optionally not exist, or exist once and only once. Inside that, we have said that this group optionally starts with either a dash or space (we are very forgiving) with `[- ]?`

, and then we have said that there must be four more digits with `[0-9]{4,4}`

.

Canadian postal codes, on the other hand, are quite straightforward to determine. They are always of the format `X#X #X#,`

where `#`

represents a digit and `X`

a letter from the English alphabet. A regular expression for this would be as follows:

[A-Za-z][0-9][A-Za-z][:space:]*[0-9][A-Za-z][0-9]

We have been a little forgiving and let the user put any number of whitespace characters (including none) between the two blocks of three.

If we wanted to do a bit more research, however, we would realize that not all letters are valid in Canadian postal codes. For the first letter, in fact, only the letters in `[ABCEGHJKLMNPRSTVXY]`

are valid. We could rewrite our regular expression as follows:

[ABCEGHJKLMNPRSTVXYabceghjklmnprstvxy][0-9][A-Za-z] [:space:]*[0-9][A-Za-z][0-9]

(We have split the above regular expression onto two lines for formatting purposes only.)

updated