Finding a string with preg_match()

preg_match() needs 3 arguments; a regular expression (regex), a source string and an array variable. preg_match() returns 1 if a match is found and 0 if no match is found. Lets see the following example:

$source = “Michael Jordan is a great player”;
if ( preg_match( “/J.r/”, $source, $array ) )
print $array[0]
// prints Jor

Here our regex, “/J.r/” , is saying that we want a J; followed by any character (denoted by the .); followed by a r. The / are known as delimiters and as their name say delimit the regex. Another example:

$source = “Michael Jordan is a great player”;
if ( preg_match( “/J.*n/”, $source, $array ) )
print $array[0]
// prints Jordan

Here we are looking for a J; followed by any character (denoted by the .); followed by any amount of the previous character (denoted by the *); followed by a n. The combination .* can also be read as “any amount of any character”.

Sometimes instead of * we would want to use + which means “one or more occurrences of the previous character”. The difference between both is that * includes the case of zero occurrences of a character.

Regular expressions are said to be greedy because they will match everything until the last occurrence of the searched character is found. For example:

$text = “string strong stung”;
if ( preg_match( “/s.*g/”, $text, $array ) )
print $array[0];
// prints string strong stung

If we only want the match to include until the first occurrence of g, we should add a ?, which means “optional”

$text = “string strong stung”;
if ( preg_match( “/s.*?g/”, $text, $array ) )
print $array[0];
// prints string

We can also make use of generic character types to match only certain characters. For example:

$text = “Today is 10-15-03”;
if ( preg_match( “/\d*-\d*-\d*/”, $text, $array ) )
print $array[0]
//prints 10-15-03

On this example the generic character type \d matches any decimal digit; it is followed by any amount of the previous character *; followed by a – ; followed by any decimal digit \d; and so on. Other generic character types are shown on the following table:

Character Generic Character Type
\d any decimal digit
\D any character that is not a decimal digit
\s any whitespace character
\S any character that is not a whitespace character
\w any “word” character
\W any “non-word” character

You can use them to unleash the power of regular expressions.

About the Author:Diego Botello