PHP

Search and replace text and unicode accents in PHP

In this tutorial we'll remove extra spaces between words, remove whitespaces from the beginning or end of a string with trim function, remove and replace unicode accents with ASCII characters. At the end we'll create a function to truncate text to our desired length cleanly.

Remove extra spaces

Clean up the messiest text by removing all the white space in a string, such as extra spaces, tabs, newlines, and so on. In the following code we'll use preg_replace regex function.

$string = "It is   the   sample text   ";
$newString = preg_replace('/\s+/',' ', $string);
//It is the sample text

Remove whitespace from the beginning or end of a string

We'll use ltrim(), rtrim(), or trim(). The ltrim() function removes whitespace (such as newline, carriage return, space, horizontal and vertical tab, and null) from the beginning of a string, rtrim() from the end of a string, and trim() from both the beginning and end of a string.

Example with rtrim() function:

$text = '  hello world  ';
$trimmedText = rtrim($text);
echo $trimmedText;

The above code outputs   hello world, trimmed all spaces from the right side.

Example with ltrim() function:

$text = '  hello world  ';
$trimmedText = ltrim($text);
echo $trimmedText;

The above code outputs hello world  , trimmed all spaces from the left side.

Example with trim() function:

 $text = '  hello world  ';
$trimmedText = trim($text);
echo $trimmedText;

The above code outputs hello world, trimmed all spaces from the both sides.

Remove or replace accents

In this code we'll convert data that's accented with diacritics (such as é) to plain ASCII in readable form. We'll use PHP's str_replace function to replace all the diacritic characters with standard ones.

Replace single character with str_replace

$text = "héllo";
$from = "é";
$to = "e";
$newText = str_replace($from, $to, $text);

Replace multiple characters with str_replace using array

$text = "Olá Mundo";
$from = array(
  'á','À','Á','Â','Ã','Ä','Å',
  'ß','Ç',
  'È','É','Ê','Ë',
  'Ì','Í','Î','Ï','Ñ',
  'Ò','Ó','Ô','Õ','Ö',
  'Ù','Ú','Û','Ü');

$to = array(
  'a','A','A','A','A','A','A',
  'B','C',
  'E','E','E','E',
  'I','I','I','I','N',
  'O','O','O','O','O',
  'U','U','U','U');

$newText = str_replace($from, $to, $text);
//results : Ola Mundo

Truncate text to a specific length

The results provided by search engines always neatly display snippets of information from each website without truncating the text midword? In this code snippet we'll cut long strings short in a similar manner.

We'll make a function which takes a string variable containing text to truncate, the maximum number of characters to allow in the new string, and a string to follow the truncated text.

We'll use substr and strrpos functions. substr function returns the portion of string specified by the start and length parameters and strrpos funciton find the position of the last occurrence of a substring in a string.

substr Syntax:

$stringPortion = substr ( 
           $stringToProcess , 
           $startPosition ,
           $optionalLength
              );

Note: in the string abcdefghi, the character at position 0 is a

strrpos Syntax:

$intStringPosition = strrpos (
           $stringToSearch ,
           $needle ,
           $optionalOffset
              );

Returns the position (begins from the right side) where the needle exists. Note string positions start at 0.

Truncate text using substr and strrpos functions

$text = 'it is a very lenghty text';
$maxChar = 10;
$newText = substr($text,0,$maxChar);
echo $newText;

The above code will return it is a ve with broken very ve word. To solve this problem we'll find the position of last space in the text.

$position = strrpos($truncated, ' ');

Now we know the position of last space, we'll truncate our text again with substr to this position to get rid broken word.

$truncated = substr($text, 0, $position);

The above code will return it is a.

Complete code in custom function

function truncateText($text, $maxChar){
	//Add extra space at the end of string
	$text = $text . ' ';
	
	//Use substr function
	$truncated = substr($text, 0, $maxChar);
	
	//Last space position
	$position = strrpos($truncated, ' ');
	
	//substr to whitespace pos
	$truncated = substr($text, 0, $position);
	
	return $truncated;
}

Now we'll text this function:

$text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.
         Morbi sed massa convallis arcu vulputate sollicitudin.';
$trunText = truncateText($text,24);
echo $trunText;

We'll get the following result:

Lorem ipsum dolor sit

by BrainBellupdated
Advertisement: