PHP

Sorting with Foreign Languages

Sorting works well, as long as only the standard ASCII characters are involved. However, as soon as special language characters come into play, the sorting yields an undesirable effect. For instance, calling sort() on an array with the values 'Frans', 'Frdric', and 'Froni' puts 'Frdric' last because the character has a much larger charcode than o.

Sorting an Array with Language-Specific Characters
<?php
  function compare($a, $b) {
    if ($a == $b) {
      return 0;
    } else {
      for ($i = 0; $i < min(strlen($a), strlen($b));
        $i++) {
        $cmp = compareChar(substr($a, $i, 1),
          substr($b, $i, 1));
        if ($cmp != 0) {
          return $cmp;
        }
      }
      return (strlen($a) > strlen($b)) ? 1 : 0;
    }
  }
  function compareChar($a, $b) {
    // ...
  }
  $a = array('Frdric', 'Froni', 'Frans');
  usort($a, 'compare');
  echo implode(' < ', $a);
?>

For this special case, PHP offers no special sorting method; however, you can use strnatcmp()to emulate this behavior. The idea is to define a new order for some special characters; in the comparison function, you then use this to find out which character is "larger" and which is "smaller."

You first need a function that can sort single characters:

function compareChar($a, $b) {
    $characters =
      'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
    $characters .=
      'abcdefghijklmnopqrstuvwxyz';
    $pos_a = strpos($characters, $a);
    $pos_b = strpos($characters, $b);
    if ($pos_a === false) {
      if ($pos_b === false) {
        return 0;
      } else {
        return 1;
      }
    } elseif ($pos_b === false) {
      return -1;
    } else {
      return $pos_a - $pos_b;
    }
  }

Then, the main sorting function calls compareChar(), character for character, until a difference is found. If no difference is found, the longer string is considered to be the "greater" one. If both strings are identical, 0 is returned. The code at the beginning of This shows the compare function. The result of this code is, as desired, Frans < Frdric < Froni.

by BrainBellupdated
Advertisement: