PHP

Security and User Data

This section introduces simple techniques that preprocess user data to solve many common security holes in web database applications.

Using the techniques described here doesn't completely secure a system. Remember that securing a web database application is important, and that the advice offered here isn't a complete solution. A discussion of other security issues is presented in Chapter 9.

Data that is passed from a web browser to a web server should be secured using the steps described here. For this purpose, we have authored the clean( ) function to ensure that the data passed to a script is of the correct length and that special characters aren't misused to attack the system. To understand why the clean( ) function is needed, we describe an example attack later in this section. The function is part the include file db.inc that is used in all scripts in the online winestore.

Consider the following script. It uses the PHP exec( ) library function to run a program on the web server. The exec( ) function takes two parameters, the program to run and an array populated with any output of the program. In this example, the script uses exec( ) to run the Unix cal program and to pass the user-entered parameter $userString to the program. The information in the parameter userString can be provided by using an HTML <form> with a text input widget, by manually creating a URL, or by embedding a link in an HTML document.

<?php
  // DO NOT INSTALL THIS SCRIPT ON A WEB SERVER
  // Run "cal" with the parameter $userString
  // Store the results in the array $result
  exec("/usr/bin/cal $userString", $result);
?>
<!DOCTYPE HTML PUBLIC
                  "-//W3C//DTD HTML 4.0 Transitional//EN"
                  "http://www.w3.org/TR/html4/loose.dtd">
<html>
  <head>
    <title>Calendar</title>
  </head>
<body>
<pre>
<?php
   // Print out each line of the calendar
   foreach($result as $element)
      echo  "$element\n";
?>
</pre>
</body>
</html>

Never use exec( ) or other commands to run programs from a web script or to query a database without securing the user data. Do not install the calendar example on a web server.

The Unix cal program is a useful utility that produces monthly or yearly calendars for any date. For example, to produce a calendar for the whole of 2003, a user could request the URL:

http://localhost/cal.php?userString=2003

This runs the command /usr/bin/cal 2003 and outputs the complete 2003 calendar, as shown in Figure 5-4.

Figure 5-4. Output of the dangerous calendar example when the user requests a 2003 calendar
figs/wda_0504.gif

To produce a calendar for February 2003, the user requests:

http://localhost/cal.php?userString=2+2003

Requesting the URL without any parameters produces the calendar for the current month:

http://localhost/cal.php

While this script might seem useful and innocuous, this script is a major security hole and should never be installed on a web server.

To illustrate why the script should never be installed, consider how it can be misused. If a user wants to enter two or more commands on a single line in a Unix shell, he can do so by separating the commands with a semicolon character. For example, to see who is logged in and then to list the files in the current directory, a user can type the following commands at the shell:

% who ; ls

Now, consider what happens if he exploits this feature by requesting the following URL:

http://localhost/cal.php?userString=2001;cat+/etc/passwd

The script produces a 2001 calendar, followed by the system password file, as shown in Figure 5-5! The script allows a creative user to do things the web server process can do. The identity of the owner of the web server process affects the severity of the actions that can be performed, but this is at best a major security hole.

Figure 5-5. Output when the user requests a 2001 calendar and the system password file
figs/wda_0505.gif

Semicolons, colons, greater-than and less-than signs, and other special characters can cause a script or a query to provide undesirable functions, especially if the script executes the library functions system( ) or exec( ) to run server commands. Even if a <form> makes it difficult for a user to enter undesirable data, he can manually create his own request by entering a URL and authoring a query string.

Never trust anything you don't have control of, that is, anything not in middle or database tiers.

SQL querying also has problems. For example, a user can guess the structure of database tables and how a query is formed from user input. A user might guess that a query uses an AND clause and that a particular <form> text widget provides one of the values to the query. The user might then add additional AND and OR clauses to the query by entering a partial SQL query in the text widget. While such tricks may expose data that should remain hidden from the user, problems compound if the user inserts or deletes data with the techniques discussed in Chapter 6. However, many problems can be solved with careful server-side validation, as discussed in Chapter 7 and the approach described next.

To improve security and prevent special-character attacks, user data should be processed with the clean( ) function:

function clean($input, $maxlength)
{
  $input = substr($input, 0, $maxlength);
  $input = EscapeShellCmd($input);
  return ($input);
}

The first line uses the substr( ) function to reduce the variable $input to a maximum length of $maxlength by taking a substring beginning at the first character. You can use 30 as a maximum $regionName length for Example 5-1, and the calendar example might use a maximum length of 7. The second line calls the library function EscapeShellCmd( ), which escapes any special-purpose characters-such as semicolons, colons, greater-than and less-than signs, and so on-by replacing the character with a single backslash and then the character.

For many purposes, the clean steps are sufficient to ensure data is safe. As an example, if the parameter userString has a value of:

2001;cat /etc/passwd

then a call of:

clean($userString, 7)

produces the harmless string 2001\;ca.

This string has no detrimental effect and provides the user with no hidden data. clean( ) is used to preprocess all user data for the winestore.

User data that has not been preprocessed or cleaned is often known as tainted data, a term originating from the Perl scripting language. Rectifying this through the processing we have described untaints user data.