CGI and Perl

The Response

Now that your CGI script is running, a response must immediately be sent back to the browser. The response follows a specific format, which will be discussed later, called an HTTP Response Header. HTTP Response Headers

Note:

An HTTP Response Header is sent by the server in response to any HTTP request, not just requests for CGI scripts. If the client requests an HTML page, a response header is generated by the server and is sent to the browser before the HTML page. HTTP response headers are not typically generated for requests for GIF and JPEG images.


Here is an example of an HTTP response header:

HTTP/1.0 200 OK
 Server: Netscape-Communications/1.12
 Date: Monday, 21-Oct-96 01:42:01 GMT
 Last-modified: Thursday, 10-Oct-96 17:17:20 GMT
 Content-length: 1048
 Content-type: text/html
 CrLf

The contents (or body) of the message would follow after the Content-type and a Carriage-
Return Line-Feed (CrLf) sequence. Parsed Header Output There are two ways the response can be handled. By default, the Web server reads and parses the output (STDOUT) of your CGI script before sending the output on to the browser. The Web server will then automatically generate all necessary HTTP response headers. This is called parsed header output. The output of your CGI script is actually appended to the response headers generated by the Web server. If your script contains no output (HTML, or text, or whatever), it is expected to output either a Location or Status header (discussed below in the "Status Response Header" and "Location Response Header" sections). Thus, in summary, the Web server sends to the client in this order: a response header, a blank line (CrLf), then any output from your CGI script. This presents some problems:

  • All output from your CGI script must be processed by the Web server, which must then generate response headers and append the output of your script to them. This is very inefficient and significantly slows down your program.
  • The Web server is slowed down because of the additional processing that must be done prior to sending the output to the client. The Web server is an unnecessary middleman.
  • The Web server still parses and processes the output of your script, even if you want your script to pass headers directly to the client.

Non-Parsed Header Output Fortunately, there is an easy way around this problem. By simply beginning the name of your scripts with nph-, the Web server will not add any headers or interfere in any way with the output of your script. So, if you were working on a script called mycgi.pl, just call it nph-mycgi.pl, and the Web server won't interfere with its headers. This is called a non-parsed-header script, or NPH script. The output of a NPH script is not parsed and is sent directly to the client. This is highly recommended. On most popular Macintosh based Web servers, all CGI scripts are treated as NPH scripts. Figure 5.2 depicts an HTTP transaction with a NPH CGI script.

Figure 5.2. The HTTP transaction with a NPH CGI.

Note:

This feature must be enabled explicitly in the config files for some servers.


The CGI must respond with a valid and appropriate response header that tells the browser what to expect or do next. If you use NPH scripts, you must be sure to output the proper headers; otherwise, the client will see a 500 error message.

There are three basic Response headers: Status, Content-Type, and Location. All responses must contain either a Status header or a Location header that redirects the browser to another URI. If the script will be outputting HTML, text, or some other MIME compliant data object, the
Status
header must be followed by a Content-Type header that specifies the MIME type of the object that follows. Other RFC-822 compliant headers such as Date and Server can also be
specified in the response header. Status Response Header The Status response header (or status line) indicates which version of HTTP the server is running, together with a result code and an optional message. It should not be sent if the script returns a Location header.

The first line of the response from the server typically looks something like this:

HTTP/1.0 200 OK

where HTTP/1.0 is the version of HTTP the server is running, 200 is the result code, and OK is the message that describes the result code. There are many other result codes besides 200 that can be sent. The following Status response headers can be sent to the browser by your CGI in NPH mode. Success 2xx Result codes in the 200s indicate success. The body section if present is the object returned by the request. The body must be in MIME format and may be only in text/plain, text/html, or one of the formats specified as acceptable in the request in the HTTP/ACCEPT header.

200 OK The request was successful.
201 CREATED Following a POST command, indicates success, but the textual part of the response indicates the URI by which the newly created document should be known.
202 Accepted The request has been accepted for processing, but the processing has not been completed.
203 Partial Information When received in the response to a GET command, this indicates that the returned meta-information is not a definitive set of the object from a server with a copy of the object but is from a private overlaid Web. This may include annotation information about the object, for example.
204 No Response Server has received the request, but there is no information to send back, and the client should stay in the same document view. This is mainly to allow input for scripts without changing the document at the same time.

Redirection 3xx Result codes in the 300s indicate the action to be taken (normally automatically) by the client in order to fulfill the request.

301 Moved The data requested has been assigned a new URI; the change is permanent. The response contains a header line of the form: URI: <url> CrLf, which specifies an alternative address for the object in question.
302 Temporarily Moved The data requested actually resides under a different URL; however, this is not permanent. The response format is the same as for Moved.
303 Method Like the found response, this suggests that the client go try another network address. In this case, a different method may be used, too, rather than GET. Method: <method> <url>body-section. The body-section contains the parameters to be used for the method. This allows a document to be a pointer to a complex query operation.
304 Not Modified If the client performs a conditional GET request, but the document has not been modified as specified in the If-Modified-Since variable, this response is generated.

Client ERROR 4xx The 4xx codes are intended for cases in which the client seems to have erred. The body section may contain a document describing the error in human readable form. The document is in MIME format and may only be in text/plain, text/html, or one of the formats specified as acceptable in the request.

400 Bad request The request had bad syntax or was impossible to be satisfied.
401 Unauthorized The parameter to this message gives a specification of authorization schemes that are acceptable. The client should retry the request with a suitable Authorization header.
402 Payment Required The parameter to this message gives a specification of charging schemes acceptable. This code is not currently supported.
403 Forbidden The request is for something forbidden. Authorization will not help.
404 Not found The server has not found anything matching the URI given.

Server ERROR 5xx The 5xx codes are for the cases in which the server has erred, or all indications point to the server as being the cause of the error. Like the 400 codes, the body section may contain a document describing the error in human readable form.

500 Internal Error The server encountered an unexpected condition that prevented it from fulfilling the request.
501 Not Implemented The server does not support the ability to satisfy this request.
502 Bad Gateway The server received an invalid response from the gateway or an upstream server.
503 Service Unavailable The server cannot process the request due to a high load or maintenance of the server.
504 Gateway Timeout The response did not return within a time that the gateway was prepared to wait. Similar to Internal error 500 but has more diagnostic value.
In the next section, we will look at how the header() method in CGI.pm can be used in your script to set the Status response header. Content-Type Response Header The most common response header is Content-Type, which contains the MIME type of the object being returned (usually "text/html"), and Content-Length, which indicates the size of the object. Location Response Header If you did not want to return a file but instead wanted your script to send the browser to another URI, the Location response header can be used.

The headers are terminated by an empty line and followed by the "body" of the message. The "body" refers to the text or HTML or other MIME compliant message in the object.

So, bringing all this together, here is the syntax of a basic HTTP response header:

"HTTP/1.0" result-code message CrLf
 Header: Value CrLf
 Header: Value CrLf
 CrLf
 BODY
 Connection closed by foreign host.

Here is a real-world example of the complete response generated by a Netscape Communications Server. This response was generated after a simple GET request was sent for an HTML file. A similar response could be generated by your CGI script using the header() method in CGI.pm.

HTTP/1.0 200 OK
 Server: Netscape-Communications/1.12
 Date: Monday, 21-Oct-96 01:42:01 GMT
 Last-modified: Thursday, 10-Oct-96 17:17:20 GMT
 Content-length: 1048
 Content-type: text/html
 <HTML>
 <HEAD>
 <TITLE>Welcome to My Web Page</TITLE>
 </HEAD>
 <BODY>
 Here is my web page.  <b>Do you like it?</b>
 </BODY>
 </HTML>
 Connection closed by foreign host.

Notice that after the response headers are sent, a blank line (CrLf) is sent, which is followed by the body of the response. The body is the Web page, or the data your script outputs. After the data has been sent (the body), the server drops the connection.

When generating your own response headers for NPH scripts (highly recommended), the contents of these response headers are easily modified using CGI.pm, or the HTTP::Response class modules. In the next section, I will show you how CGI.pm or the HTTP::Response class can be used to easily generate response headers.