CGI and Perl

New Features in Perl5

You can use a number of new features and enhancements with Perl5. Many of them are utilized in the modules demonstrated . In the following sections, I provide a short overview of most of them.

Usability and Simplicity

Some major improvements have been made to Perl, in terms of the layman's ability to use it and understand it. While it's always been a tool for the common man, its latest release has seen major work towards making it even easier to use and understand. Let's see how this was accomplished. Enhanced Documentation Probably the most significant improvement to the Perl distribution, outside of Perl itself, is the documentation. The single monolithic manual page has been split up into logical sections corresponding to the various aspects of Perl programming, along with sections related to the more advanced aspects of Perl, like embedding the Perl interpreter in an external program, and other sections as well.

A Simple Convention

As you read through this and other chapters, you'll notice the capitalized references to the various sections of the new manual, of the form PERLBLAH, where the BLAH corresponds to the section of the new Perl manual that is being referred to. Such references are there to help you find your way to the related sections of the Perl manual, regarding the current subject matter.

The Perl manual now has a total of 32 standard sections, each with a specific intent. Table 2.1 lists them all.

Table 2.1. Standard Perl manual sections.

PERL Perl overview
PERLTOC Perl documentation table of contents
PERLDATA Perl data structures
PERLSYN Perl syntax
PERLOP Perl operators and precedence
PERLRE Perl regular expressions
PERLRUN Perl execution and options
PERLFUNC Perl built-in functions
PERLVAR Perl predefined variables
PERLSUB Perl subroutines
PERLMOD Perl modules
PERLFORM Formats, and using write()
PERLREF Perl references
PERLDSC Perl data structures intro
PERLLOL Perl data structures: lists of lists
PERLOBJ Perl objects
PERLTIE Perl objects hidden behind simple variables
PERLBOT Perl OO tricks and examples
PERLIPC Perl interprocess communication
PERLDEBUG Perl debugging
PERLDIAG Perl diagnostic messages
PERLSEC Perl security
PERLTRAP Perl traps for the unwary
PERLSTYLE Perl style guide
PERLXS Perl XS application programming interface
PERLXSTUT Perl XS tutorial
PERLGUTS Perl internal functions for creating extensions
PERLCALL Perl calling conventions from C
PERLEMBED Perl: how to embed Perl in your C or C++ application
PERLPOD Perl plain old documentation
PERLAPIO Perl internal IO abstraction interface
PERLBOOK Perl book information


The documentation also contains numerous additional sections corresponding to the standard modules that ship with Perl. Most or all of these additional sections are extracted from the embedded POD, which is to be found in the module file itself. All Perl documentation is written first in POD and then translated to other formats.

POD

Plain Old Documentation. ASCII text documentation with markers corresponding to the various formatting elements. Can be embedded directly into Perl modules. See PERLPOD.

You can easily transform POD into standard UNIX *ROFF format, HTML, and a number of other formats by using the pod2* (pod2man, pod2html, pod2text, and so on) converters. There exist POD converters to many other types of formats, as well. The POD format also implies that you can read the documentation directly, without any post-formatting at all. Everything that I cover in this chapter is also documented in the Perl PODs, and I give references to the specific sections as I go along, using the PERLBLAH notation as previously mentioned. POD can be embedded directly into a Perl module, or program, and nearly all of the modules have them already.

When Perl is installed on a typical UNIX site, the POD documentation, including POD from the modules, is converted automatically into standard UNIX manpages. The administrator usually installs it inside the primary @INC directory, usually in a subdirectory called man. Macintosh Perl installations have the POD, converted to HTML format, in a folder beneath the folder that contains the Perl application, named pod. The Windows (ntperl) installation also has the documentation, converted to HTML, in the directory called docs. You should find this directory, and be ready to refer to the documentation within it.

As I mentioned previously, conversion tools are available for POD, including pod2html, pod2text, pod2rtf, pod2tex, pod2inf, and now even pod2ps, and the pod2man program. You can use any of them to convert from POD format to your preferred format, provided that the tool has been written to work on your architecture. All these tools work on UNIX, and some are configurable to work on other platforms.

A Note on Compatibility

There are, as you might guess, still a few incompatibilities when attempting to produce cross-platform Perl code, and not all scripts run on all architectures by default. Filepaths, for instance, shouldn't be hardcoded in Perl programs, but often are. This issue in general is being considered and worked on by the Perl Porters, a large group of brilliant people who help to bring you Perl.

Remember to check your favorite CPAN to get the latest versions of the pod2* programs.

CPAN

Comprehensive Perl Archive Network. A large group of well-connected Internet sites that maintains a copy of the master Perl archive. You can find more details later in this chapter and a complete history and description of the CPAN in Chapter 1.

Readability Improvements The ability to provide easily readable and reusable code has become more important as the level of formal training and skills required to start a Web site has decreased. The responsibility is largely up to the script author to implement this readability. Perl5 also provides some new functionality that enhances the capability of the script author to do so.

The English module increases the readability and understanding of Perl code, and it is a big step toward alleviating the boggling effect that raw Perl code sometimes has on new programmers. The English module provides a mapping between Perl's eclectic punctuation (special) variables with an English name corresponding to each one. The regular-expression variables that correspond to the three components of a matched string, for example, are often difficult to remember, even for the experienced Perl programmer. The English module maps these variables as follows:

*MATCH = *& ;
 *PREMATCH = *';
 *POSTMATCH = *';

Thus, when you use the English module in your program, you can use $MATCH, $PREMATCH, and $POSTMATCH instead of using $&, $', or $' and chasing through the manual to verify whether you need an ampersand, backtick, or single quotation mark following the $, each time you want to access these built-in variables. See the complete English.pm module in @INC, and its embedded POD documentation, or English.3, the POD converted to a manpage, for more details. New Logical Operators The new logical operators and, or, and not enable you to avoid using the &&, ||, and unary ! operators, respectively. The new operators are definitely more readable, to the casual observer at least. The former also have lower precedence than the comma, and certain other low-precedence operators. Consider the following:

$foo/=0 || print "aak";
 # prints: aak

Using ||, it looks like you can divide by 0. (Actually you're dividing by the ||'d value of zero and one, the return value from print.) Now consider the same example, using the or operator:

$foo/=0 or print "aak";
 # prints: Illegal division by zero

Using or (and Perl5) gives the expected result. See PERLOP for more details on operator precedence. Warnings and Stricture Other usability enhancements in Perl5 include improvements to the -w command-line switch, which now gives more useful and informative output. You should use it with all your programs. See PERLRUN for a complete description of all command-line switches.

Also new is a set of pragmatic modules, which impose certain restrictions and perform extended type and syntax checking on your code at compile time, to potentially help you find bugs before they bite. Use of these pragmas within your code is also highly recommended. See PERLDSC and PERLMOD for more details on the motivation for and impact of using the strict modules. The New => Operator The => operator is syntactic sugar for a comma. It makes certain declarations look prettier and appear more sensible, as in the following example:

%hash = (
     `Name' => `Joe',
     `Address' => `123 Foo Street',
     `City' => `San Francisco',
     `State' => `CA'
 );

Note how we used the comma at the end of each hash key/value pair, but the => operator between them. This makes such declarations easier to read, especially when declaring more complex data structures. Function Prototypes Function prototypes are one of the very latest new features in Perl5. They were finally included as of Perl5.002, after a great deal of consideration and discussion. Essentially, they provide you with a means to assure that the correct arguments are passed to your subroutines, to emulate the behavior of built-in commands, for instance. See PERLSUB for more details.