C Language

Practical Pointers

If William Tell had used a C compiler for a bow and C pointers for arrows, he'd have skewered Bill, Jr. - not an apple - to the tree. Although C pointers are simply memory addresses, they're notorious for contributing to tricky programs and hard-to-spot program bugs.

C pointers are easier to use than assembler address operands, but C's low-level and unprotected exposure of addresses creates pitfalls for business application programs.

Perhaps the most common mistake when working with pointers is forgetting the * (the dereferencing or "contents of") operator. The following code shows a fairly harmless but common example of this mistake:

 int x = 25;
 int *y;

 y = &x;

 printf( "y is %d\n", y );

What is printed is the address stored in y, not the value (i.e., 25) stored at that address. The correct printf statement is:

 printf( "y is %d\n", *y );

Not all * omissions are so harmless and easy to detect. Consider the fees function in Figure 6.1. This function uses age and income to calculate registration and activity fees, which the fees function returns via two pointers passed by the caller.

Figure 6.1 Typical Function Returning Two Values

 void function fees(    int * rfee,
                int * afee,
            const int  age,
            const int  income ) {
   /*
   | Calculate fees as base plus adjustment based on age
   | and income
   */

   * rfee = 100;
   * afee =  50;

   rfee += ( age >= 60 ) ? ( income < 50000 ?   0 :  50 ) :
                 ( income < 50000 ? 100 : 200 );

   afee += ( age >= 60 ) ? ( income < 50000 ?  10 :  20 ) :
                 ( income < 50000 ?  30 :  40 );
 }

The fees function will be compiled and executed without raising exception. But every call to fees will produce the same $100 registration fee and $50 activity fee, regardless of age or income. In this example, the third and fourth assignment statements increment the values of rfee and afee, which are addresses (pointers), not the integer values stored at these two addresses. The assignment statements' targets should be *rfee and *afee. The compiler, however, can't tell the original version is wrong because addition operations are legal on both pointer and integer variables.

C's lack of "output" parameters forces C programmers to explicitly handle addresses and dereferencing (i.e., referencing the storage pointed to by a pointer) to return more than one value from a function. Combined with C's overloading of arithmetic operators for both integer and pointer arithmetic, dereferencing can easily trip you up. A good high-level language (HLL) should support output parameters so you don't need pointers and dereferencing to return multiple procedure values. (The C development community recognizes this C deficiency and has added references, which can be used for return parameters, to C++. But no such facility is planned for C itself.)

HLLs suitable for business programming also should either prohibit direct address modification (i.e., pointer arithmetic) or provide distinct functions for modifying addresses so such operations stand out in the code rather than appear as ordinary arithmetic operations. As I've emphasized in previous chapters, C was designed as a portable assembly language, and when you're programming at the machine level, it's logical to treat addresses as integers. At the business application level, however, machine addresses shouldn't be visible, much less easily confused with ordinary numbers.

You won't find a foolproof way to use dereferenced pointer parameters. If you try to code operands such as *rfee and *afee throughout a function, you'll eventually slip up and omit the *. Finding the mistake may not be easy. But a simple coding practice will lead you around the pitfall: For non-array "output" or "input/output" parameters, use local variables instead of dereferenced parameters in function calculations.

Figure 6.2 shows the fees function rewritten to use two local variables in the calculations. The function's last two statements assign the calculated values to the locations pointed to by the pointer parameters. This technique isolates and simplifies dereferencing and can significantly reduce errors. Figure 6.3 shows how to handle in/out parameters by initializing the local variables to the dereferenced parameters.

Figure 6.2 Using Local Variables Instead of Dereferenced Parameters

 void function fees(   int * rfee,
                int * afee,
            const int   age,
            const int   income ) {

   /*
   | Calculate fees as base plus adjustment based on age
   | and income
   */

   int reg_fee = 100;
   int act_fee =  50;

   reg_fee += ( age >= 60 ) ? ( income < 50000 ?  0 : 50 ) :
                    ( income < 50000 ? 100 : 200 );

   act_fee += ( age >= 60 ) ? ( income < 50000 ? 10 : 20 ) :
                    ( income < 50000 ?  30 :  40 );

   /*
   | Return values
   */

   * rfee = reg_fee;
   * afee = act_fee;
 }

Figure 6.3 Using Local Variables with In/Out Parameters

 void function fees(   int * rfee,
                int * afee,
            const int   age,
            const int   income ) {
    /*
    | Adjust fees based on age and income
    */

    int reg_fee = * rfee;
    int act_fee = * afee;

    reg_fee += ( age >= 60 ) ? ( income < 50000 ?  0 : 50 ):
                     ( income < 50000 ? 100 : 200 );

    act_fee += ( age >= 60 ) ? ( income < 50000 ? 10 : 20 ):
                     ( income < 50000 ?  30 :  40 );

    /*
    | Return values
    */

    * rfee = reg_fee;
    * afee = act_fee;
 }

A companion to the previous rule is use array notation instead of pointers and dereferencing when working with arrays. C's array notation is really just shorthand for pointer operations, and C lets you use either in most contexts. For example, if a is declared as an array, *(a+i), a[i], and i[a] mean exactly the same thing.

But when using an array variable, you should stick with array notation such as a[i] to keep your code's meaning obvious. An added benefit in using such notation is that, in some contexts, the C compiler can catch mistakes in expressions using array names that it can't catch with pointers (e.g., C lets you change an address in a pointer variable, but you can't change the address referred to by an array name). And before you let some "old hand at C" convince you that direct manipulation of pointers is "so much faster" than subscripting arrays, read "Pulling a `Fast' One," page XX. In business applications and most utility software, you can freely use array subscripts without performance concerns.

I've read the viewpoint that since C array notation is really just shorthand for pointer operations, you should use pointer notation because it more "honestly" shows what's going on. If you're trying to dissuade someone from using C, this argument has merit. C pointer and dereferencing notation certainly looks stranger than array notation to most programmers and warns newcomers that C isn't your ordinary HLL. But in the long run, array notation expresses high-level data constructs much better than pointer notation.

by BrainBellupdated
Advertisement: