Dynamic memory allocation and pointers

 2-16 DYNAMIC MEMORY ALLOCATION AND POINTERS
 *******************************************
 (Thanks to Sergio Gelato who contributed most of this chapter)

 In standard FORTRAN 77, the sizes of all objects must be known at
 compile time (This does not apply to the sizes of formal arguments 
 to subprograms, only to those of the actual arguments).

 This is inconvenient for many applications, as it means that the 
 program may have to be recompiled before it can be run with a 
 different problem size. Many operating environments let the user 
 allocate contiguous blocks of memory whose sizes are determined 
 at run time. Many implementations of FORTRAN 77 support an extension 
 ("Cray pointers") that allows programs to make use of this facility. 

 Cray pointers also allow the construction of linked data structures 
 such as lists, trees, queues, ... More recently, the Fortran 90 
 standard introduced roughly equivalent functionality in the form 
 of ALLOCATABLE arrays on the one hand, and of the POINTER attribute 
 on the other.


 Declaring Cray pointers 
 -----------------------
 Cray pointers are just variables (usually of type INTEGER, but it 
 is best not to declare their type explicitly) that hold the address 
 of another variable called the pointee.

 Example of pointer declaration:

      INTEGER    N
      PARAMETER  (N = 10)
      REAL       POINTEE(N)
      POINTER    (PTR, POINTEE)

 Note that we have here a type statement of only ONE entity: a pointer 
 called PTR, whose data type is 'pointer to a 1D array of REAL with
 dimension N'. This multi-line type statement is a little awkward, but 
 obeys FORTRAN syntax.

 The array that PTR points to can be accessed by the POINTEE identifier, 
 but POINTEE is not an array by itself as it has (until PTR is initialized) 
 no memory storage associated with it.

 The above example is confusing because of a common misunderstanding 
 about FORTRAN type-statements (e.g. REAL X,Y), such statements ARE NOT
 like the declarations of PASCAL and C, they don't reserve memory 
 storage, but just supply data-type information to the compiler.

 Put another way, with Cray pointers the pointer and the entity it 
 points to are declared together and have different identifiers 
 associated with them, there is no separate indirection (dereferencing) 
 operator.


 An example program with Cray pointers:


      PROGRAM PNTR
C     ------------------------------------------------------------------
      INTEGER	
     *			I
C     ------------------------------------------------------------------
      REAL		
     *			ARRAY1(10),   ARRAY2(5), 
     *			POINTEE1(10), POINTEE2(5), POINTEE3(*)
C     ------------------------------------------------------------------
      POINTER 
     *			(PTR1, POINTEE1),
     *			(PTR2, POINTEE2),
     *			(PTR3, POINTEE3)
C     ------------------------------------------------------------------
      DATA 
     *			ARRAY1   /0,1,2,3,4,5,6,7,8,9/,
     *			ARRAY2   /5,5,5,5,5/
C     ------------------------------------------------------------------
      WRITE(*,*) 
      WRITE(*,'(1X,A,10F6.1)') 'array1=   ', ARRAY1
      WRITE(*,'(1X,A,10F6.1)') 'array2=   ', ARRAY2
C     ------------------------------------------------------------------
      WRITE(*,*) 
      PTR1 = LOC(ARRAY1)
      PTR2 = LOC(ARRAY1)
      PTR3 = LOC(ARRAY1)
      WRITE(*,'(1X,A,10F6.1)') 'pointee1= ', POINTEE1
      WRITE(*,'(1X,A,10F6.1)') 'pointee2= ', POINTEE2
      WRITE(*,'(1X,A,10F6.1)') 'pointee3= ', (POINTEE3(I), I = 1, 10)
C     ------------------------------------------------------------------
      WRITE(*,*) 
      PTR1 = LOC(ARRAY2)
      PTR2 = LOC(ARRAY2)
      PTR3 = LOC(ARRAY2)
      WRITE(*,'(1X,A,10F6.1)') 'pointee1= ', POINTEE1
      WRITE(*,'(1X,A,10F6.1)') 'pointee2= ', POINTEE2
      WRITE(*,'(1X,A,10F6.1)') 'pointee3= ', (POINTEE3(I), I = 1, 5)
C     ------------------------------------------------------------------
      END


 The result of this program on a VMS machine was:

 array1=      0.0   1.0   2.0   3.0   4.0   5.0   6.0   7.0   8.0   9.0
 array2=      5.0   5.0   5.0   5.0   5.0

 pointee1=    0.0   1.0   2.0   3.0   4.0   5.0   6.0   7.0   8.0   9.0
 pointee2=    0.0   1.0   2.0   3.0   4.0
 pointee3=    0.0   1.0   2.0   3.0   4.0   5.0   6.0   7.0   8.0   9.0

 pointee1=    5.0   5.0   5.0   5.0   5.0   0.0   0.0   0.0   0.0   0.0
 pointee2=    5.0   5.0   5.0   5.0   5.0
 pointee3=    5.0   5.0   5.0   5.0   5.0


 Note that declaring the pointee with assumed-array syntax, makes it
 impossible tohave I/O statements that reference the array name 
 without subscript. 



 Using Cray pointers
 -------------------
 The value of the pointer can be defined using some intrinsic 
 function usually called LOC(arg) that takes as argument the 
 name of a variable and returns it's memory address.

 The following points should be noted:

    1) The pointee only exists while the value of the pointer 
       is defined.
    2) The pointee may not be part of a COMMON block, nor may 
       it be a dummy argument of the current subprogram. 
       The pointer may be a COMMON block variable or a dummy 
       argument.
    3) You may pass a pointee as an actual argument to a 
       subprogram without special precautions; i.e., the 
       subprogram need not know that the object is a pointee; 
       the subprogram will treat it as it would any ordinary 
       variable.
    4) If you pass a pointer as an actual argument, then the 
       called subprogram should usually have declared the 
       corresponding dummy argument as a pointer.

 Furthermore:

    -- If the pointee is an array, and the dimensions of the array 
       are specified as run-time expressions rather than compile-time 
       constants, these will usually be evaluated (in an arbitrary 
       order, so beware of side-effects in their evaluation) upon 
       entry into the subprogram.

       It follows that you can't declare such variable dimensions 
       in the main program. A few compilers offer, as an option, 
       "dynamic dimensioning" in which the array dimensions are 
       evaluated again on each reference. (For example, XL Fortran 
       lets you do this by compiling with -qddim.)

       Dynamic dimensioning is not as widely supported as Cray pointers, 
       and can entail additional overhead in some circumstances 
       (especially when working with multidimensional arrays). You are 
       therefore encouraged to arrange for the dimensions to be known 
       upon entry to the subprogram that declares the pointee; otherwise 
       multidimensional arrays may be indexed incorrectly, and bounds 
       checking may fail on one-dimensional arrays.

    -- Many implementations supply a subprogram to allocate memory 
       (it may be called MALLOC, for Memory ALLOCation, or HPALLOC, 
       for HeaP ALLOCation, or something else entirely) and another 
       subprogram to release memory to the operating system. 

       On VMS you may use: STATUS = LIB$GET_VM(SIZE_IN_BYTES, PTR)
       to allocate a block of memory, to free that block you may use:
       STATUS = LIB$FREE_VM(SIZE_IN_BYTES, PTR). STATUS is an integer 
       variable used to store the return value of these system routines,
       (STATUS .NE. .TRUE.) means the routine failed.

       It is your responsibility to allocate a pointee before the 
       value of the pointer becomes undefined (for example by exiting
       the subprogram where it is declared, if it isn't SAVEd.)


 Cray pointers and automatic compiler optimizations
 --------------------------------------------------
 Compilers perform a partial data-flow analysis as a preparation before 
 automatic optimization, unrestricted pointers that are free to point 
 to any other variable makes such an analysis almost impossible. 

 Cray pointers are restricted to some degree, e.g. pointer arithmetic 
 is not allowed, but still a pointer can point to different objects
 during a procedure activation. 

 Fortran was designed so it could be highly optimized (remember that
 it had to compete with assembly language and won), the language 
 specification explicitly forbids anything that may lead to ALIASING 
 (having more than one name for the same variable) because of the 
 detrimental effect on automatic optimizations. In short, pointers 
 violate the spirit of Fortran, and defeat its purpose.

 Cray pointers may even cause WRONG results when compiler optimization 
 is turned on, and they are used without deep understanding of the effect 
 on the optimizer. A probable reason for that may be that the optimizer 
 assumes no-aliasing.

 It is recommended that you use Cray pointers only for allocating
 dynamic memory, and pass the corresponding pointees to subroutines. 

 If you wish to avoid Cray pointers completely, you can call a C routine 
 for that purpose, see the chapter on variable-size arrays for source-code 
 and discussion.


 Fortran 90 ALLOCATABLE objects
 ------------------------------
 Fortran 90 offers a more elegant way of allocating objects dynamically,
 through the ALLOCATABLE attribute and the ALLOCATE and DEALLOCATE
 executable statements. 

 For example, one can declare a one-dimensional allocatable array 
 as:

   REAL, ALLOCATABLE :: A(:)

 then allocate it to a given size with: 

   INTEGER IAS
   ALLOCATE (A(3:12), STAT=IAS)

 The optional STAT=IAS allows recovery from allocation errors. If the
 allocation is unsuccessful and no STAT= was given, the program aborts;
 if STAT= was given, IAS contains either 0 for success, or a non-zero 
 error code.

 The array bounds are stored with the array, and are computed at the
 time the ALLOCATE statement is executed. This avoids all the difficulties
 mentioned above with the dimensioning of Cray pointees.

 Don't forget to 

   DEALLOCATE(A)

 when you no longer need it. If you are no longer sure whether A is 
 allocated or not, say:

   IF(ALLOCATED(A)) DEALLOCATE(A) .



 Linked data structures and the Fortran 90 POINTER attribute
 -----------------------------------------------------------
 Pointers are also useful for constructing linked lists, trees and other
 dynamic data structures. Cray pointers are suitable for this purpose,
 particularly when used in conjunction with the function LOC(arg) that
 returns a pointer to its argument, but Fortran 90's ALLOCATABLE arrays
 are not. This is why Fortran 90 also supports a POINTER attribute in
 addition to ALLOCATABLE. A binary tree node could be declared as

   TYPE NODE
      TYPE(NODE), POINTER :: LEFT, RIGHT
      TYPE(ELEMENT) :: DATA
   END TYPE NODE

 Fortran 90 pointers can be in one of three states: undefined, associated
 and unassociated. The initial state is unassociated. An unassociated
 pointer may become associated by executing a pointer assignment statement

   pointer => target

 or an allocation statement

   ALLOCATE(pointer);

 an associated pointer may become unassociated by executing a

   NULLIFY(pointer)

 or, in some cases,

   DEALLOCATE(pointer);

 any pointer may become undefined in the same way as other Fortran 
 variables can (by exiting a subprogram without a SAVE for the pointer, 
 etc.) and once undefined may never be used again. (Be careful!)

 Pointers may point to otherwise unnamed objects created by an ALLOCATE
 statement, or to named objects declared with the TARGET attribute.
 The TARGET attribute is designed as an optimization aid to the compiler;
 if you use it on variables that don't need it, your code may run more 
 slowly.


   Summary of differences between Cray and Fortran 90 pointers
   ===========================================================

 Cray                                   Fortran 90
 -------------------------------------  ------------------------------------
 POINTER is a data type                 POINTER is an attribute

 A pointer holds the machine address    A pointer holds any information
 of the pointee; arithmetic is usually  needed to access the object; usually
 possible on pointer values, but may    an address plus a data type, array
 be non-portable.                       bounds and strides where applicable.
                                        You should not depend on assumptions
                                        about the internal representation,
                                        storage size, etc. of a Fortran 90
                                        pointer. No arithmetic on pointers.

 Assigning to a pointer variable        Referencing the pointer variable
 associates the pointer with a new      "does the right thing", i.e. usually
 pointee. Referencing the pointee       manipulates the value of the pointee.
 variable manipulates the current       In particular, pointer = value will
 pointee without affecting the          change the value of the object pointed
 pointer.                               to, but not the association of the
                                        pointer with the pointee. If you want
                                        to change that, use pointer => target
                                        instead.

 POINTER is used both to construct      POINTER can be used for both linked
 linked data structures and to support  data structures and dynamic allocation,
 dynamic memory allocation.             but if dynamic allocation alone is
                                        needed it is better to use ALLOCATABLE
                                        objects. (Significant performance
                                        differences have been reported with
                                        some compilers.)
Return to contents page