Ada 95 Quality and Style Guide                            Chapter 10


Chapter 10: Improving Performance

In This Chapter:

10.1 Performance Issues
10.2 Performance Measurement
10.3 Program Structure
10.3.1 Blocks
10.4 Data Structures
10.4.1 Dynamic Arrays
10.4.2 Zero-Based Arrays
10.4.3 Unconstrained Records
10.4.4 Records and Arrays
10.4.5 Records and Array Aggregates
10.5 Algorithms
10.5.1 Mod and rem Operators
10.5.2 Short-Circuit Operators
10.5.3 Case Statement Versus elsif
10.5.4 Checking for Constraint Errors
10.5.5 Order of Array Processing
10.5.6 Assigning Alternatives
10.5.7 Packed Boolean Array Shifts
10.5.8 Subprogram Dispatching
10.6 Types
10.6.1 Aggregates for Type Extensions
10.6.2 Protected Types
10.6.3 Bit Operations on Modular Types
10.6.4 Bounded Strings
10.6.5 String Handling Subprograms
10.6.6 Constraint Checking
10.6.7 Real-Time Systems Annex
10.7 Pragmas
10.7.1 Pragma Inline
10.7.2 Pragma Restrictions
10.7.3 Pragma Preelaborate
10.7.4 Pragma Pure
10.7.5 Pragma Discard_Names
10.7.6 Pragma Suppress
10.7.7 Pragma Reviewable
10.8 Summary

In many ways, performance is at odds with maintainability and portability. To achieve improved speed or memory usage, the most clear algorithm sometimes gives way to confusing code. To exploit special purpose hardware or operating system services, nonportable implementation dependencies are introduced. When concerned about performance, you must decide how well each algorithm meets its performance and maintainability goals. Use the guidelines in this chapter with care; they may be hazardous to your software.

The best way to build a system that satisfies its performance requirements is through good design. You should not assume that speeding up your code will result in a visible increase in system execution. In most applications, the overall throughput of the system is not defined by the execution speed of the code but by the interaction between concurrent processes and the response time of the system peripherals.

Most of the guidelines in this chapter read ". . . when measured performance indicates." "Indicates" means that you have determined that the benefit in increased performance to your application in your environment outweighs the negative side effects on understandability, maintainability, and portability of the resulting code. Many of the guideline examples show the alternatives that you will need to measure in order to determine if the guideline is indicated.

10.1 PERFORMANCE ISSUES

Performance has at least four aspects: execution speed, code size, compilation speed, and linking speed. Although all four are important, most people think of execution speed when performance is mentioned, and most of the guidelines in this chapter focus on execution speed.

Performance is influenced by many factors, including the compilation software, hardware, system load, and coding style. While only coding style is typically under the control of the programmer, the other factors have so much influence that it is impossible to make flat statements such as "case statements are more efficient than if-then-else structures." When performance is critical, you cannot assume that a coding style that proves more efficient on one system will also be more efficient on another. Decisions made for the sake of performance must be made on the basis of testing the alternatives on the actual system on which the application will be fielded.

10.2 PERFORMANCE MEASUREMENT

While most well-known tools for measuring performance are stand-alone programs that concentrate on execution speed, there is a comprehensive tool that covers all four aspects of performance. The Ada Compiler Evaluation System (ACES) is the result of merging two earlier products: the United States Department of Defense's Ada Compiler Evaluation Capability and the United Kingdom Ministry of Defence's Ada Evaluation System. It offers a comprehensive set of nearly 2,000 performance tests along with automated setup, test management, and analysis software. This system reports (and statistically analyzes) compilation time, linking time, execution time, and code size. The analysis tools make comparisons among multiple compilation-execution systems and also provide comparisons of the run-time performance of tests using different coding styles to achieve similar purposes.

Version 2.0 of the ACES, released in March of 1995, includes a Quick-Look facility that is meant to replace the Performance Issues Working Group (PIWG) suite. The Quick-Look facility is advertised as being easy to download, install, and execute in less than a day, while providing information that is as useful as that generated by the PIWG suite. In addition, Version 2.0 contains a limited number of Ada 95 tests (all of which are also included in the Quick-Look subset). Version 2.1, including broad coverage of the "core" Ada 95 language, is scheduled for release in March 1996.

At the time of this writing, the ACES software and documentation can be obtained via anonymous FTP from the host sw-eng.falls-church.va.us, directory /public/AdaIC/testing/aces. For World Wide Web access, use the following uniform resource locator (URL): http://sw-eng.falls-church.va.us/AdaIC/testing/aces/.

While measuring performance may seem to be a relatively straightforward matter, there are significant issues that must be addressed by any person or toolset planning to do such measurement. For detailed information, see the following sources: ACES (1995a, 1995b, 1995c); Clapp, Mudge, and Roy (1990); Goforth, Collard, and Marquardt (1990); Knight (1990); Newport (1995); and Weidermann (1990).

10.3 PROGRAM STRUCTURE

10.3.1 Blocks

guideline

  • Use blocks (see Guideline 5.6.9) to introduce late initialization when measured performance indicates.

  • example
       ...
    
       Initial : Matrix;
    
    
    
    begin  -- Find_Solution
    
    
    
       Initialize_Solution_Matrix:
    
          for Row in Initial'Range(1) loop
    
             for Col in Initial'Range(2) loop
    
                Initial (Row, Col) := Get_Value (Row, Col);
    
             end loop;
    
          end loop Initialize_Solution_Matrix;
    
    
    
       Converge_To_The_Solution:
    
          declare
    
    
    
             Solution       : Matrix           := Identity;
    
             Min_Iterations : constant Natural := ...;
    
    
    
          begin  -- Converge_To_The_Solution
    
             for Iterations in 1 .. Min_Iterations loop
    
                Converge (Solution, Initial);
    
             end loop;
    
    
    
          end Converge_To_The_Solution;
    
    
    
       ...
    
    end Find_Solution;
    
    
    
    

    rationale

    Late initialization allows a compiler more choices in register usage optimization. Depending on the circumstance, this may introduce a significant performance improvement.

    Some compilers incur a performance penalty when declarative blocks are introduced. Careful analysis and timing tests by the programmer may identify those declarative blocks that should be removed.

    notes

    It is difficult to accurately predict through code inspections which declarative blocks improve performance and which degrade performance. However, with these general guidelines and a familiarity with the particular implementation, performance can be improved.

    10.4 DATA STRUCTURES

    10.4.1 Dynamic Arrays

    guideline

  • Use constrained arrays when measured performance indicates.

  • rationale

    If array bounds are not known until run-time, then calculations of these bounds may affect run-time performance. Using named constants or static expressions as array bounds may provide better performance than using variables or nonstatic expressions. Thus, if the values of Lower and Upper are not determined until run-time, then:

    ... is array (Lower .. Upper) of ...
    
    

    may cause address and offset calculations to be delayed until run-time, introducing a performance penalty. See NASA (1992) for a detailed discussion of the tradeoffs and alternatives.

    10.4.2 Zero-Based Arrays

    guideline

  • Use zero-based indexing for arrays when measured performance indicates.

  • rationale

    For some compilers, offset calculations for an array whose lower bound is 0 (either the integer zero or the first value of an enumeration type) are simplified. For other compilers, optimization is more likely if the lower bound is 1.

    10.4.3 Unconstrained Records

    guideline

  • Use fixed-size components for records when measured performance indicates.

  • example
    subtype Line_Range   is Integer range 0 .. Max_Lines;
    
    subtype Length_Range is Integer range 0 .. Max_Length;
    
    
    
    -- Note that Max_Lines and Max_Length need to be static
    
    type Paragraph_Body is array (Line_Range range <>, Length_Range range <>) of Character;
    
    
    
    type Paragraph (Lines : Line_Range := 0; Line_Length : Length_Range := 0) is
    
       record
    
          Text : Paragraph_Body (1 .. Lines, 1 .. Line_Length);
    
       end record;
    
    
    
    

    rationale

    Determine the size and speed impact of unconstrained records having components depending on discriminants. Some compilers will allocate the maximum possible size to each object of the type; others will use pointers to the dependent components, incurring a possible heap performance penalty. Consider the possibility of using fixed-size components.

    10.4.4 Records and Arrays

    guideline

  • Define arrays of records as parallel arrays when measured performance indicates.

  • example
        -- Array of records
    
        Process (Student (Index).Name, Student (Index).Grade);
    
        -- Record of arrays
    
        Process (Student.Name (Index), Student.Grade (Index));
    
        -- Parallel arrays
    
        Process (Name (Index), Grade (Index));
    
    

    rationale

    Determine the impact of structuring data as arrays of records, records containing arrays, or parallel arrays. Some implementations of Ada will show significant performance differences among these examples.

    10.4.5 Record and Array Aggregates

    guideline

  • Use a sequence of assignments for an aggregation when measured performance indicates.

  • rationale

    Determine the impact of using an aggregate versus a sequence of assignments. Using an aggregate generally requires the use of a temporary variable. If the aggregate is "static" (i.e., its size and components are known at compile- or link-time, allowing link-time allocation and initialization), then it will generally be more efficient than a sequence of assignments. If the aggregate is "dynamic," then a series of assignments may be more efficient because no temporary variable is needed.

    See Guideline 5.6.10 for a discussion of aggregates from the point of view of readability and maintainability.

    See Guideline 10.6.1 for a discussion of extension aggregates.

    10.5 ALGORITHMS

    10.5.1 Mod and rem Operators

    guideline

  • Use incremental schemes instead of mod and rem when measured performance indicates.

  • example
       -- Using mod
    
       for I in 0 .. N loop
    
          Update (Arr (I mod Modulus));
    
       end loop;
    
    
    
       -- Avoiding mod
    
       J := 0;
    
       for I in 0 .. N loop
    
          Update (Arr (J));
    
          J := J + 1;
    
          if J = Modulus then
    
             J := 0;
    
          end if;
    
       end loop;
    
    
    
    

    rationale

    Determine the impact of using the mod and rem operators. One of the above styles may be significantly more efficient than the other.

    10.5.2 Short-Circuit Operators

    guideline

  • Use the short-circuit control form when measured performance indicates.

  • example
       -- Nested "if"
       if Last >= Target_Length then
    
          if Buffer (1 .. Target_Length) = Target then
    
             ...
    
          end if;
    
       end if;
    
    
    
       -- "and then"
       if Last >= Target_Length and then Buffer (1 .. Target_Length) = Target then
    
          ...
    
       end if;
    
    
    
    

    rationale

    Determine the impact of using nested if statements versus using the and then or or else operator. One of the above may be significantly more efficient than the other.

    10.5.3 Case Statement Versus elsif

    guideline

  • Use the case statement when measured performance indicates.

  • example
       subtype Small_Int is Integer range 1 .. 5;
    
       Switch : Small_Int;
    
       ...
    
       -- Case statement
    
       case Switch is
    
          when 1 => ...
    
          when 2 => ...
    
          when 3 => ...
    
          when 4 => ...
    
          when 5 => ...
    
       end case;
    
    
    
       -- "elsif construct"
       if Switch = 1 then
    
          ...
    
       elsif Switch = 2 then
    
          ...
    
       elsif Switch = 3 then
    
          ...
    
       elsif Switch = 4 then
    
          ...
    
       elsif Switch = 5 then
    
          ...
    
       end if;
    
    
    
    

    rationale

    Determine the impact of using case statements versus the elsif construct. If the case statement is implemented using a small jump table, then it may be significantly more efficient than the if .. then .. elsif construct.

    See also Guideline 8.4.6 for a discussion of the table-driven programming alternative.

    10.5.4 Checking for Constraint Errors

    guideline

  • Use hard-coded constraint checking when measured performance indicates.

  • example
       subtype Small_Int is Positive range Lower .. Upper;
    
       Var : Small_Int;
    
       ...
    
    
    
       -- Using exception handler
    
       Double:
    
          begin
    
             Var := 2 * Var;
    
          exception
    
             when Constraint_Error =>
                ...
    
          end Double;
    
    
    
          -- Using hard-coded check
    
          if Var > Upper / 2 then
    
             ...
    
          else
    
             Var := 2 * Var;
    
          end if;
    
    
    
    

    rationale

    Determine the impact of using exception handlers to detect constraint errors. If the exception handling mechanism is slow, then hard-coded checking may be more efficient.

    10.5.5 Order of Array Processing

    guideline

  • Use column-first processing of two-dimensional arrays when measured performance indicates.

  • example
        type Table_Type is array (Row_Min .. Row_Max, Col_Min .. Col_Max) of ...
    
        Table : Table_Type;
    
        ...
    
        -- Row-order processing
    
        for Row in Row_Min .. Row_Max loop
    
           for Col in Col_Min .. Col_Max loop
    
              -- Process Table (Row, Col)
    
           end loop;
    
        end loop;
    
        -- Column-order processing
    
        for Col in Col_Min .. Col_Max loop
    
           for Row in Row_Min .. Row_Max loop
    
              -- Process Table (Row, Col)
    
           end loop;
    
        end loop;
    
    

    rationale

    Determine the impact of processing two-dimensional arrays in row-major order versus column-major order. While most Ada compilers are likely to use row-major order, it is not a requirement. In the presence of good optimization, there may be no significant difference in the above examples. Using static array bounds is also likely to be significant here. See Guidelines 10.4.1 and 10.4.2.

    10.5.6 Assigning Alternatives

    guideline

  • Use overwriting for conditional assignment when measured performance indicates.

  • example
       -- Using "if .. else"
       if Condition then
    
          Var := One_Value;
    
       else
    
          Var := Other_Value;
    
       end if;
    
       -- Using overwriting
    
       Var := Other_Value;
    
       if Condition then
    
          Var := One_Value;
    
       end if;
    
    

    rationale

    Determine the impact of styles of assigning alternative values. The examples illustrate two common methods of doing this; for many systems, the performance difference is significant.

    10.5.7 Packed Boolean Array Shifts

    guideline

  • When measured performance indicates, perform packed Boolean array shift operations by using slice assignments rather than repeated bit-wise assignment.

  • example
       subtype Word_Range is Integer range 0 .. 15;
    
       type Flag_Word is array (Word_Range) of Boolean;
    
       pragma Pack (Flag_Word);
    
       Word : Flag_Word;
    
       ...
    
    
    
       -- Loop to shift by one bit
    
       for Index in 0 .. 14 loop
    
          Word (Index) := Word (Index + 1);
    
       end loop;
    
       Word (15) := False;
    
    
    
       -- Use slice assignment to shift by one bit
    
       Word (0 .. 14) := Word (1 .. 15);
    
       Word (15) := False;
    
    
    
    

    rationale

    Determine the impact of slice manipulation when shifting packed Boolean arrays. For Ada 83 implementations using packed Boolean arrays, shift operations may be much faster when slice assignments are used as opposed to for loop moving one component at a time. For Ada 95 implementations, consider using modular types instead (see Guideline 10.6.3).

    10.5.8 Subprogram Dispatching

    guideline

  • Use static subprogram dispatching when measured performance indicates.
  • example

    The term "static dispatching" in this example refers to the use of if/elsif sequences to explicitly determine which subprograms to call based on certain conditions:

        -- (1) Dispatching where tag is not known at compile time
    
        --     (See ACES V2.0 test "a9_ob_class_wide_dynamic_01")
    
        -- Object_Type is a tagged type
    
        -- The_Pointer designates Object_Type'Class;
    
        -- Subclass1_Pointer designates Subclass1 (derived from Object_Type)
    
        -- Subclass2_Pointer designates Subclass2 (derived from Subclass1)
    
        -- Subclass3_Pointer designates Subclass3 (derived from Subclass2)
    
        Random_Value := Simple_Random; -- Call to a random number generator
    
        if Random_Value < 1.0/3.0 then
    
           The_Pointer := Subclass1_Pointer;
    
        elsif Random_Value > 2.0/3.0 then
    
           The_Pointer := Subclass2_Pointer;
    
        else
    
           The_Pointer := Subclass3_Pointer;
    
        end if;
    
        Process (The_Pointer.all);  -- Tag is unknown
    
        -- (2) Tag is determinable at compile time (static dispatching)
    
        --     (See ACES V2.0, test "a9_ob_class_wide_static_01")
    
        -- Object_Type is a tagged type
    
        -- The_Pointer designates Object_Type'Class;
    
        -- Subclass1_Pointer designates Subclass1 (derived from Object_Type)
    
        -- Subclass2_Pointer designates Subclass2 (derived from Subclass1)
    
        -- Subclass3_Pointer designates Subclass3 (derived from Subclass2)
    
        Random_Value := Simple_Random; -- Call to a random number generator
    
        if Random_Value < 1.0/3.0 then
    
           Process (Subclass1_Pointer.all);
    
        elsif Random_Value > 2.0/3.0 then
    
           Process (Subclass2_Pointer.all);
    
        else
    
           Process (Subclass3_Pointer.all);
    
        end if;
    
        -- (3) No tagged types are involved (no dispatching)
    
        --     (See ACES V2.0, test "ap_ob_class_wide_01")
    
        -- Object_type is a discriminated type with variants; possible
    
        -- discriminant values are Subclass1, Subclass2, and Subclass3
    
        -- All the pointers designate values of Object_Type
    
        -- Subclass1_Pointer := new Object_Type (Subclass1);
    
        -- Subclass2_Pointer := new Object_Type (Subclass2);
    
        -- Subclass3_Pointer := new Object_Type (Subclass3);
    
        -- There is only one "Process" procedure (operating on Object_Type)
    
        Random_Value := Simple_Random; -- Call to a random number generator
    
        if Random_Value < 1.0/3.0 then
    
           Process (Subclass1_Pointer.all);
    
        elsif Random_Value > 2.0/3.0 then
    
           Process (Subclass2_Pointer.all);
    
        else
    
           Process (Subclass3_Pointer.all);
    
        end if;
    
    

    rationale

    Determine the impact of dynamic and static subprogram dispatching. The compiler may generate much more efficient code for one form of dispatching than the other.

    notes

    Dynamic dispatching will almost certainly be more efficient than an explicit if . . . elsif sequence. However, you should be aware of any optimizing decisions made by a compiler that might affect this situation.

    10.6 TYPES

    10.6.1 Aggregates for Type Extensions

    guideline

  • Use only simple aggregates when measured performance indicates.

  • example
       type Parent is tagged
    
          record
    
             C1 : Float;
    
             C2 : Float;
    
          end record;
    
    
    
       type Extension is new Parent with
    
          record
    
             C3 : Float;
    
             C4 : Float;
    
          end record;
    
    
    
       Parent_Var : Parent := (C1 => Float_Var1, C2 => Float_Var2);
    
       Exten_Var  : Extension;
    
       ...
    
       -- Simple aggregate
    
       -- (See ACES V2.0, test "a9_ob_simp_aggregate_02")
    
       Exten_Var := (C1 => Float_Var1, C2 => Float_Var2,
    
                     C3 => Float_Var3, C4 => Float_Var4);
    
       -- Extension aggregate
    
       -- (See ACES V2.0, test "a9_ob_ext_aggregate_02")
    
       Exten_Var := (Parent_Var with C3 => Float_Var3, C4 => Float_Var4);
    
    

    rationale

    Determine the impact of using extension aggregates. There may be a significant performance difference between evaluation of simple aggregates and evaluation of extension aggregates.

    10.6.2 Protected Types

    guideline

  • For mutual exclusion, when measured performance indicates, use protected types as an alternative to tasking rendezvous.
  • To implement an interrupt handler, when performance measurement indicates, use a protected procedure.

  • example
       -- (1) Using protected objects
    
       --     (See ACES V2.0, test "a9_pt_prot_access_02")
    
       protected Object is
    
          function Read return Float;
    
          procedure Write (Value : in Float);
    
       private
    
          Data : Float;
    
       end Object;
    
       protected body Object is
    
          function Read return Float is
    
          begin
    
             return Data;
    
          end Read;
    
          procedure Write (Value : in Float) is
    
          begin
    
             Data := Value;
    
          end Write;
    
       end Object;
    
       task type Modify is
    
       end Modify;
    
       type Mod_Bunch is array (1 .. 5) of Modify;
    
       task body Modify is
    
          ...
    
       begin -- Modify
    
          for I in 1 .. 200 loop
    
             The_Value := Object.Read;
    
             Object.Write (The_Value - 0.125);
    
             if The_Value < -1.0E7 then
    
                The_Value := 1.0;
    
             end if;
    
          end loop;
    
       end Modify;
    
       ...
    
       -- Block statement to be timed
    
       declare
    
          Contending_Tasks : array (1 .. 5) of Modify;
    
       begin
    
          null;  -- 5 tasks contend for access to protected data
    
       end;
    
       ------------------------------------------------------------------------------
    
       -- (2) Using monitor task
    
       --     (See ACES V2.0, test "tk_rz_entry_access_02")
    
       Task Object is
    
          entry Write (Value : in     Float);
    
          entry Read  (Value :    out Float);
    
       end Object;
    
       task body Object is
    
          Data : Float;
    
       begin -- Object
    
          loop
    
             select
    
                accept Write (Value : in     Float) do
    
                   Data := Value;
    
                end Write;
    
             or
    
                accept Read  (Value :    out Float) do
    
                   Value := Data;
    
                end Read;
    
             or
    
                terminate;
    
             end select;
    
          end loop;
    
       end Object;
    
       -- Task type Modify declared as above
    
       -- Block statement to be timed as above
    
    

    rationale

    Protected objects are meant to be much faster than tasks used for the same purpose (see Guideline 6.1.1). Determine the impact of using protected objects to provide access safely to encapsulated data in a concurrent program.

    10.6.3 Bit Operations on Modular Types

    guideline

  • Use modular types rather than packed Boolean arrays when measured performance indicates.

  • example
       -- (1) Packed Boolean arrays
    
       --     (See ACES V2.0, test "dr_ba_bool_arrays_11")
    
    
    
       type Set is array (0 .. 15) of Boolean;
    
       pragma Pack (Set);
    
    
    
       S1     : Set;
    
       S2     : Set;
    
       Empty  : Set := (Set'Range => False);
    
       Result : Boolean;
    
    
    
       ...
    
    
    
       -- Is S1 a subset of S2?
    
       Result := ((S1 and not S2) = Empty);
    
    
    
       ---------------------------------------------------------------------
    
    
    
       -- (2) Modular types
    
       --     (See ACES V2.0, test "a9_ms_modular_oper_02")
    
    
    
       type Set is mod 16;
    
    
    
       S1     : Set;
    
       S2     : Set;
    
       Empty  : Set := 0;
    
       Result : Boolean;
    
    
    
       ...
    
    
    
       -- Is S1 a subset of S2?
    
       Result := ((S1 and not S2) = Empty);
    
    
    
    

    rationale

    Determine the impact of performing bit-wise operations on modular types. The performance of these operations may be significantly different from similar operations on packed Boolean arrays. See also Guideline 10.5.7.

    10.6.4 Bounded Strings

    guideline

  • Use the predefined bounded strings when predictable performance is an issue and measured performance indicates.

  • rationale

    The unbounded strings may be allocated on the heap. If bounded strings are not allocated on the heap, then they may provide better performance. Determine the impact of using the string type declared in instantiations of Ada.Strings.Bounded.Generic_Bounded_Length versus the type declared in Ada.Strings.Unbounded.

    The predefined Ada 95 language environment defines packages that support both bounded and unbounded strings. Using bounded strings may avoid the unpredictable duration of delays associated with using heap storage.

    10.6.5 String Handling Subprograms

    guideline

  • Use the procedural form of the string handling subprograms when measured performance indicates.

  • rationale

    Determine the relative performance cost of functions and procedures having the same name and functionality in Ada.Strings.Fixed, Ada.Strings.Bounded, Ada.Strings.Unbounded and the corresponding child packages whose names include Wide.

    While functional notation typically leads to clearer code, it may cause the compiler to generate additional copying operations.

    10.6.6 Constraint Checking

    guideline

  • Use strong typing with carefully selected constraints to reduce run-time constraint checking when measured performance indicates.

  • example

    In this example, two potential constraint checks are eliminated. If the function Get_Response returns String, then the initialization of the variable Input would require constraint checking. If the variable Last is type Positive, then the assignment inside the loop would require constraint checking:

       ...
    
       subtype Name_Index is Positive range 1 .. 32;
    
       subtype Name       is String (Name_Index);
    
       ...
    
       function Get_Response return Name is separate;
    
       ...
    
    begin
    
       ...
    
       Find_Last_Period:
    
          declare
    
             -- No Constraint Checking needed for initialization
    
             Input       : constant Name       := Get_Response;
    
             Last_Period :          Name_Index := 1;
    
          begin  -- Find_Last_Period
    
             for I in Input'Range loop
    
                if Input(I) = '.' then
    
                   -- No Constraint Checking needed in  this `tight' loop
    
                   Last_Period := I;
    
                end if;
    
             end loop;
    
             ...
    
          end Find_Last_Period;
    
    

    rationale

    Because run-time constraint checking is associated with slow performance, it is not intuitive that the addition of constrained subtypes could actually improve performance. However, the need for constraint checking appears in many places regardless of the use of constrained subtypes. Even assignments to variables that use the predefined subtypes may need constraint checks. By consistently using constrained subtypes, many of the unnecessary run-time checking can be eliminated. Instead, the checking is usually moved to less frequently executed code involved in system input. In the example, the function Get_Response may need to check the length of a user-supplied string and raise an exception.

    Some compilers can do additional optimizations based on the information provided by constrained subtypes. For example, although an unconstrained array does not have a fixed size, it has a maximum size that can be determined from the range of its index. Performance can be improved by limiting this maximum size to a "reasonable" number. Refer to the discussion on unconstrained arrays found in NASA (1992).

    10.6.7 Real-Time System Annex

    guideline

  • For cases where both rendezvous and protected types are inefficient, consider the use of the Real-Time Systems Annex (Ada Reference Manual 1995, Annex D).

  • rationale

    The packages Ada.Synchronous_Task_Control and Ada.Asynchronous_Task_Control have been defined to provide an alternative to tasking and protected types for use in applications where a minimal run-time is desired (Ada Reference Manual 1995, Annex D).

    10.7 PRAGMAS

    10.7.1 Pragma Inline

    guideline

  • When measured performance indicates, use pragma Inline when calling overhead is a significant portion of the routine's execution time.

  • example
    procedure Assign (Variable : in out Integer;
    
                      Value    : in     Integer);
    
    pragma Inline (Assign);
    
    ...
    
    procedure Assign (Variable : in out Integer;
    
                      Value    : in     Integer) is
    
    begin
    
       Variable := Value;
    
    end Assign;
    
    

    rationale

    If calling overhead is a significant portion of a subprogram's execution time, then using pragma Inline may reduce execution time.

    Procedure and function invocations include overhead that is unnecessary when the code involved is very small. These small routines are usually written to maintain the implementation hiding characteristics of a package. They may also simply pass their parameters unchanged to another routine. When one of these routines appears in some code that needs to run faster, either the implementation-hiding principle needs to be violated or a pragma Inline can be introduced.

    The use of pragma Inline does have its disadvantages. It can create compilation dependencies on the body; that is, when the specification uses a pragma Inline, both the specification and corresponding body may need to be compiled before the specification can be used. As updates are made to the code, a routine may become more complex (larger) and the continued use of a pragma Inline may no longer be justified.

    exceptions

    Although it is rare, inlining code may increase code size, which can lead to slower performance caused by additional paging. A pragma Inline may actually thwart a compiler's attempt to use some other optimization technique, such as register optimization.

    When a compiler is already doing a good job of selecting routines to be inlined, the pragma may accomplish little, if any, improvement in execution speed.

    10.7.2 Pragma Restrictions

    guideline

  • Use pragma Restrictions to express the user's intent to abide by certain restrictions.

  • rationale

    This may facilitate the construction of simpler run-time environments (Ada Reference Manual 1995, §§13.12, D.7, and H.4).

    10.7.3 Pragma Preelaborate

    guideline

  • Use pragma Preelaborate where allowed.

  • rationale

    This may reduce memory write operations after load time (Ada Reference Manual 1995, §§10.2.1 and C.4).

    10.7.4 Pragma Pure

    guideline

  • Use pragma Pure where allowed.

  • rationale

    This may permit the compiler to omit calls on library-level subprograms of the library unit if the results are not needed after the call (Ada Reference Manual 1995, §10.2.1).

    10.7.5 Pragma Discard_Names

    guideline

  • Use pragma Discard_Names when the names are not needed by the application and data space is at a premium.

  • rationale

    This may reduce the memory needed to store names of Ada entities, where no operation uses those names (Ada Reference Manual 1995, §C.5).

    10.7.6 Pragma Suppress

    guideline

  • Use pragma Suppress where necessary to achieve performance requirements.

  • rationale

    See Guideline 5.9.5.

    10.7.7 Pragma Reviewable

    guideline

  • Use pragma Reviewable to aid in the analysis of the generated code.

  • rationale

    See the Ada Reference Manual (1995, Annex H).

    10.8 SUMMARY

    Use the guidelines in this chapter with care; they may be hazardous to your software.

    program structure

  • Use blocks to introduce late initialization when measured performance indicates .

  • data structures

  • Use constrained arrays when measured performance indicates.
  • Use zero-based indexing for arrays when measured performance indicates.
  • Use fixed-size components for records when measured performance indicates.
  • Define arrays of records as parallel arrays when measured performance indicates.
  • Use a sequence of assignments for an aggregation when measured performance indicates.

  • algorithms

  • Use incremental schemes instead of mod and rem when measured performance indicates.
  • Use the short-circuit control form when measured performance indicates.
  • Use the case statement when measured performance indicates.
  • Use hard-coded constraint checking when measured performance indicates.
  • Use column-first processing of two-dimensional arrays when measured performance indicates.
  • Use overwriting for conditional assignment when measured performance indicates.
  • When measured performance indicates, perform packed Boolean array shift operations by using slice assignments rather than repeated bit-wise assignment.
  • Use static subprogram dispatching when measured performance indicates.<

  • types

  • Use only simple aggregates when measured performance indicates.
  • For mutual exclusion, when measured performance indicates, use protected types as an alternative to tasking rendezvous.
  • To implement an interrupt handler, when measured performance indicates, use a protected procedure.
  • Use modular types rather than packed Boolean arrays when measured performance indicates.
  • Use the predefined bounded strings when predictable performance is an issue and measured performance indicates.
  • Use the procedural form of the string handling subprograms when measured performance indicates.
  • Use strong typing with carefully selected constraints to reduce run-time constraint checking when measured performance indicates.
  • For cases where both rendezvous and protected types are inefficient, consider the use of the Real-Time Systems Annex (Ada Reference Manual 1995, Annex D).

  • pragmas

  • When measured performance indicates, use pragma Inline when calling overhead is a significant portion of the routine's execution time.
  • Use pragma Restrictions to express the user's intent to abide by certain restrictions.
  • Use pragma Preelaborate where allowed.
  • Use pragma Pure where allowed.
  • Use pragma Discard_Names when the names are not needed by the application and data space is at a premium.
  • Use pragma Suppress where necessary to achieve performance requirements.
  • Use pragma Reviewable to aid in the analysis of the generated code.