Power-Programmierung

home *** CD-ROM | disk | FTP | other *** search

/ Power-Programmierung / CD1.mdf / ada / library / napp / pp_exe.exe / arc / PRETTY_P.DOC < prev next >

Wrap

Text File | 1988-05-27 | 74KB | 1,990 lines

The source code and data files for GSFC Ada Standard Pretty Printer is in the public domain, but NASA does request to be notified of how many copies are made to encourage funding for similar development for the public domain. Therefore, please fill out this page and return it to: Ada Pretty Printer Notification c/o Ms. Elisabeth Brinker -- Code 522 Data Systems Applications Branch Goddard Space Flight Center Greenbelt, Maryland 20771-0001 ----------------------------------------------------------------- Ada Pretty Printer Notification Version 1.1 Name ____________________________________________________________ Address _________________________________________________________ _________________________________________________________________ _________________________________________________________________ Copies ___________________ Modified Copies _________________ _ |_| I would like information regarding any modifications since version 1.1 Comments ________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ Data Systems Technology Laboratory Series DSTL-88-003 National Aeronautics and Space Administration Goddard Space Flight Center Standard Ada Pretty Printer May 1988 FOREWORD NASA/GSFC Standard Ada Pretty Printer is a publication of the Data Systems Technology Division of the Mission Operations and Data Systems Directorate, the National Aeronautics and Space Ad- ministration, Goddard Space Flight Center (NASA/GSFC). The principal author of this document is Allyn M. Shell of AdaCraft, Inc. (a Maryland Corporation) Work was accomplished for Data Systems Technology Division under Purchase Order Number S-89844-D Single copies of this document may be obtained by writing to: Ms. Elisabeth Brinker Code 522 NASA/Goddard Space Flight Center Greenbelt, Maryland 20771 ABSTRACT This documentation contains a User Guide describing installation and limitations of the NASA/GSFC Ada Pretty Printer tool. Con- tained in the Appendix is a description of how the source code for the Pretty Printer is structured and guidelines for source code modification and reuse. This documentation was completed together with Version 1.1 of the software tool, which provides significant enhancements to expand the tool's usefulness in a software development environment. Goddard Space Flight Center Standard Ada Pretty Printer Table of Contents: 1. Background 2. Installation for IBM AT and Compatibles From Executable 3. Installation From Source Code 4. Normal Use With Single File 5. Multiple Files and Batch Mode 6. The Output Files 7. Modifying Default File Names 8. Optional Parameters 9. Pragmas and Creating a Specialized Pretty Printer 10. Error Messages 11. Limitations and Known Problems Appendix Ada Pretty Printer Code Documentation Ada Pretty Printer Notification 1 Goddard Space Flight Center Standard Ada Pretty Printer 1. Background This project began in early 1986 as an Ada learning project in the Simulations Operations Control Center (SOC) at Goddard Space Flight Center, where the tool's author was being groomed as "The Ada Expert" for the SOC. Through the process of writing a parse table generator and Ada syntax file, he was able to learn many of the intricacies of Ada he might have otherwise missed in years of development projects. The tool's author, Mr. Allyn M. Shell, joined the Ada Users Group at GSFC and subsequently became a charter member of the Ada Stan- dards Committee, a working committee of the Ada Users Group. The standards implemented in the tool are directly taken from the Ada Style Guide (Software Engineering Laboratory Series SEL-87-002) produced by the Standards Committee. In January of 1988 NASA selected AdaCraft, Inc. to update and document the Pretty Printer code that Mr. Shell wrote. The code has been released to the public domain. Enhancements are encour- aged, and NASA would like to know what was done. If some en- hancements require changes to the syntax or directives, there is a parse table generator associated with the Pretty Printer which operates on the VAX / VMS. This program can be acquired from the Data Systems Applications Branch of GSFC. (See notification at end of document for address.) It will require modification to make it usable on an IBM AT. 2 Goddard Space Flight Center Standard Ada Pretty Printer 2. Installation for IBM AT and Compatibles from Archived Executable: AT must have extended RAM with driver=vdisk.sys in the config.sys file (vdisk.sys comes with IBM's DOS). Otherwise, use PP.exe alone (expanded memory helps). INSTALLATION: After making a backup floppy, do the DOS commands designated in the left column starting from the root directory. MD Pretty | Act_Tab1.DIO Act_Tab2.DIO DirecTab.DIO CD Pretty | Heap.COM Load_PP.BAT PP.EXE A:PP_Exe | Pretty_P.DOC Pretty_P.EXE Pretty_P.EXT Your directory should include the files on the right. Print Pretty_P.DOC. If your RAM is not D: then you will need to edit Load_PP.BAT to designate the correct drive number. The Alsys compiler routines are assumed to be present by the Load_PP.BAT file but are not necessary. If you have the Alsys compiler routines, Heap.COM can be deleted. Add C:\Pretty to the path in your AutoExec.BAT file. If you expect to use the Pretty Printer frequently, include the Load_PP call (without the Heap call) before any Alsys\StartUp call in the AutoExec.BAT file. EXECUTION: with RAM: Load_PP (This only needs to be done once per session.) Pretty_Print without RAM: PP On the prompt enter file name. The last two letters will be changed to -PP for the formatted source file and -SR for the status report. 3 Goddard Space Flight Center Standard Ada Pretty Printer 3. Installation From Source Code: Compilation command lists are provided for Vax / VMS and IBM AT. Users of other machines should determine whether the VAX_Parse_- Tables can be compiled on their systems. If the VAX_Parse_Tables will compile then they should be used. Edit the appropriate com- mand list. If necessary create the Ada library. Then compile and link the source code using the edited command list. Copy the executable module into the appropriate directory, and if you have not used the VAX_Parse_Tables, you will have to copy the action tables and directives table to the same directory. These are the files: Act_Tab1.DAT Act_Tab2.DAT DirecTab.DAT Note: These tables are necessary only for the first execution of the Pretty Printer and can be deleted after the first run. The Pretty Printer will generate Direct_IO versions of these tables with .DIO designations. All of the .DIO files or all of the .DAT files must be present for the Pretty Printer to initialize. If modifications are made to the source code that do not include changes to the parse tables, the original .DIO files can be used with the new load module of the Pretty Printer. If modifications are made to the syntax or directives using the parse table generator then the parse tables and parse keys files will both need to be recompiled along with their dependent units. The .DIO files will also need to be renamed so that the new .DAT will be installed. 4 Goddard Space Flight Center Standard Ada Pretty Printer 4. Normal Use with Single File: To execute the Pretty Printer simply invoke the executable modu- le. On the VAX this would be: Run Pretty_Print On an IBM AT this would be: Pretty_Print Note: Execution of the Pretty Printer on an IBM AT or Compatible will require copying the .EXT file to RAM and assignment of the name Ada_Heap.DTA to the free space in RAM. This can be done using the Load_PP.BAT file which is included. The Pretty Printer will prompt for the name of the file to be formatted and will return several simple messages. For Example: C> Pretty_Print Ada File? > FileName.Ada FileName.APP created FileName.ASR created The Ada file FileName.Ada was formatted according to the GSFC Standard Ada Style and written to the FileName.APP file with a statistics report written to the FileName.ASR file. See Output Files section for examples. Note: Since output file names are simply modified versions of the input file name less the last two characters, distinctions bet- ween files made by differences in the extension may be lost. (For instance a declaration and corresponding body called Package1.ADA and Package1.ADB will both produce output files with the names Package1.APP and Package1.ASR.) Also file names with single char- acter extensions will raise exceptions in some implementations because the period of the intended extension will be deleted. Some error messages may also appear at the terminal. If the message is related to a parse error then the source code does not conform to the Ada LRM. In this case the -PP file will not contain a formatted version of the source file. Instead, it will be a copy of the original file with very simple error messages. Often one parse error will produce several error messages; some of which may be much later in the code where later parts of nested constructs appear. Another kind of message that appears at the terminal is a flag that is displayed for violations of the Style Guide which are not violations of Ada. These are included for NASA's use. 5 Goddard Space Flight Center Standard Ada Pretty Printer 5. Multiple Files and Batch Mode: The Pretty Printer will take a list of files and format each file on the list. Each file will produce two output files named in the same manner as for the single file. A statistics totals report is also produced named the Stat_Tot.RPT file. To format a list of files: Prepare the list to be formatted with one file name per line. The name may be preceded by blanks (no tab characters). If anything else follows on the line there must be at least one blank space immediately after the name. To execute a list of files: Indicate the input file is a list by placing an "at" sign before the file name. C> Pretty_Print Ada File? > @NameList.Ada Pretty Print -- File_1.APP File_1.APP created File_1.ASR created Pretty Print -- File_2.APP File_2.APP created File_2.ASR created Stat_Tot.Rpt created To execute the Pretty Printer in Batch mode: The first word of the first line in the files list should be the word "batch". This word must be all lower case and must be sep- arated from anything following by at least one blank. If the output file for the terminal responses is to be named, the name must appear on the same line after the word "batch" and separated before and behind by one or more spaces (no tab characters). Everything that appears on this line after the first blank space following the output file name will be considered a comment. For batch processing this list file must be named Files_In.LST and must be in the directory starting the batch job. When the batch job is completed this file should be renamed because it will cause normal execution of the Pretty Printer to execute the same list file and overwrite the output file. This may cause a loss of the existing output file even if the execution is stop- ped. 6 Goddard Space Flight Center Standard Ada Pretty Printer If the Storage_Error exception is raised while a list of files is being formatted, execution will be stopped at that file, an error message will appear at the terminal or in the output file, and the Statistics Totals Report will be generated for all files prior to the one causing the Storage_Error. Note: Files having parse errors will not produce a unit status report nor add statistics to the total. 7 Goddard Space Flight Center Standard Ada Pretty Printer 6. The Output Files: The two primary output files are the Formatted Source File and the Statistics Report. The Formatted Source File will have the same source code as the original source file but the format will be according to the Ada Style Guide (Version 1.1). A sample of the Statistics Report follows: Statistics for file named => FileName.Ada Description Amount ----------------------------------------- Units 1 Lines 69 Lines_of_Code 17 Comment_Lines 42 Blank_Lines 10 Right_Comments 6 Open_Semicolons 10 Enclosed_Semicolons 1 Maximum_Indentation 2 ----------- Code - Statistics ----------- with_clause 1 use_clause 1 body_stub 1 procedure_body 1 parameter_spec (fn & proc) 2 call_statement 1 assignment_statement 3 if_statement 2 elsif_clause 1 ----------------------------------------- The code statistics correspond to the number of parse tree nodes for the selected constructs. (Details such as identifier and op- erator counts are not included.) The open and enclosed semicolon counts are provided to determine how the semicolons are being counted by others providing Ada code. (Enclosed semicolons are those that appear within parenthesis as between the parameter declarations in the formal part of a subprogram declaration.) The maximum indentation is provided to give a rough estimate of unit complexity. The parameter_spec count includes all parameter specifications in procedure, function, and entry declarations and accept statements. The amount for the procedure body, function body, package body, package spec, etc. will include the item that is the compilation unit as well as any enclosed units. The other reported items are self explanatory. Note: When the source code contains parse errors this report will not be generated. 8 Goddard Space Flight Center Standard Ada Pretty Printer 7. Modifying Default File Names: The default file names can be changed by making changes in the FileNams.Ada file and recompiling this file and its dependencies. The files which can be modified this way are: Action_Table_One : constant String := "Act_Tab1.Dat"; Action_Table_Two : constant String := "Act_Tab2.Dat"; Directives_Table : constant String := "Directab.Dat"; Productions_Table : constant String := "Prodtabl.Dat"; Direct_Action_Table_One : constant String := "Act_Tab1.DIO"; Direct_Action_Table_Two : constant String := "Act_Tab2.DIO"; Direct_Directives_Table : constant String := "Directab.DIO"; Direct_Productions_Table : constant String := "Prodtabl.DIO"; Input_Files_List : constant String := "Files_In.Lst"; Output_List : constant String := "PPoutput.lst"; Status_Totals_Report : constant String := "Stat_Tot.Rpt"; Note: The first eight files named here are not used when the VAX Parse Tables are used. 9 Goddard Space Flight Center Standard Ada Pretty Printer 8. Optional Parameters: The Pretty Printer contains many controls that are parameterized. Any of these parameters can be overridden by pragma Form_Set. These parameters include the following: type Format_Type is (NASA_STANDARD, SPECIAL); Pretty_Print_Format : Format_Type := NASA_STANDARD; Page_Width : Natural := 80; Right_Comment_Column : Natural := 40; Colon_Offset : Natural := 15; Value_Offset : Natural := 15; Assign_Offset : Natural := 15; The following parameters can also be overridden by pragma Form_- Set, but they only apply when the Pretty_Print_Format is set to SPECIAL. Indentation : Natural := 3; Using_Half_Indentation : Boolean := TRUE; -- not recommended for Indentation < 3 Using_Reverse_Indentation : Boolean := TRUE; -- for loop names, block names, and labels type Cases is (UPPER_CASE, CAPITALIZE_FIRST, LOWER_CASE); Types_Case : Cases := CAPITALIZE_FIRST; Enumeration_Case : Cases := UPPER_CASE; Identifier_Case : Cases := CAPITALIZE_FIRST; Using_Echeloned_Declarations : Boolean := FALSE; Using_Echeloned_Assigns : Boolean := FALSE; Using_Colon_Alinement : Boolean := FALSE; Using_Value_Alinement : Boolean := FALSE; Using_Assign_Alinement : Boolean := FALSE; Using_Pointer_Alinement : Boolean := FALSE; The offset parameters in the third group are used with the "using" flags in the last group. The Colon_Offset is aligned measuring from the beginning of the identifier that it follows. The Value_Offset is for the assignment in a declaration and is aligned measuring from the colon. The Assign_Offset is for assignment statement and is aligned measuring from the beginning of the statement. The half indentation will cause the "when" clause of case state- ments, exception handlers, and variant parts of record declara- 10 Goddard Space Flight Center Standard Ada Pretty Printer tions and the "record" of the record declarations to be indented only half of a normal indentation level so that the enclosed portion of the construct will be indented one level. The reverse indentation applies to labels (which should not be used according to the Ada Style Guide) and to loop and block names. The reverse indentation causes them to stick out from the surrounding lines of code as in the Ada LRM rather than the Ada Style Guide. The capitalization parameters can be set for upper_case, lower_- case, or capitalize_first. Capitalize_First simply means that the first letter of each word is capitalized. A words in an Ada identifier must be separated by an underline to be capitalized. The one exception to this rule is: if an identifier ends with _IO then the 'O' is also capitalized. The capitalization parameters can be overridden for individual identifiers by using pragma Form. An echeloned declaration is one with the colon indented on the line following the identifier. An echeloned assign is one with the assignment indented on the line following the colon. (The echeloned assign flag is not considered unless the echeloned declaration flag is TRUE.) 11 Goddard Space Flight Center Standard Ada Pretty Printer 9. Pragmas and Creating a Specialized Pretty Printer: The Pretty Printer recognizes four pragmas. They are pragma Form_Set, pragma Form, pragma Format_Off, and Format_On. (The pragmas are all case insensitive.) pragma Form_Set (Parameter_Name => Value); The Form_Set pragma will set any of the parameters mentioned in the last section. The pragma can appear anywhere in the file being formatted and effects the formatting of the entire file and any other files which may follow immediately during the execution of the Pretty Printer. This pragma requires both the parameter name and new value. It can have multiple pairs of names and values, but it cannot continue over to a second line. There may be as many pragma Form_Set statements in a file as desired. When the same parameter is set more than once the last setting will be in force throughout the file being formatted. One exception to the parameter name requirement is the Pretty_Print_Format parame- ter. The value SPECIAL or NASA_STANDARD may be used without the parameter name, but the value must be the first setting listed in the pragma. Note: Settings of the parameters remain in effect throughout a series of formats done using a list of files starting at the beginning of the file in which they appear. This allows lists of files to be formatted uniformly without individual editing, but it will require that parameters be reset after an individual file in a list is formatted (in the next file) if they are meant to be in effect only for that one file. A recommended procedure is to place all such files at the end of any list of files to be form- atted with the pragma reset files between them. Files which contain only pragmas will be flagged as having a parse error (since they do not contain a unit), but the parameters will be set. Because the parse error is raised, a pragma file will not affect the total statistics. pragma Form (Identifier); The Form pragma allows the capitalization for individual iden- tifiers to be overridden. The capitalization used will be the first form appearing in the file (which may not necessarily be the one in the pragma). The pragma can appear anywhere in the file being formatted and affects the formatting of the entire file. This pragma can have multiple identifiers separated by commas, but it cannot continue over to a second line. This pragma does not carry over to subsequent files. 12 Goddard Space Flight Center Standard Ada Pretty Printer pragma Format_Off; pragma Format_On; The Format_On and Format_Off pragmas allow portions of a file to be excluded from the automatic indentation and spacing controls that the Pretty Printer performs. This allows tables and lists which have been indented and spaced for clarity to be left in their original clear form. These pragmas take effect at their positions in the file. The enclosed code will all be adjusted by the same number of spaces as the first line after the pragma. Note: Both pragmas in a given pair must appear at places in the code that have the same indentation level. (This means, for example, that the pragmas cannot be used to enclose only the condition line(s) of an if construct.) Creating a Specialized Pretty Printer: By setting the parameters in the Pretty_Printer_Parameters unit to the preferred values and recompiling, the user can set the Pretty Printer to the preferred local standard. If the Pretty_- Print_Format is set to NASA_Standard, the Special settings can be obtained simply by using one pragma statement: pragma Form_Set (Special); If the Pretty_Print_Format is set to Special, the NASA_Standard can still be obtained by: pragma Form_Set (NASA_Standard); 13 Goddard Space Flight Center Standard Ada Pretty Printer 10. Error Messages: Error messages can appear in several places. They can be in the terminal's output listing or the same output when sent to a file by the batch processing mode or they can appear in the formatted source code file. The error messages sent to the output listing or output file include: Parse Error Encountered in Line 20 ** this file contains goto_statement ** ** Storage_Error raised ** ** exception raised ** When the "parse error" message appears in the output listing there will be a corresponding error message in the -PP file (which will not be formatted) at the designated line which will look like: -- PARSE ERROR => @@^$^@ Where the thing after the => indicates the unparsable token. Often this error message will appear several times after a given line that will not parse because the parser will try to start parsing again with an unrecognizable piece of code. When this happens the last error message line following a given line of code will be the one that was generated first, and the token that is designated will be the one on which the parser failed. Error messages will also appear at the end of any construct that enclosed the original error. The "file contains" message is not actually an error. It is a flag designating Ada code that does not adhere to the Ada Style Guide. Not all violations of the Ada Style Guide are identified in this utility. (Some simply cannot be checked and other checks were not attempted.) The file causing a storage error in a list of files will be the last file processed because the pointers for the data base are controlled rather than recovered. Therefor, the space is un- usable for any other purpose and would only cause continual storage errors. However, the totals report is prepared and printed to a file (after the buffers are released by closing the file being processed), but the totals report will not include statistics on the file that caused the storage error. 14 Goddard Space Flight Center Standard Ada Pretty Printer 11. Limitations and Known Problems: Three limitations and known problems are enumerated in the fol- lowing paragraphs. One limitation which will affect many people is that the input source code is limited to 80 columns. This extension is one very desirable enhancement for which there simply was not enough time. Correction can be made by editing the code and inserting a line break between any two tokens in the long line before pretty printing. One known problem occurs when the Pretty Printer appends to the end of a line of code from the following line. Sometimes the spacing is not even. This can be corrected simply by rerunning the Pretty Printer using the -PP file as the input file. Another known problem occurs when a "begin" is followed by a "--" which is followed by an end of line with nothing but space bet- ween them. A parse error is generated because the Pretty Printer is looking for a syntactic comment. This can be corrected by editing the code and either removing the singular "--" or insert- ing something after it before pretty printing. 15 Appendix Ada Pretty Printer Code Documentation 1. Introduction to Code 2. Package Tokens and a Description of the Data Structure 3. Scan 4. Parse_Ada 5. Format 6. Print 7. Status_Report 8. Restart 9. Dump 10. Pretty_Print 11. Token 12. Source_Line 13. Page 14. Symbol_Text_Record 15. Symbol_Table_Entry 16. Parse_Tree_Node 17. List_Record 18. Reusable Code Ada Pretty Printer Notification 16 Appendix 1. Introduction to Code: The Pretty Printer is built around a complex data structure which is declared in package Tokens. This data structure is composed of seven interconnected record structures. These include Source- _Line, Token, Symbol_Table_Entry, Parse_Tree_Node, and several others which will be described in detail in the following sec- tions. Use of this data structure is available through seven major routines. These routines are Scan, Parse_Ada, Format, Print, Status_Report, Restart, and Dump which are described in the following sections. Several other units are also described which include the main routine, Pretty_Print, and several routi- nes which could be reusable such as Reserved_Words_Package. ---------------------------------------------------------------- 17 Appendix 2. Package Tokens and a General Description of the Data Structure: As described in the Introduction, the data structure in package Tokens is the heart of the Pretty Printer. Package Tokens con- tains more than the data structure. It contains the objects that serve as the hooks on which the data structure network hangs and a large number of the subroutines that serve the data structure. The data structure is made up of records with access pointers that serve to make the records both self recursive and intercon- necting. These records are: record name accessed by points into ------------------ -------------------- ----------- 1. Token Token_Pointer (1) 2 5 6 7 2. Source_Line Source_Line_Pointer 1 (2) 3. Page Page_Pointer 4. Symbol_Text_Record Symbol_Text_Pointer 3 (4) 5. Symbol_Table_Entry Symbol_Table_Pointer 1 4 (5) 6. Parse_Tree_Node Parse_Tree_Pointer 1 (6) 7. List_Record List_Pointer (7) The first group of objects that provide access to this structure is the group of objects and subroutines that control the freed records. They are First_Free_X, Last_Free_X, Blank_X, function New_X, and procedure Free_X where _X is the record name. The next group of objects that provide access to this structure is the group of objects that serve as the roots of the tables, tree structures, and cross reference lists. They include Start- er (the root for the Source_Line and Token structure), Reserved_- Table (Symbol_Table for reserved words), Table (Symbol_Table for other tokens organized by token kind), First_Rule (cross refere- nce for parse rules), and Parse_Tree (root of the Parse Tree). Many of the cross reference linkages are in the records of the data structure. The next group of objects that provide access to this structure is the group of objects that serve as the current position point- ers used in the utilities. These include Current_Line, Last_- Line, Current_Token, Current_Page, Page_Pos, and Last_Rule. Also Error_In_Code provides an inter-routine flag. Several objects also appear in the Tokens package which are primarily for the parse table generator (which uses the same data structure). These include Kind_Of_Scan, Class_Is_Open, Produc- tion, Defn_Count, and Class_Count. The Tokens package also includes one other package declaration, the State_Stack_Package. This package is used only in the Parse- _Ada routine, but it is included here because of implementation considerations. 18 Appendix The Tokens package declares two categories of subroutines, the data structure utility routines and the visible routines. The utility routines can be accessed only by the visible routines and other utility routines. For descriptions of these routines see the descriptions in the documentary comments with the subroutine declarations. The visible routines, Scan, Parse_Ada, Format, Print, Status_Report, Restart, and Dump, are described in the following sections. Two visible routines are not provided with the rest of this code. They are Generate_Tables and Edit. The Generate_Tables code contains more units than the rest of the Pretty Printer code but is not used directly in the Pretty Prin- ter. The code can be obtained by contacting NASA. The Edit routine was the original objective of the development project, but has not been developed yet. Both of these routines have stubs in the delivered code. ---------------------------------------------------------------- 3. Scan: This routine performs a lexical scan of Ada source code or a properly formatted syntax file. The products of this procedure are the source line, token, and symbol table data structures used by all the other routines in this package. This routine will read the source file and scan through the file one character at a time identifying the tokens and creating the symbol text and symbol table entries. The symbol table entries will all be cross referenced during the scan routine. Some of the more important routines used in the scan are Read_- Line, Add_Line, Start_Identifier, Finish_Identifier, Check_Reser- ved, Start_Number, Start_Symbol, Start_String, Start_Comment, Start_Comment_Word, Add_New_Token, Update_Source_Line, Add_To_- Symbol_Table, Add_Symbol_Text, Add_Symbol_Table_Entry, and Assign_Cross_Reference. The Read_Line routine is fairly intui- tive but it does contain the call to Add_Line which creates the Source_Line data structure. The Start_XX routines along with the Scan routine itself do the actual character by character iden- tifying of tokens. The Add_XX routines build the named parts of the data structure. And the other routines named serve to inter- connect the data structure. ---------------------------------------------------------------- 4. Parse_Ada: This procedure parses the internal tokens data structure made by the Scan routine and produces an internal parse table data structure used by the Format, Status_Report, and Dump routines. The parse procedure used is a simple LR parse. 19 Appendix The Parse_Ada routine is the only routine using the State_Stack_- Package. The parse tables are generated by the Generate_Tables routine which can also be obtained from NASA. The action tables are compressed with the table position being determined by the modulus of the table length and the conflict jump being deter- mined by the multiple of the table length as divided into the uncompressed table position (called Unique in the code). Com- ments are handled by creating individual parse trees for each comment and attaching them to the first non comment token for the line that they are in. (Left comments are attached to the EOL token.) This routine also does some minimal error recovery and places error messages in the source line and token data struc- ture. Some of the more important routines used in Parse_Ada are Assign- _Key_Values, Push_State_Stack, Pop_Top_Node, Attach_Comment, Error_Fix, and Add_Error_Comment. The Assign_Key_Values routine identifies the individual tokens and gives them keys. These keys determine the action to be taken by the parser. Every non EOL or EOF token is given a parse node as its initial sub-parse-tree. The Push_State_Stack and Pop_Top_Node routines enter and remove the sub-parse-trees and states from the state stack. The Attach_Comment routine handles comments as described above (which would not be necessary if the parser was reused as the front end of a compiler). To reuse this code as an incremental parser the final assignment to the Parse_Tree would simply be changed to whatever parse node was the father of the code being incremental- ly parsed. The Error_Fix simply takes the sub-parse-trees in the state stack as well as any unidentifiable tokens that follow and collects them as sons under a single error node. The parse con- tinues with the first token it expects to recognize. ---------------------------------------------------------------- 5. Format: The Format routine modifies the capitalization, spacing, position of line breaks and indentation in the source line and token data structure to correspond to the designated pretty print standard. The formatting process simply takes the directives which were attached to the parse nodes during parsing and moves them to the appropriate tokens. Tt then modifies the data structure accord- ing to the directives and parameter settings. Some of the more important routines used in the Format routine are Set_Pragma_Parameters, Identify_Parameter, Standardize_Para- meters, Standardize_Case, Verify_Syntactic_Comments, Move_- Actions_To_Tokens, Select_Pretty_Print_Action, EOL_Action, Action_# (where # is the directive number), Set_Token_Spacing, Format_Right_Comment, and Format_Left_Comment. The Set_Pragma_- Parameters, Identify_Parameter, Standardize_Parameters, and other Set_XX routines (where XX is the name of a parameter) are all 20 Appendix associated with the settings of parameters in the Pretty_Print_- Parameters package. The Standardize_Case routine and its sub- routines modify the capitalization of the various identifiers. (The reserved words will always be in lower case.) The Move_- Actions_To_Tokens actually moves the action directives from the parse tree where they are attached during the parse and attach them to the individual tokens where they act. The other routines all perform the action functions to format the Source_Line and Token data structure by adjusting the location of line breaks, indentation and token spacing. ---------------------------------------------------------------- 6. Print: This procedure produces a formatted source code file from the internal the Source_Line and Token data structure. The routines are Print and Print_Token. The Print routine coordinates which lines are printed, and the Print_Token prints the spaces and display text of a token. ---------------------------------------------------------------- 7. Status_Report: This routine produces a status report on statement counts for a specified file or for the totals from a list of files. The subroutines are Calculate_Counts, Sum_Counts, Print_Counts, and Print_Totals. These routines are the primary users of the Status_Data package. The statistics are gathered primarily from the cross reference counts of parse tree node types. Some are calculated using the relationships of nodes in the parse tree. The Parse_Keys package and the Syntax.Dat file will be very valuable to anyone who wants to modify the selection of statis- tics. ---------------------------------------------------------------- 8. Restart: This routine resets the variables of this package to their ini- tial state. The hooks for the data structures are cleared and the individual records are released to their free lists for reuse. Note: Since the data structure is controlled, once a storage error occurs the data structure is corrupted and can no longer be properly recovered. The major subroutines of this routine are Free_All_Symbol_Tables, Free_All_Source_Lines_And_Tokens, Free_Parse_Tree, and the Free_X routines of the Tokens package (where X is a data record struct- ure). These routines traverse the data structures and release the records to their various free lists. 21 Appendix 9. Dump: This routine will dump the internal token and parse data struc- tures into files. This routine and its Dump_XX and Print_XX rou- tines were written as part of the debug reporting for the Gener- ate_Tables routine, but they are included here because they may be helpful to anyone wanting a better look into the data struc- ture. They will need to be rewritten slightly for your individual purposes. ---------------------------------------------------------------- 10. Pretty_Print: The main routine of the Pretty Printer guides the flow of control through the routines discussed above. The Pretty Printer opera- tes two main paths, one for single file processing, the other for a list of files. This routine starts by checking for the presence of a file called Files_In.LST to determine if batch processing is in effect. If it is present and the first word is "batch" then the default output is reset to the filename indicated. If the Files_In.LST file is not present then a prompt is sent to the user to desig- nate the file to be formatted. If the filename entered by the user is not prefaced by an "at" sign, then the processing path for the single file is executed. Otherwise the processing path for the list of file is executed. 22 Appendix 11. Token: type TOKEN_KIND is ( UNKNOWN, RESERVED_WORD, IDENTIFIER, NUMERIC_LITERAL, CHARACTER_LITERAL, STRING_LITERAL, COMMENT_WORD, SPECIAL, OPERATOR, OTHER ); type TOKEN_POINTER is access TOKEN; type TOKEN is record Number : INTEGER := 0; Line : SOURCE_LINE_POINTER := null; Spaces : INTEGER := 0; Column : INTEGER := 0; Next : TOKEN_POINTER := null; Kind : TOKEN_KIND := UNKNOWN; Symbol : SYMBOL_TABLE_POINTER := null; Rule : INTEGER := 0; Construct_Position : NATURAL := 0; Directive : NATURAL := 0; Dir_List : LIST_POINTER := null; X_Ref_Next : TOKEN_POINTER := null; Parse : PARSE_TREE_POINTER := null; end record; This data structure record is the focus of the pretty printer. It provides connections to every other data structure either directly or indirectly. (See list in Section 2.) Most Token records are created during the Scan routine. Others are added by the Format routine. As they are added to the data structure the text from the original source file that the token represents is placed in the symbol table with the appropriate references and cross references in the Token record. In the Parse_Ada routine each Token record receives a Parse_Tree_Node record as its indi- vidual connection to the parse tree. At the beginning of the Format routine the action directives are collected and added to the Dir_List in parse tree order and are acted on in source line order. The Print routine creates the formatted source code for output from the Token records. And the Restart routine frees them for reuse. The individual components are used as follows: Number : identifies the order of the tokens and is used only for debug purposes. Line : points to the Source_Line record to which the Token record currently belongs. 23 Appendix Spaces : designate the number of spaces in front of the Token. This value is modified by the Format routine to make the code pretty. Column : designates the position from the beginning of the line of code of the last character of the preceding token in the line. Next : points to the next Token record. Kind : categorizes the token. (See the Token_Kind type declara- tion.) SPECIAL includes EOF, EOL, etc. Symbol : points to the Symbol_Table record which points to the Symbol_Text record which points to the Page and Position on the page containing the token's text string. This text string has its capitalization modified by the Format routine to make the code pretty. Rule : is used only in the Generate_Tables routine. Construct_Position : is used only in the Generate_Tables routine. Directive : is used only in the Generate_Tables routine. Dir_List : points to the head of the List records which contain the action directives. X_Ref_Next : points to the next Token record having the same symbol. Parse : points to the Parse_Tree_Node that is attached to this Token. ---------------------------------------------------------------- 12. Source_Line: type SOURCE_LINE_POINTER is access SOURCE_LINE; type SOURCE_LINE is record Line_No : INTEGER := 0; First_Token : TOKEN_POINTER := null; Last_Token : TOKEN_POINTER := null; Token_Count : NATURAL := 0; Prior : SOURCE_LINE_POINTER := null; Next : SOURCE_LINE_POINTER := null; end record; Most of the Source_Line records are generated by the Scan rou- 24 Appendix tine. Others are added by the Format routine to make the code pretty. The combination of the Source_Line records and Token records corresponds very closely to the source code. The individual components are used as follows: Line_No : identifies the order of the source lines and is used primarily for debug purposes. The Print routine will accept a range of line numbers for printing. Note: because the Format routine does not completely renumber the lines, ranges can get misrepresented if the lines are not renum- bered first. Therefore, if you reuse the code to print portions of code by line number be sure to renumber first. First_Token : points to the first Token record in the line. Last_Token : points to the last Token record in the line. Token_Count : indicates the number of Token records in the line. Prior : points to the prior Source_Line record. Next : points to the next Source_Line record both in the Source_- Line structure and in the free list. ---------------------------------------------------------------- 13. Page: type PAGE_POINTER is access PAGE; subtype PAGE is STRING (1 .. Page_Size); Page_Pos : NATURAL := 0; The Page array is used to hold both a display and a match copy of the symbol text. The match copy is always all lower case. Literals will have only one copy since they are never compared for a match. And reserved words have only one copy since their display form and match form are always the same (all lower case). This array is the target of the Source_Text pointers. Since this is an array the free list is maintained by a separate array of pointers. ---------------------------------------------------------------- 25 Appendix 14. Symbol_Text_Record: type SYMBOL_TEXT_POINTER is access SYMBOL_TEXT_RECORD; type SYMBOL_TEXT_RECORD is record Page : PAGE_POINTER := null; Pos : NATURAL := 0; Len : NATURAL := 0; Next : SYMBOL_TEXT_POINTER := null; end record; The Symbol_Text record designates the Page and the position and length of the text on the page. This record structure is pointed at by the Symbol_Table record. The individual components are used as follows: Page : points to the page containing the text. Pos : references the starting position of the first character of the text. Len : indicates the number of characters in the text. Next : is used by the free list. ---------------------------------------------------------------- 15. Symbol_Table_Entry: type CAP_TYPE is ( NORMAL, ENUMERATION, TYPE_SUBTYPE, SPECIAL ); type SYMBOL_TABLE_POINTER is access SYMBOL_TABLE_ENTRY; type SYMBOL_TABLE_ENTRY is record Key : NATURAL := 0; Match : SYMBOL_TEXT_POINTER := null; Display : SYMBOL_TEXT_POINTER := null; Cap : CAP_TYPE := NORMAL; Prior : SYMBOL_TABLE_POINTER := null; Next : SYMBOL_TABLE_POINTER := null; Lesser : SYMBOL_TABLE_POINTER := null; Greater : SYMBOL_TABLE_POINTER := null; Count : NATURAL := 0; X_Ref_First : TOKEN_POINTER := null; X_Ref_Last : TOKEN_POINTER := null; end record; 26 Appendix type SYMBOL_TABLE_HEADER is record First : SYMBOL_TABLE_POINTER := null; Head : SYMBOL_TABLE_POINTER := null; Last : SYMBOL_TABLE_POINTER := null; end record; type SYMBOL_TABLE_ARRAY is array (TOKEN_KIND) of SYMBOL_TABLE_HEADER; The Symbol_Table_Entry record is created by the Scan routine for each unmatched token found in the source code. The Symbol_Text records are created at the same time, and so are the Page entri- es. When a matching token symbol entry is found it is added to the cross reference list of the matched Symbol_Table_Entry. Each Symbol_Table_Entry is part of a Symbol_Table which is organized in two ways, an alphabetical list and a binary tree. These both start with a Symbol_Table_Header which contains the first and last entry of the alphabetical list and the head (root) of the binary tree. The binary tree is created by order of appearance in the source code. (No special balancing is done to improve speed.) The individual components are used as follows: Key : is assigned and used in the Parse_Ada routine. These Keys can be found in the Parse_Keys package. Match : points to the Symbol_Text record used for comparisons in the binary search. Display : points to the Symbol_Text record which is modified by the Format routine to make the code pretty and which is used by the Print routine. Cap : designates the form of capitalization to be used. (See the Cap_Type.) When the Form pragma is used, the Special value is assigned. Otherwise, the capitalization is modified in the Format routine according to the parameters designated for the capitalization type. Prior : points to the prior Symbol_Table_Entry in the alphabetic list. Next : points to the next Symbol_Table_Entry in the alphabetic list. This component is also used by the free list. Lesser : points to the prior Symbol_Table_Entry in the binary tree. 27 Appendix Greater : points to the next Symbol_Table_Entry in the binary tree. Count : indicates the number of tokens in the cross reference list. X_Ref_First : points to the first token in the cross reference list. X_Ref_Last : points to the last token in the cross reference list. ---------------------------------------------------------------- 16. Parse_Tree_Node: type PARSE_TREE_POINTER is access PARSE_TREE_NODE; type PARSE_TREE_NODE is record Number : NATURAL := 0; Rule : NATURAL := 0; Rule_Next : PARSE_TREE_POINTER := null; Pp_Dir : NATURAL := 0; Token : TOKEN_POINTER := null; Father : PARSE_TREE_POINTER := null; Brother : PARSE_TREE_POINTER := null; Comment : PARSE_TREE_POINTER := null; Sons : NATURAL := 0; First_Son : PARSE_TREE_POINTER := null; Last_Son : PARSE_TREE_POINTER := null; end record; The Parse_Tree_Node record is the second most important member of the data structure. This record provides the direct link between the Ada Syntax and the code, and it is the avenue for taking the action directives that are associated with the Syntax and attach- ing them to the tokens. The Parse Tree is a simple tree struc- ture with each node having only one father and either a token or a sub-parse-tree as a subordinate. The first level of any sub- parse-tree can have zero to many sons. Each Parse_Tree_Node designates its own immediate next Brother node in this first level of sons. All the brothers will designate the same Father node. The Parse Tree is created by the parser in the Parse_Ada routine. (The Parse Tree could alternately be created by select- ing syntax structures in a language sensitive editor. Since this was the originally intended product the Parse_Tables package and the related data files already contain the data to build these constructs directly.) 28 Appendix The individual components are used as follows: Number : indicates the order of the Parse_Tree_Nodes. This is primarily for debug purposes. Rule : designates the Syntax rule associated with a sub-parse- tree. This corresponds to the Parse Keys in the Parse_Keys package and to the Key in the Token record. It is assigned as the parser performs its reduce step. Rule_Next : points to the next Parse_Tree_Node in the cross reference list for the rule indicated. Pp_Dir : is the action directive associated with the given posi- tion of the father node's rule. The values for this com- ponent are provided in the Directives table of the Parse_- Tree package or the associated data file. Token : points to the Token record that is associated with this node. The Token component will always be null when the node has one or more son nodes. Father : points to the father node of the parse tree. Brother : points to the next brother node of the parse tree. This component is also used by the free list. Comment : points to a special sub-parse-tree which is a right comment. This will usually only be used when there is a token record subordinate. Sons : indicates the number of son nodes in the sub_parse_tree. First_Son : points to the first son node of the sub_parse_tree. Last_Son : points to the last son node of the sub_parse_tree. ---------------------------------------------------------------- 17. List_Record: type LIST_POINTER is access LIST_RECORD; type LIST_RECORD is record Ref : NATURAL := 0; Next : LIST_POINTER := null; end record; The last record structure used in the data structure of the pretty printer is the List record. This is simply a list that is 29 Appendix recursive. The only use in the Pretty Printer is for attaching multiple directives to a given token. (It is used for several other kinds of lists in the Generate_Tables routine.) The individual components are used as follows: Ref : designates the reference value of the action directive. Next : points to the next List record. This component is also used by the free list. ---------------------------------------------------------------- 18. Reusable Code: Not only can the code be used for readily modifiable Ada pretty printer, but with reasonably little modification this code could be used for any language syntax by generating the parse tables using the associated Generate_Tables routine. Similarly the code can be used as a basis for development of a language sensi- tive editor. Or it could even be used as the front end of a compiler. Some individual components which might be reusable individually include the package Reserved_Words, the procedure Lower_Case, and the procedure Next_Prime. 30 The source code and data files for GSFC Ada Standard Pretty Printer is in the public domain, but NASA does request to be notified of how many copies are made to encourage funding for similar development for the public domain. Therefore, please fill out this page and return it to: Ada Pretty Printer Notification c/o Ms. Elisabeth Brinker -- Code 522 Data Systems Applications Branch Goddard Space Flight Center Greenbelt, Maryland 20771-0001 ----------------------------------------------------------------- Ada Pretty Printer Notification Version 1.1 Name ____________________________________________________________ Address _________________________________________________________ _________________________________________________________________ _________________________________________________________________ Copies ___________________ Modified Copies _________________ _ |_| I would like information regarding any modifications since version 1.1 Comments ________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ _________________________________________________________________