Index


RISC World

BASIC to C

Martin Carradus explains how to use his BASIC to C translator

Introduction to the use of !BBC_C

These notes describe what the application !BBC_C does and the various options that !BBC_C presents to the user. !BBC_C translates from Acorn BBC BASIC held in its associated file into Acorn ANSI C as a text file suitable for an Acorn C compiler. In order to understand the advantages of translation from BASIC to C you will need to know something about the difference between compilers (C) and interpreters (BASIC).

All computer languages need to be translated into the fundamental instructions of the particular computer (machine code). A compiler is presented with the complete program and translates the whole thing into machine code. The machine code is then run on the computer in a separate stage. An interpreter, however, translates the program and obeys it as it goes along. The advantage is that you do not need a preceding stage before you run the program. The disadvantage is that, since the whole program has not been "seen", the final code is usually less efficient and runs more slowly than compiled code. The compiled C that !BBC_C produces seems to run about three times faster than the same interpreted BBC BASIC.

In order for an interpreter to execute the program efficiently, it is usually coded so that it can identify keywords/commands quickly. The main "trick" is to code the keywords (words like GOTO, or IF or PRINT) into only a couple of characters (bytes), known as a token. This means that they can be easily recognised every time the interpreter runs across them in the program. There are over 160 different "tokens" used in BBC BASIC files.

At the beginning of each line of the program, the interpreter will need to know the line number of that line and where the next line is. Both are coded for greater efficiency.

Almost all types of BASIC use an interpreter.

Due to the nature of BASIC, it can do things "on the fly" that a compiled language cannot. Also BBC BASIC was designed without any formal structure (the grammar of the language) in mind, whereas the C language has a formal grammar. In particular, the C language and its compiler are very particular about the mixture of different data types (i.e. storage locations (variables) designated as integer, floating point or character string) whereas BBC BASIC uses these locations interchangeably. For these reasons, you should know something about the C language before using !BBC_C in order to understand any compiler error messages.

Using !BBC_C

Drag the Application to a safe (preferable new) Directory on your hard disk. Ensure you have the latest versions of the Modules required by using !SysMerge with the version of !System supplied. !BBC_C issues a warning message if your modules are out of date. For RISC PCs double click on !Boot. Clicking on "System" on the panel presented offers you the option of merging the !System supplied on this disc with your !System.

Double click with the mouse on the !BBC_C icon to install it on the icon bar. Click on bar icon to bring up the application panel. Click the "menu" button on your mouse on the bar icon for other functions including Info and Help. The RISC OS on-line !Help application gives descriptions of the options available from the main application panel.

For !BBC_C run the BBC BASIC Program through !BBC_C by typing in the file name, dragging the file icon into the application window, double clicking on the file icon or dragging the file icon onto the bar icon. Choose the options required from the panel presented and click on the "Run" button. The effect of these options is more fully described below.

The C text file is eventually offered for saving prefixed by a capital "C". After checking it principally for variably dimensioned arrays and undeclared variables (commented out) , place it in a directory called "c" in another directory along with "h" which contains the files "leaf" and "data", and a directory "o", all of which are needed by the compiler.

Present the C file to the compiler. There may be many messages concerned with incorrect casts as BBC BASIC is not too particular about data types, whereas the C compiler is. The author has attempted to minimise such messages.

A beta version of "LeafLib" is included in the directory "$.o". Include its file name under "Libraries", from the menu associated with the compiler panel, separated by a comma from the other standard libraries. It is needed during the "Link" phase of the compilation in order to supply certain routines not given in the Acorn BBC specific library.

Though every attempt has been made to encompass all forms of construct available in BBC BASIC, !BBC_C still sometimes doesnt recover from syntax errors and produces a blank output file.

Additional Notes for Acorn C Compiler Version 5

For the Acorn C Compiler Version 5 onwards you will also need the non-standard header files "bbc.h" and "os.h" in your project "h" directory and to reference the library "RISC_OSlib", (Needed during the Link phase) under "Libraries" option off the main panel. This is in addition to "LeafLib".

The Translator now only seems to object to the operator "=" if it is a test for equality. If you mean it to be this enclose it in brackets (e.g. A=(B=C) instead of A=B=C ). Similarly "=" used in FNs to return values must be separated from any preceding expression. Logical expressions after IF, WHILE and UNTIL are taken to be C logical expressions (with "&&", "||", "^^" and "!" for AND, OR, EOR and NOT respectively. If any logical sub-expression is bracketed then it is taken in a bitwise manner with "&","|","^" and "~". In all other contexts logical expressions are taken to be bitwise.

FNs returning character strings are sometimes taken as returning a numerical value and the Translator grammar objects. You can force such FNs to be character string type by giving them a name postfixed with "$"(e.g. strcat$ ) to indicate that the return value is a character string. Generally the Translator determines the return value of a FN by the context in which it finds the FN.

For the reasons above, a logic expression starting with an FN will be taken to be numeric to start with. If the FN is in fact returning a character string and this is being compared with a string literal e.g FNmess>�Hello� then reversing the expression i.e. �Hello�<FNmess will make !BBC_C take the FN as returning a character string. !BBC_C then does not produce an error message. Also !BBC_C "remembers" the return value of a FN from above.

Another advantage of translation to C is that C is a portable language and compilers for it exist on other brands of computer. In simple cases of program it is possible to get the translated BBC BASIC program to eventually do the same thing on another computer. However, BBC BASIC in more complicated programs makes use of facilities that only exist on Acorn computers and which are handled differently on other machines.

Two examples are the use of raw Acorn assembler (translates directly into the machine code of Acorn computers) and SWIs (SoftWare Interrupts). SWIs are pre-written routines that enable the handling of various Acorn constructs and do not exist on other machines. There are also literally hundreds of SWIs that are available.

Lastly BBC BASIC programs sometimes make calls to the operating system of the Acorn computer and the commands are different on other computers.

Using the !BBC_C Set Up Panel.

Input Slot.
The name of the BBC BASIC file to be translated can be typed in here. The name will be inserted if the file concerned is dragged onto the panel or dragged to the !BBC_C icon on the icon bar. The file is "grabbed" if it is double clicked upon. The !BBC_C program will detect non BBC BASIC files when run.

Run Button.
Runs the !BBC_C program when clicked upon. Various messages are displayed on the screen to indicate the progress of the translation followed by a message giving the number of identifiers used in the BBC BASIC program. The number of constructs not recognised by the grammar (syntax errors) is also given. Syntax errors result in comments in the generated C and error message panels on the screen. These panels are usually suppressed. (See below).

The generated C is offered for saving within the same directory as the source BASIC program with the same name prefixed by a capital "C". Click on "OK" or press the Return key if you are satisfied with this, otherwise alter the file name. If you hold down "Control" and press "U", the slot is cleared and a new name can be typed in and the icon dragged to the directory of your choice.

Cancel Button.
Clicking here removes the application panel from the screen and returns you to the previous set of options.

Description Button.
Clicking here and selecting "Run" makes the !BBC_C application present the user with two panels giving a description of itself.

Help Button.
Clicking here and selecting "Run" makes the !BBC_C application present the user with two panels giving general help information. "Description" and "Help" cannot both be ticked at the same time, because selecting one deselects the other.

Verbose Button.
Clicking here makes !BBC_C output debugging information concerning the process it is going through as it scans the BBC BASIC program i.e. the identifiers, literals and tokens it is encountering. This process considerably slows up the processing, so it is advisable not to turn this feature on unless you really want to see what !BBC_C is doing. At the end of the process, by clicking "menu" over the window concerned, it is possible to save the diagnostics to a separate text file. Note that by the use of "menu" it is possible to pause, resume or abort the process. Also note that by clicking "menu" over the main panel one is able to instigate the options "Help", "Debug" or "Description" from a special menu. The opportunity to alter the command line that controls !BBC_C is also available from this menu.

Indexes Button.
A facility to make FOR loop indexes and the indexes used in array arguments to be taken as integer variables. This facility was introduced in order to overcome the necessity for array arguments to have an integer type within the C language. However the C compiler does not need "for" loops to use integers and if this option is not chosen, array indexes are still given a cast of "int" if required.

Choosing this option could cause calculations with the relevant variables in other places to be truncated, so it should be used with caution.

Single Precision Button.
When enabled, this option causes all floating point variables (those holding fractional values) to be made C type "float". Otherwise such variables are given the C type "double", which offers more precision in floating point calculations. For various reasons, this option is best not enabled.

Remove REMs Button.
When enabled, this option prevents BBC BASIC REMs (comments) from being translated into C comments within the translated code. Since one may prevent C compilers from keeping comments in the final compiled code, this option is only included for completeness.

All Lower Case Button.
Problems can be encountered when the Translator turns all identifiers to lower case in line with C programming conventions. BBC BASIC identifiers such as "a%" and "A%" are quite frequently used and could be confused with each other if they were both converted to lower case. When this option is not enabled it prevents the first character of the identifier from being altered. With most programs you are advised not to convert entirely to lower case.

Integer, String and Float Terminators.
In BBC BASIC, integer variables are terminated by "%" and character string variables by "$", otherwise they are taken as floating point. In the C language all variables must be declared before you use them, so they can have any name you like. These writable slots offer the user an alternative value to % and $, since $ and % cannot appear in C variable names. There should not be any need to alter the default values automatically supplied. If you should alter these values, only alphabetic upper or lower case characters or the underscore character may be entered into these slots.

Note this principal difference between BBC BASIC and the C language, as mentioned above. In BBC BASIC, variable names become available immediately they are mentioned anywhere in the code whereas the C language requires that all variables must be declared before they are used.

FN/PROC Terminators.
BBC BASIC FNs (Functions) and PROCs (Procedures) can have the same name without any conflict arising, but in the C language every identifier must be unique. The two letter terminators are postfixed before any terminator due to data type, so a FN called "mess%" with the defaults would become "messfn_i".

The BBC BASIC interpreter is also able to cope with such names beginning with numerics and containing keywords in the language. The C compiler requires that identifiers begin with an alphabetical character, so numerics are converted to "a" - "i" and "z" for zero e.g. "1st" becomes "ast", "0th" becomes "zth" and a procedure called PROCTO becomes "procto".

There should not be any need to alter the default values. As with variable name terminators only up to two upper or lower case characters or the underscore character may be entered in these slots.

Array Terminators.
As with BBC BASIC FNs and PROCs, there is no confusion if an array name is the same as a variable, FN or PROC, but the C compiler requires all identifiers to be unique. A three letter terminator for array variables names in the translated C is made available by this option. This terminator is in addition to any data type terminator, so an array name "data%" in BBC BASIC would become "dataarr_i" in the translated C with the defaults supplied.

There should not be any need to alter the default value supplied. As above only upper and lower case characters or the underscore character may be entered in this slot.

DEF Argument Terminators.
It was found that the names of the formal parameters of FNs and PROCs in DEF statements were being confused with the same variable names used elsewhere. These names are purely dummy names, so this facility enables one to give these parameters a terminator that distinguishes them from other variable names. As before this terminator is in addition to the data type terminator and the default value of "arg" need not be altered.

Should you wish to disable this facility, clear out the slot.

Primary and Secondary Indentation.
The single numerics entered in these slots determine the indentation to be applied to the translated C. The primary indentation applies to the whole code and indents each code line within the main program and C functions by the supplied number of spaces, making the code more readable by supplying an initial margin to the code.

The secondary indentation is added to the primary for each programming structure encountered (IF, REPEAT, WHILE, FOR and CASE). It is reduced by the same amount when the structure ends (with ENDIF, UNTIL, ENDWHILE, NEXT or ENDCASE respectively). This also makes the translated C code more readable. The indentation in no way affects the speed or size of the compiled code, as the compiler ignores any embedded spaces. However the supplied BBC BASIC may have been "squeezed" to remove spaces, as these do affect the speed of the interpreted BASIC.

Errors:- Quiet Button.
When enabled, this option suppresses the error panels generated when a syntax (or parse) error is detected in the BBC BASIC program. As mentioned above, this may not be a fault in the BBC BASIC program, but a deficiency in the grammar that is being used by !BBC_C to parse the program.

Some programs produce many error messages, so this option is enabled by default to avoid constantly having to respond to the error panels.

If !BBC_C terminates with a "Untrapped Parser Error" message, then it has fallen over at some point and the output file will be empty. In any case, the total number of errors is given. Use the tips above to reduce the errors to a minimum. In some cases, badly bracketed programming structures result in !BBC_C searching for a terminating statement (containing ENDIF, ENDWHILE, UNTIL, NEXT or ENDCASE), which is never found and the program falls over.

!BBC_C does attempt keep track of structures and close higher structures when lower ones close e.g. a WHILE loop within a CASE structure will be closed with an injected ENDWHILE upon encountering a new WHEN or OTHERWISE.

Errors:- Maximum.
!BBC_C will abort if the number of errors exceeds the indicated number. By default this is 300. If you do not wish the program to abort, you should increase this number, but by experience, if a large number of errors are generated then !BBC_C usually falls over and produces a blank output file. Sometimes this is due to not selecting the correct options for your program from the options described above.

Order of Identifier Declarations.
!BBC_C keeps an internal table of all the identifiers (symbol table), which can be kept in different sorted orders. The writable icon can only contain 0, 1 or 2.

With 0 in it, the identifiers are just added to the end of the list every time a new identifier is encountered, so they are declared at the head of the generated C code in "As Found" order. With 1 in the icon, the identifiers are kept in alphabetical order and with 2 in the icon, the identifiers are further sorted on their "type", that is whether they are variables, functions or procedures. The option "2" is selected by default.

It is recommended that you should at least select option "1", or better still "2", because it makes it easier to search for a particular identifier in the declarations at the head of the generated C code. Also with "As Found" order, the search for new identifiers in the symbol table is less efficient and can slow down !BBC_C.

Keep Separate List of Local Variables.
When this option is enabled, the variables local to a BBC BASIC function or procedure are kept in a separate symbol table, which is renewed for each new function or procedure. This means that variable data types are only determined from their behaviour within the function or procedure and not the behaviour of any global variable with the same name.

Variables local to the function or procedure are their arguments and any variables declared as LOCAL in BBC BASIC within the routine.

This option can often reduce the number of syntax errors and furthermore BBC BASIC functions are often given much more accurate data types for the value that they return. For these reasons, this option is enabled by default.

Named Wimp Control Block/s.
By default this option is ticked "Off" and the writable icon is disabled. This option, when enabled, allows you to specify the BBC BASIC names for up to ten variables used as Wimp control blocks. These blocks are used in SWI (SYS) calls, particularly in Wimp applications. Such names may consist of alphanumeric characters, in both upper and lower case, and also the symbols "%", "$" and underscore. No other characters will be accepted in the writable icon. Each variable name must be separated from the next by a comma, and the whole string of names cannot exceed 100 characters. Such variables are given the data type of a string variable in the generated C code (char *).

Further Technical Information.
For further information concerning the processing that !BBC_C does with BBC BASIC, consult "TechNote" present within the application. In particular, note that variably dimensioned arrays need suitable fixed values for their dimensions to be filled in manually within the generated C code (in #define directives at the head of the code). If this is not done, then C compiler errors occur.

Certain BBC BASIC constructs were found to be untranslatable into an equivalent construct in the C language. !BBC_C will warn you in the generated C code where this occurs and the supplied "TechNote" gives a list of those things that !BBC_C is unable to handle. Most of these constructs are only rarely used.

Writable Icons.
Pointing the mouse at a writable slot and clicking "Select" (left hand key) will insert a red cursor in the slot concerned. The backspace or delete key will delete characters to the left. Type into slots from the keyboard. Those slots with a white background will only accept alphabetic characters from the keyboard, whilst those in light yellow will only accept numeric characters.

If you clear the variable terminator slots to spaces, !BBC_C returns to its default e.g. for the integer variable terminator, it will take the value "_i" if the slot is cleared out. However the other terminator slots can be cleared to a null string.

Having entered a writable slot, pressing the "Return" (Enter) key moves the cursor in turn to the next writable slot. Pressing the "Return" (Enter) key when the final slot has been reached has the effect of running the application.

StartUp Banner Panel.
Upon double clicking with the left hand mouse button over the !BBC_C Application icon, a banner panel appears giving proprietary information about !BBC_C. At the same time the !BBC_C icon appears on the icon bar. The panel appears for about five seconds, but the panel can be immediately removed from the screen by clicking once over it once with the mouse. The panel only appears once during any one session on the computer. That is, if you quit the !BBC_C application, carry on working without switching off the computer, then start !BBC_C again, then the panel does not reappear.

Description of the Icon Bar Menu.

When the !BBC_C application is run by double-clicking on its icon, the icon is loaded onto the icon bar. Clicking "Select" on this bar icon gives the "Set Up" panel described above, but clicking "Menu" will present the user with the following menu.

Clicking on "Quit" immediately removes the icon from the icon bar and aborts the application.

Clicking on "Help" causes a window to appear giving descriptions of the options available from the "Set Up" panel.

Clicking on "Save options" will cause the set of options that you have chosen from the "Set Up" panel and this menu to be permanently saved to a file within !BBC_C. This means that your version of this application will always come up with your set of options rather than default ones every time you run !BBC_C.

Moving right over the arrow besides "Info" displays an information box.

Principally this will tell you the version and date of the Translator you have.

Moving right over the arrow besides "Options" will produce a display.

The first set of options you are given causes !BBC_C to automatically run when it receives the file name without having to click the "Run" button - Auto Run. Secondly you can cause the translated C to be automatically saved without having to click "OK" on the "Save As" box mentioned above - Auto Save. Both options are enabled by clicking on the menu item which causes it to be ticked on the menu display.

Moving right over the arrow besides "Display" will produce a display.

By default when !BBC_C runs it produces textual information. By ticking "Summary", instead a Summary window is displayed when !BBC_C runs. This looks like this.

The number of lines given is not the number of lines of C code output, but the number of lines of text that would be displayed if the "Text" option had been chosen. Not both "Summary" and "Text" can be chosen at the same time.

All the menu options chosen will be saved when "Save options" is clicked on.

Tips for avoiding !BBC_C Syntax Errors.

Determining return values of FNs.
As mentioned above, reversing inequalities with FNs in them often avoids syntax errors. Also !BBC_C "remembers" the return value of a FN that has been previously defined above the point in the code where the FN is being called. FNs returning numerics are taken differently to those returning character strings.

Bitwise ANDs, ORs, NOTs and EORs.
As mentioned above, expressions after IF, WHILE and UNTIL are taken to be C logical expressions. In all other cases (e.g. in assignments), the expressions are taken to be calculated in a bitwise manner using "&", "|", "~" and "^". For these purposes, the BBC BASIC construct TRUE is translated to a #define macro _TRUE, which has the value -1. Similarly FALSE becomes _FALSE with the value 0. Note that ANSI C returns the value of +1 in logical expressions that are true, but BBC BASIC returns a value of -1.

Also, where the code is expecting a logical expression, !BBC_C can distinguish between those expressions that require to be taken in a logical or bitwise manner. Mixed expressions will cause !BBC_C to emit syntax error messages.

Badly bracketed programming structures.
!BBC_C attempts to close higher programming structures when lower ones close, but sometimes can"t cope. In particular not closing DEF PROC with a final ENDPROC can cause problems. If your program causes !BBC_C to produce a blank output file, then it could be due to bad structure bracketing.

Returning values from FNs.
Returning a value from a FN using "=" via an IF statement can cause problems when the THEN is omitted. The only solution is to insert the THEN into the BBC BASIC code.

Similarly a test for equality in an IF statement (e.g. IF a%=b% c%+=1) should be bracketed (i.e. IF (a%=b%) c%+=1) because of the same conflict as to whether the "=" is being used as a test for equality or for returning a value from a function.

EVAL, Variable GOTOs and Variable GOSUBs.
!BBC_C is unable to cope with these constructs. EVAL produces an error message in the code. Variable GOTOs and GOSUBs result in meaningless C code. "Structured Programming" techniques demonstrate that you should be able rewrite your program to avoid these sorts of construct.

Martin Carradus

 Index