ARMalyser

Manual

Version: 0.55

Written by: David J. Ruck

Date: 13-Apr-2006

Copyright © 2000-2003, DEEJ Technology PLC.

All Rights Reserved.

 

Contents

  1. INTRODUCTION
    1. File Types
  2. USE
    1. Command Line Argumants
    2. Print Formatting
      1. HTML
      2. XML
      3. Impression DDF
      4. Ovation Pro DDL
  3. ANALYSIS
    1. Instructions
      1. Invalid Instructions
      2. Unpredictable Instructions
      3. Non 32-bit Instructions
      4. Non ARM2/3 Instructions
      5. Performance
      6. Information From Instructions
    2. Code Following
    3. Partial Emulation
    4. 26 / 32 Bit Mode Guarding
    5. Surmised Code And Data
      1. Code Recognition
      2. Data Recognition
    6. Code Structure
    7. Swi Calls
    8. Shared C Library Functions
    9. Data Structures
    10. C++ Symbols
  4. DISASSEMBLY
    1. Comment Level
    2. Code Comments
      1. Cautions
      2. Performace
      3. Informational
    3. Data Comments
  5. ASSEMBLY
    1. Header Comment
    2. Declarations
    3. Labels
    4. Compilation
  6. STATISTICS
  7. RELEASE HISTORY
    1. Previous Releases
    2. Issues With Current Release
    3. Future Development

 

1 INTRODUCTION

ARMalyser is a tool designed to analyse RISC OS executables providing identification of code and data areas with detailed comments, and facilities for turning executables back in to ObjAsm compatible assembler source. It identifies instructions that may have side effects in 32 bit processor modes, to aid in porting 32bit RISC OS variants.

It has built in knowledge ARM Architectures up to ARMv5TE, ARM procedure call standards, executable, object and library file formats and RISC OS SWI calls. Output can be in the form of disassembly, ObjAsm style assembly and statistical summary. Options are provided to format the output as text, HTML, XML, or fully customisable variants of those and most other textual tagged document formats.

ARMalyser is available for RISC OS™, Win32™, x86 and ARM Linux, and other UNIX™ variants on request.

1.1 File types

Armalyser handles the following file types:-

Squeezed absolute and raw code may be handled directly if the xpand utility is present on the run path. Similarly compressed modules can be handled if the unmodsqz utility is available.

 

2 USE

2.1 Command Line Arguments

Usage: ARMalyser [options] infile

Where options are:-
-h Output command syntax
-v Verbose output, progress reports are sent to stdout, code analysis warnings are given
-d Output disassembly to stdout,
-a Produce ObjAsm format assembly
-r[a|r] Set register naming used :-
aAPCS register names
r Standard register names
-s Print statistics on code construction and 26-bit only instructions stdout
-t <target> Target processor
ARM7 | ARM9 | StrongARM | XScale
-p<t|h|x> Print format in Text (default), HTML or XML. If no option letter is supplied
-p <filename>the format is taken from a messagetrans format file in the next argument
-xc <addr> Display analysis backtrace when code marked
-xd <addr> Display analysis backtrace when data marked
-xr Display register contents during analysis
-o [filename] Sent output to a file (uses stdout if not specified)

2.2 Print Formatting

If print formatting is specified the output is encoded with tags that can be used to provide syntax colouring and hyper-linking for display in web browsers or printing in word processors. Pain text, HTML and XML formats are provided as standard, additional formats can be used by specifying a messagetrans file containing the tokens shown below.

The standard HTML file is provided as a template, as well as an inverted variant and file for Impression DDF (also suitable for EasiWriter and TechWriter) and Ovation Pro DDL. Almost any additional textual document format may be produced, as long as formatting codes are contained in tags with defined start and end characters, and illegal characters can be escaped either with defined start and end characters or a fixed length sequence. Note however that characters apart from " ' < >{ | } & may appear unescaped in comments in the current version.

Messagetrans token HTML default XML default Description
TagStart < < Tag start character
TagEnd > > Tag end character
tag_DOC1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<HTML>\n
<HEAD>\n<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">\n
<META NAME="Generator" CONTENT="%s">\n<TITLE>%s</TITLE>\n
</HEAD>\n<BODY bgcolor="#FFFFFF" text="#000000" link="#BF3F00" vlink="#BF3F00" alink="#BF0000">\n
<?xml version="1.0"
encoding="iso-8859-1"?>\n
<!DOCTYPE ARMalyser SYSTEM "ARMalyser.dtd">\n
<%s filename="%s">\n

Used at the start of document.

The first %s code is replaced with the name of the tool, and the second with the filename it has been run on.

Note for XML the name if the tools is used as the top level tag, this must be ARMalyser for the DTD to function.

The \n codes are used to put newlines in to the output

tag_DOC2 \n</BODY>\n</HTML>\n </%s> Used at the end of the document. The %s is replaced with the name of the tool.
tag_DISS1 <PRE> <Disassembly> Marks the start of the disassembly section
tag_DISS2 </PRE><HR> </Disassembly> Marks the end of the disassembly section
tag_DISSLINE1   <DissLine> Marks the start of a line of disassembly
tag_DISSLINE2 </DissLine> Marks the end of a line of disassembly
tag_ASM1 <PRE> <Assembly> Marks the start of the assembly section
tag_ASM2 </PRE><HR> </Assembly> Marks the end of the assembly section
tag_ASMLINE1   <AsmLine> Marks the start of a line of assembly
tag_ASMLINE2   </AsmLine> Marks the end of a line of assembly
tag_STATS1 <TABLE BORDER=1 CELLSPACING=1 CELLPADDING=5>\n <Stats> Start of the statistics table
tag_STATS2 </TABLE> </Stats> End of the statistics table
tag_STATSTITLE1 <TR><TD ALIGN="CENTER" COLSPAN=3><BIG><B> <StatsTitle> Start of the statistics table title row
tag_STATSTITLE2 </B></BIG></TD></TR> </StatsTitle> End of the statistics table title row
tag_STATSLINE1 <TR> <StatsLine> Start of a statistics table row
tag_STATSLINE2 </TR> </StatsLine> End of a statistics table row
tag_STATSCOLUMNA1 <TD><B> <StatsColumn column="1"> Start of a statistics table column 1
tag_STATSCOLUMNA2 </B></TD> </StatsColumn> End of a statistics table column 1
tag_STATSCOLUMNB1 <TD ALIGN="RIGHT"> <StatsColumn column="2"> Start of a statistics table column 2
tag_STATSCOLUMNB2 </TD> </StatsColumn> End of a statistics table column 2
tag_STATSCOLUMNC1 <TD ALIGN="RIGHT"> <StatsColumn column="3"> Start of a statistics table column 3
tag_STATSCOLUMNC2 </TD> </StatsColumn> End of a statistics table column 3
tag_WARNINGLINE1 <FONT COLOR="#FF0000"> <Warning> Start of a warning line from analysis engine. Will appear before disassembly.
tag_WARNINGLINE2 </FONT><BR> </Warning> End of a warning line.
tag_ADDRESS1 <A NAME="%s%08X"><FONT COLOR="#3F3F3F"> <Address address= "%s%08X"> Start of a disassembly address field. The %s is replaced with 'L' in assembly (empty in disassembly), the %08X is replaced with the hex value of the address.
tag_ADDRESS2 </FONT></A> </Address> End of a disassembly address field.
tag_LABEL1 <A NAME="%s%08X"><FONT COLOR="#BF0000"> <Label address= "%s%08X"> Start of an assembly label field. The %s is replaced with 'L' in assembly (empty in disassembly).
tag_LABEL2 </FONT></A> </Label> End of an assembly label field.
tag_ADDRLINK1 <A HREF="#%s%08X"> <AddrLink address=\"%s%08X\"> Start of an address hyperlink. Occurs around addresses in disassembly op codes or directives, labels in assembly op codes or directives, and in pointer indicators in comments. The %s is replaced with 'L' in assembly, empty in disassembly, so the hyper link refers to the correct section. The %08X is replaced with the hex value of the address.
tag_ADDRLINK2 </A> </AddrLink> End of an address hyperlink.
tag_CHARS1 <FONT COLOR="#00007F"> <Chars> Start of the disassembly character display.
tag_CHARS2 </FONT> </Chars> End of the disassembly character display.
tag_CTRLCHAR1 <FONT COLOR="#FF0000"> <CtrlChar> Start of a control character tag. Used around a control character in the character display field in disassembly. Note this is nested within the tag_CHARS section.
tag_CTRLCHAR2 </FONT> </CtrlChar> End of a control character tag.
tag_MEMORY1 <FONT COLOR="#5F5F5F"> <Memory> Start of the disassembly memory field.
tag_MEMORY2 </FONT> </Memory> End of the disassembly memory field.
tag_INSTRUCTION1   <Instruction> Start of the instruction field in both disassembly and assembly.
tag_INSTRUCTION2   </Instruction> End of the instruction field.
tag_OPCODE1   <OpCode> Start of the op code mnemonic tag.
tag_OPCODE2   </OpCode> End of the op code mnemonic tag,
tag_DIRECTIVE1 <FONT COLOR="#3F00BF"> <Directive> Start of the assembler directive tag.
tag_DIRECTIVE2 </FONT> </Directive> End of the assembler directive tag.
tag_CONDITION1 <FONT COLOR="#00007F"> <Condition> Start of the op code condition tag.
tag_CONDITION2 </FONT> </Condition> End of the op code condition tag.
tag_MODIFIER1 <FONT COLOR="#7F007F"> <Modifier> Start of the modifier tag, used for multiple register transfer flags, write back indication, and floating point precision and rounding.
tag_MODIFIER2 </FONT> </Modifier> End of the modifier tag.
tag_REGISTER1 <FONT COLOR="#003FBF"> <Register> Start of the register tag, used for main and floating point registers.
tag_REGISTER2 </FONT> </Register> End of the register tag.
tag_REGLIST1   <RegisterList> Start of the register list tag, used around the parenthesis in the multiple data transfer instruction.
tag_REGLIST2   </RegisterList> End of the register list tag.
tag_SHIFT1 <FONT COLOR="#003F00"> <Shift> Start of the op code shift tag, used around the shift type.
tag_SHIFT2 </FONT> </Shift> End of the op code shift tag.
tag_SWI1 <FONT COLOR="#7F7F00"> <SWI> Start of the SWI tag, used around the SWI name (or number if unrecognised).
tag_SWI2 </FONT> </SWI> End of the SWI tag.
tag_NUMBER1 <FONT COLOR="#007F7F"> <Number> Start of the number tag, used around any decimal or hex value in an instruction.
tag_NUMBER2 </FONT> </Number> End of the number tag.
tag_STRING1 <FONT COLOR="#007F00"> <String> Start of the string tag, used around string values in assembler directives or comments.
tag_STRING2 </FONT> </String> End of the string tag
tag_COMMENTA1 <FONT COLOR="#00BF00">; <Comment> Start of the Comment A tag. Used when the location has been identified with certainty. Note there is a trailing space in the HTML tag.
tag_COMMENTA2 </FONT> </Comment> End of the Comment A tag.
tag_COMMENTB1 <FONT COLOR="#7FBF00">;~ <Comment surmised="1"> Start of the Comment B tag. Used when the location has been identified with high confidence. Note there is a trailing space in the HTML tag.
tag_COMMENTB2 </FONT> </Comment> End of the Comment B tag.
tag_COMMENTC1 <FONT COLOR="#BFBF00">;~~ <Comment surmised="2"> Start of the Comment C tag. Used when the location has been identified with medium confidence. Note there is a trailing space in the HTML tag.
tag_COMMENTC2 </FONT> </Comment> End of the Comment C tag.
tag_COMMENTD1 <FONT COLOR="#BF7F00">;~~~ <Comment surmised="3"> Start of the Comment D tag. Used when the location has been identified with low confidence. Note there is a trailing space in the HTML tag.
tag_COMMENTD2 </FONT> </Comment> End of the Comment D tag.
tag_COMMENTE1 <FONT COLOR="#FF0000">;? <Comment unidentified="1">
Start of the Comment E tag. Used when the location could not be identified.
Note there is a trailing space in the HTML tag.
tag_COMMENTE2 </FONT> </Comment> End of the Comment E tag.
EntityStart & & Start of entity character, used before characters that are invalid in the format.
EntityEnd ; ; End of entity character, or if 1 to 9, the number of characters after EntityStart to be skipped.
Char0
...
Char31
 
Char32
 
Char38
Char39
 
Char60
Char62
 
Char1127
 
Char128
...
Char255
@
 
_
 
&nbsp;
 
&amp;
'
 
&lt;
&gt;
 
?
 
&#128;
 
&#255;
@
 
_
 
(space)
 
&apos;
&apos
 
&lt;
&gt;
 
?
 
(character 128)
 
(character 255)
All characters in the output are replaced with the entities contained in the tokens CharN where N is 0 to 255. The characters 0 to 31 and 127 are also enclosed by the tag_CTRLCHAR tag.
Note: in the current version there are some non-alphanumeric characters that appear in comments from the analysis engine. But these characters do not effect HTML, XML, DDF or DDL formats.
In the HTML and XML all characters are mapped to themselves except where shown.
FileType &FAF &F80 RISC OS file type to use for output file

2.2.1 HTML

The HTML generated is to the W3C 3.2 final specification. It fails the W3C valuator however, due to its use of font colour tags with in PRE blocks. No problems have been found with this in any browser.

Both Fresco and Browse are suitable for rendering the output, Fresco has a slight edge due to its increased speed when dealing with very large output files. Oregano will render it, but produces an inferior display. With both courier and typewriter fonts, top bit set characters are not displayed at the same width as the other leading to a ragged appearance to disassembly mode. The XTT font render produces a very light display with these fonts which is not as legible as the Acorn Font Manager's corpus font.

2.2.2 XML

A DTD is provided to enable validation and parsing of the output.

2.2.3 Impression DDF

Whilst the DDF produced is thought to be correct, Impression Publisher and Publisher+ can have difficulties with certain lines due to the number of style changes. This is not alleviated by using effects instead of styles, and has been put down to a bug in these programs. The problem can be circumvented by removing some of the tags, so that lines contain less style changes.

The output produced will load and display perfectly in EasiWriter and TechWriter.

2.2.4 Ovation Pro DDL

As DDL is not a true tagged format (instead requiring the document text to be enclosed quotes), the formatting width calculations do work correctly and produce incorrect length instruction strings, but this is masked by using a tab in the closing instruction tag to ensure that following comments are aligned. Note a tab at the start of comments is not suitable, as they may start after the memory display in disassembly mode, where no instruction is present.

 

3 ANALYSIS

3.1 Instructions

The entire ARMv4 instruction set is recognised and some elements of ARMv5.

3.1.1 Invalid Instructions

A large amount of the potential instruction space is considered invalid for legitimate use by RISC OS applications, the following are rejected.

3.1.2 Unpredicable Instructions

The following instructions may have unpredictable results on different processor variants.

3.1.3 Non 32-bit Instructions

The following instructions are not considered valid for use in 32-bit mode when found in 26-bit executables as the meaning of the instruction or its side effects are considerably different.

Note: Branch Link is not included, as flag preserving is the consequence of the return instruction.

3.1.4 Non ARM2/3 Instructions

Use of the PSR manipulation instructions MSR and MRS are invalid on the ARM2 and ARM3 processors.

3.1.5 Performance

If the processor target is specified on the command line the analysis will gather information relating to performance issues. These include :-

ARM7 ARM9 StrongARM XScale Note:Register latency calculations should only be taken as approximate. The current implementation will take in to account instruction issue latencies and blocking as a result of previous register latencies. However not all pipeline interactions and sequeneces involving conditionally exclusive instructions are fully modelled.

3.1.6 Information from Instructions

If the instruction contains an immediate address, or a register previously loaded with a PC relative address can be located, the following information is extracted from the instructions.

3.2 Code Following

Entry points are determined into the code by analysis of the AIF or module header, or the execution address for raw code. The code is then followed, stacking addition program flow changes and registered entry points using a stack. The following order is:-

3.3 Partial Emulation

Partial emulation of code is used to track registers to enable code and data areas to be detected, either directly or by knowledge of values passed to SWI calls and Shared CLib functions arguments. Values enter registers via one of:-

The following arithmetic operations are then emulated where all registers used by the instruction are known:-

Currently the PSR flag bits are not emulated so instructions that rely on the C flag such as ADC, SBC or with immediate shifts of RRX cannot be emulated, and invalidate the destination register.

If the instruction is conditional the register(s) set by the intruction will only be valid in subsequent intructions which also bear this condition, until flag alter instruction is encountered.

Register values are tracked through code sequences and are passed forward or reverse brnaches, any subroutines and SWI's. However if any of the following are encountered all known register are invalidaed.

3.4 26 / 32 Bit Mode Guarding

Code only suitable for use in 26 bit modes, or instructions only present on the ARM 2/3, may appear in 32 bit programs if suitabled guarded by a test whether the procesor is running in 26bit mode. Similarly instructions only available on later ARM processors may be used if suitabled guarded by a test whether the procesor is running in a 32bit mode. ARMalyser recognises the following code constructions as guarding the use of 26bit only or 32bit only instructions:-

	TEQ	R0,R0	; Ensure some flag bits set (only needed in USR mode code)
	TEQ	PC,PC	; Check if PC contains PSR, NE in 26bit mode, EQ in 32bit mode
	; 26 bit only instructions safe if used with the NE condition
	; 32 bit only instructions safe if used with the EQ condition

	TEQ	PC,#0	; Ensure some flag bits set (only needed in USR mode code)
	TEQ	PC,PC	; Check if PC contains PSR, NE in 26bit mode, EQ in 32bit mode
	BEQ	in32
|in26|
	; only executed in 26 bit mode, all 26 bit only instructions safe
	B	inEither
|in32|
	; only executed in 32 bit mode, 32 bit only instructions safe
|inEither|
	; executed in either mode

	TEQ	PC,#0	; Ensure some flag bits set (only needed in USR mode code)
	TEQ	PC,PC	; Check if PC contains PSR, NE in 26bit mode, EQ in 32bit mode
	BNE	in26
|in32|
	; only executed in 32 bit mode, 32 bit only instructions safe
	B	inEither
|in26|
	; only executed in 26 bit mode, all 26 bit only instructions safe
|inEither|
	; executed in either mode

3.5 Surmised Code and Data

Following the analysis of executable structure, code following and data recognition directly from code usage, additional steps must be taken to surmise the unidentified areas. There are 4 levels of recognition confidence:-

3.5.1 Code Recognition

3.5.2 Data Recognition

3.6 Code Structure

The following code structures are understood.

3.7 SWI Calls

The following SWI's are recognised, and values or addresses set-up by proceeding instructions are used to enable data structures and handler functions to be identified.

3.8 Shared C Library Functions

3.9 Data Structures

The following data structures are identified and annotated in comments:-

3.10 C++ Symbols

C++ Symbol names are unmanaged for display in comments, courtesy of code provided by Robin Watts.

 

4 DISASSEMBLY

The disassembly output is similar to that produced by the RISC OS debugger module, but with a greater knowledge of the ARMv4 instruction set, and more comprehensive comments as a result of the code analysis.

4.1 Comment level

The comment sequence describes the level of surmisation, or confidence of the analysis.

; Positively identified
;~ Surmisation level 1
;~~ Surmisation level 2
;~~~ Surmisation level 3
;? Failed to Identify

4.2 Code Comments

4.2.1 Cautions

Prefixed with CAUTION:

Bad Address - assumes 26bit wrapping Instruction may assume branches will wrap around the 26 bit address boundary, in 32 bit mode will jump outside the bottom 64MB
Bad Address - Thumb mode Bit 2 of branch address set, which will result in thumb mode being select on later ARM's
Bad Address - unaligned Bit 1 of branch address set
Not 32bit safe Instruction does not have the same behaviour in 32-bit modes as in 26-bit modes
Not 32bit safe (uses PSR) PC used in Rd or Rm or LDM with {PC}^ may cause problems in 32-bit modes
Not 32bit safe (uses NV) Instruction uses the former NV conditional encoding, which is used to provide instruction set extensions on later ARM's
Should be a NOP Current instructions should be a NOP to prevent side effects from the previous insrtruction on some ARM variants
Uses a banked register Current instructions should not accessed a banked register, due to side effects from the previous insrtruction.
SWI after CDP SWI follows a coprocessor data instruction, which may cause problems on some ARM variants
Conditional after BL/SWI Conditional instructions used after subroutine or SWI (except conditions based on flags altered by the SWI and the V flag from flag altering subroutines), suggesting flag preservation has been assumed.
Manipulation of PSR in address? Instruction may be assuming a combined PC+PSR and is trying to maniplulate the PSR bits in what is otherwise and address
Unpredictable - negative unindexed Coprocessor data transfer unindex mode with a negative offset
Unpredictable - write back to PC LDR/STR, LDM/STM or coprocessor data transfer instruction writing back to the PC
Unpredictable - base register in list and write back Base register in list with writeback set for LDM/STM
Unpredictable - ! and ^ Use of writeback and load user regsiters with LDM/STM
Unpredictable - PC with byte/half word Program counter loaded or stored with a non word LDR/STR variant
Unpredictable - write back with Rd=Rn Write back to register also used as destinaton in LDR/STR
Unpredictable - write back with Rn=Rm Write back to register also used as index ion LDR/STR
Unpredictable - write back used Write back to register illegal in PLD
Unpredictable - use of PC PC used in a multiply or SDS instruction
Unpredictable - Rd odd or R14 Odd number register or R14 used in LDRD or STRD
Unpredictable - non unique registers Invalid use of same register more than once in a multiply or SDS instruction
Unpredictable - StrongARM bug - next op exec'd twice On a StrongARM a conditional MSR setting the control field causes the next instruction to be executed twice, so should be a NOP to prevent side effects.
Unpredictable - immediate with non flag fields A MSR instruction with immediate value setting the non flags fields may cause side effects due when altering values of currently reserved bits
Unpredictable - SBZ non zero Bits that should be zero in an instruction are set
Unpredictable - SBO not ones Bits that should be ones in an instruction are clear
Unpredictable - Rm=PC PC used as the Rm register which results architecture specific values
Invalid Instruction Invalid instruction found in code area
Self Modifing Write detected to area of code

4.2.2 Performance

Enabled when the target processor is specified on the command line. Prefixed with PERF:

Conditional LDM/STM maybe slow Conditional LDM's and STM's on StrongARM and XScale's are unrolled and take more then one cycle even if not taken, reducing code performance
Single register LDM/STM slower than LDR/STR Single register LDM's and STM's are slower than LDR's and STR#s on StrongARM and XScale. LDR Rd,[Rn],#4 or STR Rd,[Rn,#-4]! should be used in preference.
n cycle latency on register A register used in the current instruction will not be available for a number of cycles. For maximum performance code should be reordered so that other instructions which do not use this register, are inserted between the where the register is written and used.

4.2.3 Informational

ARMvN Instruction is only available the specified architecture number onwards.
Guarded non ARMvN instruction Instruction is not available on the ARM 2 or ARM 3 but is only executed if the processor is running in 26bit mode due to a previous code sequence
Guarded not 32bit safe instruction Instruction is not 32bit safe but is only executed if the processor is running in 32bit mode due to a previous code sequence
Entry Point Code entered directly from OS
Function entry Target of BL
Label Target of branch instruction or dynamic branch
Ends Flow of code ends at this point
Dynamic branch Manipulates PC to alter code flow
(Referenced) ADR or memory pointer references this location
(Read as Data) Code area is read by a data instruction
=value Decimal or ASCII representation of immediate value used in instruction

4.3 Data comments

Data comments consist of the following

[construct]<data type> <read write specifier> <array specifier> [pointer]

Where:-

Construct If data has been identified as belonging to a construct
Data type One of the following:-
  • Word
  • Byte accessed
  • SCL static data
  • Float
  • Double
  • Extended Float"
  • Packed Float
  • Extended Packed Float
  • Offset
  • Relative Offset
  • Address
  • Relocated Address
  • String
  • Inline string
  • Symbol
  • Symbol Word
  • Semaphore
  • Debug Data
  • ResourceFS Data
Read Write specifier -/- Not directly accessed
r/- Read from
-/w Written to
r/w Read and written
Array specifier If accessed as an array via an index register
Pointer If data is valid offset or address, the data is dereferenced, indicated by ->

 

5 ASSEMBLY

Assembly produces an ObjAsm style output, but is not guaranteed to be immediately usable in ObjAsm. It contains the following elements:-

5.1 Header Comment

Gives the original executable file name, and the version of the tool used to produce the file, and compilation instructions.

5.2 Declarations

Any named SWI calls used in code are declared at the start of the assembly.

5.3 Labels

Labels are constructed using the following fields where identified

Code: L<Address>.[construct].[codeinfo | funcname]
Data: L<Address>.[construct].[datatype]

Where:-

Address 8-digit hex address
Construct Code or data construct
Codeinfo Information on entry point
Funcname Function name Identified from C symbol (not unmangled for C++)
Datatype Type of data

5.4 Compilation

The assembly output should be compiled with ObjAsm using the -ABSolute flag, and linked by Link with the -bin flag. This will produce output files of type Absolute (&FF8), so for modules or other types the file type should be set appropriately.

 

6 STATISTICS

The following statistical information is gathered on the executable, and displayed both as the number of words, and percentage of the file.

Size in words Total size of the executable.
Code Words identified as valid code.
Surmised Amount of code which was surmised as opposed to directly identified.
Uses PSR Uses the processor status register, which may cause problems in 32-bit modes.
Not ARM2/3 Instruction is not available on ARM2 or ARM3 processors.
Not 32 bit Instruction does have the same behaviour in 32-bit modes as in 26-bit modes.
Unpredictable Instruction does not produce predicable results on all ARM variants.
Data Words identified as data.
Surmised Amount of data which was surmised as opposed to directly identified.
Warnings Number of warnings produced by code analysis.
Unidentified Words that could not be identified as code or data.

If a target processor is specified, the additional statistics are displayed.

Total Cycles Sum of all the instruction cycle counts in the executable.
Latencies Sum of all the instruction and register latencies identified.

 

7 RELEASE HISTORY

7.1 Previous Releases

0.01 08-Apr-2000Initial revision
0.10 19-May-2000Alpha release
0.11 23-May-20002nd Alpha release
0.12 23-May-20003rd Alpha release
0.13 29-Nov-200032 bit CLib support added
LDR PC,[R13],#4 recognised as function exit
0.14 04-Dec-2000PullCall returns 0xFFFFFFFF if empty not 0
0.15 21-Apr-2001xpand used for unsqueezing absolutes
unmodsqz used for squeezed modules
0.16 25-Apr-2001AOF and ALF file checking added
0.17 23-Jul-2001SWI parameter checking greatly expanded
0.18 10-Aug-2001HTML and XML formatting added
0.19 14-Aug-2001Custom format loading added
0.20 30-Aug-2001Debug data and C++ name unmangling added
0.21 01-Sep-2001First Beta release
Formatting fixes for DDF and DDL
0.22 27-Sep-2001Fixed length string emble display
Wimp menu display fix
MessageTrans menu structure added
MarkData() and MarkString() will not override higher priority constructs
0.23 30-Oct-2001XML tags formalised with introduction of a DTD
0.24 01-Jan-2002Code expected warning addresses fixed
0.25 20-Jan-2002OBJ_AREA relocation directive aware of instruction type
0.26 13-Feb-2002Command line arguments rearranged
Output to file added, correct RISC OS filetype set
0.27 22-Feb-2002Fixed warnings not being output to file
0.28 01-May-2002Last word of code storage cleared before loading
SDT2 type (LDR SB/SH/H) instructions added
MUL and SDS invalid combinations rejected
Removed 26bit wrapping from PC relative address calcs
0.29 02-Jun-200232bit SCL jump table unknown entries analysed correctly
0.30 06-Jul-2002WIMP structure priorities altered
Prefix added to EmbleWimpIconData
0.31 26-Jul-2002Patch candidate statistics added
Sub stats now percentage of main stat not total words
0.32 01-Sep-2002Asm warning links generated if not disassembling
Mutex flag sequence end correctly detected with R14 DP's
SDT target address calculation fixed by adding offset
FindAddr and FindValue terminates on BL or LDM Rx
Code start sequence finder recognises if PC is stacked
Code and data stack tracing added
Trailing .0 removed from FP constants for ObjAsm
FP precision added to FIX for ObjAsm
Title and warning lines prefixed with ';' in text output
Branch table endpoint label removal problem fixed
Mid string assembler label offsets generated
References generated from MarkData address
MakeString single and double quote bugs fixed
$$ quoted in assembler strings
Greater precision used for DCFS and DCFD values
Character following escape ignored in string
Only BASIC or unknown string types terminate on \r
Address word identification prevented in SCL modules
Relocation table tragets and end points marked as ref'd
Detection of sequential strings improved
0.33 05-Sep-2002FindAddr FindValue conditional instruction checking
0.34 21-Sep-2002Explicit Immediate value & rotate used in ASM instructions
Non 32bit compliant SWI warningd added
Label prescan to ensure only valid asm labels are generated
0.35 04-Nov-2002CPSR and SPSR variants made compatible with ObjAsm
mnemoaic_opts added for control of immediate and ADRs
Suprious OS_WriteI SWI variants removed from OSLib port
Invalid FP instructs with prec=3 and FIX const rejected
All code using NV condition marked as not 32 bit
Settype comment added to aseembler output for modules
Library directory chunk date stamp displayed correctly
String detection prevented from running off end of data
Error block detection fixed
Sound SWI groups &40140-&4amp;0180-&401C0 dispatch corrected
Sound_Install voice installation header identified
Wimp Message block size validated by analysis
32bit module header analysis and display added
C symbols searched in reverse order for code detection
Non XScale instruction warning added for LDR Rd,[Rd],#0
Warning added for branching outside 64MB or with thumb bit
Non RISC OS: Bogus names for SWI &100 to &1FF removed
Non RISC OS: Sound SWI names added &40140-&40180-&401C0
Non RISC OS: osfile funcs dont convert extn to lower case
Non RISC OS: Territory_ConvertDateAndTime fixed
!ARMalyser auto runs 26 or 32 bit RISC OS executable
0.37 28-Nov-2002Code cautionary descriptions improved
0.38 29-Nov-2002Display of new cautions from state and decoding
0.39 05-Dec-2002Analysis state stacking added
0.40 06-Dec-200226/32bit guarding and conditional exit detection enhanced
0.41 08-Dec-2002AOF and ALF with filetype text as well as data allowed
Relocated addresses handled in register memory loads
Fix to label generation in during analysis warnings
CRT on control processor decode type corrected
0.42 22-Dec-200216 bit half word data type handling added
ARMv5 load/store double word instrucions added
Wimp Icon Data buffer and validation marked as address
Long ADR recognised and real label with offset used
Label+offset links fixed in HTML output
OBJAREA_START marker placement validity check
0.43 31-Jan-2003SDT Rm=Rn with write back is unpredictable
MarkData exits if length is > 16MB (or signed negative)
Caution if comparing PC against value of 1<<31 or more
ARMv5 Misc and DSP instructions added
ARMv5 instruction display conditional on 32bit executable
MessageTrans menu structure display corrected
RO4 fast service entry show as address in service table
Instructure architecture version displayed
Shared C Library APCS-A recognised and marked as non 32bit
Co-pro data transfer instruction data marking corrected
ResourceFS file length displayed
Performance (XScale register latency) information added
LDR/STR register emulation offset sign calc corrected
Target processor command line option added
0.44 31-Jan-2003Executable extended to encompase SCL static data areas
Shared C Library _swi and _swix vectors correctly labelled
Instruction issue latency added
Regsiter result latencies adjusted
Instruction shift and register shift added to decode info
AIF header dectection tightened
Register emu ignores loads from memory modified at runtime
Data not marked on unaligned loads and stores
Control terminated strings allowed in error blocks and various SWI parameters
ALIGN in assembler output replaced by DCB's of spare data
SpriteArea display fixed and enhanced
R14 address checked on SDT dynamic branch
NOP/No Banked caution not triggered on exclusive condition
End of file label added to assembler output
Register condition validity sep'd from value known flag
Instruction unpredictable info calculated in main decode
MSR immediate decoding and display fixed
AOF code detection and file structure display enhanced
Relative offset data type and display added
0.45 05-Apr-2003Dynamic branches with R14=PC+4 trated as subroutine
Flag modifications tracked over dynamic branch subroutines
Instruction latency cycles added to register trace output
Total instruction latency cycles display option
Branch throughput latency corrected
OS_ReadLine32, OS_SubstituteArgs32, OS_HeapSort32 added
Mnemonic field length alterd in assem and dissassem modes
Brkpt marked as unpredictable if conditional
Error block marked on MSR CSPR_f,#Vflag
Width of SWI names in assembler EQU table mode increased
Detection of branch to thumb mode fixed
Removed banked register warning for PC
0.46 17-Aug-2003PSR state invalidated after flag setting DP instruction
Z & C flag returning SWI knowledge added
Conditional after BL/SWI improved to use SWI BL flag info
Base field removed from discontiguous enum debug display
0.47 21-Mar-2004Last word of module service table not marked as array
Object file code attribute explicitly flagged as 26bit
C flag set for OS_GBPB and OS_BGet
0.48 05-Jul-2004Check for sensible SCL chunk id added
0.49 02-Mar-2005Module header end detection improved
0.50 30-May-2005Test for cond not compatible with flags set by instr
0.51 14-Nov-2005PLD not treated as dynamic branch even if not ARMv5
BL/BLX & CLZ decode subtype numbers corrected
0.52 02-Dec-2005Construct start markers added
Debug data bitfield type omission corrected
Labels used for debug data pointers in assembler output
Assembler & Dissambler output string generation improved
ARMv5 decoding suppressed unless target XScale specified
Fixed Rm state caclulation for DP with register shift
Data backtrace location moved to pickup byte modifications
Dots substituted for characters 128 to 159 in HTML output
0.53 22-Dec-2005Debug FileInfo structure not skipped if length field zero
Debug FileInfo formating for 3 digit increments
Undefined rd in COPRO_CTRL CDT fixed
Help text updated
0.54 02-Feb-2006Fixed reg latency reporting only when reg values known
0.55 13-Apr-2006Performance analysis corrected and enhanced, ARM9 addded
SWI number displayed for kernel_SWI and swi functions
Checking of command line arguments added

7.2 Issues With Current Release

7.3 Future Development

The following features are planned in future versions:-