home *** CD-ROM | disk | FTP | other *** search
Text File | 2010-08-07 | 81.4 KB | 1,946 lines |
- #
- # $Id: 64doc,v 1.8 1994/06/03 19:50:04 jopi Exp $
- #
- # This file is part of Commodore 64 emulator
- # and Program Development System.
- #
- # See README for copyright notice
- #
- # This file contains documentation for 6502/6510/8500/8502 instruction set.
- #
- #
- # Written by
- # John West (john@ucc.gu.uwa.edu.au)
- # Marko M"akel"a (msmakela@kruuna.helsinki.fi)
- #
- #
- # $Log: 64doc,v $
- # Revision 1.8 1994/06/03 19:50:04 jopi
- # Patchlevel 2
- #
- # Revision 1.7 1994/04/15 13:07:04 jopi
- # 65xx Register descriptions added
- #
- # Revision 1.6 1994/02/18 16:09:36 jopi
- #
- # Revision 1.5 1994/01/26 16:08:37 jopi
- # X64 version 0.2 PL 1
- #
- # Revision 1.4 1993/11/10 01:55:34 jopi
- #
- # Revision 1.3 93/06/21 13:37:18 jopi
- # X64 version 0.2 PL 0
- #
- # Revision 1.2 93/06/21 13:07:15 jopi
- # *** empty log message ***
- #
- #
-
- Note: To extract the uuencoded ML programs in this article most
- easily you may use e.g. "uud" by Edwin Kremer <edwin@zlotty>,
- which extracts them all at once.
-
-
-
- Documentation for the NMOS 65xx/85xx Instruction Set
-
- 6510 Instructions by Addressing Modes
- 6502 Registers
- 6510/8502 Undocumented Commands
- Register selection for load and store
- Decimal mode in NMOS 6500 series
- 6510 features
- Different CPU types
- 6510 Instruction Timing
- How Real Programmers Acknowledge Interrupts
- Memory Management
- Autostart Code
- Notes
- References
-
-
- 6510 Instructions by Addressing Modes
-
- off- ++++++++++ Positive ++++++++++ ---------- Negative ----------
- set 00 20 40 60 80 a0 c0 e0 mode
-
- +00 BRK JSR RTI RTS NOP* LDY CPY CPX Impl/immed
- +01 ORA AND EOR ADC STA LDA CMP SBC (indir,x)
- +02 t t t t NOP*t LDX NOP*t NOP*t ? /immed
- +03 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* (indir,x)
- +04 NOP* BIT NOP* NOP* STY LDY CPY CPX Zeropage
- +05 ORA AND EOR ADC STA LDA CMP SBC Zeropage
- +06 ASL ROL LSR ROR STX LDX DEC INC Zeropage
- +07 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Zeropage
-
- +08 PHP PLP PHA PLA DEY TAY INY INX Implied
- +09 ORA AND EOR ADC NOP* LDA CMP SBC Immediate
- +0a ASL ROL LSR ROR TXA TAX DEX NOP Accu/impl
- +0b ANC** ANC** ASR** ARR** ANE** LXA** SBX** SBC* Immediate
- +0c NOP* BIT JMP JMP () STY LDY CPY CPX Absolute
- +0d ORA AND EOR ADC STA LDA CMP SBC Absolute
- +0e ASL ROL LSR ROR STX LDX DEC INC Absolute
- +0f SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Absolute
-
- +10 BPL BMI BVC BVS BCC BCS BNE BEQ Relative
- +11 ORA AND EOR ADC STA LDA CMP SBC (indir),y
- +12 t t t t t t t t ?
- +13 SLO* RLA* SRE* RRA* SHA** LAX* DCP* ISB* (indir),y
- +14 NOP* NOP* NOP* NOP* STY LDY NOP* NOP* Zeropage,x
- +15 ORA AND EOR ADC STA LDA CMP SBC Zeropage,x
- +16 ASL ROL LSR ROR STX y) LDX y) DEC INC Zeropage,x
- +17 SLO* RLA* SRE* RRA* SAX* y) LAX* y) DCP* ISB* Zeropage,x
-
- +18 CLC SEC CLI SEI TYA CLV CLD SED Implied
- +19 ORA AND EOR ADC STA LDA CMP SBC Absolute,y
- +1a NOP* NOP* NOP* NOP* TXS TSX NOP* NOP* Implied
- +1b SLO* RLA* SRE* RRA* SHS** LAS** DCP* ISB* Absolute,y
- +1c NOP* NOP* NOP* NOP* SHY** LDY NOP* NOP* Absolute,x
- +1d ORA AND EOR ADC STA LDA CMP SBC Absolute,x
- +1e ASL ROL LSR ROR SHX**y) LDX y) DEC INC Absolute,x
- +1f SLO* RLA* SRE* RRA* SHA**y) LAX* y) DCP* ISB* Absolute,x
-
-
- ROR intruction is available on MC650x microprocessors after
- June, 1976.
-
-
- Legend:
-
- t Jams the machine
- *t Jams very rarely
- * Undocumented command
- ** Unusual operation
- y) indexed using Y instead of X
- () indirect instead of absolute
-
- Note that the NOP instructions do have other addressing modes
- than the implied addressing. The NOP instruction is just like
- any other load instruction, except it does not store the
- result anywhere nor affects the flags.
-
-
- 6502 Registers
-
- The NMOS 65xx processors are not ruined with too many registers. In
- addition to that, the registers are mostly 8-bit. Here is a brief
- description of each register:
-
- PC Program Counter
-
- This register points the address from which the next
- instruction byte (opcode or parameter) will be fetched.
- Unlike other registers, this one is 16 bits in length. The
- low and high 8-bit halves of the register are called PCL
- and PCH, respectively.
-
- The Program Counter may be read by pushing its value on
- the stack. This can be done either by jumping to a
- subroutine or by causing an interrupt.
-
- S Stack pointer
-
- The NMOS 65xx processors have 256 bytes of stack memory,
- ranging from $0100 to $01FF. The S register is a 8-bit
- offset to the stack page. In other words, whenever
- anything is being pushed on the stack, it will be stored
- to the address $0100+S.
-
- The Stack pointer can be read and written by transfering
- its value to or from the index register X (see below) with
- the TSX and TXS instructions.
-
- P Processor status
-
- This 8-bit register stores the state of the processor. The
- bits in this register are called flags. Most of the flags
- have something to do with arithmetic operations.
-
- The P register can be read by pushing it on the stack
- (with PHP or by causing an interrupt). If you only need to
- read one flag, you can use the branch instructions.
- Setting the flags is possible by pulling the P register
- from stack or by using the flag set or clear instructions.
-
- Following is a list of the flags, starting from the 8th
- bit of the P register (bit 7, value $80):
-
- N Negative flag
-
- This flag will be set after any arithmetic operations
- (when any of the registers A, X or Y is being loaded
- with a value). Generally, the N flag will be copied
- from the topmost bit of the register being loaded.
-
- Note that TXS (Transfer X to S) is not an arithmetic
- operation. Also note that the BIT instruction affects
- the Negative flag just like arithmetic operations.
- Finally, the Negative flag behaves differently in
- Decimal operations (see description below).
-
- V oVerflow flag
-
- Like the Negative flag, this flag is intended to be
- used with 8-bit signed integer numbers. The flag will
- be affected by addition and subtraction, the
- instructions PLP, CLV and BIT, and the hardware signal
- -SO. Note that there is no SEV instruction, even though
- the MOS engineers loved to use East European abbreviations,
- like DDR (Deutsche Demokratische Republik vs. Data
- Direction Register). (The Russian abbreviation for their
- former trade association COMECON is SEV.) The -SO
- (Set Overflow) signal is available on some processors,
- at least the 6502, to set the V flag. This enables
- response to an I/O activity in equal or less than
- three clock cycles when using a BVC instruction branching
- to itself ($50 $FE).
-
- The CLV instruction clears the V flag, and the PLP and
- BIT instructions copy the flag value from the bit 6 of
- the topmost stack entry or from memory.
-
- After a binary addition or subtraction, the V flag
- will be set on a sign overflow, cleared otherwise.
- What is a sign overflow? For instance, if you are
- trying to add 123 and 45 together, the result (168)
- does not fit in a 8-bit signed integer (upper limit
- 127 and lower limit -128). Similarly, adding -123 to
- -45 causes the overflow, just like subtracting -45
- from 123 or 123 from -45 would do.
-
- Like the N flag, the V flag will not be set as
- expected in the Decimal mode. Later in this document
- is a precise operation description.
-
- A common misbelief is that the V flag could only be
- set by arithmetic operations, not cleared.
-
- 1 unused flag
-
- To the current knowledge, this flag is always 1.
-
- B Break flag
-
- This flag is used to distinguish software (BRK)
- interrupts from hardware interrupts (IRQ or NMI). The
- B flag is always set except when the P register is
- being pushed on stack when jumping to an interrupt
- routine to process only a hardware interrupt.
-
- The official NMOS 65xx documentation claims that the
- BRK instruction could only cause a jump to the IRQ
- vector ($FFFE). However, if an NMI interrupt occurs
- while executing a BRK instruction, the processor will
- jump to the NMI vector ($FFFA), and the P register
- will be pushed on the stack with the B flag set.
-
- D Decimal mode flag
-
- This flag is used to select the (Binary Coded) Decimal
- mode for addition and subtraction. In most
- applications, the flag is zero.
-
- The Decimal mode has many oddities, and it operates
- differently on CMOS processors. See the description
- of the ADC, SBC and ARR instructions below.
-
- I Interrupt disable flag
-
- This flag can be used to prevent the processor from
- jumping to the IRQ handler vector ($FFFE) whenever the
- hardware line -IRQ is active. The flag will be
- automatically set after taking an interrupt, so that
- the processor would not keep jumping to the interrupt
- routine if the -IRQ signal remains low for several
- clock cycles.
-
- Z Zero flag
-
- The Zero flag will be affected in the same cases than
- the Negative flag. Generally, it will be set if an
- arithmetic register is being loaded with the value
- zero, and cleared otherwise. The flag will behave
- differently in Decimal operations.
-
- C Carry flag
-
- This flag is used in additions, subtractions,
- comparisons and bit rotations. In additions and
- subtractions, it acts as a 9th bit and lets you to
- chain operations to calculate with bigger than 8-bit
- numbers. When subtracting, the Carry flag is the
- negative of Borrow: if an overflow occurs, the flag
- will be clear, otherwise set. Comparisons are a
- special case of subtraction: they assume Carry flag
- set and Decimal flag clear, and do not store the
- result of the subtraction anywhere.
-
- There are four kinds of bit rotations. All of them
- store the bit that is being rotated off to the Carry
- flag. The left shifting instructions are ROL and ASL.
- ROL copies the initial Carry flag to the lowmost bit
- of the byte; ASL always clears it. Similarly, the ROR
- and LSR instructions shift to the right.
-
- A Accumulator
-
- The accumulator is the main register for arithmetic and
- logic operations. Unlike the index registers X and Y, it
- has a direct connection to the Arithmetic and Logic Unit
- (ALU). This is why many operations are only available for
- the accumulator, not the index registers.
-
- X Index register X
-
- This is the main register for addressing data with
- indices. It has a special addressing mode, indexed
- indirect, which lets you to have a vector table on the
- zero page.
-
- Y Index register Y
-
- The Y register has the least operations available. On the
- other hand, only it has the indirect indexed addressing
- mode that enables access to any memory place without
- having to use self-modifying code.
-
-
-
- 6510/8502 Undocumented Commands
-
- -- A brief explanation about what may happen while
- using don't care states.
-
-
- ANE $8B A = (A | #$EE) & X & #byte
- same as
- A = ((A & #$11 & X) | ( #$EE & X)) & #byte
-
- In real 6510/8502 the internal parameter #$11
- may occasionally be #$10, #$01 or even #$00.
- This occurs when the video chip starts DMA
- between the opcode fetch and the parameter fetch
- of the instruction. The value probably depends
- on the data that was left on the bus by the VIC-II.
-
- LXA $AB C=Lehti: A = X = ANE
- Alternate: A = X = (A & #byte)
-
- TXA and TAX have to be responsible for these.
-
- SHA $93,$9F Store (A & X & (ADDR_HI + 1))
- SHX $9E Store (X & (ADDR_HI + 1))
- SHY $9C Store (Y & (ADDR_HI + 1))
- SHS $9B SHA and TXS, where X is replaced by (A & X).
-
- Note: The value to be stored is copied also
- to ADDR_HI if page boundary is crossed.
-
- SBX $CB Carry and Decimal flags are ignored but the
- Carry flag will be set in substraction. This
- is due to the CMP command, which is executed
- instead of the real SBC.
-
- ARR $6B This instruction first performs an AND
- between the accumulator and the immediate
- parameter, then it shifts the accumulator to
- the right. However, this is not the whole
- truth. See the description below.
-
- Many undocumented commands do not use AND between registers, the CPU
- just throws the bytes to a bus simultaneously and lets the
- open-collector drivers perform the AND. I.e. the command called 'SAX',
- which is in the STORE section (opcodes $A0...$BF), stores the result
- of (A & X) by this way.
-
- More fortunate is its opposite, 'LAX' which just loads a byte
- simultaneously into both A and X.
-
-
- $6B ARR
-
- This instruction seems to be a harmless combination of AND and ROR at
- first sight, but it turns out that it affects the V flag and also has
- a special kind of decimal mode. This is because the instruction has
- inherited some properties of the ADC instruction ($69) in addition to
- the ROR ($6A).
-
- In Binary mode (D flag clear), the instruction effectively does an AND
- between the accumulator and the immediate parameter, and then shifts
- the accumulator to the right, copying the C flag to the 8th bit. It
- sets the Negative and Zero flags just like the ROR would. The ADC code
- shows up in the Carry and oVerflow flags. The C flag will be copied
- from the bit 6 of the result (which doesn't seem too logical), and the
- V flag is the result of an Exclusive OR operation between the bit 6
- and the bit 5 of the result. This makes sense, since the V flag will
- be normally set by an Exclusive OR, too.
-
- In Decimal mode (D flag set), the ARR instruction first performs the
- AND and ROR, just like in Binary mode. The N flag will be copied from
- the initial C flag, and the Z flag will be set according to the ROR
- result, as expected. The V flag will be set if the bit 6 of the
- accumulator changed its state between the AND and the ROR, cleared
- otherwise.
-
- Now comes the funny part. If the low nybble of the AND result,
- incremented by its lowmost bit, is greater than 5, the low nybble in
- the ROR result will be incremented by 6. The low nybble may overflow
- as a consequence of this BCD fixup, but the high nybble won't be
- adjusted. The high nybble will be BCD fixed in a similar way. If the
- high nybble of the AND result, incremented by its lowmost bit, is
- greater than 5, the high nybble in the ROR result will be incremented
- by 6, and the Carry flag will be set. Otherwise the C flag will be
- cleared.
-
- To help you understand this description, here is a C routine that
- illustrates the ARR operation in Decimal mode:
-
- unsigned
- A, /* Accumulator */
- AL, /* low nybble of accumulator */
- AH, /* high nybble of accumulator */
-
- C, /* Carry flag */
- Z, /* Zero flag */
- V, /* oVerflow flag */
- N, /* Negative flag */
-
- t, /* temporary value */
- s; /* value to be ARRed with Accumulator */
-
- t = A & s; /* Perform the AND. */
-
- AH = t >> 4; /* Separate the high */
- AL = t & 15; /* and low nybbles. */
-
- N = C; /* Set the N and */
- Z = !(A = (t >> 1) | (C << 7)); /* Z flags traditionally */
- V = (t ^ A) & 64; /* and V flag in a weird way. */
-
- if (AL + (AL & 1) > 5) /* BCD "fixup" for low nybble. */
- A = (A & 0xF0) | ((A + 6) & 0xF);
-
- if (C = AH + (AH & 1) > 5) /* Set the Carry flag. */
- A = (A + 0x60) & 0xFF; /* BCD "fixup" for high nybble. */
-
-
-
- $CB SBX X <- (A & X) - Immediate
-
- The 'SBX' ($CB) may seem to be very complex operation, even though it
- is a combination of the subtraction of accumulator and parameter, as
- in the 'CMP' instruction, and the command 'DEX'. As a result, both A
- and X are connected to ALU but only the subtraction takes place. Since
- the comparison logic was used, the result of subtraction should be
- normally ignored, but the 'DEX' now happily stores to X the value of
- (A & X) - Immediate. That is why this instruction does not have any
- decimal mode, and it does not affect the V flag. Also Carry flag will
- be ignored in the subtraction but set according to the result.
-
- Proof:
-
- begin 644 vsbx
- M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```*D`H#V1*Z`_D2N@09$KJ0>%
- M^QBE^VEZJ+$KH#F1*ZD`2"BI`*(`RP`(:-B@.5$K*4#P`E@`H#VQ*SAI`)$K
- JD-Z@/[$K:0"1*Y#4J2X@TO\XH$&Q*VD`D2N0Q,;[$+188/_^]_:_OK>V
- `
- end
-
- and
-
- begin 644 sbx
- M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI`*!-D2N@3Y$KH%&1*ZD#
- MA?L8I?M*2)`#J1@LJ3B@29$K:$J0`ZGX+*G8R)$K&/BXJ?2B8\L)AOP(:(7]
- MV#B@3;$KH$\Q*Z!1\2L(1?SP`0!H1?TIM]#XH$VQ*SAI`)$KD,N@3[$K:0"1
- 9*Y#!J2X@TO\XH%&Q*VD`D2N0L<;[$))88-#X
- `
- end
-
- These test programs show if your machine is compatible with ours
- regarding the opcode $CB. The first test, vsbx, proves that SBX does
- not affect the V flag. The latter one, sbx, proves the rest of our
- theory. The vsbx test tests 33554432 SBX combinations (16777216
- different A, X and Immediate combinations, and two different V flag
- states), and the sbx test doubles that amount (16777216*4 D and C flag
- combinations). Both tests have run successfully on a C64 and a Vic20.
- They ought to run on C16, +4 and the PET series as well. The tests
- stop with BRK, if the opcode $CB does not work as expected. Successful
- operation ends in RTS. As the tests are very slow, they print dots on
- the screen while running so that you know that the machine has not
- jammed. On computers running at 1 MHz, the first test prints
- approximately one dot every four seconds and a total of 2048 dots,
- whereas the second one prints half that amount, one dot every seven
- seconds.
-
- If the tests fail on your machine, please let us know your processor's
- part number and revision. If possible, save the executable (after it
- has stopped with BRK) under another name and send it to us so that we
- know at which stage the program stopped.
-
- The following program is a Commodore 64 executable that Marko M"akel"a
- developed when trying to find out how the V flag is affected by SBX.
- (It was believed that the SBX affects the flag in a weird way, and
- this program shows how SBX sets the flag differently from SBC.) You
- may find the subroutine at $C150 useful when researching other
- undocumented instructions' flags. Run the program in a machine
- language monitor, as it makes use of the BRK instruction. The result
- tables will be written on pages $C2 and $C3.
-
- begin 644 sbx-c100
- M`,%XH`",#L&,$,&,$L&XJ8*B@LL7AOL(:(7\N#BM#L$M$,'M$L$(Q?OP`B@`
- M:$7\\`,@4,'N#L'0U.X0P=#/SB#0[A+!T,<``````````````)BJ\!>M#L$M
- L$,'=_\'0":T2P=W_PM`!8,K0Z:T.P2T0P9D`PID`!*T2P9D`PYD`!<C0Y``M
- `
- end
-
-
- Other undocumented instructions usually cause two preceding opcodes
- being executed. However 'NOP' seems to completely disappear from 'SBC'
- code $EB.
-
- The most difficult to comprehend are the rest of the instructions
- located on the '$0B' line.
-
- All the instructions located at the positive (left) side of this line
- should rotate either memory or the accumulator, but the addressing
- mode turns out to be immediate! No problem. Just read the operand, let
- it be ANDed with the accumulator and finally use accumulator
- addressing mode for the instructions above them.
-
- RELIGION_MODE_ON
- /* This part of the document is not accurate. You can
- read it as a fairy tale, but do not count on it when
- performing your own measurements. */
-
- The rest two instructions on the same line, called 'ANE' and 'LXA'
- ($8B and $AB respectively) often give quite unpredictable results.
- However, the most usual operation is to store ((A | #$ee) & X & #$nn)
- to accumulator. Note that this does not work reliably in a real 64!
- In the Commodore 128 the opcode $8B uses values 8C, CC, EE, and
- occasionally 0C and 8E for the OR instead of EE,EF,FE and FF used in
- the C64. With a C128 running at 2 MHz #$EE is always used. Opcode $AB
- does not cause this OR taking place on 8502 while 6510 always performs
- it. Note that this behaviour depends on processor and/or video chip
- revision.
-
- Let's take a closer look at $8B (6510).
-
- A <- X & D & (A | VAL)
-
- where VAL comes from this table:
-
- X high D high D low VAL
- even even --- $EE (1)
- even odd --- $EE
- odd even --- $EE
- odd odd 0 $EE
- odd odd not 0 $FE (2)
-
- (1) If the bottom 2 bits of A are both 1, then the LSB of the result may
- be 0. The values of X and D are different every time I run the test.
- This appears to be very rare.
- (2) VAL is $FE most of the time. Sometimes it is $EE - it seems to be random,
- not related to any of the data. This is much more common than (1).
-
- In decimal mode, VAL is usually $FE.
-
-
- Two different functions have been discovered for LAX, opcode $AB. One
- is A = X = ANE (see above) and the other, encountered with 6510 and
- 8502, is less complicated A = X = (A & #byte). However, according to
- what is reported, the version altering only the lowest bits of each
- nybble seems to be more common.
-
- What happens, is that $AB loads a value into both A and X, ANDing the
- low bit of each nybble with the corresponding bit of the old
- A. However, there are exceptions. Sometimes the low bit is cleared
- even when A contains a '1', and sometimes other bits are cleared. The
- exceptions seem random (they change every time I run the test). Oops -
- that was in decimal mode. Much the same with D=0.
-
- What causes the randomness? Probably it is that it is marginal logic
- levels - when too much wired-anding goes on, some of the signals get
- very close to the threshold. Perhaps we're seeing some of them step
- over it. The low bit of each nybble is special, since it has to cope
- with carry differently (remember decimal mode). We never see a '0'
- turn into a '1'.
-
- Since these instructions are unpredictable, they should not be used.
-
- There is still very strange instruction left, the one named SHA/X/Y,
- which is the only one with only indexed addressing modes. Actually,
- the commands 'SHA', 'SHX' and 'SHY' are generated by the indexing
- algorithm.
-
- While using indexed addressing, effective address for page boundary
- crossing is calculated as soon as possible so it does not slow down
- operation. As a result, in the case of SHA/X/Y, the address and data
- are processed at the same time making AND between them to take place.
- Thus, the value to be stored by SAX, for example, is in fact (A & X &
- (ADDR_HI + 1)). On page boundary crossing the same value is copied
- also to high byte of the effective address.
-
- RELIGION_MODE_OFF
-
- Register selection for load and store
-
- bit1 bit0 A X Y
- 0 0 x
- 0 1 x
- 1 0 x
- 1 1 x x
-
- So, A and X are selected by bits 1 and 0 respectively, while
- ~(bit1|bit0) enables Y.
-
- Indexing is determined by bit4, even in relative addressing mode,
- which is one kind of indexing.
-
- Lines containing opcodes xxx000x1 (01 and 03) are treated as absolute
- after the effective address has been loaded into CPU.
-
- Zeropage,y and Absolute,y (codes 10x1 x11x) are distinquished by bit5.
-
-
- Decimal mode in NMOS 6500 series
-
- Most sources claim that the NMOS 6500 series sets the N, V and Z
- flags unpredictably when performing addition or subtraction in decimal
- mode. Of course, this is not true. While testing how the flags are
- set, I also wanted to see what happens if you use illegal BCD values.
-
- ADC works in Decimal mode in a quite complicated way. It is amazing
- how it can do that all in a single cycle. Here's a C code version of
- the instruction:
-
- unsigned
- A, /* Accumulator */
- AL, /* low nybble of accumulator */
- AH, /* high nybble of accumulator */
-
- C, /* Carry flag */
- Z, /* Zero flag */
- V, /* oVerflow flag */
- N, /* Negative flag */
-
- s; /* value to be added to Accumulator */
-
- AL = (A & 15) + (s & 15) + C; /* Calculate the lower nybble. */
-
- AH = (A >> 4) + (s >> 4) + (AL > 15); /* Calculate the upper nybble. */
-
- if (AL > 9) AL += 6; /* BCD fixup for lower nybble. */
-
- Z = ((A + s + C) & 255 != 0); /* Zero flag is set just
- like in Binary mode. */
-
- /* Negative and Overflow flags are set with the same logic than in
- Binary mode, but after fixing the lower nybble. */
-
- N = (AH & 8 != 0);
- V = ((AH << 4) ^ A) & 128 && !((A ^ s) & 128);
-
- if (AH > 9) AH += 6; /* BCD fixup for upper nybble. */
-
- /* Carry is the only flag set after fixing the result. */
-
- C = (AH > 15);
- A = ((AH << 4) | (AL & 15)) & 255;
-
-
- The C flag is set as the quiche eaters expect, but the N and V flags
- are set after fixing the lower nybble but before fixing the upper one.
- They use the same logic than binary mode ADC. The Z flag is set before
- any BCD fixup, so the D flag does not have any influence on it.
-
- Proof: The following test program tests all 131072 ADC combinations in
- Decimal mode, and aborts with BRK if anything breaks this theory.
- If everything goes well, it ends in RTS.
-
- begin 600 dadc
- M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'BI&* A/N$_$B@+)$KH(V1
- M*Q@(I?PI#X7]I?LI#V7]R0J0 FD%J"D/A?VE^RGP9?PI\ C $) ":0^JL @H
- ML ?)H) &""@X:5\X!?V%_0AH*3W@ ! ""8"HBD7[$ JE^T7\, 28"4"H**7[
- M9?S0!)@) J@8N/BE^V7\V A%_= G:(3]1?W0(.;[T(?F_-"#:$D8\ )88*D=
- 0&&4KA?NI &4LA?RI.&S[ A%
-
- end
-
- All programs in this chapter have been successfully tested on a Vic20
- and a Commodore 64 and a Commodore 128D in C64 mode. They should run on
- C16, +4 and on the PET series as well. If not, please report the problem
- to Marko M"akel"a. Each test in this chapter should run in less than a
- minute at 1 MHz.
-
- SBC is much easier. Just like CMP, its flags are not affected by
- the D flag.
-
- Proof:
-
- begin 600 dsbc-cmp-flags
- M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'B@ (3[A/RB XH8:66HL2N@
- M09$KH$R1*XII::BQ*Z!%D2N@4)$K^#BXI?OE_-@(:(7].+BE^^7\"&A%_? !
- 5 .;[T./F_-#?RA"_8!@XCEY<7%
-
- end
-
-
- The only difference in SBC's operation in decimal mode from binary mode
- is the result-fixup:
-
- unsigned
- A, /* Accumulator */
- AL, /* low nybble of accumulator */
- AH, /* high nybble of accumulator */
-
- C, /* Carry flag */
- Z, /* Zero flag */
- V, /* oVerflow flag */
- N, /* Negative flag */
-
- s; /* value to be added to Accumulator */
-
- AL = (A & 15) - (s & 15) - !C; /* Calculate the lower nybble. */
-
- if (AL & 16) AL -= 6; /* BCD fixup for lower nybble. */
-
- AH = (A >> 4) - (s >> 4) - (AL & 16); /* Calculate the upper nybble. */
-
- if (AH & 16) AH -= 6; /* BCD fixup for upper nybble. */
-
- /* The flags are set just like in Binary mode. */
-
- C = (A - s - !C) & 256 != 0;
- Z = (A - s - !C) & 255 != 0;
- V = ((A - s - !C) ^ s) & 128 && (A ^ s) & 128;
- N = (A - s - !C) & 128 != 0;
-
- A = ((AH << 4) | (AL & 15)) & 255;
-
-
- Again Z flag is set before any BCD fixup. The N and V flags are set
- at any time before fixing the high nybble. The C flag may be set in any
- phase.
-
- Decimal subtraction is easier than decimal addition, as you have to
- make the BCD fixup only when a nybble overflows. In decimal addition,
- you had to verify if the nybble was greater than 9. The processor has
- an internal "half carry" flag for the lower nybble, used to trigger
- the BCD fixup. When calculating with legal BCD values, the lower nybble
- cannot overflow again when fixing it.
- So, the processor does not handle overflows while performing the fixup.
- Similarly, the BCD fixup occurs in the high nybble only if the value
- overflows, i.e. when the C flag will be cleared.
-
- Because SBC's flags are not affected by the Decimal mode flag, you
- could guess that CMP uses the SBC logic, only setting the C flag
- first. But the SBX instruction shows that CMP also temporarily clears
- the D flag, although it is totally unnecessary.
-
- The following program, which tests SBC's result and flags,
- contains the 6502 version of the pseudo code example above.
-
- begin 600 dsbc
- M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'BI&* A/N$_$B@+)$KH':1
- M*S@(I?PI#X7]I?LI#^7]L /I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL KI7RBP
- M#ND/.+ )*+ &Z0^P NE?A/T%_87]*+BE^^7\"&BH.+CXI?OE_-@(1?W0FVB$
- 8_47]T)3F^]">YOS0FFA)&- $J3C0B%A@
-
- end
-
- Obviously the undocumented instructions RRA (ROR+ADC) and ISB
- (INC+SBC) have inherited also the decimal operation from the official
- instructions ADC and SBC. The program droradc proves this statement
- for ROR, and the dincsbc test proves this for ISB. Finally,
- dincsbc-deccmp proves that ISB's and DCP's (DEC+CMP) flags are not
- affected by the D flag.
-
- begin 644 droradc
- M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH(V1
- M*S@(I?PI#X7]I?LI#V7]R0J0`FD%J"D/A?VE^RGP9?PI\`C`$)`":0^JL`@H
- ML`?)H)`&""@X:5\X!?V%_0AH*3W@`!`""8"HBD7[$`JE^T7\,`28"4"H**7[
- M9?S0!)@)`J@XN/BE^R;\9_S8"$7]T"=HA/U%_=`@YOO0A>;\T(%H21CP`EA@
- 2J1T892N%^ZD`92R%_*DX;/L`
- `
- end
-
- begin 644 dincsbc
- M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH':1
- M*S@(I?PI#X7]I?LI#^7]L`/I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL`KI7RBP
- M#ND/.+`)*+`&Z0^P`NE?A/T%_87]*+BE^^7\"&BH.+CXI?O&_.?\V`A%_="9
- ::(3]1?W0DN;[T)SF_-"8:$D8T`2I.-"&6\
- `
- end
-
- begin 644 dincsbc-deccmp
- M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'B@`(3[A/RB`XH8:7>HL2N@
- M3Y$KH%R1*XII>ZBQ*Z!3D2N@8)$KBFE_J+$KH%61*Z!BD2OX.+BE^^;\Q_S8
- L"&B%_3BXI?OF_,?\"&A%_?`!`.;[T-_F_-#;RA"M8!@XCFYL;&Q\?GYP#8
- `
- end
-
-
-
- 6510 features
-
- o PHP always pushes the Break (B) flag as a `1' to the stack.
- Jukka Tapanim"aki claimed in C=lehti issue 3/89, on page 27 that the
- processor makes a logical OR between the status register's bit 4
- and the bit 8 of the stack pointer register (which is always 1).
- He did not give any reasons for this argument, and has refused to clarify
- it afterwards. Well, this was not the only error in his article...
-
- o Indirect addressing modes do not handle page boundary crossing at all.
- When the parameter's low byte is $FF, the effective address wraps
- around and the CPU fetches high byte from $xx00 instead of $xx00+$0100.
- E.g. JMP ($01FF) fetches PCL from $01FF and PCH from $0100,
- and LDA ($FF),Y fetches the base address from $FF and $00.
-
- o Indexed zero page addressing modes never fix the page address on
- crossing the zero page boundary.
- E.g. LDX #$01 : LDA ($FF,X) loads the effective address from $00 and $01.
-
- o The processor always fetches the byte following a relative branch
- instruction. If the branch is taken, the processor reads then the
- opcode from the destination address. If page boundary is crossed, it
- first reads a byte from the old page from a location that is bigger
- or smaller than the correct address by one page.
-
- o If you cross a page boundary in any other indexed mode,
- the processor reads an incorrect location first, a location that is
- smaller by one page.
-
- o Read-Modify-Write instructions write unmodified data, then modified
- (so INC effectively does LDX loc;STX loc;INX;STX loc)
-
- o -RDY is ignored during writes
- (This is why you must wait 3 cycles before doing any DMA --
- the maximum number of consecutive writes is 3, which occurs
- during interrupts except -RESET.)
-
- o Some undefined opcodes may give really unpredictable results.
-
- o All registers except the Program Counter remain unmodified after -RESET.
- (This is why you must preset D and I flags in the RESET handler.)
-
-
- Different CPU types
-
- The Rockwell data booklet 29651N52 (technical information about R65C00
- microprocessors, dated October 1984), lists the following differences between
- NMOS R6502 microprocessor and CMOS R65C00 family:
-
- 1. Indexed addressing across page boundary.
- NMOS: Extra read of invalid address.
- CMOS: Extra read of last instruction byte.
-
- 2. Execution of invalid op codes.
- NMOS: Some terminate only by reset. Results are undefined.
- CMOS: All are NOPs (reserved for future use).
-
- 3. Jump indirect, operand = XXFF.
- NMOS: Page address does not increment.
- CMOS: Page address increments and adds one additional cycle.
-
- 4. Read/modify/write instructions at effective address.
- NMOS: One read and two write cycles.
- CMOS: Two read and one write cycle.
-
- 5. Decimal flag.
- NMOS: Indeterminate after reset.
- CMOS: Initialized to binary mode (D=0) after reset and interrupts.
-
- 6. Flags after decimal operation.
- NMOS: Invalid N, V and Z flags.
- CMOS: Valid flag adds one additional cycle.
-
- 7. Interrupt after fetch of BRK instruction.
- NMOS: Interrupt vector is loaded, BRK vector is ignored.
- CMOS: BRK is executed, then interrupt is executed.
-
-
-
- 6510 Instruction Timing
-
- The NMOS 6500 series processors always perform at least two reads
- for each instruction. In addition to the operation code (opcode), they
- fetch the next byte. This is quite efficient, as most instructions are
- two or three bytes long.
-
- The processors also use a sort of pipelining. If an instruction does
- not store data in memory on its last cycle, the processor can fetch
- the opcode of the next instruction while executing the last cycle. For
- instance, the instruction EOR #$FF truly takes three cycles. On the
- first cycle, the opcode $49 will be fetched. During the second cycle
- the processor decodes the opcode and fetches the parameter #$FF. On
- the third cycle, the processor will perform the operation and store
- the result to accumulator, but simultaneously it fetches the opcode
- for the next instruction. This is why the instruction effectively
- takes only two cycles.
-
- The following tables show what happens on the bus while executing
- different kinds of instructions.
-
- Interrupts
-
- NMI and IRQ both take 7 cycles. Their timing diagram is much like
- BRK's (see below). IRQ will be executed only when the I flag is
- clear. IRQ and BRK both set the I flag, whereas the NMI does not
- affect its state.
-
- The processor will usually wait for the current instruction to
- complete before executing the interrupt sequence. To process the
- interrupt before the next instruction, the interrupt must occur
- before the last cycle of the current instruction.
-
- There is one exception to this rule: the BRK instruction. If a
- hardware interrupt (NMI or IRQ) occurs before the fourth (flags
- saving) cycle of BRK, the BRK instruction will be skipped, and
- the processor will jump to the hardware interrupt vector. This
- sequence will always take 7 cycles.
-
- You do not completely lose the BRK interrupt, the B flag will be
- set in the pushed status register if a BRK instruction gets
- interrupted. When BRK and IRQ occur at the same time, this does
- not cause any problems, as your program will consider it as a
- BRK, and the IRQ would occur again after the processor returned
- from your BRK routine, unless you cleared the interrupt source in
- your BRK handler. But the simultaneous occurrence of NMI and BRK
- is far more fatal. If you do not check the B flag in the NMI
- routine and subtract two from the return address when needed, the
- BRK instruction will be skipped.
-
- If the NMI and IRQ interrupts overlap each other (one interrupt
- occurs before fetching the interrupt vector for the other
- interrupt), the processor will most probably jump to the NMI
- vector in every case, and then jump to the IRQ vector after
- processing the first instruction of the NMI handler. This has not
- been measured yet, but the IRQ is very similar to BRK, and many
- sources state that the NMI has higher priority than IRQ. However,
- it might be that the processor takes the interrupt that comes
- later, i.e. you could lose an NMI interrupt if an IRQ occurred in
- four cycles after it.
-
- After finishing the interrupt sequence, the processor will start
- to execute the first instruction of the interrupt routine. This
- proves that the processor uses a sort of pipelining: it finishes
- the current instruction (or interrupt sequence) while reading the
- opcode of the next instruction.
-
- RESET does not push program counter on stack, and it lasts
- probably 6 cycles after deactivating the signal. Like NMI, RESET
- preserves all registers except PC.
-
-
- Instructions accessing the stack
-
- BRK
-
- # address R/W description
- --- ------- --- -----------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R read next instruction byte (and throw it away),
- increment PC
- 3 $0100,S W push PCH on stack (with B flag set), decrement S
- 4 $0100,S W push PCL on stack, decrement S
- 5 $0100,S W push P on stack, decrement S
- 6 $FFFE R fetch PCL
- 7 $FFFF R fetch PCH
-
-
- RTI
-
- # address R/W description
- --- ------- --- -----------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R read next instruction byte (and throw it away)
- 3 $0100,S R increment S
- 4 $0100,S R pull P from stack, increment S
- 5 $0100,S R pull PCL from stack, increment S
- 6 $0100,S R pull PCH from stack
-
-
- RTS
-
- # address R/W description
- --- ------- --- -----------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R read next instruction byte (and throw it away)
- 3 $0100,S R increment S
- 4 $0100,S R pull PCL from stack, increment S
- 5 $0100,S R pull PCH from stack
- 6 PC R increment PC
-
-
- PHA, PHP
-
- # address R/W description
- --- ------- --- -----------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R read next instruction byte (and throw it away)
- 3 $0100,S W push register on stack, decrement S
-
-
- PLA, PLP
-
- # address R/W description
- --- ------- --- -----------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R read next instruction byte (and throw it away)
- 3 $0100,S R increment S
- 4 $0100,S R pull register from stack
-
-
- JSR
-
- # address R/W description
- --- ------- --- -------------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low address byte, increment PC
- 3 $0100,S R internal operation (predecrement S?)
- 4 $0100,S W push PCH on stack, decrement S
- 5 $0100,S W push PCL on stack, decrement S
- 6 PC R copy low address byte to PCL, fetch high address
- byte to PCH
-
-
-
- Accumulator or implied addressing
-
- # address R/W description
- --- ------- --- -----------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R read next instruction byte (and throw it away)
-
-
- Immediate addressing
-
- # address R/W description
- --- ------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch value, increment PC
-
-
- Absolute addressing
-
- JMP
-
- # address R/W description
- --- ------- --- -------------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low address byte, increment PC
- 3 PC R copy low address byte to PCL, fetch high address
- byte to PCH
-
-
- Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
- LAX, NOP)
-
- # address R/W description
- --- ------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low byte of address, increment PC
- 3 PC R fetch high byte of address, increment PC
- 4 address R read from effective address
-
-
- Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
- SLO, SRE, RLA, RRA, ISB, DCP)
-
- # address R/W description
- --- ------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low byte of address, increment PC
- 3 PC R fetch high byte of address, increment PC
- 4 address R read from effective address
- 5 address W write the value back to effective address,
- and do the operation on it
- 6 address W write the new value to effective address
-
-
- Write instructions (STA, STX, STY, SAX)
-
- # address R/W description
- --- ------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low byte of address, increment PC
- 3 PC R fetch high byte of address, increment PC
- 4 address W write register to effective address
-
-
- Zero page addressing
-
- Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
- LAX, NOP)
-
- # address R/W description
- --- ------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch address, increment PC
- 3 address R read from effective address
-
-
- Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
- SLO, SRE, RLA, RRA, ISB, DCP)
-
- # address R/W description
- --- ------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch address, increment PC
- 3 address R read from effective address
- 4 address W write the value back to effective address,
- and do the operation on it
- 5 address W write the new value to effective address
-
-
- Write instructions (STA, STX, STY, SAX)
-
- # address R/W description
- --- ------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch address, increment PC
- 3 address W write register to effective address
-
- Zero page indexed addressing
-
- Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
- LAX, NOP)
-
- # address R/W description
- --- --------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch address, increment PC
- 3 address R read from address, add index register to it
- 4 address+I* R read from effective address
-
- Notes: I denotes either index register (X or Y).
-
- * The high byte of the effective address is always zero,
- i.e. page boundary crossings are not handled.
-
-
- Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
- SLO, SRE, RLA, RRA, ISB, DCP)
-
- # address R/W description
- --- --------- --- ---------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch address, increment PC
- 3 address R read from address, add index register X to it
- 4 address+X* R read from effective address
- 5 address+X* W write the value back to effective address,
- and do the operation on it
- 6 address+X* W write the new value to effective address
-
- Note: * The high byte of the effective address is always zero,
- i.e. page boundary crossings are not handled.
-
-
- Write instructions (STA, STX, STY, SAX)
-
- # address R/W description
- --- --------- --- -------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch address, increment PC
- 3 address R read from address, add index register to it
- 4 address+I* W write to effective address
-
- Notes: I denotes either index register (X or Y).
-
- * The high byte of the effective address is always zero,
- i.e. page boundary crossings are not handled.
-
-
- Absolute indexed addressing
-
- Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
- LAX, LAE, SHS, NOP)
-
- # address R/W description
- --- --------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low byte of address, increment PC
- 3 PC R fetch high byte of address,
- add index register to low address byte,
- increment PC
- 4 address+I* R read from effective address,
- fix the high byte of effective address
- 5+ address+I R re-read from effective address
-
- Notes: I denotes either index register (X or Y).
-
- * The high byte of the effective address may be invalid
- at this time, i.e. it may be smaller by $100.
-
- + This cycle will be executed only if the effective address
- was invalid during cycle #4, i.e. page boundary was crossed.
-
-
- Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
- SLO, SRE, RLA, RRA, ISB, DCP)
-
- # address R/W description
- --- --------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low byte of address, increment PC
- 3 PC R fetch high byte of address,
- add index register X to low address byte,
- increment PC
- 4 address+X* R read from effective address,
- fix the high byte of effective address
- 5 address+X R re-read from effective address
- 6 address+X W write the value back to effective address,
- and do the operation on it
- 7 address+X W write the new value to effective address
-
- Notes: * The high byte of the effective address may be invalid
- at this time, i.e. it may be smaller by $100.
-
-
- Write instructions (STA, STX, STY, SHA, SHX, SHY)
-
- # address R/W description
- --- --------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch low byte of address, increment PC
- 3 PC R fetch high byte of address,
- add index register to low address byte,
- increment PC
- 4 address+I* R read from effective address,
- fix the high byte of effective address
- 5 address+I W write to effective address
-
- Notes: I denotes either index register (X or Y).
-
- * The high byte of the effective address may be invalid
- at this time, i.e. it may be smaller by $100. Because
- the processor cannot undo a write to an invalid
- address, it always reads from the address first.
-
-
- Relative addressing (BCC, BCS, BNE, BEQ, BPL, BMI, BVC, BVS)
-
- # address R/W description
- --- --------- --- ---------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch operand, increment PC
- 3 PC R Fetch opcode of next instruction,
- If branch is taken, add operand to PCL.
- Otherwise increment PC.
- 4+ PC* R Fetch opcode of next instruction.
- Fix PCH. If it did not change, increment PC.
- 5! PC R Fetch opcode of next instruction,
- increment PC.
-
- Notes: The opcode fetch of the next instruction is included to
- this diagram for illustration purposes. When determining
- real execution times, remember to subtract the last
- cycle.
-
- * The high byte of Program Counter (PCH) may be invalid
- at this time, i.e. it may be smaller or bigger by $100.
-
- + If branch is taken, this cycle will be executed.
-
- ! If branch occurs to different page, this cycle will be
- executed.
-
-
- Indexed indirect addressing
-
- Read instructions (LDA, ORA, EOR, AND, ADC, CMP, SBC, LAX)
-
- # address R/W description
- --- ----------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch pointer address, increment PC
- 3 pointer R read from the address, add X to it
- 4 pointer+X R fetch effective address low
- 5 pointer+X+1 R fetch effective address high
- 6 address R read from effective address
-
- Note: The effective address is always fetched from zero page,
- i.e. the zero page boundary crossing is not handled.
-
- Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
-
- # address R/W description
- --- ----------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch pointer address, increment PC
- 3 pointer R read from the address, add X to it
- 4 pointer+X R fetch effective address low
- 5 pointer+X+1 R fetch effective address high
- 6 address R read from effective address
- 7 address W write the value back to effective address,
- and do the operation on it
- 8 address W write the new value to effective address
-
- Note: The effective address is always fetched from zero page,
- i.e. the zero page boundary crossing is not handled.
-
- Write instructions (STA, SAX)
-
- # address R/W description
- --- ----------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch pointer address, increment PC
- 3 pointer R read from the address, add X to it
- 4 pointer+X R fetch effective address low
- 5 pointer+X+1 R fetch effective address high
- 6 address W write to effective address
-
- Note: The effective address is always fetched from zero page,
- i.e. the zero page boundary crossing is not handled.
-
- Indirect indexed addressing
-
- Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP)
-
- # address R/W description
- --- ----------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch pointer address, increment PC
- 3 pointer R fetch effective address low
- 4 pointer+1 R fetch effective address high,
- add Y to low byte of effective address
- 5 address+Y* R read from effective address,
- fix high byte of effective address
- 6+ address+Y R read from effective address
-
- Notes: The effective address is always fetched from zero page,
- i.e. the zero page boundary crossing is not handled.
-
- * The high byte of the effective address may be invalid
- at this time, i.e. it may be smaller by $100.
-
- + This cycle will be executed only if the effective address
- was invalid during cycle #5, i.e. page boundary was crossed.
-
-
- Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
-
- # address R/W description
- --- ----------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch pointer address, increment PC
- 3 pointer R fetch effective address low
- 4 pointer+1 R fetch effective address high,
- add Y to low byte of effective address
- 5 address+Y* R read from effective address,
- fix high byte of effective address
- 6 address+Y R read from effective address
- 7 address+Y W write the value back to effective address,
- and do the operation on it
- 8 address+Y W write the new value to effective address
-
- Notes: The effective address is always fetched from zero page,
- i.e. the zero page boundary crossing is not handled.
-
- * The high byte of the effective address may be invalid
- at this time, i.e. it may be smaller by $100.
-
-
- Write instructions (STA, SHA)
-
- # address R/W description
- --- ----------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch pointer address, increment PC
- 3 pointer R fetch effective address low
- 4 pointer+1 R fetch effective address high,
- add Y to low byte of effective address
- 5 address+Y* R read from effective address,
- fix high byte of effective address
- 6 address+Y W write to effective address
-
- Notes: The effective address is always fetched from zero page,
- i.e. the zero page boundary crossing is not handled.
-
- * The high byte of the effective address may be invalid
- at this time, i.e. it may be smaller by $100.
-
-
- Absolute indirect addressing (JMP)
-
- # address R/W description
- --- --------- --- ------------------------------------------
- 1 PC R fetch opcode, increment PC
- 2 PC R fetch pointer address low, increment PC
- 3 PC R fetch pointer address high, increment PC
- 4 pointer R fetch low address to latch
- 5 pointer+1* R fetch PCH, copy latch to PCL
-
- Note: * The PCH will always be fetched from the same page
- than PCL, i.e. page boundary crossing is not handled.
-
-
-
- How Real Programmers Acknowledge Interrupts
-
- With RMW instructions:
-
- ; beginning of combined raster/timer interrupt routine
- LSR $D019 ; clear VIC interrupts, read raster interrupt flag to C
- BCS raster ; jump if VIC caused an interrupt
- ... ; timer interrupt routine
-
- Operational diagram of LSR $D019:
-
- # data address R/W
- --- ---- ------- --- ---------------------------------
- 1 4E PC R fetch opcode
- 2 19 PC+1 R fetch address low
- 3 D0 PC+2 R fetch address high
- 4 xx $D019 R read memory
- 5 xx $D019 W write the value back, rotate right
- 6 xx/2 $D019 W write the new value back
-
- The 5th cycle acknowledges the interrupt by writing the same
- value back. If only raster interrupts are used, the 6th cycle
- has no effect on the VIC. (It might acknowledge also some
- other interrupts.)
-
-
-
- With indexed addressing:
-
- ; acknowledge interrupts to both CIAs
- LDX #$10
- LDA $DCFD,X
-
- Operational diagram of LDA $DCFD,X:
-
- # data address R/W description
- --- ---- ------- --- ---------------------------------
- 1 BD PC R fetch opcode
- 2 FD PC+1 R fetch address low
- 3 DC PC+2 R fetch address high, add X to address low
- 4 xx $DC0D R read from address, fix high byte of address
- 5 yy $DD0D R read from right address
-
-
- ; acknowledge interrupts to CIA 2
- LDX #$10
- STA $DDFD,X
-
- Operational diagram of STA $DDFD,X:
-
- # data address R/W description
- --- ---- ------- --- ---------------------------------
- 1 9D PC R fetch opcode
- 2 FD PC+1 R fetch address low
- 3 DC PC+2 R fetch address high, add X to address low
- 4 xx $DD0D R read from address, fix high byte of address
- 5 ac $DE0D W write to right address
-
-
- With branch instructions:
-
- ; acknowledge interrupts to CIA 2
- LDA #$00 ; clear N flag
- JMP $DD0A
- DD0A BPL $DC9D ; branch
- DC9D BRK ; return
-
- You need the following preparations to initialize the CIA registers:
-
- LDA #$91 ; argument of BPL
- STA $DD0B
- LDA #$10 ; BPL
- STA $DD0A
- STA $DD08 ; load the ToD values from the latches
- LDA $DD0B ; freeze the ToD display
- LDA #$7F
- STA $DC0D ; assure that $DC0D is $00
-
- Operational diagram of BPL $DC9D:
-
- # data address R/W description
- --- ---- ------- --- ---------------------------------
- 1 10 $DD0A R fetch opcode
- 2 91 $DD0B R fetch argument
- 3 xx $DD0C R fetch opcode, add argument to PCL
- 4 yy $DD9D R fetch opcode, fix PCH
- ( 5 00 $DC9D R fetch opcode )
-
-
- ; acknowledge interrupts to CIA 1
- LSR ; clear N flag
- JMP $DCFA
- DCFA BPL $DD0D
- DD0D BRK
-
- ; Again you need to set the ToD registers of CIA 1 and the
- ; Interrupt Control Register of CIA 2 first.
-
- Operational diagram of BPL $DD0D:
-
- # data address R/W description
- --- ---- ------- --- ---------------------------------
- 1 10 $DCFA R fetch opcode
- 2 11 $DCFB R fetch argument
- 3 xx $DCFC R fetch opcode, add argument to PCL
- 4 yy $DC0D R fetch opcode, fix PCH
- ( 5 00 $DD0D R fetch opcode )
-
-
- ; acknowledge interrupts to CIA 2 automagically
- ; preparations
- LDA #$7F
- STA $DD0D ; disable all interrupt sources of CIA2
- LDA $DD0E
- AND #$BE ; ensure that $DD0C remains constant
- STA $DD0E ; and stop the timer
- LDA #$FD
- STA $DD0C ; parameter of BPL
- LDA #$10
- STA $DD0B ; BPL
- LDA #$40
- STA $DD0A ; RTI/parameter of LSR
- LDA #$46
- STA $DD09 ; LSR
- STA $DD08 ; load the ToD values from the latches
- LDA $DD0B ; freeze the ToD display
- LDA #$09
- STA $0318
- LDA #$DD
- STA $0319 ; change NMI vector to $DD09
- LDA #$FF ; Try changing this instruction's operand
- STA $DD05 ; (see comment below).
- LDA #$FF
- STA $DD04 ; set interrupt frequency to 1/65536 cycles
- LDA $DD0E
- AND #$80
- ORA #$11
- LDX #$81
- STX $DD0D ; enable timer interrupt
- STA $DD0E ; start timer
-
- LDA #$00 ; To see that the interrupts really occur,
- STA $D011 ; use something like this and see how
- LOOP DEC $D020 ; changing the byte loaded to $DD05 from
- BNE LOOP ; #$FF to #$0F changes the image.
-
- When an NMI occurs, the processor jumps to Kernal code, which jumps to
- ($0318), which points to the following routine:
-
- DD09 LSR $40 ; clear N flag
- BPL $DD0A ; Note: $DD0A contains RTI.
-
- Operational diagram of BPL $DD0A:
-
- # data address R/W description
- --- ---- ------- --- ---------------------------------
- 1 10 $DD0B R fetch opcode
- 2 11 $DD0C R fetch argument
- 3 xx $DD0D R fetch opcode, add argument to PCL
- 4 40 $DD0A R fetch opcode, (fix PCH)
-
-
- With RTI:
-
- ; the fastest possible interrupt handler in the 6500 family
- ; preparations
- SEI
- LDA $01 ; disable ROM and enable I/O
- AND #$FD
- ORA #$05
- STA $01
- LDA #$7F
- STA $DD0D ; disable CIA 2's all interrupt sources
- LDA $DD0E
- AND #$BE ; ensure that $DD0C remains constant
- STA $DD0E ; and stop the timer
- LDA #$40
- STA $DD0C ; store RTI to $DD0C
- LDA #$0C
- STA $FFFA
- LDA #$DD
- STA $FFFB ; change NMI vector to $DD0C
- LDA #$FF ; Try changing this instruction's operand
- STA $DD05 ; (see comment below).
- LDA #$FF
- STA $DD04 ; set interrupt frequency to 1/65536 cycles
- LDA $DD0E
- AND #$80
- ORA #$11
- LDX #$81
- STX $DD0D ; enable timer interrupt
- STA $DD0E ; start timer
-
- LDA #$00 ; To see that the interrupts really occur,
- STA $D011 ; use something like this and see how
- LOOP DEC $D020 ; changing the byte loaded to $DD05 from
- BNE LOOP ; #$FF to #$0F changes the image.
-
- When an NMI occurs, the processor jumps to Kernal code, which
- jumps to ($0318), which points to the following routine:
-
- DD0C RTI
-
- How on earth can this clear the interrupts? Remember, the
- processor always fetches two successive bytes for each
- instruction.
-
- A little more practical version of this is redirecting the NMI
- (or IRQ) to your own routine, whose last instruction is JMP
- $DD0C or JMP $DC0C. If you want to confuse more, change the 0
- in the address to a hexadecimal digit different from the one
- you used when writing the RTI.
-
- Or you can combine the latter two methods:
-
- DD09 LSR $xx ; xx is any appropriate BCD value 00-59.
- BPL $DCFC
- DCFC RTI
-
- This example acknowledges interrupts to both CIAs.
-
-
- If you want to confuse the examiners of your code, you can use any
- of these techniques. Although these examples use no undefined opcodes,
- they do not necessarily run correctly on CMOS processors. However, the
- RTI example should run on 65C02 and 65C816, and the latter branch
- instruction example might work as well.
-
- The RMW instruction method has been used in some demos, others were
- developed by Marko M"akel"a. His favourite is the automagical RTI
- method, although it does not have any practical applications, except
- for some time dependent data decryption routines for very complicated
- copy protections.
-
-
-
- Memory Management
-
-
- The processor's point of view
-
- The Commodore 64 has access to more memory than its processor can
- directly handle. This is possible by banking the memory. There are
- five user configurable inputs that affect the banking. Three of them
- can be controlled by program, and the rest two serve as control lines
- on the memory expansion port.
-
- The 6510 MPU has an integrated I/O port with six I/O lines. This
- port is accessed through the memory locations 0 and 1. The location 0
- is the Data Direction Register for the Peripheral data Register, which
- is mapped to the other location. When a bit in the DDR is set, the
- corresponding PR bit controls the state of a corresponding Peripheral
- line as an output. When it is clear, the state of the Peripheral line
- is reflected by the Peripheral register. The Peripheral lines are
- numbered from 0 to 5, and they are mapped to the DDR and PR bits 0 - 5,
- respectively. The 8502 processor, which is used in the Commodore 128,
- has seven Peripheral lines in its I/O port. The pin P6 is connected to
- the ASC/CC key (Caps lock in English versions).
-
- The I/O lines have the following functions:
-
- Direction Line Function
- --------- ---- --------
- out P5 Cassette motor control. (0 = motor spins)
- in P4 Cassette sense. (0 = PLAY button depressed)
- out P3 Cassette write data.
- out P2 CHAREN
- out P1 HIRAM
- out P0 LORAM
-
- The default value of the DDR register is $2F, so all lines except
- Cassette sense are outputs. The default PR value is $37 (Datassette
- motor stopped, and all three memory management lines high).
- If you turn any memory management line to input, the external pull-up
- resistors make it to look like it is outputting logical "1". This
- is actually why the computer always switches the ROMs in upon startup:
- Pulling the -RESET line low resets all Peripheral lines to inputs,
- thus setting all three processor-driven memory management lines to
- logical "1" level.
-
- The two remaining memory management lines are -EXROM and -GAME on
- the cartridge port. Each line has a pull-up resistor, so the lines
- are "1" by default.
-
- Even though the memory banking has been implemented with a 82S100
- Programmable _Logic_ Array, there is only one control line that seems
- to behave logically at first sight, the -CHAREN line. It is mostly
- used to choose between I/O address space and the character generator
- ROM. The following memory map introduces the oddities of -CHAREN and
- the other memory management lines. It is based on the memory maps in
- the Commodore 64 Programmer's Reference Guide, pp. 263 - 267, and some
- errors and inaccuracies have been corrected.
-
- The leftmost column of the table contains addresses in hexadecimal
- notation. The columns aside it introduce all possible memory
- configurations. The default mode is on the left, and the absolutely
- most rarely used Ultimax game console configuration is on the right.
- (Has anybody ever seen any Ultimax games?) Each memory configuration
- column has one or more four-digit binary numbers as a title. The bits,
- from left to right, represent the state of the -LORAM, -HIRAM, -GAME
- and -EXROM lines, respectively. The bits whose state does not matter
- are marked with "x". For instance, when the Ultimax video game
- configuration is active (the -GAME line is shorted to ground), the
- -LORAM and -HIRAM lines have no effect.
-
-
- default 001x Ultimax
- 1111 101x 1000 011x 00x0 1110 0100 1100 xx01
- 10000
- ----------------------------------------------------------------------
- F000
- Kernal RAM RAM Kernal RAM Kernal Kernal Kernal ROMH(*
- E000
- ----------------------------------------------------------------------
- D000 IO/C IO/C IO/RAM IO/C RAM IO/C IO/C IO/C I/O
- ----------------------------------------------------------------------
- C000 RAM RAM RAM RAM RAM RAM RAM RAM -
- ----------------------------------------------------------------------
- B000
- BASIC RAM RAM RAM RAM BASIC ROMH ROMH -
- A000
- ----------------------------------------------------------------------
- 9000
- RAM RAM RAM RAM RAM ROML RAM ROML ROML(*
- 8000
- ----------------------------------------------------------------------
- 7000
-
- 6000
- RAM RAM RAM RAM RAM RAM RAM RAM -
- 5000
-
- 4000
- ----------------------------------------------------------------------
- 3000
-
- 2000 RAM RAM RAM RAM RAM RAM RAM RAM -
-
- 1000
- ----------------------------------------------------------------------
- 0000 RAM RAM RAM RAM RAM RAM RAM RAM RAM
- ----------------------------------------------------------------------
-
- *) Internal memory does not respond to write accesses to these
- areas.
-
-
- Legend: Kernal E000-FFFF Kernal ROM.
-
- IO/C D000-DFFF I/O address space or Character
- generator ROM, selected by
- -CHAREN. If the CHAREN bit is
- clear, the character generator
- ROM will be selected. If it is
- set, the I/O chips are
- accessible.
-
- IO/RAM D000-DFFF I/O address space or RAM,
- selected by -CHAREN. If the
- CHAREN bit is clear, the
- character generator ROM will
- be selected. If it is set, the
- internal RAM is accessible.
-
- I/O D000-DFFF I/O address space.
- The -CHAREN line has no effect.
-
- BASIC A000-BFFF BASIC ROM.
-
- ROMH A000-BFFF or External ROM with the -ROMH line
- E000-FFFF connected to its -CS line.
-
- ROML 8000-9FFF External ROM with the -ROML line
- connected to its -CS line.
-
- RAM various ranges Commodore 64's internal RAM.
-
- - 1000-7FFF and Open address space.
- A000-CFFF The Commodore 64's memory chips
- do not detect any memory accesses
- to this area except the VIC-II's
- DMA and memory refreshes.
-
- NOTE: Whenever the processor tries to write to any ROM area
- (Kernal, BASIC, CHAROM, ROML, ROMH), the data will get
- "through the ROM" to the C64's internal RAM.
-
- For this reason, you can easily copy data from ROM to RAM,
- without any bank switching. But implementing external
- memory expansions without DMA is very hard, as you have to
- use a 256 byte window on the I/O1 or I/O2 area, like
- GEORAM, or the Ultimax memory configuration, if you do not
- want the data to be written both to internal and external
- RAM.
-
- However, this is not true for the Ultimax video game
- configuration. In that mode, the internal RAM ignores all
- memory accesses outside the area $0000-$0FFF, unless they
- are performed by the VIC, and you can write to external
- memory at $1000-$CFFF and $E000-$FFFF, if any, without
- changing the contents of the internal RAM.
-
-
- A note concerning the I/O area
-
- The I/O area of the Commodore 64 is divided as follows:
-
- Address range Owner
- ------------- -----
- D000-D3FF MOS 6567/6569 VIC-II Video Interface Controller
- D400-D7FF MOS 6581 SID Sound Interface Device
- D800-DBFF Color RAM (only lower nybbles are connected)
- DC00-DCFF MOS 6526 CIA Complex Interface Adapter #1
- DD00-DDFF MOS 6526 CIA Complex Interface Adapter #2
- DE00-DEFF User expansion #1 (-I/O1 on Expansion Port)
- DF00-DFFF User expansion #2 (-I/O2 on Expansion Port)
-
- As you can see, the address ranges for the chips are much larger
- than required. Because of this, you can access the chips through
- multiple memory areas. The VIC-II appears in its window every $40
- addresses. For instance, the addresses $D040 and $D080 are both mapped
- to the Sprite 0 X co-ordinate register. The SID has one register
- selection line less, thus it appears at every $20 bytes. The CIA chips
- have only 16 registers, so there are 16 copies of each in their memory
- area.
-
- However, you should not use other addresses than those specified by
- Commodore. For instance, the Commodore 128 mapped its additional I/O
- chips to this same memory area, and the SID responds only to the
- addresses D400-D4FF, also when in C64 mode. And the Commodore 65, or
- the C64DX, which unfortunately did not make its way to the market,
- could narrow the memory window reserved for its CSG 4567 VIC-III.
-
-
- The video chip
-
- The MOS 6567/6569 VIC-II Video Interface Controller has access to
- only 16 kilobytes at a time. To enable the VIC-II to access the whole
- 64 kB memory space, the main memory is divided to four banks of 16 kB
- each. The lines PA0 and PA1 of the second CIA are the inverse of the
- virtual VIC-II address lines VA14 and VA15, respectively. To select a
- VIC-II bank other than the default, you must program the CIA lines to
- output the desired bit pair. For instance, the following code selects
- the memory area $4000-$7FFF (bank 1) for the video controller:
-
- LDA $DD02 ; Data Direction Register A
- ORA #$03 ; Set pins PA0 and PA1 to outputs
- STA $DD02
- LDA $DD00
- AND #$FC ; Mask the lowmost bit pair off
- ORA #$02 ; Select VIC-II bank 1 (the inverse of binary 01 is 10)
- STA $DD00
-
- Why should you set the pins to outputs? Hardware RESET resets all
- I/O lines to inputs, and thanks to the CIA's internal pull-up
- resistors, the inputs actually output logical high voltage level. So,
- upon -RESET, the video bank 0 is selected automatically, and older
- Kernals could leave it uninitialized.
-
- Note that the VIC-II always fetches its information from the
- internal RAM, totally ignoring the memory configuration lines. There
- is only one exception to this rule: The character generator ROM.
- Unless the Ultimax mode is selected, VIC-II "sees" character generator
- ROM in the memory areas 1000-1FFF and 9000-9FFF. If the Ultimax
- configuration is active, the VIC-II fetches all data from the internal
- RAM.
-
-
- Accessing the memory places 0 and 1
-
- Although the addresses 0 and 1 of the processor are hard-wired to
- its on-chip I/O port registers, you can access the memory places 0 and
- 1. The video chip always reads from RAM (or character generator ROM),
- so you can use it to read also from 0 and 1. Enable the bit-map screen
- and set the start address of the graphics screen to 0. Now you can see
- these two memory locations in the upper left corner. Alternatively,
- you could set the character generator start address to 0, in which
- case you would see these locations in @ characters (code 0). Or, you
- can activate a sprite with start address 0. Whichever method you
- choose, you can read these locations with sprite collision registers.
- Define a sprite consisting of only one dot, and move it to read the 8
- bits of each byte with the sprite to sprite or sprite to background
- collision registers.
-
- But how can you write to these locations? If you execute the command
- POKE 53265,59, you will see that the memory place 1 changes its value
- wildly. If you disable the interrupts (POKE53664,127), it will remain
- stable. How is this possible? When the processor writes to 0 or 1, it
- will put the address on the address bus and set the R/-W line to indicate
- a write cycle, but it does not put the data on the data bus. Thus, it
- writes "random" data. Of course this data is not truly random. Actually
- it is something that the video chip left on the bus on its clock half.
- So, if you want to write a certain value on 0 or 1, you have to make the
- video chip to read that value just before the store cycle. This requires
- very accurate timing, but it can be achieved even with a carefully
- written BASIC program. Just wait the video chip to be in the top or
- bottom border and the beam to be in the middle of the screen (not in the
- side borders). At this area, the video chip will always read the last
- byte of the video bank (by default $3FFF). Now, if you store anything to
- the I/O port registers 0 or 1 while the video chip is refreshing this
- screen area, the contents of the memory place $3FFF will be written to
- the respective memory place (0 or 1). Note that this trick does not work
- reliably on all computers. You need good RF protection, as the data bus
- will not be driven at all when the value remains on it.
-
- On the C128 in its 2 MHz mode, you can write to the memory places
- with an easier kludge. Just make sure that the video chip is not
- performing the memory refresh (as it would slow down to 1 MHz in that
- case), and use some instruction that reads from a proper memory location
- before writing to 0 or 1. Indexed zero-page addressing modes are good
- for it. I tested this trick with LDX#1 followed by STA $FF,X. As you
- can read from the instruction timing section of this document, the
- instruction first reads from $FF (the base address) and then writes to 0.
- The timing can be done with a simple LDA$D012:CMP$D012:BEQ *-3 loop.
- But in the C128 mode you can relocate the stack page to zero page, so
- this trick is not really useful.
-
- You can also read the memory places 0 and 1 much faster than with
- sprite collisions. Just make the video chip to read from 0 or 1, and
- then read from non-connected address space ($DE00-$DFFF on a stock C64;
- also $D700-$D7FF on C128's). Actually, you can produce a complete map
- of the video timing on your computer by making a loop that reads from
- open address space, pausing one frame and one cycle in between. And if
- you are into copy protections, you could write a program on the open
- address space. Just remember that there must be a byte on the bus for
- each clock cycle.
-
- These tricks unfortunately do not work reliably on all units. So far
- I have had the opportunity to try it on three computers, two of which
- were Commodore 128 DCR's (C128's housed in metal case with a 1571 floppy
- disk drive, whose controller is integrated on the mother board). One
- C128DCR drove some of its data bits too heavily to high state. No wonder,
- since its housing consisted of some newspapers spread on the floor.
-
-
-
- Autostart Code
-
- Although this document concentrates on hardware, there is one thing
- that you must know about the firmware to get complete control over
- your computer. As the Commodore 64 always switches the ROMs on upon
- -RESET, you cannot relocate the RESET vector by writing something in
- RAM. Instead, you have to use the Autostart code that will be
- recognized by the KERNAL ROM. If the memory places from $8004 through
- $8008 contain the PETSCII string 'CBM80' (C3 C2 CD 38 30), the RESET
- routine jumps to ($8000) and the default NMI handler jumps to ($8002).
-
- Some programs that load into RAM take advantage of this and don't
- let the machine to be reset. You don't have to modify the ROM to get
- rid of this annoying behaviour. Simply ground the -EXROM line for the
- beginning of the RESET sequence.
-
-
-
- Notes
-
- See the MCS 6500 Microcomputer Family Programming Manual for less
- information.
-
-
- References:
- C64 Memory Maps C64 Programmer's Reference Guide, pp. 262-267
- C64 Schematic Diagram
- 6510 Block Diagram C64 Programmer's Reference Guide, p. 404
- Instruction Set C64 Programmer's Reference Guide, pp. 254-255, 416-417
- C64/128 Real Programmer's Revenge Guide
- C=Lehti magazine 4/87
-