Balance of Power: Stalking the Wild Defect
Dave Evans
Once again I found myself bleary eyed and fighting sleep, yet I continued to search for understanding. Having already struck down two possible causes for my enigma, I was now searching for new clues. I stubbornly refused to rest until I had flushed out the software defect.
My journey had begun modestly enough as I chanced upon a capricious crash in my software. I wondered which assumption or logic was at fault. Armed with only my low-level debugger, I began a hunt that would consume me into the dead of night. On this adventure through the dark Mac OS interior, I crossed rivers of mode switches, hopped islands of cross-TOC glue, and set snares in a jungle of native PowerPC code.
In this column I'll walk you through one facet of that relentless pursuit, pointing out the key landmarks I used to navigate and demonstrating the tools I used to survive. This should help guide you through your own future explorations of the innards of PowerPC code.
ON THE HUNT
Programming for a Power Macintosh may appear similar to your efforts on a 680x0-based Macintosh, but on close inspection you'll find PowerPC code far more interesting to debug. The relatively simple landscape of a 680x0 world gives way to confusing and insidious terrain on a Power Macintosh. Routine descriptors, dual assembly languages, and native glue are obstacles that impede your progress.My subject was a crash that occurred when PowerPC applications called MaxApplZone. I was certain the problem was in my recent system software changes, but I needed to see what happened right before the crash to understand it. I started by setting a breakpoint when an application called MaxApplZone. (Later I'll describe a good technique for setting these breakpoints.) Then I traced through the system routine and looked for anything startling.
One application executed the following code just before calling MaxApplZone:
0093B260 mflr r0 0093B264 stw r31,-0x0004(SP) 0093B268 stw r30,-0x0008(SP) 0093B26C stw r29,-0x000C(SP) 0093B270 stw r0,0x0008(SP) 0093B274 stwu SP,-0x0050(SP) 0093B278 lwz r30,-0x3940(RTOC) 0093B27C bl MaxApplZoneThe preamble to MaxApplZone saves registers R29 to R31 on the stack, creates a stack frame, and loads a local variable into R30 from the application's TOC globals before calling the routine. If we trace through this and then step into the bl (branch and link) instruction to MaxApplZone, we find the following:
0094CBFC lwz r12,-0x7E60(RTOC) 0094CC00 stw RTOC,0x0014(SP) 0094CC04 lwz r0,0x0000(r12) 0094CC08 lwz RTOC,0x0004(r12) 0094CC0C mtctr r0 0094CC10 bctrThis code is standard cross-TOC glue. The caller of a routine has the responsibility to set the TOC register (RTOC) correctly for it. Routines imported from other code fragments will have a different TOC value than the application. The PowerPC Code Fragment Manager supplies the correct TOC value and the address of the imported routine in a pair of long words called a transition vector, or TVector. In this case, the TVector is stored as global data at the application's TOC value minus $7E60 bytes. This glue code loads the TVector's address in R12 and then uses that to load the address of the routine in R0 and the new TOC value. It uses the counter register and the bctr (branch to counter register) instruction to jump to the correct address, so the return address in the link register will not be changed.
After tracing through this glue code, we find ourselves in a different kind of glue. The MaxApplZone TVector points to a routine in the InterfaceLib code fragment, as listed below. On this computer, you can guess that the code fragment is in ROM because the address of the routine is very high, $40A0E30C in this case. Since the routine is in ROM, you can't effectively set a breakpoint at its beginning.
MaxApplZone +00000 40A0E30C mflr r0 +00004 40A0E310 stwu SP,-0x0040(SP) +00008 40A0E314 stw r0,0x0048(SP) +0000C 40A0E318 lis r0,0x0001 +00010 40A0E31C subic r5,r0,0x5F9D +00014 40A0E320 lwz r3,MaxApplZone(r0) +00018 40A0E324 li r4,0x3802 +0001C 40A0E328 bl CallOSTrapUniversalProc +00020 40A0E32C lwz RTOC,0x0014(SP) +00024 40A0E330 lwz r12,0x0048(SP) +00028 40A0E334 addic SP,SP,0x0040 +0002C 40A0E338 mtlr r12 +00030 40A0E33C blrYou might expect the real MaxApplZone routine to do much more than what appears in this routine. In fact, this routine is simply glue for the 680x0 A-trap table: it gets the address of MaxApplZone from that trap table (don't try this yourself without GetOSTrapAddress, kids) and then uses the CallOSTrapUniversalProc routine to call the address.
Most of the routines in InterfaceLib are actually just like this glue routine for the trap table. Because the routines go through the trap table, PowerPC applications will be affected by patches to the trap table; if they were to bind directly with the system code fragments, patches would be bypassed.
To continue with our tracing, we must step up to and then into CallOSTrapUniversalProc. This takes us to more cross-TOC glue:
40A06D10 lwz r12,0x0008(RTOC) 40A06D14 stw RTOC,0x0014(SP) 40A06D18 lwz r0,0x0000(r12) 40A06D1C lwz RTOC,0x0004(r12) 40A06D20 mtctr r0 40A06D24 bctrSince CallOSTrapUniversalProc is part of the Mixed Mode Manager, it's implemented in the MixedMode code fragment. This cross-TOC glue finds the TVector for that routine and calls through to it. When we step through this and over the last bctr instruction, we're magically transferred not to the Mixed Mode Manager but instead to 680x0 code. Wow! MacsBug knew we were calling a universal procedure pointer, so it spared us the trace through the mode switch and took us directly to the location of the universal procedure pointer, in this case the following 680x0 code:
0031B160 MOVE.L ApplLimit,D0 0031B164 MOVE.L HeapEnd,D1 0031B168 SUB.L D1,D0 0031B16A MOVEQ #$14,D1 0031B16C CMP.L D0,D1 0031B16E BLE.S *+$000A 0031B170 MOVEQ #$00,D0 0031B172 MOVE.W D0,MemErr 0031B176 RTS 0031B178 JMP $00167FCCFrom my experience tracing through the system, I'd guess that this 680x0 code is a patch on top of the real MaxApplZone, because it compares two numbers and in one of only two cases jumps to an absolute address. The absolute address was probably set when this code was installed as a patch, and it points to either the real MaxApplZone routine or another patch.
The patch appears to check whether the value of the ApplLimit low-memory global is within 20 bytes of the value of HeapEnd. If so, it simply returns noErr in the low-memory global MemErr without calling through to the real MaxApplZone. This patch is probably part of the system software, designed to fix a bug in the ROM without having to replace the entire real MaxApplZone routine.
Now if we trace through this patch and visit the absolute address $167FCC from the patch, we find the following:
No procedure name 00167FCC *_MixedModeMagic 00167FCE BTST D3,D0 00167FD0 ORI.B #$00,D0 00167FD4 ORI.B #$00,D0 00167FD8 ORI.B #$3002,D0 00167FDC ORI.B #$04,D1 00167FE0 ORI.B #$8274,D7 00167FE4 ORI.B #$00,D0 00167FE8 ORI.B #$00,D0 00167FEC ORI.B #$A036,D0Aha! This ugly disassembly is actually a routine descriptor in disguise. The _MixedModeMagic trap invokes the Mixed Mode Manager from 680x0 code, and it always appears at the beginning of a routine descriptor. Since this trap is at the beginning of each routine descriptor, you can simply construct a routine descriptor and then jump to it in 680x0 code. The drd dcmd in MacsBug will let you see this routine descriptor in a meaningful way. When I typed drd pc in this case, I saw the contents of Listing 1.
Listing 1. Displaying a routine descriptor
drd: 00167fcc MixedModeMagic: 0xAAFE, version: #7, flags: 0x00 (NotIndexable) LoadLoc: 0x00000000, reserved2: 0x00000000, SelectorInfo: 0x00 (No Selector) Routine Count (zero-based): 0x0000 (#0) ---- Routine Record 0x0000 (#0) at 0x00167fd8 ---- ProcInfo: 0x00003002, Reserved1: 0x00000000, ISA: #1 (PowerPC) Record Flags: 0x0004 (Absolute, IsPrepared, NativeISA, PassSelector, IsNotDefault) ProcPtr: 0x00078274, offset: 0x00000000, selector: 0x00000000
-
Notice the number of fields displayed in Listing 1. For simple routine
descriptors like this one, you'll only need to look at the ProcPtr entry on the
last line of the display. More complicated routine descriptors have an array of
routines, and you'll need to look for a passed selector to determine which one
is actually used.
__MaxApplZone +00000 000CA1A8 mflr r0 +00004 000CA1AC stwu SP,-0x0040(SP) +00008 000CA1B0 stw r0,0x0048(SP) +0000C 000CA1B4 bl __HSetStateQ+0073C +00010 000CA1B8 crmove cr7_SO,cr7_SO +00014 000CA1BC extsh r4,r3 +00018 000CA1C0 li r3,0x0000 +0001C 000CA1C4 bl SetEmulatorRegister +00020 000CA1C8 lwz RTOC,0x0014(SP) +00024 000CA1CC lwz r12,0x0048(SP) +00028 000CA1D0 addic SP,SP,0x0040 +0002C 000CA1D4 mtlr r12 +00030 000CA1D8 blrThis is the MaxApplZone routine in the Memory Manager. It appears to call more substantial subroutines when it branches to __HSetStateQ+0073C, but this is the actual routine.
WALKING BACK OUT
We've braved routine descriptors, glue, and patches to make it this far. I won't dive further into the Memory Manager for this illustration, but let's try an instructive walk back out from the MaxApplZone routine.After tracing through this routine, we step over the blr instruction to branch back to the link register address. To our surprise we not only switch back to 680x0 emulation mode but we appear to be lost in darkness. The following 680x0 F-line instruction will be executed next:
No procedure name 0162D0A0 DC.W $FE02We switched back to 680x0 emulation mode because we're returning from the 680x0 patch call to a routine descriptor. Typing ip to disassemble at the current location shows what appears to be garbage, however:
No procedure name 0162D08C NEGX.L D7 0162D08E EOR.W D3,(A0)+ 0162D090 BCHG D0,-(A2) 0162D092 DC.W $D0F0 0162D094 DC.W $FFFF 0162D096 ORI.B #$00,D4 0162D09A DC.W $FFFF 0162D09C ORI.B #$A063,D0 0162D0A0 *DC.W $FE02 0162D08C NEGX.L D7 0162D08E EOR.W D3,(A0)+ 0162D090 BCHG D0,-(A2) 0162D092 DC.W $D0F0 0162D0A2 ORI.B #$9C,D0 0162D0A6 BCLR D0,D3 0162D0A8 BCHG D0,-(A2) 0162D0AA ADD.B D0,(A0)Here's the secret: The $FE02 F-line instruction is very much like the _MixedModeMagic trap in that it can signal the transition from emulated 680x0 code to PowerPC code. Just as with the routine descriptor that we saw earlier, executing the $FE02 instruction will in this case cause us to switch back to PowerPC native mode and will bring us to a completely different address.
Truly perceptive readers might have noticed that the program counter at the $FE02 instruction is actually on the stack. Listing 2 shows a memory dump of the first 48 bytes of the stack at this time. Notice that the word at the beginning of the third line (at $162D0A0) is the $FE02 instruction we're about to execute.
Listing 2. The stack upon return to PowerPC code
Displaying memory from sp 0162D080 DDDD DDDD DDDD DDDD 7FFF 7FFF 4087 B758 -+**********@á 0162D090 0162 D0F0 FFFF 0004 0000 FFFF 0000 A063 [[Sigma]]X*b-*********** 0162D0A0 FE02 0000 009C 0183 0162 D110 0000 3802 +c*****ú*É*b--***
As we trace over that $FE02 instruction, we find ourselves back inside the InterfaceLib glue routine for MaxApplZone. Tracing through those last instructions finally takes us back to the application code where we started, as shown here:
MaxApplZone +00020 40A0E32C lwz RTOC,0x0014(SP) +00024 40A0E330 lwz r12,0x0048(SP) +00028 40A0E334 addic SP,SP,0x0040 +0002C 40A0E338 mtlr r12 +00030 40A0E33C blr No procedure name 0093B280 lwz RTOC,0x0014(SP) 0093B284 li r31,0x0001Notice that when we returned to a previous code fragment, we immediately restored the TOC register to a value saved on the stack. Not only is the caller responsible for setting the TOC register before calling a routine, it's also responsible for restoring this register when the call returns.
This concludes our romp through the wilderness of the modern PowerPC environment. We traced from an application's code fragment, through the InterfaceLib fragment and then a patch in the trap table, to a routine descriptor for the real MaxApplZone routine, and ultimately back again.
CATCHING POWERPC CALLS
Earlier I glossed over how to set a breakpoint and catch an application as it calls MaxApplZone. Now I'll describe a good trick for doing this. The MacsBug debugger doesn't implement 680x0 A-trap break commands for PowerPC code yet. But you can easily mimic the A-trap break feature in PowerPC code, using the FindSym, PlayMem, and PPCJump MacsBug macros. You can use those macros if you install the file "PowerPC dcmds" (which you'll find on this issue's CD) into your MacsBug Preferences folder.Say, as an example, that you'd like to catch all PowerPC code that calls the Toolbox routine ReleaseResource. PowerPC code fragments access this routine by importing its entry point from the InterfaceLib code fragment. Typing FindSym ReleaseResource on my Power Macintosh 8100 produces the following:
findsym: "ReleaseResource" "ReleaseResource" #1796 TVec 0001acc0 (40a15978,0001ea14) in "InterfaceLib"
-
FindSym is case sensitive. When looking for an entry in InterfaceLib, for
example, you must spell the routine name exactly and capitalize letters
perfectly; typing "releaseresource" rather than "ReleaseResource" will not
work.*
If I need to catch callers to ReleaseResource, I could then simply type brp 40a15978 to set a PowerPC breakpoint at the beginning of the InterfaceLib code. On the Power Macintosh 8100, however, this address is in ROM. Setting breakpoints in ROM is more difficult for MacsBug, which returns this message:
Warning: This requires stepping through each instructionYour Macintosh might become unusable if MacsBug is forced to single-step through all the code. Because MacsBug can set a breakpoint in RAM with less difficulty, we'll now use the PlayMem and PPCJump macros to set an equivalent breakpoint in RAM.
PlayMem is a MacsBug variable that points to 512 bytes of scratch memory in RAM. The PPCJump macro expands to a set of PowerPC instructions for jumping to an absolute address. So the command
sl PlayMem PPCJump 40a15978writes the following instructions to MacsBug's scratch memory:
lis r0,40a1 | 3C0040A1 ori r0,r0,5978 | 60005978 mtctr r0 | 7C0903A6 bctr | 4E800420Now I'll replace the value of the TVector with our new code in scratch space, by typing sl 0001acc0 PlayMem. PowerPC code bound with InterfaceLib will now call my new code instead of ReleaseResource, but my code will then correctly pass control to ReleaseResource. Finally, typing brp PlayMem will set the PowerPC breakpoint we want.
When PowerPC code tries to call the ReleaseResource trap via InterfaceLib, execution will stop at my breakpoint in PlayMem. At that point, typing ipp lr will list PowerPC instructions around the address in the link register, quickly showing me which code was calling the trap.
AFTER THE HUNT
Although I seriously doubt I would find enjoyment in hunting live animals, I've found the hunt for software defects truly rewarding. Some problems are a definite challenge, and I often learn something new about the Mac OS with each riddle solved. I hope that knowing the details of my pursuit will help you in your own future quests.
DAVE EVANS and fellow Apple engineer Rus Maxham took another adventure by motorcycle this summer. This time they journeyed to Utah and skirted the Great Salt Lake. Turning north, they discovered the beautiful and unspoiled vistas of Idaho. Cottonwood flower petals rained on them as they crossed into Washington. Hectares of wheat farms and the blustery Columbia River guided them to Oregon. One cracked tailpipe and two quarts of oil later, they finally arrived home in California.
Thanks to Nitin Ganatra, Pete Gontier, Jim Luther, and Alex Rangel for reviewing this column.
- SPREAD THE WORD:
- Slashdot
- Digg
- Del.icio.us
- Newsvine