home *** CD-ROM | disk | FTP | other *** search
-
-
- THE LOW-DOWN ON LOADALL:
- EXCERPTS FROM THE BOOK
-
-
- THE HYPER-SPACE NAVIGATOR'S GUIDE
- by
- Terrance E. Hodgins
-
- copyright (C) 1990 by Terrance E. Hodgins,
- All rights reserved.
-
-
- Semi-Intelligent Systems
- PO BOX 4492
- ALBUQUERQUE, NM 87196
-
- Compuserve: 76416,553
- Internet: 76416.553@compuserve.com
- Internet: terry%scopes.unm.edu@ariel.unm.edu
-
-
- And now the boring legal stuff:
-
-
- This document uses the following trademarks:
-
- AST is a registered trademark of AST Research, Inc.
-
- IBM, PC-DOS, PC/XT, and PC/AT are registered trademarks of International Busi-
- ness Machines Corporation.
-
- Intel is a registered trademark of Intel Corporation.
-
- Lotus is a registered trademark of Lotus Development Corporation.
-
- Microsoft, MS-DOS, Windows '286, and OS/2 are registered trademarks of Micro-
- soft Corporation.
-
- Semi-Intelligent Systems, The Hyper-Space Library, Get-High, HI-DOS, High
- Code, Xcode, and Mode Code are registered trademarks of Semi-Intelligent
- Systems.
-
- Unix is a registered trademark of AT&T, Inc.
-
-
-
-
- Disclaimer of Warranty
-
- TERRANCE E. HODGINS, AND SEMI-INTELLIGENT SYSTEMS, EXCLUDE ANY AND ALL
- IMPLIED WARRANTIES, INCLUDING WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
- PARTICULAR PURPOSE.
-
- NEITHER TERRANCE E. HODGINS, NOR SEMI-INTELLIGENT SYSTEMS, MAKE ANY
- WARRANTY OF REPRESENTATION, EITHER EXPRESS OR IMPLIED, WITH RESPECT TO THESE
- PROGRAMS, THEIR QUALITY, PERFORMANCE, MERCHANTABILITY, OR FITNESS FOR A PAR-
- TICULAR PURPOSE.
-
- NEITHER TERRANCE E. HODGINS, NOR SEMI-INTELLIGENT SYSTEMS, SHALL HAVE
- ANY LIABILITY FOR SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF
- OR RESULTING FROM THE USE OR MODIFICATION OF THESE PROGRAMS.
-
- THE USE OF THE 80286 LOADALL INSTRUCTION IS INHERENTLY DANGEROUS, AND
- CAN RESULT IN PROGRAM CRASHES, OR RUN-AWAY PROGRAMS, WHICH CAN ALTER, DAMAGE,
- OR DESTROY COMPUTER DATA, AND WHICH CAN DAMAGE OR DESTROY COMPUTER HARDWARE.
- USE ONLY AT YOUR OWN RISK.
-
-
-
-
-
- Introduction
-
-
-
- Yes, there really is an unpublicized, almost secret, instruction in
- the 80286, which has the ability to do several supposedly impossible things.
-
- It is called Loadall.
-
- What Loadall does is completely load all the registers of the 80286
- from a table starting at 80:0 in low memory. I do mean ALL registers: every
- register you ever heard of, and a few you haven't, and also the "invisible"
- internal registers which are NOT OTHERWISE programmable. Executing a Loadall
- nearly completely re-defines the CPU's state.
-
- This means that it is a great warp, or hyper-space, instruction:
- executing a Loadall will jump you to someplace new, and leave you with your
- choice of register contents, status and mode settings, and memory segment
- mappings, allowing you to have your segments anywhere in the 16-megabyte
- address space of the 80286. Those of you who are familiar with Unix and C
- programming will be immediately reminded of the "longjump" routine. Loadall
- is the ultimate long-jump.
-
- This is possible in REAL mode. You do NOT have to go into protected
- mode to get at memory above 1 Megabyte on the AT. Which also means that you
- don't have to then go through all kinds of odd-ball gyrations to get back out
- of protected mode. And better yet, this instruction will work in both REAL
- and PROTECTED mode.
-
- Intel included the Loadall instruction in the 80286 for chip testing
- (they can throw the CPU into any state, and see if it then does what it is
- supposed to do), but there are much better uses for it than that (in my
- not-so-humble opinion).
-
- The power of being able to re-program ANY and ALL of the registers of
- the CPU with one single instruction opens up a whole new world of possibili-
- ties.
-
- Including, but not limited to:
-
- getting at all the memory in your machine at will, even if it is
- addressed above 1 megabyte, from real mode.
-
- executing real-mode programs in ram above one megabyte.
-
- installing a second operating-system-like program, or command proces-
-
- sor, or shell, in memory above 1 megabyte, and alternating between that and
- DOS.
-
- installing most of the guts of custom TSR's, shells, and device-driv-
- ers in ram above 1 megabyte (freeing up precious base memory), leaving in low
- memory only the stubs to call the code upstairs.
-
- writing very large programs, which are "split", and have half the
- program residing in the low-down 640K, and the other half up in extended
- memory, and running in either real or protected mode.
-
- installing large protected-mode programs in extended memory, where
- they will not conflict with, or crowd out DOS, and ping-ponging between them
- and DOS.
-
- switching to protected mode.
-
- emulating real mode from protected mode (tough, and full of gotchas,
- but still worth mentioning).
-
- this is really off-the-wall, but possible: building automata that use
- Loadall to warp from state to state, sort of like a computer game of Life,
- played in the twilight zone.
-
- ? use your imagination. The sky's the limit.
-
- While the Loadall instruction only exists on the 80286 (to the best of
- anyone's knowledge at present -- anyone who will talk, that is...), the 386
- has other instructions which can accomplish much of the same functions. Thus,
- it is possible to write code that detects the processor being used, and
- switches strategy accordingly, using subroutines with 386 op-codes to accom-
- plish the same functions on a 386. Microsoft is already doing that in their
- RamDrive.Sys and HiMem.Sys programs. Thus you can have code which will run on
- both the 286 or 386, and makes the best use of each.
-
- This instruction opens up so many possibilities (AND creates so many
- problems) for things like alternate operating systems, and alternate shells,
- that live above 1 megabyte, in real mode or protected mode, that I foresee the
- need for a community library of "Hyper-Space" subroutines, which can still
- work properly even though some segments are in outer space, or the 80286 is in
- protected mode. I would be happy to collect these, and pass on the best of
- them with future distributions of this book and software.
-
- Please forgive all the legalistic warning messages. If used properly,
- and carefully, the Loadall instruction can be quite safe. You haven't seen
- Microsoft Ramdrive.Sys destroying any systems lately, have you? It's just
- that a few bothersome people love to sue for anything, so you just have to
- plaster those stupid warning messages all over everything.
-
-
-
-
- LOADALL
-
-
- Okay, so what IS the Loadall instruction?
-
- Simple:
-
- *** 0F 05 hex ***
-
- So how does it work? Well, I've already told you the gist of it: all
- CPU registers are loaded from a 51-word table of data that starts at 80:0h
- (absolute 24-bit address 800h). This address is one thing that cannot be
- changed or re-programmed. It's hard-wired into the chip, and that's that. And
- that's unfortunate, because all versions of anybody's DOS earlier than version
- 3.3 use that area for critical system code.
-
- Loadall takes no operands, and is just a two-byte instruction. All the
- "operands" for the instruction are obtained from the table at 80:0h.
-
- Just put "db 0Fh, 05" in your code stream, and watch the fun. But you
- had better get that table right before you do, or else... (crash).
-
-
- ** THE LOAD TABLE **
-
- -----------------------------------------------------------
- Address Size CPU register
- (words)
- -----------------------------------------------------------
-
- 800 3 unused (?? I don't believe it.)
- 806 1 MSW (Machine Status Word)
- 808 7 unused (?? I don't believe it.)
- 816 1 TR (Task Register)
- 818 1 Flag Word
- 81A 1 IP (Instruction Pointer)
- 81C 1 LDT (Local Descriptor Table)
-
- 81E 1 DS (Data Segment, or DS Selector)
- 820 1 SS (Stack Segment, or SS Selector)
- 822 1 CS (Code Segment, or CS Selector)
- 824 1 ES (Extra Segment, or CS Selector)
-
- 826 1 DI (Destination Index)
- 828 1 SI (Source Index)
- 82A 1 BP (Base Pointer)
- 82C 1 SP (Stack Pointer)
-
- 82E 1 BX (Data Register BX)
- 830 1 DX (Data Register BX)
- 832 1 CX (Data Register BX)
- 834 1 AX (Data Register BX)
-
- 836 3 ES Descriptor Cache
- 83C 3 CS Descriptor Cache
- 842 3 SS Descriptor Cache
- 848 3 DS Descriptor Cache
-
- 84E 3 GDTR
- (Global-Descriptor-Table Register)
-
- 854 3 LDTDC
- (Local-Descriptor-Table Descriptor Cache)
-
- 85A 3 IDTR
- (Interrupt-Descriptor-Table Register)
-
- 860 3 TSSDC
- (Task-State-Segment Descriptor Cache)
-
- total = 33h words == 102. bytes
-
-
-
-
-
-
- THE DESCRIPTOR CACHE ENTRIES
-
- (DSDC, SSDC,CSDC, and ESDC)
-
-
-
- Wait a minute, forward-referencing again! What's a descriptor? You've
- already used that word up above, and never defined it.
-
- Well okay. A segment descriptor is a four-word structure of informa-
- tion that describes a segment. A descriptor gives a segment's size and 24-bit
- starting address, and has a byte of encoded information, called the "access
- byte", that describes the characteristics of the segment (like whether it is a
- code segment or a data segment, writable or write-protected, and so on). And
- the desciptor also has a dummy zero word for upward compatibility with the
- 80386. Segment Descriptors are used in protected mode, but not in real mode.
-
- In protected mode, when you want to use a segment of memory, you
- reference the segment descriptor. The 80286 looks into a table or two of
- descriptors (which can be quite large, up to 16384 entries), to find the right
- entry, and find out what the segment is. If you had to do this every time you
- referenced a memory variable, it would be terribly slow. In order to prevent
- this overhead, saving the descriptor information for the current segments in
- quickly-accessible CPU registers is a must. That's what the descriptor caches
- are for.
-
- But you said they aren't used in real mode, right? Right. The soft-
- ware descriptor tables aren't. But the hardware descriptor caches are. The
- Intel book on the 80286 seems mighty thin when it comes to telling you pre-
- cisely what the protected-mode hardware does while in real mode, but some of
- it still works, and is very important (the descriptor caches in particular).
- The descriptor caches determine where your segments really are, whether in
- real or protected mode.
-
- In real mode, your segments are all normally 64 Kbytes in size, by
- default, and are always located in the lowest megabyte of the 80286's 16-
- megabyte address space. When you want to access a segment, you just load a
- number for the start of the segment into the appropriate segment register, and
- then read or write that segment of memory.
-
- The segment number that you load is the address scaled down by four
- bits, so that it really addresses a memory address that is sixteen times the
- number you gave it. You can address anywhere inside that 64 Kbyte-sized
- window by using an offset.
-
- Since the segment registers are 16 bits in size, and have been scaled
- by four bits, you have the equivalent of 20-bit addressing, and can address a
- 1-megabyte sized area. That's real mode.
-
- Did it ever occur to you that that 1-megabyte sized area might itself
- just be appearing somewhere inside of an even larger area?
-
- And that the 1-megabyte-sized real-mode area is made to start at zero
- when the 80286 chip is reset, but doesn't have to stay there forever?
-
- I mean, if in protected mode, the hardware is there to address a 16-
- megabyte-sized address space, well, that hardware doesn't just go away when
- you are in real mode, does it? Or all just get turned off?
-
- No, it doesn't. As a matter of fact, it still works just fine, but
- you weren't given any instructions for doing anything with that part of the
- hardware from real mode. Or were you?
-
- Oh yes you were. It's called LOADALL.
-
- So how do the hardware desciptor caches work? Well, they hold the
- information that was read from a (software) descriptor in memory. The 80286
- discards the unused zero word, and keeps the rest. When you address memory,
- you are actually using the segment addresses in the descriptor cache regis-
- ters, not what is in the segment registers.
-
- Perhaps you thought you were using the segment registers for address-
- ing: it sure looks like you do, because if you load something into a segment
- register, you will then address the memory that the segment register is point-
- ing to. What is happening invisibly in the background is that the correspond-
- ing descriptor cache is being updated whenever you load a segment register,
- and then the descriptor cache is being used for the actual addressing.
-
- So the segment descriptor caches, and not the segment registers, are
- what actually control what goes out on the address lines, and hence, what
- memory you will really address. And the addresses in the segment descriptor
- caches are 24-bit addresses. Now isn't that special?
-
- So if we can use Loadall to load anything we want to into the segment
- descriptor caches, then we can address anywhere in the 16-megabyte address
- space of the 80286, right? Right. You got it.
-
-
- The contents of the descriptor cache entries in a Loadall table are:
-
- The absolute 24-bit address for the start of the segment,
- in the usual Intel lowest-byte-first byte-order.
- That is, the bytes are: lowest, middle, highest.
-
-
- An access byte, customarily set to 92h or 93h. This byte
- is encoded in the usual way that access bytes are
- encoded in Global Descriptor Table entries (see the
- accompanying charts). This byte describes the
- characteristics of the segment, like whether it is
- code or data, and write-protected or not.
-
- A 16-bit segment limit. This is the segment size, minus
- one. FFFFh is equal to a full 64K.
-
-
- This ordering is exactly backwards, word-order-wise, from the usual
- layout of the descriptors used in the protected-mode tables like the Global
- Descriptor Table. This is because the Loadall instruction is essentially just
- a giant POP-ALL instruction. The word order is backwards (really, "Stack-
- wards"), but the byte order within words is not reversed.
-
- The addresses loaded into the descriptor caches must be 24-bit "abso-
- lute" or un-segmented "flat-space" versions of the segment start addresses.
- IE: a segment address of 3456h becomes an absolute address of 034560h. Remem-
- ber that segment addresses are ordinarily scaled down by 4 bits. So we have
- to scale them back up to get the 24-bit flatland equivalent.
-
- You will notice that there seems to be some duplication of information
- here: you have a CS register slot in the Loadall table, which is loaded with
- the desired code segment start address, and you also have a Code Segment
- Descriptor Cache entry, with an address slot which is loaded with much the
- same information. The same is also true of DS, SS, and ES.
-
- They can't always be the same, because one is 16 bits, and one 24, and
- the 24-bit descriptor cache entry can specify the address down to the byte, in
- the full 16-megabyte address space, while the 16-bit segment register can only
- address on 16-byte boundaries, and can't address beyond 1 megabyte.
-
- So if they are different, which ones win out? The answer is, the
- Descriptor Caches. They have to, because only the Descriptor Caches have the
- whole 24-bit address necessary for addressing the entire 16-megabyte address
- space of the 80286. Also because the Descriptor Caches are what are actually
- wired to the address lines. In protected mode, the segment registers don't
- even get close to the address lines.
-
- But watch out: this gets tricky. For a simple rule-of-thumb, the
-
- proper programming practice to follow is: in real mode, always keep them the
- same. That is, where the bits of the two overlap, keep them the same. The CS
- register really holds the equivalent to bits A3 to A19 of the 24-bit address
- in the CS Descriptor Cache, so there is no way that you can "keep them all the
- same". But you can keep those bits the same, and you will want to.
-
- Why? Because, in real mode, certain operations will update a Descrip-
- tor Cache using the contents of the paired Segment Register. Oh yeh? Yeh.
-
- Example:
-
- Even if the code segment entry "CS" in a Loadall table is blatantly
- wrong, but the value in the Code Segment Descriptor Cache "CSDC" is correct,
- and the Instruction Pointer "IP" value is correct, and you do a Loadall, the
- Loadall will still work, and you WILL run the code that you intended to be run
- after the Loadall, but the program will crash at the first jump instruction
- after the Loadall. Calls to subroutines will likewise crash if the CS is
- wrong.
-
- The jump or call instruction causes updating of the CS Descriptor
- Cache contents, using the contents of the CS register, and the offset in the
- jump or call instruction. So your CS descriptor cache goes from right to
- wrong, without any further help from you. That's why you have to "keep them
- the same".
-
- (This is just a simple rule of thumb. Like all simple rules, there
- are exceptions, and the rules can be broken. Breaking these rules doesn't buy
- you anything, but you might note that this is simply a rule of thumb, not a
- Commandment From On High.)
-
- The same is also true of the other Segment Registers, and their match-
- ing Descriptor Caches, although the instructions that will cause updating will
- differ. The commonest operation that causes updating of these descriptor
- caches is loading a new segment value into a segment register.
-
- Now obviously, not all the bits of the address in the Descriptor Cache
- will be updated by such operations. The highest 4 bits cannot be updated from
- the segment register, because there are no corresponding bits. So what does
- it do with them? In real mode, the worst. It clears them. Try doing a jump
- while executing real-mode code upstairs, above 1 Megabyte, and you will come
- crashing down out of the sky. A simple jump in code located way upstairs will
- turn into a very long jump to the lowest megabyte of memory. Probably not
- what you had in mind, at all. Far jumps and far calls are out of the question
- for the same reason.
-
- Curiously, a call will not cause you to fall out of the sky in the
- same way as a jump will, so we can do reversed jumps, or reversed calls, by
- shoving a return address, and then a destination address onto the stack, and
- then executing a return instruction, where a jump to precisely the same place
- will crash us.
-
-
- When you are executing code in real mode, above 1 megabyte (plus 64
- K), your position is as precarious as that of Icarus flying towards the sun on
- wings held together with wax. (More on that "plus 64 K" note later.) You
- must keep interrupts turned off because ANY interrupt will yank you down-
- stairs, and you won't return upstairs again. The interrupt service routine
- will change some segments or other, particulary the Code Segment, and those
- segments' descriptor caches will have the highest four bits irretrievably
- cleared.
-
- And the updating of the lowest four bits is an open question. I
- always set my segments on 16-byte boundaries so I don't get burned there.
- That is, the four lowest bits of the 24-bit address are always zero. Thus,
- the 16-bit segment settings in the segment registers will always match the
- values of the lowest 20 bits of the descriptor cache settings.
-
- Here's what these descriptor cache entries look like, in source code,
- with a set of default values plugged in:
-
- newESDC dw 0, 9200h, 0FFFFh
- newCSDC dw 0, 9200h, 0FFFFh
- newSSDC dw 0, 9200h, 0FFFFh
- newDSDC dw 0, 9200h, 0FFFFh
-
- The running program will replace those zeroes in the first and second
- words of each entry with real addresses before doing the Loadall.
-
- The "92"'s are the access bytes, and mean: "this item is a descriptor
- of a data segment, it is valid, it has the highest possible privilege level
- (0), writing to it is okay, and it has not been accessed" (really, written to.
- A 'dirty' page, in virtual-memory-system parlance).
-
- Those "FFFF"'s set up segments 64K in size. There's no point in set-
- ting them any smaller, and a lot of grief to be gotten if you do. So just
- always set them to "FFFF" in real mode.
-
-
-
-
-
- THOSE OTHER BIG REGISTERS
-
-
- GDTR Global Descriptor Table Register
- LDTDC Local-Descriptor-Table Descriptor Cache
- IDTR Interrupt Descriptor Table Register
- TSSDC Task-State-Segment Descriptor Cache
-
- These registers do next to nothing while in real mode. The strategy
- for dealing with these is: just set them up in an acceptable manner, and then
- forget them. The Interrupt Descriptor Table Register is the most important of
- these, as it really does determine the starting address of the interrupt
- vector table.
-
- The format of the data for these registers is just about identical to
- the format of the data in the Descriptor Caches, except for an unused byte
- (there is no access byte):
-
- an absolute 24-bit address for table start, in the
- usual Intel byte-order. That is, the bytes
- are: lowest, middle, highest.
-
- an "extra", or "trash", or "dummy" byte (pick your
- favorite name.) Set to either FFh or 0.
-
- a 16-bit limit. This is the table size, minus one.
- (FFFFh == a full 64K)
-
-
-
- Set up the GDTR (Global Descriptor Table Register) and the IDTR (Inter-
- rupt Descriptor Table Register) using the instructions "sgdt" and "sidt" --
- "store global descriptor table register", and "store interrupt descriptor
- table register". These two instructions work in both real and protected mode.
-
- The values that we get from them are somewhat goofy (especially since
- we are getting data about non-existent tables), but we use those values any-
- way, just to keep the 80286 chip happy. We will just stuff back into the chip
- whatever is already in there.
-
- The LDTDC (Local-Descriptor-Table Descriptor Cache) is a real nothing
- in real mode. In real mode, there is no Local-Descriptor-Table Descriptor to
- cache. We just set the LDTDC with an acceptable size, same as the GDTR (88h),
- and let it go at that.
-
- The TSSDC (Task-State-Segment Descriptor Cache) is likewise a null
-
- register in real mode. There is no Task State Segment to point to. Again, we
- just set it up with a size that will keep the 80286 chip from freaking out
- (thinking that the segment is impossibly small), and let it go at that.
-
-
-
- Set up, just before doing a Loadall, these items will look like:
-
- newGDTR dw D8A0h, 0FF00h, 88h
- newLDTDC dw 0, 0FF0Eh, 88h
- newIDTR dw 0, 0FF00h, 0FFFFh
- newTSSDC dw 4000h, 0FF0Eh, 800h
-
-
- The addresses in the newLDTDC and the newTSSDC are E0000h and E4000h,
- respectively. There is nothing at those addresses but stupid phantom copies
- of the BIOS Roms, wasting precious low-memory address space. So what I do is
- put the non-existent tables on top of the non-existant ROMS, and let them
- fight it out. In truth, those addresses in those descriptor caches' entries
- will never really be used for anything, anyway, so they could be anywhere.
- They just don't matter. The starting address of the IDTR is the only one that
- does matter.
-
-
-
- AND ALL THOSE OTHER LITTLE REGISTERS
-
-
- The MSW (Machine Status Word) is normally set to zero. On the 286,
- only the 4 lowest bits are even used. The one super-important bit in this
- register is the mode bit. Set it, and you warp into protected mode. The
- other three bits are invalid and irrelevant if you are not in protected mode.
- Zero this word, unless you really intend to go into unreal mode. Heaven help
- your program if you set it, and have not set up all the descriptor tables, and
- all the protected-mode registers, and cross-linked all the pointers to every-
- thing, correctly, first. We will get into that can of worms later.
-
- Just for reference, here's what the bits are:
-
- D0 == PE Protected-Mode Enable (yeh, this is IT.)
- D1 == MP Monitor Process
- D2 == EM Emulate Processor Extension
- D3 == TS Task Switched
-
-
- The TR (Task Register) is another register used only in protected
- mode. It is used for keeping track of which task is running. Not our problem
- in real mode. Zero it.
-
- The Flag Word is the same old flag word that we are already familiar
- with from ordinary real-mode programming. We just push the flags word here,
- and we've done it. Or we can zero it. None of our programs are going to do
- anything as off-the-wall as a conditional jump right after a Loadall, anyway,
- right? Uh, right? Why do I see you grinning? 12-dimensional Life, huh?
-
- The IP (Instruction Pointer) is critical. This one really works. The
- address we put here will, in combination with the address in the Code Segment
- Descriptor Cache (CSDC), determine where we will start executing code immedi-
- ately after the Loadall. So this acts like a jump vector. We set this up in
- the our programs, just before doing a Loadall, do determine where we will go
- next. Better get this one right.
-
- The LDT (Local Descriptor Table) is another null register in real
- mode. Zero it.
-
-
- DS (Data Segment, or DS Selector)
- SS (Stack Segment, or SS Selector)
- CS (Code Segment, or CS Selector)
- ES (Extra Segment, or CS Selector)
- Set these up so that they contain the same number as bits A4 to A19 of
- the corresponding Segment Descriptor Cache. These work in conjunction with
- those.
-
-
-
- All of the following registers are very straight-forward: just load
- them with whatever you want the registers to have after the Loadall. If you
- are not trying to carry values in these registers, you can just default most
- all of them to zeroes.
-
- The stack pointer requires some care, as the stack is one of the best
- ways to carry data into the beyond. I generally stuff the stack just before a
- Loadall, and then write the current stack pointer to the SP slot in the Loa-
- dall table, so that I know that I have it right.
-
- DI (Destination Index)
- SI (Source Index)
- BP (Base Pointer)
- SP (Stack Pointer)
-
- BX (Data Register BX)
- DX (Data Register DX)
- CX (Data Register CX)
- AX (Accumulator AX)
-
-
- And then these little curiosities: the two "dead" spots in the table.
-
- 800 3 words unused (?? I don't believe it.)
- 808 7 words unused (?? I don't believe it.)
-
- Obviously, they are there for something. They must load some invisi-
- ble register or other. The registers might be some very transient registers,
- just for intermediate products, which may not be useful...
-
- Then again, considering how much we haven't been told so far, they
- might be good for something. This is another area for future experimentation.
- In the mean time, zero them.
-
-
- AND A PRETTY-TOGETHER DEFAULT TABLE
-
-
- So here's what a default Loadall table looks like. Note that
- "new_Reg_Buf" doesn't label any data item that we really use; it's the name of
- the whole table.
-
- ; LOADALL Register Load Table for new values to be loaded
- ; into registers by a Loadall.
-
- new_Reg_Buf dw 3 dup (0) ; unused space
- newMSW dw 0
- newDead dw 7 dup (0) ; unused space
- newTR dw 0
- newFlagWord dw 0
- newIP dw offset after_ldall ; * may chng
- newLDT dw 0
-
- newDS dw 0 ; *chng
- newSS dw 0 ; *chng
- newCS dw 0 ; *chng
- newES dw 0 ; *chng
-
- newDI dw 0
- newSI dw 0
- newBP dw 0
- newSP dw 0 ; *chng
-
- newBX dw 0
- newDX dw 0
- newCX dw 0
- newAX dw 0
-
- newESDC dw 0, 9300h, 0FFFFh ; *chng
- newCSDC dw 0, 9300h, 0FFFFh ; *chng
- newSSDC dw 0, 9300h, 0FFFFh ; *chng
- newDSDC dw 0, 9300h, 0FFFFh ; *chng
-
- newGDTR dw D8A0h, 0FF00h, 88h ; @ 0D8A:0 *n
- newLDTDC dw 0, 0FF0Eh, 88h ; @ E000:0
- newIDTR dw 0, 0FF00h, 0FFFFh ; @ 0000:0 *n
- newTSSDC dw 4000h, 0FF0Eh, 800h ; @ E400:0
-
-
- Those "*chng" comments mean that those items MUST be changed by the
- running program before actually doing the Loadall. We cannot correctly default
- them in the sources because the correct values can only be determined at run-
- time.
-
-
-
- The "*n" means that those values are not really in the default tables
- in the sources: the running program uses the sgdt and sidt instructions to get
- those values and then plugs them into those two entries. Just letting you see
- what they will look like. You could have anything in the original table there,
- because the running program will over-write those items with correct values
- anyway.
-
- The "@ 0D8A:0" comments are just noting the addresses in those items,
- in a more readable form.
-
-
-
-
- GATE A20 : Door to the Beyond
-
-
- Before we get heavy into the guts of actually using the Loadall in-
- struction, we need to touch on this item: Gate A20. Loadall is almost useless
- without control of Gate A20.
-
- Gate A20 is the gate on the motherboard of the AT that enables or
- disables the 4 highest address lines, A20 to A23. In order to be PC-
- compatible, they are ordinarily disabled on an AT. The pathetic PC could only
- address 1 megabyte of space, total, remember? That's 20 bits. If those lines
- are disabled, then addressing wraps to zero above FFFF:0010. But if they are
- enabled, then addressing doesn't wrap, and you can address above 1 Megabyte.
- This has nothing to do with protected mode. Even if the 80286 were in pro-
- tected mode, it still couldn't address above 1 Megabyte without enabling Gate
- A20.
-
- In the part of the Hyper-Space Library freely distributed with this
- document and the View-XM program are routines called "A20_on" and "A20_off".
- They need no arguments. You just call them, and they will enable or disable
- Gate A20. Do not make a habit of turning Gate A20 on and just leaving it on,
- as rumor has it that some barbaric programmers from the bad-old days made a
- habit of depending on address-wrapping, addressing something like FFFF:0345h
- to get at 0:0335h. Ugh! These subroutines also check whether Gate A20 was
- already on before the call, and if so, leave it alone.
-
- This leads us to a very interesting twist in the game: what if you
- turn on Gate A20, and load FFFFh into a segment register, like the DS regis-
- ter, and then address something like DS:0300h? The answer is, you will ad-
- dress beyond 1 megabyte, without either going into protected mode, or using
- Loadall tricks. The PC can only address 1 Megabyte total, but the AT can
- address 1 Megabyte, plus 64K, minus 16 bytes, in REAL mode, without Loadall.
-
- This a big part of the XMS driver specification. That's the eXtended
- Memory Specification (not to be confused with the "EMS" Expanded Memory Speci-
- fication). The XMS driver accesses memory addressed above 1 Megabyte on AT's.
-
- You can write programs which use standardized calls to the XMS driver,
- and expect that the program will work with anyone's XMS driver. Microsoft,
- Intel, Lotus, and AST Research (the authors) have put the XMS specification in
- the public domain (although they retain the copyright), and it is currently
- supported by them, and probably by many more companies that I don't know of,
- so we should be seeing plenty of good XMS device-drivers around, and, in turn,
- programs using it.
-
- Furthermore, Microsoft will give you a copy of the XMS driver, and
- standard, free, if you write to them and ask for one. Write to Microsoft
-
- Corporation, 16011 NE 36th Way, Box 97017, Redmond WA 98073, and politely
- request a floppy copy of the XMS standard and driver. The same files are
- available from many bulletin board systems, and anonymous FTP sites.
-
- Since you want a nice, clean, non-colliding standard way for your
- programs to be able to get at more ram, using the EMS and XMS standards is the
- only good way to go. Throughout this book, we are going to support those
- standards, and others, too.
-
- The recommended programming practice is to always support the XMS
- standard, and use requests to the XMS driver to get at extended memory, rather
- than just brute-force doing it yourself, even though you can with Loadall, so
- that your programs will not conflict with others.
-
- The PC world is already far too filled with gotchas and incompatibili-
- ties, and things that collide with other things, for us to be adding to the
- misery.
-
- The one thing that the XMS driver adds, that you will not have if you
- just take over and use an area of extended memory yourself, is any kind of
- collision prevention or co-ordination between programs. You won't know if
- another program is already using that area, but the XMS driver will, as long
- as the other program is also using the driver. So everybody better be adher-
- ing to the standard!
-
- On the other hand, you would not be reading this book about the
- "secret" Loadall instruction if you were all that committed to ONLY using
- "normal" standards, would you? The trick is to support the standards, without
- being constrained by them. This requires great care and thought about the
- consequences of any use of Loadall for "non-standard" activities. You can,
- for instance, allocate some memory, using the XMS driver, and then go ahead
- and use Loadall to do anything you want to with it, since you now own it. You
- have the best of both worlds.
-
- And so what do the XMS drivers use to get at the extended memory above
- the High-Memory Area? Either going into protected mode, or Loadall.
-
-
-
-
- THE PROCEDURE FOR USING LOADALL
- (the ultra-safe, long procedure)
-
-
- 1. Save the original machine state, so you have a state to return to.
- This information can be saved in a Loadall table, which is the most convenient
- form for later use.
-
- 2. Disable interrupts. Just in case. We want a clean copy of area 80.
-
- 3. Save the 102-byte (33h words) block of data located at 80:0h. Ver-
- sions of DOS (both PC- and MS-) earlier than 3.3 use this area for critical
- system code, and as of DOS 3.3, RamDrive.Sys, and Himem.Sys use this area for
- their own Loadall tables.
-
- 4. Re-enable interrupts. Let the clock ticks, or whatever, through,
- while we do the following step.
-
- 5. Set up the new Loadall table (new_reg_buf), which defines the new
- state we want to warp to.
-
- 6. Disable Interrupts.
-
- 7. Copy the new Loadall table to 80:0h.
-
- 8. Execute a Loadall.
-
- 9. Do something or other with your new machine state. Read or write
- extended memory, run code upstairs, or whatever.
-
- 10. Copy the "old" Loadall table, containing the saved machine state, down
- to 80:0.
-
- 11. Do another Loadall (Un-Loadall.) This restores the original machine
- state.
-
- 12. Copy the block of saved data back to 80:0h.
-
- 13. Re-enable interrupts.
-
- And you have done it.
-
-
- This is the long, drawn-out method. There are various short-cuts and
- speedups possible.
-
- If all you have been doing is reading or writing extended memory, for
- instance, then you don't have to do the second loadall. Just changing a
- segment register (loading a new value) will cause the corresponding Descriptor
- Cache to drop its four highest address bits, restoring addressing to the low
- megabyte.
-
- Read the sources for the program "View-XM" for more details on this.
-
- See the full text of The Hyper-Space Navigator's Guide for more.
-
-
-
-
- The Hyper-Space Navigator's Guide, the book and software library, is
- available from Semi-Intelligent Systems for $49.00 (students get 20%
- discount), and comes with the floppy of source code. With other books, you
- have to pay $10 or $20 more to get the floppy that should have come with the
- book in the first place. Here you don't. It is available on any common
- floppy format: 5.25" 360K or 1.2MB, or 3.5" 720K. If you order it, please
- state your floppy format preference.
-
- FULL source code in assembly and C is provided.
-
- The Hyper-Space Navigator's Guide gives the full low-down on Loadall,
- and other 286- and 386-compatable extended-memory tricks, too: the good, the
- bad, and the ugly.
-
- The book comes with a library of subroutines designed to facilitate
- the use of extended memory, and includes numberous demo programs which do just
- about everything you can do with Loadall (or without), including:
-
- reading and writing extended memory.
-
- running code up there, in both real and protected mode. Yes, you can
- use Loadall to warp directly into protected mode. Or you can do it the
- "normal" way, so that the code will be 386-compatable. Both ways are imple-
- mented in the code.
-
- going into, and running in, and then getting back out of protected
- mode, from within your own programs, on both the 286 and 386. Getting into
- protected mode is relatively easy. Try getting back out on a 286. I'll show
- you how.
-
- writing "split" programs, with a low-memory half, and a high- or
- extended-memory half, with the second half in real or protected mode. The
- cat's meow for image-processing programs which eat memory space like popcorn.
-
- installing either real- or protected-mode "high code" inside an ex-
- tended-memory ram-disk file, where it won't collide with anything or anybody
- else, and then using a TSR to launch directly into running the code from in
- there (thus turning a piece of the ram-disk back into ram).
-
- Again, full and complete source code, so that the demo programs also
- supply you with hackable skeletons for quickly building your own programs,
- (without the many months of day-and-night hacking and hair-tearing I went
- through to figure out this stuff). Just throw away the middle of the demo
- main routine and plug your code in.