home *** CD-ROM | disk | FTP | other *** search
-
-
-
- 56
-
- CHAPTER 8 - SHIFT AND ROTATE
-
-
- There are seven instructions that move the individual bits of a
- byte or word either left or right. Each instruction works
- slightly differently. We'll make a standard program and then
- substitute each instruction into that program.
-
-
- SAL - SHL
-
- The instructions SHL (shift logical left) and SAL (shift
- arithmetic left) are exactly the same. They have the same machine
- code. They shift each bit to the left. How far? That depends.
- There are two (and only two) forms of this instruction. All other
- shift and rotate instructions have these two (and only these two)
- forms as well. The first form is:
-
- shl al, 1
-
- Which shifts each bit to the left one bit. The number MUST be 1.
- No other number is possible. The other form is:
-
- shl al, cl
-
- shifts the bits in AL to the left by the number in CL. If CL = 3,
- it shifts left by 3. If CL = 7, it shifts left by 7. The count
- register MUST be CL (not CX). The bits on the left are shifted
- out of the register into the bit bucket, and zeros are inserted
- on the right. The easy way to understand this is to fire up the
- standard program. Remember, from now on we always use
- template.asm.
-
- ;sal.asm
- ; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE
- mov ax_byte, 0A3h ; half reg, low reg binary
- mov bx_byte, 0A4h ; half reg, low reg hex
- mov cx_byte, 0A1h ; half reg, low reg signed
- mov dx_byte, 0A2h ; half reg, low reg unsigned
- lea ax, ax_byte
- call set_reg_style
-
- mov ax, 0 ; clear registers
- mov bx, 0
- mov cx, 0
- mov dx, 0
- mov di, 0
- mov bp, 0
- call show_regs
-
- outer_loop:
- call get_hex_byte ; get number and put in registers
- mov bl, al
- mov cl, al
-
- ______________________
-
- The PC Assembler Tutor - Copyright (C) 1989 Chuck Nelson
-
-
-
-
- Chapter 8 - Shift and Rotate 57
- ____________________________
-
- mov dl, al
- mov si, 8 ; 8 iterations of the loop
- and al, al ; set the flags
- call show_regs_and_wait
- shift_loop:
- sal al, 1
- sal bl, 1
- sal cl, 1
- sal dl, 1
- call show_regs_and_wait
- dec si
- jnz shift_loop
- jmp outer_loop
-
- ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE
-
- This standard program is with bytes, not words. This is because
- if we had used words we would have performed 16 individual shifts
- and that would have been time consuming and boring. First we set
- the style to half registers. Notice that one is binary, one is
- hex, one is signed and one is unsigned. That covers all bases.
- All the registers are then cleared. It would be nice to use the
- loop instruction, but CX is committed, so we make our own loop
- instruction. We move 8 into SI. The loop instructions are:
-
- dec si
- jnz shift_loop
-
- DEC decrements a register or a variable by 1. Its counterpart INC
- increments a register or variable by 1. JNZ (jump if not zero)
- jumps to 'shift_loop' if SI is not zero.
-
- We get a hex byte in AL and put the same byte in BL, CL, and DL.
- This way we will be able to see what is happening in binary, hex,
- signed and unsigned. Before starting, we have:
-
- and al, al
-
- This is there to set the flags correctly before starting. All
- four are shifted left one bit each time, and then we look at the
- result.
-
- Assemble, link and run it. Enter the number 7. In binary, that is
- (0000 0111). Take a look at the flags before starting. It is a
- positive number so SF shows '+'. ZF is not set. PF shows 'O'. O
- stands for odd. Every time you perform an arithmetic or logical
- operation, the 8086 checks parity. Parity is whether the number
- contains an even or odd number of 1 bits. This contains 3 1 bits,
- so the parity is odd. The possible settings are 'E' for even and
- 'O' for odd.{1} SAL checks for parity (though some of the other
- instructions don't). Now press ENTER. It will shift left 1 and
- you will have (0000 1110). What does the unsigned number say now?
- 14. Press ENTER again. (0001 1100) What does the unsigned number
- say? 28. Again (0011 1000) 56. Again (0111 0000) 112. Notice that
- ____________________
-
- 1 This is for use by communications programs.
-
-
-
-
- The PC Assembler Tutor 58
- ______________________
-
- the signed number reads +112. Look at the CF and OF. They are
- both cleared. Things are going to change now. Press ENTER again.
- (1110 0000). SF is now '-'. OF, the overflow flag is set because
- you changed the number from positive to negative (from +112 to
- -32). What is the unsigned number now? 224. CF is cleared. PF is
- '0'. Shift again. (1100 0000) OF is cleared because you didn't
- change signs. (Remember, the leftmost bit is the sign bit for a
- signed number). PF is now 'E' because you have two 1 bits, and
- two is even. CF is set because you shifted a 1 bit off the left
- end. Keep pressing ENTER and watch SF, OF, CF, and PF.
-
- Let's look at the unsigned numbers we had until we started
- shifting 1 bits off the left end. We started with 7, then had 14,
- 28, 56, 112, 224. This instruction is multiplying by 2. That's
- right, and it is MUCH faster than multiplication (about 50 times
- faster). Far and away the fastest way to multiply a register by
- 2, 4 or 8 is to use sal.
-
- ; by 2 ;by 4 ; by 8
- sal di,1 sal di, 1 sal di, 1
- sal di, 1 sal di, 1
- sal di, 1
-
- For a register, it is faster to use a series of 1 shifts than to
- load cl. For a variable in memory, anything over 1 shift is
- faster if you load cl.
-
- Do a few more numbers to see what is happening both with the
- number and the flags. CF always signals when a 1 bit has been
- shifted off the end.
-
-
- SAR and SHR
-
- Unlike the left shift instruction, there are two completely
- different right shift instructions. SHR (shift logical right)
- shifts the bits to the right, setting CF if a 1 bit is pushed off
- the right end. It puts 0s in the leftmost bit. Make a copy of
- SAL.ASM and replace the four instructions:
-
- sal al, 1
- sal bl, 1
- sal cl, 1
- sal dl, 1
-
- with SHR. We'll call the new program SHR.ASM. Run this one too.
- Instead of 7, use E0h (1110 0000) which is 224d. The first time
- you shift (0111 0000) the OF flag will be set because the sign
- changed. Keep shifting, noting the flags and the unsigned number.
- This time we have 224, 112, 56, 28, 14, 7, 3, 1. It is dividing
- by two and is once again MUCH faster than division. For a single
- shift, the remainder is in CF. For a shift of more than one bit,
- you lose the remainder, but there is a way around this which we
- will discuss in a moment. Do some more numbers till you are
- comfortable with the flags and the operation.
-
- If you want to divide by 16, you will shift right four times, so
-
-
-
-
- Chapter 8 - Shift and Rotate 59
- ____________________________
-
- you'll lose those 4 bits. But those bits are exactly the value of
- the remainder. All we need to do is:
-
- mov dx, ax ; copy of number to dx
- and dx, 0000000000001111b ; remainder in dx
- mov cl, 4 ; shift right 4 bits
- shr ax, cl ; quotient in ax
-
- Using a mask, we keep only the right four bits, which is the
- remainder.
-
-
- SAR
-
- SAR (shift arithmetic right) is different. It shifts right like
- SHR, but the leftmost bit always stays the same. This will make
- more sense when you run the program. Make another copy, call it
- SAR.ASM, and change the four instructions to SAR. The flags
- operate the same as for SHR and SHL. The overflow flag will never
- change since the left bit will always stay the same.
-
- First enter 74h (+116). We will be looking at the signed numbers
- only. Copy down the signed numbers as you go along. They should
- be: 116, 58, 29, 14, 7, 3, 1, 0, 0. Now try 8Ch (-116). The
- numbers you should get are: -116, -58, -29, -15, -8, -4, -2, -1,
- -1. They started out the same, then they got off by one. The
- negative numbers are one too negative. Try 39h (+57). The
- numbers here are: 57, 28, 14, 7, 3, 1, 0, 0, 0. Just as it should
- be for division by 2. Now try C7 (-57). Here the numbers are:
- -57, -29, -15, -8, -4, -2, -1, -1, -1. This time it went screwy
- right off the bat. Once again, the negative numbers are one too
- negative.
-
- SAR is an instruction for doing signed division by 2 (sort of).
- It is, however, an incomplete instruction. The rule for SAR is:
- SAR gives the correct answer if the number is positive. It gives
- the correct answer if the number is negative and the remainder is
- zero. If the number is negative but there is a remainder, then
- the answer is one too negative. The reason for this is a little
- complex, but we need to add some code if we want to do signed
- division.{2} For SHR, the remainder part was optional. Here it is
- not. We need to know whether the remainder is zero or not. For
- this example we will do a word shift left by 6. That's dividing
- by 64.
-
- remainder_mask dw 002Fh ; 63
-
- call get_signed ; number in ax
- mov bx, ax ; copy in bx
- and bx, remainder_mask ; the remainder
- mov cl,6 ; shift right 6 bits
- sar ax, cl
- jns continue ; is it positive?
- ____________________
-
- 2 Both the code and the reasons will be explained (but not
- proved) in the summary.
-
-
-
-
- The PC Assembler Tutor 60
- ______________________
-
- and bx, bx ; is the remainder zero?
- jz continue
- inc ax
- continue:
-
- We get the remainder, then shift right 6 bits. Upon finishing
- SAR, the sign flag will be set correctly. Here is yet another
- jump. This one is JNS (jump on not sign) jumps if the sign flag
- is NOT set, that is if the number is positive. If it is positive,
- then everything is ok so we skip ahead. If the number is
- negative, then we check to see if there was a remainder. If there
- wasn't, everything is ok, so we go ahead. If there was a
- remainder, then we INC (add 1) ax.
-
- Is the remainder correct? If the number was positive, the
- remainder is correct, but if the number was negative, then we
- need to do one more thing. After INC, but before 'continue' we
- have a SUB instruction:
-
- inc ax
- sub bx, 64 ; correct the remainder
- continue:
-
- Why that is the correct number will be explained in the summary.
- What a lot of work when we could simply write:
-
- mov cx, 64
- call get_signed
- cwd ; sign extend
- idiv cx ; signed division
-
- Is there any advantage to this instruction? Not really. Remember
- that the more you shift, the longer it takes. If you shift 2,
- then it's about 1/3 faster than division. If you shift 14, then
- it is only 15% faster than division. Considering that even a slow
- PC can do 25000 divisions a second, you must be in serious need
- of speed to use this. In any case, you will never or almost never
- use SAR for signed division, while you will find lots of
- opportunity to use SHR and SHL for unsigned multiplication and
- division.
-
-
- ROR and ROL
-
- ROR (rotate right) and ROL (rotate left) rotate the bits around
- the register. We will just do one program since they operate the
- same way, only in opposite directions. Make another copy of
- SAL.ASM and put in ROR in the appropriate spots.
-
- Enter a number. This time you will notice that the bits, rather
- than dissapearing off the end, reappear on the other side. They
- rotate around the register. The only flags that are defined are
- OF and CF. OF is set if the high bit changes, and CF is set if a
- 1 bit moves off the end of the register to the other side. Do a
- few more, and we'll go on to the last two instructions.
-
-
-
-
-
-
- Chapter 8 - Shift and Rotate 61
- ____________________________
-
- RCR and RCL
-
- RCR (rotate through carry right) and RCL (rotate through carry
- left) rotate the same as the above instructions except that the
- carry flag is involved. Rotating right, the low bit moves to CF,
- the carry flag and CF moves to the high bit. Rotating left, the
- high bit moves to CF and CF moves to the low bit. There are 9
- bits (or 17 bits for a word) involved in the rotation. Make yet
- another copy of the program, and change those 4 instructions to
- RCR. Also, since we have 9 bits instead of 8, change the loop
- count to 9 from 8:
-
- mov si, 9
-
- Enter a number and watch it move. Before you start moving, look
- at CF and see if there is anything in it. There are only two
- flags defined, OF and CF. Obviously, CF is set if there is
- something in it. OF is wierd. In RCL (the opposite instruction to
- the one we are using), OF operates normally, signalling a change
- in the top (sign) bit. In RCR, OF signals a change in CF. Why? I
- don't have the slightest idea. You really have no need for the OF
- flag anyways, so this is unimportant.
-
-
- Well, those are the seven instructions, but what can you do with
- them besides multiply and divide?
-
- First, you can work with multiple bit data. The 8087 has a word
- length register called the status register. Looking at the upper
- byte:
-
- 15 14 13 12 11 10 9 8
- X X X
-
- bits 11, 12 and 13 contain a number from 0 to 7. The data in this
- register is not directly accessable. You need to move the
- register into memory, then into an 8086 register. If you want to
- find what this number is, what do you do?
-
- mov bx, status_register_data
- mov cl, 3
- ror bx, cl
- and bh, 00000111b
-
- we rotate right 3 and then mask off everything else. The number
- is now in BH. We could have used SHR if we wanted. Another 8087
- register is the control register. In the upper byte it has:
-
- 15 14 13 12 11 10 9 8
- X X
-
- a number from 0 to 3 in bits 10 and 11. If we want the
- information, we do the same thing:
-
- mov bx, control_register_data
- mov cl, 2
- ror bx, cl
-
-
-
-
- The PC Assembler Tutor 62
- ______________________
-
- and bh, 00000011b
-
- and the number is in BH.
-
- You are now going to write a program that inputs an unsigned
- number and prints out its hex representation. Here it is:
-
-
- ; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE
- mov ax_byte, 0A5h ; half regs, right ascii
- mov bx_byte, 4 ; hex
- mov dx_byte, 4 ; hex
- lea ax, ax_byte
- call set_reg_style
- call show_regs
-
- outer_loop:
- call get_unsigned
- mov bx, ax
- mov dx, ax
- mov cx, 4
- inner_loop:
- push cx ; save cx
- mov cl, 4
- rol bx, cl ; rotate left 1/2 byte
- mov al, bl ; copy to al
- and al, 0Fh ; mask off upper 1/2 byte
- cmp al, 10 ; < 10, 0 - 9 ; > 9 A - F
- jae use_letters
- add al, '0' ; change to ascii
- jmp print_it
- use_letters:
- add al, 'A' - 10 ; 10 = 'A'
- print_it:
- call print_ascii_byte
- call show_regs_and_wait
- pop cx
- loop inner_loop
- jmp outer_loop
- ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE
-
- AL will be shown in ascii while BX and DX will be in hex. We save
- the original number in DX. Since the first thing we want to print
- is the left hex character, we rotate left, not right. We move the
- low byte to AL, mask off everything but the low hex number and
- then convert to an ascii character. If it is 0 - 9, we add '0'
- (the character, not the number). If it is > 9, we add "'A' - 10"
- and get a letter (if the number is 10, we get 'A'). JAE means
- jump if above or equal, and is an unsigned comparison.{3}
-
-
- ____________________
-
- 3 You are getting innundated with conditional jump
- instructions. Don't worry. As long as you understand each one
- when you run across it, you don't have to remember it. All jump
- instructions will be covered soon.
-
-
-
-
- Chapter 8 - Shift and Rotate 63
- ____________________________
-
- Finally, we print the ascii character that is in AL.{4}
-
- Another thing to notice is that just inside the loop we push CX.
- That is because we use CL for the ROL instruction. It is then
- POPped just before the loop instruction. This is typical. CX is
- the only register that can be used for counting in indexed
- instructions. It is common for indexing instructions to be
- nested, so you temporarily store the old value of CX while you
- are using CX for something different.
-
- push cx ; typical code for a shift
- mov cl, 7
- shr si, cl
- pop cx
-
-
- Finally, let's multiply large numbers by 2. Here's the code:
-
- ; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
- byte1 db ?
- byte2 db ?
- byte3 db ?
- byte4 db ?
- error_message db "Result is too large.", 0
- ; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE
-
- ; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE
- outer_loop:
- lea ax, byte1 ; get 4 byte number
- call get_unsigned_4byte
-
- shl byte1, 1
- rcl byte2, 1
- rcl byte3, 1
- rcl byte4, 1
- jnc go_on
- lea ax, error_message
- call print_string
- go_on:
- lea ax, byte1
- call print_unsigned_4byte
- jmp outer_loop
- ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE
-
- This will require some explaination. Get_unsigned_4byte gets a
- number from 1 to four billion. We put it in memory. Normally, the
- following instructions would be done word by word. We are doing
- them byte by byte so you can see the mechanics of the situation.
- The low byte is shifted left 1 bit. This doubles it, but may
- shift a 1 bit from the high bit into CF. If it does, then it will
- be present when we rotate byte2. That moves CF into the low bit
- and moves the high bit into CF. We do it again. And again. If
- there is an unsigned overflow, it will be signalled by CF being
- ____________________
-
- 4 Any subroutine in ASMHELP.OBJ that involves a one byte input
- or output has the data in AL.
-
-
-
-
- The PC Assembler Tutor 64
- ______________________
-
- set after:
-
- rcl byte4, 1
-
- JNC (jump on not carry) will skip the error message if everything
- is ok. Print_string prints a zero terminated string, that is a C
- string which is terminated by the number (not the character) 0.
- Finally, we print the number.
-
- A word about large numbers in ASMHELP.OBJ. It is assumed that you
- would like to use commas if you could. Any data type over 1 word
- long allows commas. The following are considered the same by
- ASMHELP.OBJ in its input routines:
-
- 23546787
- 2,3,5,4,6,7,8,7
- 23,,5,46,,78,7
- 23,546787
- 23,546,787
-
- It always prints commas correctly in the print routines.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Chapter 8 - Shift and Rotate 65
- ____________________________
-
- SUMMARY
-
- All shift and rotate instructions operate on either a register or
- on memory. They can be either 1 bit shifts:
-
- sal cx, 1
- ror variable1, 1
- shr bl, 1
-
- or shifts indexed by CL (it must be CL):
-
- rcl variable2, cl
- sar si, cl
- rol ah, cl
-
-
-
- SHL and SAL
-
- SHL (shift logical left) and SAL (shift arithmetic left) are
- exactly the same instruction. They move bits left. 0s are
- placed in the low bit. Bits are shoved off the register (or
- memory data) on the left side, and CF indicates whether the
- last bit shoved was a 1 or a 0. It is used for multiplying
- an unsigned number by powers of 2.
-
-
- SHR
-
- SHR (shift logical right) does the same thing as SHL but in
- the opposite direction. Bits are shifted right. 0s are
- placed in the high bit. Bits are shoved off the register (or
- memory data) on the right side and CF indicates whether the
- last bit shoved off was a 0 or a 1. It is used for dividing
- an unsigned number by powers of 2.
-
-
- SAR
-
- SAR (shift arithmetic right) shifts bits right. The high
- (sign) bit stays the same throughout the operation. Bits are
- shoved off the register (or memory data) on the right side.
- CF indicates whether the last bit shoved off was a 1 or a 0.
- It is used (with difficulty) for dividing a signed number by
- powers of 2.
-
-
- ROR and ROL
-
- ROR (rotate right) and ROL (rotate left) rotate the bits of
- a register (or memory data) right and left respectively. The
- bit which is shoved off one end is moved to the other end.
- CF indicates whether the last bit moved from one end to the
- other was a 1 or a 0.
-
- RCR and RCL
-
-
-
-
-
- The PC Assembler Tutor 66
- ______________________
-
- RCR (rotate through carry right) and RCL (rotate through
- carry left) rotate the bits of a register (or of memory
- data) right and left respectively. The bit which is shoved
- off the register (or data) is placed in CF and the old CF is
- placed on the other side of the register (or data).
-
-
- INC
- INC increments a register or a variable by 1.
-
- inc ax
- inc variable1
-
- DEC
- DEC decrements a register or a variable by 1.
-
- dec ax
- dec variable1
-
-
-
- The following is fairly technical. It is only for those willing
- to wade their way through a turgid explaination. If you don't
- understand it, forget it.
-
- CODE FOR SHL
-
- If you are shifting an UNSIGNED number right by 'X' bits, it is
- the same as dividing by (2 ** X) 1 bit = (2**1 = 2), 2 bits =
- (2**2 = 4), 7 bits = (2**7 = 128). This is the same as dividing
- by a number which is all 0s except the Xth bit which is 1 (for 0
- we have 0000 0001, for 1 we have 0000 0010, for 3 we have 0000
- 1000, for 7 we have 1000 0000). The remainder mask will be this
- number minus 1 (for 0 we have 0000 0000, for 1 we have 0000 0001,
- for 3 we have 0000 0111, for 7 we have 0111 1111).
-
-
- CODE FOR SAR
-
- The order of numbers is important for SAR. If you start with 0
- and add 1 each time, the actual sequence of signed numbers that
- you get (from the bottom up) is:
-
-
- -1
- -2
- .
- .
- -32767
- -32768
- +32767
- +32766
- .
- .
- 3
- 2
- 1
- 0
-
-
-
- Chapter 8 - Shift and Rotate 67
- ____________________________
-
-
- The positive numbers are increasing in absolute value while the
- negative numbers are decreasing in absolute value. If you divide
- by shifting and there is no remainder, then the quotient is
- exact. If there is a remainder, the quotient will truncate
- towards 0 IN THE ABOVE DIAGRAM. This means that positive numbers
- will truncate down, while the negative numbers will truncate
- towards -32768, and will be one too negative.
-
- If the number was positive, the remainder will be positive and
- will be exactly the same as for SHR. If the number was negative,
- then things are more complicated. We'll take division by 32 as an
- example. If we divide by 32 (0010 0000) the remainder mask will
- be 31 (0001 1111). If the number is negative, then what we get
- when we AND the mask:
-
- and ax, 00011111b
-
- is not the remainder but (remainder + 32). In order to get the
- actual negative remainder, we need to subtract 32. This gives us
- (remainder + 32 - 32).
-
- remainder mask = divisor - 1
- negative remainder correction = NEG divisor.
-
-