home *** CD-ROM | disk | FTP | other *** search
Wrap
DEFINT A-Z DECLARE SUB LookForMatch (x, i) DECLARE SUB GetCommNames () DECLARE SUB Resynch () DECLARE SUB Normal () DECLARE SUB need1(diff) DECLARE SUB need2(diff1,diff2) DECLARE FUNCTION fchangesok (x,ip,diff1, diff2, diff3, diff4) DECLARE SUB SetDiffer ( k, x) DECLARE FUNCTION FileExists! (testfile$) ' $INCLUDE: 'qb.bi' CONST IBUF% = 100 REM IBUF% is the input buffer size for obtaining the ascii value of each REM byte. Speed seems relatively independent of the IBUF% size. CONST ISIZE% = 32000 CONST ISIZE1% = 22000 CONST ISIZE2% = ISIZE% - ISIZE1% CONST ISIZE3% = ISIZE1% - ISIZE2% REM This is the hunk of file that we work with. We read in the first REM 32000 (or all if less), and then try to obtain matches for the first REM 22000 of new. Then we go again, throwing away the REM first 10000 (ISIZE2%), and reading in the next 10000, then matching REM from 12000 (ISIZE3%) up to 22000, and so on. CONST ISEG% = 16 REM This is the hunk we use for basic matching. REM ISEG%*256 must be less than 32K-1, because we sum ISEG% values into REM an INTEGER value. CONST MAXINSERT% = 1000 REM This is the area of the search for matches from the current position. REM Speed is directly proportional to the size of this. CONST SEGARRAYSIZE% = ISIZE% \ ISEG% CONST SEGARRAYSIZE1% = ISIZE1% \ ISEG% REM This is the size of the arrays holding details of matches REM $DYNAMIC DIM bytesold(ISIZE%) REM This is the array of bytes from the old version DIM bytesnew(ISIZE%) REM This is the array of bytes from the new version DIM matched(SEGARRAYSIZE%) REM This is the array that says where we have a match (0 if none) REM for each block of ISEG% bytes. DIM pointerchange(SEGARRAYSIZE%) REM This array holds the change in the offset pointer needed for this REM match. DIM matchtype(SEGARRAYSIZE%) REM This holds the type of the match, 1 for type A, 2 for type B, and REM 3 for type C. DIM subtype(SEGARRAYSIZE%) REM For type B and C matches, this identifies the diffs to use. DIM differa(SEGARRAYSIZE%) DIM differb(SEGARRAYSIZE%) DIM differc(SEGARRAYSIZE%) DIM differd(SEGARRAYSIZE%) REM These hold the diffa and the diffb values for type B and type C REM matches. The value is new - old. Differc and differd are used REM for type D matches. Note that for type D, one differ may be REM zero. DIM bitval(8) REM holds the values of the bits in an octet for easy setting in bitmaps DIM gdiff(4), gdiffcnt(4) REM Used to hold the global difference values we maintain during analysis DIM gdiff2(4) REM Used to hold the global difference values we maintain during output REM SHARED reffile$, newfile$, sendfile$, diaglevel$, diagoutput$ REM SHARED recreffile$, refnewfile$ COLOR 15, 1, 1 CALL GetCommNames done& = 0 usedold& = 0 donesegs& = 0 REM This is the amount of the new file that we have already done as a REM previous hunk. dgf = 0 valjl = 1 FOR i = 8 to 1 step -1 bitval(i) = valjl valjl = valjl*2 next i IF INSTR(diaglevel$,"T") <> 0 THEN dgf = FREEFILE OPEN diagoutput$ FOR OUTPUT AS #dgf END IF IF INSTR(diaglevel$,"P") <> 0 then CLS locate 2,26 color 14,4,1 PRINT "Forward compression program"; color 15,1,1 locate 5,1 PRINT "Compressing " + newfile$ + " using " + reffile$ + " as a reference file"; END IF IF INSTR(diaglevel$,"B") <> 0 THEN PRINT "JLPAK Copyright IT Institute 1990 - Forward compression program version 1.0" PRINT " " bline = CSRLIN locate bline,1 PRINT "Percent done is: 0% Reading in files for processing - please wait ..." ; END IF fpold = FREEFILE buff$ = STRING$(IBUF%, " ") if fileexists(reffile$) then OPEN reffile$ FOR BINARY AS #fpold else PRINT "**** ERROR - Reference file " + reffile$ + " does not exist" if dgf <> 0 then close #dgf end end if checked = 0 oldsumcheck1 = 0 oldsumcheck2 = 0 WHILE NOT EOF(fpold) GET #fpold, ,buff$ if checked < 10 then checked = checked + 1 if not eof(fpold) then for i = 1 to IBUF% octval = ASC(MID$(buff$,i,1)) oldsumcheck1 = (oldsumcheck1 + octval) MOD 255 oldsumcheck2 = (oldsumcheck2 + oldsumcheck1) MOD 255 next i else lengthold& = loc(fpold) for i = 1 to (lengthold& MOD IBUF%) octval = ASC(MID$(buff$,i,1)) oldsumcheck1 = (oldsumcheck1 + octval) MOD 255 oldsumcheck2 = (oldsumcheck2 + oldsumcheck1) MOD 255 next i end if end if WEND lengthold& = loc(fpold) close #fpold OPEN reffile$ FOR BINARY AS #fpold ICNT1 = 0 WHILE NOT EOF(fpold) AND ICNT1 + IBUF% <= ISIZE% GET #fpold, , buff$ FOR i = 1 TO IBUF% ICNT1 = ICNT1 + 1 bytesold(ICNT1) = ASC(MID$(buff$, i, 1)) NEXT i WEND endold = ISIZE% if lengthold& < ISIZE% then endold = lengthold& IF INSTR(diaglevel$,"P") <> 0 THEN locate 10,1 PRINT "First hunk of reference file loaded - total length is " + str$(lengthold&) + " bytes"; END IF fpnew = FREEFILE buff$ = STRING$(IBUF%, " ") if fileexists(newfile$) then OPEN newfile$ FOR BINARY AS #fpnew else PRINT "**** ERROR - File to be compressed (" + newfile$ + ") does not exist" if dgf <> 0 then close #dgf close #fpold end end if sumcheck1 = 0 sumcheck2 = 0 WHILE NOT EOF(fpnew) GET #fpnew, ,buff$ if not eof(fpnew) then for i = 1 to IBUF% octval = ASC(MID$(buff$,i,1)) sumcheck1 = (sumcheck1 + octval) MOD 255 sumcheck2 = (sumcheck2 + sumcheck1) MOD 255 next i else lengthnew& = loc(fpnew) for i = 1 to (lengthnew& MOD IBUF%) octval = ASC(MID$(buff$,i,1)) sumcheck1 = (sumcheck1 + octval) MOD 255 sumcheck2 = (sumcheck2 + sumcheck1) MOD 255 next i end if WEND lengthnew& = loc(fpnew) close #fpnew OPEN newfile$ FOR BINARY AS #fpnew ICNT2 = 0 WHILE NOT EOF(fpnew) AND ICNT2 + IBUF% <= ISIZE% GET #fpnew, , buff$ FOR i = 1 TO IBUF% ICNT2 = ICNT2 + 1 bytesnew(ICNT2) = ASC(MID$(buff$, i, 1)) NEXT i WEND endnew = ISIZE% if lengthnew& < ISIZE% then endnew = lengthnew& newblocks = endnew \ ISEG% if lengthold& > ISIZE% or lengthnew& > ISIZE% then if newblocks > SEGARRAYSIZE1% then newblocks = SEGARRAYSIZE1% end if IF INSTR(diaglevel$,"P") <> 0 THEN locate 11,1 PRINT "First hunk of main file loaded - total length is " + str$(lengthnew&) + " bytes"; END IF REM Now we initialise all variables, ready to start the first hunk analysis hunkstart = 0 startoffset = 0 factor = 1 skipval = 1 nomatchcnt = 0 resyncstate = 0 typeacnt = 0 typebcnt = 0 typeccnt = 0 typed1cnt = 0 typed2cnt = 0 nomatches = 0 ptrchs = 0 ptr = 0 for i = 1 to 4 gdiff(i) = 0 gdiff2(i) = 0 gdiffcnt(i) = 0 next i lastperc& = 0 lasttypeacnt = 0 lasttypebcnt = 0 lasttypeccnt = 0 lasttyped1cnt = 0 lasttyped2cnt = 0 lastnomatches = 0 lastptrchs = 0 lastfullmatch = 0 investigate& = 0 cumptr = 0 outputcnt& = 0 obuff$ = "" IF FileExists(sendfile$) THEN KILL sendfile$ fp = FREEFILE OPEN sendfile$ FOR BINARY AS #fp REM Output the preamble text 'There will be an initial block containing: ' 1. a length count (one octet)for the next item. ' 2. the full path name of the file that is to be compressed ' (on decompression, a file TEMP.TMP will be created in the current ' directory, and then copied to the original destination path when ' decompression is complete, unless overridden by a different target). ' 3. two octets of sum check for the whole of this file; the sum check ' is the standard one of SIGMA(Ai) and SIGMA(iAi), all MOD 255. ' 4. a length count for the next item. ' 5. the full path name of the reference file (this can be overridden ' on decompression) ' 6. two octets of sum check for the first 1K of the reference file. obuff$ = obuff$ + CHR$(LEN(recnewfile$)) obuff$ = obuff$ + recnewfile$ obuff$ = obuff$ + CHR$(sumcheck1) + CHR$(sumcheck2) obuff$ = obuff$ + CHR$(LEN(recreffile$)) obuff$ = obuff$ + recreffile$ obuff$ = obuff$ + CHR$(oldsumcheck1) + CHR$(oldsumcheck2) outputcnt& = outputcnt& + LEN(recnewfile$) + LEN (recreffile$) + 6 PUT #fp, , obuff$ obuff$ = "" REM We try to find a match for each ISEG of octets. REM We start the search at the corresponding position in the REM old file, offset by startoffset. Startoffset is updated on each REM to migrate towards the offset of the match. We will go up to REM MAXINSERT% * factor from startoffset, working outwards in one byte REM units. REM Factor is initially one, but gets adjusted if matches fail. DO : REM This is the outer loop, LOOP UNTIL finished, for each hunk. FOR i = 1 + hunkstart TO newblocks matched(i) = 0 matchtype(i) = 0 pointerchange(i) = 0 REM Assume not matched. k = 0 c1 = i * ISEG% + startoffset c2 = i * ISEG% + startoffset WHILE k <= (MAXINSERT% * factor) AND matched(i) = 0 AND (i MOD skipval) = 0 IF c1 <= endold AND C1 >= ISEG% + lastfullmatch THEN CALL LookForMatch (c1, i) END IF IF matched(i) <> 0 THEN offset = c1 - i * ISEG% pointerchange(i) = offset - ptr ptr = offset startoffset = (startoffset + offset) \ 2 ELSE IF c2 >= ISEG% + lastfullmatch AND c2 <= endold THEN CALL LookForMatch(c2, i) END IF IF matched(i) <> 0 THEN offset = c2 - i * ISEG% pointerchange(i) = offset - ptr ptr = offset startoffset = (startoffset + offset) \ 2 END IF END IF k = k + 1 c1 = c1 + 1 c2 = c2 - 1 WEND IF matched(i) = 0 THEN nomatches = nomatches + 1 SELECT CASE resyncstate CASE IS = 0 nomatchcnt = nomatchcnt + 1 IF nomatchcnt = 10 then resyncstate = 1 end if CASE IS = 1 if (i mod skipval) = 0 then nomatchcnt = nomatchcnt + 1 skipval = nomatchcnt - 9 factor = (nomatchcnt - 8) \ 2 IF nomatchcnt = 20 then call resynch resyncstate = 2 end if CASE IS = 2 END SELECT ELSE SELECT CASE resyncstate CASE IS = 0 nomatchcnt = 0 CASE IS = 1 nomatchcnt = nomatchcnt - 1 if nomatchcnt = 10 then call normal skipval = 1 CASE IS = 2 skipval = 1 resyncstate = 1 END SELECT if pointerchange(i) <> 0 then ptrchs = ptrchs + 1 END IF IF INSTR(diaglevel$,"P") <> 0 THEN perc& = i realperc& = ((perc& * ISEG% + done&) * 1000)\lengthnew& locate 20,1 PRINT "Percent done is: " + left$(STR$(realperc&/10.0),5) + "% " ; locate 21,1 PRINT "Resync state is: " + str$(resyncstate) + " No match count is: " + str$(nomatchcnt) + " " locate 22,1 PRINT "Number of no matches: " + STR$(nomatches) + " out of " + STR$(i + donesegs&) + " segments"; locate 23,1 PRINT "Type A: " + STR$(typeacnt) + " Type B: " +STR$(typebcnt) + " Type C: " + STR$(typeccnt) + " Type D1: " + STR$(typed1cnt) + " Type D2: " + STR$(typed2cnt); locate 24,1 PRINT "Start offset is: " + str$(startoffset) + " "; END IF IF INSTR(diaglevel$,"B") <> 0 THEN perc& = i realperc& = ((perc& * ISEG% + done&) * 1000)\lengthnew& locate bline,1 PRINT "Percent done is: " + left$(STR$(realperc&/10.0),5) + " % " ; END IF NEXT i IF INSTR(diaglevel$,"P") <> 0 THEN locate 15,1 PRINT STRING$(80," ") locate 10,1 PRINT STRING$(80," "); locate 10,1 PRINT "Done analysis of a hunk of the file - outputing compressed format"; locate 11,1 PRINT STRING$(80," "); END IF IF INSTR(diaglevel$,"B") <> 0 THEN perc& = i realperc& = ((perc& * ISEG% + done&) * 1000)\lengthnew& locate bline,1 PRINT "Percent done is: " + left$(STR$(realperc&/10.0),5) + " % Writing out compressed format - please wait ... " ; END IF 'The receivers state is set up with ptroffset = 0, diffa = 0, and 'diffb = 0 and done& = 0 and usedold& = 0. 'These values are retained unless explicitly 'changed. The difference is new - old IF INSTR(diaglevel$,"I") <> 0 AND investigate& > = 0 THEN locate 13,1 INPUT "Specify first i for investigation (0 end, -1 none, -2 all):",investigate& IF investigate& = 0 THEN END END IF 'We take each ISEG block in turn. i = hunkstart WHILE i < newblocks i = i + 1 IF LEN(obuff$) > 100 THEN PUT #fp, , obuff$ obuff$ = "" END IF 'If there is no match, we output 10nnnnnn followed by the nnnnnn sets of '16 octets for up to 64 (nnnnnn of 0 means 64) segments. 'Note that 11nnnnnn is used for total matches (see below). All other 'codes are 0xxxxxxx. IF matched(i) = 0 THEN nnn = 1 j = i + 1 WHILE matched(j) = 0 AND nnn < 64 AND j <= newblocks j = j + 1 nnn = nnn + 1 WEND k = nnn IF nnn = 64 THEN nnn = 0 nnn = nnn + 128 obuff$ = obuff$ + CHR$(nnn) newstart = (i -1) * ISEG% FOR ik = 1 to k FOR ii = 1 TO ISEG% obuff$ = obuff$ + CHR$(bytesnew(newstart + ii)) NEXT ii IF LEN(obuff$) > 100 THEN PUT #fp, , obuff$ obuff$ = "" END IF newstart = newstart + ISEG% next ik i = i + k - 1 outputcnt& = outputcnt& + (ISEG% * k) + 1 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; PRINT #dgf,STR$(k) + " no match blocks" IF INSTR(diaglevel$,"I")<>0 and (investigate& = i + donesegs& or investigate& = -2) THEN newstart = (i-1) * ISEG% oldstart = newstart + cumptr FOR ii = 1 TO ISEG% PRINT #dgf, bytesnew(newstart + ii), bytesold(oldstart + ii) NEXT ii IF investigate& <> -2 THEN INPUT "Specify next i for investigation (0 end, -1 none, -2 all):",investigate& IF investigate& = 0 THEN END END IF END IF END IF ELSE 'If there is a pointer change, we need to signal it in the output. 'We have four cases for the value of the ptr change: ' positive and less than (or equals) 256 ' negative and less than (or equals) 256 ' positive and over 256 ' negative and over 256 'We use 00000000, 00000001, 00010000, and 00010001 for these four cases. 'In the first two cases, we follow with a single octet, and in the last 'two with two octets. In the case of the single octet, zero means 256. IF pointerchange(i) <> 0 THEN ptr2 = pointerchange(i) cumptr = cumptr + pointerchange(i) ptrcase = 0 IF ptr2 < 0 THEN ptr2 = -ptr2 ptrcase = 1 END IF IF ptr2 = 256 THEN ptr2 = 0 IF ptr2 <= 256 THEN obuff$ = obuff$ + CHR$(ptrcase) + CHR$(ptr2) outputcnt& = outputcnt& + 2 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; PRINT #dgf, " Pointer change to " + STR$(cumptr) END IF ELSE obuff$ = obuff$ + CHR$(ptrcase + 16) + CHR$(ptr2 \ 256) + CHR$(ptr2 MOD 256) outputcnt& = outputcnt& + 3 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; PRINT #dgf, " Pointer change to " + STR$(cumptr) END IF END IF END IF IF matched(i) + usedold& - i*ISEG% - done& <> cumptr THEN PRINT "**** ERROR - Cumulative pointer error (bug):", matched(i), cumptr END END IF IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, "Dealing with a matchtype of "; matchtype(i) END IF 'Next we look at whether a diffa, diffb, diffc, diffd change is needed. 'If a change of a diff is needed, we 'have the following cases for the diff: ' positive and less than 256 ' negative and less than 256 'We use 00100000 and 00100001 for these cases for diffa, 00110000 and '00110001 for these cases for diffb, 01000000 and 01000001 for diffc, 'and 0101000 and 01010001 for diffd, followed by a single byte which gives 'the new diff value. SELECT CASE matchtype(i) CASE 1: REM Type A - changes never needed, no action CASE 2: REM Type B - may need a diffx change, x depending on the sybtype k = subtype(i) CALL SetDiffer (k, differa(i)) CASE 3: REM Type C - may need a diffx and/or a diffy change, depending REM on the subtype (1 to 6) k = subtype(i) SELECT CASE k CASE 1 TO 3 : CALL SetDiffer (1, differa(i)) CALL SetDiffer (k + 1, differb(i)) CASE 4 TO 5: CALL SetDiffer (2, differa(i)) CALL SetDiffer (k - 1, differb(i)) CASE 6: CALL SetDiffer (3, differa(i)) CALL SetDiffer (4, differb(i)) END SELECT CASE 4: REM Type D - may need up to two changes, could be any ones. Check REM each one. CALL SetDiffer (1,differa(i)) CALL SetDiffer (2,differb(i)) CALL SetDiffer (3,differc(i)) CALL SetDiffer (4,differd(i)) END SELECT SELECT CASE matchtype(i) CASE 1: REM Type A - total match 'For a type A match, we look 'to see if the next block is also type A with no pointer change. If so, 'we count the number of blocks we can merge in, up to a maximum of 64, 'and output 11nnnnnn, where n is the count of the number of merged-in 'blocks '(nnnnnn all zeros means 64). nnn = 1 j = i + 1 WHILE matchtype(j) = 1 AND nnn < 64 AND pointerchange(j) = 0 AND j <= newblocks j = j + 1 nnn = nnn + 1 WEND k = nnn i = i + k - 1 IF nnn = 64 THEN nnn = 0 nnn = nnn + 128 + 64 obuff$ = obuff$ + CHR$(nnn) outputcnt& = outputcnt& + 1 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; PRINT #dgf,STR$(k) + " type A blocks" END IF CASE 2: REM Type B - match with a single diff value. 'For a type B match, we determine whether to use diffa, b, c, or d (based 'on this segment alone, and possibly with a diff change). We then merge in 'up to 8 segments (nnn = 000 means 8) with the same pointer and diff value '(after juggling diffs if necessary). The coding is 0nnn0010 for use of diffa '0nnn0011 for use of diffb, 0nnn0100 for use of diffc, and 0nnn0101 for use of 'diffd. nnn is the number of segments merged in. We then follow with a 'string of bits, one per byte in each of the segments, where 0 means no 'addition, and 1 means add in the selected diff. Bits up to the next octet 'boundary are ignored. nnn = 1 j = i + 1 WHILE matchtype(j) = 2 AND nnn < 8 AND pointerchange(j) = 0 AND differa(j) = differa(i) AND j<=newblocks j = j + 1 nnn = nnn + 1 WEND k = nnn IF nnn = 8 THEN nnn = 0 testdiff = gdiff2(subtype(i)) dcase = subtype(i) + 1 obuff$ = obuff$ + CHR$(nnn*16 + dcase) outputcnt& = outputcnt& + 1 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; PRINT #dgf, STR$(k) + " type B blocks. Difference is " + str$(testdiff) END IF REM We now output k sets of 2 octets, showing where testdiff is to be REM added in for each of the k segments. newstart = (i -1) * ISEG% oldstart = matched(i) - ISEG% FOR ik = 1 to 2 * k bitmap = 0 FOR ii = 1 TO ISEG%\ 2 IF bytesnew(newstart + ii) <> bytesold(oldstart + ii) THEN IF bytesnew(newstart + ii) - bytesold(oldstart+ii) <> testdiff THEN PRINT "**** ERROR - Type B difference is not differa (bug)" END END IF bitmap = bitmap + bitval(ii) END IF NEXT ii oldstart = oldstart + ISEG% \2 newstart = newstart + ISEG% \2 obuff$ = obuff$ + chr$(bitmap) outputcnt& = outputcnt& + 1 next ik i = i + k -1 CASE 3: REM Type C - match with two diff values 'For a type C match, we have six possible diff selections to use, coded as 'follows (again, merging in up to 8 possible segments): ' diffa and diffb 0nnn0110 ' diffa and diffc 0nnn0111 ' diffa and diffd 0nnn1000 ' diffb and diffc 0nnn1001 ' diffb and diffd 0nnn1010 ' diffc and diffd 0nnn1011 'We then follow with a string of bits, one per byte in each of the segments, 'where 0 means no addition, and 10 means the first mentioned diff is added in, 'and 11 means the second mentionned diff is added in. diff1 = differa(i) diff2 = differb(i) nnn = 1 j = i + 1 WHILE matchtype(j) = 3 AND nnn < 8 AND pointerchange(j) = 0 AND differa(j) = diff1 AND differb(j) = diff2 AND j<=newblocks j = j + 1 nnn = nnn + 1 WEND k = nnn IF nnn = 8 THEN nnn = 0 nnn = nnn * 16 + subtype(i) + 5 obuff$ = obuff$ + CHR$(nnn) outputcnt& = outputcnt& + 1 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; PRINT #dgf, STR$(k) + " type C blocks. Differences are " + str$(diff1) " and " + str$(diff2) END IF REM We now output k sets of bitstrings, showing where diff1 (10) and REM diff2 (11) or neither (0) are to be added in for each of the k segments. newstart = (i -1) * ISEG% oldstart = matched(i) - ISEG% bitstring$ = "" FOR ik = 1 to k FOR ii = 1 TO ISEG% SELECT CASE bytesnew(newstart + ii) - bytesold(oldstart + ii) CASE IS = 0 bitstring$ = bitstring$ + "0" CASE IS = diff1 bitstring$ = bitstring$ + "10" CASE IS = diff2 bitstring$ = bitstring$ + "11" CASE ELSE PRINT "**** ERROR - Type C difference is not differa or differb (bug)" END END SELECT NEXT ii oldstart = oldstart + ISEG% newstart = newstart + ISEG% next ik bitstring$ = bitstring$ + "0000000" blen = len(bitstring$) blen = 8 * (blen\8) bitstring$ = left$(bitstring$,blen) outc = 0 j = 0 jj = 0 bitmap = 0 while j<blen j=j+1 jj = jj + 1 if mid$(bitstring$,j,1) = "1" THEN bitmap = bitmap + bitval(jj) if jj = 8 then jj = 0 obuff$ = obuff$ + chr$(bitmap) outputcnt& = outputcnt& + 1 bitmap = 0 end if wend i = i + k -1 CASE 4: REM Type D 'For a type D match, we encode the segments 'as 0nnn1100 (for up to 8 segments) where we have four differs, and '0nnn1101 where diffa is omitted, 0nnn1110 where diffb is omitted, 0nnn1111 'where diffc is omitted. For diffd omitted, we have no code space left, 'and treat that as if it were a four differ (code 0nnn1100). 'This is followed by a bitstring for each 'octet in each of the segments, where we have 0 if no diff is to 'be added in, then, for the three differs in use, '100 if the first is to be added in, 101 for the second, and 11 for the third. 'For four differs, we have '100 if diffa is to be added in, 101 for diffb, 110 for diffc, '111 for diffd. diff1 = differa(i) diff2 = differb(i) diff3 = differc(i) diff4 = differd(i) nnn = 1 j = i + 1 WHILE matchtype(j) = 4 AND nnn < 8 AND pointerchange(j) = 0 AND differa(j) = diff1 AND differb(j) = diff2 AND differc(j) = diff3 AND differd(j) = diff4 AND j<=newblocks if subtype(j) <> subtype(i) then subtype(i) = 0 :REM use all four if necessary j = j + 1 nnn = nnn + 1 WEND if subtype(i) = 4 then subtype(i) = 0 :REM This is because we ran out of code space SELECT CASE subtype(i) CASE 0 CASE 1 diff1 = differb(i) diff2 = differc(i) diff3 = differd(i) diff4 = 0 CASE 2 diff2 = differc(i) diff3 = differd(i) diff4 = 0 CASE 3 diff3 = differd(i) diff4 = 0 CASE 4 END SELECT k = nnn IF nnn = 8 THEN nnn = 0 nnn = nnn * 16 + 12 nnn = nnn + subtype(i) obuff$ = obuff$ + CHR$(nnn) outputcnt& = outputcnt& + 1 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; IF subtype(i) = 0 THEN PRINT #dgf, STR$(k) + " type D2 blocks. Differences are " + str$(diff1) " and " + str$(diff2) + " and " + str$(diff3) + " and " str$(diff4) ELSE PRINT #dgf, STR$(k) + " type D1 blocks. Differences are " + str$(diff1) " and " + str$(diff2) + " and " + str$(diff3) + " and " str$(diff4) END IF END IF REM We now output k sets of bitstrings, showing where diff1 (100), REM diff2 (101), diff3 (110 - or 11 if only three differs), diff4 (111) REM or neither (0) are to be added in for each of the k segments. diff3str$ = "110" if subtype(i) <> 0 then diff3str$ = "11" newstart = (i -1) * ISEG% oldstart = matched(i) - ISEG% bitstring$ = "" FOR ik = 1 to k FOR ii = 1 TO ISEG% SELECT CASE bytesnew(newstart + ii) - bytesold(oldstart + ii) CASE IS = 0 bitstring$ = bitstring$ + "0" CASE IS = diff1 bitstring$ = bitstring$ + "100" CASE IS = diff2 bitstring$ = bitstring$ + "101" CASE IS = diff3 bitstring$ = bitstring$ + diff3str$ CASE IS = diff4 bitstring$ = bitstring$ + "111" CASE ELSE PRINT "**** ERROR - Type D difference is not diffa or diffb or diffc or diffd (bug)" END END SELECT NEXT ii oldstart = oldstart + ISEG% newstart = newstart + ISEG% next ik bitstring$ = bitstring$ + "0000000" blen = len(bitstring$) blen = 8 * (blen\8) bitstring$ = left$(bitstring$,blen) j = 0 jj = 0 bitmap = 0 while j<blen j=j+1 jj = jj + 1 if mid$(bitstring$,j,1) = "1" THEN bitmap = bitmap + bitval(jj) if jj = 8 then jj = 0 obuff$ = obuff$ + chr$(bitmap) outputcnt& = outputcnt& + 1 bitmap = 0 end if wend i = i + k -1 END SELECT END IF WEND REM Now loop to do another hunk if necessary finished = 0 IF lengthnew& - (newblocks + donesegs&)*ISEG < ISEG% THEN finished = 1 close #fpold close #fpnew ELSE REM Prepare for next hunk hunkstart = ISIZE3% \ ISEG% REM move new array down by ISIZE2%, and oldarray down by REM ISIZE2% + startoffset, and decrement ptr and startoffset by startoffset REM and signal the move in the sendfile. done& = done& + ISIZE2% donesegs& = done& \ ISEG% move = ISIZE2% + startoffset if move < 0 then move = 0 usedold& = usedold& + move REM Signal a new hunk 'Where we have discarded some old file and refilled the buffers, we 'signal this 'in the output by putting out a single octet of 01100000 followed by 'two octets giving the amount of the move. obuff$ = obuff$ + chr$(96) + chr$(move \ 256) + chr$(move MOD 256) outputcnt& = outputcnt& + 3 ptr = ptr + (ISIZE2% - move) startoffset = startoffset + (ISIZE2% - move) lastfullmatch = lastfullmatch - move IF endold < move + ISEG% THEN PRINT "**** ERROR - Files are too different" if dgf <> 0 then CLOSE #dgf close #fp close #fpnew close #fpold IF FileExists(sendfile$) THEN KILL sendfile$ END END IF endold = endold - move for i = 1 to endold bytesold(i) = bytesold(i + move) next i endnew = endnew - ISIZE2% for i = 1 to endnew bytesnew(i) = bytesnew(i + ISIZE2%) next i IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, "Move of new by ";ISIZE2%;" and of old by ";move END IF REM Now read in as much as possible of the files to fill the buffers. buff$ = STRING$(IBUF%, " ") ICNT1 = endold WHILE NOT EOF(fpold) AND ICNT1 + IBUF% <= ISIZE% GET #fpold, , buff$ FOR i = 1 TO IBUF% ICNT1 = ICNT1 + 1 bytesold(ICNT1) = ASC(MID$(buff$, i, 1)) NEXT i WEND endold = ISIZE% if lengthold& - usedold& < ISIZE% then endold = lengthold& - usedold& buff$ = STRING$(IBUF%, " ") ICNT2 = endnew WHILE NOT EOF(fpnew) AND ICNT2 + IBUF% <= ISIZE% GET #fpnew, , buff$ FOR i = 1 TO IBUF% ICNT2 = ICNT2 + 1 bytesnew(ICNT2) = ASC(MID$(buff$, i, 1)) NEXT i WEND endnew = ISIZE% if lengthnew& - done& < ISIZE% then endnew = lengthnew& - done& newblocks = endnew \ ISEG% if lengthold& - usedold& > ISIZE% or lengthnew& - done& > ISIZE% then if newblocks > SEGARRAYSIZE1% then newblocks = SEGARRAYSIZE1% end if IF INSTR(diaglevel$,"P") <> 0 THEN locate 10,1 PRINT STRING$(80," "); locate 10,1 PRINT "Next hunk of files loaded - proceeding with analysis"; locate 11,1 PRINT STRING$(80," "); END IF REM This ends the preparation for the next hunk. END IF LOOP UNTIL finished = 1 'Finally, we have to cope with the residual set of up to 15 octets at the 'end of the file. We simply output 01100001, then a single octet saying 'how many follow, then the octets. If there are none, we still output 'the single octet and the null count. extras = endnew - newblocks * ISEG% obuff$ = obuff$ + chr$(97) + chr$(extras) outputcnt& = outputcnt& + 2 for i = 1 to extras obuff$ = obuff$ + chr$(bytesnew(newblocks * ISEG% + i)) next i outputcnt& = outputcnt& + extras PUT #fp, , obuff$ CLOSE #fp IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, PRINT #dgf,"Statistics for the compression (done using" + STR$(ISEG%) + " byte segments) are" percent& = typeacnt percent& = (percent& * ISEG% * 1000) \ lengthnew& pstr$ = left$(STR$(percent&/10),5) + "%" PRINT #dgf," Type A matches: " + pstr$ percent& = typebcnt percent& = (percent& * ISEG% * 1000) \ lengthnew& pstr$ = left$(STR$(percent&/10),5) + "%" PRINT #dgf," Type B matches: "+pstr$ percent& = typeccnt percent& = (percent& * ISEG% * 1000) \ lengthnew& pstr$ = left$(STR$(percent&/10),5) + "%" PRINT #dgf," Type C matches: " + pstr$ percent& = typed1cnt percent& = (percent& * ISEG% * 1000) \ lengthnew& pstr$ = left$(STR$(percent&/10),5) + "%" PRINT #dgf," Type D1 matches: " + pstr$ percent& = typed2cnt percent& = (percent& * ISEG% * 1000) \ lengthnew& pstr$ = left$(STR$(percent&/10),5) + "%" PRINT #dgf," Type D2 matches: " + pstr$ percent& = nomatches percent& = (percent& * ISEG% * 1000) \ lengthnew& pstr$ = left$(STR$(percent&/10),5) + "%" PRINT #dgf," Failed matches: "+pstr$ PRINT #dgf," Pointer changes:",ptrchs PRINT #dgf, END IF IF INSTR(diaglevel$,"P") <> 0 THEN locate 13,1 PRINT STRING$(80," "); locate 13,1 color 14,4,1 PRINT "Compression complete. Number of octets output is ", outputcnt&; outjl& = lengthnew& outjl& = (outjl& * 10) \ outputcnt& locate 15,18 PRINT "This is a compression ratio of "+str$(outjl&/10) +" to 1"; locate 16,18 PRINT "Conversion to .ARC will give a little more."; locate 18,1 PRINT "Forward compression finished."; locate 24,1 END IF IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, "Compression complete. Number of octets output is ", outputcnt& outjl& = lengthnew& outjl& = (outjl& * 10) \ outputcnt& PRINT #dgf, " This is a compression ratio of "+str$(outjl&/10) +" to 1" PRINT #dgf, " Conversion to .ARC will give a little more." END IF IF INSTR(diaglevel$,"B") <> 0 THEN locate bline,1 outjl& = lengthnew& outjl& = (outjl& * 10) \ outputcnt& PRINT STRING$(80," "); locate bline,1 PRINT "Forward compression complete. Compression ratio is " + str$(outjl&/10) +" to 1" END IF close #dgf END SUB LookForMatch (x, i) SHARED bytesold(), bytesnew(), matched() SHARED pointerchange(), matchtype(), subtype() SHARED differa(), differb(), differc(), differd() SHARED typeacnt, typebcnt, typeccnt, typed1cnt, typed2cnt SHARED ptr, gdiff(), gdiffcnt (), lastfullmatch, factor diffa = 0 diffb = 0 diffc = 0 diffd = 0 diffe = 0 startnew = (i * ISEG%) - ISEG% startold = x - ISEG% j = 1 WHILE j <= ISEG% AND diffa = 0 diff = bytesnew(j + startnew) - bytesold(j + startold) IF diff <> 0 THEN diffa = diff j = j + 1 WEND WHILE j <= ISEG% AND diffb = 0 diff = bytesnew(j + startnew) - bytesold(j + startold) IF diff <> 0 AND diff <> diffa THEN diffb = diff j = j + 1 WEND WHILE j <= ISEG% AND diffc = 0 diff = bytesnew(j + startnew) - bytesold(j + startold) IF diff <> 0 AND diff <> diffa AND diff <> diffb THEN diffc = diff j = j + 1 WEND WHILE j <= ISEG% AND diffd = 0 diff = bytesnew(j + startnew) - bytesold(j + startold) IF diff <> 0 AND diff <> diffa AND diff <> diffb AND diff <> diffc THEN diffd = diff j = j + 1 WEND WHILE j <= ISEG% AND diffe = 0 diff = bytesnew(j + startnew) - bytesold(j + startold) IF diff <> 0 AND diff <> diffa AND diff <> diffb AND diff <> diffc AND diff <> diffd THEN diffe = diff j = j + 1 WEND IF diffe = 0 THEN matched(i) = x IF diffa = 0 THEN matchtype(i) = 1 typeacnt = typeacnt + 1 IF factor = 1 THEN lastfullmatch = x ELSE IF diffb = 0 THEN matchtype(i) = 2 typebcnt = typebcnt + 1 differa(i) = diffa CALL need1(diffa) subtype(i) = 0 for ii = 1 to 4 if diffa = gdiff(ii) then subtype (i) = ii next ii if subtype(i) = 0 then PRINT "**** ERROR - In LookForMatch (bug)" ELSE IF diffc = 0 THEN matchtype(i) = 3 typeccnt = typeccnt + 1 CALL need2 (diffa, diffb) if gdiff(1) = diffa then switch = 0 if diffb = gdiff(2) then subtype(i) = 1 if diffb = gdiff(3) then subtype(i) = 2 if diffb = gdiff(4) then subtype(i) = 3 else if gdiff(1) = diffb then switch = 1 if diffa = gdiff(2) then subtype(i) = 1 if diffa = gdiff(3) then subtype(i) = 2 if diffa = gdiff(4) then subtype(i) = 3 else if gdiff(2) = diffa then switch = 0 if diffb = gdiff(3) then subtype(i) = 4 if diffb = gdiff(4) then subtype(i) = 5 else if gdiff(2) = diffb then switch = 1 if diffa = gdiff(3) then subtype(i) = 4 if diffa = gdiff(4) then subtype(i) = 5 else subtype(i) = 6 if gdiff(3) = diffa then switch = 0 else switch = 1 end if end if end if end if end if if switch = 1 then sw = diffa diffa = diffb diffb = sw end if differa(i) = diffa differb(i) = diffb ELSE REM If we have more than one pointer and one diff, or two diff changes REM needed for these options, we treat them as a no match. chngesok = fchangesok (x,i,diffa, diffb, diffc, diffd) if chngesok = 0 then matchtype(i) = 0 matched(i) = 0 else matchtype(i) = 4 if diffd = 0 then typed1cnt = typed1cnt + 1 else typed2cnt = typed2cnt + 1 end if differa(i) = gdiff(1) differb(i) = gdiff(2) differc(i) = gdiff(3) differd(i) = gdiff(4) end if END IF END IF END IF END IF REM For a type B match, we have in differa the new gdiffx value to be forced, REM depending on the subtype. REM For a type C match, we have in differa and differb the new gdiffx and REM gdiffy values to be forced, depending on the subtype. REM For a type D match, we have in differa to differd the new gidffx REM values to be forced (except where differd = 0). if matched(i) <> 0 then for ii = 1 to 4 gdiffcnt(ii) = gdiffcnt(ii)\2 next ii end if END SUB SUB need1(diff) SHARED ptr, gdiff(), gdiffcnt() gotit = 0 FOR i = 1 to 4 if diff = gdiff(i) then gdiffcnt(i) = gdiffcnt(i) + 32 gotit = 1 end if next i if gotit <> 1 then mincnt = 1 for i = 1 to 4 if gdiffcnt(i) < gdiffcnt(mincnt) then mincnt = i next i gdiffcnt (mincnt) = gdiffcnt(mincnt) + 32 gdiff (mincnt) = diff end if END SUB SUB need2(diff1,diff2) SHARED ptr, gdiff(), gdiffcnt() gotit1 = 0 FOR i = 1 to 4 if diff1 = gdiff(i) then gdiffcnt(i) = gdiffcnt(i) + 32 gotit1 = 1 end if next i gotit2 = 0 FOR i = 1 to 4 if diff2 = gdiff(i) then gdiffcnt(i) = gdiffcnt(i) + 32 gotit2 = 1 end if next i if gotit1 <> 1 then mincnt = 1 for i = 1 to 4 if gdiffcnt(i) < gdiffcnt(mincnt) then mincnt = i next i gdiffcnt (mincnt) = gdiffcnt(mincnt) + 32 gdiff (mincnt) = diff1 end if if gotit2 <> 1 then mincnt = 1 for i = 1 to 4 if gdiffcnt(i) < gdiffcnt(mincnt) then mincnt = i next i gdiffcnt (mincnt) = gdiffcnt(mincnt) + 32 gdiff (mincnt) = diff2 end if END SUB FUNCTION fchangesok (x,ip,diff1, diff2, diff3, diff4) DIM diff(4), gotit(4) SHARED ptr, gdiff(), gdiffcnt(),factor, resyncstate, subtype() diff(1) = diff1 diff(2) = diff2 diff(3) = diff3 diff(4) = diff4 m = 4 subtype(ip) = 0 if diff4 = 0 then m = 3 matchcnt = 0 for j = 1 to m gotit(j) = 0 FOR i = 1 to 4 if diff(j) = gdiff(i) then gotit(j) = i matchcnt = matchcnt + 1 end if next i next j for i = 1 to 4 if gdiffcnt(i) = 0 then matchcnt = matchcnt + 1 next i nok = 0 ok = 1 changesok = ok if x - ip * ISEG% <> ptr and matchcnt < m - 2 and resyncstate = 0 then changesok = nok REM If we have a pointer move, we require no more than two new diffs unless REM we are resynching, when anything goes. if changesok = 1 then for j = 1 to m k = gotit(j) if k <> 0 then gdiffcnt(k) = gdiffcnt(k) + 32 next j for j = 1 to m if gotit(j) = 0 then mincnt = 1 for i = 1 to 4 if gdiffcnt(i) < gdiffcnt(mincnt) then mincnt = i next i gdiffcnt (mincnt) = gdiffcnt(mincnt) + 32 gdiff (mincnt) = diff(j) end if next j end if if m <> 4 then for i = 1 to 4 notneeded = 1 for j = 1 to m if gdiff(i) = diff(j) then notneeded = 0 next j if notneeded = 1 then subtype(ip) = i next i end if fchangesok = changesok END FUNCTION SUB GetCommNames SHARED reffile$, newfile$, sendfile$, diaglevel$, diagoutput$ SHARED recnewfile$, recreffile$ DIM z$(7) Maxargs = 7 Numargs = 0: in = 0 Cl$ = COMMAND$ l = LEN(Cl$) FOR i = 1 TO l c$ = MID$(Cl$, i, 1) IF (c$ <> " " AND c$ <> CHR$(9)) THEN IF in = 0 THEN IF Numargs = Maxargs THEN EXIT FOR Numargs = Numargs + 1 in = 1 END IF z$(Numargs) = z$(Numargs) + c$ ELSE in = 0 END IF NEXT i IF Numargs < 4 THEN PRINT "You have to specify (as command line parameters separated by space): PRINT " the file to be compressed (the new version);" PRINT " the reference file (the old version);" PRINT " the verbosity level (B or P), and optionally T;" PRINT " (B is brief, P is progress messages, T is trace);" PRINT " the file to hold the fcm format for transmission." PRINT " " PRINT "If the verbosity level includes T, you must next specify the file " PRINT "to hold the trace information reporting details of matches." PRINT " " PRINT "Finally, you may optionally include two further parameters which" PRINT "specify first the file name of the file to be generated (by default)" PRINT "on the receiver's system, and secondly the filename to be used (by" PRINT "default) as the reference file (the old version) on the receiver's" PRINT "system. If these are omitted, tbey both default to the name supplied" PRINT "for the (new) file to be compressed." PRINT END END IF Newfile$ = UCASE$(z$(1)) Reffile$ = UCASE$(z$(2)) diaglevel$ = UCASE$(z$(3)) IF INSTR(diaglevel$,"T") <> 0 THEN IF INSTR(diaglevel$,"P") <> 0 THEN diaglevel$ = "TP" ELSE diaglevel$ = "TB" END IF ELSE IF INSTR(diaglevel$,"P") <> 0 THEN diaglevel$ = "P" ELSE diaglevel$ = "B" END IF END IF Sendfile$ = UCASE$(z$(4)) IF INSTR(diaglevel$,"T") <> 0 AND Numargs < 5 THEN PRINT "Tracing requested but no trace file name. Please respecify" END IF arginc = 0 IF INSTR(diaglevel$,"T") <> 0 THEN diagoutput$ = UCASE$(z$(5)) arginc = 1 END IF recnewfile$ = newfile$ recreffile$ = newfile$ IF numargs >(4 + arginc) then recnewfile$= ucase$(z$(5+arginc)) IF numargs >(5 + arginc) then recreffile$ = ucase$(z$(6+arginc)) END SUB SUB SetDiffer ( k, x) SHARED gdiff2(), obuff$, outputcnt&, dgf, diaglevel$, i, donesegs& if x <> gdiff2(k) then REM We need to change gdiff2(k) to x diffid = (k + 1) * 16 gdiff2(k) = x sendiff = x dcase = 0 IF sendiff < 0 THEN dcase = 1 sendiff = -sendiff END IF IF sendiff > 255 THEN PRINT "**** ERROR - Difference is out of range (bug): ";sendiff sendiff = 0 END IF obuff$ = obuff$ + CHR$(dcase + diffid) + CHR$(sendiff) outputcnt& = outputcnt& + 2 IF INSTR(diaglevel$,"T") <> 0 THEN PRINT #dgf, i + donesegs&; PRINT #dgf, "Change of diff " + str$(k) + " to " + str$(x) END IF end if END SUB FUNCTION FileExists! (testfile$) DIM InRegs AS RegType, OutRegs AS RegType checkname$ = testfile$ + CHR$(0) InRegs.ax = &H4300 InRegs.dx = SADD(checkname$) CALL INTERRUPT(&H21, InRegs, OutRegs) IF (&H1 AND OutRegs.flags) <> 0 THEN FileExists = 0 ELSE FileExists = 1 END IF END FUNCTION SUB Resynch SHARED factor, skipval, lastfullmatch, lengthold&, lengthnew&, i, usedold& SHARED startoffset, diaglevel$, done&, resyncstate REM We have had twenty successive no matches. REM Adjust factor and skipval resyncstate = 2 lastfullmatch = 0 estimatedposn& = 2.0 * (lengthold& - lengthnew&)*((1.0*i*ISEG% + done&)/(lengthnew& + lengthold&)) startoffset = estimatedposn& - usedold& + done& END SUB SUB Normal SHARED factor, skipval, lastfullmatch, lengthold&, lengthnew&, i, done& SHARED startoffset, resyncstate factor = 1 skipval = 1 resyncstate = 0 END SUB 'REM This is part of the eventual documentation, and describes the 'compressed format based on this approach. ' 'There will be an initial block containing: ' 1. a length count (one octet)for the next item. ' 2. the full path name of the file that is to be compressed ' (on decompression, a file TEMP.TMP will be created in the current ' directory, and then copied to the original destination path when ' decompression is complete, unless overridden by a different target). ' 3. two octets of sum check for the whole of this file; the sum check ' is the standard one of SIGMA(Ai) and SIGMA(iAi), all MOD 255. ' 4. a length count for the next item. ' 5. the full path name of the reference file (this can be overridden ' on decompression) ' 6. two octets of sum check for the first 1K of the reference file. ' 'Decompression will be abandonned if the reference file is not found 'on the receiving system, with the correct sumcheck, and TEMP.TMP will 'not be copied unless the sum-checks of the original file match with the 'decompressed one. ' 'The receivers state is set up with ptroffset = 0, gdiff2(1) = 0, gdiff2(2) = 0, 'gdiff2(3) = 0, gdiff2(4) = 0. These values are retained unless explicitly 'changed. The target value is old plus difference. (Difference is new - old). 'The old values to be used are at cnt + ptroffset + 1 to 'cnt + ptroffset + ISEG%, where cnt is the number 'of new octets generated so far. A positive pointer change is an addition 'to ptroffset. '(A ptroffset is old posn - new posn for compression). ' 'The following algorithm determines the compressed format: ' 'We take each ISEG block in turn. ' 'If there is no match, we output 10nnnnnn followed by the nnnnnn sets of '16 octets for up to 64 (nnnnnn of 0 means 64) segments. 'Note that 11nnnnnn is used for total matches (see below). All other 'codes are 0xxxxxxx. 'On decompression, we branch on bit 1 equals 1 or 0, and if bit 1 is '1, then we branch on 10 or 11. For 10, we pick up nnnnnn, and copy 'ISEG% * nnnnnn octetcs from the fcm to the target. ' 'If there is a pointer change, we need to signal it in the output. 'We have four cases for the value of the ptr change: ' positive and less than (or equals) 256 ' negative and less than (or equals) 256 ' positive and over 256 ' negative and over 256 'We use 00000000, 00000001, 00010000, and 00010001 for these four cases. 'In the first two cases, we follow with a single octet, and in the last 'two with two octets. In the case of the single octet, zero means 256. 'On decompression, we are in the first bit zero case, and branch first on 'the bottom four bits, taking the 0 and 1 cases here. For zero, we are 'collecting a positive value, and for 1 a negative. Next we branch on the 'value of nnn in 0nnn000x. For 0 we have a single octet following and 'a value to be added (after negation if necessary) to ptroffset. For 1 'we have a two octet value (most significant first). ' 'Next we look at whether a diffa, diffb, diffc, diffd change is needed. 'If a change of a diff is needed, we 'have the following cases for the diff: ' positive and less than 256 ' negative and less than 256 'We use 00100000 and 00100001 for these cases for diffa, 00110000 and '00110001 for these cases for diffb, 01000000 and 01000001 for diffc, 'and 0101000 and 01010001 for diffd, followed by a single byte which gives 'the fcm diff value. 'On decompression, this is the nnn = 2 (diffa), 3 (diffb), 4 (diffc) and '5 (diffd) cases described under pointer above. ' 'This completes the use of the 0mmm0000 and 0mmm0001 values 'except for 110 which is used in termination below, and 111 which is spare. 'Remaining codes are '0mmmxxxx where xxxx is above 0001. ' 'In all cases, apart from no match, we first code any pointer change 'needed, then any one or more diff changes that are needed, then we code 'the type. ' 'For a type A match, we look 'to see if the next block is also type A with no pointer change. If so, 'we count the number of blocks we can merge in, up to a maximum of 64, 'and output 11nnnnnn, where n is the count of the number of merged-in 'blocks '(nnnnnn all zeros means 64). 'On decompression, this is the branch from first bit 1, first two 11. 'We copy ISEG% * nnnnnn octets from the old file, starting at cnt + 'ptroffset + 1. ' 'For a type B match, we determine whether to use diffa, b, c, or d (based 'on this segment alone, and possibly with a diff change). We then merge in 'up to 8 segments (nnn = 000 means 8) with the same pointer and diff value '(after juggling diffs if necessary). The coding is 0nnn0010 for use of diffa '0nnn0011 for use of diffb, 0nnn0100 for use of diffc, and 0nnn0101 for use of 'diffd. nnn is the number of segments merged in. We then follow with a 'string of bits, one per byte in each of the segments, where 0 means no 'addition, and 1 means add in the selected diff. Bits up to the next octet 'boundary are ignored. 'On decompression, we use nnn to determine the number of octets to be 'produced (16 * nnn), and then take as many octets as necessary to give 'us the additions of diffa, b etc. We start the old at cnt + ptroffset + 1 'and we set new to old + diff. (Diff was defined as new - old) ' 'For a type C match, we have six possible diff selections to use, coded as 'follows (again, merging in up to 8 possible segments): ' diffa and diffb 0nnn0110 ' diffa and diffc 0nnn0111 ' diffa and diffd 0nnn1000 ' diffb and diffc 0nnn1001 ' diffb and diffd 0nnn1010 ' diffc and diffd 0nnn1011 'We then follow with a string of bits, one per byte in each of the segments, 'where 0 means no addition, and 10 means the first mentioned diff is added in, 'and 11 means the second mentionned diff is added in. 'On decompression, this is values 6 to 11 of the bottom four octets. We 'proceed as for case B, except that we need to use two diffs, diff1 and diff2 'taken from diffa to diffd according to the case being considered. ' 'For a type D match, we encode the segments 'as 0nnn1100 (for up to 8 segments) where we have four differs, and '0nnn1101 where diffa is omitted, 0nnn1110 where diffb is omitted, 0nnn1111 'where diffc is omitted. For diffd omitted, we have no code space left, 'and treat that as if it were a four differ (code 0nnn1100). 'This is followed by a bitstring for each 'octet in each of the segments, where we have 0 if no diff is to 'be added in, then, for the three differs in use, '100 if the first is to be added in, 101 for the second, and 11 for the third. 'For four differs, we have '100 if diffa is to be added in, 101 for diffb, 110 for diffc, '111 for diffd. 'On decompression, this is again similar to CASE C, except that we have 'three or four differs depending on the value 12 (four) or 13 to 16 (three) 'of the case. The value of the case says which differ is omitted. ' 'Finally, we have to cope with the residual set of up to 15 octets at the 'end of the file. We simply output 00001110, then a single octet saying 'how many follow, then the octets. If there are none, we still output 'the single octet and the null count. On decompression, we copy octets 'across. ' 'Where we have discarded some old file and refilled the buffers, we 'signal this 'in the output by putting out a single octet of 00011110 followed by 'two octets giving the amount of the move. 'On decompression, this is a signal to move the new file by the full '12K, and to move the old file by the specified amount. ' 'This ends the use of the 0nnnxxxx codes. The value 1111 of xxxx is 'spare. ' 'Thus, we get the following counts: ' first no match 17 ' subsequent no match (up to 128) 16 ' ptr change (<256) 2 ' ptr change (>256) 3 ' diff change 2 ' type A 1 (?+3 - ptr change) ' subsequent A (up to 128) 0 ' type B 3 (?+5 - ptr + 1 diff) ' subsequent B (up to 8) 2 ' type C 4 to 5 bytes (?+7 - ptr + 2 diff) ' subsequent C (up to 8) 2 to 4 bytes ' type D or E 4 to 7 bytes (?+5/4 - ptr/diff + diff) 'NOTE - We recognise D or E only if there is at most 1 ptr and one diff, or 'two diff changes. '