|
Volume Number: | 4 | |
Issue Number: | 8 | |
Column Tag: | HyperChat™ |
XCMD Cookbook: Sorting Routines
By Donald Koscheka, Apple Computer, Inc.
In the last two issues we covered XCMD programming basics. First we looked at the interface between Hypercard and XCMDs and then we discussed the callbacks and how they can make your XCMD programming a little easier. This month, I present an XCMD that adds a useful feature to Hypercard - the ability to sort lines in a text field.
This is a good example of XCMD programming since it extends Hypercard by adding a new feature. In addition to adding features, XCMDs are a useful way of speeding up some process that may have been written in HyperTalk. Sorting lines of text is certainly something that could be accomplished in Hypertalk. But if you have a lot of lines in the field, the Hypertalk script may be too slow. Enter the XCMD!
Before we can write the sort program, we must decide what it is we want to sort and what form the data will be presented in. Let’s define a line as a string of text that is spearated by a newline character. Next we’ll want to be able to sort both alphabetic strings and numeric strings. We need the numeric sort because strings of characters are sorted by ASCII value rather than numeric value.
Once we decide on the form and value of the data to sort, we can decide what we need to pass to the XCMD from Hypercard. Obviously, we’ll need the lines. Parameter 1 will contain a handle to the field, parameter 2 will tell the XCMD whether to sort by ASCII value (alphanumeric sort) or by numeric value (decimal sort). Parameter 3 tells the xcmd to sort in ascending or descending order. Actually LineSort only sorts in ascending (smallest to largest) order. There really is no need to sort from largest to smallest. You can simply report the lines “backwards” to get descending order. I’ve left this as an exercise to the reader since it involves little more than reversing the order of the FOR loop in two of the routines in SetNewText and SetNewNums.
When LineSort is complete, we’ll return the sorted list in the parameter block’s returnValue. By making this an XFCN we can invoke it using the follwing format:
Put LineSort( card field “my list”, alpha ) into card field “my list”
The actual process of sorting can be broken down into the following steps:
(1) LineStart: Create an array of offsets into the list marking the start of each line.
(2) SortText: Sort the list of text by comparing lines using some sorting algorithm.
(3) SetNewText: Report the sorted list back to Hypercard.
(Note: the process is identical for numeric sorting. Because the numeric sort is quite a bit simpler in implementation, we’ll concentrate this discussion on the text sort).
How we accomplish these three tasks is quite another story. Sorting lines of text is not as straightforward as sorting integers or characters since each line can be of an arbitrary length. Before performing the sort, we need to calculate where in the large chunk of text each line starts. We create an array, LineStart, that gives us an offset into the list of the start and length of each line. Consider the list:
Margaret Colleen Donald
The first line in the list has an offset of 0 and a length of 8. The next item in the list has an offset of 9 and a length of 7. The third item has an offset of 17 and a length of 6. If the numbers don’t seem to add up it’s because the linestart array accounts for the newline character that terminates every entry in the list!
This matter of creating a linestart array pays handsome dividends. When it comes time to sort the list, we don’t have to move the strings around in memory. Rather, we simply rearrange the linestart array to reflect the sort order. For example, the line start array for the above array looks like this before the sort:
[0,8] [9,7] [17,6]
and like this after the sort:
[9,7] [17,6] [0,8]
To report the sorted list back to hypercard now becomes a simple task. SetNewText extracts N elements from the linestart array where N is the number of elements in the array. We determine this number by dividing the size of the array by the size of one element in the array. For each element in the array, the first value in the array is the offset of the first byte in the string and the second element is the number of bytes to copy onto that line. Each line is already terminated with a newline so the process of building the output string is a simple process of concatenating the strings in the order prescribed by LineStart. If you want to perform a descending sort, build the output list starting with the last element in the array and work your way backwards, one element at a time! This is good practice for the beginner so I haven’t included the sort order part of the code.
I glossed over the actual sorting of the linestart array for two reasons. First off, it looks a lot scarier than it is. Second off, you may want to replace my sorting algorithm with one of your own. I use a Shell sort because it is one of the easier sorting methods to follow. If you’re programming in MPW “C” you should consider replacing SortText and SortNums with Apple’s QuickerSort. Greg Kimberly of Apple Computer, Inc. demonstrated an XCMD to me that uses QuickerSort. It was at least three times faster than my Shell sort implementation!
The Shell sort starts by comparing the first element in the array with every element that is at least jump elements away from the first element. If an some element is smaller than the first element, then we swap that element, and test the next one. Once no more elements can be swapped, we exit the FOR loop, decrease the jump size and start the comparison process all over again.
The ToolBox call IUMagString compares the two string elements and returns -1 if string 1 is less than string 2, 0 if they’re equal and 1 if string 1 is greater (the swap condition). IUMagString takes two string pointers and the length of each string as input. The length is easy, that’s simply the second element of each linestart array entry. We can create a pointer to each line by adding the offset element of each array entry to the starting address of the text. Since the text is referenced via a handle (passed to sortText as hField), we lock down the handle to make sure that this starting address doesn’t change on us during the sort.
The numeric sorting scheme is very similar to the alpha scheme, only easier because the size of each element is fixed at 4 bytes (the size of a longInt). If you have trouble reading the sort Text routines, try starting with the sort numeric code.
Next month: File I/O. You can take it with you!
{*************************} {* File: LineSort.p*} {* *} {* Sorts lines of text (delimited by *} {* newline and null terminated). *} {* Returns the sorted container. *} {*-------------------------------- *} {* In: paramPtr=pointer to the XCMD *} {* Parameter Block *} {* *} {* params[1] = handle to the text *} {* params[2] = handle to sort type *} {* (ALPHA, NUMERIC)*} {* params[3] = Sort Order *} {* (ASCENDING, DESCENDING)*} {* *} {* Defaults : ALPHA, ASCENDING*} {* Out: returnValue = handle to the*} {* sorted data *} {* *} {*-------------------------------- *} {* © 1988, Donald Koscheka*} {* All Rights Reserved *} {*-------------------------------- *} {*************************} (************************* BUILD SEQUENCE pascal LineSort.p link -m ENTRYPOINT -rt XFCN=65535 -sn Main=LineSort LineSort.p.o “{Libraries}”Interface.o “{PLibraries}”Paslib.o -o “{xcmds}”testxcmds ************************* {$S LineSort } UNIT Donald_Koscheka; {----------INTERFACE------------} INTERFACE USES MemTypes,QuickDraw,OSIntf,ToolIntf, PackIntf,HyperXCmd; PROCEDURE EntryPoint(paramPtr:XCmdPtr); {--------IMPLEMENTATION--------} IMPLEMENTATION {$R-} CONST NULL = $00; NEWLINE= $0D; ALPHA = $00; NUMERIC= $01; ASCEND = $00; DESCEND= $01; LESSTHAN = -1; EQUALTO= 0; GREATERTHAN= 1; TYPE Str31 = String[31]; LinePtr = ^LineElem; LineHand = ^LinePtr; LineElem = PACKED RECORD Start: LongInt; Size : LongInt; END; numPtr = ^LongInt; numHand= ^numPtr; PROCEDURE LineSort(paramPtr:XCmdPtr); FORWARD; {------------EntryPoint------------} PROCEDURE EntryPoint(paramPtr: XCmdPtr); BEGIN LineSort(paramPtr); END; {----------LineSort----------------} PROCEDURE LineSort(paramPtr:XCmdPtr); (************************* * Main code segment follows *************************) VAR SortType : INTEGER; SortOrder: INTEGER; LineStart: LineHand; hNums : numHand; NewField : Handle; SortStr: Str255; {$I XCmdGlue.inc } (************************) (*** Alpha Sorting Routines***) (************************) FUNCTION GetLineStarts(hField:Handle) : LineHand; (************************** * Given a pointer to a block of text, * scan for line-terminators (NEWLINE * | NULL) and fill out a dynamically * allocated array of line starts * information. * * The line starts array contains * the offset in the text to the start * of the line as well as the length * of the line in bytes. * * Offsets are used to allow the record * to remain valid across relocation * of the basic text. * * In: Pointer to a block of text, * null terminated. * * Out: Handle to an array of * linestart indices. * *************************) VAR done : BOOLEAN; arraySize: LongInt; startText: LongInt; { Base pointer of the text } sizeText : LongInt; { length of the current run} lineStart: LineHand; { array of lineStarts pointers} txStart: Ptr; { pointer to the input text} txPtr : Ptr; { pointer into input text (FIELD)} lineArray: LinePtr; { Pointer into lineStart array } BEGIN IF (hField<>NIL)AND(hField^^<>NULL)THEN BEGIN HLock( hField ); txPtr := hField^; startText := ORD( txPtr ); lineStart := LineHand(NewHandle(0)); done := FALSE; WHILE NOT done DO BEGIN txStart := txPtr; ScanToReturn( txPtr ); IF txPtr^ = NULL THEN BEGIN txPtr^ := NEWLINE; done := TRUE; END; txPtr := Pointer( ORD(txPtr) + 1); sizeText := ORD(txPtr)-ORD(txStart); IF sizeText > 1 THEN BEGIN {*** point to next record in linestarts array ***} arraySize := GetHandleSize( Handle(LineStart) ); SetHandleSize( Handle(LineStart), arraySize + sizeOf( lineElem )); lineArray := Pointer( ORD(lineStart^) + arraySize ); {*** Put away offset and length of line ***} WITH lineArray^ DO BEGIN Start:= ORD(txStart) - startText; Size := SizeText; END; END; END; {*** While ***} HUnlock( hField ); END; GetLineStarts := lineStart; END; PROCEDURE SortText( hField:Handle ); (************************ * Given a handle to an run of text, * sort the lines of text pointed to and * rearrange the array accordingly. * * Text is sorted by rearranging the line * starts array. * * In: Pointer to array of linestarts *linestarts, linecount (global) * Out: sorted array of linestarts. ************************) VAR done : BOOLEAN; jump,len1,len2: INTEGER; n,m,lineCount : LongInt; str1, str2 : Ptr; tempElem : LineElem; elem1, elem2 : LinePtr; BEGIN LineCount:= GetHandleSize( Handle(lineStart) ); LineCount:= LineCount DIV sizeOf( LineElem ); jump := lineCount; HLock( Handle(lineStart) ); HLock( hField ); WHILE jump > 1 DO BEGIN jump := jump DIV 2; REPEAT done := TRUE; FOR m := 0 to ( lineCount - jump - 1 ) DO BEGIN n := m + jump; {*** Calculate the offsets of the two elements ***} elem1 := LinePtr( ORD(LineStart^) + (n * sizeof( LineElem )) ); str1 := Pointer( ORD( hField^ ) + elem1^.Start) ; len1:= INTEGER( elem1^.Size ); elem2 := LinePtr( ORD(LineStart^) + (m * sizeof( LineElem )) ); str2 := Pointer( ORD( hField^ ) + elem2^.Start) ; len2 := INTEGER( elem2^.Size ); IF IUMagString( str2, str1, len2, len1 ) = GREATERTHAN THEN BEGIN tempElem.Start := elem1^.Start; tempElem.Size := elem1^.Size; elem1^.Start := elem2^.Start; elem1^.Size:= elem2^.Size; elem2^.Start := tempElem.Start; elem2^.Size:= tempElem.Size; done := FALSE; END; END; {*** FOR loop ***} UNTIL done; END; {*** WHILE jump > 1 ***} HUnlock( Handle(lineStart) ); HUnlock( hField ); END; FUNCTION SetNewText( hField: Handle; Dir : INTEGER ): Handle; (***************************** * Given a pointer to a linestarts array, and * a corresponding block of text, rearrange * the text to match the line starts in the * array. This is useful after doing a search * since the linestarts array will be in order * but the text won’t be. * * return a run of null-terminated text * with each line NEWLINE-terminated. * * In: Handle to text array *Handle to linestarts array. * * Out: Handle lines of newline terminated text. ****************************) VAR LineNum, LineCount, oldSize: LongInt; NewText: Handle; StartOfLine: Ptr; NextLine : LinePtr; BEGIN LineCount:= GetHandleSize( Handle(lineStart) ); LineCount:= LineCount DIV sizeOf( LineElem ); NewText:= NewHandle( 0 ); FOR LineNum := 0 to LineCount-1 DO BEGIN NextLine := LinePtr( ORD( LineStart^ ) + (LineNum * sizeOf( LineElem )) ); StartOfLine := Pointer( ORD( hField^ ) + NextLine^.Start ); {*** add length of new line to output text ***} oldSize := GetHandleSize( NewText ); SetHandleSize( NewText, oldSize + NextLine^.Size ); {*** move the line into the array ***} BlockMove( StartOfLine, Pointer( ORD( NewText^ ) + oldSize), NextLine^.Size ); END; {*** Tack a NULL on the end of the new text ***} oldSize := GetHandleSize( newText ); SetHandleSize( newText, oldSize + 1 ); StartOfLine := Pointer( ORD( newText^ ) + oldSize ); StartOfLine^ := NULL; SetNewText := NewText; END; (***************************) (*** Number Sorting Routines ***) (***************************) FUNCTION GetNums( hField : Handle ): numHand; (*************************** * Given a handle to a block of text, scan * for line-terminators (NEWLINE | NULL) and * fill out a dynamically allocated array of * long integers, one for each line. * * Out: Handle to an array of longInt. * ***************************) VAR done : BOOLEAN; arraySize, theNum : LongInt; txStart: Ptr; txPtr : Ptr; numArray : numPtr; hNum : NumHand; tStr : Str255; BEGIN IF ( hField <> NIL ) AND ( hField^^ <> NULL ) THEN BEGIN HLock( hField ); txPtr := hField^; hNum := NumHand( NewHandle( 0 ) ); done := FALSE; WHILE NOT done DO BEGIN txStart := txPtr; ScanToReturn( txPtr ); IF txPtr^ = NULL THEN done := TRUE ELSE txPtr^ := NULL; ZeroToPas( txStart, tStr ); theNum := StrToNum( tStr ); txPtr := Pointer( ORD(txPtr) + 1); arraySize := GetHandleSize( Handle( hNum ) ); SetHandleSize( Handle( hNum ), arraySize + sizeOf( longInt )); numArray := Pointer( ORD( hNum^) + arraySize ); numArray^:= theNum; END; {*** While ***} HUnlock( hField ); END; GetNums := hNum; END; PROCEDURE SortNums( theNums : NumHand ); (********************* * Given a pointer to an array of longints, * sort the numbers and rearrange the array * accordingly. * * In: Handle to an array of longInts; * uses linecount and lineNum; * Out: sorted array of linestarts. ********************) VAR done : BOOLEAN; jump,len1,len2 : INTEGER; n,m ,swap, lineCount: LongInt; num1,num2: numPtr; BEGIN LineCount := GetHandleSize( Handle(theNums) ) DIV sizeOf( LongInt ); jump := lineCount; WHILE jump > 1 DO BEGIN jump := jump DIV 2; REPEAT done := TRUE; FOR m := 0 to ( lineCount - jump - 1 ) DO BEGIN n := m + jump; {*** Calculate the offsets of the two elements ***} num1 := Pointer(ORD(theNums^) + (n * sizeof( LongInt )) ); num2 := Pointer(ORD(theNums^) + (m * sizeof( LongInt )) ); IF num2^ > num1^ THEN BEGIN swap := num1^; num1^:= num2^; num2^:= swap; done := FALSE; END; END; {*** FOR loop ***} UNTIL done; END; {*** WHILE jump > 1 ***} END; FUNCTION SetNewNums( hNum: numHand; Dir : INTEGER ): Handle; (******************** * Given a handle to an array of * longints, convert them back to * strings and put them away in * a new handle as text * * In: Handle to num array * * Out: Handle lines of newline * terminated text. *********************) VAR index, intCount, oldSize, numLen : LongInt; nexNum : NumPtr; NewText: Handle; tempPtr: Ptr; tempStr: Str255; BEGIN NewText:= NewHandle( 0 ); intCount := (GetHandleSize( Handle(hNum) ) DIV sizeOf( LongInt )); FOR index := 0 to intCount-1 DO BEGIN nexNum := NumPtr( ORD( hNum^ ) + (index * sizeOf( LongInt )) ); tempStr := NumToStr( nexNum^ ); numLen := LongInt( ORD( tempStr[0] ) ); {*** add length of new line to output text ***} oldSize := GetHandleSize( NewText ); SetHandleSize( NewText, oldSize + NumLen + 1 ); {*** move the line into the array ***} TempPtr := Pointer( ORD( @tempStr ) + 1 ); BlockMove( TempPtr , Pointer(ORD( NewText^)+ oldSize),NumLen); TempPtr := Pointer(ORD( NewText^) + oldSize + NumLen); TempPtr^ := NEWLINE; END; {*** Tack a NULL on the end of the new text ***} oldSize := GetHandleSize( newText ); SetHandleSize( newText, oldSize + 1 ); tempPtr := Pointer( ORD( newText^ ) + oldSize ); TempPtr^ := NULL; SetNewNums := NewText; END; {******* Main Block for LineSort *******} BEGIN NewField := NIL; WITH paramPtr^ DO IF (params[1] <> NIL) AND (params[1]^^ <> NULL) THEN BEGIN SortType := ALPHA; SortOrder := ASCEND; IF params[2] <> NIL THEN BEGIN ZeroToPas( params[2]^, sortStr ); IF StringEqual(‘NUMERIC’, sortStr) THEN SortType := NUMERIC; END; IF params[3] <> NIL THEN BEGIN ZeroToPas( params[3]^, sortStr ); IF StringEqual(‘DESCENDING’, sortStr ) THEN SortOrder := DESCEND; END; CASE SortType OF ALPHA: BEGIN LineStart := GetLineStarts( params[1] ); SortText( params[1] ); newField := SetNewText( params[1], SortOrder ); DisposHandle( Handle( LineStart ) ); END; NUMERIC: BEGIN hNums := GetNums( params[1] ); SortNums( hNums ); newField := SetNewNums( hNums, SortOrder ); DisposHandle( Handle( hNums ) ); END; END; {*** CASE SortOrder OF ***} END; paramPtr^.returnValue := newField; END; END.
- SPREAD THE WORD:
- Slashdot
- Digg
- Del.icio.us
- Newsvine