home *** CD-ROM | disk | FTP | other *** search
-
- VERSION
-
- This is version 0 of the the Sort Flogger.
-
- OVERVIEW
-
- This is a small collection of some of the more popular sort
- algorithms, including:
-
- bubble sort -- any 1st semester course in CS
- heap sort -- contributed by der Mouse
- insertion sort -- any 1st semester course in CS
- merge sort -- roughly patterned after Knuth Vol. 3
- quick sort -- C.A.R. Hoare's as given in K&R 2 pg 87
- shell sort -- D.L. Shell as given in K&R 2 pg 62
- bogo sort -- worst-case sort by Richard Krehbiel
-
- A slightly weird test routine is provided which calculates some
- performance measurements on the above algorithms and also on the
- qsort() function provided with your system, as well as verifying that
- the sort actually worked and that it didn't modify anything immediately
- above or below the array, that it doesn't leak memory, and whether
- or not it's stable. These tests are carried out in a variety of
- situations designed to highlight worst-case behavior.
-
- These implementations of the sort functions are all compatible with
- qsort()'s parameter list, so if you think your application spends too
- much time sorting, you can just s/qsort/merge_sort/ and link in the
- sort library and give it a try. If you think it doesn't spend *enough*
- time sorting, try bubble_sort instead.
-
- The merge sort declares a global int called _maybe_sorted which
- tells merge sort that the data in the array may already be mostly in
- order. It is advisory--the algorithm will produce the correct results
- regardless of _maybe_sorted's value; however the execution time for
- merge_sort will be quite a bit less if this is set and the array is
- already mostly sorted. If the array isn't sorted at all, this option
- extracts about 2% speed penalty.
-
- The heapsort was contributed by mouse@larry.mcrcim.mcgill.edu
- aka "der Mouse".
-
- The merge_sort() function is the only one I've taken any trouble to
- optimize.
-
- INSTALLATION
-
- See the INSTALLATION document which should be nearby.
-
- USAGE
-
- Just type "flogger" or "flogger <number>" and watch what happens.
-
- DESCRIPTION OF OUTPUT
-
- It seems like it doesn't really do anything but spit out numbers,
- and rather slowly at that. The version and patch level of the program
- is printed; the resolution of the clock is printed; the size of the
- elements is reported; and the number of elements to be sorted is
- reported.
-
- For each sort algorithm, a table of measurements is reported. The
- rows describe the situation the sort algorithm was presented with for
- those measurements and the columns represent the type of measurement
- taken. The rows (situations) are:
-
- random: The contents of the array to be sorted are pseudo
- random numbers generated by rand(), with two special
- values excluded for the purpose of detecting overwrites
- at the ends of the array.
- ascend: The array is already completely in order. To simply
- verify this fact is all that's required of the sort
- algorithm; ironically, bubble sort outperforms all
- the rest in this case.
- descend: The array is exactly in reverse order. Theoretically,
- an sort function could detect this and swap ends in
- linear time, but it hardly seems useful.
- fib asc: The contents of the array are irrelevant; the comparison
- function passed to the routine always says that its
- first argument is less than its second.
- fib desc: Similarly, the comparison function always says that its
- first argument is greater than its second.
- surprise: A pseudorandom result is returned from the comparison
- function in an attempt to confound the algorithms' efforts
- at sorting.
- mostly: Similar to ascend, but the first and last elements
- are reversed. This is meant to simulate an index which
- is pretty much already sorted and only a small number
- of new items need to be inserted.
- equiv: The comparison function reports that all elements
- compare equal. The data is arranged so that a
- check can be made for whether the algorithm is
- stable; that is the property that records with
- equal keys stay in the same relative order in the
- array that they were in before the sort began.
-
- The columns refer to these measurements:
-
- compares: The number of calls to the comparison routine during
- the sort. I consider this very important, as most
- of the sorting bottlenecks I've experience center
- around the comparison function.
- stack: (Estimated) Some semi-non-portable attempts to record
- the growth of the stack are made and a number comes
- out. I consider O(log n) use of the stack acceptable
- on modern machines and none of these algorithms are
- worse than that.
- heap: Amount of malloc'd storage as reported by mallinfo().
- If your C library doesn't support mallinfo(), the
- compiler will probably barf on flogger.c. The flogger
- checks for memory leaks.
- user: Amount of user mode CPU time as reported by times().
- This includes time for both the operation of the
- sort function and the comparison function; as these
- are only marginally related, this number is not
- useful for estimating how long it might take to
- sort useful data in some other program.
- system: Amount of system CPU time as reported by times().
- Generally 0.00 or very small. If a sort algorithm
- uses lots of system time it's probably a Trojan horse
- trying to break into the passwd file or delete your
- home directory or something so maybe you should
- start shuffling through your desk looking for a recent
- backup tape. I may remove this in a future version
- (the report of system time, not the Trojan horse)
- in favor of another metric, perhaps wall clock time.
-
- But it isn't enough simply to be fast, a sort algorithm also has
- to be correct. To that end, the flogger routine verifies that
- the array was actually sorted, at least when the comparison routine
- wasn't lying. It also checks that magic values in the memory locations
- immediately above and below the array were not accidentally accessed
- or overwritten.
-
- The bogo sort run is skipped if n is sufficiently large that
- it's execution would run into trillions of cycles (approx n = 15).
-
- MISCELLANEOUS
-
- When running the test routine, some sorts take so long that you
- might lose patience. Keyboard interrupt (aka Control-C) will abort the
- current sort test and move on to the next. If you really want the
- program to end, give it a quit signal (Control-\) or send it a signal
- with the kill command. You can also background it (Control-Z) from
- the C-shell then kill it.
-
- The size of the elements being sorted is hard-coded at the moment,
- but it can be changed by going into flogger.c and changing the typedefs for
-
- #define TEST_TYPE double
- #define TEST_FUNC double_compare
-
- into
-
- #define TEST_TYPE int
- #define TEST_FUNC int_compare
-
- or back again. Or add a new type and a new comparison function.
-
- Several of the sort algorithms are not re-entrant which means
- you shouldn't call the sort from within the very same comparison
- function you passed into that sort in the first place.
-
- COPYRIGHT
-
- Since none of the sort algorithms are mine and anyone with a copy
- of Knuth Vol 3 and a little knowledge of C could type them in as easily
- as I did, there isn't much point in claiming a copyright, much less
- applying for a patent. I wanted to disclaim liability though, so I put
- a few minor restrictions on the code, as set forth in each of the
- source files.
-