home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!ontek!mikey
- From: mikey@ontek.com (euphausia superba)
- Newsgroups: comp.lang.c
- Subject: Re: qucik, merge, shell and bubble sort
- Message-ID: <2188@ontek.com>
- Date: 17 Nov 92 22:33:48 GMT
- References: <1e1cnvINNt2t@usenet.INS.CWRU.Edu>
- Organization: Ontek Corporation -- Laguna Hills, California
- Lines: 159
-
- In comp.lang.c, cc015@cleveland.Freenet.Edu (Vito T. Kwan) writes:
- |
- | I want to take a look at these four sorting method in C, does anyone
- | know where can I get (ftp) them?
-
- I have implementations of quick, merge, and bubble sort; send
- email to me if you can't find them through other channels.
-
- Many people have posted timings for sort routines over the
- years; I was dissatisfied with the level of detail, so I grabbed
- a few sort routines and did some hacking and measured some
- of the not-so-obvious aspects of how sort algorithms perform
- under extreme conditions. This is scientific only in the
- sense that the numbers I got were repeatable to one or two
- significant digits.
-
- The columns in the following tables have these meanings:
-
- compares: Number of callbacks to the comparison function.
- On the few occasions I've gprof'd a sort operation, most
- of the time is spent in the comparison function, so I
- consider this the most interesting column.
-
- stack: Amount of stack space used (roughly).
- Obviously, the way I measured this was non-portable,
-
- heap: Amount of heap space used as reported by mallinfo.uordblks
-
- user: User time as reported by times(tms.tms_utime)
- This is supposed to be accurate to 1/60th of a second.
-
- system: System time as reported by times(tms.tms_stime)
- This is also supposed to be accurate to 1/60th of a second.
-
- The rows describe the input array and the behavior of the
- comparison function.
-
- random: The input data is generated by rand(), and the
- comparison function reports truthfully.
-
- ascend: The data being sorted is already in the correct
- order. This is a useful case for sorting
- indexes which are pretty much already in order;
- bubble sort handles this case well, but results
- of bubble sort are excluded here, mostly because
- bubble sort is silly.
-
- descend: The input data is exactly in reverse order.
-
- fib asc: The data is random, but the comparison function
- always says that its first argument is less than
- its second argument.
-
- fib asc: The data is random, but the comparison function
- always says that its first argument is greater than
- its second argument.
-
- suprise: The comparison function returns a random value.
-
- Now, some of these are really obscure situations that just
- *shouldn't* happen with real data and well-written comparison
- functions. But I've found that some of the most time-consuming
- sorts are done at the request of the user, and are ones
- in which the user can specify several possibly interrelated
- keys. I worried about what would happen if the user managed
- to find a combination that would result in something other
- than a complete ordering.
-
- My comments are interspersed with the results.
-
- banzai# test_sort 5000
- HZ = 60
- element size = 8
- number of elements = 5000
-
-
- *** qsort ***
- data compares stack heap user system
- random: 66554 768 0 3.68 0.00
- ascend: 57727 960 0 3.00 0.00
- descend: 57728 960 0 3.03 0.00
- fib asc: 57727 960 0 2.78 0.05
- fib desc: ^C
- surprise: 61797 864 0 3.75 0.00
-
- [This qsort is the one supplied with SunOS 4.1. It performs well
- under most circumstances but goes into an infinite loop when the
- comparison always returns 1. The infinite loop seems to be of
- such a character that the comparison function is being called
- with ever decreasing addresses and so presumably could cause
- segmentation faults under some circumstances. It's usage of
- stack space and heap space are perfectly fine for modern CPUs.]
-
- *** merge_sort ***
- data compares stack heap user system
- random: 55240 1152 40004 2.98 0.02
- ascend: 32004 1152 40004 1.92 0.03
- descend: 29804 1152 40004 1.72 0.03
- fib asc: 32004 1152 40004 1.73 0.00
- fib desc: 29804 1152 40004 1.63 0.00
- surprise: 44434 1152 40004 2.75 0.00
-
- [Based on other tests, the stack space is O(log(n)). The heap space
- used here is questionable; a more talented programmer might be
- able implement it without using up O(n) space. A less talented
- programmer would be tempted to place it on the stack, but that's
- a marginal improvement. In terms of number of comparisons,
- merge_sort's worst case is better than qsort's best case and with
- one exception the same can be said of execution speed.]
-
- *** merge_sort(maybe_sorted) ***
- data compares stack heap user system
- random: 56248 1152 40004 3.05 0.10
- ascend: 7831 1152 40004 0.48 0.03
- descend: 30827 1152 40004 1.77 0.05
- fib asc: 33027 1152 40004 1.87 0.02
- fib desc: 6023 1152 40004 0.37 0.00
- surprise: 34723 1152 40004 2.22 0.00
-
- [I modified merge_sort slightly to perform some simple checking to
- see if the array (or subsections of the array) are already in order.
- It adds about 2% to the number of comparisons in the "normal" case,
- but for sorting an index which is "almost" sorted already this
- cuts the time down dramatically.]
-
- *** quick_sort ***
- data compares stack heap user system
- random: 71409 1536 12 4.82 0.02
- ascend: 51822 960 12 2.88 0.00
- descend: 54924 1248 12 3.80 0.00
- fib asc: 12497500 479808 12 601.07 1.42
- fib desc: 12497500 0 12 597.93 1.15
- surprise: 57198 672 12 3.97 0.00
-
- [The quick sort is adapted from the one found on page 87 of K&R 2; they
- disclaim it as sub-optimal, for good reason it seems. I haven't
- had the time to rewrite it though I did fix the infinite loop
- problem, at a marginal cost in speed. The quadratic behavior
- in the case of a deceptive comparison routine should probably
- not be held against it.]
-
- --
-
- The actual results are not true for all time and all machines. The
- execution time for these algorithms when using non-brain-damaged
- comparison routine are all within a factor of 2. Sorting a bigger
- array brings out the differences more clearly, but at least between
- qsort and merge sort the improvement is rather minor.
-
- I planned to add a feature for changing the size of the input array
- automatically and analyzing the change in time and space usage to
- directly calculate O(whatever) for each algorithm, but I'm a bit busy
- at the moment. (Posting to usenet to be precise.) I'll also be
- adding heapsort and shellsort algorithms; if anyone has a qsort-like
- drop-in replacement for either or both of those I'd appreciate
- a copy.
-
- der krill
-
-