home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!gatech!ncar!noao!amethyst!organpipe.uug.arizona.edu!news
- From: dave@cs.arizona.edu (Dave Schaumann)
- Newsgroups: comp.programming
- Subject: Re: Locating duplicates
- Message-ID: <1993Jan8.051051.18788@organpipe.uug.arizona.edu>
- Date: 8 Jan 93 05:10:51 GMT
- References: <C07FJo.4vq@ux1.cso.uiuc.edu> <1993Jan2.041658.21596@organpipe.uug.arizona.edu> <1993Jan5.150010.27104@ulysses.att.com> <MARKT.93Jan7122729@wundt.harlqn.co.uk>
- Sender: news@organpipe.uug.arizona.edu
- Reply-To: dave@cs.arizona.edu (Dave Schaumann)
- Organization: University of Arizona
- Lines: 18
- In-Reply-To: markt@harlqn.co.uk (Mark Tillotson)
-
- In article <MARKT.93Jan7122729@wundt.harlqn.co.uk>, markt@harlqn (Mark Tillotson) writes:
- >Use a hash table from integers to the number of occurrences, then you
- >do N hash-lookups, and one scan of the hash table looking for
- >duplicates, giving an expected time O(N),
-
- But what happens when the number of elements to be scanned is much
- larger than the number of buckets you choose? You get collisions
- up the wazoo, and then your algorithm is suddenly O(N*f(N/#buckets)),
- where your collision resolution algorithm is O(f(N/#buckets)).
-
- If you choose the number of buckets at run-time based on the number
- of elements, you might as well just use the "radix sort" solution
- already posted.
-
- --
- You're traveling through another dimension -- a dimension not only of sight and
- sound but of mind. A journey into a wonderous land whose boundaries are that of
- imagination. That's a signpost up ahead: your next stop -- the Twilight Zone!
-