home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!pipex!bnr.co.uk!uknet!mcsun!sun4nl!and!jos
- From: jos@and.nl (Jos Horsmeier)
- Newsgroups: comp.programming
- Subject: Re: Locating duplicates
- Message-ID: <4279@dozo.and.nl>
- Date: 2 Jan 93 11:51:09 GMT
- References: <C07FJo.4vq@ux1.cso.uiuc.edu>
- Organization: AND Software BV Rotterdam
- Lines: 33
-
- In article <C07FJo.4vq@ux1.cso.uiuc.edu> ceblair@ux1.cso.uiuc.edu (Charles Blair) writes:
- |
- | Given a large array of integers. Is there a way to identify duplicates
- |which is substantially less work than sorting the array?
-
- If you're low on memory and the array is really large, there is no better
- way than to sort the array and scan the sorted array afterwards, checking
- for duplicates. This would cost you O(n*log(n)) time to do the job.
-
- On the other hand, if you're not low on memory and if you know the range
- of the numbers in the array, say Min and Max, you could use an index
- scheme as follows:
-
- for all i in Min and Max in index
- index[i] = unset
-
- for all i in Minindex and Maxindex in array
- if index[array[i]] != unset then
- duplicate detected
- else
- index[array[i]]= i
-
- Here index[j] points to the location in the array where the number j
- can be found, if present. This algorithm (and it's memory usage) is
- all very dependent on the actual numbers Min and Max (the lowest and
- highest values found in the array.) But it'll cost you only O(n) time
- to scan the array.
-
- I hope this helps a bit,
-
- kind regards and a happy new year,
-
- Jos aka jos@and.nl
-