NetNews Usenet Archive 1992 #31

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #31 / NN_1992_31.iso / spool / comp / programm / 3395 < prev next >

Wrap

Internet Message Format | 1993-01-02 | 1.4 KB

Path: sparky!uunet!pipex!bnr.co.uk!uknet!mcsun!sun4nl!and!jos From: jos@and.nl (Jos Horsmeier) Newsgroups: comp.programming Subject: Re: Locating duplicates Message-ID: <4279@dozo.and.nl> Date: 2 Jan 93 11:51:09 GMT References: <C07FJo.4vq@ux1.cso.uiuc.edu> Organization: AND Software BV Rotterdam Lines: 33 In article <C07FJo.4vq@ux1.cso.uiuc.edu> ceblair@ux1.cso.uiuc.edu (Charles Blair) writes: | | Given a large array of integers. Is there a way to identify duplicates |which is substantially less work than sorting the array? If you're low on memory and the array is really large, there is no better way than to sort the array and scan the sorted array afterwards, checking for duplicates. This would cost you O(n*log(n)) time to do the job. On the other hand, if you're not low on memory and if you know the range of the numbers in the array, say Min and Max, you could use an index scheme as follows: for all i in Min and Max in index index[i] = unset for all i in Minindex and Maxindex in array if index[array[i]] != unset then duplicate detected else index[array[i]]= i Here index[j] points to the location in the array where the number j can be found, if present. This algorithm (and it's memory usage) is all very dependent on the actual numbers Min and Max (the lowest and highest values found in the array.) But it'll cost you only O(n) time to scan the array. I hope this helps a bit, kind regards and a happy new year, Jos aka jos@and.nl