home *** CD-ROM | disk | FTP | other *** search
- !-----------------------------------------------------------------------------!
- ! !
- ! Program: 2nd Language Exercise (Hope: Sorted words with frequency count) !
- ! Author: M. Y. Ben-Gershon !
- ! Date: 13th December 1989 !
- ! !
- !-----------------------------------------------------------------------------!
- ! !
- ! Purpose: !
- ! This program reads a sentence as a list of characters. It then splits !
- ! the sentence into its component words, sorts them into alphabetical !
- ! order (strictly speaking into ASCII order, upper case befor lower case) !
- ! and then produces a list of all the words found, together with a !
- ! frequency count for each word found. !
- ! !
- ! !
- ! Calling format: !
- ! The program is used by entering a function call of the form: !
- ! !
- ! 'WordCount("This is the sentence I wish to use for the test.");' !
- ! !
- ! !
- ! Output format: !
- ! The program displays its results in the form of a list. In the above !
- ! example, the results would have been: !
- ! !
- ! >> [ ( "I" , 1 ) , ( "This" , 1 ) , ( "for" , 1 ) , ( "is" , 1 ) , !
- ! ( "sentence" , 1 ) , ( "test" , 1 ) , ( "the" , 2 ) , ( "to" , 1 ) , !
- ! ( "use" , 1 ) , ( "wish" , 1 ) ] : list ( ( TV000 # num ) ) !
- ! !
- ! !
- !-----------------------------------------------------------------------------!
-
-
-
- type word == list(char);
-
- type sentence == list(char);
-
- type CountItem(alpha) == alpha # num;
-
- type CountList(alpha) == list(CountItem(alpha));
-
-
-
-
-
-
- data tree(alpha) == niltree ++
- constree(alpha #
- tree(alpha) # tree(alpha));
-
-
-
-
- infix << : 6; ! This is the 'less than' operator !
- ! for words (lists of char). !
-
-
-
- dec Letter : char -> truval;
- dec AWord : sentence # word -> word # sentence;
- dec WordList : sentence # list(word) -> list(word) # sentence;
- dec << : word # word -> truval;
- dec InsertInTree : word # tree(word) -> tree(word);
- dec OrderedTreeFromList : list(word) -> tree(word);
- dec Flatten : tree(alpha) -> list(alpha);
- dec DeleteAndCount : alpha # list(alpha) # num # list(alpha) -> num # list(alpha);
- dec ListCount : list(alpha) -> CountList(alpha);
- dec WordCount : sentence -> CountList(word);
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function returns 'true' if the character !
- ! parameter passed to it was a letter, and 'false' otherwise. !
- ! !
- !--------------------------------------------------------------!
-
- ! dec Letter : char -> truval;
-
- --- Letter ch <=
- ( (ch >= 'a') and (ch =< 'z') )
- or ( (ch >= 'A') and (ch =< 'Z') );
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function returns a given sentence, split !
- ! into two parts: the first word, and the rest. It !
- ! terminates a word on the first non-alphabetic character !
- ! found. It MUST be called initially using the form: !
- ! !
- ! 'AWord("List of characters forming a sentence" , nil);' !
- ! !
- ! as it uses the first parameter to 'accumulate' the !
- ! characters forming the word to be returned. This makes the !
- ! function a linear one, rather than an O(n^2) one, which !
- ! would be the case had the function been a simple recursive !
- ! one, without such an 'accumulating parameter'. !
- ! !
- !--------------------------------------------------------------!
-
- ! dec AWord : sentence # word -> word # sentence;
-
- --- AWord(nil , TheWordSoFar) <=
- (TheWordSoFar , nil);
-
- --- AWord(Head::Tail , TheWordSoFar) <=
- if not Letter(Head) then
- (TheWordSoFar , Tail)
- else
- AWord(Tail , TheWordSoFar <> (Head::nil) );
-
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function returns a given sentence, split up !
- ! into its component words (lists of characters). It uses !
- ! the function 'AWord' (defined previously) which returns a !
- ! given sentence, split into two parts - the first word, and !
- ! the rest. The words returned by 'AWord' are accumulated !
- ! in an accumulating parameter, so this function should be !
- ! called using the form: !
- ! !
- ! 'WordList("List of characters forming a sentence" , nil);' !
- ! !
- !--------------------------------------------------------------!
-
-
- ! dec WordList : sentence # list(word) -> list(word) # sentence;
-
- --- WordList(nil , WordsSoFar) <=
- (WordsSoFar , nil);
-
- --- WordList(Sentence , WordsSoFar) <= ! This removes 'nil' !
- if NextWord = nil then ! words - and prevents !
- WordList(RestOfSentence , WordsSoFar) ! them from being added !
- ! to the list of words. !
- else
- WordList(RestOfSentence , WordsSoFar <> (NextWord :: nil) )
-
- where (NextWord , RestOfSentence) == AWord(Sentence , nil);
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function returns 'true' if the first list !
- ! is 'less' than the second. It is polymorphic, and will !
- ! work on any list of characters or numbers. For a list of !
- ! numbers the order is strictly numerical, for lists of char !
- ! the order is ASCII. The nil list is 'less than' any other !
- ! list. In the case of lists of char (words), the function !
- ! will, in effect be testing for the alphabetical order of !
- ! the two words, bearing in mind that upper and lower case !
- ! are regarded aa being distinct, with lower-case being !
- ! 'less' than upper-case. The function defines a new infix !
- ! operator, '<<', with a priority of 6 (as '>', '<' etc). !
- ! !
- !--------------------------------------------------------------!
-
- ! infix << : 6;
- ! dec << : list(alpha) # list(alpha) -> truval;
-
- --- _ << nil <=
- false;
-
- --- nil << _ <=
- true;
-
- --- (h1::t1) << (h2::t2) <=
- if (h1 < h2) then
- true
- else
- if (h1 > h2) then
- false
- else
- (t1 << t2);
-
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function returns a tree, consisting of the !
- ! given word inserted into the given ordered binary tree of !
- ! words. !
- ! !
- !--------------------------------------------------------------!
-
- ! dec InsertInTree : word # tree(word) -> tree(word);
-
- --- InsertInTree(w , niltree) <=
- constree(w , niltree , niltree);
-
- --- InsertInTree(w , constree(NodeVal , Left , Right)) <=
- if (w << NodeVal) then
- constree(NodeVal , InsertInTree(w , Left) , Right)
- else
- constree(NodeVal , Left , InsertInTree(w , Right) );
-
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function returns a tree, consisting of an !
- ! unordered list of words, entered into an ordered binary !
- ! tree of words. !
- ! !
- !--------------------------------------------------------------!
-
- ! dec OrderedTreeFromList : list(word) -> tree(word);
-
- --- OrderedTreeFromList( nil ) <=
- niltree;
-
- --- OrderedTreeFromList( h::t ) <=
- InsertInTree(h , OrderedTreeFromList(t) );
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function 'flattens' a tree into a list. If !
- ! the tree was an ordered one, 'Flatten' will return a list, !
- ! consisting of the ordered elements of the tree. !
- ! !
- !--------------------------------------------------------------!
-
- ! dec Flatten : tree(alpha) -> list(alpha);
-
- --- Flatten niltree <=
- nil;
-
- --- Flatten( constree( NodeVal , Left , Right ) ) <=
- (Flatten Left) <> (NodeVal :: Flatten Right);
-
-
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function searches for an item in a list and !
- ! returns a count of the number of times the item was found !
- ! in the list, together with a new list consisting of the !
- ! original list with all occurrences of the item removed. !
- ! !
- !--------------------------------------------------------------!
-
-
- ! dec DeleteAndCount : alpha # list(alpha) # num # list(alpha) -> num # list(alpha);
-
- --- DeleteAndCount(_ , nil , TimesThisWord , UniqueWordsSoFar) <=
- (TimesThisWord , UniqueWordsSoFar);
-
- --- DeleteAndCount(Item , Head::Tail , TimesThisWord , UniqueWordsSoFar) <=
- if (Item = Head) then
- DeleteAndCount(Item , Tail , succ TimesThisWord , UniqueWordsSoFar)
- else
- DeleteAndCount(Item , Tail , TimesThisWord , UniqueWordsSoFar <> (Head::nil));
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function counts the frequency of items in a !
- ! list. It returns a list of tuples, each containing one of !
- ! the unique words found, together with its frequency. !
- ! !
- !--------------------------------------------------------------!
-
- ! dec ListCount : list(alpha) -> CountList(alpha);
-
- --- ListCount(nil) <=
- nil;
-
- --- ListCount(Head::Tail) <=
- (Head , (succ TimesHeadFound) ) :: ListCount(NewTail)
- where (TimesHeadFound , NewTail) ==
- DeleteAndCount(Head , Tail , 0 , nil);
-
-
-
-
-
-
- !--------------------------------------------------------------!
- ! !
- ! The following function is the program 'shell' which enables !
- ! the component functions to be put together and produce !
- ! a sorted list of the words in a sentence with their !
- ! frequency count, when the user is only required to enter !
- ! the sentence to be counted as a parameter of this one !
- ! function. !
- ! !
- !--------------------------------------------------------------!
-
-
- ! dec WordCount : sentence -> CountList(word);
-
- --- WordCount ls <=
- ListCount(Flatten(OrderedTreeFromList(TheListOfWords)))
-
- where (TheListOfWords , _) == WordList(ls , nil);
-
-
- WordCount (" Write a function ChangeBase of two arguments which converts a number to its representation in some specified base <= 10. ") ;