home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cs.utexas.edu!sun-barr!sh.wide!wnoc-tyo-news!ccut!news.u-tokyo.ac.jp!kappa!dave
- From: dave@appi.iis.u-tokyo.ac.jp (David Wuertele)
- Newsgroups: gnu.emacs.help
- Subject: Fuzzy Text Comparison Code?
- Message-ID: <DAVE.92Sep8190422@appi.iis.u-tokyo.ac.jp>
- Date: 8 Sep 92 10:04:22 GMT
- Sender: news@kappa.iis.u-tokyo.ac.jp
- Distribution: gnu
- Organization: Institute of Industrial Science, University of tokyo.
- Lines: 46
-
- Hi, I'm writing a vocabulary learning application in elisp. It's working
- great and soon I will post it to gnu.emacs.sources. There is one
- function, however, that I would like to improve upon, and maybe you gurus
- out there can give me some suggestions.
-
- The function I want to re-write is defined something like this:
-
- (defun correlation (input-string reference-string)
- "Compare two strings, and return an integer in the range [0..10]
- roughly representing their correlation."
- ;; insert code here
- )
-
- The function should act something like this:
-
- (correlation "The Same STRING, really" "the same string really")
- => 10 ;; case shouldn't count
-
- (correlation "The SAME string (really!)" "the same string really")
- => 10 ;; punctuation of any kind shouldn't count
-
- (correlation "Almost the Same string, really" "the same string really")
- => 10 ;; the input includes the output
-
- (correlation "really the string same" "the same string really")
- => 10 ;; order should not matter.
-
- (correlation "The very different thing" "the same string really")
- => 0 ;; words like 'the and 'a should not count as matches
-
- (correlation "stringamasamethingreally" "the same string really")
- => ? ;; I haven't decided what this should produce.
-
- (correlation "male" "female")
- => 0 ;; the string 'female contains the string 'male, but, well, you get it.
-
- (correlation "dogs and cats" "the same dog, really")
- => 3 ;; one third of the important words was matched (with a plural)
-
- Any suggestions?
-
- TIA,
- Dave
- -----
- David Wuertele, Yasuda Lab, Electronic Engineering, Institute of Industrial Science,
- University of Tokyo. dave@windsor.iis.u-tokyo.ac.jp
-