home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!munnari.oz.au!spool.mu.edu!enterpoop.mit.edu!eru.mt.luth.se!lunic!sunic!dkuug!login.dkuug.dk!keld
- From: keld@login.dkuug.dk (Keld J|rn Simonsen)
- Newsgroups: comp.std.internat
- Subject: Re: Future extension of ISO 10646
- Message-ID: <keld.726195098@login.dkuug.dk>
- Date: 5 Jan 93 00:51:38 GMT
- References: <1hvth4EINN2r4@uni-erlangen.de> <1993Jan4.102448.9710@cs.ruu.nl>
- Sender: news@dkuug.dk
- Organization: DKnet
- Lines: 27
- Nntp-Posting-Host: login.dkuug.dk
-
- jhelling@cs.ruu.nl (Jeroen Hellingman) writes:
-
- >[about having an A-umlaut for each language seperately to help sorting...]
-
- I believe the sorting procedure is related to the user, not to the data.
- It is the user which needs data sorted in the way the user
- expect it, so s/he can find what s/he looks for.
- For that purpose the character d (a-umlaut, a-diaresis, swedish/finnish
- ae) should not have different encodings, according to language, as
- the user does not in many cases know what language the string is
- in, for instance is Eberg a Danish or Swedish name. It would be
- soretd differently, if it should be sorted according to the
- original language.
-
- Most national standards for collating (German, Austrian, Canadian,
- Swedish, Danish, Norwegian, Finnish come to my mind) does not
- provide specifications for collating according to the original
- language, but only with respect to the letters. Some standards are
- explicit about this, for example the Danish DS 377 standard.
-
- So I would recommend *not* to have different versions of Latin
- characters outside the BMP of 10646.
-
- I know to little of the Han/Kanji/Hanzi character collation to
- give an educated opinion here.
-
- Keld Simonsen
-