NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / comp / std / internat / 1029 < prev next >

Wrap

Internet Message Format | 1993-01-04 | 1.6 KB

Path: sparky!uunet!munnari.oz.au!spool.mu.edu!enterpoop.mit.edu!eru.mt.luth.se!lunic!sunic!dkuug!login.dkuug.dk!keld From: keld@login.dkuug.dk (Keld J|rn Simonsen) Newsgroups: comp.std.internat Subject: Re: Future extension of ISO 10646 Message-ID: <keld.726195098@login.dkuug.dk> Date: 5 Jan 93 00:51:38 GMT References: <1hvth4EINN2r4@uni-erlangen.de> <1993Jan4.102448.9710@cs.ruu.nl> Sender: news@dkuug.dk Organization: DKnet Lines: 27 Nntp-Posting-Host: login.dkuug.dk jhelling@cs.ruu.nl (Jeroen Hellingman) writes: >[about having an A-umlaut for each language seperately to help sorting...] I believe the sorting procedure is related to the user, not to the data. It is the user which needs data sorted in the way the user expect it, so s/he can find what s/he looks for. For that purpose the character d (a-umlaut, a-diaresis, swedish/finnish ae) should not have different encodings, according to language, as the user does not in many cases know what language the string is in, for instance is Eberg a Danish or Swedish name. It would be soretd differently, if it should be sorted according to the original language. Most national standards for collating (German, Austrian, Canadian, Swedish, Danish, Norwegian, Finnish come to my mind) does not provide specifications for collating according to the original language, but only with respect to the letters. Some standards are explicit about this, for example the Danish DS 377 standard. So I would recommend *not* to have different versions of Latin characters outside the BMP of 10646. I know to little of the Han/Kanji/Hanzi character collation to give an educated opinion here. Keld Simonsen