home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cs.utexas.edu!sun-barr!sh.wide!fgw!fdm!ace!melby
- From: melby@dove.yk.fujitsu.co.jp (John B. Melby)
- Newsgroups: comp.std.internat
- Subject: Re: Language tagging
- Message-ID: <MELBY.93Jan11132917@dove.yk.fujitsu.co.jp>
- Date: 11 Jan 93 13:29:17 GMT
- References: <1iav6tINNee2@life.ai.mit.edu> <1iddeeINN58g@rodan.UU.NET>
- <TT.93Jan7085019@tarzan.jyu.fi> <1ii6bkINNf6c@rodan.UU.NET>
- Sender: news@ace.yk.fujitsu.co.jp
- Organization: Open Systems Group, Fujitsu Limited, Yokohama
- Lines: 40
- In-reply-to: avg@rodan.UU.NET's message of 7 Jan 93 21:12:20 GMT
-
- >>Let's look at it this way: How would a Finn want to see Chinese
- >>names sorted?
- >
- > If (as is likely) he doesn't know Chinese he either
- >>couldn't care less, or would want them transliterated into Latin
- >>characters (and then sorted by Finnish rules).
- >
- >Ever saw Chinese transliterated into Latin? :-) You generally
- >can't do it and keep it intelligible because phonetic structures
- >of languages are fairly different.
-
- It is feasible to sort Chinese names in the Putonghua-Pinyin order.
- (Of course, there may be problems if this is done programmatically,
- since many characters have two or more pronunciations.)
-
- This is even more so in the case of Japanese, where a name like Kouichi
- Watanabe could just as easily be Hirokazu Watabe, and the characters
- for "hinagata" can also be pronounced "suukei."
-
- In Japanese databases, ordering information is stored in a separate
- field from the character encoding, making the kanji ordering somewhat
- irrelevant for sorting.
-
- If one wants to make sense out of an ordered multilingual list, it might
- be a good idea to provide the following information for each item:
-
- (1) Source language of data item. The sorting algorithm may or may not
- use this information to arrange different languages into separate
- lists.
- (2) Ordering information in source language, where necessary.
- (3) Official or arbitrary Romanized equivalent, where necessary.
- (Since people romanize their names differently, the use of this field
- may result in conflicting entries for the same name, such as Otsu,
- Ootsu, Ohtsu, Otu, Ootu, and Ohtu. There is probably no easy
- solution to this problem.)
-
- -----
- John B. Melby
- Fujitsu Limited, Yokohama
- melby%yk.fujitsu.co.jp@fai.com
-