home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.std.internat
- Path: sparky!uunet!haven.umd.edu!decuac!pa.dec.com!jrdzzz.jrd.dec.com!jrd.dec.com!doi
- From: doi@jrd.dec.com (Hitoshi Doi)
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Message-ID: <C0Cyw8.Dn0@jrd.dec.com>
- Keywords: ISO10646 Unicode
- Sender: usenet@jrd.dec.com (USENET News System)
- Nntp-Posting-Host: usagi.jrd.dec.com
- Organization: DEC Japan Research and Development Center
- References: <8490@charon.cwi.nl> <1hvu79INN4qf@rodan.UU.NET> <1i0oj2INNp4v@life.ai.mit.edu> <1i13rrINNars@rodan.UU.NET> <1993Jan1.163927.20277@prl.dec.com> <2605@titccy.cc.titech.ac.jp> <1993Jan4.214109.8143@prl.dec.com>
- Date: Tue, 5 Jan 1993 02:00:08 GMT
- Lines: 61
-
- In article <1993Jan4.214109.8143@prl.dec.com>, boyd@prl.dec.com (Boyd Roberts) writes:
- # In article <2605@titccy.cc.titech.ac.jp>, mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
- # >
- # > In Japanese, the only natural sorting is on pronouciation, whose
- # > information is missing from character code. Otherwise, any sorting
- # > is almost as good as other sorting.
- # >
- #
- # Applying your argument to languages that use the glyphs 0, 1, 2, 3, 4, 5,
- # 6, 7, 8 and 9 to represent integers then it is only `natural' to sort
- # them based on their pronounciation.
- #
- # So in Japanese I will find myself sorting them in ascending order:
- #
- # 1 ichi
- # 9 kyu
- # 5 go
- # 3 san
- # 4 shi
- # 7 shichi
- # 0 zero
- # 2 ni
- # 8 hachi
-
- true, if that is how the numbers are intended to be read.
-
- # But that would be wrong, because I can pronounce:
- ^^^^^ could
-
- # 0 as rei
- # 4 as yon
- # 7 as nana
- # 9 as ku
-
- # So just how will you sort these glyphs based on `natural' pronounciation?
-
- you have to know how it is intended to be read (pronounced).
- otherwise you can't sort.
-
- an example (sorted in ascending order):
-
- 3rd (3 is read 'saa')
- sankaku
- 3gatu (3 is read 'san')
- 30 (30 is read 'sanjuu')
- 4kaku (4 is read 'shi')
- 13 (13 is read 'juusan')
- 12 (12 is read 'juuni')
- 3D (3 is read 'surii')
- 3tsu (3 is read 'mi')
- 4do (4 is read 'yon')
- 123 (123 is read 'wan tsuu surii')
-
- silly?
-
- sorting Japanese using a mixture of kanji, kana, numbers, alphabets
- can be really difficult.
- --
- Hitoshi Doi, International Open Systems Engineering doi@jrd.dec.com
- Japan Research and Development Center decwrl!jrd.dec.com!doi
- Digital Equipment Corporation Japan [from JUNET: doi@jrd.dec-j.co.jp]
-