home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.std.internat
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!saimiri.primate.wisc.edu!ames!agate!dog.ee.lbl.gov!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry
- From: terry@cs.weber.edu (A Wizard of Earth C)
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Message-ID: <1993Jan9.031217.27425@fcom.cc.utah.edu>
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
- Sender: news@fcom.cc.utah.edu
- Organization: Weber State University (Ogden, UT)
- References: <1i2emiINN2td@rodan.UU.NET> <1993Jan7.065611.15193@fcom.cc.utah.edu> <1993Jan8.094119.6795@prl.dec.com>
- Date: Sat, 9 Jan 93 03:12:17 GMT
- Lines: 70
-
- In article <1993Jan8.094119.6795@prl.dec.com> boyd@prl.dec.com (Boyd Roberts) writes:
- >In article <1993Jan7.065611.15193@fcom.cc.utah.edu>, terry@cs.weber.edu (A Wizard of Earth C) writes:
- >>
- >> Again, sorting can not be safely tied to character set lexical order for
- >> all languages. I disagree with Boyd here. Localization with Unicode is
- >> a piece of cake. Unicode allows it to be entirely data driven, with no
- >> locale-specific algorithms or hard-coded data.
- >
- >Maybe you should check your attributions, but I have never advocated
- >that the code values can be used to do lexical sorting in the general
- >case -- far from it. Your above paragraph contains a contradiction
- >as I read it.
-
- Sorry if I have misattributed anything. The disagreement was regarding
- the ease of localization.
-
- >Tell me how I sort on stroke count in Unicode without ``locale-specific
- >algorithms or hard-coded data''?
-
- Locale-specific sorting can be done with a generalized algorythm which
- is itself data-driven. The *data* doing the driving, on the other hand,
- is *entirely* locale specific.
-
- Localization (in terms of providing a native language environment of
- commands, interface text, and error messages) can be totally data driven
- using locale specific message catalogs.
-
- None of this requires either direct manipulation algorithms (ie: not data
- driven) or hard-coded data (as in constant character strings compiled into
- programs.
-
- In previous posts, I (and others) have pointed out why it is fundamentally
- impossible to take all sorting issues into account with simple lexical
- ordering because of multiple sorting procedures within a given language.
- This is because the locale specific character set binds the sorting order
- to the lexical order within a single language (unless multiple character
- sets are provided for each language with more than one sort order, and
- the character set is thus bound to the locale, not to the lexical ordering
- within a particular character set for each language.
-
- Your stroke count example is well taken, since it is not the only sorting
- order in Chinese (for instance, sorting on radicals is frequently used),
- but stroke-count (and direction) sorts are highly useful even in English
- for handwriting recognition systems.
-
- >Re: `san'
- >
- > It can be used on either the family name of the first name.
- > This assertion is based on current common practice in Japan.
- >
- > If you knew Ohta-san reasonably well (ie. friend or colleague)
- > and were contemporaries you could use `kun' instead of `san'.
-
- So, out of curiousity, is "Ohta" his family name or his first name? I had
- assumed it was being used as an honorific (ie: Mr.), in which case the
- correct usage in his particular case is still dependant on whether he
- was ordering his name for Japanese or for English in his signature.
-
-
- Terry Lambert
- terry@icarus.weber.edu
- terry_lambert@novell.com
- ---
- Any opinions in this posting are my own and not those of my present
- or previous employers.
- --
- -------------------------------------------------------------------------------
- "I have an 8 user poetic license" - me
- Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial
- -------------------------------------------------------------------------------
-