home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!munnari.oz.au!spool.mu.edu!sgiblab!nec-gw!nec-tyo!wnoc-tyo-news!cs.titech!titccy.cc.titech!necom830!mohta
- From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta)
- Newsgroups: comp.std.internat
- Subject: Dumb Terry
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
- Message-ID: <2639@titccy.cc.titech.ac.jp>
- Date: 7 Jan 93 08:55:00 GMT
- References: <1hu9v5INNbp1@rodan.UU.NET> <8490@charon.cwi.nl> <1hvu79INN4qf@rodan.UU.NET> <1993Jan7.063116.14846@fcom.cc.utah.edu>
- Sender: news@titccy.cc.titech.ac.jp
- Organization: Tokyo Institute of Technology
- Lines: 45
-
- In article <1993Jan7.063116.14846@fcom.cc.utah.edu>
- terry@cs.weber.edu (A Wizard of Earth C) writes:
-
- >The
- >collation sequences for Japanese, for instance, vary on pronuciation. If
- >it were the intent of Unicode to provide a unified collation mechanism,
- >this would be a very strong argument against Chinese/Japanese unification.
-
- You completely miss the point. Japanese Kanji does not have a unique
- pronouciation, while Japanese word has.
-
- >First, Unicode is not the sole definition of 10646; just the only currently
- >defined character set within 10646. There is no reason to throw out 10646
- >because of Unicode (although I could make an argument for 32 bits being a
- >nifty reason for doing so).
-
- 10646 explicitely assign corresponding C/J/K Han in GB, JIS and KCS national
- standard to the same code point. Thus, expanding 10646 to 32 bit can't
- re-separate the assignment.
-
- That is, 10646 is unusable, because it is polluted with Unicode.
-
- >Second, Unicode buys more than simply another character set; it buys the
- >ability to produce non-conflicting monolingual localizations of software
- >systems (as opposed to conflicting ones as a result of a lack of standards
- >coordination with existing standards). It also buys a platform for
- >non-conflicting multinationalization (multilingual data processing) given
- >a means of compounding documents by language/locale (there may be more than
- >one locale per language).
-
- For such a purpose, use EUC, which is the moderate subset of ISO 2022.
-
- >The Unicode character set solves a number of long standing problems.
-
- Unicode is a genuine mixture of past failures:
-
- 1) 16 bitness from 8 bitness of ISO 8859/*
- 2) statefullness (by signature) from ISO 2022
- 3) combining characters from ISO 6937
- 4) existence of full/half width character from JIS X0201/0208
-
- and more, without solving any problems. And now, you are importing
- one more failure: incorrect dependence on locale model.
-
- Masataka Ohta
-