home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.std.internat
- Path: sparky!uunet!cs.utexas.edu!sun-barr!ames!data.nas.nasa.gov!taligent!tseng
- From: jenkinsj@blowfish.taligent.com (John H. Jenkins)
- Subject: Re: Language tagging
- Message-ID: <jenkinsj-070193102220@tseng.taligent.com>
- Followup-To: comp.std.internat
- Sender: usenet@taligent.com (More Bytes Than You Can Read)
- Organization: Taligent, Inc.
- References: <1336@blue.cis.pitt.edu> <1993Jan3.203017.232@enea.se> <2609@titccy.cc.titech.ac.jp> <1iav6tINNee2@life.ai.mit.edu> <jenkinsj-050193090315@tseng.taligent.com><2631@titccy.cc.titech.ac.jp>
- Date: Thu, 7 Jan 1993 20:37:14 GMT
- Lines: 100
-
- This is actually a response to three of Ohta-san's postings. I'm
- merging the threads because (1) I think they're germane to each other
- and (2) I, for one, find it confusing to deal with so many threads on
- pretty much the same topic. I'm not creating a new thread because I
- want to avoid starting another long, acrimonious, and pointless debate
- over Han unification.
-
- In <2631@titccy.cc.titech.ac.jp>, he argues:
-
- > <I have a plain [ISO 2022] text of a Japanese commentary
- > <on the Confucian classics. Which parts should I display using a
- > <"Japanese" font and which using a "Chinese" font?
- > <
- > <The answer is: As designated by the escape sequences.
- >
-
- Presumably by allowing you to switch character sets. I'm not aware of
- any 2022 mechanism for switching fonts per se.
-
- There is, as I understand it, a fundamental assumption about Japanese
- which Ohta-san makes. He feels that Japanese kanji and Chinese hanzi
- (and, presumably, Korean hanja, Vietnamese chu nom, and so on) are
- inherently distinct, the way the Greek alpha and Latin a are. He feels
- this inherent distinction must be maintained in a character encoding.
-
- I don't share this assumption. Unicode doesn't share this assumption.
- The CJK-JRG doesn't share this assumption. The Japanese delegation to
- the CJK-JRG doesn't share this assumption. Japanese lexicographers such
- as the people responsible for the Dai Kanwa dictionary don't share this
- assumption. Japanese printers who, in fact, publish Japanese
- commentaries on the Confucian classics and print *the* *whole* *thing*
- using Japanese fonts don't share this assumption.
-
- If I'm wrong with disagreeing with Ohta-san on this issue, at least I
- have the consolation of being in good company. :-)
-
- (Or am I misinterpreting you here? Are you arguing that, although Han
- unification is possible in theory, on computers, it's best to make the
- distinction between Japanese and Chinese, French and English an inherent
- part of the character code for other reasons?)
-
- In <2638@titccy.cc.titech.ac.jp>, he says:
-
- >
- >The original DIS 10646 was free from CJK unification defect.
- >
- >Then, Unicoders made 10646 unusable ignoring many rational oppositions.
- >
-
- Point of information. Unicode is not responsible for the current
- content of 10646. WG2 is. It changed 10646 so as to be merged with
- Unicode because its member bodies wanted it to. Their feeling was,
- generally, that having the two codes merged was a better alternative
- than having two multilingual character set codes in general use. The
- decision to merge 10646 with Unicode had widespread, international
- support.
-
- BTW, I would agree that there are rational objections to Han unification
- (and other features of Unicode). I feel that there are also rational
- arguments in its favor.
-
- >
- >Unicode, as it is now, is unusable and can't be implemented on a usable
- >internationalized product.
- >
-
- Well, the proof here is in the pudding. Unicode-based products are
- starting to be released. Quite a few will be available by the end of
- the millenium. We'll just have to see how they fly.
-
- Finally, in <2639@titccy.cc.titech.ac.jp>, he argues:
-
- >
- >10646 explicitely assign corresponding C/J/K Han in GB, JIS and KCS
- national
- >standard to the same code point. Thus, expanding 10646 to 32 bit can't
- >re-separate the assignment.
- >
-
- The Japanese delegation to WG2 has advocated creating a swap zone in
- 10646 large enough to allow the swapping in of JIS. This would make it
- possible for Japanese users to maintain an inherent distinction between
- Japanese and other Han-derived text based on code-point alone. This
- suggestion did not meet with much international support, but it would
- still be possible to define a JIS plane, a GB plane, a KSC plane, and so
- on, in UCS-4 which Japanese users could use for a similar purpose. Such
- a mechanism, however, would only be helpful for text created using it.
-
- (I am *not* advocating this -- please nobody bother to tell me it's a
- bad idea. I just want to point out that it is possible.)
-
- IMHO, the reseparation problem is not unique to Unicode/10646. It is an
- inherent fact of life deriving from two people, using different
- character sets based on different assumptions, trying to talk to each
- other.
-
- ----
- John H. Jenkins
- John_Jenkins@taligent.com
-
-