NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / comp / std / internat / 1095 < prev next >

Wrap

Text File | 1993-01-07 | 4.8 KB | 113 lines

Newsgroups: comp.std.internat Path: sparky!uunet!cs.utexas.edu!sun-barr!ames!data.nas.nasa.gov!taligent!tseng From: jenkinsj@blowfish.taligent.com (John H. Jenkins) Subject: Re: Language tagging Message-ID: <jenkinsj-070193102220@tseng.taligent.com> Followup-To: comp.std.internat Sender: usenet@taligent.com (More Bytes Than You Can Read) Organization: Taligent, Inc. References: <1336@blue.cis.pitt.edu> <1993Jan3.203017.232@enea.se> <2609@titccy.cc.titech.ac.jp> <1iav6tINNee2@life.ai.mit.edu> <jenkinsj-050193090315@tseng.taligent.com><2631@titccy.cc.titech.ac.jp> Date: Thu, 7 Jan 1993 20:37:14 GMT Lines: 100 This is actually a response to three of Ohta-san's postings. I'm merging the threads because (1) I think they're germane to each other and (2) I, for one, find it confusing to deal with so many threads on pretty much the same topic. I'm not creating a new thread because I want to avoid starting another long, acrimonious, and pointless debate over Han unification. In <2631@titccy.cc.titech.ac.jp>, he argues: > <I have a plain [ISO 2022] text of a Japanese commentary > <on the Confucian classics. Which parts should I display using a > <"Japanese" font and which using a "Chinese" font? > < > <The answer is: As designated by the escape sequences. > Presumably by allowing you to switch character sets. I'm not aware of any 2022 mechanism for switching fonts per se. There is, as I understand it, a fundamental assumption about Japanese which Ohta-san makes. He feels that Japanese kanji and Chinese hanzi (and, presumably, Korean hanja, Vietnamese chu nom, and so on) are inherently distinct, the way the Greek alpha and Latin a are. He feels this inherent distinction must be maintained in a character encoding. I don't share this assumption. Unicode doesn't share this assumption. The CJK-JRG doesn't share this assumption. The Japanese delegation to the CJK-JRG doesn't share this assumption. Japanese lexicographers such as the people responsible for the Dai Kanwa dictionary don't share this assumption. Japanese printers who, in fact, publish Japanese commentaries on the Confucian classics and print *the* *whole* *thing* using Japanese fonts don't share this assumption. If I'm wrong with disagreeing with Ohta-san on this issue, at least I have the consolation of being in good company. :-) (Or am I misinterpreting you here? Are you arguing that, although Han unification is possible in theory, on computers, it's best to make the distinction between Japanese and Chinese, French and English an inherent part of the character code for other reasons?) In <2638@titccy.cc.titech.ac.jp>, he says: > >The original DIS 10646 was free from CJK unification defect. > >Then, Unicoders made 10646 unusable ignoring many rational oppositions. > Point of information. Unicode is not responsible for the current content of 10646. WG2 is. It changed 10646 so as to be merged with Unicode because its member bodies wanted it to. Their feeling was, generally, that having the two codes merged was a better alternative than having two multilingual character set codes in general use. The decision to merge 10646 with Unicode had widespread, international support. BTW, I would agree that there are rational objections to Han unification (and other features of Unicode). I feel that there are also rational arguments in its favor. > >Unicode, as it is now, is unusable and can't be implemented on a usable >internationalized product. > Well, the proof here is in the pudding. Unicode-based products are starting to be released. Quite a few will be available by the end of the millenium. We'll just have to see how they fly. Finally, in <2639@titccy.cc.titech.ac.jp>, he argues: > >10646 explicitely assign corresponding C/J/K Han in GB, JIS and KCS national >standard to the same code point. Thus, expanding 10646 to 32 bit can't >re-separate the assignment. > The Japanese delegation to WG2 has advocated creating a swap zone in 10646 large enough to allow the swapping in of JIS. This would make it possible for Japanese users to maintain an inherent distinction between Japanese and other Han-derived text based on code-point alone. This suggestion did not meet with much international support, but it would still be possible to define a JIS plane, a GB plane, a KSC plane, and so on, in UCS-4 which Japanese users could use for a similar purpose. Such a mechanism, however, would only be helpful for text created using it. (I am *not* advocating this -- please nobody bother to tell me it's a bad idea. I just want to point out that it is possible.) IMHO, the reseparation problem is not unique to Unicode/10646. It is an inherent fact of life deriving from two people, using different character sets based on different assumptions, trying to talk to each other. ---- John H. Jenkins John_Jenkins@taligent.com