NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / comp / std / internat / 1054 < prev next >

Wrap

Text File | 1993-01-06 | 5.0 KB | 122 lines

Path: sparky!uunet!math.fu-berlin.de!mailgzrz.TU-Berlin.DE!news.netmbx.de!Germany.EU.net!incom!kostis!blues!kosta From: kosta@blues.kk.sub.org (Kosta Kostis) Newsgroups: comp.std.internat Subject: Re: Dumb Unicoders Summary: Hit someone else, please or better don't hit anyone Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages Message-ID: <TaaXwB1w165w@blues.kk.sub.org> Date: Tue, 05 Jan 93 22:16:52 MET References: <2608@titccy.cc.titech.ac.jp> Organization: The Blues Family Lines: 109 mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes: > In article <DwqPwB3w165w@blues.kk.sub.org> > kosta@blues.kk.sub.org (Kosta Kostis) writes: > > >I agree that UniCode is not very good for Asian languages, but > >for European languages (and some more) it's really OK. > > Then, could you get out of here, comp.std.internat? Masataka-san, if you are unable or unwilling to understand my posting, please refrain from pointing your publicly known flame thrower at me. Thank you very much, indeed. > >10 bits? You want to keep Asian languages out of the game, too? :-) > > Or, you want to keep Asian languages out of the game, too? :-| Iie, soo dewa arimasen. Is it possible that you have become a little "paranoid" during your "fight" against 16-bit UniCode. =;^) What makes you believe that I want to keep Asian languages "out, too"? It's the other round. My statement was that UniCode supports Latin-like language well while Asian languages are *not* covered well. Does that imply I *like* UniCode? No, so please calm down. Arigatoo gozaimasu. :-) > >Should > >we ever decide to use the full 32-bits ISO 10646 intended to > >allocate for character codes, we should be able to cover almost > >all languages (to some extend). > > Full 32 bit? 18 or 20 bits will be suffice for the real universal > character code set. Well, at least one that is "universal" enough from a Japanese point of view ;-), but maybe you're right. So, just to satisfy my curiousity: how do you represent 18 or 20 bits with 8-bit octets? Wouldn't be there be some "wasted" bits? Please don't let us make the same mistake that was done when SMTP was designed to cut off the MSB when discussing a "universal" character set. > >Who defines "required"? This is a classical "don't care about the > >users culture/needs" approach that was so harmfull in the past. > > ISO 8859-1 is not very good for all European languages, but > for central European languages (and some more) it's really OK. Right, but there are more character sets needed like ISO 8859-7 for Greek (I am a Greek), ISO 8859-5 for Cyrillic (allthough Vadim doesn't like it), ISO 8859-8 for Hebrew, ISO 8859-6 for Arabic and many more. I want to be able to compose a document containing all the languages, many more *including* the Asian languages. That's what leads us to a "universal" character set. Note, that I have never claimed, that 16-bit UniCode *is* the "universal" character set I'm looking for. As I stated before, I think 16-bit UniCode is one step "aside" and one into the "right" direction. Does that sound like "fandom"? > Then, you wrote: > > I agree that UniCode is not very good for Asian languages, but > for European languages (and some more) it's really OK. > > What was so harmful in the past? What you are repeating? Well, the same thing that all non-english speaking people suffered and still suffer from because of the fact that the U. S. were and still are leading and ruling the computer industry. Don't get me wrong, I have no objections against the U. S., they just ignored the needs of their international customers for much too long. We suffer(ed) from Systems, that don't support anything other than ASCII or at least nothing more than *one* 8-bit character set (like ISO 8859-1, which is better (for me) than US ASCII but not enough). In Japan you have your own local solution which may be the better for you than 16-bit UniCode, but that isn't "universal" either. I think we can agree we all should want to have a "universal" character set. At least as a basis for *international* data/document interchange. > >The other languages you stated, like Russian, Greek and English > >seem to be served well by UniCode, I think. > > But, as for reg-exp pattern, Vadim's suggestion is quite right. What do you use reg-exp patterns for? I think it's irrelevant or at least no real problem. Local versions for that will never go, I believe. Do you know any broadly used computer language that supports more than US ASCII in qualifiers, names, etc.? :-) If you want to stay with two "alphabets" fitting in 8-bit, you can always convert your ISO 10646 data to KOI-8 (or whatever) and use your old programs. Right, Vadim? ;-) Kosta PS: I am *not* a "UniCode"r, but I strongly feel the need for a universal character set and I want it *implemented* in *this* century. :-) -- Kosta Kostis, Talstrasse 25, D-6074 Roedermark 3, Germany kosta@blues.kk.sub.org (home) sw authors: please support ISO 8859-x! Σ÷ⁿ─╓▄▀ = aeoeueAEOEUEss