home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!math.fu-berlin.de!mailgzrz.TU-Berlin.DE!news.netmbx.de!Germany.EU.net!incom!kostis!blues!kosta
- From: kosta@blues.kk.sub.org (Kosta Kostis)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Unicoders
- Summary: Hit someone else, please or better don't hit anyone
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
- Message-ID: <TaaXwB1w165w@blues.kk.sub.org>
- Date: Tue, 05 Jan 93 22:16:52 MET
- References: <2608@titccy.cc.titech.ac.jp>
- Organization: The Blues Family
- Lines: 109
-
- mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
-
- > In article <DwqPwB3w165w@blues.kk.sub.org>
- > kosta@blues.kk.sub.org (Kosta Kostis) writes:
- >
- > >I agree that UniCode is not very good for Asian languages, but
- > >for European languages (and some more) it's really OK.
- >
- > Then, could you get out of here, comp.std.internat?
-
- Masataka-san, if you are unable or unwilling to understand my posting,
- please refrain from pointing your publicly known flame thrower at me.
- Thank you very much, indeed.
-
- > >10 bits? You want to keep Asian languages out of the game, too? :-)
- >
- > Or, you want to keep Asian languages out of the game, too? :-|
-
- Iie, soo dewa arimasen.
-
- Is it possible that you have become a little "paranoid" during
- your "fight" against 16-bit UniCode. =;^)
-
- What makes you believe that I want to keep Asian languages "out, too"?
- It's the other round.
-
- My statement was that UniCode supports Latin-like language well while
- Asian languages are *not* covered well. Does that imply I *like* UniCode?
- No, so please calm down. Arigatoo gozaimasu. :-)
-
- > >Should
- > >we ever decide to use the full 32-bits ISO 10646 intended to
- > >allocate for character codes, we should be able to cover almost
- > >all languages (to some extend).
- >
- > Full 32 bit? 18 or 20 bits will be suffice for the real universal
- > character code set.
-
- Well, at least one that is "universal" enough from a Japanese point
- of view ;-), but maybe you're right. So, just to satisfy my curiousity:
- how do you represent 18 or 20 bits with 8-bit octets?
- Wouldn't be there be some "wasted" bits?
- Please don't let us make the same mistake that was done when SMTP was
- designed to cut off the MSB when discussing a "universal" character set.
-
- > >Who defines "required"? This is a classical "don't care about the
- > >users culture/needs" approach that was so harmfull in the past.
- >
- > ISO 8859-1 is not very good for all European languages, but
- > for central European languages (and some more) it's really OK.
-
- Right, but there are more character sets needed like ISO 8859-7 for
- Greek (I am a Greek), ISO 8859-5 for Cyrillic (allthough Vadim doesn't
- like it), ISO 8859-8 for Hebrew, ISO 8859-6 for Arabic and many more.
-
- I want to be able to compose a document containing all the languages,
- many more *including* the Asian languages. That's what leads us to a
- "universal" character set. Note, that I have never claimed, that 16-bit
- UniCode *is* the "universal" character set I'm looking for.
-
- As I stated before, I think 16-bit UniCode is one step "aside" and one
- into the "right" direction. Does that sound like "fandom"?
-
- > Then, you wrote:
- >
- > I agree that UniCode is not very good for Asian languages, but
- > for European languages (and some more) it's really OK.
- >
- > What was so harmful in the past? What you are repeating?
-
- Well, the same thing that all non-english speaking people suffered
- and still suffer from because of the fact that the U. S. were and
- still are leading and ruling the computer industry.
- Don't get me wrong, I have no objections against the U. S., they
- just ignored the needs of their international customers for much
- too long.
-
- We suffer(ed) from Systems, that don't support anything other than
- ASCII or at least nothing more than *one* 8-bit character set (like
- ISO 8859-1, which is better (for me) than US ASCII but not enough).
- In Japan you have your own local solution which may be the better
- for you than 16-bit UniCode, but that isn't "universal" either.
-
- I think we can agree we all should want to have a "universal" character
- set. At least as a basis for *international* data/document interchange.
-
- > >The other languages you stated, like Russian, Greek and English
- > >seem to be served well by UniCode, I think.
- >
- > But, as for reg-exp pattern, Vadim's suggestion is quite right.
-
- What do you use reg-exp patterns for? I think it's irrelevant or at
- least no real problem. Local versions for that will never go, I believe.
- Do you know any broadly used computer language that supports more than
- US ASCII in qualifiers, names, etc.? :-)
-
- If you want to stay with two "alphabets" fitting in 8-bit, you can
- always convert your ISO 10646 data to KOI-8 (or whatever) and use
- your old programs. Right, Vadim? ;-)
-
- Kosta
-
- PS: I am *not* a "UniCode"r, but I strongly feel the need for a universal
- character set and I want it *implemented* in *this* century. :-)
-
- --
- Kosta Kostis, Talstrasse 25, D-6074 Roedermark 3, Germany
- kosta@blues.kk.sub.org (home)
- sw authors: please support ISO 8859-x! Σ÷ⁿ─╓▄▀ = aeoeueAEOEUEss
-