home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!spool.mu.edu!darwin.sura.net!sgiblab!nec-gw!nec-tyo!wnoc-tyo-news!cs.titech!titccy.cc.titech!necom830!mohta
- From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Unicoders
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
- Message-ID: <2632@titccy.cc.titech.ac.jp>
- Date: 6 Jan 93 22:15:28 GMT
- References: <2608@titccy.cc.titech.ac.jp> <TaaXwB1w165w@blues.kk.sub.org>
- Sender: news@titccy.cc.titech.ac.jp
- Organization: Tokyo Institute of Technology
- Lines: 35
-
- In article <TaaXwB1w165w@blues.kk.sub.org>
- kosta@blues.kk.sub.org (Kosta Kostis) writes:
-
- >Well, at least one that is "universal" enough from a Japanese point
- >of view ;-), but maybe you're right. So, just to satisfy my curiousity:
- >how do you represent 18 or 20 bits with 8-bit octets?
-
- As you want. But, according to Shanon, variable length encoding would
- be better.
-
- >Wouldn't be there be some "wasted" bits?
-
- How does that matter? Aren't there unoccupied code points in Unicode?
-
- >As I stated before, I think 16-bit UniCode is one step "aside" and one
- >into the "right" direction. Does that sound like "fandom"?
-
- Unicode is worse than ISO 2022 for its universalness.
-
- >> But, as for reg-exp pattern, Vadim's suggestion is quite right.
- >
- >What do you use reg-exp patterns for?
-
- To search some pattern in a large text file(s).
-
- >Local versions for that will never go, I believe.
-
- What we need is, of course, the universal version of reg-exp.
-
- >PS: I am *not* a "UniCode"r, but I strongly feel the need for a universal
- > character set and I want it *implemented* in *this* century. :-)
-
- Then, throwaway Unicode as soon as possible, as we have less than 8 years.
-
- Masataka Ohta
-