NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / comp / std / internat / 1070 < prev next >

Wrap

Internet Message Format | 1993-01-06 | 1.7 KB

Path: sparky!uunet!spool.mu.edu!darwin.sura.net!sgiblab!nec-gw!nec-tyo!wnoc-tyo-news!cs.titech!titccy.cc.titech!necom830!mohta From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta) Newsgroups: comp.std.internat Subject: Re: Dumb Unicoders Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages Message-ID: <2632@titccy.cc.titech.ac.jp> Date: 6 Jan 93 22:15:28 GMT References: <2608@titccy.cc.titech.ac.jp> <TaaXwB1w165w@blues.kk.sub.org> Sender: news@titccy.cc.titech.ac.jp Organization: Tokyo Institute of Technology Lines: 35 In article <TaaXwB1w165w@blues.kk.sub.org> kosta@blues.kk.sub.org (Kosta Kostis) writes: >Well, at least one that is "universal" enough from a Japanese point >of view ;-), but maybe you're right. So, just to satisfy my curiousity: >how do you represent 18 or 20 bits with 8-bit octets? As you want. But, according to Shanon, variable length encoding would be better. >Wouldn't be there be some "wasted" bits? How does that matter? Aren't there unoccupied code points in Unicode? >As I stated before, I think 16-bit UniCode is one step "aside" and one >into the "right" direction. Does that sound like "fandom"? Unicode is worse than ISO 2022 for its universalness. >> But, as for reg-exp pattern, Vadim's suggestion is quite right. > >What do you use reg-exp patterns for? To search some pattern in a large text file(s). >Local versions for that will never go, I believe. What we need is, of course, the universal version of reg-exp. >PS: I am *not* a "UniCode"r, but I strongly feel the need for a universal > character set and I want it *implemented* in *this* century. :-) Then, throwaway Unicode as soon as possible, as we have less than 8 years. Masataka Ohta