home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cs.utexas.edu!sun-barr!sh.wide!wnoc-tyo-news!cs.titech!titccy.cc.titech!necom830!mohta
- From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta)
- Newsgroups: comp.std.internat
- Subject: Re: Radicals Instead of Characters
- Message-ID: <2791@titccy.cc.titech.ac.jp>
- Date: 22 Jan 93 09:44:54 GMT
- References: <1j8kroINNf59@flop.ENGR.ORST.EDU> <ISHIKAWA.93Jan18203811@ds5200.personal-media.co.jp> <1j9sfpINN46t@life.ai.mit.edu> <1jfgq1INNqmn@flop.ENGR.ORST.EDU>
- Sender: news@titccy.cc.titech.ac.jp
- Organization: Tokyo Institute of Technology
- Lines: 36
-
- In article <1jfgq1INNqmn@flop.ENGR.ORST.EDU>
- crowl@jade.CS.ORST.EDU (Lawrence Crowl) writes:
-
- >The question I was asking was "can you _identify_ a han/kanji character
- >based on a sequence of radicals"
-
- No, you can't. Radicals are for indexing only. The rest of the character
- has its own complex shape.
-
- >and "would it be reasonable to encode
- >han/kanji on that basis".
-
- Such encoding is too lengthy.
-
- >Agreed. However, there is no natural size for tables. Table sized of
- >4000 are much cheaper than table sizes of 64000.
-
- If you use radical based encoding, it makes everything complex.
-
- Moreover, you will have to have sixteen 4000 entry tables which is as
- large as a single 64000 entry table.
-
- >But, can sixteen bits represent _all_ historical Han characters _and_
- >the historical texts of all other languages? My guess is 16 bits can
- >_if_ Han characters are coded as radicals,
-
- Maybe nor may not be. Many complex Han characters are just unique.
-
- >If the level 1 Han characters were also coded as radicals where
- >possible, you'd have a coding system like what I was proposing. Of
- >course, the charactes might be several radicals long.
-
- BTW, from the view point of programmers, combining characters are
- just unusable.
-
- Masataka Ohta
-