home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!gatech!usenet.ins.cwru.edu!agate!ames!sun-barr!sh.wide!wnoc-tyo-news!cs.titech!titccy.cc.titech!necom830!mohta
- From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Keywords: Unicode ISO10646 CharacterEncoding
- Message-ID: <2675@titccy.cc.titech.ac.jp>
- Date: 10 Jan 93 03:29:10 GMT
- References: <1993Jan7.033153.12133@fcom.cc.utah.edu> <1993Jan8.092754.6344@prl.dec.com> <1993Jan9.024546.26934@fcom.cc.utah.edu> <1in2c8INNmbj@life.ai.mit.edu>
- Sender: news@titccy.cc.titech.ac.jp
- Organization: Tokyo Institute of Technology
- Lines: 41
-
- In article <1in2c8INNmbj@life.ai.mit.edu>
- glenn@wheat-chex.ai.mit.edu (Glenn A. Adams) writes:
-
- >One should not in general use an interchange code (UTF1 or UTF2) for
- >processing. While one may use a process code for interchange,
-
- That is opposite.
-
- One can use interchange code for processing if it is convenient.
-
- There is no problem in doing so, as each application knows every detail
- of how interchange code is.
-
- On the other hand, one can't use process code for interchange, unless
- you are living in the closed world, because other applications won't
- accept it.
-
- >(e.g., Unicode and 10646 UCS[24] allow NULL bytes and ISO2022 C0/C1 control
- >code bytes in any byte position of their "process codes").
-
- That is one of famous fatal design flaw of it.
-
- >I can't imagine why anyone in their right mind would want to use UTF[12]
- >or any other ostensibe interchange code for processing, given the problems
- >of variable length encodings.
-
- Maybe, you can't. But, in many cases, variable length is of no problem.
-
- >But if one is to create
- >an aware application which uses more than the ASCII subset, or if it is
- >to memory map files, then use of a fixed-with process code (even for backing
- >store) becomes much more sensible.
-
- You should be crazy. If you are mapping a file under possibly networked
- environment (these days, all environments are so), you can't use multiple
- octet fixed width code because of endeaness. Don't say signature, because
- it makes everything complex and, thus, slow, and your reasoning to map
- files should be for efficiency and for simplicity. BTW, though file mapping
- is neither efficient nor simple, it is another topic.
-
- Masataka Ohta
-