home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cs.utexas.edu!usc!sdd.hp.com!think.com!enterpoop.mit.edu!mintaka.lcs.mit.edu!ai-lab!wheat-chex!glenn
- From: glenn@wheat-chex.ai.mit.edu (Glenn A. Adams)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Date: 9 Jan 1993 17:35:04 GMT
- Organization: MIT Artificial Intelligence Laboratory
- Lines: 42
- Message-ID: <1in2c8INNmbj@life.ai.mit.edu>
- References: <1993Jan7.033153.12133@fcom.cc.utah.edu> <1993Jan8.092754.6344@prl.dec.com> <1993Jan9.024546.26934@fcom.cc.utah.edu>
- NNTP-Posting-Host: wheat-chex.ai.mit.edu
- Keywords: Unicode ISO10646 CharacterEncoding
-
- In article <1993Jan9.024546.26934@fcom.cc.utah.edu> you write:
- >[ First a clarification of something which is my fault because of my
- > background in comm software: I have been informed that the currently
- > "blessed" correct terminlogy for what I have been calling "Runic
- > encoding" is "Process code", "File code", or "Interchange code". I'll
- > try to call it "Interchange code" from now on (I feel the other terms
- > imply applications, some of which I disagree with). ]
-
- I should have been more clear. A "process code" is a fixed-width encoding
- suitable for internal processing, e.g., ASCII, Unicode, 10646 UCS2, and
- 10646 UCS4, EUC wide char; a "file code" or "interchange code" is a
- potentially variable length encoding suitable for file storage (non memory
- mapped environments) or interchange, e.g., UTF1 and UTF2 (FSS-UTF),
- Shift JIS, EUC Multibyte.
-
- [My objection to your use of the word "rune" was (1) you weren't clear
- about which of these encodings you were referring to, and (2) I hate
- cute terminology which is opaque when perfectly transparent terminology
- already exists.]
-
- One should not in general use an interchange code (UTF1 or UTF2) for
- processing. While one may use a process code for interchange, some
- communication channels may have difficulties with data transparency
- (e.g., Unicode and 10646 UCS[24] allow NULL bytes and ISO2022 C0/C1 control
- code bytes in any byte position of their "process codes").
-
- I can't imagine why anyone in their right mind would want to use UTF[12]
- or any other ostensibe interchange code for processing, given the problems
- of variable length encodings. However, that doesn't mean that unaware
- applications can't effectively use an interchange code internally, e.g.,
- 8-bit clean applications which interpret only the ASCII (ISO646) characters
- could use UTF2 (FSS-UTF) without difficulty. But if one is to create
- an aware application which uses more than the ASCII subset, or if it is
- to memory map files, then use of a fixed-with process code (even for backing
- store) becomes much more sensible.
-
- Glenn Adams
-
-
-
-
-
-