NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / std / internat / 1312 < prev next >

Wrap

Internet Message Format | 1993-01-25 | 4.8 KB

Path: sparky!uunet!usc!sol.ctr.columbia.edu!destroyer!gumby!yale!mintaka.lcs.mit.edu!ai-lab!wheat-chex!glenn From: glenn@wheat-chex.ai.mit.edu (Glenn A. Adams) Newsgroups: comp.std.internat Subject: Re: Alphabets Date: 25 Jan 1993 15:12:14 GMT Organization: MIT Artificial Intelligence Laboratory Lines: 81 Message-ID: <1k100eINNs9n@life.ai.mit.edu> References: <1993Jan24.172323.2706@enea.se> <1jutusINNlfa@life.ai.mit.edu> <8719@charon.cwi.nl> NNTP-Posting-Host: wheat-chex.ai.mit.edu In article <8719@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes: >"What is Unicode encoding?". Scripts? Writing system? Unicode encodes scripts, and not writing systems (alphabets). >It is not encoding the Latin script I think. Consider for instance the >German writing systems that have been used. Fraktur. Is that a different >font? Fraktur is a different font typeface. >Suetterlin. Is that a different font? Many would think the latter not >predominantly derived from the symbols used in the Roman alphabet. I see >them as being more derived from the Germanic Runes. Still, there is a 1<->1 >corespondence between the symbols in the Suetterlin script and the German >version of the Latin script (I think). I assume you refer to the written form developed by von Ludwig Suetterlin (1865-1917). I don't have any detailed information on it, so I can't say for sure. Without knowing any details, I would be willing to say it was a distinct script to the extent that Suetterlin created new forms or even borrowed forms from other scripts, perhaps modifying them in the process. Aside from the issue of encoding utility, I would say that abstracting the forms of two or more alphabets into a single script should take into account historical derivation, formal similarity, and perhaps even functional similarity, although I would give the much less priority than the former two criteria. >I think you should add that unification of different scripts is possible >iff the scripts can be viewed as just being font changes (although the >derivation of the scripts can be completely different). I think you may be confusing "script" as I am using it with "handwriting form" or possibly "written form." Clearly the latter would be a matter of only font changes, and nothing more. However, I am using "script" in a different way, namely, to capture the notion of a collection of abstract symbols which tend to have a fairly clear historical relationship, which still bear a fairly strong resemblence in form, and which are used to represent fairly similar functions. In addition, such a script -- as an artificial construct -- can include elements which violate these criteria to some degree. The general process used in Unicode is to identify an alphabet (i.e., the symbols used in a particular writing system) with some historically known collection of symbols (a script), attempt to unify the alphabet with the this collection, and, then, to the extent that the unification is successful and doesn't interfere with basic processing tasks, replace the script with the (unified) union of the original script and the forms of the new alphabet. This produces a new script which may have forms in it that were not in the original script; e.g., if you look at Unicode Latin blocks, you will find elements like: LATIN SMALL LETTER THORN, LATIN SMALL LETTER EXCLAMATION MARK, LATIN SMALL LETTER BARRED LAMBDA, LATIN CAPITAL LETTER YOGH, LATIN LETTER TWO BAR, LATIN CAPITAL LETTER TONE FIVE, and so on. Clearly none of these are members of the Roman alphabet, or are even close. In these cases, alphabets which were largely derived from the core symbols of the Latin script, i.e., those in the Roman alphabet, were unified with the Latin script, resulting in unification of some forms, and the addition of other, novel forms which were innovations in the alphabets being unified. Clearly, none of the different alphabets which share the collection of symbols referred to as LATIN LETTERS in Unicode actually make use of *all* of the symbols so identified; therefore, you can't say it is just a font shift. >So while you can unify the Suetterlin and the Latin script, you can not unify >Latin and Greek script although Latin is derived from Greek. You could unify Latin and Greek if you want, but it would require radical unification of both form and function. And it wouldn't buy much as far as encoding is concerned. Keep in mind here that a SCRIPT in the Unicode sense is largely an artificial engineering construct; and not adherent to any "theory of scripts." To my knowledge, there is no theory of scripts anyway. The problem with written language is that it is based almost entirely on convention and historical accident; one can't articulate a theory of writing on the basis of necessity (at least I wouldn't give credence to any such theory). Glenn Adams