home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!usc!sol.ctr.columbia.edu!destroyer!gumby!yale!mintaka.lcs.mit.edu!ai-lab!wheat-chex!glenn
- From: glenn@wheat-chex.ai.mit.edu (Glenn A. Adams)
- Newsgroups: comp.std.internat
- Subject: Re: Alphabets
- Date: 25 Jan 1993 15:12:14 GMT
- Organization: MIT Artificial Intelligence Laboratory
- Lines: 81
- Message-ID: <1k100eINNs9n@life.ai.mit.edu>
- References: <1993Jan24.172323.2706@enea.se> <1jutusINNlfa@life.ai.mit.edu> <8719@charon.cwi.nl>
- NNTP-Posting-Host: wheat-chex.ai.mit.edu
-
- In article <8719@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes:
-
- >"What is Unicode encoding?". Scripts? Writing system?
-
- Unicode encodes scripts, and not writing systems (alphabets).
-
- >It is not encoding the Latin script I think. Consider for instance the
- >German writing systems that have been used. Fraktur. Is that a different
- >font?
-
- Fraktur is a different font typeface.
-
- >Suetterlin. Is that a different font? Many would think the latter not
- >predominantly derived from the symbols used in the Roman alphabet. I see
- >them as being more derived from the Germanic Runes. Still, there is a 1<->1
- >corespondence between the symbols in the Suetterlin script and the German
- >version of the Latin script (I think).
-
- I assume you refer to the written form developed by von Ludwig Suetterlin
- (1865-1917). I don't have any detailed information on it, so I can't say
- for sure. Without knowing any details, I would be willing to say it was
- a distinct script to the extent that Suetterlin created new forms or even
- borrowed forms from other scripts, perhaps modifying them in the process.
-
- Aside from the issue of encoding utility, I would say that abstracting the
- forms of two or more alphabets into a single script should take into account
- historical derivation, formal similarity, and perhaps even functional
- similarity, although I would give the much less priority than the former two
- criteria.
-
- >I think you should add that unification of different scripts is possible
- >iff the scripts can be viewed as just being font changes (although the
- >derivation of the scripts can be completely different).
-
- I think you may be confusing "script" as I am using it with "handwriting
- form" or possibly "written form." Clearly the latter would be a matter of
- only font changes, and nothing more. However, I am using "script" in a
- different way, namely, to capture the notion of a collection of abstract
- symbols which tend to have a fairly clear historical relationship, which
- still bear a fairly strong resemblence in form, and which are used to
- represent fairly similar functions. In addition, such a script -- as an
- artificial construct -- can include elements which violate these criteria
- to some degree.
-
- The general process used in Unicode is to identify an alphabet (i.e., the
- symbols used in a particular writing system) with some historically known
- collection of symbols (a script), attempt to unify the alphabet with the
- this collection, and, then, to the extent that the unification is successful
- and doesn't interfere with basic processing tasks, replace the script with
- the (unified) union of the original script and the forms of the new alphabet.
- This produces a new script which may have forms in it that were not in the
- original script; e.g., if you look at Unicode Latin blocks, you will find
- elements like: LATIN SMALL LETTER THORN, LATIN SMALL LETTER EXCLAMATION
- MARK, LATIN SMALL LETTER BARRED LAMBDA, LATIN CAPITAL LETTER YOGH,
- LATIN LETTER TWO BAR, LATIN CAPITAL LETTER TONE FIVE, and so on. Clearly
- none of these are members of the Roman alphabet, or are even close. In
- these cases, alphabets which were largely derived from the core symbols
- of the Latin script, i.e., those in the Roman alphabet, were unified with
- the Latin script, resulting in unification of some forms, and the addition of
- other, novel forms which were innovations in the alphabets being unified.
-
- Clearly, none of the different alphabets which share the collection of symbols
- referred to as LATIN LETTERS in Unicode actually make use of *all* of the
- symbols so identified; therefore, you can't say it is just a font shift.
-
- >So while you can unify the Suetterlin and the Latin script, you can not unify
- >Latin and Greek script although Latin is derived from Greek.
-
- You could unify Latin and Greek if you want, but it would require radical
- unification of both form and function. And it wouldn't buy much as far
- as encoding is concerned. Keep in mind here that a SCRIPT in the Unicode
- sense is largely an artificial engineering construct; and not adherent to
- any "theory of scripts." To my knowledge, there is no theory of scripts
- anyway. The problem with written language is that it is based almost
- entirely on convention and historical accident; one can't articulate a
- theory of writing on the basis of necessity (at least I wouldn't give
- credence to any such theory).
-
- Glenn Adams
-
-
-