home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!mcsun!sun4nl!cwi.nl!dik
- From: dik@cwi.nl (Dik T. Winter)
- Newsgroups: comp.std.internat
- Subject: Re: Cleanicode
- Message-ID: <8689@charon.cwi.nl>
- Date: 21 Jan 93 02:34:48 GMT
- References: <C138zr.r3@poel.juice.or.jp> <ISHIKAWA.93Jan20182546@ds5200.personal-media.co.jp>
- Sender: news@cwi.nl
- Organization: CWI, Amsterdam
- Lines: 35
-
- In article <ISHIKAWA.93Jan20182546@ds5200.personal-media.co.jp> ishikawa@personal-media.co.jp writes:
- > Why not unify Latin/Cyrillic/Greek 'A'? This simple question also is
- > the cause of uncomfortable feeling many Japanese programmers seem to
- > have (including myself).
- >
- I understand that. I think some unification might be done with the LCG
- glyphs. But the LCG scripts have the feature that each glyph comes in
- two forms: majuscule and minuscule. The distinction between the two
- forms is very small, in many cases it does not matter whether the
- majuscule form or the minuscule form is used (e.g. sorting). But that
- breaks down unification of Latin/Cyrillic and Greek 'A' because the
- minuscule form is different. Still worse are examples like 'T' and
- 'B' where the minuscule form is different for all three. On the other
- hand, I do not think Unicode is consistent (I do not know for sure, when
- I tried to buy the book it was sold out). I think that Turkish dotless
- and dot-having 'I' both share half a code point with the Latin 'I'. My
- preference would be three (times two) code points: Latin 'I', Turkish
- 'I' with dot and Turkish 'I' without dot. But I (as a westerner)
- understand why it is not done. It is impossible to distinguish the
- majuscule Latin 'I' from the Turkish majuscule dotless 'I'. Which
- would make it more difficult for the user. On the other hand, as a
- programmer, I see the difficulty in doing a case insensitive search.
- With the two Turkish 'I's there will be more false matches unless
- language is coded also, but again, that makes it more difficult for
- the user. But I think that unification of those majuscule/minuscule
- glyphs that are (upto font differences) identical would make sense.
- This includes Latin/Cyrillic 'A/a', 'J/j' (is the latter included
- in Cyrillic?) and Latin/Cyrillic/Greek 'O/o'.
-
- As I understand it, part of the problems with CJK unification are of
- a similar nature. While the base character is the same there may be
- different simplifications. But that is only as far as I understand it.
- --
- dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland
- home: bovenover 215, 1025 jn amsterdam, nederland; e-mail: dik@cwi.nl
-