home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!ira.uka.de!fauern!uni-erlangen.de!not-for-mail
- From: unrza3@cd4680fs.rrze.uni-erlangen.de (Markus Kuhn)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Date: 4 Jan 1993 13:16:15 +0100
- Organization: Regionales Rechenzentrum Erlangen
- Message-ID: <1i99qfEINN420@uni-erlangen.de>
- References: <1hvu79INN4qf@rodan.UU.NET> <1993Jan1.115424.27258@enea.se> <1i2gpvINN3lm@rodan.UU.NET> <1993Jan2.230101.20871@enea.se>
- Reply-To: mskuhn@immd4.informatik.uni-erlangen.de
- NNTP-Posting-Host: cd4680fs.rrze.uni-erlangen.de
- Lines: 84
-
- First about keyboard drivers:
-
- sommar@enea.se (Erland Sommarskog) writes:
- >Vadim Antonov (avg@rodan.UU.NET) writes:
- >>In article <1993Jan1.115424.27258@enea.se> sommar@enea.se (Erland Sommarskog) writes:
- >>>So if I type a C then a million key presses later changes puts in
- >>>an H after the C how can the keyboard driver handle that? It might
- >>>not even be the same driver who are seeing the two!
- >>Aw, don't be silly. It's trivial.
- >When you can't explain write off the problem as trivial.
-
- One solution: You extend the 'keyboard driver' <-> 'application'
- interface by a method that allows the keyboard driver to ask the application,
- in which context the key should be interpreted. The technical details are
- perhaps slightly more complicated, but surely not very difficult
- (mathematicians call this 'trivial' :-).
-
- In this way, the application need not be aware of the language and locale,
- but in existing systems applications have to be extended to aswer questions
- about the environment, in which the new letter will be inserted, from the
- keyboard driver and to even allow the keyboard driver to make changes
- in this environment. Would be a nice research project ...
-
- And about sorting rules:
-
- >Nope. Not with German. Look in a German dictionary. Then look in a
- >German phonebook. Then you will find that the dotted suckers are
- >sorted differently in the two places. If you want to support both,
- >you have to know what the user wants. Or do you suggest that the
- >user should specify that on input with choosing the correct set of
- >dotted characters? What if another user wants the other order?
-
- (BTW: German sorting rules are defined in DIN 5007. That's the method
- used in the phone books. Some dictionary printers feel, that sorting
- a-Umlaut as 'a' and not as 'ae' makes quick searching easier for human
- users.)
-
- As I believe that no one will die if his national sorting rules
- are changed, perhaps it's time to create international ISO sorting rules
- comparable to those defined in DIN 5007. Why shouldn't the people that
- know about these problems at first try to harmonize the rules internationaly
- before they implement complex historically grown sorting rules.
-
- One first approach for an ISO sorting standard would be for all latin languages
- to order letters with diacritics directly behind their versions without
- diacritics. Other suggestions for international sorting harmonization
- rules are welcome. The result might be a huge permutation table for
- Unicode. The goal should NOT be to try to cover all EXISTING sorting
- rules (which may trivially proofed as being impossible), but to create
- something acceptable for sorting multilingual textes. Good idea?
-
- This ISO sorting standard should be 100% language independend. This surely
- is possible and as I believe would be very very useful.
-
-
- Sorting is always depends on the type of application you have,
- dont' forget this. E.g. in bibliographic systems you have to insert
- permutation markers in names, so that
-
- Markus G. *Kuhn (* represents here the marker control code
- in your favourite bibliographic control
- character set. What's the corresponding
- UNICODE code?)
-
- will be sorted as
-
- Kuhn Markus G
-
- There are even more complex systems with 2 or 3 markers in use.
-
- ASCII also doesn't allow direct sorting. You have to do case conversion,
- punctuation removement, etc. before you can arithmetically compare the
- strings. With ASCII, these algorithms are only so simple, that most people
- don't recognize them. With ISO 10646, you need the same methods as with ASCII,
- they are only more complex and require much bigger exception tables. There
- is not fundamental difference!
-
- Markus
-
- --
- Markus Kuhn, Computer Science student -=-=- University of Erlangen, Germany
- Internet: mskuhn@immd4.informatik.uni-erlangen.de | X.500 entry available
- ----- Anyone participating in the use of MS-DOS, Heroin or Cocaine is -----
- ---- simply not getting the most out of life possible. (Brian Downing) ----
-