NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / comp / std / internat / 1011 < prev next >

Wrap

Internet Message Format | 1993-01-04 | 4.6 KB

Path: sparky!uunet!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!ira.uka.de!fauern!uni-erlangen.de!not-for-mail From: unrza3@cd4680fs.rrze.uni-erlangen.de (Markus Kuhn) Newsgroups: comp.std.internat Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST) Date: 4 Jan 1993 13:16:15 +0100 Organization: Regionales Rechenzentrum Erlangen Message-ID: <1i99qfEINN420@uni-erlangen.de> References: <1hvu79INN4qf@rodan.UU.NET> <1993Jan1.115424.27258@enea.se> <1i2gpvINN3lm@rodan.UU.NET> <1993Jan2.230101.20871@enea.se> Reply-To: mskuhn@immd4.informatik.uni-erlangen.de NNTP-Posting-Host: cd4680fs.rrze.uni-erlangen.de Lines: 84 First about keyboard drivers: sommar@enea.se (Erland Sommarskog) writes: >Vadim Antonov (avg@rodan.UU.NET) writes: >>In article <1993Jan1.115424.27258@enea.se> sommar@enea.se (Erland Sommarskog) writes: >>>So if I type a C then a million key presses later changes puts in >>>an H after the C how can the keyboard driver handle that? It might >>>not even be the same driver who are seeing the two! >>Aw, don't be silly. It's trivial. >When you can't explain write off the problem as trivial. One solution: You extend the 'keyboard driver' <-> 'application' interface by a method that allows the keyboard driver to ask the application, in which context the key should be interpreted. The technical details are perhaps slightly more complicated, but surely not very difficult (mathematicians call this 'trivial' :-). In this way, the application need not be aware of the language and locale, but in existing systems applications have to be extended to aswer questions about the environment, in which the new letter will be inserted, from the keyboard driver and to even allow the keyboard driver to make changes in this environment. Would be a nice research project ... And about sorting rules: >Nope. Not with German. Look in a German dictionary. Then look in a >German phonebook. Then you will find that the dotted suckers are >sorted differently in the two places. If you want to support both, >you have to know what the user wants. Or do you suggest that the >user should specify that on input with choosing the correct set of >dotted characters? What if another user wants the other order? (BTW: German sorting rules are defined in DIN 5007. That's the method used in the phone books. Some dictionary printers feel, that sorting a-Umlaut as 'a' and not as 'ae' makes quick searching easier for human users.) As I believe that no one will die if his national sorting rules are changed, perhaps it's time to create international ISO sorting rules comparable to those defined in DIN 5007. Why shouldn't the people that know about these problems at first try to harmonize the rules internationaly before they implement complex historically grown sorting rules. One first approach for an ISO sorting standard would be for all latin languages to order letters with diacritics directly behind their versions without diacritics. Other suggestions for international sorting harmonization rules are welcome. The result might be a huge permutation table for Unicode. The goal should NOT be to try to cover all EXISTING sorting rules (which may trivially proofed as being impossible), but to create something acceptable for sorting multilingual textes. Good idea? This ISO sorting standard should be 100% language independend. This surely is possible and as I believe would be very very useful. Sorting is always depends on the type of application you have, dont' forget this. E.g. in bibliographic systems you have to insert permutation markers in names, so that Markus G. *Kuhn (* represents here the marker control code in your favourite bibliographic control character set. What's the corresponding UNICODE code?) will be sorted as Kuhn Markus G There are even more complex systems with 2 or 3 markers in use. ASCII also doesn't allow direct sorting. You have to do case conversion, punctuation removement, etc. before you can arithmetically compare the strings. With ASCII, these algorithms are only so simple, that most people don't recognize them. With ISO 10646, you need the same methods as with ASCII, they are only more complex and require much bigger exception tables. There is not fundamental difference! Markus -- Markus Kuhn, Computer Science student -=-=- University of Erlangen, Germany Internet: mskuhn@immd4.informatik.uni-erlangen.de | X.500 entry available ----- Anyone participating in the use of MS-DOS, Heroin or Cocaine is ----- ---- simply not getting the most out of life possible. (Brian Downing) ----