home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!cs.utexas.edu!sun-barr!sh.wide!wnoc-tyo-news!sranha!anprda!pmcgw!personal-media.co.jp
- From: ishikawa@personal-media.co.jp (Chiaki Ishikawa)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Message-ID: <ISHIKAWA.93Jan8125424@ds5200.personal-media.co.jp>
- Date: 8 Jan 93 03:54:10 GMT
- References: <1hvu79INN4qf@rodan.UU.NET> <1993Jan1.115424.27258@enea.se>
- <1i2gpvINN3lm@rodan.UU.NET> <1993Jan2.230101.20871@enea.se>
- <1i99qfEINN420@uni-erlangen.de>
- Sender: news@pmcgw.personal-media.co.jp
- Reply-To: ishikawa@personal-media.co.jp
- Organization: Personal Media Corp., Tokyo Japan
- Lines: 103
- Nntp-Posting-Host: ds5200
- In-reply-to: unrza3@cd4680fs.rrze.uni-erlangen.de's message of 4 Jan 93 12:16:15 GMT
- X-Md4-Signature: 4349bccdacb4f704e90f68505efdd5c5
-
-
- From a Japanese perspective:
-
- In article <1i99qfEINN420@uni-erlangen.de> unrza3@cd4680fs.rrze.uni-erlangen.de (Markus Kuhn) writes:
-
- First about keyboard drivers:
-
- [details omitted]
-
- Would be a nice research project ...
-
- Agreed. It would not be impossible, but it would require rewriting of
- many applications (or input libraries) and would take a long time to
- see such changes done everywhere.
-
- And about sorting rules:
-
- As I believe that no one will die if his national sorting rules
- are changed, perhaps it's time to create international ISO sorting rules
- comparable to those defined in DIN 5007. Why shouldn't the people that
- know about these problems at first try to harmonize the rules internationaly
- before they implement complex historically grown sorting rules.
-
- May I suggest that before going into ISO arena, each country (or
- culture group) codify the existing methods clearly in writing? And
- please note the use of my plural form of method[s]. I don't believe
- there is single satisfactory sorting method in each language. (I
- don't know if there is single satisfactory rule for American English,
- but it is not clearly in Germany [I read that some publishers dont
- follow DIN sorting rule.] and other places such as Japan.)
-
- One first approach for an ISO sorting standard would be for all latin languages
- to order letters with diacritics directly behind their versions without
- diacritics. Other suggestions for international sorting harmonization
- rules are welcome. The result might be a huge permutation table for
- Unicode. The goal should NOT be to try to cover all EXISTING sorting
- rules (which may trivially proofed as being impossible), but to create
- something acceptable for sorting multilingual textes. Good idea?
-
- I doubt if this is a good idea because unless we cover existing
- sorting rules (and which are presumably used widely), it would be
- useless for many users. Like the German publishers mentioned in this
- group, who use a sorting rule different from the one in DIN
- standard, there will be people using the existing sorting rule(s). So
- why not codify them in freely available (semi-) standard documents?
-
- This ISO sorting standard should be 100% language independend. This surely
- is possible and as I believe would be very very useful.
-
- I don't think this language independence is possible. I hate to give
- a very specific example, but IMHO, location-specific examples and
- counter-examples are a norm rather than exception. So here it goes.
- Look at Japanese text (or names). When text (or names) are written
- using Kanji (chinese characters), the information concerning how to
- read it are more or less lost! [It is true that most of the time, a
- grownup Japanese can guesstimate the correct reading. But there are so
- many exceptions to rely on the human skill :-(] If we want to collate
- such text (or names) according to how we read them, we have to have
- auxiliary phonetical data (probably given in the hiragana or katakana
- letters.) I wonder if you call such sorting environment language
- independent. [Now, granted, the sorting of hiragana only data or
- katakana data is more or less straightforward.] In any case, the local
- telephone companies use a particular sorting method to list all the
- names in telephone book. I don't think it is language-independent
- method at all.
-
- I believe the correct approach to sorting is to codify the existing
- rules of each country/culture/whatever, and then publish them in a
- easy to read documents which are understandable by reasonably talented
- programmers of other countries. Code examples for each sorting rules
- will go a long way. This will help I18N from the bottom.
-
- [bibliographic sorting example deleted]
-
- ASCII also doesn't allow direct sorting. You have to do case conversion,
- punctuation removement, etc. before you can arithmetically compare the
- strings. With ASCII, these algorithms are only so simple, that most people
- don't recognize them.
-
- Unfortunately, some languages of the world require much more complex
- preprocessing, or the notion of "sorting" may not apply very well in
- the naive sense of the word as perceived, say, American users.
- Frankly, I have tough time to explain to overseas visitors how the
- names are sorted in the Japanese telephone directory. [It would take a
- few minutes in the least and I don't know if the explanation is
- understodd at all. I bet you can explain the concept of sorting in
- American telephone directory in less than a minute to an intelligent
- grownup.]
-
- With ISO 10646, you need the same methods as with ASCII,
- they are only more complex and require much bigger exception tables. There
- is not fundamental difference!
-
- I doubt if it is just a complexity issue. The whole concept of
- language independence sounds like a pipe dream to me. (see the
- discussion of Japanese sorting above.) There is nothing wrong about
- language-specific solutions IMHO as long as such solutions (adopted
- from existing practices) are freely available in understandable
- documentation.
-
- Chiaki Ishikawa, Personal Media Corp., MY Bldg, 1-7-7 Hiratsuka,
- Shinagawa, Tokyo 142, JAPAN. FAX:+81-3-5702-0359, Phone:+81-3-5702-0351
- UUNET: ishikawa@personal-media.co.jp
-