home *** CD-ROM | disk | FTP | other *** search
- echo wg15
- cat >wg15 <<'shar.wg15.14933'
- From jsq@longway.tic.com Wed Mar 14 11:29:02 1990
- Received: from cs.utexas.edu by uunet.uu.net (5.61/1.14) with SMTP
- id AA08502; Wed, 14 Mar 90 11:29:02 -0500
- Posted-Date: 14 Mar 90 15:46:26 GMT
- Received: by cs.utexas.edu (5.59/1.52)
- id AA07316; Wed, 14 Mar 90 10:28:51 CST
- Received: by longway.tic.com (4.22/4.16)
- id AA11906; Wed, 14 Mar 90 09:47:32 cst
- From: Dominic Dunlop <tsa.co.uk!domo@longway.tic.com>
- Newsgroups: comp.std.unix
- Subject: Report on WG15 Rapporteur Group
- Message-Id: <556@longway.TIC.COM>
- Sender: std-unix@longway.tic.com
- Reply-To: std-unix@uunet.uu.net
- Date: 14 Mar 90 15:46:26 GMT
- Apparently-To: std-unix-archive@uunet.uu.net
-
- From: Dominic Dunlop <domo@tsa.co.uk>
-
- Report on ISO/IEEE JTC1/SC22/WG15 Rapporteur Group on
- Internationalization Meeting of 5th - 7th
- March, 1990, Copenhagen, Denmark
-
- Dominic Dunlop -- domo@tsa.co.uk
-
- The Standard Answer Ltd.
-
- Denmark. A small country which has tax rates so high that
- its five million inhabitants complain that, when they buy
- themselves a car, they have to buy one and a half cars for
- the government. Some part of that tax goes to fund Dansk
- Standardiseringraad (DS), the national standards body, which
- works hard to ensure that the needs of Danes are not
- overlooked when larger nations get together to write
- standards. DS has got its teeth into international
- standards for computers, and with good reason: we've been
- doing things wrong all along. We'll have to mend our ways
- if we are to produce standards which really fill
- international needs, even if we don't go as far as building
- in a framework which can easily accommodate Danish taxation.
-
- Metropolitan Chicago today has a population larger than that
- of Denmark. Imagine that you've just rebuilt the downtown
- area after the fire of 1871, only to have Alexander Graham
- Bell come along with the telephone, Edison deciding to
- generate electricity, and railroad companies starting to
- promote inter-urban lines. Your reaction might well be
- ``Oh, sh*t!'' All these innovations need new infrastructure
- -- cables and conduits and tunnels which you just hadn't
- known you'd need when you laid the roads, put up the
- buildings, and connected them to gas, water and drainage.
- As a result, competing telephone and electric companies
- string a tangle of wires from poles with little regard to
- safety and no regard for aesthetics or standardization,
- while elevated railways appear above existing roads, cutting
- off light at street level and filling upper floor rooms with
- smoke1. Only after many years of disruption, digging up
-
- __________
-
- 1. In 1887, the West Chicago Protective League complained
- ``... the proposed elevated road would materially and
- irreparably depreciate the value of real estate upon
- said streets... and render the dwellinghouses thereon
- unfit for private residences...''[1], but amid the kind
- of political maneuverings for which the city is justly
- famous, the ``L'' got built anyway.
-
-
- - 2 -
-
- streets and making holes in the walls of existing buildings
- would telephones, electricity and public transportation be
- safely hidden beneath the ground2, unseen, but playing an
- essential part in supporting the life of the city.
-
- A descendant of Alexander Graham Bell's telephone company
- now supports the UNIX operating system out of Chicago. UNIX
- is a lot like the Chicago of the last century. We've got to
- the stage of unifying the major variants in the POSIX
- standards and the commercial System V, release 4, only to
- find that there is an increasing clamor for whole new
- infrastructures to support international needs, to improve
- security, and to show that the system is performing as
- billed. Suddenly, we've got to add features to handle these
- requirements, and we've got to try to do it while observing
- the three conflicting maxims of standardization: do it once,
- do it right, and do it now. What's more, we have to try to
- do in a way which remains hidden: existing programs should
- not break, nor should they get noticeably bigger or slower.
-
- POSIX is not alone: those responsible for computer language
- standards face the same problems, and have also been the
- subject of constructive Danish criticism[2][3]. The Danes'
- long-standing interest makes it particularly appropriate
- that the first meeting of the ISO POSIX working group's
- special interest group on internationalization should be
- hosted by DS in Copenhagen. Internationalization is the
- process of removing cultural bias from a system, and then
- providing tools to allow system administrators to localize
- the system by adding a cultural bias of their own choosing.
- No wonder Dansk Standardiseringr}d -- sorry, Dansk
- Standardiseringraad -- is interested in this technology:
- its employees court a syntax error every time they type its
- name at the UNIX shell3. Internationalization will allow
- Danes to mold systems to their requirements, rather than
- having to rub along with implementation assumptions based on
- American practice.
-
- ____________________________________________________________
-
- 2. Well, in the case of Chicago, some of the public
- transportation. You can still ride the L.
-
- 3. ISO 646[4], the earliest ISO standard for information
- technology, is the international derivative of ASCII.
- Its Danish variant replaces ASCII's } with aa. Around
- the world, #$@[\]^`{|}~, all of which have a special
- meaning to the shell, are replaced by other characters
- in standards derived from ISO 646. See [5] for much
- more information.
-
-
- - 3 -
-
- The Japanese are interested too: their cultural differences
- make Denmark look close enough to the U.S.A. to be a fifty-
- first state! And the U.S.A. is interested because it has
- been charged by ISO with the production of ANSI standards
- base documents for the international POSIX standards, and
- wants them to reflect international needs. Denmark, Japan
- and the U.S.A. sent representatives to the
- internationalization meeting. There were also observers
- from EUUG/USENIX (myself), the IEEE's 1003.0 working group,
- and from an ISO study group which is grappling with the
- issues of character set use in computer languages.
-
- The official title of the POSIX internationalization group
- is the ISO/IEEE JTC1/SC22/WG15 Rapporteur Group on
- Internationalization. (Few things in ISO's world have short
- names.) Just to explore some more of the jargon, a
- rapporteur is a technical expert nominated by a member body
- -- a national standards organization such as ANSI or DS
- -- to take an interest in a specialized aspect of a
- particular standards effort. WG15, the ISO POSIX working
- group, has rapporteur groups on security, conformance
- testing and internationalization. The security group met in
- January, in conjunction with the New Orleans meeting of the
- IEEE 1003.4 working group; the conformance test group, which
- corresponds to the IEEE 1003.3 effort, met in Copenhagen
- along with the internationalization group (although this
- report does not cover its meeting).
-
- Internationalization is peculiar in that, although the
- IEEE's POSIX standards are drafted with international needs
- in mind, there is no internationalization working group
- within the POSIX project. There is a study group which, as
- part of the 1003.0 ``POSIX Guide'' work, is trying to decide
- how to bring internationalization into the official
- structure, so that it can be given officers, schedules,
- terms of reference, and all those other good things which
- make us standards people feel safer. It's a big problem,
- because the issue really affects every aspect of POSIX --
- it just took a while to realize that it was an issue at all.
- Unlike -- say -- realtime extensions, security
- extensions, or transparent remote file access for POSIX,
- internationalization doesn't really make sense as an add-on
- to a basic operating system interface standard. Rather, the
- operating system and all its extensions need to be
- internationalized as a matter of course. Every other
- working group in the IEEE POSIX is charged with producing a
- distinct standard, but it is difficult to see how a new
- group dealing with internationalization group could be given
- such a goal.
-
- ISO has a similar problem, but it's worse because the
- organization has so many balls to keep in the air. If it is
- to apply the ``do it once'' and ``do it right'' maxims to
-
-
- - 4 -
-
- internationalization, it seems clear that the issue must be
- handled near the top of Joint Technical Committee 1, the
- information technology standards group. After all, as well
- as computer languages and operating systems,
- internationalization affects communications, document
- standards, database and much more. ISO recently bit a
- similar bullet, establishing a new subcommittee (SC27)
- immediately below JTC1 to handle the security issues which
- are beginning to affect so much of its work. It may yet do
- the same with internationalization.
-
- The ``do it now'' criterion, on the other hand, argues in
- favor of addressing internationalization at a lower level
- -- doing the work in a new department, rather than going
- to the trouble of establishing a whole new division. SC22,
- which is responsible for language and operating system
- standards, is currently considering the setting up of a new
- working group at the same level as WG14 (C language), WG15
- (POSIX) and the rest. This proposal has run into
- opposition, both from those who say that the issue should be
- handled at a higher level, and from those who feel that
- there isn't an issue: after all, aren't ISO's standards
- supposed to be international anyway?
-
- Meanwhile, WG15 has established a subordinate group to
- handle internationalization at the lowest level possible.
- As somebody said at the meeting, ``You can't get much lower
- than us.'' We spent our time discussing what we were
- supposed to be doing -- and, equally important, what we
- could leave to others. In the end we came up with a little
- list:
-
- Terms of Reference
-
- The rapporteur group on internationalization
- (RIN) will study the aspects of
- internationalization related to POSIX and report
- its findings to SC22/WG15.
-
- (Bland, imposing no needless restrictions on
- what we can do.)
-
- Program of Work
-
- 1. Carry out survey to capture most of the
- requirements relevant to internationalization.
-
- (A job and a half. We have to search out users
- around the world, and persuade them to tell us
- what features they really want, rather than what
-
-
- - 5 -
-
- they can put up with, or program their way
- around4.)
-
- 2. Identify and forward requirements with
- recommendations to WG15.
-
- (So WG15 gets to carry the can for us...)
-
- 3. Capture and collect national body profiles for
- reference.
-
- (Denmark and Japan have already done some work
- on ``profiles'' that customize POSIX to suit
- local needs. Their work suggests that current
- internationalization features are inadequate.)
-
- 4. Perform investigations as needed to advance the
- internationalization work of WG15.
-
- (We can poke our noses into anything that takes
- our fancy...)
-
- 5. Review, from an internationalization
- perspective, documents submitted to WG15 for
- review and comment from an internationalization
- perspective.
-
- (We definitely get to poke our noses into
- anything that comes past WG15...)
-
- 6. Review, and evaluate impact on work of WG15 of,
- other documents relevant to internationalization
- circulated in JTC1 or its subcommittees.
-
- (And we'll try to get our hands on information
- from further afield.)
-
- That's a lot of work. It defines the function of our
- particular mill, but that mill still needs grist. That
- feedstock has to come from outside our group, and, because
- of our lowly position, we have to ask WG15 (``daddy'') to
- ask others to supply it. WG15, in turn, may have to refer
- some requests to higher authority: we want to be aware of
-
- __________
-
- 4. But we need to be a lot more diplomatic than asking
- ``What ticks you off most about these dumb American
- machines?'' -- although appeals to chauvinism have
- been known to achieve results...
-
-
- - 6 -
-
- anything which happens in SC22 which is relevant to POSIX
- internationalization -- for example, what the C language
- people in WG14 are up to. That involves going up another
- level in JTC1's hierarchy. Getting in touch with other
- subcommittees, such as SC2, which looks after character
- sets, potentially involves going right to the top of the
- bureaucracy. (Luckily, in this particular case, SC22's
- study group on character sets can stand in for SC25.)
- Consequently, when WG15 next meets in Paris in June, it will
- have to deal with several resolutions concerned with turning
- on the taps and starting the information flow to the
- rapporteur group.
-
- One of these taps is a little sticky: WG15 doesn't
- officially have a relationship with the IEEE's 1003.0 group,
- although it can, via ANSI, talk to 1003.1, 1003.2 and 1003.4
- through 1003.9. The problem is that 1003.0 deals with
- profiles, baskets of standards which, when brought together,
- solve particular classes of problem -- for example, those
- of transaction processing, realtime or batch-oriented
- systems. Profiles are outside the scope of the ISO POSIX
- effort, so we can't officially talk to 1003.0, even though
- its study group is currently holding the baton on
- internationalization. Never mind. We'll do things
- unofficially until some official pathway is sorted out.
-
- Apart from all this organizational stuff, we did review some
- existing documents. For example, DTR (draft technical
- report) 10176, a product of SC14, discusses the treatment of
- characters appearing in language constructs, variable names,
- literals and comments, and turns out to have implications
- for sh, awk, yacc and the other ``little languages'' defined
- in DP 9945-2, the forthcoming international standard for the
- shell and tools. And a document from SC22's study group on
- character sets suggests that source files should have some
- means of announcing the character set that they're using.
- Could this mean typed files or resource forks for POSIX6?
- Gee. How would we hide that?
-
- __________
-
- 5. SC2's answer to life, the universe and everything is DP
- (draft proposal) 10646, which defines a 32-bit wide
- character set with 8- and 16-bit wide canonical versions
- for storage and transmission, and a 24-bit wide
- processing version for those who can get by with only
- eight million characters or so. As it's still at the DP
- level, it'll be a long time before it hits the streets,
- and, even when it does, there's the little matter of
- getting people to use it...
-
-
- - 7 -
-
- The group next meets in Paris on June 11th and 12th, just
- before the WG15 meeting. If you want to come along, you
- have to persuade your national standards body firstly that
- you're a technical expert on POSIX, and then that they
- should appoint you as internationalization rapporteur. This
- may be surprisingly easy -- considerably simpler, for
- example, than getting somebody to fund your trip. To quote
- from [8], ``...standards committees would be hard-pressed to
- find people who participate on their voluntary committees
- with purely rational-economic expectations. Standards
- committees seem bent on justifying their existences by using
- hard data to prove that standards are good, yet they persist
- in using altruistic appeals to attract committee members.''
- If you feel like responding to the altruistic appeal of this
- article, contact me by electronic mail.
-
- Alternatively, if you're a European, you can remain seated
- in front of your terminal and participate in a news forum on
- ISO 646 and all that: Keld Simonsen of the Danish UNIX
- Users' Group has volunteered to initiate a discussion of the
- European perspective on character sets for POSIX. Denmark
- may be small, but it's certainly making its voice heard on
- this issue!
-
- ____________________________________________________________
-
- 6. UNIX' elegant and flavorless files have already taken a
- beating from X3.159, the ANSI C standard[6], since other
- operating systems tend to support filing schemes which
- are merely tasteless[7].
-
-
- - 8 -
-
- References
-
- 1. Brian J. Cudhay, Destination Loop, Stephen Green
- Press/Viking Penguin (1982)
-
- 3. P. J. Plauger, Quiet Changes, Part I, The C Users
- Journal, vol. 8, no. 2 (February, 1990), pp 9-16.
-
- 3. Keld Simonsen, A European Representation for ISO C,
- European UNIX systems User Group Newsletter, vol. 9, no.
- 2 (Summer 1989), pp 15-18
-
- 4. ISO 646:1983, Information processing -- ISO 7-bit code
- character set for information interchange
-
- 5. Keld Simonsen, An extension to the troff character set
- for Europe, European UNIX systems User Group Newsletter,
- vol. 9, no. 2 (Summer 1989), pp 2-14
-
- 6. ANSI X3.159, 1989, Programming Language C
-
- 7. P. J. Plauger, Evolution of the C I/O Model, The C Users
- Journal, vol. 7, no. 6 (August, 1989), pp 17-25.
-
- 8. Carl F. Cargill, Information Technology Standardization:
- Theory, Process and Organizations, Digital Press (1989)
-
-
- Volume-Number: Volume 18, Number 68
-
- shar.wg15.14933
- echo wg15.a
- cat >wg15.a <<'shar.wg15.a.14933'
- From jsq@longway.tic.com Thu Mar 15 13:05:08 1990
- Received: from cs.utexas.edu by uunet.uu.net (5.61/1.14) with SMTP
- id AA08775; Thu, 15 Mar 90 13:05:08 -0500
- Posted-Date: 15 Mar 90 02:43:27 GMT
- Received: by cs.utexas.edu (5.59/1.52)
- id AA12139; Thu, 15 Mar 90 12:01:42 CST
- Received: by longway.tic.com (4.22/4.16)
- id AA14670; Thu, 15 Mar 90 10:44:00 cst
- From: Randall Atkinson <uvaarpa.virginia.edu!randall@longway.tic.com>
- Newsgroups: comp.std.unix
- Subject: Re: Report on WG15 Rapporteur Group
- Message-Id: <561@longway.TIC.COM>
- References: <556@longway.TIC.COM>
- Sender: std-unix@longway.tic.com
- Reply-To: randall@uvaarpa.virginia.edu (Randall Atkinson)
- Organization: University of Virginia, Charlottesville
- Date: 15 Mar 90 02:43:27 GMT
- Apparently-To: std-unix-archive@uunet.uu.net
-
- From: randall@uvaarpa.virginia.edu (Randall Atkinson)
-
- As one who is fairly active in the multilingual computing
- side of things, I'm fairly certain that it just isn't worth
- it to try to make ISO 646 the basis of *anything* for the
- practical reason that it wasn't well thought out to begin with
- and has already been superceded by the ISO 8859/* family of
- 8-bit character sets.
-
- The latter fully support European linguistic needs (yes, including
- Danish and Icelandic and ...) and can be used quite nicely with
- most UNIX shells that I'm familiar with.
-
- I thought that trigraphs got excessive attention back when ANSI C
- was being developed and I fear that excessive attention will be
- devoted to ISO 646 when there are other areas of internationalisation
- that really deserve being thought about and solved cleanly.
-
- Most of the vendors of hardware in Europe are supporting ISO 8859/1
- now, so it is the real long term solution to European needs anyway.
- Worrying about support for ISO 646 is a mistake, worrying about
- supporting ISO 8859/* and the Asian need for larger character sets
- being fully supported and ways of handling date formats and such
- aren't a mistake at all.
-
- Volume-Number: Volume 18, Number 73
-
- shar.wg15.a.14933
- echo wg15.b
- cat >wg15.b <<'shar.wg15.b.14933'
- From jsq@longway.tic.com Fri Mar 16 22:43:46 1990
- Received: from cs.utexas.edu by uunet.uu.net (5.61/1.14) with SMTP
- id AA20308; Fri, 16 Mar 90 22:43:46 -0500
- Posted-Date: 16 Mar 90 22:44:27 GMT
- Received: by cs.utexas.edu (5.59/1.52)
- id AA19884; Fri, 16 Mar 90 21:43:08 CST
- Received: by longway.tic.com (4.22/4.16)
- id AA02441; Fri, 16 Mar 90 21:18:51 cst
- From: Marius Olafsson <rhi.hi.is!marius@longway.tic.com>
- Newsgroups: comp.std.unix
- Subject: Re: Report on WG15 Rapporteur Group
- Message-Id: <565@longway.TIC.COM>
- References: <556@longway.TIC.COM> <561@longway.TIC.COM>
- Sender: std-unix@longway.tic.com
- Reply-To: std-unix@uunet.uu.net
- Organization: University of Iceland
- Date: 16 Mar 90 22:44:27 GMT
- Apparently-To: std-unix-archive@uunet.uu.net
-
- From: marius@rhi.hi.is (Marius Olafsson)
-
- randall@uvaarpa.virginia.edu (Randall Atkinson) writes:
-
- > I'm fairly certain that it just isn't worth
- >it to try to make ISO 646 the basis of *anything* for the
- >practical reason that it wasn't well thought out to begin with
- >and has already been superceded by the ISO 8859/* family of
- >8-bit character sets.
-
- I agree. The ISO 8859 series of charactersets have the (in my opinion
- neccessary) quality that the *complete* set of ASCII characters can be
- represented. If ISO 646 will be taken into consideration must we then
- allow alternate syntax in the varius shells and utilites that make
- use of the characters {}[]@\| and ` - I think that is a can of worms
- best left unopened.
-
- >The latter fully support European linguistic needs (yes, including
- >Danish and Icelandic and ...) and can be used quite nicely with
- >most UNIX shells that I'm familiar with.
-
- And it seems that most major manufacturers already have (or have announced)
- support for ISO 8859 - at least HP-UX, Ultrix, AIX, SunOS and
- more I am sure. The X window system now supports ISO 8859 fonts, the
- latest Adobe rel of Postscripts support ISO 8859 encoding of the fonts,
- and the list goes on ... NONE provide any support for or consideration
- for ISO 646 (fortunately).
-
-
- > I fear that excessive attention will be
- >devoted to ISO 646 when there are other areas of internationalisation
- >that really deserve being thought about and solved cleanly.
-
- Definately, and serious consideration should be given to the way X/Open
- has defined some of these other areas. That system actually works pretty
- well in practice. It has been used here for about two years (on HP-UX).
-
- --
- Marius Olafsson internet: marius@rhi.hi.is
- University of Iceland UUCP: {mcsun,sunic,uunet}!isgate!rhi!marius
-
-
- Volume-Number: Volume 18, Number 77
-
- shar.wg15.b.14933
- echo wg15.c
- cat >wg15.c <<'shar.wg15.c.14933'
- From jsq@longway.tic.com Sat Mar 17 16:45:25 1990
- Received: from cs.utexas.edu by uunet.uu.net (5.61/1.14) with SMTP
- id AA16941; Sat, 17 Mar 90 16:45:25 -0500
- Posted-Date: 16 Mar 90 23:35:09 GMT
- Received: by cs.utexas.edu (5.59/1.52)
- id AA17980; Sat, 17 Mar 90 15:45:08 CST
- Received: by longway.tic.com (4.22/4.16)
- id AA04625; Sat, 17 Mar 90 14:06:42 cst
- From: David Wheeler <ida.org!wheeler@longway.tic.com>
- Newsgroups: comp.std.unix
- Subject: Re: Report on WG15 Rapporteur Group
- Message-Id: <568@longway.TIC.COM>
- Sender: std-unix@longway.tic.com
- Reply-To: std-unix@uunet.uu.net
- Date: 16 Mar 90 23:35:09 GMT
- Apparently-To: std-unix-archive@uunet.uu.net
-
- From: wheeler@ida.org (David Wheeler)
-
- domo@tsa.co.uk (Dominic Dunlop):
- = From: Dominic Dunlop <domo@tsa.co.uk>
- =
- = Report on ISO/IEEE JTC1/SC22/WG15 Rapporteur Group on
- = Internationalization Meeting of 5th - 7th
- = March, 1990, Copenhagen, Denmark
- =
- = Dominic Dunlop -- domo@tsa.co.uk
- =
- = The Standard Answer Ltd.
- =
-
- I enjoyed your posting, thank you! You included a lot of "what this
- phrase really means" that I appreciated.
-
- =
- = 3. ISO 646[4], the earliest ISO standard for information
- = technology, is the international derivative of ASCII.
- = Its Danish variant replaces ASCII's } with aa. Around
- = the world, #$@[\]^`{|}~, all of which have a special
- = meaning to the shell, are replaced by other characters
- = in standards derived from ISO 646. See [5] for much
- = more information.
- =
-
- Isn't there an 8-bit standard character set that defines the first 128
- characters as a standard set (say as USASCII, provincial I'm afraid but it
- would break no Unix tools), then includes all the international
- characters as those with values > 127? If this were used in the POSIX
- standard, wouldn't this solve many problems for those using a
- Latin-based alphabet? Or is this standard unused in the real world?
- Admittedly this eliminates the non-Latin alphabet world, and that
- is a weakness.
-
- = Apart from all this organizational stuff, we did review some
- = existing documents. For example, DTR (draft technical
- = report) 10176, a product of SC14, discusses the treatment of
- = characters appearing in language constructs, variable names,
- = literals and comments, and turns out to have implications
- = for sh, awk, yacc and the other ``little languages'' defined
- = in DP 9945-2, the forthcoming international standard for the
- = shell and tools. And a document from SC22's study group on
- = character sets suggests that source files should have some
- = means of announcing the character set that they're using.
- = Could this mean typed files or resource forks for POSIX6?
- = Gee. How would we hide that?
- =
-
- Some C programs would have to be fixed to deal with signed characters
- but at least the rules would be simple: 128+ are ordinary characters &
- can be used in identifiers, etc.
-
- Source file tagging for language sounds like an abomination!
-
- --- David A. Wheeler
- wheeler@ida.org
-
-
- Volume-Number: Volume 18, Number 80
-
- shar.wg15.c.14933
- exit
-