home *** CD-ROM | disk | FTP | other *** search
- From: Doug Gwyn (VLD/VMB) <gwyn@BRL.MIL>
-
- [ This was originally written as a letter to Dominic, but Doug
- agreed it would make a good comp.std.unix posting. -mod ]
-
- While I don't have any real problem with your use of quotations from
- my net posting, I do have a couple of comments on other things you said:
-
- The ballot also produced a number of suggestions in the area
- of internationalization, such as how to handle (and indeed,
- how to refer to) wide, or multi-byte, characters.
-
- For 1003.1, this is pretty straightforward. The C requirements on such
- character encodings are such that mbc strings can still be handled as
- uninterpreted NUL-terminated arrays of char. In the default "C" locale,
- a certain minimum set of characters must be represented, which permits
- the construction of portable filename strings. Even in the "C" locale,
- other characters are permitted, so for example a command-line argument
- containing "funny characters" can be used directly as an argument to
- open() etc. I know that there are various vendor approaches that make
- locales more visible to the operating system, but after all this is UNIX
- we're talking about, and one of the main lessons of UNIX is that the
- operating system can be designed to be happily oblivious to the uses to
- which people put the information that it manages according to simple rules.
-
- I first got involved in "internationalization" issues when I attended a
- BOF meeting at which the "expert" who was giving the presentation was
- explaining how complex the character set issues were, and when I said
- that I didn't see any inherent complexity was berated for my naivety.
- Years later, after studying the issues and conversing with the folks
- actively working in the field, I still maintain that simple solutions
- are possible. Unfortunately, vendors such as H-P started out with
- complicated schemes and have continue to think in those terms. This
- rubbed off on X3J11 when the multibyte character approach was adopted,
- which has the obvious problem that anyone programming for an
- international environment MUST change from traditional use of C strings
- to mbc arrays in his applications. The Japanese recognize this as an
- essential feature of their "long char" proposal, which X3J11 did NOT
- intend the mbc approach to be -- however, the fundamental need for
- library support using any such approach has now led to the Japanese
- requesting that such changes be made for the ISO C standard. I think
- the arguments I used for my alternative proposal to address these very
- concerns are being borne out, in spades.
-
- Returning to the matter of the programming language used for
- bindings, it is true that AT&T-derived UNIX implementations
- prefer a diet of C data types. However, it certainly was an
- aim of 1003.1 to allow hosted POSIX implementations, which
- might well be riding on underlying operating systems with
- entirely different tastes.
-
- To the contrary, we discussed this very matter in 1003.1 and decided
- that, while we did not wish to preclude layered implementations, we
- would not make any compromises to accommodate them. Very definitely
- our goal was to develop standards for genuine UNIX variants, not to
- provide a "Software Tools" style of Portable Operating System evironment.
-
- We used the same argument when we decided that NFS was simply going to
- have to be ruled non-compliant. UNIX applications rely on certain
- semantics of the file system that NFS did not properly support, and we
- decided that it would be a disservice to UNIX applications to remove
- the requirement that these useful semantics be preserved.
-
- Volume-Number: Volume 20, Number 115
-
-