home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!uunet!not-for-mail
- From: gwyn@smoke.brl.mil (Doug Gwyn)
- Newsgroups: comp.std.unix
- Subject: Re: POSIX update
- Date: 19 Aug 1992 12:59:15 -0700
- Organization: U.S. Army Ballistic Research Laboratory, APG, MD.
- Lines: 61
- Sender: sef@ftp.UU.NET
- Approved: sef@ftp.uucp (Moderator, Sean Eric Fagan)
- Message-ID: <16u96jINNlej@ftp.UU.NET>
- References: <16b9drINNero@ftp.UU.NET> <16ccqfINNrss@ftp.UU.NET> <16h5a2INNko4@ftp.UU.NET>
- NNTP-Posting-Host: ftp.uu.net
- X-Submissions: std-unix@uunet.uu.net
-
- Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn)
-
- In article <16h5a2INNko4@ftp.UU.NET> peterw@spaten.sharebase.com (Peter Wisnovsky) writes:
- >... my impression of what the potential problem is that if the standards
- >orgs went ahead and used wchar_t as the vehicle for supporting Unicode,
- >then existing conformant implementations that define wchar_t to be an
- >8-bit or 32-bit value would be made non-conformant. NT is not
- >non-conformant in itself...but the solution it proposes to the storage
- >of Unicode data, that is, the fixing of the size of wchar_t to be
- >16=bits wide, would not work with existing conformant systems.
-
- This represents a misunderstanding of the role of standards and of wchar_t
- in particular. The C standard requires that a conforming implementation
- define a wchar_t type for the internal representation of whatever
- multibyte character sequences it chooses to support; however, it does not
- mandate that any particular multibyte encoding be supported. A vendor may
- conform to the C standard while not supporting Unicode, for example. It
- is left as an implementation "marketing decision" just how extensive the
- multibyte support will be. Certainly, an implementation that chose to
- ignore genuine extended character sets and define wchar_t as an 8-bit
- type will not be able to at the same time support 16-bit Unicode. If and
- when such implementations are revised to properly support extended
- character sets, they will have to make wchar_t at least 16 bits, as most
- international implementations of Standard C already do. Since wchar_t is
- an internal data type it is not really relevant to issues of data
- interchange among different systems; that does not lie within the scope
- of the programming language standard.
-
- >Two proposals were discussed at the Unicode conference wrt XPG. One
- >would be to have the standards changed so that wchar_t would be
- >defined to be 16-bits wide; the other would be to create a new
- >datatype, `unichar'. The sentiment at the conference was to create a
- >new datatype.
-
- If that accurately reflects the discussion, then it merely serves to
- confirm the widespread impression that the Unicode proponents don't
- understand the existing standards nor the magnitude of the effects of
- addition of new data types to existing languages. It is already an
- easy matter to, for purposes of conformance with other standards, add
- non-conflicting requirements (such as at least 16 bits in wchar_t)
- beyond base standards. For example, POSIX.1 adds library requirements
- beyond those specified in the C standard. There is no need to "change"
- the base standards in such cases.
-
- Standard C's multibyte character support was designed in close
- consultation with many individuals and organizations who had long been
- involved in "internationalization" issues. ITSCJ particularly comes
- to mind, and they have continued to work on improvements within the C
- multibyte character model. Originally they too suggested a separate
- data type for "long" character encodings, and I proposed an alternate
- suggestion that also introduced a new data type (my type would have
- been used for sub-character bytes, however, reserving "char" for the
- sole character type, thus immensely simplifying programming). After
- considerable debate and several committee and working subgroup meetings,
- consensus was reached on the multibyte external sequence/wchar_t
- internal encoding approach. It would behoove the Unicode proponents to
- fully understand those deliberations and the resulting design before
- they further bollix up the works.
-
-
- Volume-Number: Volume 29, Number 7
-