home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!gatech!destroyer!gumby!yale!mintaka.lcs.mit.edu!ai-lab!wheat-chex!glenn
- From: glenn@wheat-chex.ai.mit.edu (Glenn A. Adams)
- Newsgroups: comp.std.internat
- Subject: Re: European characters (was 8-bit news)
- Date: 23 Jan 1993 17:02:28 GMT
- Organization: MIT Artificial Intelligence Laboratory
- Lines: 51
- Message-ID: <1jrtn4INN2b1@life.ai.mit.edu>
- References: <1993Jan21.005656.25514@newstand.syr.edu> <2261@blue.cis.pitt.edu> <1993Jan22.122553.4823W@lumina.edb.tih.no>
- NNTP-Posting-Host: wheat-chex.ai.mit.edu
-
- In article <1993Jan22.122553.4823W@lumina.edb.tih.no> ketil@edb.tih.no (Ketil Albertsen,TIH) writes:
- >10646/1 is supposedly a 32-bit identification scheme intended to
- >cover just about any printable symbol in the world, while 10646/2, or Unicode,
- >is a subset using 16 bit codes for text symbols (characters or Asian-
- >style symbols).
-
- ISO/IEC 10646-1:1993 (to be published in the first or second quarter this
- year) defines one repertoire of characters and two encoding forms
- The encoding forms are called UCS4 (Universal Character Set 4-Octet)
- and UCS2 (Universal Character Set 2-Octet), the former interpreted as 32-bit
- unsigned integers, the latter as 16-bit unsigned integers. The UCS2 encoding
- form is related to the UCS4 encoding form by zero extension; that is, by
- zero extending the 16-bit form to 32 bits, the equivalent UCS4 encoding form
- is created.
-
- No characters are currently assigned to codepoints (bit combinations) outside
- of UCS2 (also called the Basic Multilingual Plane or BMP for short). Out
- of the 65,536 distinct bit combinations in UCS2, 34,168 are assigned to
- characters, 6,467 are reserved, and 24,901 are available for future assignment.
-
- The UCS2 encoding space is divided into sections, whose contents are
- characterized below:
-
- A-Zone (Alphabetic) (11,892 assigned, 65 reserved, 8,011 available)
-
- Alphabets
- Hangul Jamo Alphabet
- Latin & Greek Precombined Forms
- Symbols
- CJK Auxiliaries
- Hangul Precombined Syllable Forms
-
- I-Zone (Ideographic) (20,902 assigned, 0 reserved, 90 available)
-
- CJK Unified Ideographs
-
- O-Zone (Open) (0 assigned, 0 reserved, 16,384 available)
-
- Unassigned
-
- R-Zone (Restricted) (1,374 assigned, 6402 reserved, 416 available)
-
- Private Use Area
- CJK Ideograph Compatibility
- Presentation Compatibility
- Other Compatibility & Specials
-
- Glenn Adams
-
-
-
-