OS/2 Shareware BBS: 19 Printer

home *** CD-ROM | disk | FTP | other *** search

/ OS/2 Shareware BBS: 19 Printer / 19-Printer.zip / princp10.zip / ULSUTIL.INF (.txt) < prev next >

Wrap

OS/2 Help File | 1995-10-04 | 77KB | 2,574 lines

ΓòÉΓòÉΓòÉ 1. ULSUTIL - Codepage and Keyboard Utilities ΓòÉΓòÉΓòÉ ULSUTIL - Codepage and Keyboard Utilities OS/2 Warp Connect [PowerPC Edition] Ken Borgendale OS/2 Architecture Version 1.0 - Beta OS/2 Warp Connect [PowerPC Edition] provides a large variety of codepages and keyboards to allow it to be used throughout the world. This support was added in such a way as to easily allow additional codepages and keyboards to be added to the system. The tools provided in this package provide the utilities, sample code, headers, libraries, and sample files necessary to create internationalization objects for OS/2 for PowerPC. All of the code in this package runs on OS/2 on the PC. This code will be available running on the PowerPC sometime soon after the availability of OS/2 for the PowerPC. The utilities provided in the package are: uconvdef Define a codepage uconvexp Expand a codepage printcp PostScript print of codepage makekb Define a keyboard printkb PostScript print of keyboard These utilities require several files to be in the current directory (or on the PATH). These files are: printcp.psh PostScript header for printcp printkb.dlf Fonts for printkb and printcp printkb.psh PostScript header for printkb printkb.sft Shift name tables scancode.nam Scancode names for makekb unicode.nam Unicode names (used by several utilities) Several libraries that use the objects are included. There are also several test cases shipped as source examples, along with the header files necessary to compile them. uniconv.dll Unicode conversion (and iconv) library unikbd.dll Keyboard library uctest.c Unicode sample and test kbdtest.c Keyboard sample and test scantest Simple test file for kbdtest uconv.h Unicode conversion header iconv.h iconv function prototypes unikbd.h Keyboard prototypes ulserrno.h Error codes for ULS callconv.h Calling convention makefile Makefile for test cases The following sample files are provided in subdirectories: codepage Codepages in binary form keyboard Keyboards in source form Note: If you plan to use the Unicode Conversion prototypes, update your CONFIG.SYS file to add the following statement: set ulspath=D:\ulsutil\codepage where 'D' is the drive on which you installed the ULS Utilities. ΓòÉΓòÉΓòÉ 2. Codepage Overview ΓòÉΓòÉΓòÉ A codepage is an object used to define the encoding of characters. In OS/2 for PowerPC, characters are defined based on the unicode, a 16-bit representation that includes characters from all countries in the world. A codepage is defined as a mapping to and from unicode. There are other terms used by various people for codepage. These include Coded character set, Unicode conversion object, Encoding Vector, and Symbol set. All of these are essentially the same. Within IBM, the term Coded Character Set is used to include an object made up of one or more codepages. This is used for Asian language mixed single/double byte encoding. The single-byte portion is actually one codepage, and the double-byte portion another codepage. In OS/2 these are all simply referred to as codepages. In OS/2, a codepage is indicated by a numeric value from the IBM registry of Codepages or Coded Character Sets. The Unicode Conversion Object (uconv) used to implement this codepage is given as a string "IBM-" followed by the decimal number of the codepage without leading zeros. For example: IBM-850 IBM-37 IBM-1200 The files containing the uconv objects are stored in the "/language/codepage" directory on the boot drive. Such files stored at any other location cannot be used as system codepages (they can be used for testing or other purposes). The file names are the same as the object names except that the hyphen (-) character is dropped from the name. This is done because of the file-name restrictions of the CD-ROM file system. ΓòÉΓòÉΓòÉ 3. Keyboard Overview ΓòÉΓòÉΓòÉ A keyboard is an object used to define the mapping of key presses on the keyboard to characters and virtual functions. The keyboard maintains a state which includes shift states, lock states, and dead keys. In OS/2 for PowerPC, the keyboard layout definition is independent of codepage because the character translations are defined using unicode. However, in most cases an appropriate codepage must be selected in order to make a keyboard useful. The keyboard logic maintains a state machine which is maintained external to the keyboard translation logic. The input to the keyboard logic is: shift The shift state scancode The key defined as the PM scancode action Make, break, or repeat The output from the keyboard translation is: shift The updated shift state unichar The unicode character vdkey The virtual or dead key biosscan The BIOS (translated) scancode ΓòÉΓòÉΓòÉ 4. UCONVDEF-Define a codepage ΓòÉΓòÉΓòÉ The uconvdef utility converts a codepage in source format to a codepage in binary format (Uconv table object). Note: This is version 2 of the uconvdef utility. Version 2 of the uconvdef utility fully supports the input syntax of version 1, but not all functionality is available without using new entries. Version 1 syntax is used to allow portability with other operating systems with support uconv objects. uconvdef takes two fixed parameters and a set of switches that start with a hyphen (-). The options can be specified at any point in the command. File names may use either slash (/) or backslash (\) as a path separator. uconvexp uconvfile [outfile] -options The first parameter is the name of the uconv object file to expand. This is the path name to the file. This parameter is required. The second parameter gives the name of the output file. If this is not given, it defaults to the input file name with the file extension .UCF. There are several option switches: -q Quiet - Do not show informational messages -v Verbose - Show extra information for debugging -x:file File name of extra source file For compatibility with the version 1 uconvdef command, you can specify a "-f" parameter before the first file name. Examples of calling uconvexp are: uconvdef ibm-850.ucmap uconvdef ibm-850.ucmap ibm850 -v uconvdef ibm-942.ucmap -x os2cp.ucm uconvdef -f ibm-850 ibm850 ΓòÉΓòÉΓòÉ 4.1. Uconv Source Syntax ΓòÉΓòÉΓòÉ The source file for uconvdef consists of a series of lines. Blank lines and lines starting with a comment are ignored. Other lines contain a keyword and operands. Lines are limited to 1024 bytes. The comment character (#) may be used to indicate the start of the comment on a line. All characters after the comment character are ignored. Most keyboards are contained within a '<' and '>' pair. The exception to this are CHARMAP and END CHARMAP, which start and end a section. Parameters can be of three types: strings, numbers, hex strings, and ranges of hex strings. ΓöîΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÉ Γöéstring ΓöéA set of characters surrounded by doubleΓöé"string" Γöé Γöé Γöéquote characters Γöé Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöénumber ΓöéAn unsigned decimal number Γöé1 Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöéhex string ΓöéA set of values where each byte is givenΓöé\xFC\xe5 Γöé Γöé Γöéas backslash x followed by 2 hex digits Γöé Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöérange ΓöéA hex string, three dots, and another Γöé\x45...\xf0 Γöé Γöé Γöéhex string Γöé Γöé ΓööΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÿ ΓòÉΓòÉΓòÉ 4.2. CODEPAGE ΓòÉΓòÉΓòÉ The codepage entry gives the name of the codepage. The codepage entry takes a single parameter, which can be either a string or numeric value. The string form is limited to 15 characters and the characters must be in the ASCII-7 set. The name should be limited so that with the hyphens removed it fits in 8 characters. If the codepage is given as a number, the name is constructed by using "IBM-" followed by the decimal number (without leading zeros). <codepage> "IBM-850" <codepage> 850 If extended source (-x) is selected, the codepage entry will cause the extended source to be searched looking for a matching codepage entry. The extended source is used to allow a version 1 source to be used with the additional version 2 information taken from the extended source. This entry is used within the extended source to start the entries for a particular codepage, and therefore must be the first entry within that set. All entries until the next codepage entry or end of file are processed. Note: This entry was called CHAR_SET_NAME in version 1. This name can still be used. It is deprecated because the object is actually a codepage or coded character set. Version 1 accepts only a string for this entry. ΓòÉΓòÉΓòÉ 4.3. ENCODING ΓòÉΓòÉΓòÉ The encoding entry specifies the type of codepage. This is used to set the ESID (Encoding Scheme ID) and the Uconv Class. It is also used to default the minimum and maximum code length, and the substitution character and unicode substitution character. There is a single string parameter which is one from the Encoding column below. The case of the string does not matter. ΓöîΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö¼ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÉ ΓöéEncoding ΓöéESID ΓöéClass ΓöéMin ΓöéMax ΓöéSubcharΓöéSubUni Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöésbcs-data Γöé2100 ΓöéSBCS Γöé1 Γöé1 Γöé1F ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöésbcs Γöé2100 ΓöéSBCS Γöé1 Γöé1 Γöé1A ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöésbcs-pc Γöé3100 ΓöéSBCS Γöé1 Γöé1 Γöé1F ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöésbcs-ebcdic Γöé1100 ΓöéSBCS Γöé1 Γöé1 ΓöéFF ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöésbcs-iso Γöé4100 ΓöéSBCS Γöé1 Γöé1 Γöé1A ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöésbcs-windows Γöé4105 ΓöéSBCS Γöé1 Γöé1 Γöé1A ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöésbcs-alt ΓöéF100 ΓöéSBCS Γöé1 Γöé1 Γöé1A ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöédbcs-data Γöé2200 ΓöéDBCS Γöé1 Γöé2 ΓöéFCFC ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöédbcs Γöé2200 ΓöéDBCS Γöé1 Γöé2 ΓöéFCFC ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöédbcs-pc Γöé3200 ΓöéDBCS Γöé1 Γöé2 ΓöéFCFC ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöédbcs-ebcdic Γöé1200 ΓöéDBCS Γöé1 Γöé2 ΓöéFEFE ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöémbcs-ebcdic Γöé1301 ΓöéEBCS Γöé1 Γöé2 ΓöéFEFE ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöémbcs-data Γöé2300 ΓöéMBCS Γöé1 Γöé2 ΓöéFCFC ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöémbcs Γöé2300 ΓöéMBCS Γöé1 Γöé2 ΓöéFCFC ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöémbcs-pc Γöé3300 ΓöéMBCS Γöé1 Γöé2 ΓöéFCFC ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöéucs-2 Γöé7200 ΓöéUCS2 Γöé2 Γöé2 ΓöéFFFF ΓöéFFFF Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöéugl Γöé7200 ΓöéUCS2 Γöé2 Γöé2 Γöé0000 ΓöéFFFD Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöéutf-8 Γöé7807 ΓöéUTF8 Γöé1 Γöé3 Γöé* Γöé* Γöé Γö£ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö╝ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöñ Γöéupf-8 Γöé78FF ΓöéUPF8 Γöé1 Γöé3 Γöé* Γöé* Γöé ΓööΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓö┤ΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÇΓöÿ Some classes do not honor substitution. In utf-8, upf-8, and ucs-2 when the substitution character is set to 0xffff, the substitution characters are ignored. The Encoding Scheme ID is used to define the way characters are encoded. This allows codepages to be classified. OS/2 only allows certain classes of codepages to be used for various purposes. The uconv class indicates the type of table and the algorithm used to interpret the table. This entry is allowed within the extended source file. Note: The version 1 name of this field is UCONV_CLASS and is still accepted. The legal values were "SBCS", "DBCS", "MBCS", and "EBCDIC_STATEFUL", which must be in upper case. The later name is the equivalent of "mbcs-ebcdic". ΓòÉΓòÉΓòÉ 4.4. COPYRIGHT ΓòÉΓòÉΓòÉ The copyright entry allows a copyright statement to be placed into the binary uconv object file. The system does not use the copyright notice, but it is expanded using uconvexp, and may be required in some countries for legal reasons. There is a single string parameter which is the copyright string. The string is limited only on line length restriction. The copyright will be within the first 256 bytes of the object only if it is less than 64 bytes in length. <copyright> "(C) Copyright, My Corporation, 1995" This is a version 2 only entry, and may exist within the extended source. If this entry exists before the first <codepage> entry in the extended source, it applies to any codepage. ΓòÉΓòÉΓòÉ 4.5. DESCRIPTION ΓòÉΓòÉΓòÉ The description entry allows a description of the codepage to be placed into the binary uconv object file. The system does not use the description, but it is expanded using uconvexp. There is a single string parameter which is the description. The string is limited only on line length restriction. <description> "Multilingual PC" This is a version 2 only entry, and may exist within the extended source. ΓòÉΓòÉΓòÉ 4.6. MB_MAX_LEN ΓòÉΓòÉΓòÉ The mb_max_len entry gives the maximum length for a multibyte character. This entry is normally not required since the code lengths can be determined from the encoding type. In the case of a MBCS class, it must be used to give the actual maximum code length. There is a single numeric parameter which must be a value between 1 and 13. <mb_max_len> 3 Note: There is also a MB_MIN_LEN entry. In general, the minimum length is set by the encoding type and cannot be changed. ΓòÉΓòÉΓòÉ 4.7. SUBCHAR ΓòÉΓòÉΓòÉ The subchar entry gives the substitution characters to use when converting from unicode into a codepage. When no equivalent character is found and substitution is requested, this string is substituted. There is a single hex-string parameter. The value is limited to 13 bytes. <subchar> \x1A <subchar> \x5c5c The substitution characters are defaulted based on the encoding. It is also possible the change the substitution at runtime. Some classes do not honor substitution and ignore this value. For instance, the algorithm based conversions such as UTF-8 do not do substitution. This is the equivalent of the version 1 entry, and can also be used within the extended source. ΓòÉΓòÉΓòÉ 4.8. SUBCHARUNI ΓòÉΓòÉΓòÉ The subcharuni entry gives the substitution characters to use when converting to unicode from a codepage. When no equivalent character is found and substitution is requested, this string is substituted. There is a single hex-string parameter. The value must be either one or two bytes (one unicode character). <subcharuni> \xff\xfd <subcharuni> \x00 The substitution character is defaulted based on the encoding. It is also possible the change the substitution at runtime. Some classes do not honor substitution and ignore this value. For instance, the algorithm based conversions such as UTF-8 do not do substitution. In version 1, this value had a fixed value of 0xfffd and there was no equivalent entry. This entry can be used within the extended source. ΓòÉΓòÉΓòÉ 4.9. ORDERED ΓòÉΓòÉΓòÉ The ordered entry specifies the action to take when there are multiple codepage characters assigned to the same unicode character. There is a single numeric parameter: 0 (default) Select the character with the lowest unicode value, but not if the unicode value is a control. 1 (version 1 default) Select the character which comes last in the source. 2 Select the character which comes first in the source. <ordered> 1 This function is required to deal with automatically expanded uconv source files where it is difficult to order the output correctly. In version 1, the default is to use the last character in the source. This entry may be used within the extended source file. ΓòÉΓòÉΓòÉ 4.10. DBCS_STARTER ΓòÉΓòÉΓòÉ The dbcs_starter entry specifies a range of bytes to be used as DBCS starter characters. This is used to retain compatibility with previous OS/2 implementations where the range of DBCS starters was specified even though there were no characters defined at that position. The parameter consists of a hex string or a range. The hex string must be a single byte. The starting byte vector is normally computed by uconvdef. There is a entry in this vector for each of the 256 possible starter bytes. This has one of the following values: 1 Valid single byte character 2 Starter for a double byte character 3 Starter for a triple byte character 255 Unused codepoint The dbcs_starter entry forces the entries within the vector to the value '2' for all bytes within the specified range. Multiple dbcs_starter entries may be given. <dbcs_starter> \x8a...\xaf <dbcs_starter> \xe0...\xfc ΓòÉΓòÉΓòÉ 4.11. USER_DEFINED ΓòÉΓòÉΓòÉ The user_defined entry specifies a range of characters in the codepage which are user defined. Up to 32 user-defined ranges can be specified. There is a parameter which can be a simple hex string or a range. The range is recorded in the uconv object and is available for query, but is not otherwise used. <user_defined> \xf0\x40...\xf9\xff ΓòÉΓòÉΓòÉ 4.12. CHARMAP ΓòÉΓòÉΓòÉ The charmap entry gives a section containing the definition of characters within the codepage. It has no parameters. The algorithmic only uconv classes (UTF-8 and UPF-8) do not require a CHARMAP section. It will be ignored if present since it one is created by the uconvexp utility for these classes. The CHARMAP section is ended with an end charmap entry. It has no parameters. Within the charmap section there are a series of entries which define the character encoding. These are: <Uxxxx> Specifies a single unicode character where xxxx are valid hex digits. The parameter is a hex string which indicates the character to map. This is the normal form. <Uxxxx>...<Uxxxx> Specifies a range of unicode characters where xxxx are valid hex digits. The parameter is a hex string which indicates the characters to map. The low order portion of the character is incremented for each unicode character. Since most uconv source files are generated by programs, this form is rarely used. <unassigned> Specifies a range of codepage characters which have no unicode equivalent. This is the same as not specifying those characters and is thus rarely used. Note: In version 1, CHARMAP can be in upper or lower case, but cannot be in mixed case. Otherwise, this entry is unchanged. The charmap section cannot be within the extended source. In version 1, the unicode string <Uxxxx> must have the 'U' in upper case. In version 2, the case is not significant. ΓòÉΓòÉΓòÉ 5. UCONVEXP - Expand a codepage ΓòÉΓòÉΓòÉ The uconvexp utility expands a binary uconv object into a source format. The resulting file can be edited and used to recreate the binary format. uconvexp takes two fixed parameters, and a set of switches which start with a hyphen (-). The options can be specified at any point in the command. File names may use either slash (/) or backslash (\) as a path separator. uconvexp uconvfile [outfile] -options The first parameter is the name of the uconv object file to expand. This is the path name to the file. This parameter is required. The second parameter gives the name of the output file. If this is not given, it defaults to the input file name with the file extension .UCX. There are several option switches: -q Quiet - Do not show informational messages -s Standard - Create a version 1 compliant file -u Upper - Upper case hex strings and keywords Examples of calling uconvexp are: uconvexp ibm850 uconvexp /language/codepage/ibm850 ibm850.ucx -u uconvexp \codepage\ibm850 -s ΓòÉΓòÉΓòÉ 6. PRINTCP - Print a codepage ΓòÉΓòÉΓòÉ The printcp utility provides a means of printing a codepage using PostScript. This only works for single byte codepages and the single byte part of double byte codepages. The output file can be sent directly to a printer, or an encapsulated PostScript file can be created. printcp takes two fixed parameters, and a set of switches which start with a hyphen (-). The options can be specified at any point in the command. File names may use either slash (/) or backslash (\) as a path separator. printcp codepage [outfile] -options The first parameter is the input codepage, and can be a uconv object, an AFP codepage, or an IBM source format codepage. This is the pathname to the input file. The second parameter is the output file. If not specified, this defaults to "lpt1". There are a large number of options which are used to customize the desired printout: -b Draw box to show the bounding box. This is useful to understand how the output will appear as an encapsulated PostScript object. -c:#### Codepage number. This can be used to override the codepage number specified within the object. -i IBM character names. Annotate each character with the name of the character using IBM CGIDs. This is an 8 character name. -l Show baseline. Draw a small line to indicate the position of the baseline. This is helpful in identifying some characters which differ from each other only by position. -m Publication format. This form produces the codepage table with no header information. This is designed to make encapsulated PostScript files which can be embedded in another document which has all header information. -n Codepoint number. Annotate each character with the decimal number of the codepoint. -q Do not show informational messages. This is normally used from within a command file which does its own messages. -r Reverse direction. By default characters are presented left to right. Many people like codepages printed with characters moving top to bottom which is done if this option is selected. -s:## Scale factor (25 - 200). The default scale factor fills a letter or A4 size page with minimal margins. To allow for full margins a scale factor of about 94 should be used. -u Unicode annotation. Annotate each character with the hex unicode number of the characters. -x:#### DBCS range (4 hex digits -x:819E). This can be used to override the DBCS starter range in the input (or when the input does not contain this information). -z:#### Partial grid (4 hex digits -z:40FF). This option is used when the full 16x16 grid is not desired. The following are examples of calls to printcp: printcp ibm850 ibm850.eps -u -m printcp /language/codepage/ibm850 -i -l The files unicode.nam, printkb.dlf, and printcp.psh must be in the current directory or in the PATH in order to run printcp. ΓòÉΓòÉΓòÉ 7. MAKEKB - Define a keyboard ΓòÉΓòÉΓòÉ The makekb utility creates a keyboard layout file (.kbl) from a keyboard definition file (.kbd). The keyboard layout file is the binary file used by the system to define a keyboard layout. makekb takes two fixed parameters, and a set of switches which start with a hyphen (-). The options can be specified at any point in the command. File names may use either slash (/) or backslash (\) as a path separator. makekb kbdfile [kblfile] -options The first parameter is the name of the input keyboard definition file. If a name without extension is given, the file type ".kbd" is added. This can give the path to the file. The second parameter is optional and specifies the name of the output keyboard layout file. If this is not specified, the layout file name is constructed from the input name by using ".kbl" as the file extension. There is a single option which must begin with a hyphen (-) and may be anywhere in the command line. -q Do not show informational messages. This is normally used in command files where the command file takes care of any messages. Any error messages are sent to stderr. A non-zero return code is given for significant errors. The files unicode.nam, and scancode.nam must be in the current directory or in the PATH in order to run makecp. ΓòÉΓòÉΓòÉ 7.1. Keyboard Definition File (.kbd) Format ΓòÉΓòÉΓòÉ The keyboard definition file (.kbd) consists of a number of lines. Each line has an entry name and some options. Blank lines, and lines beginning with asterisk (*), semicolon(;), or number(#) are comments. You can place a comment on any entry after all options, or using a semicolon (;) after any required options. Each line is limited to 1024 bytes. The entry name must be the first thing on each line. It does not need to start at the beginning of the line. It can be specified in either upper or lower case. Some entries are valid only within a scope, and are normally indented in the source. Options for each keyboard are space or tab separated. Each entry defines its own syntax. Most entries take a semicolon as a comment character after the end of the required parameters. A few entries which take character strings do not allow comments. Character names can be specified in IBM or Adobe format, or as hex values. The names are taken from the file unicode.nam. There are also defined names for virtual and deadkeys. For a list of deadkeys see the DEADKEY entry. The following can be used as the names of virtual keys. These map in value to the virtual keys used by OS/2 Presentation Manager. Virtual keys are not normally set except in the default keyboard, since the virtual function keys normally do not move. (See the file defkbd.kbd for the default keyboard assignments.) VK_BREAK VK_BACKSPACE VK_TAB VK_BACKTAB VK_NEWLINE VK_SHIFT VK_CTRL VK_ALT VK_ALTGRAF VK_PAUSE VK_CAPSLOCK VK_ESC VK_SPACE VK_PAGEUP VK_PAGEDOWN VK_END VK_HOME VK_LEFT VK_UP VK_RIGHT VK_DOWN VK_PRINTSCRN VK_INSERT VK_DELETE VK_SCRLLOCK VK_NUMLOCK VK_ENTER VK_SYSRQ VK_F1 VK_F2 VK_F3 VK_F4 VK_F5 VK_F6 VK_F7 VK_F8 VK_F9 VK_F10 VK_F11 VK_F12 VK_F13 VK_F14 VK_F15 VK_F16 VK_F17 VK_F18 VK_F19 VK_F20 VK_F21 VK_F22 VK_F23 VK_F24 VK_CLEAR VK_EREOF VK_PA1 VK_PA2 VK_PA3 VK_GROUP VK_GROUPLOCK VK_APPL VK_WINLEFT VK_WINRIGHT ΓòÉΓòÉΓòÉ 7.2. Shift State Definition ΓòÉΓòÉΓòÉ The following bits are defined in the shift state. All other bits are reserved for future use and you should not assume they are zero. KBD_SHIFT 0x00000001 KBD_CONTROL 0x00000002 KBD_ALT 0x00000004 KBD_ALTCTRLSHIFT 0x00000007 KBD_ALTGR 0x00000008 KBD_NLS1 0x00000010 KBD_NLS2 0x00000020 KBD_NLS3 0x00000040 KBD_NLS4 0x00000080 KBD_SCROLLLOCK 0x00000100 KBD_NUMLOCK 0x00000200 KBD_CAPSLOCK 0x00000400 KBD_EXTRALOCK 0x00000800 KBD_APPL 0x00001000 KBD_LEFTSHIFT 0x00010000 KBD_RIGHTSHIFT 0x00020000 KBD_LEFTCONTROL 0x00040000 KBD_RIGHTCONTROL 0x00080000 KBD_LEFTALT 0x00100000 KBD_RIGHTALT 0x00200000 KBD_LEFTWINDOWS 0x00400000 KBD_RIGHTWINDOWS 0x00800000 ΓòÉΓòÉΓòÉ 7.3. Sample Keyboard Definition ΓòÉΓòÉΓòÉ This is an example keyboard definition for national French. This represents a somewhat complex single layer keyboard. * * Name: fr.kbd * * Function: * Keyboard layout for National French - azerty * * Copyright: * Copyright (C) IBM Corp., 1995 * keyboard France country 189 France French version 1.0 include common.kbd include qwerty.kbd ; modified to azerty option 102 * * Standard definition for the alt graphics shift state * state 3 12 layer altgr 8 0 option altgr shift hold altright altgr endshift * * Define scancode ranges for capslock. This is a very complete * definition (all the white keys except those on the pad). * lock capslock capslock shift list grave one two three four five six seven eight nine zero list hyphen equal list q w e r t y u i o p bracketleft bracketright list a s d f g h j k l semicolon quotesingle backslash list extra z x c v b n m comma period slash ; next to z endlock * * A bunch of keys are moved to crate the azerty keyboard. These will * have to be defined again below. * option national redefine q redefine w redefine a redefine z redefine m * * Define key translate tables * keys normal scan padperiod xxxx comma scan one ampersand one scan two eacute two scan three quotedbl three scan four quotesingle four scan five parenleft five scan six hyphen six scan seven egrave seven scan eight underscore eight ctrl1f scan nine ccedilla nine scan zero agrave zero scan hyphen parenright degree scan equal equal plus scan bracketleft DK_CIRCUMFLEX DK_DIERESIS scan bracketleft circumflex dieresis scan bracketright dollar sterling scan semicolon m M ctrlm scan quotesingle ugrave percent scan grave twosuperior threesuperior scan backslash asterisk mu scan q a A ctrla scan w z Z ctrlz scan a q Q ctrlq scan z w W ctrlw scan m comma question scan comma semicolon period scan period colon slash scan slash exclam section scan padasterisk asterisk asterisk scan extra less greater endkeys * * Define alternate graphics keys (with right alt pressed) * keys altgr scan two DK_TILDE scan two tilde scan three numbersign scan four braceleft scan five bracketleft scan six bar scan seven DK_GRAVE scan seven grave scan eight backslash scan nine asciicircum scan zero at scan hyphen bracketright scan equal braceright scan bracketright currency xscan comma less ; 101 keyboard xscan period greater ; 101 keyboard endkeys define altgrctrl 000a keys altgrctrl scan five ctrl1b scan eight ctrl1c scan nine ctrl1e scan hyphen ctrl1d endkeys * * BIOS scan code changes due to letter key remap * biosscan q a alt biosscan w z alt biosscan a q alt biosscan z w alt biosscan semicolon m alt end ΓòÉΓòÉΓòÉ 7.4. KEYBOARD ΓòÉΓòÉΓòÉ The keyboard entry must be the first entry in the file. There is a single option which gives a description of the keyboard. This is a maximum of 32 characters and may contain spaces. Note: No comment is allowed on this line. keyboard United States keyboard My own special keyboard ΓòÉΓòÉΓòÉ 7.5. COUNTRY ΓòÉΓòÉΓòÉ The country entry specifies information about the country and language of this keyboard. This information is not used directly in processing keys, but may be queried. There are three options. Architecture This is a decimal number from 1 to 4095 optionally followed by a letter 'a' to 'o'. The number should be the IBM defined keyboard id. The letter shows variations of this standard. Country This is the name (in English) of the country of the keyboard. It may also be the two character ISO abbreviation for the country. makekb knows a large number of country names, but if the name is unknown, the two character abbreviation may always be used. Language This is the name (in English) of the language of the keyboard. It may also be the two character ISO abbreviation for the language. makekb knows a large number of language names, but if the name is unknown, the two character abbreviation may always be used. country 150d Switzerland German country 454 EE et ; Estonia ΓòÉΓòÉΓòÉ 7.6. VERSION ΓòÉΓòÉΓòÉ The version entry specifies the version number of the keyboard layout. It is specified as a two part number - major and minor versions. This is not directly used by the system. version 1.0 ΓòÉΓòÉΓòÉ 7.7. INCLUDE ΓòÉΓòÉΓòÉ The include entry allows common definitions to be kept in another file. Include can be anywhere in the file. Up to 8 levels of recursion are allowed. Include takes as a single option the name of the file to include. Several include files are shipped with the system: common.kbd Common gray key definitions qwerty.kbd Basic letter key layout dvorak.kbd Dvorak letter key layout romanji.kbd Japanese romanji definitions include common.kbd ΓòÉΓòÉΓòÉ 7.8. DEFINE ΓòÉΓòÉΓòÉ The define entry is used to define a user name for a hex value. There are two options, the name and a value. The value must be a valid hex value. The defined name can be used as a scancode, a character name, or an option name. define altshift 0005 ΓòÉΓòÉΓòÉ 7.9. OPTION ΓòÉΓòÉΓòÉ The option entry allows setting of various option flags. These are used by the Uni functions and are available on UniQueryKeyboard. The following options can be set: noctrlshift Ctrl+Shift is always equal to Ctrl (default) ctrlshift Ctrl and Shift can be separate defaultvkey Use the system VKEY table (default) nodefaultvkey No not use the system VKEY table noaltgr Alt graphics is not used (default) altgr Alt graphics is used shiftaltgr Alt graphics shift is used noshiftaltgr Shift is ignored with altgr (default) altshift Alt+shift lone key is used noaltshift Alt+shift lone key is not used (default) qwerty Normal letter key layout (default) dvorak Dvorak letter key layout national Special national letter key layout international Expanded character set local No expanded character set (default) 100 Default physical layout is 100 Key 101 Default physical layout is American (default) 102 Default physical layout is European 106 Default physical layout is Japanese The defaults are for the US keyboard, which is a slight problem since the US keyboard is the simplest of the keyboards. option 102 altgr ΓòÉΓòÉΓòÉ 7.10. REDEFINE ΓòÉΓòÉΓòÉ The redefine entry allows a scancode to be totally redefined. This removes all instances of the specified scancode from all tables. This is normally used after an include to replace the default action for a key. redefine x ΓòÉΓòÉΓòÉ 7.11. STATE ΓòÉΓòÉΓòÉ State specifies how to interpret the shift state bits and how many shift states are used. There are two operands. The first gives the modifier mask. This is normally 3 or 7. Three indicates the use of shift+ctrl, seven is shift+ctrl+alt. The second option gives the total number of shift states as a decimal number, including those specified as layers. The value is normally 8, 12, or 16. Eight indicates no alt graphics. Twelve indicates alt graphics, and 16 indicates alt graphics and national language layer. state 3 8 ΓòÉΓòÉΓòÉ 7.12. LAYER ΓòÉΓòÉΓòÉ The layer entry further specifies how to interpret the shift state bits. There can be multiple layer entries, and they are ordered. The layer entry has three options. The first specifies a shift state mask for which this layer is active. The second gives a decimal value to add to the state. An optional third parameter gives a mask of flags to be processed by additional layer entries (this defaults to zero which indicates that additional layer entries are not processed). layer altgr 8 ΓòÉΓòÉΓòÉ 7.13. SHIFT ΓòÉΓòÉΓòÉ The shift entry starts a table of shift table entries. There are no options. Within a shift table, you can have hold, toggle, and set entries. Each of these has the following options: The mask and compare can be used to only process a shift key in a particular shift state, or to implement multi-way toggles. ΓòÉΓòÉΓòÉ 7.14. HOLD ΓòÉΓòÉΓòÉ A hold entry specifies a key which modifies the shift state while it is held down. scancode The scancode of the shift key (scancode name or hex) shift The shift state to set or reset upper The upper 16 bits of the shift state mask The mask of what bits to compare compare The value to compare with shift hold shiftleft shift shiftleft endshift ΓòÉΓòÉΓòÉ 7.15. TOGGLE ΓòÉΓòÉΓòÉ A toggle entry specifies a key which modifies the shift state on alternate presses of the key. One press turns on the state, and the next press turns it off. The action is taken on the down stroke of the key. scancode The scancode of the shift key (scancode name or hex) shift The shift state to set or reset upper The upper 16 bits of the shift state mask The mask of what bits to compare compare The value to compare with shift toggle capslock capslock toggle numlock nls1 0 alt alt ; alt-numlock = nls1 endshift ΓòÉΓòÉΓòÉ 7.16. SET ΓòÉΓòÉΓòÉ A set entry specifies a key which sets a value into one or more bits in the shift state. This can be used to implement a multi-way toggle. The upper option is used as an update mask. scancode The scancode of the shift key (scancode name or hex) shift The shift state to set or reset upper The update mask in a multiway toggle mask The mask of what bits to compare compare The value to compare with To create a threeway toggle of the nls1 and nls2 bits with the padminus key: define nlstoggle 0x0030 ; nls1 | nls2 shift set padminus nls1 nlstoggle nlstoggle 0 set padminus nls2 nlstoggle nlstoggle nls1 set padminus 0 nlstoggle nlstoggle nls2 endshift ΓòÉΓòÉΓòÉ 7.17. ENDSHIFT ΓòÉΓòÉΓòÉ The endshift entry ends a shift table. There are no options. This moves the state out of shift state and back into keyboard state. shift hold shiftleft shift shiftleft endshift ΓòÉΓòÉΓòÉ 7.18. LOCK ΓòÉΓòÉΓòÉ A lock entry specifies the start of a table of lock entries. There are three options. The first gives a mask of bits to compare. The second gives the comparison value which must match after the mask. The third option gives the shift state to be modified when this lock state is active. This value is XORd with the lower 16 bits of the shift state. lock capslock shift ΓòÉΓòÉΓòÉ 7.19. LIST ΓòÉΓòÉΓòÉ The list entry is the only valid entry within a lock table. It specifies a list of scancodes. When the specified state is active, and the scancode is in this list, the effective shift state is modified. lock capslock shift list q w e r t y u i o p endlock ΓòÉΓòÉΓòÉ 7.20. ENDLOCK ΓòÉΓòÉΓòÉ The endlock entry ends a lock table. There are no options. This moves the state from lock state back into keyboard state. lock capslock shift list q w e r t y u i o p endlock ΓòÉΓòÉΓòÉ 7.21. KEYS ΓòÉΓòÉΓòÉ The keys entry starts a translation table. The options give a list of states being defined. keys altgr The state can be specified as one of the constants known to makekb, as a define, or as a hex value. shift 0x0001 control 0x0002 ctrl 0x0002 alt 0x0004 altgr 0x0008 ctrlshift 0x0003 altshift 0x0005 altgrctrl 0x000A altgrshift 0x0009 altctrl 0x0006 ctrlalt 0x0006 nls1 0x0010 nls2 0x0020 nls3 0x0040 nls4 0x0080 ΓòÉΓòÉΓòÉ 7.22. SCAN ΓòÉΓòÉΓòÉ The scan entry specifies a scancode and a set of characters which are mapped from this scancode. The first one is mapped to the state specified on the keys entry. The second one is mapped to the state after the one specified on the keys entry. This is normally used to specify lower and upper case. keys scan semicolon ae AE endkeys keys altgr scan c copyright scan r registered endkeys Any given position can have both a unicode mapping and a deadkey or virtual key mapping (it cannot have both a virtual and deadkey). In most cases, keys are either virtual keys or characters keys, but in a few cases such as tab, backspace, and enter, a key has both. When a deadkey is given, it is normal to also give the unicode mapping of the standalone character. When a placeholder is desired to give the second or subsequent entry in the table, the value xxxx may be used to indicate that there is no entry. Scancodes may be given by their name on the US keyboard, or by hex value. They may appear in any order. The actual name mappings are contained within the file scancode.nam. These scancodes appear in this order on normal 101 and 102 key keyboards: esc f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 grave one two three four five six seven eight nine zero hyphen equal backspace tab q w e r t y u i o p bracketleft bracketright backslash capslock a s d f g h j k l semicolon quotesingle enter shiftleft extra z x c v b n m comma period slash shiftright ctrlleft altleft space altright ctrlright These scancodes appear on the cursor and numeric keypad sections: print scrolllock break insert home pageup numlock padslash padasterisk padminus delete end pagedown pad7 pad8 pad9 pad4 pad4 pad6 padplus up pad1 pad2 pad3 left down right pad0 padperiod padenter The following scancodes appear on other keyboards, and should be accounted for. In addition to these, there are scancodes for the 122 key keyboard which are honored. nls1 (7b) Key next to altleft on Japanese keyboard nls2 (79) Key left of space on Japanese keyboard nls3 (70) Key right of space on Japanese keyboard jextra Key right of slash on Japanese keyboard yen Key right of equal on Japanese keyboard appl MS application key winleft MS windows key left winright MS windows key right ΓòÉΓòÉΓòÉ 7.23. ENDKEYS ΓòÉΓòÉΓòÉ The endkeys entry ends a keyboard translation table. There are no options. The only effect of this entry is moving out of keys state into keyboard state. keys scan semicolon ae AE endkeys ΓòÉΓòÉΓòÉ 7.24. DEADKEY ΓòÉΓòÉΓòÉ The deadkey entry specifies a language specific deadkey mapping. This is usually not necessary, as all common deadkey mappings are in the global deadkey mapping table. deadkey DK_ACUTE a aacute The second and third parameter can be any unicode character. The deadkeys which can be specified are: DK_ACUTE DK_GRAVE DK_DIERESIS DK_CIRCUMFLEX DK_TILDE DK_CEDILLA DK_MACRON DK_BREVE DK_OGONEK DK_DOT DK_BAR DK_RING DK_CARON DK_HUNGARUMLAUT DK_ACUTEDIA DK_PSILI DK_DASIA DK_OVERLINE DK_UNDERDOT ΓòÉΓòÉΓòÉ 7.25. LED ΓòÉΓòÉΓòÉ The ledentry specifies actions which affect the LED indicators on the keyboard. Normally, the LEDs directly match the shift state, and on IBM keyboards the three lock states have LEDs. It is possible to remap the shift state to the LED state using this control. There are five options, at least four of which are required: Action This is a number between 0 and 7, and indicates how to interpret the other values. If the 1 bit is set, then do not process any other LED entries after this one. If the 2 bit is set, use the upper 16 bits of the shift state for comparison, otherwise use the lower 16 bits. If the 4 bit is set, modify the upper 16 bits of the LED state, otherwise modify the lower 16 bits. Mask This is a hex value and is ANDed with the specified half of the shift state before the comparison. Compare This is a hex value. If the masked shift state matches this value, this entry is used. Set This is a hex value, and indicates which bits to set in the specified half of the LED state. Reset This is a hex value, and indicates which bits to reset in the specified half of the LED state. These bits are reset before any bits are set. led 0 nls1 nls1 scrolllock ΓòÉΓòÉΓòÉ 7.26. BIOSSCAN ΓòÉΓòÉΓòÉ The biosscan entry specifies an entry is the BIOS scancode translation table. This is done to emulate the legacy DOS and OS/2 keyboard tables, which allow the BIOS scancode to be modified based on the shift states. This is used to move the ALT letter keys to match the moved locations. There are three values. The first gives an input scancode. The second gives the output scancode, and the third gives a mask of the shift state bits. If multiple bits are specified in this mask, all of them must be on for the conversion to be done. If no entry is found which matches the current shift state and matches the specified scancode, the output scancode is based on the built-in table of standard BIOS scancode mappings. biosscan q a alt biosscan w z alt ΓòÉΓòÉΓòÉ 7.27. ROMANJI ΓòÉΓòÉΓòÉ The romanji entry specifies the beginning of a romanji definition. The first option is the flags value, and any remaining options define a set of shift states. RJCHAR entries have a value for each set of shift states. There must be at least one valid shift state. A column of the romanji table is used if all of the bits in the specified mask are one. Therefore, shift states with the most one bits should be specified first. romanji 0 hiragana widekana katakana ΓòÉΓòÉΓòÉ 7.28. RJCHAR ΓòÉΓòÉΓòÉ The rjchar entry describes a romanji character. This consists of the romanji sequence followed by the definition for each shift state. There is one definition for each shift state specified in the romanji entry. Each entry specifies one or more characters separated by a comma. The normal case of multiple characters is when a voiced or semivoiced vowel follows the character. Example: rjchar a ahiragana Akana akana rjchar ba bahiragana Bakana hakana,vj ΓòÉΓòÉΓòÉ 7.29. RJSYN ΓòÉΓòÉΓòÉ The rjsyn entry describes a romanji synonym. This is a romanji sequence which generates another sequence of romanji characters. The resulting characters are then processed based on the rjchar entries. There are two values, the first is the input romanji sequence, and the second is the output romanji sequence. Example: rjsyn bya bixya rjsyn fa huxa ΓòÉΓòÉΓòÉ 7.30. ENDROMANJI ΓòÉΓòÉΓòÉ The endromanji entry ends a romanji table. It has not options. Example: endromanji ΓòÉΓòÉΓòÉ 7.31. END ΓòÉΓòÉΓòÉ The end entry must be the last entry in the file. Additional entries are not processed. End has no options. The parsing must be at keyboard state when this entry is found. If there is no end in the input file, it is an error. Example: end ΓòÉΓòÉΓòÉ 8. PRINTKB - Print a keyboard ΓòÉΓòÉΓòÉ The printkb utility allows you to print a graphic representation of a keyboard layout using PostScript. makekb takes two fixed parameters, and a set of switches which start with a hyphen (-). The options can be specified at any point in the command. File names may use either slash (/) or backslash (\) as a path separator. makekb kblfile [otufile] -options The first parameter is the name of the input keyboard layout file. If a name without extension is given, the file type ".kbl" is added. This can give the path to the file. The second parameter is optional and specifies the name of the output keyboard layout file. If this is not specified, the output is sent to "lpt1". There are several options which must begin with a hyphen (-) and may be anywhere in the command line. -b Draw box around keyboard -c Show cursor pad as well as the normal keyboard. This is not the default since this area rarely changes based on layout. -f Show function keys -h Print hex scancode numbers -i Use ISO key icons for function keys -k Show keypad and cursor pad as well as the normal keyboard. This is not the default since this area rarely changes based on layout. -l:### Physical layout (84, 85, 89, 101, 102, 106, 122). By default, this information is taken from the layout object. -m Publication format (no headers). This is used to create an Encapsulated PostScript file with a bounding box matching the drawn box but without any headers. This is designed to be used when the output is embedded in another document. -n Use names for keys. The file printkb.sft is used to select names for a particular keyboard. -p Print in portrait mode -q Do not show informational messages -r Show romanji tables. This does nothing if there are no romanji tables. This should not be used with the -m option since it creates a multipage output. -v Print additional information. This is used while debugging to show the version numbers and country information. -s:## Scale factor (40 - 120). The default scale factor (100) is designed to use most of a landscape page. -x Show caps lock keys. Place a small square to indicate which keys are affected by caps lock. This is mostly a debug option. -z Do not shade gray keys. Normally gray keys are shaded to make the keyboard look more like the actual keyboard. This can cause loss of readability at small sizes. printkb tries to simulate the normal method of engraving keytops. It is possible for a key to have more meanings that printkb will use. printkb will place up to four labels on a key. All labels are placed on the surface of the key. On real keyboards, the alt or altgr actions are sometimes labeled on the front of the key. The resulting output is valid Encapsulated PostScript (it contains structuring comments including a BoundingBox). It can thus be used as input to any program which takes EPS. Lower left This is the base character assigned to this key Upper left This is the shift character assigned to this key. If this is the uppercase of the base character then a large upper left character is shown, and the lower case character is not shown. Lower right This is the alt graphic character assigned to this key. If the shift-altgr character is the uppercase of this character, only the uppercase version is shown in this location. Upper right This is either the NLS or shift-altgr character. Keyboards should not assign both. It is possible to create a keyboard definition which is too complex for printkb. The keyboard used to input Japanese is an example of this where there are a large number of meanings assigned to each key. To get a usable printout, you may want to construct two keyboards for printing purposes. Deadkeys are shown with a small gray blob where the character would be. The files printkb.dlf, printkb.psh, and printkb.sft must be in the current directory or in the PATH in order to run printkb. ΓòÉΓòÉΓòÉ 9. Codepage files ΓòÉΓòÉΓòÉ The codepage files are shipped in binary. These consists of a large set of single-byte codepages from around the world, and the double-byte codepages to support Japan. Note that the file name does not contain the hyphen. These are the ASCII codepages which can be used as process codepages. IBM-301 Japan - DBCS base IBM-437 United States Legacy IBM-813 Greece - ISO 8859-7 IBM-819 Latin 1 - ISO 8859-1 IBM-850 Multilingual IBM-851 Greek - Legacy IBM-852 Latin 2 IBM-855 Cyrillic IBM-857 Latin 5 IBM-860 Portugal IBM-861 Iceland IBM-862 Israel IBM-863 Canadian (French) IBM-864 Arabic IBM-865 Nordic IBM-866 Russia IBM-868 Urdu IBM-869 Greece IBM-874 Thai IBM-897 Japan - Legacy IBM-907 APL IBM-909 APL2 Extended IBM-910 APL2 IBM-912 Latin 2 - ISO 8859-2 IBM-913 Latin 3 - ISO 8859-3 IBM-914 Latin 4 - ISO 8859-4 IBM-915 Cyrillic - ISO 8859-5 IBM-916 Israel - ISO 8859-8 IBM-920 Latin 5 - ISO 8859-9 IBM-921 Baltic - ISO IBM-922 Estonia IBM-942 Japan SAA (1041+301) IBM-1004 Windows Extended IBM-1006 Urdu - ISO IBM-1008 Arabic Windows IBM-1041 Japan SAA IBM-1051 HP Roman 8 IBM-1089 Arabic - ISO 8859-6 IBM-1116 Estonia IBM-1117 Latvia IBM-1118 Lithuania IBM-1119 Lithuanian and Russian IBM-1124 Ukraine 8-bit IBM-1250 Latin 2 Windows IBM-1251 Cyrillic Windows IBM-1252 Latin 1 Windows IBM-1253 Greece Windows IBM-1254 Turkey Windows IBM-1255 Israel Windows IBM-1256 Arabic Windows IBM-1257 Latin 4 Windows IBM-1275 Apple Latin 1 IBM-1276 Adobe PS Standard Encoding IBM-1277 Adobe PS ISOLatin1 Encoding In addition there are a number of non-ASCII which can be used for display or conversion. IBM-037 United States - EBCDIC IBM-259 Symbols WP - EBCDIC IBM-273 Germany - EBCDIC IBM-277 Denmark, Norway - EBCDIC IBM-278 Finland, Sweden - EBCDIC IBM-280 Italy - EBCDIC IBM-284 Spain, Latin America - EBCDIC IBM-285 United Kingdom - EBCDIC IBM-290 Japan - EBCDIC IBM-293 APL - EBCDIC IBM-297 France - EBCDIC IBM-361 International - Publishing IBM-363 Symbols - Publishing IBM-367 G0 - ASCII IBM-382 Austria, Germany - Publishing IBM-383 Belgium - Publishing IBM-385 Canada (French) - Publishing IBM-386 Denmark, Norway - Publishing IBM-387 Finland, Sweden - Publishing IBM-388 France - Publishing IBM-389 Italy - Publishing IBM-391 Portugal - Publishing IBM-392 Spain - Publishing IBM-393 Latin America - Publishing IBM-394 United Kingdom - Publishing IBM-395 United States - Publishing IBM-424 Israel - EBCDIC IBM-500 International - EBCDIC IBM-829 Math Symbols - Publishing IBM-838 Thai - EBCDIC IBM-870 Latin 2 - EBCDIC IBM-871 Iceland - EBCDIC IBM-875 Greece - EBCDIC IBM-895 Japan G0 (Latin) - EUC IBM-896 Japan G2 (Katakana) - EUC IBM-918 Urdu - EBCDIC IBM-930 Japan - EBCDIC IBM-954 Japan - EUC IBM-1025 Cyrillic - EBCDIC IBM-1026 Latin 5 (Turkey) - EBCDIC IBM-1027 Japan (Latin) - EBCDIC IBM-1028 Hebrew - Publishing IBM-1038 PostScript Symbol Set IBM-1046 Arabic - EBCDIC IBM-1092 PC Symbols IBM-1097 Farsi - EBCDIC IBM-1112 Baltic - EBCDIC IBM-1122 Estonia - EBCDIC ΓòÉΓòÉΓòÉ 10. Keyboard Source Files ΓòÉΓòÉΓòÉ The keyboard files consist of a large set keyboards which are shipped with OS/2 for PowerPC. Each file is shipped in source form with the file extension .KBD. The associated keyboards can be created using makekb. aa Arabic (238) aa470 Arabic 101 (470) ba Bosnia (234) be Belgium French (120) bg Bulgaria Cyrillic - US Latin (442) bg241 Bulgaria Cyrillic (241) br Brazil (275) br274 Brazil 101(274) ca Canada (445) cf Canada French (058) cz Czech (243) de Germany (129) de453 Germany 100 (453) dk Denmark (159) ee Estonia (454) el Greece (319) el220 Greece (220) es Spain (172) fi Finland (153) fr France (189) fr120 France (120) hr Croatia (234) hu Hungary (208) il Israel (212) is Iceland (197) is458 Iceland 101 (458) it Italy (141) it142 Italy (142) jp Japan (194) la Latin America (171) mk Macedonia (449) nl Netherlands (143) no Norway (155) pk Pakistan Urdu (125) pl Poland (214) pl457 Polish Programmer's (457) po Portugal (159) ro Romanian (446) ru Russia (441) ru443 Russia 443 (443) sd Switzerland German (150d) sf Switzerland French (150f) sk Slovakia (245) sl Slovenia (234) sq Albania (452) sr Serbia (450) sv Sweden (153) th Thailand (191) th190 Thailand PC (190) tr Turkey (179) tr440 Turkey (440) uk UK English (166) uk168 UK English (168) us US English (103) usdv US English Dvorak (103d) yc Yugoslavia Cyrillic (118) Also supplied are a small set of example keyboards. defkbd Default keyboard. This is based on the US keyboard but also contains the virtual key mappings. usmac US Macintosh. This is a US keyboard based on the Apple Macintosh layout. usmath US Math. This is a multi-layer US keyboard which supports extra math symbols including the Greek alphabet. usinter US International. This is a keyboard based on the US keyboard, but with alt graphic and dead keys to type all western european languages. ΓòÉΓòÉΓòÉ 11. OS/2 Unicode Conversion Prototype Library ΓòÉΓòÉΓòÉ UNICONV.DLL is an OS/2 PC DLL which contains a set of conversion functions. These are a subset of the conversion functions provided by OS/2 for PowerPC, and include both the Unicode Conversion (uconv) functions and the xpg4 (iconv) functions. The following uconv functions are supported: UniCreateUconvObject() UniFreeUconvObject() UniQueryUconvObject() UniSetUconvObject() UniUcsUconvFromUcs() UniUcsUconvToUcs() The following iconv functions are supported: iconv_open() iconv iconv_close() ΓòÉΓòÉΓòÉ 11.1. UniCreateUconvObject() ΓòÉΓòÉΓòÉ The UniCreateUconvObject opens a unicode conversion object and returns a handle to it. #include <uconv.h> int UniCreateUconvObject( UniChar * uname, /* I - Unicode name of uconv table */ UconvObject * uobj /* O - Uconv object handle */ ); The UniCreateUconvObject function returns a handle that describes a conversion between the codepage specified by uname and Unicode (UCS-2). This handle is used for other uconv functions. The conversion object remains valid until closed by UniFreeUconvUconvObject. See Conversion Object Names for a description of the uname. The name is in unicode, but is restricted to the characters in the ASCII-7 character set. Returns: . ULS_INVALID Conversion table not found or invalid ULS_BADATTR Unknown or invalid modifier ULS_NOMEMORY No memory is available Examples: rc = UniCreateUconvObject("ibm-850", &hand); rc = UniCreateUconvObject("ibm-437@map=data", &hand); ΓòÉΓòÉΓòÉ 11.2. UniQueryUconvObject() ΓòÉΓòÉΓòÉ The UniQueryUconvObject function allow queries about the attributes and characteristics of the specified uconv object. Some of these are static and bound to the conversion table, but others are settable using UniSetUconvObject. #include <uconv.h> int UniQueryUconvObject( UconvObject uobj, /* I - Uconv object handle */ uconv_attribute_t * attr, /* O - Uconv attributes */ size_t * size, /* I/O- Buffer length */ char first[256], /* O - First byte of multibyte */ char other[256], /* O - Other byte of multibyte */ udcrange_t udcrange[32] /* O - User defined char range */ ); Any of the parameters attr, first, other, or udcrange can be NULL to indicate that this data should not be returned. The following structure defines the uconv object attributes used for UniQueryUconvObject and UniSetUconvObject. Those fields marked with a Q are only for query. Those fields marked with a Q/S can be both set and queried. typedef struct { uint32 version; /* Q/S Version (must be zero) */ char mb_min_len; /* Q Minimum char size */ char mb_max_len; /* Q Maximum char size */ char usc_min_len; /* Q UCS min size */ char usc_max_len; /* Q UCS max size */ uint16 esid; /* Q Encoding scheme ID */ char options; /* Q/S Substitution options */ char state; /* Q/S Current state */ endian_t endian; /* Q/S Source and target endian */ uint32 displaymask; /* Q/S Display/data mask */ uint32 converttype; /* Q/S Conversion type */ uint16 subchar_len; /* Q/S MBCS sub len 0=table */ uint16 subuni_len; /* Q/S Unicode sub len 0=table */ char subchar[16]; /* Q/S MBCS sub characters */ UniChar subuni[8]; /* Q/S Unicode sub characters */ } uconv_attribute_t; typedef struct { uint16 source; /* Used by FromUcs */ uint16 target; /* Used by ToUcs */ } endian_t; typedef struct { /* User Defined character range */ uint32 first; /* First codepoint */ uint32 last; /* Last codepoint */ } udcrange_t; The size parameter specifies the size the the attribute buffer. This must be at least as large as version 0 of the uconv_attribute_t structure. On output this contains the size of data actually returned in bytes. The first array gives an array of starting bytes for a multibyte character set. For some forms of stateful codepages, the length is based on state and not this table. If this parameter is NULL, no value is returned. Each byte has one of the following values: 1 Valid single byte character 2 Starter for a double byte character 3 Starter for a triple byte character 255 Unused codepoint The other array gives an array indicating when the byte is used a secondary byte in a multi-byte sequence. This is used to allocate buffers. There are only two possible values for each byte. 0 indicates that this is not used as a secondary character, and 1 indicates that it is. The udcrange array gives a set of ranges of characters which make up the user defined character range. Example: int rc; uconv_attribute_t attr; size_t len; char starter[256]; len = sizeof(attr); rc = UniQueryUconvObject(hand, &attr, &len, &starter, NULL, NULL); ΓòÉΓòÉΓòÉ 11.3. UniSetUconvObject() ΓòÉΓòÉΓòÉ The UniSetUconvObject function allow the attributes of the specified uconv object to be set. These attributes may be queried using UniQueryUconvObject. #include <uconv.h> int UniSetUconvObject( UconvObject * uobj, /* O - Uconv object handle */ uconv_attribute_t * attr, /* O - Uconv attributes */ ); The following fields within the attribute can be set: options Substitution options. This can have one of the following values: UCONV_OPTION_SUBSTITUTE_FROM_UNICODE UCONV_OPTION_SUBSTITUTE_TO_UNICODE UCONV_OPTION_SUBSTITUTE_BOTH The value UCONV_OPTION_SUBSTITUTE_SINGLE can be OR'd in with the preceeding value to force the specified subchar to be used, even when the table specified a split single/double substitution. endian Source and target endian. This is an structure containing a source and target endian field. Source applies to UniUconvFromUcs and target applies to UniUconvToUcs. Each of the fields can contain one of the following values: 0x0000 Use system endian 0xfeff Use big endian 0xfffe Use little endian displaymask Display/data mask. This is a mask of 32 bits. Each bit represents a control character below space (1<<char). If the bit is zero the character is treated as a display glyph. If the bit is one, the character is treated as a control. There are several predefined values for this mask, but any value can be used: DSPMASK_DATA All characters less than space are controls. DSPMASK_DISPLAY All characters less than space are glyphs. DSPMASK_CRLF CR and LF are controls, others are glyphs. converttype Conversion type. This is a set of flags. The following flags exist and may be ORed together: CVTYPTE_CTRL7F Treat the 0x7f character as a control. CVTTYPE_CDRA Use IBM standard control conversion. If this bit is not set, controls are converted to an equal value. Some conversions always do control conversions. subchar_len Codepage substitution length. This can be a value between 1 and 13 to indicate the substitution length. It may not exceed the maximum size character in the encoding. A value of zero indicates that the substitution character from the conversion table should be used. subchar Substitution bytes. This is the actual value whose length is specified by subchar_len. subuni_len Unicode substitution length. This can be either 0 or 1. A zero indicates that the unicode substitution from the conversion table should be used. subuni If subuni_len is set to 1, the first element in this array gives the unicode substitution character. The following example sets the displaymask to all data. This means that all codepoints below space are mapped as controls and not as glyphs. To modify only some attributes, a query should first be done using UniQueryUconvObject. Example: int rc; uconv_attribute_t attr; size_t len; len = sizeof(attr); rc = UniQueryUconvObject(hand, &attr, &len, NULL, NULL, NULL); if (!rc) { attr.displaymask = DSPMASK_DATA; rc = UniSetUconvObject(hand, &attr); } ΓòÉΓòÉΓòÉ 11.4. UniUconvToUcs() ΓòÉΓòÉΓòÉ The UniUconvToUcs function converts a sequence of characters encoding in a specified codepage to Unicode. #include <uconv.h> int UniUconvToUcs( UconvObject uobj, /* I - Uconv object handle */ void * * inbuf, /* IO - Input buffer */ size_t * inbytes, /* IO - Input buffer size (bytes) */ UniChar * * outbuf, /* IO - Output buffer size */ size_t * outchars, /* IO - Output size (chars) */ size_t * subst /* O - Substitution count */ ); uobj A handle to a uconv object created using UniCreateUconvObject. inbuf The address of the pointer to the input buffer. This is updated for any characters consumed by the transform. If this is zero or points to zero, a reset is done for any stateful transforms. inbytes On input this points to the number of bytes to be converted. On output this is replaced with the number of bytes which were not processed. outbuf The address of the pointer to the output buffer. That pointer is updated for any bytes output by this function. outchars On input this points to the number of characters in the output buffer. On output this is replaced with the number of characters left in the buffer. subst The value at the specified address is set to the number of non-identical conversions done. If a sequence of input bytes does not form a valid character conversion stops at the previous successfully converted character. If the input buffer ends with an incomplete character, conversion stops after the previous successfully converted character. If the output buffer is not large enough to hold the entire converted input, conversion stops just prior to the input bytes that would have caused the buffer to overflow. If a character is found in the input buffer which is legal, but for which an identical character does not exist in the target codepage, either a substitution or error is returned based on the substitution setting. Returns: ULS_BADHANDLE Invalid handle ULS_INVALID Character truncated ULS_BUFFERFULL Output buffer full ULS_ILLEGALSEQUENCE Invalid input character, or no identical character when substitution is not requested. Example: UconvObject hand; int rc; char chbuf[256], * chptr; UniChar ucbuf[256], * ucptr; size_t insize, outsize; size_t subs; chptr = chbuf; insize = cvtsize; ucptr = ucbuf; outsize = 256; rc = UniUconvToUcs(hand, (void * *)&chptr, &insize, &ucptr, &outsize, &subs); ΓòÉΓòÉΓòÉ 11.5. UniUconvFromUcs() ΓòÉΓòÉΓòÉ The UniUconvFromUcs function converts a string from Unicode (UCS-2) to the codepage of the specified conversion object. #include <uconv.h> int UniUconvFromUcs( UconvObject uobj, /* I - Uconv object handle */ UniChar * * inbuf, /* IO - Input buffer */ size_t * inchars, /* IO - Input buffer size (bytes) */ void * * outbuf, /* IO - Output buffer size */ size_t * outbytes, /* IO - Output size (chars) */ size_t * subst /* O - Substitution count */ ); uobj A handle to a uconv object created using UniCreateUconvObject. inbuf The address of the pointer to the input buffer. This is updated for any characters consumed by the transform. If this is zero or points to zero, a reset is done for any stateful transforms. inchars On input this points to the number of characters to be converted. On output this is replaced with the number of characters which were not processed. outbuf The address of the pointer to the output buffer. That pointer is updated for any bytes output by this function. outbytes On input this points to the number of bytes in the output buffer. On output this is replaced with the number of bytes left in the buffer. subst The value at the specified address is set to the number of non-identical conversions done. If a sequence of input bytes does not form a valid character conversion stops at the previous successfully converted character. If the input buffer ends with an incomplete character, conversion stops after the previous successfully converted character. If the output buffer is not large enough to hold the entire converted input, conversion stops just prior to the input bytes that would have caused the buffer to overflow. If a character is found in the input buffer which is legal, but for which an identical character does not exist in the target codepage, either a substitution or error is returned based on the substitution setting. Returns: ULS_BADHANDLE Invalid handle ULS_INVALID Character truncated ULS_BUFFERFULL Output buffer full ULS_ILLEGALSEQUENCE Invalid character Example: UconvObject hand; int rc; char chbuf[256], * chptr; UniChar ucbuf[256], * ucptr; size_t insize, outsize; size_t subs; chptr = chbuf; insize = cvtsize; ucptr = ucbuf; outsize = 256; rc = UniUconvFromUcs(hand, &ucptr, &insize, (void * *)&chptr, &outsize, &subs); ΓòÉΓòÉΓòÉ 11.6. UniFreeUconvObject() ΓòÉΓòÉΓòÉ The UniFreeUconvObject closes a Uconv Object. #include <uconv.h> int UniFreeUconvObject( UconvObject uobj, /* I - Uconv object handle */ ); Close and free resource associated with the specified Uconv Object handle. Returns: ULS_BADHANDLE Invalid handle ΓòÉΓòÉΓòÉ 11.7. Conversion Object Names ΓòÉΓòÉΓòÉ In OS/2, a codepage is indicated by a numeric value from the IBM registry of Codepages or Coded Character Sets. The Unicode Conversion Object (uconv) used to implement this codepage is given as a string "IBM-" followed by decimal number of the codepage without leading zeros. For example: IBM-850 IBM-37 IBM-1200 The names of conversion objects are normally of this form, but conversion objects which are not used as system codepages may have any name of up to 15 characters. After removing any hyphen (-) characters, this name should fit within 8 characters, based on the limitation of the CD-ROM file system. The names of conversion objects may not contain the path. In OS/2 for PowerPC, the conversions objects are in a fixed directory. For the OS/2 PC Prototype, the files are searched for in the current directory, and using the environment variable ULSPATH. After the name, there may be additional modifiers. These modifiers allow for customization of the conversion object. Modifiers start with an at sign ('@') and consist of name=value pairs which are separated by a comma. The following values are allowed: map=data Map all characters below space as controls. map=display Map all characters below space as glyphs. This is the default. map=crlf Map the CR and LF characters as controls, and all other characters as glyphs. map=cdra Map characters below space as controls, and map them according to the IBM standard which causes PC codepage controls to be substituted. This is used when interchanging with non-OS/2 systems. endian=big Convert any UCS-2 characters to big endian (MSB first). endian=little Convert any UCS-2 characters to little endian (LSB first). This is the default. sub=yes Do substitution for non-identical characters. This is the default. sub=no Fail the conversion for non-identical characters. subchar=\xXX Sets the substitution character(s). For multi-byte codepages this can be multiple byte each specified as two hex digits. Examples of valid conversion object names are: IBM-850@map=data IBM-1200@endian=big IBM-437@sub=yes,subchar=\xff,map=cdra ΓòÉΓòÉΓòÉ 11.8. iconv_open() ΓòÉΓòÉΓòÉ The iconv_open function initializes a codepage conversion object. #include <iconv.h> iconv_t iconv_open( char * tocode, /* I - Name of input conversion */ char * fromcode /* I - Name of output conversion */ ); The iconv_open function returns a handle that describes a conversion from the codepage specified by fromcode to the codepage specified by tocode. This handle can be used by the iconv function. The conversion object remains valid until closed by iconv_close. See Conversion Object Names for a description of the tocode and fromcode names. iconv_open returns a handle when successful, and the value (iconv_t)-1 on failure. The value errno is set for a bad return code with one of the following values: EMFILE The number of handles is exceeded. ENOMEM No memory is available. EINVAL One of the conversion objects is unknown, or the modifiers are invalid. ΓòÉΓòÉΓòÉ 11.9. iconv() ΓòÉΓòÉΓòÉ The iconv function converts text from one codepage to another #include <iconv.h> size_t iconv( iconv_t hand, /* I - iconv object handle */ char * * inbuf, /* IO - Input buffer */ size_t * inbytesleft, /* IO - Input chars left */ char * * outbuf /* IO - Output buffer */ size_t * outbytesleft /* IO - Output chars left */ ); hand A handle to a conversion object created using uconv_open. inbuf The address of the pointer to the input buffer. This is updated for any characters consumed by the transform. If this is zero or points to zero, a reset is done for any stateful transforms. inbytesleft On input this points to the number of bytes to be converted. On output this is replaced with the number of bytes which were not processed. outbuf The address of the pointer to the output buffer. That pointer is updated for any bytes output by this function. outbytesleft On input this points to the number of bytes in the output buffer. On output this is replaced with the number of bytes left in the buffer. If a sequence of input bytes does not form a valid character conversion stops at the previous successfully converted character. If the input buffer ends with an incomplete character, conversion stops after the previous successfully converted character. If the output buffer is not large enough to hold the entire converted input, conversion stops just prior to the input bytes that would have caused the buffer to overflow. If iconv() encounters a character in the input buffer which is legal, but for which an identical character does not exist in the target codepage, either a substitution or error is returned based on the substitution setting. If an error occurs, iconv() sets the return to (size_t)-1. Otherwise it returns the number of non-identical conversions. The following errors may be set in errno. EBADF The input handle is not valid. EILSEQ Input conversion stopped due to an input byte that does not belong to the input codepage, or a character which has not identical mapping when substitution is off. E2BIG Input conversion stopped due to lack of space in the output buffer. EINVAL Input conversion stopped due to an incomplete character at the end of the input buffer. ΓòÉΓòÉΓòÉ 11.10. iconv_close() ΓòÉΓòÉΓòÉ The iconv_close function closes a conversion object. #include <uconv.h> int iconv_close( iconv_t hand; /* I - iconv object handle */ ); On successful completion, a value of zero is returned. Otherwise, a value of -1 is returned and errno is set to indicate the error: EBADF The input handle is not valid ΓòÉΓòÉΓòÉ 12. OS/2 Keyboard Prototype Library ΓòÉΓòÉΓòÉ The keyboard prototype library consists of a set of functions which are used to process scancodes into keyboard state. The following uconv functions are supported: UniCreateKeyboard() UniDestroyKeyboard() UniQueryKeyboard() UniResetShiftState() UniTranslateDeadKey() UniTranslateKey() UniUpdateShiftState() UniUntranslateKey() The header for all keyboard files is unikbd.h ΓòÉΓòÉΓòÉ 12.1. UniCreateKeyboard ΓòÉΓòÉΓòÉ The UniCreateKeyboard function load a keyboard layout from disk and returns a handle. If the keyboard layout is already in use, a use count is increment. #include <unikbd.h> uint32 UniCreateKeyboard ( KHAND *pkhand, KBDNAME *name, uint32 mode); name (KBDNAME) input The name string that identifies the keyboard translation table file name (e.g. "us"). The string does not include the path name. pkhand (KHAND) output Return location for keyboard handle. This handle is used on all other keyboard translation calls. mode (uint32) input Reserved for future use - must be zero. Return codes: NO_ERROR ERR_TOO_MANY_KBD ERR_KBD_NOT_FOUND ERR_NO_MEMORY ΓòÉΓòÉΓòÉ 12.2. UniDestroyKeyboard ΓòÉΓòÉΓòÉ The UniDestroyKeyboard function closes a keyboard resource. #include <unikbd.h> uint32 UniDestroyKeyboard (KHAND khand); khand (KHAND) input Keyboard handle. Return codes: NO_ERROR ERR_NOOP ERR_BAD_HANDLE This function releases the keyboard handle and reduces the use count. When the use count goes to zero, the resource associated with the keyboard will be released. ΓòÉΓòÉΓòÉ 12.3. UniQueryKeyboard ΓòÉΓòÉΓòÉ The UniQueryKeyboard function is used to query information from the header in a keyboard table. #include <unikbd.h> uint32 UniQueryKeyboard ( KHAND khand, KEYBOARDINFO *kbdinfo); khand (KHAND) input Keyboard handle. kbdinfo (KEYBOARDINFO) output Address of keyboardinfo packet. /* * Query keyboard structure */ typedef struct { uint32 len; /* Length of structure */ uint16 kbid; /* Keyboard architecture id */ uint16 version; /* Version number */ char language[2]; /* Normal language */ char country[2]; /* Normal country */ uint16 flags; /* Flags (KBDF_) */ uint16 resv; /* Reserved */ UniChar description[32]; /* Description of keyboard */ } KEYBOARDINFO; /* * Query keyboard flags */ #define KBDF_DEFAULTVKEY 0x0001 /* Use default VKEYs */ #define KBDF_NOCTRLSHIFT 0x0002 /* Ctrl+Shift equals Ctrl */ #define KBDF_NOALTGR 0x0004 /* Alt graphics is not used */ #define KBDF_SHIFTALTGR 0x0010 /* Altgr, shift-altgr separate */ #define KBDF_DEADGOOD 0x0020 /* Invalid dead use second char*/ #define KBDF_DEADPRIVATE 0x0040 /* Use only private dead keys */ #define KBDF_SYSTEM 0x8000 /* System supplied keyboard */ #define KBDF_INTERNATIONAL 0x4000 /* Full-range character set */ #define KBDF_DVORAK 0x2000 /* Alternate letter keys */ #define KBDF_NATIONAL 0x1000 /* National letter keys */ #define KBDF_LETTERKEYS 0x3000 /* Letter key type */ #define KBDF_ISOKEYS 0x0800 /* Use ISO icons for key names */ #define KBDF_LAYOUT101 0x0000 /* Normal layout is 84/101 */ #define KBDF_LAYOUT102 0x0100 /* Normal layout is 85/102 */ #define KBDF_LAYOUT106 0x0200 /* Normal layout is 89/106 */ #define KBDF_LAYOUT103 0x0300 /* Normal layout is 86/103 */ #define KBDF_LAYOUT100 0x0400 /* Normal layout is 83/100 */ #define KBDF_LAYOUTS 0x0700 /* Layout related bits */ Return codes: NO_ERROR ERR_BAD_HANDLE ΓòÉΓòÉΓòÉ 12.4. UniTranslateKey ΓòÉΓòÉΓòÉ The UniTranslateKey function translates a scan code and effective shift state) to a unicode character and virtual key or deadkey. It also sets the BIOS scancode. #include <unikbd.h> uint32 UniTranslateKey( KHAND khand, uint32 eshift, VSCAN scan, UniChar * unichar, VDKEY * vdkey, char * biosscan); khand (HUNIKBD) input Keyboard handle eshift (uint32) input Effective shift state. This is an output from UniUpdateShiftState. scan (VSCAN) input. The PM style scancode which indicates which key. Note that this does not indicate the action (make, break, repeat). unichar (UniChar) output Unicode character vdkey (VDKEY) output Virtual key or dead key Return codes: NO_ERROR ERR_BAD_HANDLE In most cases there is either a unicode character or a virtual key. In a few cases (esc, tab, backspace, enter) both exist. It is normal when a dead key is returned to also return a unicode character for the standalone character associated with the deadkey. The BIOS scancode is returned since the translation is dependent of the keyboard layout. This is done to emulate the earlier DOS and OS/2 keyboard layouts which allowed the translated (BIOS) scancode to be set by the layout. ΓòÉΓòÉΓòÉ 12.5. UniUntranslateKey ΓòÉΓòÉΓòÉ The UniUntranslateKey does a reverse keyboard translate. Translate a unichar and virtual or dead key to a scan code and shiftstate. This is used to create a complete keyboard packet when an already translated character is entered. This is used mostly for programmed input. Normally either the unicode character or the vdkey is given. If both are given, UniUntranslateKey will process the vdkey first. #include <unikbd.h> uint32 UniUntranslateKey ( KHAND khand, UniChar unichar, VDKEY vdkey, VSCAN * scan, uint32 * state); khand (KHAND) input Keyboard handle unichar (UniChar) input Unicode character to untranslate vdkey (VDKEY) input Virtual or Deadkey scan (VSCAN *) output Location for output of PM scancode eshift (uint32) output Effective shift to generate this character. This is a minimal number of bits set. Return codes: NO_ERROR ERR_BAD_HANDLE ERR_NO_SCAN. ΓòÉΓòÉΓòÉ 12.6. UniUpdateShiftState ΓòÉΓòÉΓòÉ The UniUpdateShiftState function is used to update the three portions of the shift state (actual, effective, and led). #include <unikbd.h> uint32 UniUpdateShiftState ( KHAND khand, SHIFTSTATE * state, VSCAN scan, UCHAR makebreak) khand (KHAND) input Keyboard handle state (SHIFTSTATE *) i/o Shift state. This consists of three 32 bit values. They have similar bit definitions but they define the actual, effective, and led shift states. scan (VSCAN) input PM scan code makebreak (UCHAR) input make/break/repeat indicator Return codes: NO_ERROR ERR_BAD_HANDLE Modify the shift state as required by the scan code, using the specified keyboard translation tables. The shift state consists of three parts: the actual shift state, the effective shift state, and the LED status. The effective shift is equal to the 16 lower bits of the actual shift, but when a lock state modifies an actual shift (such as capslock affecting shift) the effective shift is modified. This means that the effective shift is only correct for the specified scancode. ΓòÉΓòÉΓòÉ 12.7. UniTranslateDeadKey ΓòÉΓòÉΓòÉ The UniTranslateDeadKey translates a dead key code with a unicode to a composite char. Translate deadkey combination using the global table and any special table associated with the keyboard translation table. #include <unikbd.h> uint32 UniTranslateDeadKey ( KHAND khand, VDKEY dead, UniChar inchar, UniChar * outchar VDKEY * outdead); khand (KHAND) input Keyboard handle dead (VDKEY) input Dead key value. inchar (UniChar) input Second character in sequence outchar (UniChar *) output Composite character outdead(VDKEY *) output Output deadkey. If this is non-zero then the deadkey is chained. OS/2 does not support chained deadkeys, so this should not be used. Return codes: NO_ERROR ERR_BAD_HANDLE ERR_NO_DEAD The calling program is expected to maintain deadkey state so that when a deadkey is found, the next will be used to form the full character. After doing the deadkey translate, the deadkey state should be reset. There is provision in the tables for a deadkey formed from multiple deadkeys, and this is used in the Japanese logic. It should not be generally used since OS/2 does not support chained dead keys. ΓòÉΓòÉΓòÉ 12.8. UniResetShiftState ΓòÉΓòÉΓòÉ The UniResetShiftState function resets the shift state. This is used when the shift state is changed other than through the normal key sequence. This allows the LED status to be maintained based on the shift state. #include <unikbd.h> uint32 UniResetShiftState( KHAND khand, SHIFTSTATE *state, USHORT type) khand (KHAND) input Keyboard handle state (SHIFTSTATE) i/o Shift state structure. type (USHORT) input type of reset: KEYEV_SET Set to specified value KEYEV_RELEASE Release all pressed keys KEYEV_ZERO Release all pressed and locked keys Return codes: NO_ERROR ERR_BAD_HANDLE