═══ 1. Overview ═══ This package provides an internationalization toolkit for the development of world-wide applications based on the X/Open Portability Guidelines, Issue 4 (XPG4). This industry standard was developed to ensure portability, interoperability, and consistency of user environments across all compliant systems. This package is not a full implementation of the X/Open XPG4 specification and does not make any claims of XPG4 branding. It implements only the portions of the XPG4 specification which deal with internationalization. X/Open is a trademark of the X/Open Company Limited. The XPG4 internationalization programming model has the following features: 1. Applications can be written in a language-independent way, so that a single version of the application can support users throughout the world. This eliminates the need for multiple language-specific versions and can dramatically reduce development, manufacturing and distribution costs. 2. Applications can "automatically" interact with international users in their own language, correct cultural conventions, and national data encoding. 3. The selection of localization features is user-definable through language environment variables. This announcement mechanism permits the dynamic loading of localization objects (locales), which give applications the correct cultural flavor. 4. The language is determined by the user, not the system or application. The same application can be launched in different languages on the same workstation, by simply changing the language environment settings. 5. New languages, countries and code pages can easily be supported by simply providing new locale objects, without requiring any change to applications. 6. Locale objects contain information to support the following types of culturally correct processing: a. date and time formatting b. numerical formatting c. monetary formatting d. text sorting e. character classification f. character case conversion g. processing of single or multi-byte character text h. text processing using wide-characters 7. Utilities are provided to permit the packaging of application-specific translatable text in external message files. Messaging interfaces are provided to access the correct message language at run-time, as specified by the locale environment variables. This package consists of a set of APIs, commands and locale .dll's which implement this internationalization support. This technical reference contains lists of these items as well as the programming guidelines for using them effectively. ═══ 1.1. Locale model ═══ The I18N library is based on the concept of locale objects, which can be loaded at run-time and provide all information which is specific to a particular language/territory/code-page combination. A territory is usually a country, but could be any geographical area. A particular territory could have several different locales which correspond to different languages. For example, Switzerland has locales for both the German and French locales. Also, a particular language may be spoken in several different territories and there may be several locales associated with these combinations. For example, French is spoken in France, Canada, Switzerland and Belgium, so that there are four locales corresponding to these combinations. Finally, a particular language/territory combination can be use with several different code pages. For example, English in the US is available in three different locales, which support code pages IBM-437, IBM-850 and IBM-819 (ISO8859-1). In general, a locale name has the following format: Xx_YY.ZZZZZZZZ where Xx is a language abbreviation, YY is a territory abbreviation, and .ZZZZZZZZ is an code page name (generally "IBM-NNNN" where NNNN is a 3 or 4 digit codepage number). So, to set the locale to US English, one could type: set LANG=En_US The US English locale corresponding to the current process code page would be then be loaded. To force the locale for the IBM-437 code page, one could type: set LANG=En_US.IBM-437 The actual locales are stored in dynamic link libraries (in \i18n\locale). The names of the .DLL files are similar to the full names shown above, but with the underscore, period and "IBM-" removed. So the default US English locale DLL file is ENUS437.DLL. This is done to support FAT file systems, which enforce file name lengths of 8.3. The I18N package also provides the ability to define locale aliases. These are stored in "\i18n\locale\ALIASES". The table consists of rows of aliases. Each row contains the alias DLL name and the true DLL name, separated by a space. The alias file is an ASCII text file, and can be edited. Locale aliasing using the ALIASES file always takes place before any attempt to load a locale. The following is a list of locales provided in this release: ┌──────────┬───────────────┬──────────────────────────────────────────────────┐ │DLL │Locale Name │Locale Description │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │(setloc1) │C │Default locale if no locale DLL can be loaded. │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ARAA864 │Ar_AA.IBM-864 │Arabic in Arabic Area (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ARAA1046 │Ar_AA.IBM-1046 │Arabic in Arabic Area (Alternate) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │BGBG915 │Bg_BG.IBM-915 │Bulgarian in Bulgaria │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │CSCZ852 │Cs_CZ.IBM-852 │Czech in Czech Republic (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │CSCZ912 │Cs_CZ.IBM-912 │Czech in Czech Republic (ISO8859-2) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │DADK850 │Da_DK.IBM-850 │Danish in Denmark │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │DECH850 │De_CH.IBM-850 │German in Switzerland │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │DEDE850 │De_DE.IBM-850 │German in Germany (and Austria) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ELGR869 │El_GR.IBM-869 │Greek in Greece │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ENGB850 │En_GB.IBM-850 │English in the U.K. │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ENUS437 │En_US.IBM-437 │English in the U.S. (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ENUS850 │En_US.IBM-850 │English in the U.S. (Alternate) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ENUS819 │En_US.IBM-819 │English in the U.S. (ISO8859-1) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ESES850 │Es_ES.IBM-850 │Spanish in Spain │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │FIFI850 │Fi_FI.IBM-850 │Finnish in Finland │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │FRBE850 │Fr_BE.IBM-850 │French in Belgium │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │FRCA850 │Fr_CA.IBM-850 │French in Canada │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │FRCH850 │Fr_CH.IBM-850 │French in Switzerland │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │FRFR850 │Hr_FR.IBM-850 │French in France │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │HRHR852 │Hr_HR.IBM-852 │Croatian in Croatia (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │HRHR912 │Hr_HR.IBM-912 │Croatian in Croatia (ISO8859-2) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │HUHU852 │Hu_HU.IBM-852 │Hungarian in Hungary (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │HUHU912 │Hu_HU.IBM-912 │Hungarian in Hungary (ISO8859-2) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ISIS850 │Is_IS.IBM-850 │Icelandic in Iceland │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ITIT850 │It_IT.IBM-850 │Italian in Italy │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │IWIL862 │Iw_IL.IBM-862 │Hebrew in Israel (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │IWIL856 │Iw_IL.IBM-856 │Hebrew in Israel (Alternate) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │JAJP932 │Ja_JP.IBM-932 │Japanese in Japan (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │(alias) │Ja_JP.IBM-942 │Japanese in Japan (Alternate) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │KOKR949 │Ko_KR.IBM-949 │Korean in Korea │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │MKMK915 │Mk_MK.IBM-915 │Macedonian in Macedonia │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │NLBE850 │Nl_BE.IBM-850 │Flemish in Belgium │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │NLNL850 │Nl_NL.IBM-850 │Dutch in Netherlands │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │NONO850 │No_NO.IBM-850 │Norwegian in Norway │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │PLPL852 │Pl_PL.IBM-852 │Polish in Poland (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │PLPL912 │Pl_PL.IBM-912 │Polish in Poland (ISO8859-2) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │PTBR850 │Pt_BR.IBM-850 │Portuguese in Brazil │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │PTPT850 │Pt_PT.IBM-850 │Portuguese in Portugal │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │RORO852 │Ro_RO.IBM-852 │Romanian in Romania │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │RURU866 │Ru_RU.IBM-866 │Russian in Russia (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │RURU915 │Ru_RU.IBM-915 │Russian in Russia (ISO8859-5) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SHSP852 │Sh_SP.IBM-852 │Latin Serbian in Serbia (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SHSP912 │Sh_SP.IBM-912 │Latin Serbian in Serbia (ISO8859-2) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SLSI852 │Sl_SI.IBM-852 │Slovenian in Slovenia (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SLSI912 │Sl_SI.IBM-912 │Slovenian in Slovenia (ISO8859-2) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SKSK852 │Sk_SK.IBM-852 │Slovak in Slovakia (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SKSK912 │Sk_SK.IBM-912 │Slovak in Slovakia (ISO8859-2) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SRSP915 │Sr_SP.IBM-915 │Cyrillic Serbian in Serbia │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │SVSE850 │Sv_SE.IBM-850 │Swedish in Sweden │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │THTH874 │Th_TH.IBM-874 │Thai in Thailand │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │TRTR857 │Tr_TR.IBM-857 │Turkish in Turkey │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ZHCN1381 │Zh_CN.IBM-1381 │Simplified Chinese in China │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ZHTW950 │Zh_TW.IBM-950 │Traditional Chinese in Taiwan (Primary) │ ├──────────┼───────────────┼──────────────────────────────────────────────────┤ │ZHTW948 │Zh_TW.IBM-948 │Traditional Chinese in Taiwan (Alternate) │ └──────────┴───────────────┴──────────────────────────────────────────────────┘ In addition to using the 8.3 names (without the underscore or period) for locale DLLs, the I18N package also uses those names for message catalog directories. So, suppose you had the following NLSPATH variable, and LANG variable, with a process code page of 437: set NLSPATH=E:\I18N\MESSAGES\%L\%N set LANG=En_US Further suppose you execute the following catalog open command in your program: cat_handle = catopen("my.cat", 0); catopen would then attempt to find the following file: E:\I18N\MESSAGES\ENUS437\MY.CAT That is because the locale (En_US) is remapped to ENUS437 (without the .DLL). Look at the \I18N\MESSAGES directory in this package to see some of the locale message directories provided. ═══ 1.2. Environment Variables ═══ The following environment variables are used for internationalization to determine the behavior of the system. The variables are inherited as defaults by the application at boot time from CONFIG.SYS or set directly via the SET command. LOCPATH specifies the search path(s) for the locale DLLs. Different paths are separated by semicolons. NLSPATH specifies the search path(s) for locating the message catalog files. Different paths are separated by semicolons. LC_ALL specifies the locale for all categories. Overrides LANG and other LC_* environment variables. LC_COLLATE specifies the locale for the LC_COLLATE (collation) category. LC_CTYPE specifies the locale for the LC_CTYPE category. This category determines character classification, case mappings and multibyte/widechar character processing. LC_MESSAGES specifies the locale for the LC_MESSAGES (message catatlog) category. LC_MONETARY specifies the locale for the LC_MONETARY (monetary formatting) category. LC_NUMERIC specifies the locale for the LC_NUMERIC (numeric formatting) category. LC_TIME specifies the locale for the LC_TIME (date/time formatting) category. LANG default setting of locale overridable by the preceding environment variables. When a call to setlocale is made, the settings are queried for priority level as follows: 1. If the LC_ALL environment variable is set, the value of the LC_ALL variable is used for all categories. 2. If the LC_ALL environment variable is not set, the values specified for medium-priority environment variables (LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME) are used. 3. If individual LC_* environment variables are not set, the value of the LANG environment variable specifies the locale for all remaining categories. 4. If the LANG environment variable is not set, the default locale is used for all remaining categories. ═══ 1.3. Using the CHAR datatype ═══ The CHAR datatype is still used in most cases but now it may contain multi-byte characters as defined in the specific locale. A multi-byte character is composed of one or more bytes, with no imbedded null bytes. The standard C str* functions are still used to manipulate these kinds of strings, with the exception of: strcmp doesn't handle < or > comparisons culturally. However, strcmp still works for equal/not equal tests. To determine greater than/less than on strings, use the strcoll and strxfrm functions. strncpy is not multi-byte aware and may only copy only part of a multi-byte character. strstr is not multi-byte aware for single character strings but will work with strings >1 character as the search key. strchr/strrchr for character searching its not multi-byte aware and so may find a single byte character amid a multi-byte sequence. ═══ 1.4. Wide character datatype ═══ The XPG4 programming model defines a new character datatype wchar_t for 16-bit character code elements. The wchar_t datatype extends the range of the standard char datatype to 16-bit characters. In general programmers only need to use this datatype when direct character manipulation is needed on multi-byte character data. Special functions are provided to convert between multi-byte and wchar strings (see mbstowcs, wcstombs, mbtowc, and wctomb functions below), and to copy, compare, or search wchar strings (see wcs* functions below). The XPG4 programming model doesn't dictate the code page of the wchar_t variables. Wide characters can be used in any code page. As a programmer writing NLS programs, you can not assume anything about what value a wchar_t variable may contain, because it can vary from locale to locale, and from vendor to vendor. Instead, you must use the I18N functions to manipulate and test wchar_t variables to insure proper operation of your program in an international environment. These functions are described, in detail, below. Typically, wide character functions either start with the prefix "wcs" (as in wcscpy - wide character copy), or have the letter "w" inserted in the function name (as in "iswalpha" - is the wide character an alphabetic character). Note: Wide characters and strings *can* be represented in your program as literal values. Place the capital letter 'L' in front of your literal characters and strings. wcscpy(a_wide_var, L"A wide char string!"); if (a_wide_char_var == L'\0') ... ═══ 1.5. Recommended Reading ═══ The Library of NLS recommends the following books: GG24-3850 ITSC AIX 3.2 Natl Language Support SC23-2431 Internationalization of AIX Software: A Programmer's Guide XOPEN X/Open CAE Specification, Issue 4 ═══ 2. Prototypes of I18N Functions ═══ The I18N internationalization library consists of over 80 different application programming interfaces (APIs) which you can use to internationalize your applications. This section lists the categories and prototypes of these APIs. 1. Runtime Locale Load/Query Function: char *setlocale(int category, const char *locale); char *nl_langinfo(nl_item item); struct lconv *localeconv(void); 2. Messages Catalog Functions: nl_catd catopen(const char *name, int oflag); char *catgets(nl_catd catd, int set_id, int msg_id, const char *s); int catclose(nl_catd catd); 3. File I/O functions: wint_t fgetwc(FILE *stream); wchar_t *fgetws(wchar_t *s, int n, FILE *stream); wint_t fputwc(const wint_t wc, FILE *stream); int fputws(const wchar_t *s, FILE *stream); wint_t getwc(FILE *stream); wint_t putwc(wint_t c, FILE *stream); wint_t ungetwc(wint_t c, FILE *stream); wint_t getwchar(void); wint_t putwchar(wint_t c); int getw(register FILE *stream); int putw(int w, register FILE *stream); 4. Code Set Conversion Functions: iconv_t iconv_open(const char *tocode, const char *fromcode); size_t iconv(iconv_t cd, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft); int iconv_close(iconv_t cd); 5. Character Attribute testing (Uses locale-based methods): int isalnum(int c); int isalpha(int c); int iscntrl(int c); int isdigit(int c); int isgraph(int c); int islower(int c); int isprint(int c); int ispunct(int c); int isspace(int c); int isupper(int c); int isxdigit(int c); int toupper(int c); int tolower(int c); int iswalnum(wint_t wc); int iswalpha(wint_t wc); int iswcntrl(wint_t wc); int iswdigit(wint_t wc); int iswgraph(wint_t wc); int iswlower(wint_t wc); int iswprint(wint_t wc); int iswpunct(wint_t wc); int iswspace(wint_t wc); int iswupper(wint_t wc); int iswxdigit(wint_t wc); int iswctype(wint_t wc, wctype_t mask); wint_t towupper(wint_t wc); wint_t towlower(wint_t wc); 6. The basic conversion methods for multibyte and wchar_t: int mblen (const char *s, size_t n); size_t mbstowcs(wchar_t *ws, const char *s, size_t n); int mbtowc (wchar_t *wc, const char *s, size_t n); double wcstod (const wchar_t *nptr, wchar_t **endptr); long wcstol (const wchar_t *nptr, wchar_t **endptr, int base); size_t wcstombs(char *s, const wchar_t *ws, size_t n); unsigned long wcstoul (const wchar_t *nptr, wchar_t **endptr, int base); int wctomb (char *s, wchar_t wchar); 7. Formatted I/O Functions: fscanf/scanf/sscanf with %ws, %wc decoding, and parameter reordering via the (%n$x) format fprintf/printf/sprintf with %ws, %wc encoding, and parameter reordering via the (%n$x) format 8. Collations: int strcoll(const char *s1, const char *s2); size_t strxfrm(char *s1, const char *s2, size_t n); int wcscoll(const wchar_t *ws1, const wchar_t *ws2); size_t wcsxfrm(wchar_t *ws1, const wchar_t *ws2, size_t n); 9. Date and Time Formatting: size_t strftime (char *s, size_t maxsize, const char *format, const struct tm *tm); size_t strfmon (char *s, size_t maxsize, const char *format, ...); char *strptime (const char *buf, const char *fmt, struct tm *tm); size_t wcsftime (wchar_t *wcs, size_t maxsize, const char *format, const struct tm *tm); 10. Basic String manipulation API for wchar_t data type: wchar_t *wcscat (wchar_t *string1, const wchar_t *string2); wchar_t *wcschr (wchar_t *string1, wint_t wc); int wcscmp (const wchar_t *string1, const wchar_t *string2); wchar_t *wcscpy (wchar_t *string1, const wchar_t *string2); size_t wcscspn (const wchar_t *string1, const wchar_t *string2); size_t wcslen (const wchar_t *ws); wchar_t *wcsncat (wchar_t *string1, const wchar_t *string2, size_t n); int wcsncmp (const wchar_t *string1, const wchar_t *string2, size_t n); wchar_t *wcsncpy (wchar_t *string1, const wchar_t *string2, size_t n); wchar_t *wcspbrk (const wchar_t *string1, const wchar_t *string2); wchar_t *wcsrchr (wchar_t *string1, wint_t wc); size_t wcsspn (const wchar_t *string1, const wchar_t *string2); wchar_t *wcstok (wchar_t *string1, const wchar_t *string2); wchar_t *wcswcs (const wchar_t *string1, const wchar_t *string2); int wcswidth(wchar_t *ws, size_t n); char wctype (const char *charclass); int wcwidth (wchar_t wc); 11. Additional helper functions double get_i18n_version(void); ═══ 3. Locale Functions ═══ The locale functions are: char *setlocale(int category, const char *locale); char *nl_langinfo(nl_item item); struct lconv *localeconv(void); ═══ 3.1. setlocale -- Defines/queries the program's locale ═══ Syntax #include char *setlocale(int category, const char *locale); Description By design, an application program initially starts up with a default locale. This default locale is determined during the setloc1.dll initialization by first mapping the current country code and code page to the set of installed locales. Then the LANG and LC_* environment variables are examined and if present, override this first default. If a locale which uses code page 437 is not found, the US English 437 locale, enus437, is used instead. If a locale which uses code page 850 is not found, the US English 850 locale, enus850, is used instead. If no suitable default locale can be found, the "C" locale is used. Most internationalized programs, however, must then change the locale for the program to the locale set by the user. This is done with a call to setlocale. Upon calling the setlocale routine, the program's locale is again determined by either the LANG, and LC_* environment variables, or directly by the 2nd argument to setlocale. Parameters category Specifies which locale categories to set. Possible values are: LC_ALL All categories are affected. LC_COLLATE Affects behavior of the strcoll and strxfrm functions. LC_CTYPE Affects behavior of the character handling functions (See IS* functions) LC_MESSAGES Affects message information (See catgets functions) LC_MONETARY Affects monetary information (See strfmon functions) LC_NUMERIC Affects the decimal-point character for the formatted input/output and string conversion functions, and the nonmonetary formatting information returned by the localeconv function. LC_TIME Affects behavior of the strftime function. locale Specifies which locale to use. You can set the value to any valid locale. To query what locale is currently active for a given category, specify NULL for the locale parameter. The setlocale function returns a pointer to the string associated with the specified category. The string can be used on a subsequent call to restore that part of the program's locale. Note: Because the string to which a successful call to setlocale points may be overwritten by subsequent calls to the setlocale function, you should copy the string if you plan to use it later. Return Values On error, the setlocale function returns NULL and the program's locale is not changed. Related Information Running samples of the setlocale function call can be found in most I18N programs. The sample program setl demonstrates most of the uses of the setlocale function. The following is a list of related functions and include files:  getenv - Search for Environment Variables  localeconv - Query Locale Conventions  _putenv - Modify Environent Variables  locale.h - Contains definitions needed for setlocale Examples This example sets the locale of the program equal to the session's locale and prints the string that is associated with the locale. #include #include #include char *string; int main(void) { /* Inherit parent session's locale. */ string = setlocale(LC_ALL, ""); /* Query the state of the locale. */ /* If it is returned, print it out. */ string = setlocale(LC_ALL, NULL); if (string != NULL) { printf(" %s \n",string); } } If LANG were set, for example, to En_US.IBM-437, then the program would display: ENUS437 ENUS437 ENUS437 ENUS437 ENUS437 ENUS437 This example sets the LC_COLLATE category of the program's locale equal to the US English 850 locale and prints the string that is associated with the locale. #include #include #include char *string; int main(void) { string = setlocale(LC_COLLATE, "En_US.IBM-850"); string = setlocale(LC_ALL, NULL); if (string != NULL) { printf(" %s \n",string); } } If LANG were set, for example, to En_US.IBM-437, then the program would display: ENUS437 ENUS437 ENUS850 ENUS437 ENUS437 ENUS437 This example sets all of the locale settings (categories) based on the environment variables. This is the usual form of the setlocal call in your programs. #include #include void main(void) { setlocale(LC_ALL, ""); } ═══ 3.2. nl_langinfo -- Retrieves information from locale ═══ Syntax #include char *nl_langinfo(nl_item item); Description The function nl_langinfo returns a pointer to a string containing information relevent to the language or cultural area defined in the program's locale. This string should not be modified by the calling program. Calls to the setlocale function may also modify the string. Parameters item Identifies specific information being requested. The values for item are defined in the file langinfo.h. A list of the possible values follows: D_T_FMT string for formatting date and time D_FMT string for formatting date T_FMT string for formatting time AM_STR string for a.m. PM_STR string for p.m. ABDAY_1 abbreviated first day of the week (Sun) ABDAY_2 abbreviated second day of the week (Mon) ABDAY_3 abbreviated third day of the week (Tue) ABDAY_4 abbreviated fourth day of the week (Wed) ABDAY_5 abbreviated fifth day of the week (Thu) ABDAY_6 abbreviated sixth day of the week (Fri) ABDAY_7 abbreviated seventh day of the week (Sat) DAY_1 name of the first day of the week (Sunday) DAY_2 name of the second day of the week (Monday) DAY_3 name of the third day of the week (Tuesday) DAY_4 name of the fourth day of the week (Wednesday) DAY_5 name of the fifth day of the week (Thursday) DAY_6 name of the sixth day of the week (Friday) DAY_7 name of the seventh day of the week (Saturday) ABMON_1 abbreviated first month (Jan) ABMON_2 abbreviated second month (Feb) ABMON_3 abbreviated third month (Mar) ABMON_4 abbreviated fourth month (Apr) ABMON_5 abbreviated fifth month (May) ABMON_6 abbreviated sixth month (Jun) ABMON_7 abbreviated seventh month (Jul) ABMON_8 abbreviated eighth month (Aug) ABMON_9 abbreviated ninth month (Sep) ABMON_10 abbreviated tenth month (Oct) ABMON_11 abbreviated eleventh month (Nov) ABMON_12 abbreviated twelveth month (Dec) MON_1 name of the first month (January) MON_2 name of the second month (February) MON_3 name of the third month (March) MON_4 name of the fourth month (April) MON_5 name of the fifth month (May) MON_6 name of the sixth month (June) MON_7 name of the seventh month (July) MON_8 name of the eighth month (August) MON_9 name of the ninth month (September) MON_10 name of the tenth month (October) MON_11 name of the eleventh month (November) MON_12 name of the twelveth month (December) RADIXCHAR radix character THOUSEP separator for thousands YESSTR affirmative response for yes/no queries NOSTR negative response for yes/no queries CRNCYSTR currency symbol; - leading, + trailing CODESET codeset name Return Values Returns string containing requested information or a pointer to an empty string if an error occurs. Related Information  langinfo.h Example This program queries the current locale for the string which defines the parameter DAY_5 and ABMON_10. #include #include #include #include #include int main (void) { printf("\nCalling setlocale\n"); printf("Setlocale returns: %s \n", setlocale(LC_ALL, "")); printf("My Kingdom for a Day and a Month : %s %s", nl_langinfo(DAY_5),nl_langinfo(ABMON_10)); return(1); } ═══ 3.3. localeconv -- Determine program locale ═══ Syntax #include #include struct lconv *localeconv(void); Description The localeconv function sets the components of a structure, struct lconv, to values appropriate for the current locale. The structure may be overwritten by another call to localeconv or by calling setlocale and passing LC_ALL, LC_MONETARY, or LC_NUMERIC. The structure contains the following elements (defaults shown are for the C locale): ┌───────────────────────┬───────────────────────────────────┬───────────┐ │ELEMENT │PURPOSE │DEFAULT │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *decimal_point │Radix character used to format │"." │ │ │non-monetary quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *thousands_sep │Character used to separate groups │"" │ │ │of digits to the left of the │ │ │ │decimal point character in │ │ │ │formatted nonmonetary quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *grouping │String whose elements taken as │"" │ │ │one-byte integer values indicate │ │ │ │the size of each group of digits in│ │ │ │formatted non-monetary quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *int_curr_symbol │International currency symbol for │"" │ │ │the current locale. The first │ │ │ │three characters contain the │ │ │ │alphabetic international currency │ │ │ │symbol. The fourth character │ │ │ │(usually a space) is the character │ │ │ │used to separate the international │ │ │ │currency symbol from the monetary │ │ │ │quantity. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *currency_symbol │Local currency symbol of the │"" │ │ │current locale. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *mon-decimal_point│Radix character used to format │"." │ │ │monetary quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *mon_thousands_sep│Separator for digits in formatted │"" │ │ │monetary quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *mon_grouping │String whose elements taken as │"" │ │ │one-byte integer values indicate │ │ │ │the size of each group of digits in│ │ │ │formatted monetary quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *positive_sign │String indicating a positive value │ │ │ │formatted monetary quantity. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *negative_sign │String indicating a negative value │ │ │ │formatted monetary quantity. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char int_frac_digits │The number of displayed digits to │"UCHAR_MAX"│ │ │the right of the decimal place for │ │ │ │internationally formatted monetary │ │ │ │quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char frac_digits │Number of digits to the right of │"UCHAR_MAX"│ │ │the decimal place in monetary │ │ │ │quantities. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char p_cs_precedes │1 if the "currency_symbol" or │"UCHAR_MAX"│ │ │int_curr_symbol precedes the value │ │ │ │for a non-negative formatted │ │ │ │monetary quantity; 0 if it succedes│ │ │ │the value. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char p_sep_by_space │0 if the "currency_symbol" or │"UCHAR_MAX"│ │ │int_curr_symbol is not separated by│ │ │ │a from the value for a non-negative│ │ │ │formatted space monetary quantity; │ │ │ │1 if a space separates the symbol │ │ │ │from the value; and 2 if a space │ │ │ │separates the symbol and the sign │ │ │ │string, if adjacent. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char n_cs_precedes │1 if the "currency_symbol" or │"UCHAR_MAX"│ │ │int_curr_symbol precedes the value │ │ │ │for a negative formatted monetary │ │ │ │quantity; 0 if it succedes the │ │ │ │value. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char n_sep_by_space │0 if the "currency_symbol" or │"UCHAR_MAX"│ │ │int_curr_symbol is not separated by│ │ │ │a space from the value for a │ │ │ │negative formatted monetary │ │ │ │quantity; 1 if a space separates │ │ │ │the symbol from the value; and 2 │ │ │ │if a space separates the symbol and│ │ │ │the sign string, if adjacent. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char p_sign_posn │Value indicating the position of │ │ │ │the "positive_sign" for a │ │ │ │non-negative formatted monetary │ │ │ │quantity. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char n_sign_posn │Value indicating the position of │ │ │ │the "negative_sign" for a negative │ │ │ │formatted monetary quantity. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *left_parenthesis │Character that appears on left side│"" │ │ │of negative currency amount. │ │ ├───────────────────────┼───────────────────────────────────┼───────────┤ │char *right_parenthesis│Character that appears on right │"" │ │ │side of negative currency amount. │ │ └───────────────────────┴───────────────────────────────────┴───────────┘ The n_sign_posn and p_sign_posn elements can have the following values: Value Meaning 0 Parentheses surround the quantity and currency_symbol or int_curr_symbol. 1 The sign precedes the quantity and currency_symbol or int_curr_symbol. 2 The sign follows the quantity and currency_symbol or int_curr_symbol. 3 The sign precedes the currency_symbol or int_curr_symbol. 4 The sign follows the currency_symbol or int_curr_symbol. Return Values The localeconv function returns a pointer to the structure. Related Information  setlocale - Set Locale  locale.h Example This example prints out the default decimal point for your locale and then the decimal point for the Fr_FR locale. #include #include #include int main(void) { char * string; struct lconv * mylocale; mylocale = localeconv(); /* Default decimal point */ printf( "Default decimal point is a %s\n", mylocale->decimal_point ); string = setlocale(LC_ALL,"Fr_FR"); mylocale = localeconv(); /***************************************** * A comma is set to be the decimal point * * when the locale is LC_FRANCE * *****************************************/ printf( "France's decimal point is a %s\n", mylocale->decimal_point ); } /**************** Output should be similar to: ****************** Default decimal point is a . France's decimal point is a , *******************************************************************/ ═══ 4. Message Catalog Functions ═══ The message catalog functions are: nl_catd catopen(const char *name, int oflag); char *catgets(nl_catd catd, int set_id, int msg_id, const char *s); int catclose(nl_catd catd); ═══ 4.1. catopen -- Opens a specified message catalog ═══ Syntax #include nl_catd catopen(const char *name, int oflag); Description The catopen function opens a specified message catalog and returns a catalog descriptor used to retrieve messages from the catalog. The contents of the catalog descriptor are complete when the catgets function accesses the message catalog. The nl_catd data type is used for catalog descriptors. This data type is defined in the nl_types.h file. If the catalog file name referred to by the name parameter contains a drive letter or a leading \, it is assumed to be an absolute pathname. That is, the catalog is looked for following that path. If the name specified is not an absolute path name, the user environment determines which directory paths to search. The NLSPATH environment variable defines the directory search path. When this variable is used, the setlocale function must be called before the catopen function. You can use two special variables, %N and %L, in the NLSPATH environment variable. The %N variable is replaced by the catalog name referred to by the call that opens the message catalog. The %L variable is replaced by the value of the LC_MESSAGES category. The value of the LC_MESSAGES category can be set by specifying values for the LANG, LC_ALL, or LC_MESSAGES environment variable. The value of the LC_MESSAGES category indicates which locale specific directory to search for message catalogs. For example, if the catopen function specifies a catalog with the name mycmd, and the environment variables are set as follows: NLSPATH=..\%N:.\%N:\system\nls\%L\%N:\system\nls\%N LANG=Fr_fr.IBM-850 then the application searches for the catalog in the following order:  ..\mycmd  .\mycmd  \system/\ls\FRFR850\mycmd  \system\nls\mycmd If you omit the %N variable in a directory specification within the NLSPATH environment variable, the application assumes that the path defines a directory and searches for the catalog in that directory before searching the next specified path. If the NLSPATH environment variable is not defined, the current directory is used. If the LC_MESSAGES category is set to the C locale (LC_MESSAGES=C), then the NLSPATH variable mechanism is disabled. Subsequent calls to the catgets function generate pointers to the program-supplied default text. Parameters name Specifies the catalog file to open. oflag Designates how the message catalog is located. 0 The LANG environment variable is used to locate the message catalog without regard to the LC_MESSAGES category. NL_CAT_LOCALE The LC_MESSAGES category is used to locate the message catalog. Return Values Successful A catalog descriptor is returned. Error CATD_ERR (-1) Returned if the LC_MESSAGES category is set to the "C" locale or if an error occurred during the creation of the nl_catd structure. Related Information  setlocale - Set a locale.  catgets - Read a message.  catclose - Close a message catalog file. Example #include #include void load_cat(char *tcat) { nl_catd catd; /* Catalog descriptor. */ if ((catd = catopen(tcat, 0)) == CATD_ERR) { printf("Unable to load specified catalog. \n"); exit(1); } if (catclose(catd) == -1) printf("Error when trying to close catalog file\n"); } ═══ 4.2. catgets -- Retrieves a message from a catalog ═══ Syntax #include char *catgets(nl_catd catd, int set_id, int msg_id, const char *s); Description The catgets function retrieves a message from a catalog and constructs a message based on the set_id + msg_id + message. If the catgets function finds the specified message, it loads it into an internal character string buffer, ends the message string with a null character, and returns a pointer to the buffer. The returned pointer is used to reference the buffer and display the message. However, the buffer can not be referenced after the catalog is closed. Parameters catd Specifies the open catalog to use for message retrieval. This is the catalog descriptor returned on the catopen call. set_id Specifies the set ID of the message. msg_id Specifies the message ID of the message. The set_id and msg_id parameters specify a particular message in the catalog to retrieve. s Specifies the default character-string buffer to use if the message is not retrieved from the catalog. Return Value Successful Retrieved message is returned. Error The user-supplied default message string specified by the s parameter is returned. Related Information  catopen - Open a message catalog.  catclose - Close a message catalog file.  wchar.h - Header file for message functions. Example This example opens a message file with the name contained in tcat and prints out the message associated with set number in setno and the message number in msgno. #include #include char *catg(char *tcat, int setno, int msgno, char *def) { nl_catd catd; /* Catalog descriptor. */ char cat[PATH_MAX]; if ((catd = catopen(tcat, 0)) == CATD_ERR) { printf("Unable to load specified catalog. \n"); exit (1); } printf("ERROR MESSAGE : %s\n", catgets(catd, setno, msgno, def)); if (catclose(catd) == -1) perror("Error when trying to close catalog file"); } ═══ 4.3. catclose -- Closes a specified message catalog ═══ Syntax #include int catclose(nl_catd catd); Description The function catclose is used to close a message catalog that was previously opened by catopen. Parameters catd Catalog descriptor returned by catopen. Return Value 0 Successful -1 Close failed. Related Information  catopen - Open a message catalog.  catgets - Read a message from the message catalog.  nl_types.h - header file for message functions. Example This example opens and closes a catalog message file. #include #include void load_cat(char *tcat) { nl_catd catd; if ((catd = catopen(tcat, 0)) == CATD_ERR) { printf("Unable to load specified catalog. \n"); exit (1); } if (catclose(catd) == -1) printf("Error when trying to close catalog file"); } ═══ 5. File I/O Functions ═══ The File I/O functions are: wint_t fgetwc(FILE *stream); wint_t fputwc(const wint_t wc, FILE *stream); wchar_t *fgetws(wchar_t *s, int n, FILE *stream); int fputws(const wchar_t *s, FILE *stream); wint_t getwc(FILE *stream); wint_t putwc(wint_t c, FILE *stream); wint_t ungetwc(wint_t c, FILE *stream); wint_t getwchar(void); wint_t putwchar(wint_t c); int getw(register FILE *stream); int putw(int w, register FILE *stream); ═══ 5.1. fgetwc -- Read a wide-character from a stream ═══ Syntax #include #include wint_t fgetwc(FILE *stream); Description The fgetwc function reads a single character from the input stream at the current position, converts it to the corresponding wide character and advances the associated file pointer, if any, so that it points to the next character. Parameters stream Input stream from which character is to be retrieved. Return Values wint_t WEOF An error or an end-of-file condition occurred. Use feof or ferror to determine whether the WEOF value indicates an error or the end of the file. Other The character read in wide-character format. Related Information  feof - Test End-of-File Indicator  ferror - Test for Read/Write Errors  fopen - Open File  wchar.h - Include file for the wide-character functions. Example This example gathers a single line of input from a stream. #include #include #define MAX_LEN 26 int main(void) { FILE *stream; wchar_t buffer[MAX_LEN + 1]; size_t i; wint_t ch; stream = fopen("myfile.dat", "r"); for (i = 0; ((i < MAX_LEN) && ((ch = fgetwc(stream)) != WEOF) && (ch != '\n')); i++) { buffer[i] = ch; } buffer[i] = NULL; printf("%S", buffer); if (fclose(stream)) perror("fclose error"); } /*************** If myfile.dat contains ******************** ABCDEFGHIJKLMNOPQRSTUVWXYZ1000000A0.013400 ******************* expected output is: ******************** ABCDEFGHIJKLMNOPQRSTUVWXYZ ═══ 5.2. fputwc -- Write a wide-character to a stream ═══ Syntax #include #include wint_t fputwc(const wint_t wc, FILE *stream); Description The fputwc function converts the wide-character wc to a multibyte character and then writes the multibyte character to the output stream at the current position and advances the file position appropriately. If the stream is opened with one of the append modes, the character is appended to the end of the stream. Parameters wc wide-character to be written. stream Output stream into which the character is to be written. Return Values wint_t WEOF An error occurred. Other The character that was written. Related Information  fgetwc - Read a wide-character.  putwc - Write a wide-character.  wchar.h - Header file for wide-character prototypes. Example This example writes the contents of buffer to a file called myfile.dat. #include #include #define NUM_ALPHA 80 int main(void) { FILE * stream; int i; wint_t ch; wchar_t buffer[NUM_ALPHA]; mbstowcs(buffer, "abcdefghijklmnopqrstuvwxyz", NUM_ALPHA); if ((stream = fopen("myfile.dat", "w")) != NULL) { for (i=0; i < (int) wcslen(buffer); ++i) ch=fputwc(buffer[i], stream); fclose( stream ); } else printf("Error opening myfile.dat"); } ═══ 5.3. fgetws -- Read wide-character string from a stream ═══ Syntax #include #include wchar_t *fgetws (wchar_t *ws, int n, FILE *stream); Description The fgetws function reads characters from the current stream position up to and including the first new-line character (\n), up to the end of the stream, or until the number of characters read is equal to n-1, whichever comes first. It converts each character to its corresponding wide-character, stores the result in the string pointed to by ws and adds a null character (\0) to the end of the string. The string includes the new-line character, if read. If n is equal to 1, the string contains a wide-character new-line only. Parameters ws Points to a wchar_t array into which the wide-characters will be placed. n The maximum number of wide-characters to return including the NULL terminator. stream Input stream from which the characters are to be retrieved. Return Values wchar_t * NULL An error or an end-of-file condition occurred. Use feof or ferror to determine whether the WEOF value indicates an error or the end of the file. In either case, the value of the string is unchanged. Other A pointer to the string containing the wide-characters. Related Information  feof - Test End-of-File Indicator.  ferror - Test for Read/Write Errors.  fputws - Print Strings.  wchar.h - Include file for the wide-character functions. Example This example gets a line of input from a data stream. The example reads no more than MAX_LEN - 1 characters, or up to a new-line character, from the stream. #include #include #define MAX_LEN 27 int main(void) { FILE *stream; wchar_t line[MAX_LEN], *result; stream = fopen("myfile.dat", "rb"); if ((result = fgetws(line, MAX_LEN, stream)) != NULL) printf("The string is %S\n", result); if (fclose(stream)) perror("fclose error"); } **************** If myfile.dat contains ******************** ABCDEFGHIJKLMNOPQRSTUVWXYZ1000000A0.013400 ******************* expected output is: ******************** The string is ABCDEFGHIJKLMNOPQRSTUVWXYZ ═══ 5.4. fputws -- Write wide-character string to a stream ═══ Syntax #include #include int fputws(const wchar_t *string, FILE *stream); Description The fputws function copies string to the output stream at the current position. It does not copy the NULL character (\0) at the end of the string. The wide-characters contained in string are converted to their corresponding multibyte characters. Parameters string Address of a null terminated wide-character string. stream Output stream into which the string is to be written. Return Values wint_t -1 An error occurred. Non negative number The number of bytes written to the output stream. Related Information  fgetws - Read a String  wchar.h - Header file for wide-character prototypes. Example This example writes a wide-character string to a stream. #include #include #define NUM_ALPHA 80 int main(void) { FILE * stream; int num; wint_t ch; wchar_t buffer[NUM_ALPHA]; mbstowcs(buffer,"abcdefghijklmnopqrstuvwxyz", NUM_ALPHA); if ((stream=fopen("myfile.dat", "w")) != NULL) { num =fputws(buffer, stream); printf("Total number of characters written to file = %i\n", num); fclose(stream); } else printf("Error opening myfile.dat"); } ═══ 5.5. getwc -- Read a wide-character from a stream ═══ Syntax #include #include wint_t getwc(FILE *stream); Description. This function is equivalent to the fgetwc function. See the fgetwc function description. Parameters. Please refer to fgetwc. Return Values Please refer to fgetwc. Example This example gets a line of input from the stdin stream. You can also use getwc(stdin) instead of getchar() in the for statement to get a line of input from stdin. #include #include #define LINE 80 int main(void) { wchar_t buffer[LINE+1]; int i; wint_t ch; printf("Please enter string\n"); /* Keep reading until either: 1. the length of LINE is exceeded, 2. the input character is EOF, 3. the input character is a new-line character */ for (i=0; (i < LINE) && ((ch = getwc()) != EOF) && (ch != '\n'); ++i) buffer[i] = ch; buffer[i]=NULL; printf("The string is: %ls\n", buffer); } /**************** Output should be similar to: ****************** Please enter string hello world The string is: hello world *******************************************************************/ ═══ 5.6. putwc -- Write a wide-character to a stream ═══ Syntax #include #include wint_t putwc(wint_t wc, FILE *stream); Description This function is equivalent to the fputwc function. See the fputwc function description. Parameters. Please refer to fputwc. Return Values Please refer to fputwc. Related Information  fputwc - Write a wide-character  getwc - Read a wide-character  fputws - Write a Wide String  wchar.h - Header file for wide-character prototypes. Example This example writes the contents of a buffer to a data stream. #include #include int main(void) { FILE *stream = stdout; int i; wchar_t ws[200]; char *s = "hello world"; mbstowcs(ws, s, strlen(s)); for (i = 0;i < wcslen(ws); i++) if (putwc(ws[i], stream) == WEOF) { printf("Stream I/O Failure"); break; } } /******************** Expected output: ************************** Hello world *******************************************************************/ ═══ 5.7. getwchar -- Read a wide-character from stdin ═══ Syntax #include #include wint_t getwchar(void); Description The getwchar function is equivalent to the getwc function with stdin as the input stream. Parameters. Please refer to fgetwc. Return Values Please refer to fgetwc. Related Information  getwc - Get a wide-character.  fgetwc - Get a wide-character from the input stream.  wchar.h - Header file for wide-character prototypes. Example This example gets wide-characters from the keyboard and echoes them back to the screen. #include #include int main (void) { wint_t ch; printf("\nType in some letters."); printf("\n"); for(;;) { ch = getwchar(); putwchar(ch); } return(1); } ═══ 5.8. putwchar -- Write a wide-character to stdout ═══ Syntax #include #include wint_t putwchar(wint_t wc); Description The putwchar function works like the putwc function, except that putwchar writes the specified wide character to the standard output. Output streams, with the exception of stderr, are buffered by default if they refer to files, or line-buffered if they refer to terminals. Parameters. Please refer to fputwc. Return Values Please refer to fputwc. Related Information  fputwc - Put a wide-character out to a stream.  putwc - Put a wide-character to std out.  wchar.h - Header file for wide-character prototypes. Example This example gets wide-characters from the keyboard and prints them out using putwchar. #include #include int main(void) { wint_t ch; printf("\nType in some letters."); printf("\n"); for (;;) { ch = getwchar(); putwchar(ch); } return(1); } ═══ 5.9. ungetwc -- Push wide-character back onto a stream ═══ Syntax #include #include wint_t ungetwc(wint_t wc, FILE *stream); Description The ungetwc function pushes the character corresponding to the wide-character specified by wc back onto the given input stream. However, only one sequential ungetwc is guaranteed to be pushed back onto the input stream if you call ungetwc consecutively without any intervening read or file-positioning operation. The stream must be open for reading. A subsequent read operation on the stream starts with the pushed back character. You cannot push WEOF back on the stream using ungetwc. Characters placed on the stream by ungetwc will be erased if a fseek, fsetpos, rewind, or fflush function is called before the character is read from the stream. Parameters wc wide-character to be put back on the stream. stream Input stream into which the wide-character is to be written. Return Values wint_t WEOF An error occurred. Other The wide-character that was pushed back. Related Information  getwc - Read a wide-character  putwc - Write a wide-character  wchar.h - Header file for wide-character prototypes. Example In this example, the while statement reads decimal digits from an input data stream by using arithmetic statements to compose the numeric values of the numbers as it reads them. When a nondigit character appears before the endof the file, ungetwc replaces it in the input stream so that later input functions can process it. #include #include #include int main(void) { FILE *stream; int ch; unsigned int result = 0; while ((ch = getc(stream)) != EOF && isdigit(ch)) result = result * 10 + ch - '0'; if (ch != EOF) ungetwc(ch, stream); } ═══ 5.10. getw -- Read a word from a stream ═══ Syntax #include #include int getw(register FILE *stream); Description The getw function reads the next word from the stream. The size of a word is the size of an INT and may vary from machine to machine. This function presumes no special alignment in the file. Due to possible differences in word length and byte ordering, files written using putw are machine dependent and may not be read using getw on a different processor. Parameters stream Input stream from which the word is to be read. Return Values int EOF An error occurred or end of file was reached. Other The word read from the input stream. Related Information  putw - Write a word  wchar.h - Header file for wide-character prototypes. ═══ 5.11. putw -- Write a word to a stream ═══ Syntax #include #include int putw(int w, register FILE *stream); Description The putw function writes the word w to the stream at the file pointer's current position. A word is the size of an INT and may vary from machine to machine. This function presumes no special alignment in the file. Due to possible differences in word length and byte ordering, files written using putw are machine dependent and may not be read using getw on a different processor. Parameters w Word to be written. stream Output stream onto which the word is to be written. Return Values int 0 Word successfully written. Non zero An error occurred. Related Information  getw - Read a word  wchar.h - Header file for wide-character prototypes. ═══ 6. ICONV Functions ═══ The Code Set Conversion Functions are: iconv_t iconv_open(const char *tocode, const char *fromcode); size_t iconv(iconv_t cd, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft); int iconv_close(iconv_t *cd); ═══ 6.1. iconv_open -- Access a conversion descriptor ═══ Syntax #include iconv_t iconv_open(const char *tocode, const char *fromcode); Description The iconv_open function is used to obtain an iconv_t descriptor that describes the conversion from the source code set to the target code set. For state dependent conversions, iconv_open starts in the initial unshifted state. Parameters There are two parameters to iconv_open. Both are character strings which represent codepages in the OS/2 system. Examples include:  437 - Major US English codepage.  850 - Codepage for most of Western Europe.  942 - Japanese double-byte codepage. tocode A character string specifying the target codepage. fromcode A character string specifying the source codepage. Return Values iconv_t -1 An error occurred and errno is set to indicate the error. Other Conversion descriptior. Related Information  iconv - Converts string of char from one character code set to another.  iconv_close - Deallocates all resources allocated by iconv_open.  iconv.h - Header file for iconv prototypes. Example See iconv for an example. ═══ 6.2. iconv -- Converts string from one character code set to another ═══ Syntax #include size_t iconv(iconv_t CD, const char **InBuf, size_t *InBytesLeft, char **OutBuf, size_t *OutBytesLeft); Description The iconv function converts the string specified by the InBuf parameter into a different code set and returns the results in the OutBuf parameter. The required conversion method is identified by the CD parameter, which must be a valid conversion descriptor returned by a previous, successful call to the iconv_open function. On calling, the InBytesLeft parameter indicates the number of bytes in the InBuf buffer to be converted, and the OutBytesLeft parameter indicates the number of available bytes in the OutBuf buffer. These values are updated upon return so they indicate the new state of their associated buffers. For state-dependent encodings, calling the iconv function with the InBuf buffer set to null will reset the conversion descriptor in the CD parameter to its initial state. Subsequent calls with the InBuf buffer, specifying other than a NULL pointer, may cause the internal state of the function to be altered as necessary. Upon successful conversion of all the characters in the InBuf buffer and after placing the converted characters in the OutBuf buffer, the iconv function returns 0, updates the InBytesLeft and OutBytesLeft parameters, and increments the InBuf and OutBuf pointers. Parameters CD Specifies the conversion descriptor that points to the correct code set converter. InBuf Points to a buffer that contains the bytes to be converted. InBytesLeft Points to an integer that contains the number of bytes in InBuf. OutBuf Points to a buffer that contains the bytes that have been converted. OutBytesLeft Points to an integer that contains the number of bytes in OutBuf. Return Values If the iconv function is unsuccessful, it updates the variables to reflect the extent of the conversion before it stopped, and sets errno to one of the following values: EILSEQ The conversion has been stopped due to an input byte that does not belong to the input codeset. E2BIG The conversion has been stopped due to lack of space in the output buffer. EINVAL The conversion has been stopped due to an incomplete character or shift sequence at the end of the input buffer. EBADF The argument CD is invalid. Related Information  genxlt  iconv_open - allocates all resources needed by iconv.  iconv_close - Deallocates all resources allocated by iconv_open.  iconv.h - Header file for iconv prototypes. Example This example converts all the characters from one code set, IBM-437, to another, IBM-850. The characters are contained in the variable "line". #include #include #include #include #include #define MAX_LEN 255 int main(void) { char *from = "IBM-037", char *to = "IBM-850"; char buf2[256]; long i; iconv_t cd; char *inptr; char *outptr; size_t inleft; size_t outleft; size_t ret; char line[MAX_LEN] = "THIS IS THE LINE TO CONVERT", char result[MAX_LEN]; buf2[0] = NULL; inptr = line; inleft = strlen(inptr); outleft = sizeof(buf2); outptr = buf2; printf("Converting from %s to %s.\n",from,to); cd = iconv_open(from,to); if ((int) cd <= 0) { printf("ERROR OPENING ICONV TABLES"); return 1; } ret = iconv(cd, &inptr, &inleft, &outptr, &outleft); *outptr = NULL; printf("original string: %s\n",line); printf("modified string: %s\n",buf2); iconv_close(cd); return(1); } ═══ 6.3. iconv_close -- Deallocates all resources allocated by iconv_open ═══ Syntax #include int iconv_close(iconv_t *cd); Description The function iconv_close is used to deallocate all resources that have been allocated by the iconv_open function including the conversion descriptor that is specified by the pointer cd. Parameters cd Conversion descriptor returned by iconv_open. Return Values int -1 An error occurred and errno has been set. EBADF The conversion descriptor is invalid. 0 Successful. Related Information  iconv - Converts string of char from one character code set to another.  iconv_open - allocates all resources needed by iconv.  iconv.h - Header file for iconv prototypes. Example See iconv for an example. ═══ 7. IS* Functions ═══ Character Attribute testing (Uses locale-based methods): int isalnum(int c); int isalpha(int c); int iscntrl(int c); int isdigit(int c); int isgraph(int c); int islower(int c); int isprint(int c); int ispunct(int c); int isspace(int c); int isupper(int c); int isxdigit(int c); int iswalnum(wint_t wc); int iswalpha(wint_t wc); int iswcntrl(wint_t wc); int iswdigit(wint_t wc); int iswgraph(wint_t wc); int iswlower(wint_t wc); int iswprint(wint_t wc); int iswpunct(wint_t wc); int iswspace(wint_t wc); int iswupper(wint_t wc); int iswxdigit(wint_t wc); int toupper(int c); int tolower(int c); wint_t towupper(wint_t wc); wint_t towlower(wint_t wc); int iswctype(wint_t wc, wctype_t mask); ═══ 7.1. is* -- Character identification functions ═══ Syntax #include int isalnum(int c); int isalpha(int c); int iscntrl(int c); int isdigit(int c); int isgraph(int c); int islower(int c); int isprint(int c); int ispunct(int c); int isspace(int c); int isupper(int c); int isxdigit(int c); Description The functions listed below all test a given integer value. Function Tests for isalnum upper or lowercase letters, or decimal digit isalpha alphabetic character iscntrl any control character isdigit decimal digit isgraph printable character excluding space islower lowercase isprint printable character including space ispunct any nonalphanumeric printable character, excluding space isspace whitespace character isupper uppercase isxdigit hexadecimal digit Parameters c Integer to be tested. (EOF is a valid input value.) Return Values int Nonzero Successful. The integer satifies the test condition. 0 Unsuccessful. The test failed. Related Information  tolower - Convert Character Case  toupper - Convert Character Case  ctype.h Example This example analyzes all characters between code 0x0 and code UPPER_LIMIT, printing A for alphabetic characters, AN for alphanumerics, U for uppercase, L for lowercase, D for digits, X for hexadecimal digits, S for spaces, PU for punctuation, PR for printable characters, G for graphics characters, and C for control characters. This example prints the code if printable. The output of this example is a 256-line table showing the characters from 0 to 255 that possess the attributes tested. #include #include #define UPPER_LIMIT 0xFF int main(void) { int ch; for ( ch = 0; ch <= UPPER_LIMIT; ++ch ) { printf("\n%3d ", ch); printf("%#04x ", ch); printf("%3s ", isalnum(ch) ? "AN" : " "); printf("%2s ", isalpha(ch) ? "A" : " "); printf("%2s", iscntrl(ch) ? "C" : " "); printf("%2s", isdigit(ch) ? "D" : " "); printf("%2s", isgraph(ch) ? "G" : " "); printf("%2s", islower(ch) ? "L" : " "); printf(" %c", isprint(ch) ? ch : ' '); printf("%3s", ispunct(ch) ? "PU" : " "); printf("%2s", isspace(ch) ? "S" : " "); printf("%3s", isprint(ch) ? "PR" : " "); printf("%2s", isupper(ch) ? "U" : " "); printf("%2s", isxdigit(ch) ? "X" : " "); } } ═══ 7.2. isw* -- wide-character identification functions ═══ Syntax #include int iswalnum(wint_t wc); int iswalpha(wint_t wc); int iswcntrl(wint_t wc); int iswdigit(wint_t wc); int iswgraph(wint_t wc); int iswlower(wint_t wc); int iswprint(wint_t wc); int iswpunct(wint_t wc); int iswspace(wint_t wc); int iswupper(wint_t wc); int iswxdigit(wint_t wc); Description The functions listed below all test a given wide-character value. Function Tests for iswalnum upper or lowercase letters, or decimal digit iswalpha alphabetic character iswcntrl any control character iswdigit decimal digit iswgraph printable character excluding space iswlower lowercase iswprint printable character including space iswpunct any nonalphanumeric printable character, excluding space iswspace whitespace character iswupper uppercase iswxdigit hexadecimal digit Parameters wc wide-character to be tested. (WEOF is a valid input value.) Return Values int Nonzero Successful. The wide-character satifies the test condition. 0 Unsuccessful. The test failed. Related Information  is* functions  wchar.h Example This example analyzes all characters between code 0x0 and code UPPER_LIMIT, printing A for alphabetic characters, AN for alphanumerics, U for uppercase, L for lowercase, D for digits, X for hexadecimal digits, S for spaces, PU for punctuation, PR for printable characters, G for graphics characters, and C for control characters. This example prints the code if printable. The output of this example is a 256-line table showing the characters from 0 to 255 that possess the attributes tested. #include #include #define UPPER_LIMIT 0xFF int main(void) { wint_t ch; for ( ch = 0; ch <= UPPER_LIMIT; ++ch ) { printf("\n%3d ", ch); printf("%#04x ", ch); printf("%3s ", iswalnum(ch) ? "AN" : " "); printf("%2s ", iswalpha(ch) ? "A" : " "); printf("%2s", iswcntrl(ch) ? "C" : " "); printf("%2s", iswdigit(ch) ? "D" : " "); printf("%2s", iswgraph(ch) ? "G" : " "); printf("%2s", iswlower(ch) ? "L" : " "); printf(" %c", iswprint(ch) ? ch : ' '); printf("%3s", iswpunct(ch) ? "PU" : " "); printf("%2s", iswspace(ch) ? "S" : " "); printf("%3s", iswprint(ch) ? "PR" : " "); printf("%2s", iswupper(ch) ? "U" : " "); printf("%2s", iswxdigit(ch) ? "X" : " "); } } ═══ 7.3. iswctype -- Class test for characters ═══ Syntax #include int iswctype(wint_t wc, wctype_t mask); Description The iswctype function uses the character class to determine if a wide-character is in the specified class. Parameters wc wide-character to be tested. mask Character class string. The character classes are defined in each locale by the following key strings: "alnum" - for the alpha numeric class. "alpha" - for the alphabetic only class. "cntrl" - for the control class. "digit" - for the digit class. "graph" - for the graphics character class. "lower" - for the lower case character class. "print" - for the printable character class. "punct" - for the punctuation class. "space" - for the space character class. "upper" - for the upper case character class. "xdigit" - for the digit or alpha character class. "blank" - for the hexadecimal alpha numeric class. Return Values int Nonzero Successful. The wide-character satifies the test condition. 0 Unsuccessful. The test failed. Related Information  wctype  wchar.h  isw* functions Example This example analyzes all characters between code 0x0 and code UPPER_LIMIT, printing A for alphabetic characters, AN for alphanumerics, U for uppercase, L for lowercase, D for digits, X for hexadecimal digits, S for spaces, PU for punctuation, PR for printable characters, G for graphics characters, and C for control characters. This example prints the code if printable. The output of this example is a 256-line table showing the characters from 0 to 255 that possess the attributes tested. #include #include #define UPPER_LIMIT 0xFF int main(void) { wint_t ch; for ( ch = 0; ch <= UPPER_LIMIT; ++ch ) { printf("\n%3d ", ch); printf("%#04x ", ch); printf("%3s", iswctype(ch,wctype("alnum")) ? "AN" : " "); printf("%2s", iswctype(ch,wctype("alpha")) ? "A" : " "); printf("%2s", iswctype(ch,wctype("cntrl")) ? "C" : " "); printf("%2s", iswctype(ch,wctype("digit")) ? "D" : " "); printf("%2s", iswctype(ch,wctype("graph")) ? "G" : " "); printf("%2s", iswctype(ch,wctype("lower")) ? "L" : " "); printf(" %c", iswctype(ch,wctype("blank")) ? "B" : " "); printf("%3s", iswctype(ch,wctype("punct")) ? "PU" : " "); printf("%2s", iswctype(ch,wctype("space")) ? "S" : " "); printf("%3s", iswctype(ch,wctype("print")) ? "PR" : " "); printf("%2s", iswctype(ch,wctype("upper")) ? "U" : " "); printf("%2s", iswctype(ch,wctype("xdigit")) ? "X" : " "); } } ═══ 7.4. tolower / toupper -- Convert character case ═══ Syntax #include int tolower(int c); /* Convert c to lowercase if appropriate */ int toupper(int c); /* Convert c to uppercase if appropriate */ Description The tolower function returns a lowercase letter if the value in c represents an uppercase letter and there exists a corresponding lowercase letter (as defined by character type information in the program locale category LC_CTYPE). Otherwise, c is returned. The input parameter is unchanged. The toupper function returns an uppercase letter if the value in c represents a lowercase letter and there exists a corresponding uppercase letter (as defined by character type information in the program locale category LC_CTYPE). Otherwise, c is returned. The input parameter is unchanged. Parameters c Character to be converted. Return Values int Nonzero Converted letter if successful. Nonconverted letter if unsuccessful. Related Information  isalnum to isxdigit - Test Integer Value  isascii - Test Integer Values  toascii - Convert Character  tolower - Convert Character  toupper - Convert Character  wchar.h Example This example uses the toupper and tolower functions to modify characters between code 0 and code 7f. #include #include int main(void) { int ch; for (ch = 0; ch <= 0x7f; ch++) { printf("\ntoupper=%#04x - ", toupper(ch)); printf("tolower=%#04x", tolower(ch)); } } ═══ 7.5. towlower / towupper -- Convert wide-character case ═══ Syntax #include wint_t towupper(wint_t wc); wint_t towlower(wint_t wc); Description The towlower function returns a lowercase wide-character if the value in wc represents an uppercase wide-character and there exists a corresponding lowercase wide-character (as defined by character type information in the program locale category LC_CTYPE). Otherwise, wc is returned. The input parameter is unchanged. The towupper function returns an uppercase wide-character if the value in wc represents a lowercase wide-character and there exists a corresponding uppercase wide-character (as defined by character type information in the program locale category LC_CTYPE). Otherwise, wc is returned. The input parameter is unchanged. Parameters wc wide-character to be converted. Return Values wint_t Nonzero Converted wide-character if successful. Nonconverted wide-character if unsuccessful. Related Information  iswalnum to iswxdigit - Test Integer Value  iswascii - Test Integer Values  towascii - Convert Character  tolower - Convert Character  toupper - Convert Character  wchar.h ═══ 8. Locale/MB* Functions ═══ The basic conversion methods for multibyte and wchar_t: int mblen(const char *s, size_t n); size_t mbstowcs(wchar_t *ws, const char *s, size_t n); int mbtowc(wchar_t *wc, const char *s, size_t n); double wcstod(const wchar_t *nptr, wchar_t **endptr); long wcstol(const wchar_t *nptr, wchar_t **endptr, int base); size_t wcstombs(char *s, const wchar_t *ws, size_t n); unsigned long wcstoul (const wchar_t *nptr, wchar_t **endptr, int base); int wctomb(char *s, wchar_t wchar); ═══ 8.1. mblen -- Length in bytes of multi-byte character ═══ Syntax #include #include int mblen(const char *string, size_t n); Description The mblen function determines the length in bytes of the multi-byte character pointed to by string. A maximum of n bytes is examined. Parameters string Points to multibyte character. n Maximum number of bytes to examine. Return Values If string is NULL, the mblen function returns:  Nonzero when encodings have state dependency  Zero otherwise. If string is not NULL, the mblen function returns:  Zero if string points to the null character  The number of bytes comprising the multi-byte character  -1 if string does not point to a valid multi-byte character. Related Information  mbtowc - Convert multi-byte character to wchar_t  mbstowcs - Convert multi-byte characters to wchar_t characters  strlen - Determine string length  wcslen - Calculate length of wchar_t string  wctomb - Convert wchar_t character to multi-byte character  wchar.h Example This example uses mblen to obtain the length of a multi-byte string. #include #include int main(void) { int length; char *string = "String of multi byte characters"; length = mblen(string, MB_CUR_MAX); printf("mblen rc %i multi-byte character: %c",length,*string); } ═══ 8.2. mbstowcs -- Converts character string to a wide-character string ═══ Syntax #include #include size_t mbstowcs(wchar_t *pwc, const char *string, size_t n); Description The mbstowcs function converts a string of multi-byte characters into a string of wide-characters. No more than n characters will be converted. No characters that follow a null byte will be examined or converted. Parameters pwc Pointer to target string of wide-characters. string Source string of multi-byte characters to be converted. n Maximum number of multi-byte characters that will be converted. Return Values If pwc is NULL, the mbstowcs function returns:  number of elements required for the wide-character. If pwc is not NULL, the mbstowcx function returns:  the number of wide-characters generated not including any null terminators.  a -1 if it encounters an invalid multibyte character. The returned wide-character string will be null terminated unless if the returned value is equal to n. Related Information  mblen - Multi-byte string length  mbtowc - Convert multi-byte character to wchar_t  wcslen - Calculate length of wchar_t string  wcstombs - Convert wchar_t string to multi-byte character string  wchar.h Example This example uses mbstowcs to convert a multi-byte character string into a wide-character string. #include #include int main(void) { int rc; char *string = "String of multi byte characters"; wchar_t arr[20]; rc = mbstowcs(arr,string,19); printf("\n\nmbstowcs rc %i wide-character string: %S\n",rc,&arr[0]); } ═══ 8.3. mbtowc -- Converts character to wide-character ═══ Syntax #include #include int mbtowc(wchar_t *pwc, const char *string, size_t n); Description The mbtowc function first determines the length of the multi-byte character pointed to by string. It then stores the wide-character code for the multi-byte character in the wchar_t object pointed to by pwc. A maximum of n bytes is examined. Parameters pwc Pointer to target wide-character. string Pointer to multi-byte character to be converted. n Maximum number of bytes that will be examined. Return Values If string is NULL, the mbtowc function returns:  Nonzero when encodings have state dependency  0 otherwise. If string is not NULL, the mbtowc function returns:  0 if string points to the null character  The number of bytes comprising the converted multi-byte character  -1 if string does not point to a valid multi-byte character or if a valid multi-byte character is not found in n or fewer bytes. Related Information  mblen - Multi-byte string length  mbstowcs - Convert multi-byte characters to wchar_t characters  wcslen - Calculate length of wchar_t string  wctomb - Convert wchar_t character to multi-byte character  wchar.h Example This example uses mbtowc to convert a multi-byte character into a single wide-character. #include #include #include int main(void) { int rc; char *string = "ABC"; wchar_t arr; rc = mbtowc(&arr,string,MB_CUR_MAX); printf("\nmbtowc rc : %i wide-character : %C\n",rc,arr); } ═══ 8.4. wcstod -- Converts wide char strings to double precision number ═══ Syntax #include double wcstod (const wchar_t *nptr, wchar_t **endptr); Description This function converts the initial portion of the wide character string pointed to by the nptr parameter to a signed long integer representation. The input wide-character string is first broken down into three parts: 1 An initial, possibly empty, sequence of white-space wide character codes (as specified by the iswspace function). 2 A subject sequence interpreted as a floating-point constant 3 A final wide-character string of one or more unrecognized wide character codes, including the terminating wide-character null of the input wide-character string. If possible, the subject is then converted to a floating point number and the result is returned. The wide-character string is parsed to skip the initial space characters (as determined by the iswspace function). Any non space character signifies the start of a subject string that may form a floating constant except that the radix is used in place of a period and if neither an exponent part nor a radix appears, a radix is assumed to follow the last digit in the wide-character string. The subject string is defined to be the longest initial substring that is of the expected form. Any character not satisfying this form begins the final portion of the wide character string pointed to by the endptr parameter on return from the call to the this function. The radix is defned in the program's locale. The subject string contains no wide-character codes if the input wide character string is empty or consists entirely of white-space wide-character codes or if the first non-white-space wide-character code is not a sign, a digit, or a radix. No conversion is performed if the subject string is empty or does not have the expected form. Parameters nptr Contains a pointer to the wide-character string to be converted. The expected form of the subject string is an optional sign, followed by a non-empty sequence of digits optionally containing a radix, then an optional exponent part. The exponent part consits of e(E) followed by an optional sign, followed by 1 or more decimal digits. endptr If not NULL, contains a pointer to the position in the nptr string where a wide-character is found that is not a valid character for the purpose of this conversion. If no conversion takes place and this pointer is not NULL, contains a pointer to the value of nptr. Return Values double The converted value of double if expected form is found. 0 No conversion could be performed and global variable errno is set to EINVAL or the correct value would cause an underflow so global variable errno is set to ERANGE. HUGE_VAL Converted value is outside the range of representable values. Global variable errno is set to ERANGE. Since 0 is returned in the event of an error and is also valid if the function is successful, applications should set the errno global variable to 0 before calling the function, and then check the errno global variable after return. Then, if the errno global variable has changed, an error occurred. Related Information  iswspace - Checks for a valid wide-character space.  wsctol - Converts wide-character strings to long integer.  wsctoul - Converts wide-character strings to unsigned long integer.  wchar.h - Header file for wide-character function prototypes. Example This example converts a wide-character string to a double. #include #include #define SIZE 40 main() { wchar_t WCString[SIZE], *endptr; double retval; /** Set errno to 0 so a failure for wcstod can be **detected */ errno=0; mbstowcs(WCString,"10e2",SIZE); /* ** Let WCString point to a wide-character null terminated ** string containing a double */ retval = wcstod ( WCString, &endptr ); /* Check errno, if it is non-zero, wcstod failed */ if (errno != 0) { /*Error handling*/ printf("An error occurred during conversion\n"); } else if (retval == HUGE_VAL) { /* No conversion could be performed */ /* Handle this case accordingly. */ printf("No conversion could be performed\n"); } /* retval contains double */ printf("This should be a double %ld\n",retval); } ═══ 8.5. wcstol -- Converts a wide char string to long int representation ═══ Syntax #include long int wcstol(const wchar_t *nptr, wchar_t **endptr, int base); Description This function converts the initial portion of the wide character string pointed to by the nptr parameter to a signed long integer representation. The input wide-character string is first broken down into three parts: 1 An initial, possibly empty, sequence of white-space wide character codes (as specified by the iswspace function). 2 A subject sequence interpreted as an integer and represented in a radix determined by the base parameter. 3 A final wide-character string of one or more unrecognized wide character codes, including the terminating wide-character null of the input wide-character string. If possible, the subject is then converted to an integer and the result is returned. The wide-character string is parsed to skip the initial space characters (as determined by the iswspace function). Any non space character signifies the start of a subject string that may form an integer in the radix specified by the base parameter. The subject string is defined to be the longest initial substring that is a long integer of the expected form. Any character not satisfying this form begins the final portion of the wide character string pointed to by the endptr parameter on return from the call to the wcstol function. The subject string contains no wide-character codes if the input wide character string is empty or consists entirely of white-space wide-character codes or if the first non-white-space wide-character code is not a sign or permissible letter or digit. No conversion is performed if the subject string is empty or does not have the expected form. Parameters nptr Contains a pointer to the wide-character string to be converted to a long integer. endptr If not NULL, contains a pointer to the position in the nptr string where a wide-character is found that is not a valid character for the purpose of this conversion. If no conversion takes place and this pointer is not NULL, contains a pointer to the value of nptr. base Specifies the radix in which the characters are interpreted. The value also determines the expected form of the subject string. Value Expected Form 0 The subject string is that of a decimal, octal, or hexadecimal constant, optionally preceded by a plus or minus sign. A decimal constant begins with a non-zero digit and consists of a sequence of decimal digits. An octal constant begins with a 0 optionally followed by a sequence of digits 0 through 7. A hexadecimal constant starts with 0x or 0X followed by a sequence of the decimal digits and letters (a(A) to f(F)). 2 through 36 The subject string is a sequence of letters and digits representing an integer in the radix specified by the base parameter, optionally preceded by a plus or minus sign, but not including an integer suffix. The letters a(A) through z(Z) are ascribed the values 10 to 35. Only letters whose values are less than that of the specified base are permitted. If the value of base is 16, the characters 0x or 0X may optionally precede the sequence of letters or digits, following the sign if present. Return Values long integer The converted value of long interger if expected form is found. 0 No conversion could be performed. Global variable errno is set to EINVAL. The value of base may not be supported. LONG_MAX Converted value is outside the range of representable values. Global variable errno is set to ERANGE. LONG_MIN Converted value is outside the range of representable values. Since 0, LONG_MIN, and LONG_MAX are returned in the event of an error and are also valid returns if the wcstol function is successful, applications should set the errno global variable to 0 before calling the wcstol function, and then check the errno global variable after return. Then, if the errno global variable has changed, an error occurred. Related Information  iswspace - Checks for a valid wide-character space.  wcstod - Converts a wide-character string to a double-precision number.  wsctoul - Converts wide-character strings to unsigned long integer.  wchar.h - Header file for wide-character function prototypes. Example The following converts a wide-character string to a signed long integer. #include #include #define SIZE 40 main() { wchar_t WCString[SIZE], *endptr; long int retval; /** Set errno to 0 so a failure for wcstol can be **detected */ errno=0; mbstowcs(WCString,"-100000",SIZE); /* ** Let WCString point to a wide-character null terminated ** string containing a signed long integer value ** */ retval = wcstol ( WCString, &endptr, 10 ); /* Check errno, if it is non-zero, wcstol failed */ if (errno != 0) { /*Error handling*/ printf("An error occurred during conversion\n"); } else if (retval == 0) { /* No conversion could be performed */ /* Handle this case accordingly. */ printf("No conversion could be performed\n"); } /* retval contains long integer */ printf("This should be a long integer %ld\n",retval); } ═══ 8.6. wcstombs -- Converts a wide-character string to a character string ═══ Syntax #include size_t wcstombs(char *dest, const wchar_t *ws, size_t count); Description. The wcstombs function converts the wide-character string pointed to by ws into a sequence of characters and stores them in the array pointed to by dest. The converted string begins in the initial shift state. The conversion stops after count bytes in dest are filled up or a null byte is stored. The target string will be null terminated unless the number of bytes converted equals count. Parameters dest Pointer to target character string. ws Pointer to wide-character string to be converted. count Maximum number of bytes that will be converted. Return Values If dest is NULL, the mbtowc function returns:  The number of bytes required for the character array. If dest is not NULL, the mbtowc function returns:  -1 if the wide-character encountered cannot be converted.  The number of bytes stored in the array pointed to by dest excluding any null terminator if the wide-character is successfully converted. Related Information  mbstowcs - Convert multi-byte Characters to wchar_t Characters  wcslen - Calculate Length of wchar_t String  wctomb - Convert wchar_t Character to multi-byte Character  wchar.h Example In this example, a wchar_t string is converted to a char string twice. The first call converts the entire string, while the second call only converts three characters. The results are printed each time. #include #include #include #include int main(void) { char dest[20]; wchar_t dptr[20]; size_t count = 20; size_t length; mbstowcs(dptr,"string",7); length = wcstombs( dest, dptr, count ); printf( "%d bytes were converted.\n", length ); printf( "The converted string is \"%s\"\n\n", dest ); memset( dest, '\0', sizeof(dest)); /* Initialize the buffer */ /* Now convert only 3 bytes */ length = wcstombs( dest, dptr, 3 ); printf( "%d bytes were converted.\n", length ); printf( "The converted string is \"%s\"\n", dest ); } /**************** Output should be similar to: ****************** 12 bytes were converted. The converted string is "string" 3 bytes were converted. The converted string is "str" *******************************************************************/ ═══ 8.7. wcstoul -- Converts wide char strings to unsigned long int ═══ Syntax #include unsigned long wcstoul(const wchar_t *nptr, wchar_t **endptr, int base); Description This function converts the initial portion of the wide character string pointed to by the nptr parameter to an unsigned long integer representation. The input wide-character string is first broken down into three parts: 1 An initial, possibly empty, sequence of white-space wide character codes (as specified by the iswspace function). 2 A subject sequence interpreted as an integer and represented in a radix determined by the base parameter. 3 A final wide-character string of one or more unrecognized wide character codes, including the terminating wide-character null of the input wide-character string. If possible, the subject is then converted to an unsigned long integer and the result is returned. The wide-character string is parsed to skip the initial space characters (as determined by the iswspace function). Any non space character signifies the start of a subject string that may form an unsigned integer in the radix specified by the base parameter. The subject string is defined to be the longest initial substring that is a long integer of the expected form. Any character not satisfying this form begins the final portion of the wide character string pointed to by the endptr parameter on return from the call to the wcstol function. The subject string contains no wide-character codes if the input wide character string is empty or consists entirely of white-space wide-character codes or if the first non-white-space wide-character code is not a sign or permissible letter or digit. No conversion is performed if the subject string is empty or does not have the expected form. Parameters nptr Contains a pointer to the wide-character string to be converted to an unsigned long integer. endptr If not NULL, contains a pointer to the position in the nptr string where a wide-character is found that is not a valid character for the purpose of this conversion. If no conversion takes place and this pointer is not NULL, contains a pointer to the value of nptr. base Specifies the radix in which the characters are interpreted. The value also determines the expected form of the subject string. Value Expected Form 0 The subject string is that of a decimal, octal, or hexadecimal constant, optionally preceded by a plus or minus sign. A decimal constant begins with a non-zero digit and consists of a sequence of decimal digits. An octal constant begins with a 0 optionally followed by a sequence of digits 0 through 7. A hexadecimal constant starts with 0x or 0X followed by a sequence of the decimal digits and letters (a(A) to f(F)). 2 through 36 The subject string is a sequence of letters and digits representing an integer in the radix specified by the base parameter, optionally preceded by a plus or minus sign, but not including an integer suffix. The letters a(A) through z(Z) are ascribed the values 10 to 35. Only letters whose values are less than that of the specified base are permitted. If the value of base is 16, the characters 0x or 0X may optionally precede the sequence of letters or digits, following the sign if present. Return Values long integer The converted value of unsigned long integer if expected form is found. 0 No conversion could be performed. Global variable errno is set to EINVAL. The value of base may not be supported. LONG_MAX Converted value is outside the range of representable values. Global variable errno is set to ERANGE. LONG_MIN Converted value is outside the range of representable values. Since 0, LONG_MIN, and LONG_MAX are returned in the event of an error and are also valid returns if this function is successful, applications should set the errno global variable to 0 before calling this function, and then check the errno global variable after return. Then, if the errno global variable has changed, an error occurred. Related Information  iswspace - Checks for a valid wide-character space.  wcstod - Converts a wide-character string to a double-precision number.  wsctol - Converts wide-character strings to unsigned long integer.  wchar.h - Header file for wide-character function prototypes. Example This example converts a wide-character string to an unsigned long integer. #include #include #define SIZE 40 main() { wchar_t WCString[SIZE], *endptr; unsigned long int retval; /** Set errno to 0 so a failure for wcstoul can be **detected */ errno=0; mbstowcs(WCString,"100000",SIZE); /* ** Let WCString point to a wide-character null terminated ** string containing an unsigned long integer value ** */ retval = wcstoul ( WCString, &endptr, 10 ); /* Check errno, if it is non-zero, wcstoul failed */ if (errno != 0) { /*Error handling*/ printf("An error occurred during conversion\n"); } else if (retval == ULONG_MAX) { /* No conversion could be performed */ /* Handle this case accordingly. */ printf("No conversion could be performed\n"); } /* retval contains unsigned long integer */ printf("This should be an unsigned long integer %ld\n",retval); } ═══ 8.8. wctomb -- Convert wide-character to multibyte character ═══ Syntax #include int wctomb(char *string, wchar_t character); Description. The wctomb function converts the wide-character in character to a multibyte character and stores it in string. If the value of character is 0, the function is left in the initial shift state. At most, wctomb stores MB_CUR_MAX characters in string. Parameters character Wide-character to be converted. string Pointer to target buffer. Return Values If string is NULL, the wctomb function returns:  Nonzero when encodings have state dependency  0 otherwise. If string is not NULL, the wctomb function returns:  The number of bytes comprising the converted multi-byte character  -1 if the value of character does not correspond to a valid multi-byte character. Related Information  mbtowc - Convert Multibyte Character to wchar_t.  wcslen - Calculate Length of wchar_t String.  wcstombs - Convert wchar_t String to Multibyte Character String.  wchar.h - Header file for wide charcter prototypes. Example This example converts the wide-character c to a character. #include #include #define SIZE 40 int main(void) { static char buffer[ SIZE ]; wchar_t wch = 'c'; int length; length = wctomb( buffer, wch ); printf( "\nThis multibyte character is (%d) byte(s) long", length); printf( "And the converted string is \"%s\"\n", buffer ); } /**************** Output should be similar to: ****************** This multibyte character is 1 byte(s) long And the converted string is "c" *******************************************************************/ ═══ 9. Formatted I/O Functions ═══ Formatted I/O Functions: fscanf/scanf/sscanf with %ws, %wc decoding, and parameter reordering via the (%n$x) format fprintf/printf/sprintf with %ws, %wc encoding, and parameter reordering via the (%n$x) format ═══ 9.1. fprintf -- Formats output to an output stream ═══ Syntax #include int fprintf(FILE *stream, const char *format-string, argument-list); Description The fprintf function formats and writes a series of characters and values to the output stream. It converts each entry in argument-list, if any, and writes to the stream according to the corresponding format specification in the format-string. Each entry in argument-list must be a pointer to a variable with a type that corresponds to a type specifier in format-string. The format-string has the same form and function as the format-string argument for the printf function. In extended mode, fprintf also converts floating-point values of NaN and infinity to the strings "NAN" or "nan" and "INFINITY" or "infinity". The case and sign of the string is determined by the format specifiers. See Infinity and NaN Support for more information on infinity and NaN values. Parameters stream Output stream into which data is written. format-string Format specification. See printf. argument-list List of arguments to be substituted into the formatted data. Return Values The fprintf function returns the number of characters written or a negative value if an output error occurs. Related Information  fscanf - Read Formatted Data.  printf - Formatted Print.  sprintf - Formatted Print to Buffer.  stdio.h - Include file for standard I/O functions. Example This example stores a series of data types to a file "myfile.dat" using the fprintf function. #include #include #define MAXLEN 80 int main(void) { int count; FILE *stream; long l; float fp; wchar_t s[MAXLEN]; wchar_t c[MAXLEN]; fp = 1.34e-2; l = 1000000; mbstowcs(s,"ABCDEFGHIJKLMNOPQRSTUVWXYZ",MAXLEN); mbstowcs(c,"A",MAXLEN); stream = fopen("myfile.dat", "w"); fprintf(stream, "%S", s); fprintf(stream, "%ld", l); fprintf(stream, "%C", c[0]); fprintf(stream, "%f", fp); printf("\n"); fclose(stream); } ═══ 9.2. fscanf -- Reads data from a stream and returns data to argument list ═══ Syntax #include /* when using C set 2 the header file is */ int fscanf (FILE *stream, const char *format-string, argument-list); Description The fscanf function reads data from the current position of the specified stream into the locations given by the entries in argument-list, if any. Each entry in argument-list must be a pointer to a variable with a type that corresponds to a type specifier in format-string. The format-string controls the interpretation of the input fields and has the same form and function as the format-string argument for the scanf function. See scanf for a description of format-string. In extended mode, the fscanf function also reads in the strings "INFINITY", "INF", and "NAN" (in upper or lowercase) and converts them to the corresponding floating-point value. The sign of the value is determined by the format specification. See Infinity and NaN Support for more information on infinity and NaN values. New Formatting Codes Using the n$ feature the order of arguments referenced can be changed. For example, take the following statement: fscanf(stream,"%d %d",&value1, &value2); By using the n$ format the value2 could be referenced before value1: fscanf(stream,"%2$d %1$d",&value1,&value2). The %S is used to print wide-character strings for wide-character support. The %C is used to print individual wide-characters for wide-character support. Parameters stream Input stream from which data is read. format-string Format specification. See scanf. argument-list List of arguments to be substituted into the formatted data. Return Values The fscanf function returns the number of fields that it successfully converted and assigned. The return value does not include fields that fscanf read but did not assign. If an input failure occurs before any conversion, the return value is EOF. Related Information  fprintf - Write Formatted Data  scanf - Read Data  stdio.h - Normal location for fscanf function prototype, however for this implementation uses wchar.h.  wchar.h - Header file for wide-character support. Example This example opens the file myfile.dat for reading and then scans this file for a string, a long integer value, a character, and a floating-point value. #include #include #include #define MAXLEN 80 int main(void) { int count; FILE *stream; long l; float fp; wchar_t s[MAXLEN]; wchar_t c; for(count=0;count #include int printf(const char *format-string, ...); Description The printf function formats and prints a series of characters and values to the standard output stream stdout. The format-string consists of ordinary characters, escape sequences, and format specifications. The ordinary characters are copied in order of their appearance to stdout. Format specifications, beginning with a percent sign (%), determine the output format for any argument-list following the format-string. The format-string is a multibyte character string beginning and ending in its initial shift state. The format-string is read left to right. When the first format specification is found, the value of the first argument after the format-string is converted and output according to the format specification. The second format specification causes the second argument after the format-string to be converted and output, and so on through the end of the format-string. If there are more arguments than there are format specifications, the extra arguments are evaluated and ignored. The results are undefined if there are not enough arguments for all the format specifications. A format specification has the following form: --%---------------------------------------------------------type-- - - - - - - - - ---flags--- ---width--- ---.--precision--- ---h--- - - ---L--- - - ---l--- Each field of the format specification is a single character or number signifying a particular format option. The type character, which appears after the last optional format field, determines whether the associated argument is interpreted as a character, a string, a number, or pointer. The simplest format specification contains only the percent sign and a type character (for example, %s). The following optional fields control other aspects of the formatting: Field Description flags Justification of output and printing of signs, blanks, decimal points, octal, and hexadecimal prefixes, and the semantics for wchar_t precision unit. width Minimum number of characters (bytes) output. precision Maximum number of characters (bytes) printed for all or part of the output field, or minimum number of digits printed for integer values. h,l,L Size of argument expected. The type characters and their meanings are given in the following table: ┌─────────┬───────────────┬────────────────────────────────────────────────────┐ │CHARACTER│ARGUMENT │OUTPUT FORMAT │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │d, i │Integer │Signed decimal integer. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │u │Integer │Unsigned decimal integer. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │o │Integer │Unsigned octal integer. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │x │Integer │Unsigned hexadecimal integer, using abcdef │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │X │Integer │Unsigned hexadecimal integer, using ABCDEF │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │f │Double │Signed value having the form [-]dddd.dddd, where │ │ │ │dddd is one or more decimal digits. The number of │ │ │ │digits before the decimal point depends on the │ │ │ │magnitude of the number. The number of digits after│ │ │ │the decimal point is equal to the requested │ │ │ │precision. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │e │Double │Signed value having the form [-]d.dddd"e"[sign]ddd, │ │ │ │where d is a single-decimal digit, dddd is one or │ │ │ │more decimal digits, ddd is 2 or more decimal │ │ │ │digits, and sign is + or -. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │E │Double │Identical to the "e" format except that "E" │ │ │ │introduces the exponent instead of "e". │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │g │Double │Signed value printed in "f" or "e" format. The "e" │ │ │ │format is used only when the exponent of the value │ │ │ │is less than -4 or greater than precision. Trailing│ │ │ │zeros are truncated, and the decimal point appears │ │ │ │only if one or more digits follow it. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │G │Double │Identical to the "g" format except that "E" │ │ │ │introduces the exponent (where appropriate instead │ │ │ │of "e". │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │c │Character │Single character. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │s │String │Characters printed up to the first null character │ │ │ │(\"0") or until precision is reached. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │n │Pointer to │Number of characters successfully written so far to │ │ │integer │the stream or buffer; this value is stored in the │ │ │ │integer whose address is given as the argument. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │p │Pointer │Pointer to void converted to a sequence of printable│ │ │ │characters. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │"lc" │Wide Character │Multibyte character. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │"ls" │Wide string │Multibyte characters printed up to the first │ │ │ │"wchar_t" null character ("L\0") or until precision │ │ │ │is reached. │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │C │wide character │A wchar_t character is converted into an array of │ │ │to Multibyte │characters representing an multibyte character and │ │ │ │this character is printed out. Same result as │ │ │ │wctomb(). │ ├─────────┼───────────────┼────────────────────────────────────────────────────┤ │S │Wide character │Takes a pointer to an array of wchar_t characters │ │ │sting to │and converts it to an array of multibyte characters │ │ │Multibyte │upto but not including the null character and prints│ │ │string │the result. Same result as wcstombs(). │ └─────────┴───────────────┴────────────────────────────────────────────────────┘ The flag characters and their meanings are as follows (notice that more than one flag can appear in a format specification): ┌──────────┬───────────────────────────────────────────┬───────────────────────┐ │FLAG │MEANING │DEFAULT │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │- │Left-justify the result within the field │Right-justify. │ │ │width. │ │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │+ │Prefix the output value with a sign (+ or │Sign appears only for │ │ │-) if the output value is of a signed type.│negative signed values │ │ │ │(-). │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │blank(' ')│Prefix the output value with a blank if the│No blank. │ │ │output value is signed and positive. The │ │ │ │"+" flag over- - rides the blank flag if │ │ │ │both appear, and a positive signed value │ │ │ │will be output with a sign. │ │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │# │When used with the "o", "x", or "X" │No prefix. │ │ │formats, the "#" flag prefixes any nonzero │ │ │ │output value with "0", "0"x, or "0"X, │ │ │ │respectively. │ │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │# │When used with the "f", "e", or "E" │Decimal point appears │ │ │formats, the "#" flag forces the output │only if digits follow │ │ │value to contain a decimal point in all │it. │ │ │cases. │ │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │# │When used with the "g" or "G" formats, the │Decimal point appears │ │ │"#" flag forces the output value to contain│only if digits follow │ │ │a decimal point in all cases and prevents │it; trailing zeros are │ │ │the truncation of trailing zeros. │truncated. │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │# │When used with the "ls" format, the "#" │Precision indicates the│ │ │flag causes precision to be measured in │maximum number of bytes│ │ │"wchar_t" characters. │to be output. │ ├──────────┼───────────────────────────────────────────┼───────────────────────┤ │0 │When used with the "d", "i", "o", "u", "x",│Space padding. │ │ │"X", "e", "E", "f", "g", or "G" formats, │ │ │ │the "0" flag causes leading "0"'s to pad │ │ │ │the output to the field width. The "0" │ │ │ │flag is ignored if precision is specified │ │ │ │for an integer or if the "-" flag is speci-│ │ │ │fied. │ │ └──────────┴───────────────────────────────────────────┴───────────────────────┘ The # flag should not be used with c, lc, d, i, u, s, or p types. Width is a nonnegative decimal integer controlling the minimum number of characters printed. If the number of characters in the output value is less than the specified width, blanks are added on the left or the right (depending on whether the - flag is specified) until the minimum width is reached. Width never causes a value to be truncated; if the number of characters in the output value is greater than the specified width, or width is not given, all characters of the value are printed (subject to the precision specification). For the ls type, width is specified in bytes. If the number of bytes in the output value is less than the specified width, single-byte blanks are added on the left or the right (depending on whether the - flag is specified) until the minimum width is reached. The width specification can be an asterisk (*), in which case an argument from the argument list supplies the value. The width argument must precede the value being formatted in the argument list. Precision is a nonnegative decimal integer preceded by a period, which specifies the number of characters to be printed or the number of decimal places. Unlike the width specification, the precision can cause truncation of the output value or rounding of a floating-point value. The precision specification can be an asterisk (*), in which case an argument from the argument list supplies the value. The precision argument must precede the value being formatted in the argument list. The interpretation of the precision value and the default when the precision is omitted depend upon the type, as shown in the following table: ┌──────────┬──────────────────────────────────────┬────────────────────────────┐ │TYPE │MEANING │DEFAULT │ ├──────────┼──────────────────────────────────────┼────────────────────────────┤ │i, d, u, │Precision specifies the minimum number│If precision is "0" or │ │o, x, X │of digits to be printed. If the number│omitted entirely, or if the │ │ │of digits in the argument is less than│period (.) appears without a│ │ │precision, the output value is padded │number following it, the │ │ │on the left with zeros. The value is │precision is set to 1. │ │ │not truncated when the number of │ │ │ │digits exceeds precision. │ │ ├──────────┼──────────────────────────────────────┼────────────────────────────┤ │f, e, E │Precision specifies the number of │Default precision is six. │ │ │digits to be printed after the decimal│If precision is "0" or the │ │ │point. The last digit printed is │period appears without a │ │ │rounded. │number following it, no │ │ │ │decimal point is printed. │ ├──────────┼──────────────────────────────────────┼────────────────────────────┤ │g, G │Precision specifies the maximum number│All significant digits are │ │ │of significant digits printed. │printed. │ ├──────────┼──────────────────────────────────────┼────────────────────────────┤ │c │No effect. │The character is printed. │ ├──────────┼──────────────────────────────────────┼────────────────────────────┤ │lc │No effect. │The "wchar_t" character is │ │ │ │printed. │ ├──────────┼──────────────────────────────────────┼────────────────────────────┤ │s │Precision specifies the maximum number│Characters are printed until│ │ │of characters to be printed. │a null character is │ │ │Characters in excess of precision are │encountered. │ │ │not printed. │ │ ├──────────┼──────────────────────────────────────┼────────────────────────────┤ │ls │Precision specifies the maximum number│"wchar_t" characters are │ │ │of bytes to be printed. Bytes in │printed until a null │ │ │excess of precision are not printed; │character is encountered. │ │ │however, multibyte integrity is always│ │ │ │preserved. │ │ └──────────┴──────────────────────────────────────┴────────────────────────────┘ The h, l, and L characters specify the size of the expected argument. Their meanings are as follows: h A prefix with the integer types d, i, o, u, x, X, and n that specifies that the argument is short int or unsigned short int. l A prefix with d, i, o, u, x, X, and n types that specifies that the argument is a long int or unsigned long int. L A prefix with e, E, f, g, or G types that specifies that the argument is long double. If a percent sign (%) is followed by a character that has no meaning as a format field, the character is simply copied to stdout. For example, to print a percent sign character, use %%. The printf function returns the number of characters (bytes) printed. Related Information  fprintf - Write Formatted Data  fscanf - Read Formatted Data  scanf - Read Data  sprintf - Formatted Print to Buffer  sscanf - Read Data  stdio.h Example This example prints data in a variety of formats. #include #include int main(void) { char ch = 'h', *string = "computer"; int count = 234, hex = 0x10, oct = 010, dec = 10; double fp = 251.7366; wchar_t wc = (wchar_t)0x0058; wchar_t ws[4]; ws[0] = (wchar_t)0x0041; ws[1] = (wchar_t)0x0042; ws[2] = (wchar_t)0x0043; ws[3] = (wchar_t)0x0000; printf("%d %+d %06d %X %x %o\n\n", count, count, count, count, count, count); printf("1234567890123%n4567890123456789\n\n", &count); printf("Value of count should be 13; count = %d\n\n", count); printf("%10c%5c\n", ch, ch); printf("%25s\n%25.4s\n\n", string, string); printf("%f %.2f %e %E\n\n", fp, fp, fp, fp); printf("%i %i %i\n\n", hex, oct, dec); printf("%C %S\n\n",wc,ws); printf("%2$C %1$2S\n\n",ws,wc); } /***************** Output should be similar to: ***************** 234 +234 000234 EA ea 352 12345678901234567890123456789 Value of count should be 13; count = 13 h h computer comp 251.736600 251.74 2.517366e+02 2.517366E+02 16 8 10 X ABC X AB *******************************************************************/ ═══ 9.4. scanf -- Reads formatted data from the standard input ═══ Syntax #include #include int scanf(const char *format-string, argument-list); Description The scanf function reads data from the standard input stream stdin into the locations given by each entry in argument-list. Each argument must be a pointer to a variable with a type that corresponds to a type specifier in format-string. The format-string controls the interpretation of the input fields, and is a multibyte character string beginning and ending in its initial shift state. The format-string can contain one or more of the following:  White-space characters, as specified by isspace (such as blanks and new-line characters). A white-space character causes scanf to read, but not to store, all consecutive white-space characters in the input up to the next character that is not white-space. One white-space character in format-string matches any combination of white-space characters in the input.  Characters that are not white space, except for the percent sign character (%). A non-white-space character causes scanf to read, but not to store, a matching non-white-space character. If the next character in stdin does not match, scanf ends.  Format specifications, introduced by the percent sign (%). A format specification causes scanf to read and convert characters in the input into values of a specified type. The value is assigned to an argument in the argument list. The scanf function reads format-string from left to right. Characters outside of format specifications are expected to match the sequence of characters in stdin; the matched characters in stdin are scanned but not stored. If a character in stdin conflicts with format-string, scanf ends. The conflicting character is left in stdin as if it had no been read. When the first format specification is found, the value of the first input field is converted according to the format specification and stored in the location specified by the first entry in argument-list. The second format specification converts the second input field and stores it in the second entry in argument-list, and so on through the end of format-string. An input field is defined as all characters up to the first white-space character (space, tab, or new line), up to the first character that cannot be converted according to the format specification, or until the field width is reached, whichever comes first. If there are too many arguments for the format specifications, the extra arguments are ignored. The results are undefined if there are not enough arguments for the format specifications. A format specification has the following form: --%---------------------------------type-- - - - - - - ---*--- ---width--- ---h--- - - ---l--- - - ---L--- Each field of the format specification is a single character or a number signifying a particular format option. The type character, which appears after the last optional format field, determines whether the input field is interpreted as a character, a string, or a number. The simplest format specification contains only the percent sign and a type character (for example, %s). Each field of the format specification is discussed in scanf Format Tags. If a percent sign (%) is followed by a character that has no meaning as a format control character, that character and following characters up to the next percent sign are treated as an ordinary sequence of characters; that is, a sequence of characters that must match the input. For example, to specify a percent-sign character, use %%. The scanf function scans each input field character by character. It might stop reading a particular input field either before it reaches a space character, when the specified width is reached, or when the next character cannot be converted as specified. When a conflict occurs between the specification and the input character, the next input field begins at the first unread character. The conflicting character, if there was one, is considered unread and is the first character of the next input field or the first character in subsequent read operations on stdin. The scanf function returns the number of fields that were successfully converted and assigned. The return value does not include fields that were read but not assigned. The return value is EOF for an attempt to read at end-of-file if no conversion was performed. A return value of 0 means that no fields were assigned. scanf Format Tags The scanf format specification fields are described below. * An asterisk (*) following the percent sign suppresses assignment of the next input field, which is interpreted as a field of the specified type. The field is scanned but not stored. width The width is a positive decimal integer controlling the maximum number of characters to be read from stdin. No more than width characters are converted and stored at the corresponding argument. Fewer than width characters are read if a white-space character (space, tab, or new line), or a character that cannot be converted according to the given format occurs before width is reached. h, l, L The optional prefix l(ell) shows that you use the long version of the following type, while the prefix h indicates that the short version is to be used. The corresponding argument should point to a long or double object (for the l character), a long double object (for the L character), or a short object (with the h character). The l and h modifiers can be used with the d, i, o, x, and u type characters. The l modifier can also be used with the e, f, and g type characters. The L modifier can be used with the e, f and g type characters. The l and h modifiers are ignored if specified for any other type. Note that the l modifier is also used with the c and s characters to indicate a multibyte character or string. type The type characters and their meanings are in the following table: ┌──────────┬──────────────────────────────┬──────────────────────────────┐ │CHARACTER │TYPE OF INPUT EXPECTED │TYPE OF ARGUMENT │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"d" │Decimal integer │Pointer to "int" │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"o" │Octal integer │Pointer to "unsigned int" │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"x", "X" │Hexadecimal integer │Pointer to "unsigned int" │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"i" │Decimal, hexadecimal, or octal│Pointer to "int" │ │ │integer │ │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"u" │Unsigned decimal integer │Pointer to "unsigned int" │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"e", "f", │Floating point value │Pointer to "float" │ │"g", "E", │consisting of an optional sign│ │ │"G" │(+ or -); a series of one or │ │ │ │more decimal digits possibly │ │ │ │containing a decimal point; │ │ │ │and an optional exponent (e or│ │ │ │E0 followed by a possibly │ │ │ │signed integer value │ │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"c" │Character; white-space │Pointer to "char" large enough│ │ │characters that are ordinarily│for input field │ │ │skipped are read when "c" is │ │ │ │specified │ │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"s" │String │Pointer to character array │ │ │ │large enough for input field │ │ │ │plus a terminating null │ │ │ │characger ("\0"), which is │ │ │ │automatically appended │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"n" │No input read from stream or │Pointer to "int", into which │ │ │buffer │is stored the number of │ │ │ │characters successfully read │ │ │ │from the stream or buffer up │ │ │ │to that point in the call to │ │ │ │"scanf" │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"p" │Pointer to "void" converted to│Pointer to "void" │ │ │series of characters │ │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"lc" │Multibyte character constant │Pointer to "wchar_t" │ ├──────────┼──────────────────────────────┼──────────────────────────────┤ │"ls" │Multibyte string constant │Pointer to "wchar_t" string │ └──────────┴──────────────────────────────┴──────────────────────────────┘ To read strings not delimited by space characters, substitute a set of characters in brackets ([ ]) for the s (string) type character. The corresponding input field is read up to the first character that does not appear in the bracketed character set. If the first character in the set is a caret (^), the effect is reversed: the input field is read up to the first character that does appear in the rest of the character set. To store a string without storing an ending NULL character (\0), use the specification %ac, where a is a decimal integer. In this instance, the c type character means that the argument is a pointer to a character array. The next a characters are read from the input stream into the specified location, and no NULL character is added. The input for a %x format specifier is interpreted as a hexadecimal number. Related Information  fscanf - Read Formatted Data  printf - Formatted Print  sscanf - Read Data  wchar.h Example This example scans various types of data: #include int main(void) { int i; float fp; char c, s[81]; wchar_t wc; wchar_t ws[81]; printf("Enter an integer, a real number, a character, a string,\n" "a multi-byte character, and a multi-byte string : \n"); if (scanf("%d %f %c %s %C %S", &i, &fp, &c, s, &wc, ws) != 6) printf("Not all of the fields were assigned\n"); else { printf("integer = %d\n", i); printf("real number = %f\n", fp); printf("character = %c\n", c); printf("string = %s\n",s); printf("wide-character = %C\n", wc); printf("wide-character string = %S\n",ws); } } /***************** If input is: 12 2.5 a yes, ******************* ************** then output should be similar to: **************** Enter an integer, a real number, a character and a string : integer = 12 real number = 2.500000 character = a string = yes *******************************************************************/ ═══ 9.5. sprintf -- Formatted print to buffer ═══ Syntax #include #include int sprintf(char *buffer, const char *format-string, ...); Description The sprintf function formats and stores a series of characters and values in the array buffer. Any argument-list is converted and put out according to the corresponding format specification in the format-string. The format-string consists of ordinary characters and has the same form and function as the format-string argument for the printf function. See printf for a description of the format-string and arguments. The sprintf function returns the number of characters written in the array, not counting the ending null character. Related Information  fprintf - Write Formatted Data  printf - Formatted Print  sscanf - Read Data  wchar.h Example This example formats several data values to a buffer and prints out the results using printf. #include char buffer[200]; int i, j; double fp; char *s = "baltimore"; char c; int main(void) { c = 'l'; i = 35; fp = 1.7320508; /* Format and print various data */ j = sprintf(buffer, "%s\n", s); j += sprintf(buffer+j, "%c\n", c); j += sprintf(buffer+j, "%d\n", i); j += sprintf(buffer+j, "%f\n", fp); printf("\nstring:\n%s\ncharacter count = %d\n", buffer, j); } /********************* Output should be similar to: ************* string: baltimore l 35 1.732051 character count = 24 *******************************************************************/ ═══ 9.6. sscanf -- Read data ═══ Syntax #include int sscanf(const char *buffer, const char *format, ...); Description The sscanf function reads data from buffer into the locations given by the argument list. Each argument must be a pointer to a variable with a type that corresponds to a type specifier in the format string. See scanf for a description of the format string. The sscanf function returns the number of fields that were successfully converted and assigned. The return value does not include fields that were read but not assigned. The return value is WEOF when the end of the string is encountered before anything is converted. Related Information  fscanf - Read Formatted Data  scanf - Read Data  sprintf - Formatted Print to Buffer  wchar.h Example This example uses sscanf to read various data from the string tokenstring, and then displays the data. #include int main(void) { char *tokenstring = "15 12 14"; wchar_t *widestring = (wchar_t)"ABC Z"; wchar_t ws[81]; wchar_t wc; int i; float fp; char s[81]; char c; /* Input various data */ sscanf(tokenstring, "%s %c%d%f", s, &c, &i, &fp); sscanf(widestring, "0C", ws,&wc); /* If there were no space between %s and %c, */ /* sscanf would read the first character following */ /* the string, which is a blank space. */ /* Display the data */ printf("\nstring = %s\n",s); printf("character = %c\n",c); printf("integer = %d\n",i); printf("floating-point number = %f\n",fp); printf("wide-character string = %S\n",ws); printf("wide-character = %C\n",wc); } /***************** Output should be similar to: ***************** string = 15 character = 1 integer = 2 floating-point number = 14.000000 wide-character string = ABC wide-character = Z *******************************************************************/ ═══ 10. String Collation Functions ═══ The String Collation Functions are: int strcoll(const char *s1, const char *s2); size_t strxfrm(char *s1, const char *s2, size_t n); int wcscoll(const wchar_t *ws1, const wchar_t *ws2); size_t wcsxfrm(wchar_t *ws1, const wchar_t *ws2, size_t n); ═══ 10.1. strcoll -- Compare character strings ═══ Syntax #include #include int strcoll(const char *string1, const char *string2); Description The strcoll function compares two strings using the collating sequence specified by the program's locale. Parameters string1 1st string to be compared. string2 2nd string to be compared. Return Values The strcoll function returns a value indicating the relationship between the strings, as listed below. Value Meaning Less than 0 string1 less than string2 0 string1 equivalent to string2 Greater than 0 string1 greater than string2 Related Information  setlocale - Set Locale  wcscmp - Compare wchar_t Strings  wcsncmp - Compare wchar_t Strings  string.h Example This example compares the two strings passed to main. #include #include #include #include int main(int argc, char ** argv) { int result; if ( argc != 3 ) printf( "Usage: %s string1 string2\n", argv[0] ); else { printf("setlocale set as - %s\n\n",setlocale(LC_ALL,"")); result = strcoll( argv[1], argv[2] ); if ( result == 0 ) printf( "\"%s\" is identical to \"%s\"\n", argv[1], argv[2] ); else if ( result < 0 ) printf( "\"%s\" is less than \"%s\"\n", argv[1], argv[2] ); else printf( "\"%s\" is greater than \"%s\"\n", argv[1], argv[2] ); } } /****************** If the input is the strings *********************** **************** "firststring" and "secondstring", ******************** ******************* then the expected output is: ********************** "firststring" is less than "secondstring" *************************************************************************/ ═══ 10.2. strxfrm -- Character string transformation ═══ Syntax #include #include size_t strxfrm(char *string1, const char *string2, size_t count); Description The strxfrm function transforms the string pointed to by string2 and places the result into the string pointed to by string1. The transformation is determined by the program's current locale. The transformed string is not necessarily readable, but can be used with the strcmp or strncmp functions. Parameters string1 Pointer to target string. string2 Pointer to source string. count Maximum number of bytes placed in the target string including the null terminator. Return Values The strxfrm function returns the length of the transformed string, excluding the terminating null character. If the value returned is equal to or greater than count, the contents of the transformed string are indeterminate. Related Information  localeconv - Query Locale Conventions  setlocale - Set Locale  strcmp - Compare Strings  strcmpi - Compare Strings  strcoll - Compare Strings  stricmp - Compare Strings  strncmp - Compare Strings  strnicmp - Compare Strings  string.h Example This example prompts the user to input a string of characters, then uses strxfrm to transform the string and return its length. It also shows collation weights for locale after transformation. #include #include int main(void) { char *string1, buffer[80]; int length; char *s1[8]; char *s2[8]; printf("Type in a string of characters.\n "); string1 = gets(buffer); length = strxfrm(NULL, string1, 0); printf("You would need a %d element array to hold the string\n",length); printf("\n\n%s\n\n transformed according",string1); printf(" to this program's locale. \n"); length = strxfrm(s1,"choice",6); length = strxfrm(s2,"crumby",6); if (strcmp(s1,s2) < 0) printf("\nchoice\ncrumby\n"); else printf("\ncrumby\nchoice\n"); } ═══ 10.3. wcscoll -- Compare wide-character strings ═══ Syntax #include int wcscoll(const wchar_t *WcString1, const wchar_t *WcString2); Description The wcscoll function compares the two wide-character strings pointed to by the WcString1 and WcString2 parameters based on the collation values specified by the LC_COLLATE environment variable of the current locale. Note: The wcscoll function differs from the wcscmp function in that the wcscoll function compares wide-characters based on their collation values, while the wcscmp function compares wide characters based on their ordinal values. The wcscoll function is less efficient in terms of time than the wcscmp function because of the overhead of obtaining the collation values from the current locale. The wcscoll function may be unsuccessful if the wide-character strings specified by the WcString1 or WcString2 parameter contains characters outside the domain of the current collating sequence. Parameters WcString1 Contains a pointer to a wide-character string. WcString2 Contains a pointer to a wide-character string. Return Values Value Meaning Less than 0 string1 less than string2 0 string1 equivalent to string2 Greater than 0 string1 greater than string2 Related Information  wcscmp  wchar.h Example This example sets the current locale and compares two wide-character strings with respect to the current locale. #include #include #include #include int main (void) { wchar_t temp1[255], temp2[255]; char *s = "crumby"; char *t = "choice"; printf("\nCalling setlocale\n"); printf("Setlocale returns: %s \n", setlocale(LC_ALL, "")); mbstowcs(temp1, s, strlen(s)+1); mbstowcs(temp2, t, strlen(t)+1); if(wcscoll(temp1, temp2) > 0) printf("%S is greater than %S\n",temp1,temp2); else printf("%S is less than %S\n",temp1,temp2); } /* end main */ ═══ 10.4. wcsxfrm -- Wide character string transformation ═══ Syntax #include size_t wcsxfrm(wchar_t *WcString1, const wchar_t *WcString2, size_t count); Description The wcsxfrm function transforms the wide-character string specified by the WcString2 parameter into a string of wide character codes, based on the collation values of the wide-characters in the current locale as specified by the LC_COLLATE category. No more than the number of character codes specified by the count parameter are copied into the array specified by the WcString1 parameter. This transformation is such that when two such transformed wide-character strings are obtained and the transformed strings are compared using the wcscmp function, the result obtained would be the same as that obtained by a direct call to the wcscoll function on the two original wide-character strings. Parameters WcString1 Contains a pointer to the destination wide-character string. WcString2 Contains a pointer to the source wide-character string. count Specifies the maximum number of wide-character codes to place into the array specified by WcString1. Return Values If WcString1 is a NULL pointer:  returns the number of wide-character elements (not including the wide-character null terminator) required to store the transformed wide-character string. If WcString1 is not NULL:  If the count specified by the count parameter is sufficient to hold the transformed string in the WcString1 parameter, including the wide-character null terminator, the return value is set to the actual number of wide-character elements placed in the WcString1 parameter, not including the wide-character null.  If the return value is equal to or greater than the value specified by the count parameter, the contents of the array pointed to by the WcString1 parameter are indeterminate. This occurs whenever the count parameter value is too small to hold the entire transformed string.  If the wide-character string pointed to by the WcString2 parameter contains wide-character codes outside the domain of the collating sequence defined by the current locale, -1 is returned. Related Information  wcscmp  wcscoll Example This example transforms the wide-character string specified by the "temp2" parameter into a string of wide-character codes, based on the collation values of the wide-characters in the current locale as specified by the LC_COLLATE category. #include #include #include #include int main (void) { char *string1 = "Armadillo aerosol chunks"; size_t length, l1, l2; wchar_t wstring[100]; wchar_t ws1[50]; wchar_t ws2[50]; wchar_t wsa[50]; wchar_t wsb[50]; mbstowcs(wstring,string1,100); printf("\nLocale now set to --> %s",setlocale(LC_ALL,"Es_ES") ); length = wcsxfrm(NULL, wstring, 100); printf("\n%d wide-character element array needed to hold the transformed string",length); printf("\n%S\n",wstring); mbstowcs(wsa,"llama",50); mbstowcs(wsb,"lusty",50); l1 = wcsxfrm(ws1,wsa,6); l2 = wcsxfrm(ws2,wsb,6); if (wcscmp(ws1,ws2) < 0) printf("\nlength1-%i-%6S -less than- length2-%i-%6S\n",l1,wsa,l2,wsb); else printf("\nlength2-%i-%6S -less than- length1-%i-%6S\n",l2,wsb,l1,wsa); } ═══ 11. Date/Time and Monetary Formatting Functions ═══ Date and Time Formatting: size_t strftime(char *s, size_t maxsize, const char *format, const struct tm *tm); size_t strfmon(char *s, size_t maxsize, const char *format, ...); char *strptime(const char *buf, const char *fmt, struct tm *tm); size_t wcsftime(wchar_t *wcs, size_t maxsize, const char *format, const struct tm *tm); ═══ 11.1. strfmon -- Formats monetary strings ═══ Syntax #include ssize_t strfmon(char *s, size_t maxsize, const char *format, ...); Description The strfmon function converts numeric values to monetary strings according to the specifications in the format parameter. This parameter also contains numeric values to be converted. Characters are placed into the s array, as controlled by format. The LC_MONETARY category governs the format of the conversion. The strfmon function can be called multiple times by including additional format structures. Parameters s Pointer to target array. maxsize Maximum number of bytes (including the null terminator) that are placed into the target array. format Specifies a character string that can contain plain characters and conversion specifications. Plain characters are copied to the output stream. Conversion specifications result in the fetching of zero or more arguments, which are converted and formatted. If there are insufficient arguments for this parameter, the results are undefined. If arguments remain after this parameter is exhausted, the excess arguments are ignored. Conversion Specification A conversion specification consists of the following sequence:  a % (percent sign) character  optional flags  optional field width  optional left precision  optional right precision  a required conversion character that determines the conversion to be performed. Flags One or more of the following optional flags can be specified to control the conversion: =f An = (equal sign) followed by a single character that specifies the numeric fill character. The default numeric fill character is the space character. This flag does not affect field width filling, which always uses the space character. This flag is ignored unless a left precision is specified. \^ Does not use grouping characters when formatting the currency amount. The default is to insert grouping characters if defined for the current locale. + or ( Determines the representation of positive and negative currency amounts. Only one of these flags may be specified. The locale's equivalent of + and - are used if + is specified. The locale's equivalent of enclosing negative amounts within parentheses is used if ( is specified. If neither flag is included, a default specified by the current locale is used. ! Suppresses the currency symbol from the output conversion. Field Width w The decimal-digit string, w, specifies the minimum field width in which the result of the conversion is right-justified. If -w is specified, the result is left-justified. Left Precision #n A # (pound sign) followed by a decimal-digit string, n, specifies the maximum number of digits to be formatted to the left of the radix character. This option can be specified to keep formatted output from multiple calls to the strfmon function aligned in the same columns. It can also be used to fill unused positions with a special character (for example, $***123.45). This option causes an amount to be formatted as if it has the number of digits specified by the n variable. If more than n digit positions are required, this option is ignored. Digit positions in excess of those required are filled with the numeric fill character set with the =f flag. If defined for the current locale and not suppressed with the \^ flag, grouping is applied to the fill characters and regular digits. If the fill character is not 0, however, grouping separators following a fill character are replaced by the fill character (for example, $0,001,234.56 and $****1,234.56). Right Precision .p A .(period) followed by a decimal digit string, p, specifies the number of digits after the radix character. If the value of the p variable is 0, no radix character is used. If a right precision is not specified, a default specified by the current locale is use. The amount being formatted is rounded to the specified number of digits prior to formatting. Conversion Characters i The double argument is formatted according to the current locale's international currency format; for example, in the U.S.: 1,234.56. n The double argument is formatted according to the current locale's national currency format; for example, in the U.S.: $1,234.56. % No argument is converted; the conversion specification %% is replaced by a single %. Return Values If successful, and if the number of resulting bytes (including the terminating null character) is not more than the number of bytes specified by the maxsize parameter, this function returns the number of bytes placed into the array pointed to by the s parameter. Otherwise, -1 is returned and the contents of the s array are indeterminate. Related Information  scanf  strftime  strptime  wcsftime  monetary.h Example This example sets the locale and prints out the monitary information with respect to that locales specification. #include #include #include #include int main (void) { char temp1[255], buffer[256]; double num = 0.0; size_t rc; printf("Setlocal returns: %s \n", setlocale(LC_ALL, "")); num = 123456.789; rc = strfmon(temp1,sizeof(temp1),"Monetary %n",num); printf("strfmon rc = %ld\n",rc); printf("Monetary value is %.2f %s\n",num,temp1); } /* end main */ ═══ 11.2. strftime -- Formats time and date ═══ Syntax #include size_t strftime(char *String, size_t Maxsize, const char *Format, const struct tm *TmPtr); Description The strftime function converts the internal time and date specification of the tm structure, which is pointed to by the TmPtr parameter, into a character string pointed to by the String parameter under the direction of the format string pointed to by the Format parameter. The actual values for the format specifiers are dependent on the current settings for the LC_TIME Category. The tm structure values may be assigned by the user or generated by the localtime or gmtime function. The resulting string is similar to the result of the printf format parameter, and is placed in the memory location addressed by the String parameter. The maximum length of the string is determined by the Maxsize parameter and terminates with a null character. Many conversion specifications are the same as those used by the date command. The interpretation of some conversion specifications is dependent on the current locale of the process. The Format parameter is a character string containing two types of objects: plain characters that are simply placed in the output string, and conversion specifications that convert information from the TmPtr parameter into readable form in the output string. Each conversion specification is a sequence of this form: %[[-]width][.precision]type A % (percent sign) introduces a conversion specification. An optional decimal-digit string specifies a minimum field width. A converted value that has fewer characters than the field width is padded with spaces to the left. If the decimal digit string is preceded by a - (minus sign), padding with spaces occurs to the right of the converted value. If no width is given, the appropriate default width is used for numeric fields, with the field padded to the left with zeros, as required. For strings, the output field is made exactly wide enough to contain the string. An optional precision value gives the maximum number of characters to be printed for the conversion specification. The precision value is a decimal-digit string preceded by a period. If the value to be output is longer than the precision, the string is truncated on the right. The type of conversion is specified by one or two conversion characters. The characters and their meanings are: %a Represents the locale's abbreviated weekday name (for example,Sun). %A Represents the locale's full weekday name (for example, Sunday). %b Represents the locale's abbreviated month name (for example,Jan). %B Represents the locale's full month name (for example, January). %c Represents the locale's date and time format. %C Represents the century. %d Represents the day of the month as a decimal number (01 to 31). %D Represents the date in %m/%d/%y format (for example, 01/31/91). %e Represents the day of the month as a decimal number ( 1 to 31). An single digit is preceded by a space character. %h Same as %b. %H Represents the 24-hour-clock hour as a decimal number (00 to 23). %I Represents the 12-hour-clock hour as a decimal number (01 to 12). %j Represents the day of the year as a decimal number (001 to 366). %m Represents the month of the year as a decimal number (01 to 12). %M Represents the minute of the hour as a decimal number (00 to 59). %n Specifies a new-line character. %p Represents the locale's AM or PM string. %r Represents 12-hour clock time with AM/PM notation(%I:%M:0p). %R Represents 24-hour-clock time in the format %H:%M. %S Represents the second of the minute as a decimal number (00 to 61). %t Specifies a tab character. %T Represents 24-hour-clock time in the format %H:%M:%S. %u Represents the day of the week as a decimal number (1 to 7). Monday is considered as 1. %U Represents the week of the year as a decimal number (00 to 53). Sunday is considered the first day of the week. %V Represents the week of the year as a decimal number (01 to 53). Monday is considered the first day of the week. If the week containing 1 January has four or more days in the new year, then it is considered week 1; otherwise, it is week 53 of the previous year, and the next week is week 1. %w Represents the day of the week as a decimal number (0 to 6). Sunday is considered as 0. %W Represents the week of the year as a decimal number (00 to 53). Monday is considered the first day of the week. All days in a new year preceding the first Monday are considered to be week 0. %x Represents the locale's date format. %X Represents the locale's time format. %y Represents the year of the century (00 to 99). %Y Represents the year with century as a decimal number (1989). %Z Represents the time-zone name if one can be determined (for example, EST). No characters are displayed if a time zone cannot be determined. %% Specifies a % (percent) sign. Parameters String Pointer to the string to hold the formatted time. Maxsize Maximum length of string pointed to by the String parameter. Format Pointer to the format character string. TmPtr Pointer to the time structure that is to be converted. Return Values If the total number of resulting bytes, including the terminating null byte, is not more than the Maxsize value, the strftime function returns the number of bytes placed into the array pointed to by the String parameter. Otherwise, 0 is returned and the contents of the array are indeterminate. Related Information  strfmon  strptime  mbstowcs  wcsftime  localtime  gmtime  printf Example This code fragment sets the locale and calls strftime to print out the time with respect to that locales format. #include #include #include #include #include int main (void) { char temp[255]; time_t ltime; size_t rc; struct tm *ptmTemp; printf("\nCalling setlocale\n"); printf("Setlocale returns: %s \n", setlocale(LC_ALL, "")); time(<ime); ptmTemp = localtime(<ime); rc = strftime(temp,sizeof(temp),"At Time %9X Date %9x %a %A %b %B %h",ptmTemp); printf("strftime rc = %ld\n",rc); printf("%s\n",temp); } ═══ 11.3. strptime -- Converts a character string to a time value ═══ Syntax #include char *strptime(const char *String, const char *Format, struct tm *TmPtr); Description The strptime function converts the characters in the String parameter to values that are stored in TmPtr, using the format specified by the Format parameter. Parameters String Contains the character string to be converted. Format Contains format specifiers for the strptime function. TmPtr Specifies the structure to contain the output of the strptime function. If a conversion fails, the contents of the TmPtr structure are undefined. Format Specifiers The Format parameter contains zero or more specifiers. Each specifier is composed of one of the following:  One or more white-space characters  An ordinary character (neither % nor a white-space character)  A format specifier. The LC_TIME category defines the locale values for the format specifiers. The following format specifiers are supported: %a Represents the weekday name; abbreviated or full (for example, Sun or Sunday) defined for the locale. %A Same as %a. %b Represents the month name ; abbreviated or full (for example, Jan or January) defined for the locale. %B Same as %b. %c Represents the date and time format defined by the locale(%x %X). %C Represents the century number (0,99). %d Represents the day of the month as a decimal number (1 to 31). %D Represents the date in %m/%d/%y format (for example, 12/31/93). %e Same as %d. %h Same as %b. %H Represents the 24-hour-clock hour as a decimal number (00 to 23). %I Represents the 12-hour-clock hour as a decimal number (01 to 12). %j Represents the day of the year as a decimal number (001 to 366). %m Represents the month as a decimal number (01 to 12). %M Represents the minutes of the hour as a decimal number (00 to 59). %n Specifies any white space. %p Represents the AM or PM string defined by the am_pm statement. %r Represents the time in the format %I:%M:0p. %R Represents the time in the format %H:%M (for example, 16:5). %S Represents the seconds (00 to 61). %t Specifies any white space. %T Represents 24-hour-clock time in the format %H:%M:%S. %U Represents the week of the year as a decimal number (00 to 53). Sunday is considered the first day of the week. %w Represents the day of the week as a decimal number (0 to 6). Sunday is considered as 0. %W Represents the week of the year as a decimal number (00 to 53). Monday is considered the first day of the week. %x Represents the date, defined by the locale's date format. %X Represents the time, defined by the locale's time format. %y Represents the year of the century (00 to 99). %Y Represents the year including the century (for example, 1939). %% Specifies a % (percent sign) character. A format specification consisting of white-space characters is performed by reading input until the first non-white-space character (which is not read) or up to no more characters can be read. A format specification consisting of an ordinary character is performed by reading the next character from the String parameter. If this character differs from the character comprising the directive, the directive fails and the differing character and any characters following it remain unread. Case is ignored when matching String items, such as month or weekday names. Return Values If successful, the strptime function returns a pointer to the character following the last character parsed. Otherwise, a null pointer is returned. Related Information  scanf  strfmon  strftime  wcsftime Example This example loads the locale and sets the day of the week and the date in the time structure to 10/21/93 and prints out the results with respect to the locale loaded. #include #include #include #include #include int main (void) { char temp[255]; const char *buf; time_t ltime; char rc; char *rcp; struct tm *ptmTemp; ptmTemp = malloc(sizeof(*ptmTemp)); strcpy(temp,"12:34:56 10/21/93"); printf("\nCalling setlocale\n"); printf("Setlocale returns: %s \n", setlocale(LC_ALL, "")); rcp = strptime(temp, "%T %D", ptmTemp); if(rcp != NULL) { rc = strftime(temp,sizeof(temp),"At Time %9X Date %9x %A %B %r",ptmTemp); printf("%s\n",temp); return(1); } return (0); } ═══ 11.4. wcsftime -- Converts date and time into a wide-character string ═══ Syntax #include size_t wcsftime(wchar_t *WcString, size_t Maxsize, const char *Format, const struct tm *TmPtr); Description The wcsftime function formats the data in the TmPtr parameter according to the specification contained in the Format parameter and places the resulting wide-character string into the WcString parameter. Up to Maxsize-1 wide-characters are placed into the WcString parameter, terminated by a wide-character NULL. The wcsftime function behaves as if the character string generated by the strftime function is passed to the mbstowcs function as the character string parameter and the mbstowcs function places the result in the WcString parameter of the wcsftime function, up to the limit of wide-character codes specified by the Maxsize parameter. Parameters WcString Contains the output of the wcsftime function. Maxsize Specifies the maximum number of bytes (including the wide character null-terminating byte) that may be placed in the WcString parameter. Format Contains format specifiers. The LC_TIME category defines the locale values for the format specifiers. The Format parameter can use the same format specifiers as strftime. TmPtr Contains the data to be converted by the wcsftime function. Return Values If successful, and if the number of resulting wide-characters (including the wide-character null-terminating byte) is no more than the number of bytes specified by the Maxsize parameter, the wcsftime function returns the number of wide-characters (not including the wide-character null-terminating byte) placed in the WcString parameter. Otherwise, 0 (zero) is returned and the contents of the WcString parameter are indeterminate. Related Information  strfmon  strftime  strptime  mbstowcs Example This example sets the locale and calls wcsftime to print out the time in wide-character with respect to that locales format. #include #include #include #include #include int main (void) { wchar_t temp[255]; time_t ltime; size_t rc; struct tm *ptmTemp; printf("\nCalling setlocale\n"); printf("Setlocale returns: %s \n", setlocale(LC_ALL, "")); time(<ime); ptmTemp = localtime(<ime); rc=wcsftime(temp,sizeof(temp),"At Time %9X Date %9x %a %A %b %B %h",ptmTemp); printf("wcsftime rc = %ld\n",rc); printf("%ls\n",temp); } ═══ 12. WideChar String Functions ═══ Basic String manipulation API for wchar_t data type: wchar_t *wcscat (wchar_t *string1, const wchar_t *string2); wchar_t *wcschr (wchar_t *string1, wint_t wc); int wcscmp (const wchar_t *string1, const wchar_t *string2); wchar_t *wcscpy (wchar_t *string1, const wchar_t *string2); size_t wcscspn (const wchar_t *string1, const wchar_t *string2); size_t wcslen (const wchar_t *ws); wchar_t *wcsncat (wchar_t *string1, const wchar_t *string2, size_t n); int wcsncmp (const wchar_t *string1, const wchar_t *string2, size_t n); wchar_t *wcsncpy (wchar_t *string1, const wchar_t *string2, size_t n); wchar_t *wcspbrk (const wchar_t *string1, const wchar_t *string2); wchar_t *wcsrchr (wchar_t *string1, wint_t wc); size_t wcsspn (const wchar_t *string1, const wchar_t *string2); wchar_t *wcstok (wchar_t *string1, const wchar_t *string2); wchar_t *wcswcs (const wchar_t *string1, const wchar_t *string2); int wcswidth(wchar_t *ws, size_t n); char wctype(const char *charclass); int wcwidth(wchar_t wc); ═══ 12.1. wcscat -- Append strings ═══ Syntax #include wchar_t *wcscat(wchar_t *string1, const wchar_t *string2); Description. The wcscat function appends a copy of the string pointed to by string2 to the end of the string pointed to by string1. This function operates on null-terminated wchar_t strings. The string arguments to this function should contain a wchar_t null character marking the end of the string. Boundary checking is not performed. Return Values This function returns the value of string1. Related Information  strcat - Concatenate Strings  strncat - Concatenate Strings  wcsncat - Concatenate wchar_t Strings  wchar.h Example This example creates the wide-character string "computer program". #include #include int main(void) { wchar_t buffer1[40]; wchar_t string[40]; wchar_t * ptr; mbstowcs(buffer1,"computer",9); mbstowcs(string," program",9); ptr = wcscat( buffer1, string ); printf( "buffer1 = %S\n", buffer1 ); } /**************** Output should be similar to: ****************** buffer1 = computer program *******************************************************************/ ═══ 12.2. wcschr -- Search string for wide-character ═══ Syntax #include wchar_t *wcschr(const wchar_t *string1, wint_t wc); Description. The wcschr function searches string1 for the occurrence of wc. wc may be a wint_t null character (L0). The wchar_t null character at the end of string1 is included in the search. This function operates on null-terminated wchar_t strings. The string1 argument to this function should contain a wchar_t null character marking the end of the string. Return Values This function returns a pointer to the first occurrence of wc in string1. If the character is not found, a NULL pointer is returned. Related Information  wcscspn - Find Offset of First wchar_t Match  wcspbrk - Locate wchar_t Characters in wchar_t String  wcsrchr - Locate wchar_t Character in wchar_t String  wcsspn - Search wchar_t Characters in wchar_t String  wcswcs - Locate wchar_t Substring in wchar_t String  wchar.h Example This example finds the first occurrence of the character p in the wide character string "computer program". #include #include int main(void) { wchar_t buffer1[20]; wchar_t * ptr; wchar_t wc = (wchar_t)'p'; mbstowcs(buffer1,"computer program",17); ptr = wcschr( buffer1, wc ); printf( "The first occurrence of %C in '%S' is '%S'\n", wc, buffer1, ptr ); } /**************** Output should be similar to: ****************** The first occurrence of p in 'computer program' is 'puter program' *******************************************************************/ ═══ 12.3. wcscmp -- Compare strings ═══ Syntax #include int wcscmp(const wchar_t *string1, const wchar_t *string2); Description. The wcscmp function compares two wchar_t strings. This function operates on null-terminated wchar_t strings. The string arguments to this function should contain a wchar_t null character marking the end of the string. Boundary checking is not performed when a string is added to or copied. Return Values The wcscmp function returns a value indicating the relationship between the two strings, as follows: Value Meaning Less than 0 string1 less than string2 0 string1 identical to string2 Greater than 0 string1 greater than string2. Related Information  wcsncmp - Compare wchar_t Strings  wchar.h Example This example compares the wide-character string string1 to string2 #include #include int main(void) { int result; wchar_t string1[20]; wchar_t string2[20]; mbstowcs(string1,"abcdef",7); mbstowcs(string2,"abcdefg",8); result=wcscmp( string1, string2 ); if ( result == 0 ) printf( "\"%S\" is identical to \"%S\"\n", string1, string2); else if ( result < 0 ) printf( "\"%S\" is less than \"%S\"\n", string1, string2 ); else printf( "\"%S\" is greater than \"%S\"\n", string1, string2); } /**************** Output should be similar to: ****************** "abcdef" is less than "abcdefg" *******************************************************************/ ═══ 12.4. wcscpy -- Copy a string ═══ Syntax #include wchar_t *wcscpy(wchar_t *string1, const wchar_t *string2); Description. The wcscpy function copies the contents of string2 (including the ending wchar_t null character) into string1. This function operates on null-terminated wchar_t strings. The string arguments to this function should contain a wchar_t null character marking the end of the string. Boundary checking is not performed. Return Values The wcscpy function returns the value of string1. Related Information  wcsncpy - Copy wchar_t Strings  wchar.h Example This example copies the contents of a source string to a destination string. #include #include int main(void) { wchar_t source[ 80 ]; wchar_t target[ 80 ]; wchar_t *wsptr; mbstowcs(string1,"This is the source string",40); mbstowcs(string1,"And this is the target string",40); printf( "Target is originally = \"%S\"\n", target ); wsptr = wcscpy( target, source ); printf( "After wcscpy, target becomes \"%S\"\n", wsptr ); } /**************** Output should be similar to: ****************** Target is originally = "And this is the target string" After wcscpy, target becomes "This is the source string" *******************************************************************/ ═══ 12.5. wcscspn -- Find length of complementary wide char substring ═══ Syntax #include /* SAA extension to ANSI */ size_t wcscspn(const wchar_t *string1, const wchar_t *string2); Description. The wcscspn function finds the first occurrence of a wchar_t character in the string pointed to by string1 that belongs to the set of wchar_t characters specified by the string pointed to by string2. This function operates on null-terminated wchar_t strings. The string arguments to these functions should contain a wchar_t null character marking the end of the string. Return Values The wcscspn function returns the index of the first character found. This value is equivalent to the length of the initial substring of string1 that consists entirely of characters not in string2. Related Information  wcsspn - Search wchar_t Characters in String  wcswcs - Locate wchar_t Substring in wchar_t String  wchar.h Example This example uses wcscspn to find the first occurrence of any of the characters a, x, l, or e in the string "This is the source string". #include #include int main(void) { wchar_t ws[ 80 ]; wchar_t * substring[ 10 ]; mbstowcs(ws,"This is the source string",40); mbstowcs(substring,"axle",10); printf( "The first %i characters in the string \"%S\" are not " "in the string \"%S\" \n", wcscspn( ws, substring), ws, substring ); } /**************** Output should be similar to: ************************ The first 10 characters in the string "This is the source string" are not in the string "axle" *************************************************************************/ ═══ 12.6. wcslen -- Find length of wide-character string ═══ Syntax #include size_t wcslen(const wchar_t *ws); Description. The wcslen function computes the number of wchar_t characters in the string pointed to by ws. Return Values The wcslen function returns the number of wchar_t characters that precede the terminating wchar_t null character. Related Information  mblen - Multibyte String Length  strlen - Determine String Length  wchar.h Example This example computes the length of a wchar_t string. #include #include int main(void) { wchar_t ws[ 20 ]; mbstowcs(ws,"abcdef",7); printf( "Length of \"%S\" is %i\n", ws, wcslen( ws )); } /**************** Output should be similar to: ****************** Length of "abcdef" is 6 *******************************************************************/ ═══ 12.7. wcsncat -- Append strings ═══ Syntax #include wchar_t *wcsncat(wchar_t *string1, const wchar_t *string2, size_t count); Description. The wcsncat function appends up to count wide-characters from string2 to the end of string1 and appends a wchar_t null character to the result. This function operates on null-terminated wchar_t strings. The string arguments to this function should contain a wchar_t null character marking the end of the string. Return Values The wcsncat function returns string1. Related Information  strncat - Concatenate Strings  strcat - Concatenate Strings  wcscat - Append wchar_t Strings  wchar.h Example This example demonstrates the difference between wcscat and wcsncat. Wcscat appends the entire second string to the first whereas wcsncat appends only the specified number of characters in the second string to the first. #include #include int main(void) { wchar_t * buffer1; wchar_t * buffer2; wchar_t * ptr; buffer1 = (wchar_t *)malloc(80); buffer2 = (wchar_t *)malloc(20); mbstowcs(buffer1,"computer",9); mbstowcs(buffer2," program",9); /* Call wcscat with buffer1 and " program" */ ptr = wcscat( buffer1, buffer2 ); printf( "wcscat : buffer1 = \"%S\"\n", buffer1 ); /* Reset buffer1 to contain just the string "computer" again */ memset( buffer1, '\0', sizeof( buffer1 )); mbstowcs(buffer1,"computer",9); /* Call wcsncat with buffer1 and " program" */ ptr = wcsncat( buffer1, buffer2, 3 ); printf( "wcsncat: buffer1 = \"%S\"\n", buffer1 ); } /**************** Output should be similar to: ****************** wcscat : buffer1 = "computer program" wcsncat: buffer1 = "computer pr" *******************************************************************/ ═══ 12.8. wcsncmp -- Compares subset string2 to string1 ═══ Syntax #include int wcsncmp(const wchar_t *string1, const wchar_t *string2, size_t count); Description. The wcsncmp function compares up to count wide-characters in string1 to string2. The wcsncmp function operates on null-terminated wchar_t strings. The string arguments to this function should contain a wchar_t null character marking the end of the string. The wcsncmp function returns a value indicating the relationship between the two strings, as follows: Value Meaning Less than 0 string1 less than string2 0 string1 equivalent to string2 Greater than 0 string1 greater than string2 Related Information  wcscmp - Compare wchar_t Strings  strcoll - Compare wchar_t Strings  wchar.h Example This example demonstrates the difference between wcscmp and wcsncmp. #include #include int main(void) { int result; wchar_t buffer1[20]; wchar_t buffer2[20]; mbstowcs(buffer1,"abcdefg",8); mbstowcs(buffer2,"abcfg",6); result = wcscmp( buffer1, buffer2 ); printf( "Comparison of each character\n" ); printf( " wcscmp: " ); if ( result == 0 ) printf( "\"%S\" is identical to \"%S\"\n", buffer1, buffer2); else if ( result < 0 ) printf( "\"%S\" is less than \"%S\"\n", buffer1, buffer2 ); else printf( "\"%S\" is greater than \"%S\"\n", buffer1, buffer2 ); result = wcsncmp( buffer1, buffer2, 3); printf( "\nComparison of only the first 3 characters\n" ); printf( " wcsncmp: " ); if ( result == 0 ) printf( "\"%S\" is identical to \"%S\"\n", buffer1, buffer2); else if ( result < 0 ) printf( "\"%S\" is less than \"%S\"\n", buffer1, buffer2 ); else printf( "\"%S\" is greater than \"%S\"\n", buffer1, buffer2 ); } /**************** Output should be similar to: ****************** Comparison of each character wcscmp: "abcdefg" is less than "abcfg" Comparison of only the first 3 characters wcsncmp: "abcdefg" is identical to "abcfg" *******************************************************************/ ═══ 12.9. wcsncpy -- Copy n wide-characters from string2 to string1 ═══ Syntax #include wchar_t *wcsncpy(wchar_t *string1, const wchar_t *string2, size_t count); Description. The wcsncpy function copies up to count wide-characters from string2 to string1. If string2 is shorter than count characters, string1 is padded out to count characters with wchar_t null characters. This function operates on null-terminated wchar_t strings. The string arguments to this function should contain a wchar_t null character marking the end of the string. Return Values The wcsncpy function returns string1. Related Information  wcscpy - Copy wchar_t Strings  wchar.h Example This example demonstrates the difference between wcscpy and wcsncpy. #include #include int main(void) { wchar_t source[ 20 ]; wchar_t source1[ 20 ]; wchar_t target[ 20 ]; wchar_t target1[ 20 ]; wchar_t * return_string; int index = 5; mbstowcs(source,"123456789",10); mbstowcs(source1,"123456789",10); mbstowcs(target,"abcdefg",8); mbstowcs(target1,"abcdefg",8); printf( "target is originally = '%S'\n", target ); return_string = wcscpy( target, source ); printf( "After wcscpy, target becomes '%S'\n\n", target ); printf( "target is originally = '%S'\n", target1 ); return_string = wcsncpy( target1, source1, index ); printf( "After wcsncpy 'n=5', target becomes '%S'\n", target1 ); } /**************** Output should be similar to: ****************** target is originally = 'abcdefg' After wcscpy 'n=5', target becomes '123456789' target1 is originally = 'abcdefg' After wcsncpy, target1 becomes '12345fg' *******************************************************************/ ═══ 12.10. wcspbrk -- Locate wide-characters in string ═══ Syntax #include wchar_t *wcspbrk(const wchar_t *string1, const wchar_t *string2); Description. The wcspbrk function locates the first occurrence in the string pointed to by string1 of any character from the string pointed to by string2. Return Value This function returns a pointer to the character, or NULL if no wchar_t from string2 occurs in string1. Related Information  wcschr - Search wchar_t String for Given wchar_t  wcscspn - Find Offset of First wchar_t Match  wcsncmp - Compare wchar_t Strings  wcsrchr - Locate wchar_t Character in String  wcsspn - Search wchar_t Characters in String  wcswcs - Locate wchar_t Substring in wchar_t String  wchar.h Example This example returns a pointer to the first occurrence in the array string of either a or b. #include #include int main(void) { wchar_t *result; wchar_t ws[ 30 ]; wchar_t wc[ 8 ]; mbstowcs(ws,"Blue Danube",12); mbstowcs(wc,"ab",3); result = wcspbrk( ws, wc); printf("The first occurrence of any of the characters \"%S\" in " "\"%S\" is \"%S\"\n", wc, ws, result); } /******************* Output should be similar to: ********************* The first occurrence of any of the characters "ab" in "Blue Danube" is "anube" *************************************************************************/ ═══ 12.11. wcsrchr -- Locate last occurence of wide-character in a string ═══ Syntax #include wchar_t *wcsrchr(const wchar_t *ws, wint_t wi); Description. The wcsrchr function locates the last occurrence of wi in the string pointed to by ws. The terminating wchar_t null character is considered to be part of the string. Return Values The wcsrchr function returns a pointer to the character, or a NULL pointer if wi does not occur in the string. Related Information  wcschr - Search wchar_t String for Given wchar_t  wcscspn - Find Offset of First wchar_t Match  wcsspn - Search wchar_t Characters in String  wcswcs - Locate wchar_t Substring in wchar_t String  wcspbrk - Locate wchar_t Characters in String  wchar.h Example This example compares the use of wcschr and wcsrchr. It searches the string for the first and last occurrence of p in the wide-character string. #include #include int main(void) { wchar_t ws[40]; wchar_t * ptr; wint_t wi = (wint_t)'p'; mbstowcs(ws,"computer program",19); ptr = wcschr( ws, wi ); printf( "The first occurrence of %C in '%S' is '%S'\n", wi, ws, ptr ); ptr = wcsrchr( ws, wi ); printf( "The last occurrence of %C in '%S' is '%S'\n", wi, ws, ptr ); } /**************** Output should be similar to: ****************** The first occurrence of p in 'computer program' is 'puter program' The last occurrence of p in 'computer program' is 'program' *******************************************************************/ ═══ 12.12. wcsspn -- Find length of wide-character substring ═══ Syntax #include size_t wcsspn(const wchar_t *string1, const wchar_t *string2); Description. The wcsspn function finds the first occurrence of a wchar_t character in the string pointed to by string1 that is not contained in the set of wchar_t characters specified by the string pointed to by string2. Return Values The wcscspn function returns the index of the first character found. This value is equivalent to the length of the initial substring of string1 that consists entirely of characters in string2. Related Information  wcschr - Search wchar_t String for Given wchar_t  wcscspn - Find Offset of First wchar_t Match  wcsrchr - Locate wchar_t Character in String  wcsspn - Search wchar_t Characters in String  wcswcs - Locate wchar_t Substring in wchar_t String  wcspbrk - Locate wchar_t Characters in String  wchar.h Example This example finds the first occurrence in the string that is neither an a, b, nor c. Because the string in this example is cabbage, wcsspn returns 5, the index of the segment of cabbage before a character that is not an a, b, or c. #include #include int main(void) { wchar_t ws[20]; wchar_t source[8]; int index; mbstowcs(ws,"cabbage",8); mbstowcs(source,"abc",4); index = wcsspn( ws, source ); printf( "The first %d characters of \"%S\" are found in \"%S\"\n", index, ws, source ); } /**************** Output should be similar to: ****************** The first 5 characters of "cabbage" are found in "abc" *******************************************************************/ ═══ 12.13. wcstok -- Convert wide-character string to token ═══ Syntax #include wchar_t *wcstok(wchar_t *string1, const wchar_t *string2); Description The wcstok function reads a wide-character string1 as a series of zero or more tokens and wide-character string2 as the set of characters serving as delimiters of the tokens in string1. The tokens in string1 can be separated by one or more of the delimiters from string2. The tokens in string1 can be located by a series of calls to wcstok. In the first call to wcstok for a given string1, it searches for the first token in string1, skipping over leading delimiters. A pointer to the first token is returned. To read the next token from string1, call wcstok with a NULL string1 argument. A NULL string1 argument causes wcstok to search for the next token in the previous token string. Each delimiter is replaced by a null character. The set of delimiters can vary from call to call, so string2 can take any value. Parameters string1 Contains a pointer to the wide-character string to be searched. string2 Contains a pointer to the string of wide-character token delimiters. Return Values The first time wcstok is called, it returns a pointer to the first token in string1. In later calls with the same token string, wcstok returns a pointer to the next token in the string. A NULL pointer is returned when there are no more tokens. All tokens are null-terminated. Related Information  wcsspn  wcswcs  wcstoul  wcstol  wchar.h Example Using a loop, the following example gathers tokens, deliminated by commas, periods, semi-colons, or exclamation points, from a string until no tokens are left. After processing the example returns the pointers to the tokens abc, def, ghi, jk and lmnop. The next call to wcstok returns NULL and the loop ends. #include #include #include #include int main(void) { wchar_t *wsList; wchar_t *wsToken; wchar_t *wsCodes; wsList = (wchar_t *)calloc(80,sizeof(wchar_t)); wsCodes = (wchar_t *)calloc(10,sizeof(wchar_t)); wsToken = (wchar_t *)calloc(80,sizeof(wchar_t)); mbstowcs(wsList, "abc,def.ghi;jk!lmnop",25); mbstowcs(wsCodes, ";,!.\0",5); /* the wide-character string pointed to by wsList is broken up into the tokens "abc", "def", "ghi", "jk" and "lmnop" ; the null terminator (\0) is encountered and execution stops */ wsToken = wcstok(wsList,wsCodes); do { printf("Token: %S\n", wsToken); wsToken = wcstok(NULL,wsCodes); } while (wsToken != NULL); free(wsList); free(wsCodes); free(wsToken); } /************************************************************************ Output Token: abc Token: def Token: ghi Token: jk Token: lmnop ************************************************************************/ ═══ 12.14. wcswcs -- Locate wide-character substring ═══ Syntax #include wchar_t *wcswcs(const wchar_t *string1, const wchar_t *string2); Description. The wcswcs function locates the first occurrence in the string pointed to by string1 of the sequence of wchar_t characters (excluding the terminating wchar_t null character) in the string pointed to by string2. The wcswcs function returns a pointer to the located string or NULL if the string is not found. If string2 points to a string with zero length, the function returns string1. Related Information  wcschr - Search wchar_t String for Given wchar_t  wcscspn - Find Offset of First wchar_t Match  wcspbrk - Locate wchar_t Characters in String  wcsrchr - Locate wchar_t Character in String  wcsspn - Search wchar_t Characters in String  wchar.h Example This example finds the first occurrence of the wide character string pr in buffer1. #include #include #define SIZE 40 int main(void) { wchar_t buffer1[SIZE]; wchar_t * ptr; wchar_t wch[SIZE]; mbstowcs(buffer1,"ski slope",SIZE); mbstowcs(wch,"sl",SIZE); ptr = wcswcs( buffer1, wch ); printf( "The first occurrence of %S in '%S' is '%S'\n", wch, buffer1, ptr ); } /**************** Output should be similar to: ****************** The first occurrence of sl in 'ski slope' is 'slope' *******************************************************************/ ═══ 12.15. wcswidth -- Determines the display width of wide-character strings ═══ Syntax #include int wcswidth(const wchar_t *pwcs, size_t n); Description The wcswidth function determines the number of display columns to be occupied by the number of wide-characters specified by the n parameter in the string pointed to by the pwcs parameter. The LC_CTYPE category affects the behavior of the wcswidth function. Fewer than the number of wide-characters specified by the n parameter are counted if a null character is encountered first. Parameters n Specifies the maximum number of wide-characters whose display width is to be determined. pwcs Contains a pointer to the wide-character string. Return Values The wcswidth function returns the number of display columns to be occupied by the number of wide-characters (up to the terminating wide-character null) specified by the n parameter (or fewer) in the string pointed to by the pwcs parameter. A value of zero is returned if the pwcs parameter is a wide-character null pointer or a pointer to a wide-character null (that is, pwcs or *pwcs is null). If the pwcs parameter points to an unusable wide-character code, -1 is returned. Related Information  wcwidth - Determines the display width of wide-characters.  wchar.h - Header file for wide-character function prototypes. Example This example finds the display column width of a wide-character string. #include #include #include #define SIZE 80 int main(void) { wchar_t pwcs[SIZE]; int retval, int n; setlocale(LC_ALL, ""); /* Let pwcs point to a wide-character null terminated ** string. Let n be the number of wide-characters whose ** display column width is to be determined. */ mbstowcs(pwcs,"This is a test string",SIZE); retval = wcswidth( pwcs, n ); if (retval == -1) { /* Error handling. Invalid wide-character code ** encountered in the wide-character string pwcs. */ printf("Invalid wide character code was encountered in the wide-character string\n"); } else printf("The width of this wide-character string \n %S \nis %d\n",pwcs,retval); } ═══ 12.16. wctype -- Define character class ═══ Syntax #include char wctype(const char *charclass); Description This function returns a value that can be used as the second argument to subsequent calls of iswctype. This value is determined according to the rules of the coded character set defined by character type information in the program's locale. This function is defined for valid character class names as defined in the current locale. Value returned by this function are valid until a call to setlocale modifies the category LC_CTYPE. Parameters charclass A string identifying a generic character class for which codeset specific type information is required. The following class names are defined in all locales: "alnum" - for the alpha numeric class. "alpha" - for the alphabetic only class. "cntrl" - for the control class. "digit" - for the digit class. "graph" - for the graphics character class. "lower" - for the lower case character class. "print" - for the printable character class. "punct" - for the punctuation class. "space" - for the space character class. "upper" - for the upper case character class. "xdigit" - for the digit or alpha character class. "blank" - for the hexadecimal alpha numeric class. Additional character class names defined in the locale definition file (category LC_CTYPE) can also be specified. Return Values This function returns a value that can be used as the second argument to subsequent calls of iswctype. A 0 is returned if the given character class name is not valid for the current locale. Related Information  iswctype  wchar.h Example This example analyzes all characters between code 0x0 and code UPPER_LIMIT, printing A for alphabetic characters, AN for alphanumerics, U for uppercase, L for lowercase, D for digits, X for hexadecimal digits, S for spaces, PU for punctuation, PR for printable characters, G for graphics characters, and C for control characters. This example prints the code if printable. The output of this example is a 256-line table showing the characters from 0 to 255 that possess the attributes tested. #include #include #define UPPER_LIMIT 0xFF int main(void) { wint_t ch; for ( ch = 0; ch <= UPPER_LIMIT; ++ch ) { printf("\n%3d ", ch); printf("%#04x ", ch); printf("%3s ", iswctype(ch,wctype("alnum")) ? "AN" : " "); printf("%2s ", iswctype(ch,wctype("alpha")) ? "A" : " "); printf("%2s", iswctype(ch,wctype("cntrl")) ? "C" : " "); printf("%2s", iswctype(ch,wctype("digit")) ? "D" : " "); printf("%2s", iswctype(ch,wctype("graph")) ? "G" : " "); printf("%2s", iswctype(ch,wctype("lower")) ? "L" : " "); printf(" %c", iswctype(ch,wctype("blank")) ? "B" : " "); printf("%3s", iswctype(ch,wctype("punct")) ? "PU" : " "); printf("%2s", iswctype(ch,wctype("space")) ? "S" : " "); printf("%3s", iswctype(ch,wctype("print")) ? "PR" : " "); printf("%2s", iswctype(ch,wctype("upper")) ? "U" : " "); printf("%2s", iswctype(ch,wctype("xdigit")) ? "X" : " "); } } ═══ 12.17. wcwidth -- Determines the display width of wide-characters ═══ Syntax #include int wcwidth(wint_t wc); Description The wcwidth function determines the number of display columns to be occupied by the wide-character specified by the wc parameter. The LC_CTYPE function affects the behavior of the wcwidth function. Parameters wc Specifies a wide-character. Return Values The wcwidth function returns the number of display columns to be occupied by the wc parameter. If the wc parameter is a wide-character null, a value of 0 is returned. If the wc parameter points to an unusable wide-character code, -1 is returned. Related Information  wcswidth  wchar.h Example This example finds the display column width of a wide-character. #include #include #include int main(void) { wchar_t wc = 0x41; int retval; setlocale(LC_ALL, ""); /* Let wc be the wide-character whose ** display width is to be found. */ retval= wcwidth( wc ); if(retval == -1) { /* ** Error handling. Invalid wide-character in wc. */ printf("Invalid wide-character\n"); } else printf("The width of the wide-character\n%C\nis %d\n",wc,retval); } ═══ 13. Utilities for I18N ═══ The following command-line programs are provided with the OS/2 I18N package: Program Purpose gencat Constructs message catalog files. mkcatdef Constructs message #define .h files to be used in an application program to reference catalog file entries. runcat A command file which builds both message catalogs, and the include files for those catalogs (via mkcatdef and gencat). cvtmsg Converts an OS/2 mkmsgf input file into an XPG4 gencat input file. locale Displays the currently active locale categories. uconvdef Compiles or generates a UCS-2 (Unicode) conversion table for use by the iconv library. ═══ 13.1. gencat - Create/modify a message catalog ═══ Syntax gencat CatalogFile [ SourceFile ... ] Description The gencat command can be used to create a message catalog (usually *.cat) from a message text source file (usually *.msg). If a message catalog with the name specified by the CatalogFile parameter exists, the gencat command modifies it according to the statements in the specified message source files. If it does not exist, the gencat command creates a catalog file with the name specified by the CatalogFile parameter. You can specify any number of message source files. The gencat command processes multiple source files, one after another, in the sequence specified. Each successive source file modifies the catalog. If you do not specify a source file, the gencat command accepts message source data from standard input. The gencat command does not accept symbolic message identifiers. You must run the mkcatdef command if you want to use symbolic message identifiers. After entering your messages into a source file, you must use the gencat command to process the source file to create a message catalog. Parameters CatalogFile Name of catalog file to be created or modified. SourceFile List of 0 or more message source file names from which the catalog is to be updated or created. Each file name must be separated by at least one space. Return Codes 0 Successful 1 Failure Related Information  mkcatdef  runcat  catopen  catgets  catopen Example To generate a test.cat catalog from the source file test.msg, enter: gencat test.cat test.msg The test.msg file does not contain symbolic identifiers. ═══ 13.2. mkcatdef - Construct message #define .h files ═══ Syntax mkcatdef SymbolName SourceFile ... [ -h ] Description The mkcatdef command preprocesses a message source file for input to the gencat command. The SourceFile message file contains symbolic identifiers. The mkcatdef command produces the SymbolName.h file, containing statements that equate symbolic identifiers with the set numbers and message ID numbers assigned by the mkcatdef command. The mkcatdef command creates two outputs. The first is a header file called SymbolName.h. You must include this SymbolName.h file in your application program to associate the symbolic names to the set and message numbers assigned by the mkcatdef command. The mkcatdef command sends message source data, with numbers instead of symbolic identifiers, to standard output. This output is suitable as input to the gencat command. You can use the mkcatdef command output as input to the gencat command in the following ways:  Use the mkcatdef command with a > (redirection symbol) to write the new message source to a file. Use this file as input to the gencat command.  Pipe the mkcatdef command output file directly to the gencat command.  Use the runcat command rather than the mkcatdef command. The runcat command automatically sends the message source file through the mkcatdef command and then pipes the file to the gencat command. After running the mkcatdef command, you can use symbolic names in an application to refer to messages. Parameters SymbolName Name used to build the .h message file name. SourceFile List of 1 or more message source file names. These message files contain symbolic identifiers from which the SymbolName.h file is generated. Each message source file name must be separated by at least one space. Flags -h Suppresses the generation of a SymbolName.h file. This flag must be the last argument to the mkcatdef command. Return Codes 0 Successful 1 Failure Related Information  gencat  runcat  catopen  catgets  catopen Example To process the symb.msg message source file and redirect the out- put to the symb.src file, enter: mkcatdef symb symb.msg > symb.src The generated symb.h file looks similar to the following: #ifdef _H_SYMB_MSG #define _H_SYMB_MSG #include #include #define MF_SYMB "symb.cat" /* The following was generated from symb.src. */ /* definitions for set MSFAC */ #define SYM_FORM 1 #define SYM_LEN 2 #define MSG_H 6 #endif The mkcatdef command also creates the symb.src message catalog source file for the gencat command with numbers assigned to the symbolic identifiers: $quote " Use double quotation marks to delimit message text $delset 1 $set 1 1 "Symbolic identifiers can only contain alphanumeric \ characters or the _ (underscore character)\n" 2 "Symbolic identifiers cannot be more than 65 \ characters long\n" 5 "You can mix symbolic identifiers and numbers\n" $quote 6 remember to include the ".h" file in your program The assigned message numbers are noncontiguous because the source file contained a specific number. The mkcatdef program always assigns the previous number plus 1 to a symbolic identifier. Note: The mkcatdef command inserts a $delset command before a $set command in the output message source file. This means you cannot add, delete, or replace single messages in an existing catalog when piping to the gencat command. You must enter all messages in the set. ═══ 13.3. runcat - Build message catalog ═══ Syntax runcat CatalogName SourceFile [ CatalogFile ] Description The runcat command invokes the mkcatdef command and pipes the message catalog source data (the output from mkcatdef) to the gencat program. The file specified by the SourceFile parameter contains the message text with your symbolic identifiers. The mkcatdef program uses the CatalogName parameter to generate the name of the symbolic definition file by adding .h to the end of the CatalogName value, and to generate the symbolic name for the catalog file by adding MF_ to the beginning of the CatalogName value. The definition file must be included in your application program. The symbolic name for the catalog file can be used in the library functions (such as the catopen subroutine). The CatalogFile parameter is the name of the catalog file created by the gencat command. If you do not specify this parameter, the gencat command names the catalog file by adding .cat to the end of the CatalogName value. This file name can also be used in the catopen library function. Return Codes 0 Successful 1 Failure Related Information  gencat  mkcatdef  catopen  catgets  catopen Example To generate a catalog named test.cat from the message source file test.msg, enter: runcat test test.msg ═══ 13.4. cvtmsg - Convert OS/2 mkmsgf file to XPG4 gencat file ═══ Syntax cvtmsg input_OS2_file output_XPG4_file [options] Description The cvtmsg utility is used to convert the source for an OS/2 message file (Make Message File (MKMSGF) format) into the source for an XPG/4 message file (Internationalization (I18N) Generate Catalog (gencat) format). The input file must contain source which is compilable by the MKMSGF utility from the OS/2 Toolkit. The input file will not be modified by this process. An input file example: ; comment record. Semicolon must be in column 1. ; The next line defines the message component value. MSG ; MSG0001E: Message text for message number one, which is an error message. %1 %2 %3 %4 %5 %6 %7 %8 %9 indicate replacement variables. ; ; Message number two is not used because of "?" after the number. MSG0002?: ; MSG0003I: Message text for message number three, which is an informational message. ; MSG0004I: This shows a special variable: %0 ; MSG0005I: This is the last message in this example. Special conversion considerations for the input file:  Comments are lines which have a semicolon ";" in column 1. The entire line is considered part of the comment text.  Replacement variables are defined as: %1 %2 %3 %4 %5 %6 %7 %8 %9. All replacement variables are considered to be character string values.  The value %0 indicates that a new line character is NOT to be added to the end of this text line; it is NOT a replacement variable. Any text which follows the %0 is ignored and discarded.  Messages defined with a type of "?" are place-holders and do not really define a usable message. The output file will contain the converted message source which is compilable by the gencat utility. The OS/2 message source shown in the example above will create the following XPG/4 message source when the default options are used (the /C option was specified to include the converted comment text): $ comment record. Semicolon must be in column 1. $ The next line defines the message component value. $ $quote " (Define message text delimiter) $set 1 (Message component: MSG) $ 0001 "MSG0001: Message text for message number one, which is an\n\ error message. %1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s indicate\n\ replacement variables.\n" $ $ Message number two is not used because of "?" after the number. $ 0003 "Message text for message number three, which is an\n\ informational message.\n" $ 0004 "This shows a special variable: " $ 0005 "This is the last message in this example.\n" The OS/2 message source shown in the example above will create the following XPG/4 message source when the /S option was specified to create symbolic message IDs: $ comment record. Semicolon must be in column 1. $quote " (Define message text delimiter) $set 1 (Message component: MSG) $ MSG0001 "MSG0001: Message text for message number one, which is an\n\ error message. %1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s indicate\n\ replacement variables.\n" MSG0003 "Message text for message number three, which is an\n\ informational message.\n" MSG0004 "This shows a special variable: " MSG0005 "This is the last message in this example.\n" Special conversion considerations for the output file:  Comments are lines which have "$" in column 1 and a blank in column 2. Comment text can also be defined on the "$quote" and "$set" statements.  The "$set " line defines the delimiter character which is used to identify the text of the message.  Replacement variables are defined as "%n$s", where "n" is a one digit number (1-9) and "s" identifies the replacement text is a string of characters. The conversion is performed as follows:  OS/2 comments are lines which have a semicolon ";" in column 1, with comment text starting in column 2. The conversion default is that comments are not included in the output source. When the "/C" option is specified, OS/2 comments are converted to XPG/4 comments and are included in the output source. XPG/4 comments have a "$" in column 1, a blank in column 2, with comment text starting in column 3.  Converted message text will always be delimited with double quotes. This tool will add a $quote " statement for this purpose.  All converted messages will always be added to be XPG/4 message set number 1. This tool will add a "$set 1" statement for this purpose.  The formatting of the OS/2 message text will be preserved in the converted XPG/4 message text. New line characters "\n" will be added to maintain formatting.  A new line character will be added at the end of the XPG/4 message text when "%0" is not defined at the end of the OS/2 message text.  OS/2 replacement variables are defined as: %1 %2 %3 %4 %5 %6 %7 %8 %9, and are always considered to be character string values. Each variable is converted to an equivalent XPG/4 variable format: %1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s. The one digit number remains the same (to insure proper sequencing) and each variable must be a character string replacement variable.  OS/2 messages defined with a type of "?" are place-holders and do not really define a usable message. These messages are not included in the output XPG/4 message file.  OS/2 messages which have a message type of "E" (error) or "W" (warning) will have the message identifier added to the beginning of the message text (see MSG0001 in the above example). This will simulate the function provided by DosGetMessage for these types of messages. Parameters input_OS2_file This parameter indicates the name of the input file which contains the message source for the MKMSGF utility (from the OS/2 Toolkit). output_XPG4_file This parameter indicates the name of the output file to which the converted message source will be written in a format compatible with the gencat utility. options /C Include the text from all OS/2 source comments in the generated XPG/4 message source. Otherwise, OS/2 comment text is not included in the output file. /P Prompt at the beginning of execution to ensure that the passed parameters were correctly defined. The user will have the option to terminate processing before any conversion is performed. /S Generate symbolic XPG/4 message IDs which are the same as the OS/2 message ID (3 character component plus the 4 digit number, i.e. MSG0001). Otherwise, the XPG/4 message ID will consist of only the numeric part of the OS/2 message ID (i.e. 0001). Return Codes 0 Successful 1 Failure Related Information  gencat  runcat  catopen  catgets  catopen ═══ 13.5. locale - Display locale categories. ═══ Syntax locale Description This utility displays the locale that is currently active for each locale category. This is the same string of data that is returned from the setlocale function if a NULL locale is specified and LC_ALL is specified for the catagory. Parameters There are no parameters. Return Codes The return code is always 0. Related Information  setlocale Example If LANG were set, for example, to En_US.IBM-437, then the program would display: ENUS437 ENUS437 ENUS437 ENUS437 ENUS437 ENUS437 ═══ 13.6. uconvdef - Compile conversion table ═══ Syntax uconvdef [ -f SrcFile ] [ -v ] UconvTable Description Compiles or generates a UCS-2 (Unicode) conversion table for use by the iconv library. The uconvdef command reads SrcFile and creates a compiled conversion table in UconvTable. The SrcFile defines a mapping between UCS-2 and multibyte code sets (one or more bytes per character). The UconvTable is in a format that can be loaded by the UCSTBL conversion method located in the i18n\locale\iconv directory. This method uses the table to support UCS-2 conversions in both directions. The conversion table can be accessed from the iconv programming interfaces, if the following steps are taken:  Name the compiled table using the name of the non-UCS-2 code set (e.g. IBM-850).  Place the table in a directory called "uconvtab". The default system directory is i18n\locale\uconvtab. If another directory is used, the LOCPATH environment variable will need to be set to include parent directory (e.g. i18n\locale). Parameters UconvTable Specifies the path name of the compiled table to be created. This should be the name of the code set that defines conversions into and out of UCS-2. Flags -f SrcFile Specifies the conversion table source file. If this flag is not used, standard input is read. -v Causes output of the processed file statements. Return Codes 0 Successful completion. >0 An error occurred. Related Information  iconv_open  iconv  iconv_close  uconvdef source file format Example This example creates a conversion table between IBM-850 and UCS-2: uconvdef -f IBM-850.ucmap IBM-850 uconvdef Source File Format Conversion mapping values are defined using UCS-2 symbolic character names followed by character encoding (code point) values for the multibyte code set. For example, \x20 represents the mapping between the UCS-2 symbolic character name for the space character and the \x20 hexadecimal code point for the space character in ASCII. In addition to the code set mappings, directives are interpreted by the uconvdef command to produce the compiled table. These directives must precede the code set mapping section. They consist of the following keywords surrounded by < > (angle brackets), starting in column 1, followed by white space and the value to be assigned to the symbol: KEYWORD DESCRIPTION The name of the coded character set, enclosed in quotation marks (" "), for which the character set description file is defined. The maximum number of bytes in a multibyte character. The default value is 1. An unsigned positive integer value that defines the minimum number of bytes in a character for the encoded character set. The value is less than or equal to . If not specified, the minimum number is equal to . The escape character used to indicate that the character following is interpreted in a special way. This defaults to a backslash (). The character that, when placed in column 1 of a charmap line, is used to indicate that the line is ignored. The default character is the number sign (#). A quoted string consisting of format specifiers for the UCS-2 symbolic names. This must be a value of AXXXX, indicating an alphabetic character followed by 4 hexadecimal digits. Also, the alphabetic character must be a U, and the hexadecimal digits must represent the UCS-2 code point for the character. An example of a symbolic character name based on this mask is Unicode space character. Specifies the type of the code set. This type is used to direct uconvdef on what type of table to build. It is also stored in the table to indicate the type of processing algorithm in the UCS conversion methods. It must be one of the following:  SBCS Single-byte encoding  DBCS Stateless double-byte, single-byte, or mixed encodings  EBCDIC_STATEFUL Stateful double-byte, single-byte, or mixed encodings  MBCS Stateless multibyte encoding Specifies the default locale name to be used if locale information is needed. Specifies the encoding of the default substitute character in the multibyte code set. The code set mapping section consists of a sequence of mapping definition lines preceded by a CHARMAP declaration and terminated by an END CHARMAP declaration. Empty lines and lines containing in the first column are ignored. Symbolic character names in mapping lines must follow the pattern specified in the , except for the reserved symbolic name, , that indicates the associated code points are unassigned. Each noncomment line of the character set mapping definition must be in one of the following formats: Format 1 "%s %s %s/n", , , For example: \x81\x57 This format defines a single symbolic character name and a corresponding encoding. A character following an escape character is interpreted as itself; for example, the sequence <\\\>> represents the symbolic name \> enclosed between angle brackets. The encoding part is expressed as one or more concatenated decimal, hexadecimal, or octal constants in the following formats:  "%cd%d", ,  "%cx%x", ,  "%c%o", , Decimal constants are represented by two or more decimal digits preceded by the escape character and the lowercase letter d, as in d97 or d143. Hexadecimal constants are represented by two or more hexadecimal digits preceded by an escape character and the lowercase letter x, as in x61 or x8f. Octal constants are represented by two or more octal digits preceded by an escape character. Each constant represents a single-byte value. When constants are concatenated for multibyte character values, the last value specifies the least significant octet and preceding constants specify successively more significant octets. Format 2 "%s. . .%s %s %s/n", , , , For example: ... \x81\x56 This format defines a range of symbolic character names and corresponding encodings. The range is interpreted as a series of symbolic names formed from the alphabetic prefix and all the values in the range defined by the numeric suffixes. The listed encoding value is assigned to the first symbolic name, and subsequent symbolic names in the range are assigned corresponding incremental values. For example, the line: ... \x81\x56 is interpreted as: \x81\x56 \x81\x57 \x81\x58 \x81\x59 Format 3 " %s. . .%s %s/n", , , This format defines a range of one or more unassigned encodings. For example, the line: \x9b...\x9c is interpreted as: \x9b \x9c ═══ 14. Installation ═══ ═══ 14.1. Overview ═══ Included in this toolkit is a program that installs the I18N files required for the runtime environment and configures the CONFIG.SYS information. It is a command line program (.EXE) which provides no user interface other than the command line parameters and return code. It is assumed that this program will be called by individual product installation programs. ═══ 14.2. Goals ═══ The goals of this install are:  Provide consistent install behavior of the I18N code among the various products that install it.  Reduce code redundancy caused by multiple products installing the same code in different directories on the user's system.  Provide a syslevel file for support purposes.  Remove knowledge of install details from other product's installation code. ═══ 14.3. Functions ═══ This program's functions include:  Determine if this version of the I18N package should be installed by examining the environment variable, I18NDIR and the syslevel file, SYSLEVEL.I18.  Create all of the required I18N directories if they do not already exist.  Install the I18N runtime code (setloc1.dll) located in the path specified by the caller, if necessary.  Install the I18N locale files (*.dll) located in the path specified by the caller, if necessary.  Install the syslevel file, SYSLEVEL.I18, if necessary.  Update CONFIG.SYS with the I18N information if necessary. ═══ 14.4. Installation Process ═══ If the I18NDIR environment variable is not found, then the I18N package has not been installed on the system or it is a down level version. Install the new version using the drive specified by the caller. If no drive letter is specified, then the boot drive will be used. The default directory, IBMI18N, will be used. Create the directory if it does not exist. Update config.sys. The following is installed in the IBMI18N directory:  syslevel file, SYSLEVEL.I18  \DLL\SETLOC1.DLL  \LOCALE\all locale DLL's provided by the caller  \LOCALE\ALIASES file If the I18NDIR environment variable exists, the directory specified in the environment variable will be used as the target directory. Compare the existing syslevel file with the syslevel file in the install package.  If the install package is a newer version: - Replace the SETLOC1.DLL. - Replace the SYSLEVEL.I18 file. - Replace existing locale DLLs. - Add any new locale DLLs provided by the caller. - Do not remove any existing locale DLLs. - Replace the ALIASES file.  If the install package is an older version or the same version: : - Do not replace the SETLOC1.DLL. - Do not replace the SYSLEVEL.I18 file. - Do not replace existing locale DLLs. - Add any new locale DLLs provided by the caller. - Do not remove any existing locale DLLs. - Do not replace the existing ALIASES file. It may seem strange that we want to continue with the install if the versions are the same or if the caller is installing an older version. However, multiple products install this package and they don't all ship the same set of locale files. So, we want to make sure that we always have a "union" of all the different locale files installed by the different products. If the following occurs:  the program encounters a "disk full" condition while copying the source files to the user's machine AND  the caller specified a target drive that is different from the drive on which the disk full error occurred AND  the caller specified UPDCFG=Y (this is also the default) then this install program will attempt to move the entire I18NDIR directory to the specified target drive. If successful the I18NDIR environment variable in the config.sys file will be updated to reflect this and the original I18NDIR directory will be deleted. If unsuccessful, the config.sys will not be updated and a disk full error will be returned to the caller. ═══ 14.5. CONFIG.SYS Changes ═══ Environment Variable Handling I18NDIR If the I18NDIR environment variable is not found, then add it as follows: I18NDIR=x:\IBMI18N; where "x" is the drive letter specified by the calling application. If no drive letter is specified, the boot drive is used. If the I18NDIR environment variable is found, leave it alone unless the I18N directory has been changed due to a disk full error. Then update it to reflect the new location of the I18NDIR directory. LANG Generally, the LANG environment variable should not require updating or setting. If a value for the LANG environment variable is specified by the caller, add the LANG environment variable if it does not exist. If the LANG environment variable already exists, update it to the value specified by the caller. If no value is specified by the caller for the LANG environment variable and either the variable is not defined in the config.sys or it is invalid in the config.sys, then a default locale will be determined. The default locale is based on the system's country code and keyboard settings. The LANG environment variable will be set to this default. The I18N initialization code computes a default locale based on the system's country code and code page settings. So, the LANG variable is not usually necessary. It may be desirable to set LANG if the value for LANG was obtained from user input or in countries, such as Belgium, where multiple languages are frequently used on the same machine. If set, LANG should be of the form xx_xx (e.q. en_US) without a code page suffix, since this will be determined from the current process code page. If either the value of LANG specified by the caller or the value for LANG already in the config.sys file is not of the format xx_xx, it will be modified to meet this format. This is being done to ensure compatibility with Visual Age C++. NLSPATH No changes will be made to the config.sys file regarding the NLSPATH environment variable. This variable is involved in the location of message files and this install program does NOT install any message files. The NLSPATH environment variable contains a LIST of paths. Therefore when updating it, products should add to the list, not replace the list. LOCPATH If the LOCPATH environment variable exists, the path to the I18N code will be added to the beginning of the LOCPATH list of paths unless it is already there. If the LOCPATH environment variable does not exist, add it. LIBPATH Add the I18N code path to the beginning of LIBPATH unless it is already there. This program will NOT backup the config.sys file. It assumes that the caller will handle this. ═══ 14.6. Syntax ═══ i18ninst sourcepath DRIVE=x UPDCFG=x LANG=xx_xx ERASE=x CFGDRIVE=x ═══ 14.7. Parameters ═══ sourcepath Path containing code to be installed. This parameter is REQUIRED and must be the 1st parameter specified. If the path does not contain a drive letter, the current drive is used. This program expects to find the following UNPACKED files and directories at the specified path:  SYSLEVEL.I18 file  \DLL\SETLOC1.DLL  \LOCALE\xxxxxxxx.DLL for each locale file.  \LOCALE\ALIASES DRIVE=x Drive where code should be installed. If a version of the I18N code equal to or newer than this version is already installed, this program will install using the same drive as the already existing code and this parameter will be ignored. If the I18N code is not installed or an older version is already installed, this program will install the code on the drive specified in the directory, IBMI18N. This parameter is optional. If this parameter is not specified, the OS/2 install drive is used. UPDCFG=x This parameter specifies whether or not the config.sys file should be updated. CONFIG=Y indicates that it should be updated. CONFIG=N indicates that it should not. This parameter is optional. If it is not specified, the default is CONFIG=Y. LANG=xx_xx This parameter specifies a value to set the LANG environment variable to. The config.sys file is updated with this value if CONFIG=Y was specified. This parameter is optional. If it is not specified, the config.sys file will be unchanged with regards to the LANG environment variable. ERASE=x This parameter specifies whether or not this install program should erase the caller's source files and directory when the installation successfully completes. ERASE=Y indicates that the files and directory should be erase. ERASE=N indicates they should not. This parameter is optional. If it is not specified the default is 'N'. CFGDRIVE=x This parameter specifies the drive where the config.sys file is located. This parameter is optional. If it is not specified, the boot drive is assumed. This parameter is ignored if UPDCFG=N is specified. ═══ 14.8. Prerequisites ═══ This program assumes that all source files are located in the following directory structure:  SYSLEVEL.I18  \DLL\SETLOC1.DLL  \LOCALE\locale DLL files  \LOCALE\ALIASES This program assumes that ONLY the I18N files are in this directory. Because different products install different locale files and additional locale files may still be added, this install program has no knowledge of the exact number of files to expect or what the file names may be. The config.sys file has already been backed up.