Locales encapsulate some of the language/culture specific things that you shouldn't hard code in your programs.
If you have various locales installed on your computer then you can select via the following list of environment variables how a locale sensitive program will behave. The default locale is the C, or POSIX locale which is hard coded in libc.
This sets the locale, but can be overridden with any other LC_xxxx environment variables
Sort order.
Character definitions, uppercase, lowercase, ... These are used by the functions like toupper, tolower, islower, isdigit, ...
Contains the information necessary to format money in the fashion expected. It has the definitions of things like the thousands separator, decimal separator, and what the monetary symbol is and how to position it.
Thousands, and decimal separators, and the numeric grouping expected.
How to specify the time, and date. This has the things like the days of the week, and months of the year in abbreviated, and non abbreviated form.
Yes, and No expressions.
This sets the locale, and overrides any other LC_xxxx environment variables.
Here are some other locales, and there are lots more.
English Canadian.
US English.
Germany's German.
France's French.
If you are writing a program, and want to to be usable internationally you should utilize locales. The most glaring reason for this is that not everybody is going to use the same character set/code page as you.
Make sure in your programs that you don't do things like:
/* check for alphabetic characters */
if ( (( c >= 'a') && ( c <= 'z' )) ||
(( c >= 'A') && ( c <= 'Z' )) ) { ... }
If you write that type of code your program assumes that
the user/file/... is ASCII and nothing but ASCII, and it
does not respect the code page definitions of the user's locale.
For example
it preludes characters such as a-umelaut which would be used in a German
environment. What you should do instead is use the locale sensitive
functions like isalpha(). If your program does expliticly require
use of only US-ASCII alphabetics,
you still use the isalpha() function, but you must also either do
setlocale(LC_CTYPE,"C")
or set the LANG
,
LC_CTYPE
, or LC_ALL
environment variables to "C".
Locales allow a large degree of flexibility and make certain assumptions that a programmer may have made in ASCII based C programs invalid.
For instance, you cannot assume the code positions of characters. There is nothing stopping you from creating a charmap file that defines the code position of 'A' to be 0xC1 rather than 0x41. This is in fact the code point mapping for 'A' in IBM code page 37, used on mainframes, while the former is used for US-ASCII, iso8859-x, and others.
The basic idea is different people speak different languages, expect different sorting orders, use different code pages, and live in different countries. Locales and locale sensitive programs give one a means to respect such things, and handle them accordingly. It is not really much extra work to do so, it just requires a slightly different frame of mind when writing programs.