The only sub-ranges of characters you need to worry about are the Roman characters. Below is some code that illustrates how to do the conversion.
// returns true if the character needed conversion, or false if it was a // single byte character (meaning that only the first byte was processed) // (i.e. a false return means the character was a Roman character) boolean MacToGB2312(unsigned char first, unsigned char second, unsigned short *output) { if (first < 0x81) { *output = first; return false; } else { unsigned short temp; temp = (first - 0x80) << 8; temp += (second - 0x80); *output = temp; return true; } } // this will always convert, so we don't need to get the bytes separately // nor do we need to return a boolean saying whether we converted void GB2312ToMac(unsigned short input, unsigned short *output) { *output = input + 0x8080; }As you can see from the code, you need to shift both bytes of a two-byte character. This is done so that it is obvious whether a character is part of a two-byte character, or is a single-byte Roman character.
Main | Top of Section | What's New | Apple Computer, Inc. | Find It | Feedback | Help