![]() |
ConvertFromTextToUnicode |
||||
Header: | UnicodeConverter.h | Carbon status: | Supported | |
Converts a string from any encoding to Unicode.
OSStatus ConvertFromTextToUnicode ( TextToUnicodeInfo iTextToUnicodeInfo, ByteCount iSourceLen, ConstLogicalAddress iSourceStr, OptionBits iControlFlags, ItemCount iOffsetCount, ByteOffset iOffsetArray[], ItemCount *oOffsetCount, ByteOffset oOffsetArray[], ByteCount iOutputBufLen, ByteCount *oSourceRead, ByteCount *oUnicodeLen, UniCharArrayPtr oUnicodeStr );
A Unicode converter object of type TextToUnicodeInfo containing mapping and state information used for the conversion. The contents of this Unicode converter object are modified by the function. Your application obtains a Unicode converter object using the function CreateTextToUnicodeInfo.
The length in bytes of the source string to be converted.
The address of the source string to be converted.
Conversion control flags. You can use
The number of offsets in the iOffsetArray parameter. Your application supplies this value. The number of entries in iOffsetArray must be fewer than the number of bytes specified in iSourceLen. If you dont want offsets returned to you, specify 0 (zero) for this parameter.
An array of type ByteOffset. On input, you specify the array that contains an ordered list of significant byte offsets pertaining to the source string. These offsets may identify font or style changes, for example, in the source string. All array entries must be less than the length in bytes specified by the iSourceLen parameter. If you dont want offsets returned to your application, specify NULL for this parameter and 0 (zero) for iOffsetCount.
On return, a pointer to the number of offsets that were mapped in the output stream.
An array of type ByteOffset. On return, this array contains the corresponding new offsets for the Unicode string produced by the converter.
The length in bytes of the output buffer pointed to by the oUnicodeStr parameter. Your application supplies this buffer to hold the returned converted string. The oUnicodeLen parameter may return a byte count that is less than this value if the converted byte string is smaller than the buffer size you allocated. The relationship between the size of the source string and the Unicode string is complex and depends on the source encoding and the contents of the string.
On return, a pointer to the number of bytes of the source string that were converted. If the function returns a kTECUnmappableElementErr result code, this parameter returns the number of bytes that were converted before the error occurred.
On return, a pointer to the length in bytes of the converted stream.
A pointer to an array used to hold a Unicode string. On input, this value points to the beginning of the array for the converted string. On return, this buffer holds the converted Unicode string. (For guidelines on estimating the size of the buffer needed, see the discussion.)
A result code. The function returns a noErr result code if it has completely converted the input string to Unicode without using fallback characters.
You specify the source strings encoding in the Unicode mapping structure that you pass to the function CreateTextToUnicodeInfo to obtain a Unicode converter object for the conversion. You pass the Unicode converter object returned by CreateTextToUnicodeInfo to ConvertFromTextToUnicode as the iTextToUnicodeInfo parameter.
In addition to converting a text string in any encoding to Unicode, the ConvertFromTextToUnicode function can map offsets for style or font information from the source text string to the returned converted string. The converter reads the application-supplied offsets, which apply to the source string, and returns the corresponding new offsets in the converted string. If you do not want the offsets at which font or style information occurs mapped to the resulting string, you should pass NULL for iOffsetArray and 0 (zero) for iOffsetCount.
Your application must allocate a buffer to hold the resulting converted string and pass a pointer to the buffer in the oUnicodeStr parameter. To determine the size of the output buffer to allocate, you should consider the size of the source string, its encoding type, and its content in relation to the resulting Unicode string.
For example, for 1-byte encodings, such as MacRoman, the Unicode string will be at least double the size (more if it uses noncomposed Unicode); for MacArabic and MacHebrew, the corresponding Unicode string could be up to six times as big. For most 2-byte encodings, for example Shift-JIS, the Unicode string will be less than double the size. For international robustness, your application should allocate a buffer three to four times larger than the source string. If the output Unicode text is actually UTF-8which could occur beginning with the current release of the Text Encoding Conversion Manager, version 1.2.1the UTF-8 buffer pointer must be cast to UniCharArrayPtr before it can be passed as the oUnicodeStr parameter. Also, the output buffer length will have a wider range of variation than for UTF-16; for ASCII input, the output will be the same size; for Han input, the output will be twice as big, and so on.
Supported in Carbon. Available in Carbon 1.0.2 and later when running Mac OS 8.1 or later.
© 2000 Apple Computer, Inc. (Last Updated 7/17/2000)