![]() |
TECSniffTextEncoding |
||||
Header: | TextEncodingConverter.h | Carbon status: | Supported | |
Analyzes a text stream and returns the probable encodings in a ranked list, based on an array of possible encodings you supply. It also returns the number of errors and features for each encoding.
OSStatus TECSniffTextEncoding ( TECSnifferObjectRef encodingSniffer, TextPtr inputBuffer, ByteCount inputBufferLength, TextEncoding testEncodings[], ItemCount numTextEncodings, ItemCount numErrsArray[], ItemCount maxErrs, ItemCount numFeaturesArray[], ItemCount maxFeatures );
A pointer to a sniffer object.
The text to be sniffed.
The length of the input buffer.
An array of text encoding specifications. You must fill the array with the text encodings for which you want to sniff. On output, the array elements are reordered from the most likely to the least likely text encodings.
The number of entries in the testEncodings[] parameter.
An array that must contain at least numTextEncodings elements. On return, an array of the number of errors found for each possible text encoding. The array elements are in the same order as the testEncodings[] array elements at output.
The maximum number of errors a sniffer can encounter. The sniffer stops looking for an encoding after this number is reached.
An array of that must contain at least numTextEncodings elements. On return, an array of the number of features found for each possible text encoding. The array elements are in the same order as the testEncodings[] array elements at output.
The maximum number of features a sniffer can encounter. The sniffer stops looking for a features after this number is reached.
A result code.
An error indicates a code point or sequence that is illegal in the specified encoding. A feature indicates the presence of a sequence that is characteristic of that encoding.
For example, the byte sequence which is interpreted in Mac OS Roman as ä$#248;é$@246; could legally be interpreted either as Mac OS Roman text or as Mac OS Japanese text. Both sniffers would return zero errors, but the Mac OS Japanese sniffer would also return two features of Mac OS Japanese (representing two legal 2-byte characters.)
The arrays are returned in a ranked list with the most likely text encodings first. The results are sorted first by number of errors (fewest to most), then by number of features (most to fewest), and then by the original order in the list. On return, the most likely encoding is in testEncodings[0] or testEncodings[1].
If an encoding is not examined, its number of errors and features are set to 0xFFFFFFFF, and the encoding is sorted to the end of the list.
Supported in Carbon. Available in Carbon 1.0.2 and later when running Mac OS 8.1 or later.
© 2000 Apple Computer, Inc. (Last Updated 7/17/2000)