The Encoding class represents a character encoding. Methods are provided to convert arrays and strings of unicode characters to and from arrays of bytes.
Object
Encoding
[Visual Basic] MustInherit Public Class Encoding [C#] public abstract class Encoding [C++] public __gc __abstract class Encoding [JScript] public abstract class Encoding
A number of Encoding implementations are provided in the TBD namespace, including:
The ASCIIEncoding class encodes unicode characters as single 7-bit ASCII characters. This encoding only supports character values between 0x00 and 0x7F.
The CodePageEncoding class encapsulates a Windows code page. Any installed code page can be accessed through this encoding, and conversions are performed using the WideCharToMultiByte and MultiByteToWideChar Windows API functions.
The UnicodeEncoding class encodes each unicode character as two consecutive bytes. Both little endian (code page 1200) and big endian (code page 1201) encodings are supported.
The UTF7Encoding class encodes unicode characters using the UTF-7 encoding (UTF-7 stands for UCS Transformation Format, 7-bit form). This encoding supports all unicode character values, and can also be accessed as code page 65000.
The UTF8Encoding class encodes unicode characters using the UTF-8 encoding (UTF-8 stands for UCS Transformation Format, 8-bit form). This encoding supports all unicode character values, and can also be accessed as code page 65001.
When the data to be converted is only available in sequential blocks (such as data read from a stream, an application may choose to use a Decoder or an Encoder to perform the conversion. This is also useful when the amount of data is so large that it needs to be divided into smaller blocks. Decoders and encoders are obtained using the GetDecoder and GetEncoder methods. An application can use the properties of this class such as, CodePage, ASCII, Default, Unicode, UTF7, and UTF8 to obtain encodings. Applications can instantiate Encoding objects through the ASCIIEncoding, CodePageEncoding, UnicodeEncoding, UTF7Encoding and UTF8Encoding classes.
Through an encoding, the GetBytes method is used to convert arrays of characters to arrays of bytes, and the GetChars method is used to convert arrays of bytes to arrays of characters. The GetBytes and GetChars methods maintain no state between conversions, and are characters in one operation. When the data to be converted is only available in sequential blocks (such as data read from a stream) or when the amount of data is so large that it needs to be divided into smaller blocks, an application may choose to use a Decoder or an Encoder to perform the conversion. Decoders and encoders allow sequential blocks of data to be converted and they maintain the state required to support conversions of data that spans adjacent blocks. Decoders and encoders are obtained using the GetDecoder and GetEncoder methods.
The core GetBytes and GetChars methods require the caller to provide the destination buffer and ensure that the buffer is large enough to hold the entire result of the conversion. When using these methods, either directly on an Encoding object or on an associated Decoder or Encoder, an application can use one of two methods to allocate destination buffers.
1. The GetByteCount and GetCharCount methods can be used to compute the exact size of the result of a particular conversion, and an appropriately sized buffer for that conversion can then be allocated.
2. The GetMaxByteCount and GetMaxCharCount methods can be used to compute the maximum possible size of a conversion of a given number of bytes or characters, and a buffer of that size can then be reused for multiple conversions.
The first method generally uses less memory, whereas the second method generally executes faster.
Namespace: System.Text
Assembly: mscorlib.dll