11. Text Encoding (codepages)

11.1. Common scenarios

  1. Comparing English files coded in ASCII

    You may easily use WinMerge.exe. You may also use WinMergeU.exe (if you are on NT/2000/XP/2003), but it will be slower due to encoding conversion.

  2. Comparing western European files coded in the common windows codepage

    (See FAQ: How do I tell what encoding my file uses) You may easily use WinMerge.exe. You may also use WinMergeU.exe (if you are on NT/2000/XP/2003), but it will be slower due to encoding conversion.

  3. Comparing Unicode files

    You should use WinMergeU.exe (if you are on NT/2000/XP/2003) as it will be at least as fast, and may be faster than WinMerge.exe, in this scenario. Also, WinMergeU.exe correctly handles UTF-8 files, unlike WinMerge.exe. If you are on Win95/98/ME, or if your files are all in English or a western European language, you may get by with WinMerge.exe.

  4. Comparing multilingual files

    You should use WinMergeU.exe (if you are on NT/2000/XP/2003) in order to be able to see the characters from different languages simultaneously. If you are on Win95/98/ME, then you may attempt to view the files in WinMerge.exe, but it is recommended that you do not attempt to merge them.

  5. Comparing East Asian files

    You should use WinMergeU.exe (if you are on NT/2000/XP/2003) in order to correctly display double-wide characters. See the Font section for font recommendations. If you are on Win95/98/ME, then you may attempt to view the files in WinMerge.exe, but it is recommended that you do not attempt to merge them.

11.2. Explanation

Characters (such as "a" or "1" or "&") are represented by computers as numbers, and there is more than one way to do this. A text encoding method is a way to encode the characters into the numbers (bytes) of which a file is comprised. There are four main text encoding families: