UTF-8 SAMPLER

  Ñ ╖ ú ╖ Γ┼╣ ╖ $ ╖ ó ╖ Γ─ä ╖ Γ╦ÿ ╖ Γ┼ü ╖ Γ┬ñ ╖ Γ─╜ ╖ Γ┼Ü ╖ Γ┬º ╖ Γ┬¿ ╖ Γ┼á ╖ Γ┼₧ ╖ Γ┼ñ ╖ Γ┬¡ ╖ Γ┼╜ ╖ Γ┼╗

Frank da Cruz
The Kermit Project - Columbia University
New York City
fdc@columbia.edu

Last update: Thu Dec 20 14:43:18 2007


PEACE ] [ Poetry ] [ I Can Eat Glass ] [ The Quick Brown Fox ] [ HTML Features ] [ Credits, Tools, Commentary ]

UTF-8 is an ASCII-preserving encoding method for Unicode (ISO 10646), the Universal Character Set (UCS). The UCS encodes most of the world's writing systems in a single character set, allowing you to mix languages and scripts within a document without needing any tricks for switching character sets. This web page is encoded directly in UTF-8.

As shown HERE, Columbia University's Kermit 95 terminal emulation software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, or 2000 when using a monospace Unicode font like Andale Mono WT J or Everson Mono Terminal, or the lesser populated Courier New, Lucida Console, or Andale Mono. C-Kermit can handle it too, if you have a Unicode display. As many languages as are representable in your font can be seen on the screen at the same time.

This, however, is a Web page. Some Web browsers can handle UTF-8, some can't. And those that can might not have a sufficiently populated font to work with (some browsers might pick glyphs dynamically from multiple fonts; Netscape 6 seems to do this). CLICK HERE for a survey of Unicode fonts for Windows.

The subtitle above shows currency symbols of many lands. If they don't appear as blobs, we're off to a good start!


Poetry

From the Anglo-Saxon Rune Poem (Rune version):

ßπäí┬¢┘ü┬Ü╗ߢ½ß¢Æß∩┐╜∩┐╜┬¢½ßπäí┬Ü▒ßÜ⌐ßπäí├í┬Ü▒ߢ½ßπäí┬¢├í┬Ü▒ßܬߢ½ßÜ╖ߢûßÜ╗ßÜ╣ß∩┐╜┬¢┬ÜßÜ│ß├í┬¢┬ù
ߢïßÜ│ߢûßܬߢÜߢ½ß∩┐╜┬¢┬ûßܬßÜ╗ߢ½ß¢ùßܬßÜ╛ßÜ╛ßܬߢ½ßÜ╖ߢûßÜ╗ßÜ╣ß∩┐╜┬¢┬ÜßÜ│ߢ½ß¢ùߢ├í┬Ü│ߢÜß├í┬Ü╛ߢ½ßÜ╗ß∩┐╜┬¢├í┬¢½ß¢₧ßܽߢÜßܬßÜ╛
ßÜ╖ߢ├íπäí┬¢½ßÜ╗ߢûߢ½ßÜ╣ߢ├í┬¢┬Üߢûߢ½ßπäí┬Ü⌐ßÜ▒ߢ½ß¢₧ßÜ▒ߢ├í┬Ü╗ߢ├í┬Ü╛ߢûߢ½ß¢₧ßÜ⌐ߢùߢûߢïߢ½ßÜ╗ߢÜߢ┘ü┬¢├í┬ܬßÜ╛ߢ¼

From La╚┬¥amon's Brut (The Chronicles of England, Middle English, West Midlands):

An preost wes on leoden, La╚┬¥amon was ihoten
He wes Leovena≡es sone -- li≡e him be Drihten.
He wonede at Ernle╚┬¥e at µ≡elen are chirechen,
Uppen Sevarne sta■e, sel ■ar him ■uhte,
Onfest Radestone, ■er he bock radde.

(The third letter in the author's name is Yogh, missing from many fonts; CLICK HERE for another Middle English sample with some explanation of letters and encoding).

From the Tagelied of Wolfram von Eschenbach (Middle High German):

Sεne klΓwen durh die wolken sint geslagen,
er stεget √f mit gr⌠zer kraft,
ich sih in grΓwen tΣgelεch als er wil tagen,
den tac, der im geselleschaft
erwenden wil, dem werden man,
den ich mit sorgen εn verliez.
ich bringe in hinnen, ob ich kan.
sεn vil manegiu tugent michz leisten hiez.

Some lines of Odysseus Elytis (Greek):

Monotonic:

╬ñ╬╖ ╬│╬╗╧├Å├Å├Ä▒ ╬╝╬┐╧┬à ╬¡╬┤╧∩╛Å├Ä▒╬╜ ╬╡╬╗╬╗╬╖╬╜╬╣╬║╬«
╧├Ä┐ ╧├Å┬Ç╬»╧├Ä╣ ╧╬ƒ─¬∩╛Å╪«╣╬║╧┬î ╧├Å├Ä╣╧┬é ╬▒╬╝╬╝╬┐╧α╕«┤╬╣╬¡╧┬é ╧├Ä┐╧┬à ╬┬ƒ╬╝╬«╧├Ä┐╧┬à.
╬┬£╬┐╬╜╬¼╧╪«╖ ╬¡╬│╬╜╬┐╬╣╬▒ ╬╖ ╬│╬╗╧├Å├Å├Ä▒ ╬╝╬┐╧┬à ╧├Å├Ä╣╧┬é ╬▒╬╝╬╝╬┐╧α╕«┤╬╣╬¡╧┬é ╧├Ä┐╧┬à ╬┬ƒ╬╝╬«╧├Ä┐╧┬à.

╬▒╧┬Ç╧┬î ╧├Ä┐ ╬╬₧╛╬╣╬┐╬╜ ╬┬ò╧├Å├Ä»
╧├Ä┐╧┬à ╬┬ƒ╬┤╧α╕»├Å├Ä¡╬▒ ╬∩┐╜╧├Å├Ä╖

Polytonic:

╬ñß╜┤ ╬│╬╗ß┐╢╧├Å├Ä▒ ╬╝╬┐ß┐ª ß╝∩┐╜╧∩╛Å├Ä▒╬╜ ß╝∩┐╜╬╗╬╖╬╜╬╣╬║ß╜┤
╧├í╜╕ ╧├Å┬Ç╬»╧├Ä╣ ╧╬ƒ─¬∩╛Å╪«╣╬║ß╜╕ ╧├Å├í╜╢╧┬é ß╝Ç╬╝╬╝╬┐╧α╕«┤╬╣ß╜▓╧┬é ╧├Ä┐ß┐ª ß╜∩╛Ä╝╬«╧├Ä┐╧┬à.
╬┬£╬┐╬╜╬¼╧╪«╖ ß╝∩┐╜╬╜╬┐╬╣╬▒ ß╝í ╬│╬╗ß┐╢╧├Å├Ä▒ ╬╝╬┐╧┬à ╧├Å├í╜╢╧┬é ß╝Ç╬╝╬╝╬┐╧α╕«┤╬╣ß╜▓╧┬é ╧├Ä┐ß┐ª ß╜∩╛Ä╝╬«╧├Ä┐╧┬à.

ß╝Ç╧┬Çß╜╕ ╧├í╜╕ ß╝╨«╛╬╣╬┐╬╜ ß╝É╧├Å├Ä»
╧├Ä┐ß┐ª ß╜∩┐╜┤╧α╕»├Å├Ä¡╬▒ ß╝∩┐╜╧├Å├Ä╖

The first stanza of Pushkin's Bronze Horseman (Russian):

╨┬¥╨░ ╨▒╨╡╤┬Ç╨╡╨│╤┬â ╨┐╤├æ├æ┼â┬ï╨╜╨╜╤┬ï╤┬à ╨▓╨╛╨╗╨╜
╨í╤─É╛╤┼┤╗ ╨╛╨╜, ╨┤╤∩┐╜╝ ╨▓╨╡╨╗╨╕╨║╨╕╤┬à ╨┐╨╛╨╗╨╜,
╨┬ÿ ╨▓╨┤╨░╨╗╤┬î ╨│╨╗╤┼┤┤╨╡╨╗. ╨┬ƒ╤┬Ç╨╡╨┤ ╨╜╨╕╨╝ ╤∩┐╜╕╤┬Ç╨╛╨║╨╛
╨á╨╡╨║╨░ ╨╜╨╡╤├É╗╨░╤├æ┬Å; ╨▒╨╡╨┤╨╜╤┬ï╨╣ ╤╪▒∩┐╜╨╜
╨┬ƒ╨╛ ╨╜╨╡╨╣ ╤├æ┼â┬Ç╨╡╨╝╨╕╨╗╤├æ┬Å ╨╛╨┤╨╕╨╜╨╛╨║╨╛.
╨┬ƒ╨╛ ╨╝╤∩┐╜╕╤├æ┼â┬ï╨╝, ╤─É╛╨┐╨║╨╕╨╝ ╨▒╨╡╤┬Ç╨╡╨│╨░╨╝
╨º╨╡╤┬Ç╨╜╨╡╨╗╨╕ ╨╕╨╖╨▒╤┬ï ╨╖╨┤╨╡╤├æ┬î ╨╕ ╤─É░╨╝,
╨┬ƒ╤┬Ç╨╕╤├æ┬é ╤∩┐╜▒╨╛╨│╨╛╨│╨╛ ╤╪▒├æα╕░╛╨╜╤╬á░;
╨┬ÿ ╨╗╨╡╤┬ü, ╨╜╨╡╨▓╨╡╨┤╨╛╨╝╤┬ï╨╣ ╨╗╤├æ╪░░╨╝
╨┬Æ ╤┼â∩┐╜╝╨░╨╜╨╡ ╤├É┐╤┬Ç╤├æ─É░╨╜╨╜╨╛╨│╨╛ ╤├É╛╨╗╨╜╤╬á░,
╨┬Ü╤┬Ç╤∩┐╜│╨╛╨╝ ╤∩┐╜∩┐╜╝╨╡╨╗.

┼áota Rustaveli's Vep╠┬çxis T╠úq╠┬çaosani, ╠ú∩╕íTh, The Knight in the Tiger's Skin (Georgian):

ßâòßâößâ₧ß∩┐╜ßâÿß─ª ß╦ÿß┬ºßâÉßâ¥ß─ªßâÉßâ£ßâÿ ß┬¿ßâ¥ßâùßâÉ ß┬áß┬úß─ªßâùßâÉßâòßâößâÜßâÿ

ß─ñßâ¢ßâöß┬áßâùß─ªßâÿ ß┬¿ßâößâ¢ßâòßâößâôß┬áßâö, ßâ£ß┬úßâùß┬ú ßâÖßâòßâÜßâÉ ßâôßâÉßâ¢ß∩┐╜ß─ªßâ£ßâÉß─ª ß─ªßâ¥ß┬ñßâÜßâÿß─ªßâÉ ß┬¿ß┬áßâ¥ßâ¢ßâÉß─ªßâÉ, ß┼₧ßâöß┼₧ß∩┐╜ßâÜß─ª, ß─┤ß┬ºßâÉßâÜß─ªßâÉ ßâôßâÉ ßâ¢ßâÿß─┤ßâÉß─ªßâÉ, ß┬░ßâÉßâöß┬áßâùßâÉ ßâùßâÉßâ£ßâÉ ßâ¢ß┬áßâ¥ßâ¢ßâÉß─ªßâÉ; ßâ¢ßâ¥ßâ¢ß┼₧ßâ£ßâöß─ª ß┬ñß┬áßâùßâößâ£ßâÿ ßâôßâÉ ßâÉß─ñßâòß┬ñß┬áßâÿßâ£ßâôßâö, ßâ¢ßâÿßâòß┬░ß∩┐╜ßâòßâôßâö ßâ¢ßâÉß─ª ß─░ßâößâ¢ß─ªßâÉ ßâ£ßâôßâ¥ßâ¢ßâÉß─ªßâÉ, ßâôß─ñßâÿß─ªßâÿßâù ßâôßâÉ ß─ñßâÉßâ¢ßâÿßâù ßâòß┬░ß∩┐╜ßâößâôßâòßâÿßâôßâö ßâ¢ßâûßâÿß─ªßâÉ ßâößâÜßâòßâÉßâùßâÉ ßâÖß┬áßâùßâ¥ßâ¢ßâÉßâÉß─ªßâÉ.

Tamil poetry of Cupiramaniya Paarathiyar, α«∩┐╜├᫬α»├á«░α««α«úα«┐α«» ᫬α«╛α«░α«ñα«┐α«»α«╛α«░α»┬ì (1882-1921):

α«»α«╛α««α«▒α«┐α«¿α»├á«ñ α««α»∩┐╜«┤α«┐α«∩┐╜│α«┐α«▓α»┬ç α«ñα««α«┐α«┤α»├á««α»∩┐╜«┤α«┐ ᫬α»┬ïα«▓α»┬ì α«┘Ç«⌐α«┐α«ñα«╛α«╡α«ñα»┬ü α«├á«∩┐╜├á«∩┐╜├á««α»┬ì α«∩┐╜╛α«úα»┬ïα««α»┬ì,
᫬α«╛α««α«░α«░α«╛α«»α»┬ì α«╡α«┐α«▓α«∩┐╜├á«∩┐╜├á«∩┐╜│α«╛α«»α»┬ì, α«∩┐╜«▓α«∩┐╜⌐α»╫É«ñα»├á«ñα»├á««α»┬ì α«┘Ç«∩┐╜┤α»├á«∩┐╜├á«∩┐╜┐α«∩┐╜∩┐╜«▓᫬α»┬ì ᫬α«╛α«⌐α»├á««α»┬ê α«∩┐╜╬░«┬ƒα»├á«┬ƒα»┬ü,
α«¿α«╛α««α««α«ñα»┬ü α«ñα««α«┐α«┤α«░α»╬░«⌐α«∩┐╜┬ì α«∩┐╜∩┐╜«úα»├á«┬ƒα»┬ü α«┘Ç«∩┐╜├á«∩┐╜┬ü α«╡α«╛α«┤α»├á«¿α»├á«ñα«┐α«┬ƒα»├á«ñα«▓α»┬ì α«¿α«⌐α»├á«▒α»┬ï? α«∩┐╜∩┐╜«▓α»├á«▓α»┬Çα«░α»┬ì!
α«ñα»┘Ç««α«ñα»├á«░α«ñα»┬ì α«ñα««α«┐α«┤α»┬ïα«∩┐╜┬ê α«∩┐╜«▓α«∩┐╜«α»╬░«▓α«╛α««α»┬ì ᫬α«░α«╡α»├á««α»├á«╡α«∩┐╜┬ê α«∩┐╜╬░«»α»├á«ñα«▓α»┬ì α«╡α»┘Ç«úα»├á«┬ƒα»├á««α»┬ì.


I Can Eat Glass

And from the sublime to the ridiculous, here is a certain phrase¹ in an assortment of languages:

  1. Sanskrit: ∩╗┐αñ∩┐╜╛αñ∩┐╜┬é αñ╢αñ∩┐╜├áñ¿αÑ┬ïαñ«αÑ├áñ»αññαÑ├áññαÑ├áñ«αÑ┬ì αÑñ αñ¿αÑ┬ïαñ¬αñ╣αñ┐αñ¿αñ╕αÑ├áññαñ┐ αñ«αñ╛αñ«αÑ┬ì αÑÑ
  2. Sanskrit (standard transcription): k─┬ücaß╣â ┼┬¢aknomyattum; nopahinasti m─┬üm.
  3. Classical Greek: ß╜∩┐╜╬╗╬┐╬╜ ╧∩┐╜╬│╬╡ß┐∩┐╜ ╬┤ß╜╗╬╜╬▒╬╝╬▒╬╣╬┬ç ╧├Ä┐ß┐ª╧├Ä┐ ╬┐ß╜ö ╬╝╬╡ ╬▓╬╗ß╜▒╧┬Ç╧├Ä╡╬╣.
  4. Greek (monotonic): ╬┬£╧┬Ç╬┐╧├Å┬Ä ╬╜╬▒ ╧╬₧¼╧┬ë ╧├Å┬Ç╬▒╧├Ä╝╬¡╬╜╬▒ ╬│╧α╕«▒╬╗╬╣╬¼ ╧╪»∩╛Å├Ä»╧┬é ╬╜╬▒ ╧┬Ç╬¼╬╕╧┬ë ╧├Ä»╧┬Ç╬┐╧├Ä▒.
  5. Greek (polytonic): ╬┬£╧┬Ç╬┐╧├í┐╢ ╬╜ß╜░ ╧╬₧¼╧┬ë ╧├Å┬Ç╬▒╧├Ä╝╬¡╬╜╬▒ ╬│╧α╕«▒╬╗╬╣ß╜░ ╧╪»∩╛Å├í╜╢╧┬é ╬╜ß╜░ ╧┬Ç╬¼╬╕╧┬ë ╧├Ä»╧┬Ç╬┐╧├Ä▒.
    Etruscan: (NEEDED)
  6. Latin: Vitrum edere possum; mihi non nocet.
  7. Old French: Je puis mangier del voirre. Ne me nuit.
  8. French: Je peux manger du verre, τa ne me fait pas de mal.
  9. Provenτal / Occitan: P≥di manjar de veire, me nafrariß pas.
  10. QuΘbΘcois: J'peux manger d'la vitre, τa m'fa pas mal.
  11. Walloon: Dji pou magnε do vΩre, τoula m' freut nΘn mσ.
    Champenois: (NEEDED)
    Lorrain: (NEEDED)
  12. Picard: Ch'peux mingi du verre, cha m'foΘ mie n'ma.
    Corsican: (NEEDED)
    Jèrriais: (NEEDED)
  13. Krey≥l Ayisyen: Mwen kap manje vΦ, li pa blese'm.
  14. Basque: Kristala jan dezaket, ez dit minik ematen.
  15. Catalan / Catalα: Puc menjar vidre, que no em fa mal.
  16. Spanish: Puedo comer vidrio, no me hace da±o.
  17. Aragones: Puedo minchar beire, no me'n fa mal .
  18. Galician: Eu podo xantar cristais e non cortarme.
  19. European Portuguese: Posso comer vidro, nπo me faz mal.
  20. Brazilian Portuguese (8): Posso comer vidro, nπo me machuca.
  21. Caboverdiano: M' podΩ cumΩ vidru, ca ta maguΓ-m'.
  22. Papiamentu: Ami por kome glas anto e no ta hasimi da±o.
  23. Italian: Posso mangiare il vetro e non mi fa male.
  24. Milanese: S⌠n b⌠n de magnα el vΘder, el me fa minga mal.
  25. Roman: Me posso magna' er vetro, e nun me fa male.
  26. Napoletano: M' pozz magna' o'vetr, e nun m' fa mal.
  27. Sicilian: Puotsu mangiari u vitru, nun mi fa mali.
  28. Venetian: Mi posso magnare el vetro, no'l me fa mae.
  29. Zeneise (Genovese): P≥sso mangiΓ o veddro e o no me fα mΓ.
  30. Romansch (Grischun): Jau sai mangiar vaider, senza che quai fa donn a mai.
    Romany / Tsigane: (NEEDED)
  31. Romanian: Pot s─┬â m─┬ânΓnc sticl─┬â ╚┬Öi ea nu m─┬â r─┬âne╚┬Öte.
  32. Esperanto: Mi povas man─┬¥i vitron, ─┬¥i ne dama─┬¥as min.
    Pictish: (NEEDED)
    Breton: (NEEDED)
  33. Cornish: M² a yl dybry gwΘder hag Θf ny wra ow ankenya.
  34. Welsh: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
  35. Manx Gaelic: Foddym gee glonney agh cha jean eh gortaghey mee.
  36. Old Irish (Ogham): ßÜ¢ßÜ¢ßÜ∩┐╜┬Ü┬æßÜα╣ü┬Ü┬ößÜ∩┐╜┬Ü∩┐╜┬Ü┬ößÜïßÜÇßÜößÜ╫æ┬Ü┬ößÜÇßÜ├í┬Ü├í┬Ü┬ÉßÜα╣ü┬Ü┬æßÜÇßÜα╣ü┬Ü┬ößÜïßÜ╤ü┬Ü┬ôßÜα╣ü┬Ü┬ÉßÜ£
  37. Old Irish (Latin): Con╖iccim ithi nglano. Nφm╖gΘna.
  38. Irish: Is fΘidir liom gloinne a ithe. Nφ dhΘanann sφ dochar ar bith dom.
  39. Scottish Gaelic: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
  40. Anglo-Saxon (Runes): ߢ├í┬Ü│ߢ½ß¢ùßÜ¿ßÜ╖ߢ½ßÜ╖ߢÜßܿߢïߢ½ß¢ûßÜ⌐ߢ├í┬ܬßÜ╛ߢ½ßÜ⌐ßÜ╛ߢ₧ߢ½ßÜ╗ߢ├í┬¢├í┬¢½ßÜ╛ߢûߢ½ßÜ╗ߢûßܬßÜ▒ߢùߢ├í┬ܬßα╗ü┬¢½ß¢ùߢûߢ¼
  41. Anglo-Saxon (Latin): Ic mµg glµs eotan ond hit ne hearmia≡ me.
  42. Middle English: Ich canne glas eten and hit hirti■ me nou╚┬¥t.
  43. English: I can eat glass and it doesn't hurt me.
  44. English (IPA): [a╔¬ kµn i╦┬Ét gl╔┬æ╦┬És µnd ╔¬t d╔┬Éz n╔┬Æt h╔┬£╦┬Ét mi╦┬É] (Received Pronunciation)
  45. English (Braille): Γá∩┐╜á┬ÇΓá∩┐╜á├óá┬¥ΓáÇΓá∩┐╜├óá┬₧ΓáÇΓá¢Γá┘éá├óá├óá├óá┬ÇΓá├óá┬¥Γá∩┐╜┬ÇΓá∩┐╜á┬₧ΓáÇΓá∩┐╜∩┐╜∩┐╜├óá┬¥Γá₧ΓáÇΓá∩┐╜ÑΓá∩┐╜┬₧ΓáÇΓá├óá┬æ
  46. Lalland Scots / Doric: Ah can eat gless, it disnae hurt us.
    Glaswegian: (NEEDED)
  47. Gothic (4): ≡É╨£≡É╨É≡É╨Æ ≡É╨Æ≡É╨¢≡É╨ö≡Éìâ ≡É╨Ö╠╫á┬É┬ì─æ┬É╨É≡É╨¥, ≡É╨¥≡É╨Ö ≡É╨£≡É╨Ö≡Éìâ ≡Éìα╣É┬É╨ƒ ≡É╨¥≡É╨ô≡É╨É≡É╨¥ ≡É╨æ≡Éì─æ┬É╨Ö≡É╨Æ≡É╨Æ≡É╨Ö≡É╨ÿ.
  48. Old Norse (Runes): ߢûßÜ┤ ßÜ╖ߢûߢŠߢûߢ├í┬¢┬ü ßܺ ßÜ╖ߢÜߢûßÜ▒ ߢÿßÜ╛ ß∩┐╜┬¢┬ûߢïߢï ßÜ¿ßܺ ß∩┐╜┬¢┬û ßÜ▒ßα╗ü┬Ü¿ ߢïßÜ¿ßÜ▒
  49. Old Norse (Latin): Ek get eti≡ gler ßn ■ess a≡ ver≡a sßr.
  50. Norsk / Norwegian (Nynorsk): Eg kan eta glas utan σ skada meg.
  51. Norsk / Norwegian (Bokmσl): Jeg kan spise glass uten σ skade meg.
  52. F°royskt / Faroese: Eg kann eta glas, ska≡aleysur.
  53. ═slenska / Icelandic: ╔g get eti≡ gler ßn ■ess a≡ mei≡a mig.
  54. Svenska / Swedish: Jag kan Σta glas utan att skada mig.
  55. Dansk / Danish: Jeg kan spise glas, det g°r ikke ondt pσ mig.
  56. Sønderjysk: ╞ ka µe glass uhen at det go mµ naue.
  57. Frysk / Frisian: Ik kin glΩs ite, it docht me net sear.
  58. Nederlands / Dutch: Ik kan glas eten, het doet m─│ geen kwaad.
  59. Kirchr÷adsj/B⌠chesserplat: Iech ken glaas ΦΦse, mer 't deet miech jing pieng.
  60. Afrikaans: Ek kan glas eet, maar dit doen my nie skade nie.
  61. Lδtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir nδt wei.
  62. Deutsch / German: Ich kann Glas essen, ohne mir zu schaden.
  63. Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut.
  64. Langenfelder Platt: Isch kann Jlaas kimmeln, uuhne datt mich datt weh dΣΣd.
  65. Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii.
  66. OdenwΣlderisch: Iech konn glaasch voschbachteln ohne dass es mir ebbs daun doun dud.
  67. SΣchsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue.
  68. PfΣlzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
  69. SchwΣbisch / Swabian: I kσ Glas frΣssa, ond des macht mr nix!
  70. Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei.
  71. Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
  72. Schwyzerdⁿtsch: Ich chan Glaas Σsse, das tuet mir n÷d weeh.
  73. Hungarian: Meg tudom enni az ⁿveget, nem lesz t┼┬æle bajom.
  74. Suomi / Finnish: Voin sy÷dΣ lasia, se ei vahingoita minua.
  75. Sami (Northern): Sßhtßn borrat lßsa, dat ii leat bßv─├ä┬ìas.
  76. Erzian: ╨┬£╨╛╨╜ ╤├æ┬Ç╤├É░╨╜ ╤├æ∩┐╜╗╨╕╨║╨░╨┤╨╛, ╨┤╤┬ï ╨╖╤┬ï╤┼┤╜ ╤─₧╣╤├æ┼â─₧╜╨╖╤┬ì ╨░ ╤∩┐╜╗╨╕.
  77. Northern Karelian: Mie voin syvvΣ lasie ta minla ei ole kipie.
  78. Southern Karelian: MinΣ voin syvvΣ st'oklua dai minule ei ole kibie.
    Vepsian: (NEEDED)
    Votian: (NEEDED)
    Livonian: (NEEDED)
  79. Estonian: Ma v⌡in klaasi sⁿⁿa, see ei tee mulle midagi.
  80. Latvian: Es varu ─┬ôst stiklu, tas man nekait─┬ô.
  81. Lithuanian: A┼í galiu valgyti stikl─┬à ir jis man─┬Ös ne┼╛eid┼╛ia
    Old Prussian: (NEEDED)
    Sorbian (Wendish): (NEEDED)
  82. Czech: Mohu jφst sklo, neublφ┼╛φ mi.
  83. Slovak: M⌠┼╛em jes┼Ñ sklo. Nezranφ ma.
  84. Polska / Polish: Mog─┬Ö je┼┬¢─┬ç szk┼┬éo i mi nie szkodzi.
  85. Slovenian: Lahko jem steklo, ne da bi mi škodovalo.
  86. Croatian: Ja mogu jesti staklo i ne boli me.
  87. Serbian (Latin): Mogu jesti staklo a da mi ne škodi.
  88. Serbian (Cyrillic): ╨┬£╨╛╨│╤┬â ╤∩┐╜╤├æ─É╕ ╤├æ─É░╨║╨╗╨╛ ╨░ ╨┤╨░ ╨╝╨╕ ╨╜╨╡ ╤∩┐╜║╨╛╨┤╨╕.
  89. Macedonian: ╨┬£╨╛╨╢╨░╨╝ ╨┤╨░ ╤∩┐╜╨┤╨░╨╝ ╤├æ─É░╨║╨╗╨╛, ╨░ ╨╜╨╡ ╨╝╨╡ ╤∩┐╜─É╡╤─É░.
  90. Russian: ╨» ╨╝╨╛╨│╤┬â ╨╡╤├æ┼â┬î ╤├æ─É╡╨║╨╗╨╛, ╨╛╨╜╨╛ ╨╝╨╜╨╡ ╨╜╨╡ ╨▓╤┬Ç╨╡╨┤╨╕╤┬é.
  91. Belarusian (Cyrillic): ╨» ╨╝╨░╨│╤┬â ╨╡╤├æ╬í┬û ╤∩┐╜║╨╗╨╛, ╤┼┤╜╨╛ ╨╝╨╜╨╡ ╨╜╨╡ ╤∩┐╜║╨╛╨┤╨╖╤┬û╤╬í┬î.
  92. Belarusian (Lacinka): Ja mahu je┼┬¢ci ┼ík┼┬éo, jano mne ne ┼íkodzi─┬ç.
  93. Ukrainian: ╨» ╨╝╨╛╨╢╤┬â ╤┬ù╤├æ─É╕ ╤∩┐╜║╨╗╨╛, ╨╣ ╨▓╨╛╨╜╨╛ ╨╝╨╡╨╜╤┬û ╨╜╨╡ ╨┐╨╛╤∩┐╜║╨╛╨┤╨╕╤┼â┬î.
  94. Bulgarian: ╨┬£╨╛╨│╨░ ╨┤╨░ ╤┼┤╝ ╤├æ┼â∩┐╜║╨╗╨╛, ╤─É╛ ╨╜╨╡ ╨╝╨╕ ╨▓╤┬Ç╨╡╨┤╨╕.
  95. Georgian: ßâ¢ßâÿßâ£ßâÉß─ª ßâòß┬¡ßâÉß⢠ßâôßâÉ ßâÉß┬áßâÉ ßâ¢ß╦ÿßâÖßâÿßâòßâÉ.
  96. Armenian: ╘┐╓┬Ç╒╢╒í╒┤ ╒í╒║╒í╒»╒½ ╒╕╓┼É┐╒Ñ╒¼ ╓┬ç ╒½╒╢╒«╒½ ╒í╒╢╒░╒í╒╢╒ú╒½╒╜╒┐ ╒╣╒¿╒╢╒Ñ╓┬Ç╓┬ë
  97. Albanian: Unδ mund tδ ha qelq dhe nuk mδ gjen gjδ.
  98. Turkish: Cam yiyebilirim, bana zarar─▒ dokunmaz.
  99. Turkish (Ottoman): ╪¼╪º┘┬à ┘∩┐╜∩┐╜┬ç ╪¿┘┼▓∩┐╜▒┘┬à ╪¿┌¡╪º ╪╢╪▒╪▒┘┬ë ╪╖┘∩┐╜┼«∩┐╜╬⌐α╕╕▓
  100. Bangla / Bengali: αª╬░ª«αª┐ αª∩┐╜╛αª├áª┬Ü αª∩┐╜┘Ǫñαº┬ç ᪬αª╛αª░αª┐, αªñαª╛αªñαº┬ç αª╬░ª«αª╛αª░ αª∩┐╜┬ïনαº┬ï αª∩┐╜├áª╖αªñαª┐ αª╣αº┬ƒ নαª╛αÑñ
  101. Marathi: αñ«αÑ┬Ç αñ∩┐╜╛αñ┬Ü αñ∩┐╜╛αñ┬è αñ╢αñ∩┐╜ñαÑ┬ï, αñ«αñ▓αñ╛ αññαÑ┬ç αñªαÑ├áñ∩┐╜ñ αñ¿αñ╛αñ╣αÑ┬Ç.
  102. Hindi: αñ«αÑ╫Éñ┬é αñ∩┐╜╛αñ├áñ┬Ü αñ∩┐╜╛ αñ╕αñ∩┐╜ñαñ╛ αñ╣αÑ┼òñ┬ü αñ∩┐╜░ αñ«αÑ├áñ┬¥αÑ┬ç αñ∩┐╜ñ╕αñ╕αÑ┬ç αñ∩┐╜┬ïαñ┬ê αñ∩┐╜┬ïαñ┬ƒ αñ¿αñ╣αÑ┬Çαñ┬é αñ¬αñ╣αÑ├áñ┼òñ∩┐╜ñαÑ┬Ç.
  103. Tamil: α«¿α«╛α«⌐α»┬ì α«∩┐╜úα»├á«úα«╛α«┬ƒα«┐ α«∩┐╜╛᫬α»├᫬α«┐α«┬ƒα»├á«╡α»┘Ç«⌐α»┬ì, α«α╣Ç«ñα«⌐α«╛α«▓α»┬ì α«├á«⌐α«∩┐╜├á«∩┐╜┬ü α«∩┐╜░α»┬ü α«∩┐╜┘Ç«┬ƒα»├á««α»┬ì α«╡α«░α«╛α«ñα»┬ü.
  104. Urdu(2): ┘∩┐╜╨║║ ┌⌐╪º┘╬¬┬å ┌⌐┌╛╪º ╪│┌⌐╪¬╪º █├Ö∩┐╜║ ╪º┘∩┐╜▒ ┘α╕╕¼┌╛█┬Æ ╪¬┌⌐┘├¢╨╣┬ü ┘╬½├¢╨║║ █├Ö∩┐╜¬█┬î █┬ö
  105. Pashto(2): ╪▓┘┬ç ╪┤┘∩┐╜┤┘┬ç ╪«┘∩┐╜┬ô┘├¢┬É ╪┤┘α╕╕┬î ┘╪╕║┘┬ç ┘α╕╕º ┘╬⌐┬ç ╪«┘∩┐╜┬û┘∩┐╜┬è
  106. Farsi / Persian: .┘α╕╣┬å ┘∩┐╜┬î ╪¬┘∩┐╜º┘╬⌐┬à ╪¿╪»┘∩┐╜╬⌐┬É ╪º╪¡╪│╪º╪│ ╪»╪▒╪» ╪┤┘∩┐╜┤┘┬ç ╪¿╪«┘∩┐╜▒┘┬à
  107. Arabic(2): ╪ú┘╬¿º ┘┼ÿº╪»╪▒ ╪╣┘┼▓┬ë ╪ú┘├Ö┬ä ╪º┘├ÿ▓╪¼╪º╪¼ ┘┬ê ┘╪╕░╪º ┘├ÿº ┘∩┐╜ñ┘┼▓α╕╣╬⌐┬è.
    Aramaic: (NEEDED)
  108. Hebrew(2): ╫∩┐╜╫┬Ö ╫┬Ö╫┬¢╫┬ò╫┬£ ╫┬£╫┬É╫┬¢╫┬ò╫┬£ ╫┬û╫┬¢╫┬ò╫┬¢╫∩┐╜ ╫┬ò╫┬û╫┬ö ╫┬£╫┬É ╫┬₧╫┬û╫∩┐╜ ╫┬£╫┬Ö.
  109. Yiddish(2): ╫┬É╫┬Ö╫┬Ü ╫º╫ó╫┬ƒ ╫ó╫í╫┬ƒ ╫┬Æ╫┬£╫∩┐╜╫┬û ╫┬É╫┬ò╫┬ƒ ╫ó╫í ╫┬ÿ╫┬ò╫┬ÿ ╫┬₧╫∩┐╜ ╫á╫∩┐╜╫┬ÿ ╫░╫▓.
    Judeo-Arabic: (NEEDED)
    Ladino: (NEEDED)
    G╟┬¥╩╝╟┬¥z: (NEEDED)
    Amharic: (NEEDED)
  110. Twi: Metumi awe tumpan, ╔┬£ny╔┬£ me hwee.
  111. Hausa (Latin): Ina╠┬ä iya taunar gila╠┬äshi kuma in gama╠┬ä la╠┬äfiya╠┬ä.
  112. Hausa (Ajami) (2): ╪Ñ┘┬É┘╬¿º ╪Ñ┘┬É┘∩╛Ö┬Ä ╪¬┘├Ö∩┐╜╬⌐├ÿ▒ ╪║┘┬É┘┼▓├ÿº╪┤┘┬É ┘├Ö├Öα╕╣┬Ä ╪Ñ┘┬É┘┬å ╪║┘├Öα╕╣├ÿº ┘┼▓├ÿº┘├Ö┬É┘∩╛Ö├ÿº
  113. Yoruba(3): Mo lΦ je╠⌐ dφgφ, k≥ nφ pa mφ lßra.
  114. Lingala: Nakoki╠┬ü koli╠┬üya bite╠┬üni bya milungi, ekosa╠┬üla nga╠┬üi╠┬ü mabe╠┬ü t╔┬¢╠┬ü.
  115. (Ki)Swahili: Naweza kula bilauri na sikunyui.
  116. Malay: Saya boleh makan kaca dan ia tidak mencederakan saya.
  117. Tagalog: Kaya kong kumain nang bubog at hindi ako masaktan.
  118. Chamorro: Si±a yo' chumocho krestat, ti ha na'lalamen yo'.
  119. Javanese: Aku isa mangan beling tanpa lara.
  120. Burmese: ßÇÇßÇ╣ßÇÜßÇ╣ßÇ¥ßÇößÇ╣ΓÇ╤ü┬Ç┬ÉßÇ▒ßǼßÇ╣ΓÇ╤ü┬ü∩┐╜┬Ç┬ÇßÇ╣ßÇÜßÇ╣ßÇ¥ßÇößÇ╣ΓÇ╤ü┬Ç┬Ö ßÇÖßÇ╣ßÇÜßÇÇßÇ╣ΓÇ╤ü┬Çα╣ü┬ǼßÇ╕ßÇößÇ»ßÇ¡ßÇ├í┬Ç╣ΓÇ╤ü┬Ç┬₧ßÇ∩┐╜┬Ç╣ΓÇ╤ü┬ü┬ï ßü├í┬Ç┬ÇßÇ╣ßÇ¢ßÇ▒ßǼßÇ├í┬Ç╣ΓÇ╤ü┬Ç╖ ßÇæßÇ¡ßÇ├í┬Ç»ßÇ¡ßÇÇßÇ╣ΓÇ╤ü┬Ç┬ÖßÇ╣ßǃßÇ» ßÇÖßÇ¢ßÇ╣ßǃßÇ¡ßÇòßǼßüï (9)
  121. Vietnamese (quß╗æc ngß╗»): T⌠i c≤ thß╗â ─┬ân thß╗ºy tinh mα kh⌠ng hß║íi g∞.
  122. Vietnamese (n⌠m) (4): Σ║¢ ≡úÄÅ Σ╕û σÆ╣ µ░┤ µÖ╢ ≡ªôí τ⌐║ ≡úÄÅ σ«│ σƪ
    Khmer: (NEEDED)
    Lao: (NEEDED)
  123. Thai: α╕∩┐╜╕▒α╕∩┐╜├á╕┤α╕∩┐╜├á╕úα╕░α╕╫É╕├á╣─ü╕∩┐╜┬ë α╣├á╕∩┐╜╫É╕íα╕▒α╕∩┐╜─ü╕íα╣╫É╕∩┐╜│α╣├á╕½α╣∩┐╜╕∩┐╜╕▒α╕∩┐╜┬Çα╕╫É╣┘Ç╕┬Ü
  124. Mongolian (Cyrillic): ╨∩┐╜ ╤∩┐╜╕╨╗ ╨╕╨┤╤─₧╣ ╤╪░░╨┤╨╜╨░, ╨╜╨░╨┤╨░╨┤ ╤α╕░╛╤┬Ç╤─É╛╨╣ ╨▒╨╕╤┬ê
  125. Mongolian (Classic) (5): ᠪᠢ ᠰᠢᠯᠢ ᠢᠳᠡᠶᠦ ᠴᠢᠳᠠᠨᠠ ᠂ ᠨᠠᠳᠤᠷ ᠬᠣᠤᠷᠠᠳᠠᠢ ᠪᠢᠰᠢ
    Dzongkha: (NEEDED)
    Nepali: (NEEDED)
  126. Tibetan: α╜ñα╜║α╜úα╝┬ïα╜ªα╛∩┐╜╝α╝┬ïα╜┬ƒα╝┬ïα╜∩┐╜ªα╝┬ïα╜─ü╝┬ïα╜∩┐╜┬ïα╜┼ò╜▓α╝┬ïα╜∩┐╜┬ïα╜óα╜║α╜∩┐╜┬ì
  127. Chinese: µêæΦ┬╜σÉ₧Σ╕ïτ┬╗τÆ├¿┬Ç╤ä╕├ñ╝ñΦ║½Σ╜ôπÇé
  128. Chinese (Traditional): µêæΦ┬╜σÉ₧Σ╕ïτ┬╗τÆ├¿┬Ç╤ä╕├Ñ╦çΦ║½Θ½öπÇé
  129. Taiwanese(6): G≤a ─┬ô-tαng chia╠┬ìh po-lΩ, m─┬ü b─┬ô tio╠┬ìh-siong.
  130. Japanese: τº├ú┬»π┼╣π─░π┼íπé∩┐╜┬ƒπ┬╣πé∩┐╜┬é╤â┬╛πüÖπÇ─â┬ü┬¥πé╤â┬»τº├ú┬é┬Æσ╦çπ┬ñπüæπ┬╛πü¢πéôπÇé
  131. Korean: δéÿδèö ∞∩┐╜¼δÑ╝ δ¿╣∞¥ä ∞êÿ ∞₧╫£∩┐╜┬Ü┬ö. Ω╖╕δ₧ÿδÅä ∞ò├¡┬ö─ùº┬Ç ∞ò∩┐╜┬ò─ù┬Ü┬ö
  132. Bislama: Mi save kakae glas, hemi no save katem mi.
  133. Hawaiian: Hiki ia╩╗u ke ╩╗ai i ke aniani; ╩╗a╩╗ole n┼┬ì l─┬ü au e ╩╗eha.
  134. Marquesan: E ko╩╗ana e kai i te karahi, mea ╩╗─┬ü, ╩╗a╩╗e hauhau.
  135. Chinook Jargon: Naika m╔┬Ökm╔┬Ök kaksh╔┬Öt labutay, pi weyk ukuk munk-sik nay.
  136. Navajo: TsΘs╟½╩╝ yish─α╕¼├ä┬àgo bφφnφshghah d≤≤ doo shi┼┬é neezgai da.
    Cherokee (and Cree, Ojibwa, Inuktitut, Náhuatl, Quechua, and other American languages): (NEEDED)
    Garifuna: (NEEDED)
    Gullah: (NEEDED)
  137. Lojban: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi
  138. N≤rdicg: Ljœr ye caudran crΘne■ ² jor cẃran.

(Additions, corrections, completions, gratefully accepted.)

For testing purposes, some of these are repeated in a monospace font . . .

  1. Euro Symbol: Γ┼╣.
  2. Greek: ╬┬£╧┬Ç╬┐╧├Å┬Ä ╬╜╬▒ ╧╬₧¼╧┬ë ╧├Å┬Ç╬▒╧├Ä╝╬¡╬╜╬▒ ╬│╧α╕«▒╬╗╬╣╬¼ ╧╪»∩╛Å├Ä»╧┬é ╬╜╬▒ ╧┬Ç╬¼╬╕╧┬ë ╧├Ä»╧┬Ç╬┐╧├Ä▒.
  3. ═slenska / Icelandic: ╔g get eti≡ gler ßn ■ess a≡ mei≡a mig.
  4. Polish: Mog─┬Ö je┼┬¢─┬ç szk┼┬éo, i mi nie szkodzi.
  5. Romanian: Pot s─┬â m─┬ânΓnc sticl─┬â ╚┬Öi ea nu m─┬â r─┬âne╚┬Öte.
  6. Ukrainian: ╨» ╨╝╨╛╨╢╤┬â ╤┬ù╤├æ─É╕ ╤∩┐╜║╨╗╨╛, ╨╣ ╨▓╨╛╨╜╨╛ ╨╝╨╡╨╜╤┬û ╨╜╨╡ ╨┐╨╛╤∩┐╜║╨╛╨┤╨╕╤┼â┬î.
  7. Armenian: ╘┐╓┬Ç╒╢╒í╒┤ ╒í╒║╒í╒»╒½ ╒╕╓┼É┐╒Ñ╒¼ ╓┬ç ╒½╒╢╒«╒½ ╒í╒╢╒░╒í╒╢╒ú╒½╒╜╒┐ ╒╣╒¿╒╢╒Ñ╓┬Ç╓┬ë
  8. Georgian: ßâ¢ßâÿßâ£ßâÉß─ª ßâòß┬¡ßâÉß⢠ßâôßâÉ ßâÉß┬áßâÉ ßâ¢ß╦ÿßâÖßâÿßâòßâÉ.
  9. Hindi: αñ«αÑ╫Éñ┬é αñ∩┐╜╛αñ├áñ┬Ü αñ∩┐╜╛ αñ╕αñ∩┐╜ñαñ╛ αñ╣αÑ┼òñ┬ü, αñ«αÑ├áñ┬¥αÑ┬ç αñ∩┐╜ñ╕ αñ╕αÑ┬ç αñ∩┐╜┬ïαñ┬ê αñ¬αÑ┬Çαñíαñ╛ αñ¿αñ╣αÑ┬Çαñ┬é αñ╣αÑ┬ïαññαÑ┬Ç.
  10. Hebrew(2): ╫∩┐╜╫┬Ö ╫┬Ö╫┬¢╫┬ò╫┬£ ╫┬£╫┬É╫┬¢╫┬ò╫┬£ ╫┬û╫┬¢╫┬ò╫┬¢╫∩┐╜ ╫┬ò╫┬û╫┬ö ╫┬£╫┬É ╫┬₧╫┬û╫∩┐╜ ╫┬£╫┬Ö.
  11. Yiddish(2): ╫┬É╫┬Ö╫┬Ü ╫º╫ó╫┬ƒ ╫ó╫í╫┬ƒ ╫┬Æ╫┬£╫∩┐╜╫┬û ╫┬É╫┬ò╫┬ƒ ╫ó╫í ╫┬ÿ╫┬ò╫┬ÿ ╫┬₧╫∩┐╜ ╫á╫∩┐╜╫┬ÿ ╫░╫▓.
  12. Arabic(2): ╪ú┘╬¿º ┘┼ÿº╪»╪▒ ╪╣┘┼▓┬ë ╪ú┘├Ö┬ä ╪º┘├ÿ▓╪¼╪º╪¼ ┘┬ê ┘╪╕░╪º ┘├ÿº ┘∩┐╜ñ┘┼▓α╕╣╬⌐┬è.
  13. Japanese: τº├ú┬»π┼╣π─░π┼íπé∩┐╜┬ƒπ┬╣πé∩┐╜┬é╤â┬╛πüÖπÇ─â┬ü┬¥πé╤â┬»τº├ú┬é┬Æσ╦çπ┬ñπüæπ┬╛πü¢πéôπÇé
  14. Thai: α╕∩┐╜╕▒α╕∩┐╜├á╕┤α╕∩┐╜├á╕úα╕░α╕╫É╕├á╣─ü╕∩┐╜┬ë α╣├á╕∩┐╜╫É╕íα╕▒α╕∩┐╜─ü╕íα╣╫É╕∩┐╜│α╣├á╕½α╣∩┐╜╕∩┐╜╕▒α╕∩┐╜┬Çα╕╫É╣┘Ç╕┬Ü

Notes:

  1. The "I can eat glass" phrase and initial translations (about 30 of them) were borrowed from Ethan Mollick's I Can Eat Glass page (which disappeared on or about June 2004) and converted to UTF-8. Since Ethan's original page is gone, I should mention that his purpose was to offer travelers a phrase they could use in any country that would command a certain kind of respect, or at least get attention. See Credits for the many additional contributions since then. When submitting new entries, the word "hurt" (if you have a choice) is used in the sense of "cause harm", "do damage", or "bother", rather than "inflict pain" or "make sad". In this vein Otto Stolz comments (as do others further down; personally I think it's better for the purpose of this page to have extra entries and/or to show a greater repertoire of characters than it is to enforce a strict interpretation of the word "hurt"!):

    This is the meaning I have translated to the Swabian dialect. However, I just have noticed that most of the German variants translate the "inflict pain" meaning. The German example should read:

    "Ich kann Glas essen ohne mir zu schaden."

    rather than:

    "Ich kann Glas essen, ohne mir weh zu tun."

    (The comma fell victim to the 1996 orthographic reform, cf. http://www.ids-mannheim.de/reform/e3-1.html#P76.

    You may wish to contact the contributors of the following translations to correct them:

    • Lδtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir nδt wei.
    • Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii.
    • SΣchsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue.
    • Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei.
    • Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
    • Schwyzerdⁿtsch: Ich chan Glaas Σsse, das tuet mir n÷d weeh.

    In contrast, I deem the following translations *alright*:

    • Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut.
    • PfΣlzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
    • SchwΣbisch / Swabian: I kσ Glas frΣssa, ond des macht mr nix!

    (However, you could remove the commas, on account of http://www.ids-mannheim.de/reform/e3-1.html#P76 and http://www.ids-mannheim.de/reform/e3-1.html#P72, respectively.)

    I guess, also these examples translate the wrong sense of "hurt", though I do not know these languages well enough to assert them definitely:

    • Nederlands / Dutch: Ik kan glas eten; het doet m─│ geen p─│n. (This one has been changed)
    • Kirchr÷adsj/B⌠chesserplat: Iech ken glaas ΦΦse, mer 't deet miech jing pieng.

    In the Romanic languages, the variations on "fa male" (it) are probably wrong, whilst the variations on "hace da±o" (es) and "dama─┬¥as" (Esperanto) are probably correct; "nocet" (la) is definitely right.

    The northern Germanic variants of "skada" are probably right, as are the Slavic variants of "┼íkodi/╤∩┐╜║╨╛╨┤╨╕" (se); however the Slavic variants of " boli" (hv) are probably wrong, as "bolena" means "pain/ache", IIRC.

    That was from July 2004. In December 2007, Otto writes again:

    Hello Frank, in days of yore, I had written:
    > "Ich kann Glas essen ohne mir zu schaden."
    > (The comma fell victim to the 1996 orthographic reform,

    cf. http://www.ids-mannheim.de/reform/e3-1.html#P76.

    The latest revision (2006) of the official German orthography has revived the comma around infinitive clauses commencing with ohne, or 5 other conjunctions, or depending from a noun or from an announcing demonstrative (http://www.ids-mannheim.de/reform/regeln2006.pdf, §75). So, it's again: Ich kann Glas essen, ohne mir zu schaden.

    Best wishes,
         Otto Stolz

  2. The numbering of the samples is arbitrary, done only to keep track of how many there are, and can change any time a new entry is added. The arrangement is also arbitrary but with some attempt to group related examples together. Note: All languages not listed are wanted, not just the ones that say (NEEDED).

  3. Correct right-to-left display of these languages depends on the capabilities of your browser. The period should appear on the left. In the monospace Yiddish example, the Yiddish digraphs should occupy one character cell.

  4. Yoruba: The third word is Latin letter small 'j' followed by small 'e' with U+0329, Combining Vertical Line Below. This displays correctly only if your Unicode font includes the U+0329 glyph and your browser supports combining diacritical marks. The Lingala and Indic examples also include combining sequences.

  5. Includes Unicode 3.1 (or later) characters beyond Plane 0.

  6. The Classic Mongolian example should be vertical, top-to-bottom and left-to-right. But such display is almost impossible. Also no font yet exists which provides the proper ligatures and positional variants for the characters of this script, which works somewhat like Arabic.

  7. Taiwanese is also known as Holo or Hoklo, and is related to Southern Min dialects such as Amoy. Contributed by Henry H. Tan-Tenn, who comments, "The above is the romanized version, in a script current among Taiwanese Christians since the mid-19th century. It was invented by British missionaries and saw use in hundreds of published works, mostly of a religious nature. Most Taiwanese did not know Chinese characters then, or at least not well enough to read. More to the point, though, a written standard using Chinese characters has never developed, so a significant minority of words are represented with different candidate characters, depending on one's personal preference or etymological theory. In this sentence, for example, "-tαng", "chia╠┬ìh", "m─┬ü" and "b─┬ô" are problematic using Chinese characters. "G≤a" (I/me) and "po-lΩ" (glass) are as written in other Sinitic languages (e.g. Mandarin, Hakka)."

  8. Wagner Amaral of Pinese & Amaral Associados notes that the Brazilian Portuguese sentence for "I can eat glass" should be identical to the Portuguese one, as the word "machuca" means "inflict pain", or rather "injuries". The words "faz mal" would more correctly translate as "cause harm".

  9. Burmese: In English the first person pronoun "I" stands for both genders, male and female. In Burmese (except in the central part of Burma) kyundaw (ßÇÇßÇ╣ßÇÜßÇ╣ßÇ¥ßÇößÇ╣ΓÇ╤ü┬Ç┬ÉßÇ▒ßǼßÇ╣ΓÇî) for male and kyanma (ßÇÇßÇ╣ßÇÜßÇ╣ßÇ¥ßÇößÇ╣ΓÇ╤ü┬Ç┬Ö) for female. Using here a fully-compliant Unicode Burmese font -- sadly one and only Padauk Graphite font exists -- rendering using graphite engine. CLICK HERE to test Burmese characters.

The Quick Brown Fox

The "I can eat glass" sentences do not necessarily show off the orthography of each language to best advantage. In many alphabetic written languages it is possible to include all (or most) letters (or "special" characters) in a single (often nonsense) pangram. These were traditionally used in typewriter instruction; now they are useful for stress-testing computer fonts and keyboard input methods. Here are a few examples (SEND MORE):

  1. English: The quick brown fox jumps over the lazy dog.
  2. Irish: "An ß╕âfuil do ─┬ïroφ ag bualaß╕ï ≤ ß╕ƒaitφos an ─írß a ß╣üeall lena ß╣∩┐╜g Θada ≤ ß╣ílφ do leasa ß╣½·?" "D'ß╕ƒuascail ═osa ┌rß╣üac na h╙i─íe Beannaiß╣½e p≤r ╔ava agus ├├í╕┬ïaiß╣ü."
  3. Dutch: Pa's w─│ze lynx bezag vroom het fikse aquaduct.
  4. German: Falsches ▄ben von Xylophonmusik quΣlt jeden gr÷▀eren Zwerg. (1)
  5. German: Im fin┼┐teren Jagd┼┐chlo▀ am offenen Felsquellwa┼┐┼┐er patzte der affig-flatterhafte kauzig-h÷fΓÇîliche BΣcker ⁿber ┼┐einem ver┼┐ifften kniffligen C-Xylophon. (2)
  6. Swedish: Flygande bΣckasiner s÷ka strax hwila pσ mjuka tuvor.
  7. Icelandic: Sµv÷r grΘt ß≡an ■vφ ·lpan var ≤n²t.
  8. Polish: Pchn─α╕ñ┬ç w t─┬Ö ┼─é│d┼║ je┼╝a lub o┼┬¢m skrzy┼┬ä fig.
  9. Czech: P┼∩┐╜li┼í ┼╛lu┼Ñou─┬ìk² k┼»┼┬ê ·p─┬¢l ─├âíbelskΘ k≤dy.
  10. Slovak: Star² k⌠┼┬ê na h┼┬òbe knφh ┼╛uje tφ┼íko povΣdnutΘ ru┼╛e, na st─║pe sa ─┬Åate─╛ u─├â¡ kvßka┼Ñ nov· ≤du o ┼╛ivote.
  11. Greek (monotonic): ╬╛╬╡╧├Ä║╬╡╧┬Ç╬¼╬╢╧┬ë ╧├Ä╖╬╜ ╧∩┐╜α╕»╪«┐╧╬₧╕╧╨»├Ä▒ ╬▓╬┤╬╡╬╗╧α╕«│╬╝╬»╬▒
  12. Greek (polytonic): ╬╛╬╡╧├Ä║╬╡╧┬Ç╬¼╬╢╧┬ë ╧├í╜┤╬╜ ╧∩┐╜α╕»╪«┐╧╬₧╕╧╨»├Ä▒ ╬▓╬┤╬╡╬╗╧α╕«│╬╝╬»╬▒
  13. Russian: ╨┬Æ ╤╪░░╤∩╛É░╤┬à ╤├É│╨░ ╨╢╨╕╨╗-╨▒╤┬ï╨╗ ╤╬á╕╤┼â┬Ç╤├æ┬ü? ╨∩┐╜, ╨╜╨╛ ╤─É░╨╗╤╨▒∩┐╜╕╨▓╤┬ï╨╣ ╤─₧║╨╖╨╡╨╝╨┐╨╗╤├æ┬Ç! ╤┬æ╤┬è.
  14. Bulgarian: ╨┬û╤∩┐╜╗╤─É░╤─É░ ╨┤╤├É╗╤┬Å ╨▒╨╡╤∩┐╜╡ ╤∩╛É░╤├æ─É╗╨╕╨▓╨░, ╤╪░╡ ╨┐╤├æα╕▒∩┐╜┬é, ╨║╨╛╨╣╤─É╛ ╤╬í∩┐╜─É╜╨░, ╨╖╨░╨╝╤┬Ç╤∩┐╜╖╨╜╨░ ╨║╨░╤─É╛ ╨│╤╨░╛╨╜.
  15. Sami (Northern): Vuol Ruo┼ºa ge─┬æggiid leat mß┼┬ïga luosa ja ─┬ìuov┼╛┼╛a.
  16. Hungarian: ┴rvφzt┼▒r┼┬æ tⁿk÷rf·r≤gΘp.
  17. Spanish: El pingⁿino Wenceslao hizo kil≤metros bajo exhaustiva lluvia y frφo, a±oraba a su querido cachorro.
  18. Portuguese: O pr≤ximo v⌠o α noite sobre o AtlΓntico, p⌡e freqⁿentemente o ·nico mΘdico. (3)
  19. French: Les na∩fs µgithales hΓtifs pondant α Noδl o∙ il gΦle sont s√rs d'Ωtre dΘτus et de voir leurs dr⌠les d'┼┬ôufs abεmΘs.
  20. Esperanto: E─Ño┼┬¥an─┬¥o ─┬ëiu─╡a┼¡de.
  21. Hebrew: ╫┬û╫┬ö ╫┬¢╫∩┐╜ ╫í╫¬╫┬¥ ╫┬£╫⌐╫┬₧╫∩┐╜ ╫┬É╫┬Ö╫┬Ü ╫¬╫á╫ª╫┬ù ╫º╫¿╫ñ╫┬ô ╫ó╫Ñ ╫┬ÿ╫┬ò╫┬æ ╫┬æ╫┬Æ╫┬ƒ.
  22. Japanese (Hiragana):
    πü├ú┬é├ú┬»π┬½π┬╗π┬╕π┬⌐πÇÇπ┬íπé∩┐╜┬¼πéïπéÆ
    πé├ú┬ü╤â┬é╫ô┬ü┬ƒπé╤â┬ü┬₧πÇÇπ┬ñπ┬¡π┬¬πé∩┐╜┬é┬Ç
    πü╬│┬é┬Éπ┬«πü∩┐╜┬ü├ú┬é├ú┬╛πÇÇπüæπ┬╡πüôπü╫ô┬ª
    πü─â┬ü┬òπü├ú┬é╬│┬é├ú┬┐πüÿπÇÇπéæπ┬▓πé─â┬ü┬¢πüÜ (4)

Notes:

  1. Other phrases commonly used in Germany include: "Ein wackerer Bayer vertilgt ja bequem zwo Pfund Kalbshaxe" and, more recently, "Franz jagt im komplett verwahrlosten Taxi quer durch Bayern", but both lack umlauts and esszet. Previously, going for the shortest sentence that has all the umlauts and special characters, I had "Grⁿ▀e aus BΣrenh÷fe (und ╙echtringen)!" Acute accents are not used in native German words, so I was surprised to discover "╙echtringen" in the Deutsche Bundespost Postleitzahlenbuch:

    Click for full-size image (2.8MB)

    It's a small village in eastern Lower Saxony. The "oe" in this case turns out to be the Lower Saxon "lengthening e" (Dehnungs-e), which makes the previous vowel long (used in a number of Lower Saxon place names such as Soest and Itzehoe), not the "e" that indicates umlaut of the preceding vowel. Many thanks to the ╙echtringen-Namenschreibungsuntersuchungskomitee (Alex Bochannek, Manfred Erren, Asmus Freytag, Christoph Päper, plus Werner Lemberg who serves as ╙echtringen-Namenschreibungsuntersuchungskomiteerechtschreibungsprⁿfer) for their relentless pursuit of the facts in this case. Conclusion: the accent almost certainly does not belong on this (or any other native German) word, but neither can it be dismissed as dirt on the page. To add to the mystery, it has been reported that other copies of the same edition of the PLZB do not show the accent! UPDATE (March 2006): David Krings was intrigued enough by this report to contact the mayor of Ebstorf, of which Oechtringen is a borough, who responded:

    Sehr geehrter Mr. Krings,
    wenn Oechtringen irgendwo mit einem Akzent auf dem O geschrieben wurde, dann kann das nur ein Fehldruck sein. Die offizielle Schreibweise lautet jedenfalls ΓÇ₧OechtringenΓÇ£.
    Mit freundlichen Grⁿssen
    Der Samtgemeindebⁿrgermeister
    i.A. Lothar Jessel

  2. From Karl Pentzlin (Kochel am See, Bavaria, Germany): "This German phrase is suited for display by a Fraktur (broken letter) font. It contains: all common three-letter ligatures: ffi ffl fft and all two-letter ligatures required by the Duden for Fraktur typesetting: ch ck ff fi fl ft ll ┼┐ch ┼┐i ┼┐┼┐ ┼┐t tz (all in a manner such they are not part of a three-letter ligature), one example of f-l where German typesetting rules prohibit ligating (marked by a ZWNJ), and all German letters a...z, Σ,÷,ⁿ,▀, ┼┐ [long s] (all in a manner such that they are not part of a two-letter Fraktur ligature)." Otto Stolz notes that "'Schlo▀' is now spelled 'Schloss', in contrast to 'gr÷▀er' (example 4) which has kept its '▀'. Fraktur has been banned from general use, in 1942, and long-s (┼┐) has ceased to be used with Antiqua (Roman) even earlier (the latest Antiqua-┼┐ I have seen is from 1913, but then I am no expert, so there may well be a later instance." Later Otto confirms the latter theory, "Now I've run across a book ΓÇ£Deutsche RechtschreibungΓÇ¥ (edited by Lutz Mackensen) from 1954 (my reprint is from 1956) that has kept the Antiqua-┼┐ in its dictionary part (but neither in the preface nor in the appendix)."

  3. Diaeresis is not used in Iberian Portuguese.

  4. From Yurio Miyazawa: "This poetry contains all the sounds in the Japanese language and used to be the first thing for children to learn in their Japanese class. The Hiragana version is particularly neat because it covers every character in the phonetic Hiragana character set." Yurio also sent the Kanji version:

    Φ∩╜▓π┬»σî─â┬╕π┬⌐ µ∩┐╜┬é∩┐╜┬¼πéïπéÆ
    µêæπü╤ä╕∩┐╜░πü₧ σ╕╕π┬¬πé∩┐╜┬é┬Ç
    µ£∩┐╜┼ƒπ┬«σÑÑσ▒▒ Σ╗∩┐╜∩┐╜╢∩┐╜┬ü╫ô┬ª
    µ╡α╣â┬ü├ÑñóΦªïπüÿ Θàöπ┬▓πé─â┬ü┬¢πüÜ

Accented Cyrillic:

(This section contributed by Vladimir Marinov.)

In Bulgarian it is desirable, customary, or in some cases required to write accents over vowels. Unfortunately, no computer character sets contain the full repertoire of accented Cyrillic letters. With Unicode, however, it is possible to combine any Cyrillic letter with any combining accent. The appearance of the result depends on the font and the rendering engine. Here are two examples.

  1. ╨ó╨╛╨╣ ╨▓╨╕╨┤╤┬Å ╨▒╤┼┤╗╨░╤─É░ ╨║╨╛╤├É░╠┬ü ╨┐╨╛ ╨│╨╗╨░╨▓╨░╤─É░ ╨╕╠┬ü ╨╕ ╨║╨╛╠├æ├É░ ╨╜╨░ ╤┬Ç╨░╨╝╨╛╤─É╛ ╨╕╠┬ü, ╨╕ ╤┬Ç╨╡╠├æ╪░╡ ╨┤╨░ ╨╕╠┬ü ╤┬Ç╨╡╤╪░╡╠┬ü: "╨┬ƒ╨░╤┬Ç╨░╠├æ─É░ ╨┐╨╛╠┬ü ╨┐╨░╠├æ┬Ç╨╕ ╨╛╤┬é ╨┐╨░╠├æ┬Ç╨░╤─É░, ╨╜╨╡ ╤∩╛É░ ╨┐╨░╤┬Ç╨╕╠┬ü!", ╨╜╨╛ ╤├É╕ ╨┐╨╛╨╝╨╕╠├æ├É╗╨╕: "╨Ñ╨╡╨╣, ╨┐╨╛╨╝╨╕╤├É╗╨╕╠┬ü ╤├É╕! ╨┬É╠┬ü ╨╕╠┬ü ╤┬Ç╨╡╨║╨░, ╨░╠┬ü ╨╡ ╤├É║╨╛╤╪░╕╨╗╨░ ╨▓ ╤─É░╨╖╨╕ ╤┬Ç╨╡╨║╨░, ╨║╨╛╤├æ─É╛ ╤∩╛É╡╤∩┐╜╡ ╨┤╨░ ╤─É╡╤╪░╡╠┬ü, ╨░ ╨╜╨╡ ╤─É╡╠├æ╪░╡."

  2. ╨┬ƒ╨╛ ╨┐╤∩┐╜├æ┼â┬Å ╨┐╤∩┐╜┼â├î├É▓╨░╤┬é ╨║╤├î├æ┬Ç╨┤╨╕ ╨╕ ╤├É│╨╛╤├É╗╨░╨▓╤├î├É╜╨╕.

HTML Features

Here is the Russian alphabet (uppercase only) coded in three different ways, which should look identical:

  1. ╨┬É╨┬æ╨┬Æ╨┬ô╨┬ö╨┬ò╨┬û╨┬ù╨┬ÿ╨┬Ö╨┬Ü╨┬¢╨┬£╨┬¥╨┬₧╨┬ƒ╨á╨í╨ó╨ú╨ñ╨Ñ╨ª╨º╨¿╨⌐╨¬╨½╨¼╨¡╨«╨»   (Literal UTF-8)
  2. АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ   (Decimal numeric character reference)
  3. АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ   (Hexadecimal numeric character reference)

In another test, we use HTML language tags to distinguish Bulgarian, Russian, and Serbian, which have different italic forms for lowercase ╨▒, ╨│, ╨┤, ╨┐, and/or ╤┬é:

Bulgarian:   [ ╨▒╨│╨┤╨┐╤┬é ]   ╨▒╨│╨┤╨┐╤┬é ]   ╨┬£╨╛╨│╨░ ╨┤╨░ ╤┼┤╝ ╤├æ┼â∩┐╜║╨╗╨╛ ╨╕ ╨╜╨╡ ╨╝╨╡ ╨▒╨╛╨╗╨╕.
Russian: [ ╨▒╨│╨┤╨┐╤┬é ]   ╨▒╨│╨┤╨┐╤┬é ]   ╨» ╨╝╨╛╨│╤┬â ╨╡╤├æ┼â┬î ╤├æ─É╡╨║╨╗╨╛, ╤├æ─É╛ ╨╝╨╜╨╡ ╨╜╨╡ ╨▓╤┬Ç╨╡╨┤╨╕╤┬é.
Serbian: [ ╨▒╨│╨┤╨┐╤┬é ]   ╨▒╨│╨┤╨┐╤┬é ]   ╨┬£╨╛╨│╤┬â ╤∩┐╜╤├æ─É╕ ╤├æ─É░╨║╨╗╨╛ ╨░ ╨┤╨░ ╨╝╨╕ ╨╜╨╡ ╤∩┐╜║╨╛╨┤╨╕.


Credits, Tools, and Commentary

Credits:
The "I can eat glass" phrase and the initial collection of translations: Ethan Mollick. Transcription / conversion to UTF-8: Frank da Cruz. Albanian: Sindi Keesan. Afrikaans: Johan Fourie, Kevin Poalses. Anglo Saxon: Frank da Cruz. Arabic: Najib Tounsi. Armenian: Vaτe Kundakτ─▒. Belarusian: Alexey Chernyak. Bengali: Somnath Purkayastha, Deepayan Sarkar. Bislama: Dan McGarry. Braille: Frank da Cruz. Bulgarian: Sindi Keesan, Guentcho Skordev, Vladimir Marinov. Burmese: "cetanapa". Cabo Verde Creole: Clßudio Alexandre Duarte. Catalán: Jordi Bancells. Chinese: Jack Soo, Wong Pui Lam. Chinook Jargon: David Robertson. Cornish: Chris Stephens. Croatian: Marjan Ba─┬çe. Czech: Stanislav Pecha, Radovan Garabφk. Dutch: Peter Gotink. Pim Blokland, Rob Daniel, Rob de Wit. Erzian: Jack Rueter. Esperanto: Franko Luin, Radovan Garabφk. Estonian: Meelis Roos. Faroese: Jón Gaasedal. Farsi/Persian: Payam Elahi. Finnish: Sampsa Toivanen. French: Luc Carissimo, Anne Colin du Terrail, Sean M. Burke. Galician: Laura Probaos. Georgian: Giorgi Lebanidze. German: Christoph PΣper, Otto Stolz, Karl Pentzlin, David Krings, Frank da Cruz. Gothic: Aurélien Coudurier. Greek: Ariel Glenn, Constantine Stathopoulos, Siva Nataraja, Christos Georgiou. Hebrew: Jonathan Rosenne, Tal Barnea. Hausa: Malami Buba, Tom Gewecke. Hawaiian: na Hau╩╗oli Motta, Anela de Rego, Kaliko Trapp. Hindi: Shirish Kalele, Nitin Dahra. Hungarian: Andrßs Rßcz, Mark Holczhammer. Icelandic: AndrΘs Magn·sson, Sveinn Baldursson. International Phonetic Alphabet (IPA): Siva Nataraja / Vincent Ramos. Irish: Michael Everson, Marion Gunn, James Kass, Curtis Clark. Italian: Thomas De Bellis. Japanese: Makoto Takahashi, Yurio Miyazawa. Karelian: Aleksandr Semakov. Kirchr÷adsj: Roger Stoffers. Krey≥l: Sean M. Burke. Korean: Jungshik Shin. Langenfelder Platt: David Krings. Lδtzebuergescht: Stefaan Eeckels. Lingala: Denis Moyogo Jacquerye (Nk≤ta ya K╔┬ö╠┬üng╔┬ö mφbalΘ ). (Nk≤ta ya K╔┬ö╠┬üng╔┬ö mφbal Lithuanian: Gediminas Grigas. Lojban: Edward Cherlin. Lusatian: Ronald Schaffhirt. Macedonian: Sindi Keesan. Malay: Zarina Mustapha. Manx: Éanna Ó Brádaigh. Marathi: Shirish Kalele. Marquesan: Kaliko Trapp. Middle English: Frank da Cruz. Milanese: Marco Cimarosti. Mongolian: Tom Gewecke. Napoletano: Diego Quintano. Navajo: Tom Gewecke. N≤rdicg: Yẃlyan Rott. Norwegian: Herman Ranes. OdenwΣlderisch: Alexander Heß. Old Irish: Michael Everson. Old Norse: AndrΘs Magn·sson. Papiamentu: Bianca and Denise Zanardi. Pashto: N.R. Liwal. PfΣlzisch: Dr. Johannes Sander. Picard: Philippe Mennecier. Polish: Juliusz Chroboczek, Pawe┼┬é Przeradowski. Portuguese: "Clßudio" Alexandre Duarte, Bianca and Denise Zanardi, Pedro Palhoto Matos, Wagner Amaral. QuΘbΘcois: Laurent Detillieux. Roman: Pierpaolo Bernardi. Romanian: Juliusz Chroboczek, Ionel Mugurel. Romansch: Alexandre Suter. Ruhrdeutsch: "Timwi". Russian: Alexey Chernyak, Serge Nesterovitch. Sami: Anne Colin du Terrail, Luc Carissimo. Sanskrit: Siva Nataraja / Vincent Ramos. SΣchsisch: AndrΘ Mⁿller. SchwΣbisch: Otto Stolz. Scots: Jonathan Riddell. Serbian: Sindi Keesan, Ranko Narancic, Boris Daljevic, Szilvia Csorba. Slovak: G. Adam Stanislav, Radovan Garabφk. Slovenian: Albert Kolar. Spanish: Aleida Muñoz, Laura Probaos. Swahili: Ronald Schaffhirt. Swedish: Christian Rose, Bengt Larsson. Taiwanese: Henry H. Tan-Tenn. Tagalog: Jim Soliven. Tamil: Vasee Vaseeharan. Tibetan: D. Germano, Tom Gewecke. Thai: Alan Wood's wife. Turkish: Vaτe Kundakτ─▒, Tom Gewecke, Merlign Olnon. Ukrainian: Michael Zajac. Urdu: Mustafa Ali. Vietnamese: Dixon Au, [James] ─∩┐╜┬ù Bß Ph╞░ß╗¢c 杜 伯 福. Walloon: Pablo Saratxaga. Welsh: Geiriadur Prifysgol Cymru (Andrew). Yiddish: Mark David, Zeneise: Angelo Pavese.

Tools Used to Create This Web Page:
The UTF8-aware Kermit 95 terminal emulator on Windows, to a Unix host with the EMACS text editor. Kermit 95 displays UTF-8 and also allows keyboard entry of arbitrary Unicode BMP characters as 4 hex digits, as shown HERE. Hex codes for Unicode values can be found in The Unicode Standard (recommended) and the online code charts. When submissions arrive by email encoded in some other character set (Latin-1, Latin-2, KOI, various PC code pages, JEUC, etc), I use the TRANSLATE command of C-Kermit on the Unix host (where I read my mail) to convert the character set to UTF-8 (I could also use Kermit 95 for this; it has the same TRANSLATE command). That's it -- no "Web authoring" tools, no locales, no "smart" anything. It's just plain text, nothing more. By the way, there's nothing special about EMACS -- any text editor will do, providing it allows entry of arbitrary 8-bit bytes as text, including the 0x80-0x9F "C1" range. EMACS 21.1 actually supports UTF-8; earlier versions don't know about it and display the octal codes; either way is OK for this purpose.

Commentary:
Date: Wed, 27 Feb 2002 13:21:59 +0100
From: "Bruno DEDOMINICIS" <b.dedominicis@cite-sciences.fr>
Subject: Je peux manger du verre, cela ne me fait pas mal.

I just found out your website and it makes me feel like proposing an interpretation of the choice of this peculiar phrase.

Glass is transparent and can hurt as everyone knows. The relation between people and civilisations is sometimes effusional and more often rude. The concept of breaking frontiers through globalization, in a way, is also an attempt to deny any difference. Isn't "transparency" the flag of modernity? Nothing should be hidden any more, authority is obsolete, and the new powers are supposed to reign through loving and smiling and no more through coercion...

Eating glass without pain sounds like a very nice metaphor of this attempt. That is, frontiers should become glass transparent first, and be denied by incorporating them. On the reverse, it shows that through globalization, frontiers undergo a process of displacement, that is, when they are not any more speakable, they become repressed from the speech and are therefore incorporated and might become painful symptoms, as for example what happens when one tries to eat glass.

The frontiers that used to separate bodies one from another tend to divide bodies from within and make them suffer.... The chosen phrase then appears as a denial of the symptom that might result from the destitution of traditional frontiers.

Best,
Bruno De Dominicis, Paris, France

Other Unicode pages onsite:

Unicode samplers and resources offsite:

Unicode fonts:

[ Kermit 95 ] [ K95 Screen Shots ] [ C-Kermit ] [ Kermit Home ] [ Display Problems? ] [ The Unicode Consortium ]


UTF-8 Sampler / The Kermit Project / Columbia University / kermit@columbia.edu