home *** CD-ROM | disk | FTP | other *** search
- Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Path: sparky!uunet!paladin.american.edu!auvm!UTAFLL.UTA.EDU!ROBIN
- Message-ID: <9208230649.AA24130@utafll.uta.edu>
- Date: Sat, 22 Aug 92 23:49:48 CDT
- Sender: Nota Bene List <NOTABENE@TAUNIVM>
- From: Robin Cover <robin@UTAFLL.UTA.EDU>
- Subject: UUENCODE unsafe (annual reminder)
- Comments: To: notabene@vm.tau.ac.il
- Newsgroups: bit.listserv.notabene
- Lines: 67
-
-
- Allan Needell writes on UUENCODE:
-
- > From: Allan Needell <NASSH100%SIVM.BITNET@ricevm1.rice.edu>
- > Subject: The trouble with UUENCODE
-
- > Dorothy, et. al.,
-
- > The character set used by the uuencode program includes the right
- > and left square brackets &&; as many will see, these
- > characters get corrupted during some stage of the email process.
- > In my case they arrive as vertical hyphens (the shifted character to
- > the immediate left for the backspace key on the 101-key keyboard.
- > (Sorry--don't know the ascii #). Because I don't know which
- > (right or left) each represents, I can't run a ci routine on the
- > file and so I can't uudecode the messages.
-
- > Could you all use xxencode instead? Specifically, Dorothy, could you
- > resent an xxendoced copy of NOCZ2?
-
- Allan confirms what network wisdom already knows to be true: UUENCODE
- is **NOT** a safe transmission form for "7 bit" network traffic. Most
- people just ignore this and hope that the plague will not hit "them."
- The TAUNIVM server supports XXencode as one of the options, but many
- sites do not support XXencode. A real pity.
-
- Here's another annual warning:
- --------------------------------------
-
- The ISO 646 subset contains the following (non-national) characters
- which do not commonly cause misinterpretation of the data when shipped
- across networks, from ASCII to EBCDIC machines or vice versa, or across
- national boundaries:
-
- a b c d e f g h i j k l m n o p q r s t u v w x y z
- A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
- 0 1 2 3 4 5 6 7 8 9
- " % & ' ( ) * + , - . / : ; < = > ? _ and space (dec 32)
-
- OK everyone? *THOSE* are the safe characters; all others are fragile,
- including left-single-quote and others on the top row of a standard
- QWERTY Keyboard. Note that even those who use SGML need to be
- careful, since. . ....(self-quoting again)... By my count, five characters
- in the SGML reference concrete syntax fall outside this "ISO 646
- subset" (so-called in TEI Guidelines) given above:
-
- Example in IRV Decimal
- Character Name SGML markup ISO 646 Number Test Column
- ----------------------------------------------------------------------
- exclamation mark MDO 2/1 33 !
- left-square-bracket DSO 5/11 91 [
- right-square-bracket DSC 5/13 93 ]
- number/hash/pound RNI 2/3 35 #
- (broken) vertical bar OR 7/12 124 |
-
- ---The bottom line is, here: if all five characters in column five
- "Test Column" look OK to *YOU*, that just means that this file
- travelled safely from Texas to where you are now; via any other
- path, the results on your system could be different.
-
- In a word, I think Allan is correct: the mutation caused by (mostly
- IBM EBCDIC) machines on the networks will map two different characters
- onto ONE character, so the process is irreversible. In a word: use
- UUencode at your own peril. I use it, but I never EXPECT it to work,
- and sometimes indeed it fails.
-
- Robin Cover
-