NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / lang / rexx / 795 < prev next >

Wrap

Text File | 1992-08-26 | 3.6 KB | 77 lines

Newsgroups: comp.lang.rexx Path: sparky!uunet!mcsun!sunic!aun.uninett.no!ugle.unit.no!ugle!anders From: anders@lise3.lise.unit.no (Anders Christensen) Subject: Re: Blanks, REXX, and portability... In-Reply-To: Otto Stolz's message of Wed, 26 Aug 1992 18:41:09 MEZ Message-ID: <ANDERS.92Aug27004039@lise3.lise.unit.no> Sender: news@ugle.unit.no (NetNews Administrator) Organization: /home/flipper/anders/.organization References: <REXXLIST%92082621432914@DEARN> Date: 27 Aug 92 00:40:39 Lines: 64 I completely agree with Otto Stolz, both on the description of the current situation, and on what should be done. His list of checkpoints is very timely and should be addressed in the coming ANSI standard. However, on one point I disagree: In article <REXXLIST%92082621432914@DEARN> Otto Stolz <RZOTTO@DKNKURZ1.BITNET> writes: > Note 1: Another recent contribution to REXXLIST stated that REXX source > code, and REXX operands, might be represented in different > character codes (perhaps including different notions of white > space). A cursory scan through TRL did not reveal any support > for this statemnt. To quote TRL: "Programming in the REXX language can be considered to involve the use of two character sets. The first is used for expressing the REXX program itself, and is the relatively small set of characters described in the next section. The second character set is the set of characters that can be used as data by a particular implementation of a REXX language processor. This character set may be limitied in size (often to a limit of 256 different characters, which have a convenient 8-bit representation), or it may be much larger. Usually, most or all of these characters are also allowed within a REXX program, but only within commentary or immediate (literal) data." TRL, 2nd ed. Part 2, section 1, page 18. First paragraph of "Character Sets", Of course, the term "two character sets" does not mean that the Rexx script is written in EBCDIC and the data is in ASCII. Rather, it means that the characters allowed in a rexx script are a subset of the characters allowed to be handled as data by the interpreter. For instance, ISO 646 (7-bit ASCII) can be allowed in the rexx script (except for comments and strings), while ISO 8859-1 is the character set that forms the data. Consequently, Rexx may allow a lesser set of whitespace characters for use in the Rexx script source code (except from comments and strings), while it may recognize a much broader set of whitespace characters in the data. The total character set that can be used in the Rexx scripts can be (including comments and strings) less than the character set handled by the interpreter. End-Of-Line characters (for the machines that use such) are allowed in data, but explicitly not allowed in strings in a Rexx script. For instance, Tab (ascii 9) and Space (ascii 32) could be interpreted as whitespace in rexx source code, while Tab, Space and Non-breakable space (ascii 160) could be interpreted as whitespace in data. Btw, I don't remember saying anything about source code and data (or operands as you put it) should be represented by different character codes. What I did say (or least what I did mean) was that source code and data should be of two character sets. > To me, this tight coupling suggests that source program and > operands should ideally be expressed in the same character > code. [...] It is, except that the source program can only use a subset of the character codes that the data can use. -anders