home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.rexx
- Path: sparky!uunet!mcsun!sunic!aun.uninett.no!ugle.unit.no!ugle!anders
- From: anders@lise3.lise.unit.no (Anders Christensen)
- Subject: Re: Blanks, REXX, and portability...
- In-Reply-To: Otto Stolz's message of Wed, 26 Aug 1992 18:41:09 MEZ
- Message-ID: <ANDERS.92Aug27004039@lise3.lise.unit.no>
- Sender: news@ugle.unit.no (NetNews Administrator)
- Organization: /home/flipper/anders/.organization
- References: <REXXLIST%92082621432914@DEARN>
- Date: 27 Aug 92 00:40:39
- Lines: 64
-
- I completely agree with Otto Stolz, both on the description of the
- current situation, and on what should be done. His list of checkpoints
- is very timely and should be addressed in the coming ANSI standard.
- However, on one point I disagree:
-
- In article <REXXLIST%92082621432914@DEARN> Otto Stolz <RZOTTO@DKNKURZ1.BITNET> writes:
-
- > Note 1: Another recent contribution to REXXLIST stated that REXX source
- > code, and REXX operands, might be represented in different
- > character codes (perhaps including different notions of white
- > space). A cursory scan through TRL did not reveal any support
- > for this statemnt.
-
- To quote TRL:
-
- "Programming in the REXX language can be considered to involve the
- use of two character sets. The first is used for expressing the
- REXX program itself, and is the relatively small set of characters
- described in the next section. The second character set is the set
- of characters that can be used as data by a particular
- implementation of a REXX language processor. This character set may
- be limitied in size (often to a limit of 256 different characters,
- which have a convenient 8-bit representation), or it may be much
- larger. Usually, most or all of these characters are also allowed
- within a REXX program, but only within commentary or immediate
- (literal) data."
- TRL, 2nd ed.
- Part 2, section 1, page 18.
- First paragraph of "Character Sets",
-
- Of course, the term "two character sets" does not mean that the Rexx
- script is written in EBCDIC and the data is in ASCII. Rather, it means
- that the characters allowed in a rexx script are a subset of the
- characters allowed to be handled as data by the interpreter. For
- instance, ISO 646 (7-bit ASCII) can be allowed in the rexx script
- (except for comments and strings), while ISO 8859-1 is the character
- set that forms the data.
-
- Consequently, Rexx may allow a lesser set of whitespace characters for
- use in the Rexx script source code (except from comments and strings),
- while it may recognize a much broader set of whitespace characters in
- the data. The total character set that can be used in the Rexx scripts
- can be (including comments and strings) less than the character set
- handled by the interpreter. End-Of-Line characters (for the machines
- that use such) are allowed in data, but explicitly not allowed in
- strings in a Rexx script.
-
- For instance, Tab (ascii 9) and Space (ascii 32) could be interpreted
- as whitespace in rexx source code, while Tab, Space and Non-breakable
- space (ascii 160) could be interpreted as whitespace in data.
-
- Btw, I don't remember saying anything about source code and data (or
- operands as you put it) should be represented by different character
- codes. What I did say (or least what I did mean) was that source code
- and data should be of two character sets.
-
- > To me, this tight coupling suggests that source program and
- > operands should ideally be expressed in the same character
- > code. [...]
-
- It is, except that the source program can only use a subset of the
- character codes that the data can use.
-
- -anders
-