home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!gatech!purdue!yuma!csn!stortek!LSTC2VM.stortek.com!SCHMIDT
- From: SCHMIDT@LSTC2VM.stortek.com (Jon Schmidt)
- Newsgroups: comp.lang.rexx
- Subject: Re: Blanks, REXX, and portability...
- Message-ID: <16851927C.SCHMIDT@LSTC2VM.stortek.com>
- Date: 28 Aug 92 16:24:57 GMT
- References: <9208260321.AA05688@SERVER.uwindsor.ca> <ANDERS.92Aug26105620@lise3.lise.unit.no>
- Sender: usenet@stortek.com
- Organization: StorageTek SW Engineering
- Lines: 67
-
- In article <ANDERS.92Aug26105620@lise3.lise.unit.no>
- anders@lise3.lise.unit.no (Anders Christensen) writes:
- [stuff deleted]
- >In ASCII, the following characters are often considered 'whitespace',
- >listed in decreasing order of 'whitespaceness' (codes in decimal)
- >
- > ascii 32 - space
- > ascii 9 - HT (horizontal tab)
- > ascii 10 - LF (line feed)
- > ascii 13 - CR (carriage return)
- > ascii 12 - NP (new page, or FF - formfeed)
- > ascii 11 - VT (vertical tab)
- >
- >There might be even more. And worse, in some modes, I think characters
- >above 128 are space characters, like hard-space (a space that can not
- >be divided between lines). In particular the HT is considered
- >whitespace, since it conceptually a number of compressed space
- >characters (customarily 2-8).
- >
- [stuff deleted]
- >By the way ... I really can't see the problem? Unix generates tabs
- >and spaces as whitespace, the Unix rexx interpreter interprets boths as
- >blanks, No problem!
- [stuff deleted]
-
- I come from CMS and just recently began using a commercial REXX
- interpreter under UNIX. The manual for this interpreter always
- uses words "blank" and "blanks" but never suggests that a blank
- might be anything other than a space character. My meager UNIX
- experience had shown than there are some text files where TAB and
- SPACE characters are not "equivalent", such as makefiles. I was
- therefore astonished to find that the UNIX REXX expression
- ('09'X='20'X) yielded a "true" result. I later wrote a little
- UNIX REXX program (see below) to investigate this anomalous (to me)
- phenomenon. I was shocked to discover that this UNIX REXX
- interpreter considered 41 of the 256 possible 8-bit codes in the
- range '00'X to 'FF'X to be blanks! 41? Where did this come from?
- As an excercise, readers are challenged to write a portable REXX
- function that returns the position of the Nth blank within a string,
- where N and the string are passed as arguments to the function.
-
- Here's my little program, written to explore the commercial UNIX
- REXX interpreter's definition and handling of blanks:
-
- #!/usr/local/bin/rxx
- blank=LEFT('',1) /* Get a REXX-defined blank */
- SAY 'Test 1 result:' ('09'X='20'X) /* 1 */
- SAY 'Test 2 result:' ('09'X=='20'X) /* 0 */
- SAY 'Test 3 result:' (blank='09'X) /* 1 */
- SAY 'Test 4 result:' (blank=='09'X) /* 0 */
- SAY 'Test 5 result:' (blank='20'X) /* 1 */
- SAY 'Test 6 result:' (blank=='20'X) /* 1 */
- SAY 'Test 7 result:' COMPARE(blank,'09'X||'20'X) /* 1 */
- SAY 'Test 8 result:' POS(blank,'09'X||'20'X) /* 2 */
- SAY 'Test 9 result:' BlankCount(XRANGE('00'X,'FF'X)) /* 41 */
- EXIT 0
- BlankCount:PROCEDURE /* Count blanks in a string */
- nonblanks=0
- DO n=1 TO WORDS(ARG(1))
- nonblanks=nonblanks+WORDLENGTH(ARG(1),n)
- END
- RETURN LENGTH(ARG(1))-nonblanks
-
- --
- Jon_Schmidt@stortek.com Storage Technology Corporation
- (303) 673-3581 - voice 2270 South 88th Street
- (303) 673-6039 - fax Louisville, Colorado 80028-5209
-