home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
17 Bit Software 1: Collection A
/
17Bit_Collection_A.iso
/
files
/
134.dms
/
134.adf
/
Icon
/
overview.doc
< prev
next >
Wrap
Text File
|
1993-02-16
|
10KB
|
374 lines
An Overview of the Icon Programming
Language*
Ralph E. Griswold
TR 83-3c
May 13, 1983; last revised May 1, 1986
Department of Computer Science
The University of Arizona
Tucson, Arizona 85721
*This work was supported by the National Science Foundation under
Grant DCR-8401831.
An Overview of the Icon Programming
Language
_1_.___I_n_t_r_o_d_u_c_t_i_o_n
Icon is a high-level programming language with extensive
facilities for processing strings and lists. Icon has several
novel features, including expressions that may produce sequences
of results, goal-directed evaluation that automatically searches
for a successful result, and string scanning that allows opera-
tions on strings to be formulated at a high conceptual level.
Icon resembles SNOBOL4 [1] in its emphasis on high-level
string processing and a design philosophy that allows ease of
programming and short, concise programs. Like SNOBOL4, storage
allocation and garbage collection are automatic in Icon, and
there are few restrictions on the sizes of objects. Strings,
lists, and other structures are created during program execution
and their size does not need to be known when a program is writ-
ten. Values are converted to expected types automatically; for
example, numeral strings read in as input can be used in numeri-
cal computations without explicit conversion. Whereas SNOBOL4
has a pattern-matching facility that is separate from the rest of
the language, string scanning is integrated with the rest of the
language facilities in Icon. Unlike SNOBOL4, Icon has an
expression-based syntax with reserved words; in appearance, Icon
programs resemble those of several other conventional programming
languages.
Examples of the kinds of problems for which Icon is well
suited are:
o+ text analysis, editing, and formatting
o+ document preparation
o+ symbolic mathematics
o+ text generation
o+ parsing and translation
o+ data laundry
o+ graph manipulation
Version 6 of Icon, the most recent version, is implemented in
C [2]. There are UNIX* implementations for the Amdahl 580, the
__________________________
*UNIX is a trademark of AT&T Bell Laboratories.
- 1 -
AT&T 3B series, the HP 9000, the IBM PC/XT/AT, the PDP-11, the
Ridge 32, the Sun Workstation, and the VAX-11. There also is a
VMS implementation for the VAX-11 and a DOS implementation for
personal computers. Other implementations are in progress.
A brief description of some of the representative features of
Icon is given in the following sections. This description is not
rigorous and does not include many features of Icon. See [3] for
a complete description and [4] for a description of recent
changes to the language.
_2_.___S_t_r_i_n_g_s
Strings of characters may be arbitrarily long, limited only by
the architecture of the computer on which Icon is implemented. A
string may be specified literally by enclosing it in double quo-
tation marks, as in
greeting := "Hello world"
which assigns an 11-character string to greeting, and
address := ""
which assigns the zero-length _e_m_p_t_y string to address. The
number of characters in a string s, its size, is given by *s. For
example, *greeting is 11 and *address is 0.
Icon uses the ASCII character set, extended to 256 characters.
There are escape conventions, similar to those of C, for
representing characters that cannot be keyboarded.
Strings also can be read in and written out, as in
line := read()
and
write(line)
Strings can be constructed by concatenation, as in
element := "(" || read() || ")"
If the concatenation of a number of strings is to be written out,
the write function can be used with several arguments to avoid
actual concatenation:
write("(",read(),")")
Substrings can be formed by subscripting strings with range
specifications that indicate, by position, the desired range of
- 2 -
characters. For example,
middle := line[10:20]
assigns to middle the string of characters of line between posi-
tions 10 and 20. Similarly,
write(line[2])
writes the second character of line. The value 0 is used to
refer to the position after the last character of a string. Thus
write(line[2:0])
writes the substring of line from the second character to the
end, thus omitting the first character.
An assignment can be made to the substring of string-valued
variable to change its value. For example,
line[2] := "..."
replaces the second character of line by three dots. Note that
the size of line changes automatically.
There are many functions for analyzing strings. An example is
find(s1,s2)
which produces the position in s2 at which s1 occurs as a sub-
string. For example, if the value of greeting is as given ear-
lier,
find("or",greeting)
produces the value 8. See Section 4.2 for the handling of situa-
tions in which s1 does not occur in s2, or in which it occurs at
several different positions.
_3_.___C_h_a_r_a_c_t_e_r__S_e_t_s
While strings are sequences of characters, _c_s_e_t_s are sets of
characters in which membership rather than order is significant.
Csets are represented literally using single enclosing quotation
marks, as in
vowels := 'aeiouAEIOU'
Two useful built-in csets are &lcase and &ucase, which consist of
the lowercase and uppercase letters, respectively. Set opera-
tions are provided for csets. For example,
- 3 -
letters := &lcase ++ &ucase
forms the cset union of the lowercase and uppercase letters and
assigns the resulting cset to letters, while
consonants := letters -- 'aeiouAEIOU'
forms the cset difference of the letters and the vowels and
assigns the resulting cset to consonants.
Csets are useful in situations in which any one of a number of
characters is significant. An example is the string analysis
function
upto(c,s)
which produces the position s at which any character in c occurs.
For example,
upto(vowels,greeting)
produces 2. Another string analysis function that uses csets is
many(c,s)
which produces the position in s after an initial substring con-
sisting only of characters that occur in s. An example of the
use of many is in locating words. Suppose, for example, that a
word is defined to consist of a string of letters. The expres-
sion
write(line[1:many(letters,line)])
writes a word at the beginning of line. Note the use of the posi-
tion returned by a string analysis function to specify the end of
a substring.
_4_.___E_x_p_r_e_s_s_i_o_n__E_v_a_l_u_a_t_i_o_n
_4_._1___C_o_n_d_i_t_i_o_n_a_l__E_x_p_r_e_s_s_i_o_n_s
In Icon there are _c_o_n_d_i_t_i_o_n_a_l _e_x_p_r_e_s_s_i_o_n_s that may _s_u_c_c_e_e_d and
produce a result, or may _f_a_i_l and not produce any result. An
example is the comparison operation
i > j
which succeeds (and produces the value of j) provided that the
value of i is greater than the value of j, but fails otherwise.
The success or failure of conditional operations is used
instead of Boolean values to drive control structures in Icon. An
example is
- 4 -
if i > j then k := i else k := j
which assigns the value of i to k if the value of i is greater
than the value of j, but assigns the value of j to k otherwise.
The usefulness of the concepts of success and failure is
illustrated by find(s1,s2), which fails if s1 does not occur as a
substring of s2. Thus
if i := find("or",line) then write(i)
writes the position at which or occurs in line, if it occurs, but
does not write a value if it does not occur.
Many expressions in Icon are conditional. An example is
read(), which produces the next line from the input file, but
fails when the end of the file is reached. The following expres-
sion is typical of programming in Icon and illustrates the
integration of conditional expressions and conventional control
structures:
while line := read() do
write(line)
This expression copies the input file to the output file.
If an argument of a function fails, the function is not
called, and the function call fails as well. This ``inheritance''
of failure allows the concise formulation of many programming
tasks. Omitting the optional do clause in while-do, the previous
expression can be rewritten as
while write(read())
_4_._2___G_e_n_e_r_a_t_o_r_s
In some sit