home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 11 Util
/
11-Util.zip
/
qtawkos2.zip
/
QTAWK.DOC
< prev
next >
Wrap
Text File
|
1995-01-08
|
656KB
|
16,157 lines
I2
QTAwk
Utility Creation Tool
For: PC/MS-DOS & OS/2
Version 6.00 PC/MS_DOS 08-04-94
Version 1.00 OS/2 08-04-94
Sunday, January 08, 1995
(c) Copyright 1988 - 1994 Pearl Boldt
Darnestown, MD 20878
QTAwk License
Utility Creation Program
Version 6.00 PC/MS_DOS 08-04-94
Version 1.00 OS/2 08-04-94
(c) Copyright 1988 - 1994 Pearl Boldt. All Rights Reserved.
Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878
CompuServe ID: 72040.434
1. Copyright: The QTAwk program and all other programs and
documentation distributed or shipped with it are Copyright
1988 - 1994 Pearl Boldt and are protected by U.S. and
International Copyright law. In the rest of this document,
this collection of programs is referred to simply as
"QTAwk".
2. Shareware copies of QTAwk are distributed to allow you to
try the program before you pay for it. They are Copyright
1988 - 1994, Pearl Boldt and do not constitute "free" or
"public domain" software. You may use QTAwk for personal,
noncommercial use. You may give the shareware version of
QTAwk to others for noncommercial use IF:
■ As a minimum the following files are included:
a) qtawk.exe ==> QTAwk executable
b) qtawk.doc ==> QTAwk Documentation
c) diffdoc.fmt ==> differences between QTAwk and Awk
d) license.fmt ==> license agreement for QTAwk usage
e) order.frm ==> oder form for registered versions of
QTAwk
f) qtautl.zip ==> zipped file containing QTAwk
utilities
g) qtawk.his ==> version history for QTAwk
h) readme.doc ==> this file
■ The Files Are Not Modified In Any Way.
QTAwk - iii - QTAwk
3. Use of QTAwk: QTAwk is a powerful program. While we have
attempted to build in reasonable safeguards, if you do not
use QTAwk properly you may destroy files or cause other
damage to your computer software and data. You assume full
responsibility for the selection and use of QTAwk to achieve
your intended results.
The shareware version of QTAwk may be freely used and shared
with others for personal, noncommercial use. There is no
REQUIRED registration fee for noncommercial/non-institutional
personal use.
However, if you find the QTAwk package of value, a gift of
$50.00 (US) or any amount would be greatly appreciated.
Please remember that improvements to the QTAwk package cannot
happen without your support. You can help by sharing QTAwk
with others.
You may, of course, register QTAwk by ordering a standard
site license. Please see the ORDER.FRM file for details.
4. All warranties as to this software, whether express or
implied, are disclaimed, including without limitation any
implied warranties of merchantability, fitness for a
particular purpose, functionality or data integrity or
protection are disclaimed.
Questions may be sent to:
Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878
CompuServe ID: 72040.434
QTAwk - iv - QTAwk
QTAwk 6.00 PC/MS_DOS Order Form
QTAwk 1.00 OS/2 Order Form
Utility Creation Program
Version 6.00 PC/MS_DOS 08-04-94
Version 1.00 OS/2 08-04-94
(c) Copyright 1988 - 1994 Pearl Boldt. All Rights Reserved.
Return to:
Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878
Make all Checks Payable to: Pearl Boldt
Name:
Company:
Address:
Phone:
Register QTAwk to: Company (___) or Individual (___)
Send information on: Site Licenses (___), Reseller Pricing (___)
I have read and agree to abide by the QTAwk license agreement,
Signature:
Where did you hear about QTAwk?
Disk Size: ___ 5.25" (1.2 Mbytes), ___ 3.5" (1.44 Mbytes)
Quantity Price
Registered Version ($50.00 (US)/copy): ________ $ ________.____
Subtotal $ ________.____
Shipping charges, per copy: $ ________.____
Total enclosed: $ ________.____
US standard - included
US 2-day - $8.00 (US)
Canada (air) - $5.00 (US)
All Others (air) - $10.00 (US)
QTAwk - v - QTAwk
===> Please read the following before ordering! <===
Order Information
International Orders:
Orders from outside the U.S. must be paid by a check or money
order in U.S. funds and drawn on a U.S. bank; or by an
international postal money order in U.S. dollars. Checks which
are not in U.S. funds and drawn on a U.S. bank will be returned
due to extremely high charges imposed by U.S. banks to collect
the funds. Purchase orders (minimum $200) can be accepted from
outside the U.S., but you must contact us before ordering.
Company Purchase Orders:
Purchase orders for amounts of $100 and over are accepted from
established U.S. companies; orders under $100 are accepted but
must be prepaid. Have your purchasing agent contact Pearl Boldt
for terms. Credit references will be required for new
customers.
Multi-System Licenses:
Multi-system licensing arrangements are available for network,
site, and corporate use of QTAwk. Check the line on the order
form or contact us for more information. A sample schedule of
license fees is below; contact us for pricing on the exact number
of systems you wish to license. The fee includes a master
diskette.
Systems Price Systems Price Systems Price
2 90.00 15 675.00 50 2,250.00
3 135.00 20 900.00 60 2,700.00
4 155.00 25 1,125.00 70 3,150.00
5 180.00 30 1,350.00 80 3,600.00
10 450.00 40 1,800.00 100 4,500.00
Return to:
Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878
Make all Checks Payable to: Pearl Boldt
QTAwk - vi - QTAwk
QTAwk Update History
==> QTAwk Version 6.00 for DOS and version 1.00 for OS/2, dated
05/01/94. This version contains several changes and additions
from the previous versions:
1. Arrays have been fully integrated into the match operators,
'~~' and '!~', both their direct use and their implied use in
patterns and as arguments of the functions 'match', 'sub', 'gsub'
and 'split'. The use of an array as the operand of a match
operator will match against all elements of the array as separate
regular expressions. This is similar to the use of the GROUP
keyword in patterns.
If a match is found, the new variable MATCH_INDEX is set to the
string value of the index in the array of the matching regular
expression. If a multidimensional array is used, the array
indices are separated by the value of the built-in variable
SUBSEP (which has been re-introduced from Awk with a slightly
different use).
In addition, the use of arrays for the built-in variables RS and
FS enables the user to specify multiple regular expressions for
use as record separators and/or field separators. The use of
arrays for RS and/or FS does not affect the value of
MATCH_INDEX.
Arrays used for regular expression matching retain their
internal regular expression form until the whole array or an
array element is changed. Thus arrays can be be used as dynamic
regular expressions for which the user controls when the internal
form is changed.
2. New algorithms have been developed and used for regular
expression matching. A total of four different algorithms are
used for pattern matching. QTAwk automatically selects the
appropriate algorithm to optimize the search depending on the
regular expression(s) to be matched. The algorithm used is
selected based on the number of regular expression(s) to match
and the complexity of the regular expression(s).
3. The new variable MATCH_INDEX is defined. This variable is
set to the string value of the index of the matching element when
an array is used for matching.
4. The variable SUBSEP from Awk is re-introduced with a default
value of a comma character, ','. The value of SUBSEP is used to
separate the index values in MATCH_INDEX when a
QTAwk - vii - QTAwk
multidimensional array is used for matching.
5. A new method of file processing has been introduced. In the
current and previous versions, the following process is carried
out:
a) Determine next record according to RS or RECLEN, see below
for an explanation of RECLEN, b) Read next record, c) Parse
record into fields according to FS or FIELDWIDTHS, see below for
an explanation of FIELDWIDTHS, d) Execute each pattern expression
1: If pattern expression is TRUE, execute associated action
For QTAwk utilities in which all patterns contain a regular
expression match or for those files for which actions are
executed only for those records matching a set of one or more
regular expressions, the above process for each record can be
time consuming. It would be much faster to scan the input file
for matches to the desired regular expression(s) and then execute
each pattern expression once such a record has been found. This
by-passes the time consuming process of reading individual
records and parsing each into fields. Only the desired records
need to be read and parsed with the new method, thus saving much
time in the execution of the QTAwk utility.
QTAwk Version 6.00 for PC/MS-DOS and Version 1.00 for OS/2
implements the new search method. Two new variables:
a) FILE_SEARCH b) FILE_SEARCH_PAT
have been introduced for this purpose. When FILE_SEARCH is
TRUE, the next record read will be the record matching a regular
expression from FILE_SEARCH_PAT. If FILE_SEARCH is FALSE, the
normal file input process described above is followed. The new
file search process may be turned on and off as necessary for a
single input file in this manner.
FILE_SEARCH_PAT is set by the user utility to one or more
regular expressions against which records from the current input
file are matched. FILE_SEARCH_PAT may be set to a single regular
expression as a simple variable, e.g.,
■ FILE_SEARCH_PAT = /test string/;
QTAwk - viii - QTAwk
or a singly dimensioned array, e.g.,
■ FILE_SEARCH_PAT[1] = /test string 1/; ■ FILE_SEARCH_PAT[2] =
/test string 2/; ■ FILE_SEARCH_PAT[3] = /test string 3/; ■
FILE_SEARCH_PAT[4] = /test string 4/;
or a multidimensioned array, e.g.,
■ FILE_SEARCH_PAT[1][1] = /test string 1,1/; ■
FILE_SEARCH_PAT[1][2] = /test string 1,2/; ■
FILE_SEARCH_PAT[1][3] = /test string 1,3/; ■
FILE_SEARCH_PAT[2][1] = /test string 2,1/; ■
FILE_SEARCH_PAT[2][2] = /test string 2,2/; ■
FILE_SEARCH_PAT[2][3] = /test string 2,3/; ■
FILE_SEARCH_PAT[3][1] = /test string 3,1/; ■
FILE_SEARCH_PAT[3][2] = /test string 3,2/; ■
FILE_SEARCH_PAT[3][3] = /test string 3,3/;
When FILE_SEARCH is TRUE, the current input file is scanned for
a match to FILE_SEARCH_PAT. When a record is found matching a
regular expression in FILE_SEARCH_PAT, the record is read, parsed
into fields according to FS or FIELDWIDTHS and each pattern
expression executed. The associated actions for TRUE pattern
expressions are executed. Note that the variables RS or RECLEN
still determine the parsing of the input file into records.
Under some circumstances, the above process can return in '$0'
multiple records from the current input file. In searching the
input file for a match to FILE_SEARCH_PAT, a match may span more
than one record if the new variable, SPAN_RECORDS, is TRUE. In
this case, '$0' is set to the full set of records spanning the
match to FILE_SEARCH_PAT. If SPAN_RECORDS is FALSE, any matches
to FILE_SEARCH_PAT are not allowed to span input records and '$0'
will contain only a single record.
6. The new variable FILE_SEARCH is defined as described above.
7. The new variable FILE_SEARCH_PAT is defined as described
above.
8. The new variable SPAN_RECORDS is defined as described
above.
9. The action of QTAwk when NF is changed now reflects the
intuitive effect of changing NF. If the new value is greater
QTAwk - ix - QTAwk
than the current value, the current input line is lengthened
with new empty fields separated by the output field separator
string, OFS. If the new value is less than the current value,
the current input line is shortened by truncating at the end of
the field corresponding to the new NF value.
10. Two new input functions have been introduced: a)
srchrecord(sp) or srchrecord(sp,rs) or srchrecord(sp,rs,var) is
similar to the function 'getline'. 'srchrecord' will search the
current input file for the next record or records matching the
search pattern, 'sp'. If the record separator parameter, 'rs',
is not specified, records are determined by the variable RS or
RECLEN. If 'rs' is specified, record boundaries are determined
by the strings matching 'rs'. 'rs' may be a simple constant or
variable or an array. The record or records matching the search
pattern are returned in '$0' if 'var' is not specified. If 'var'
is specified, the matching record or records are returned in
'var'. The built-in variables, FNR and NR are updated to reflect
the current position and record number after the search. The
built-in variables, NF and '$i', i ≤ 0 ≤ NF, are set when 'var'
is not specified.
b) fsrchrecord(file) or fsrchrecord(file,sp) or
fsrchrecord(file,sp,rs) or fsrchrecord(file,sp,rs,var) is similar
to the function 'fgetline'. 'fsrchrecord' will search the file
specified for the next record or records matching the search
pattern, 'sp'. If the record separator parameter, 'rs', is not
specified, records are determined by the variable RS or RECLEN.
If 'rs' is specified, record boundaries are determined by the
strings matching 'rs'. 'rs' may be a simple constant or variable
or an array. The record or records matching the search pattern
are returned in '$0' if 'var' is not specified. If 'var' is
specified, the matching record or records are returned in 'var'.
The built-in variables, NF and '$i', i ≤ 0 ≤ NF, are set when
'var' is not specified.
Both functions have identical returns to the 'getline' and
'fgetline' functions, i.e., a) the number of characters in the
record(s) matched plus the End-Of_Record length plus 1
b) zero, 0, if End-Of-File was reached before finding a
match c) -1 on a file read error.
QTAwk - x - QTAwk
getline() getline(v) fgetline(F) fgetline(F,v) srchrecord()
srchrecord(v) fsrchrecord(F) fsrchrecord(F,v)
$0 updated not updated updated not updated $i, i>0 updated not
updated updated not updated NF updated not updated updated not
updated NR updated updated not updated not updated FNR updated
updated not updated not updated
Note: the function parameters sp and rs have not been shown in
'srchrecord' and 'fsrchrecord' to highlight the similarity with
the functions 'getline' and 'fgetline' and their effect on the
variables indicated.
11. The new function 'get_FNR' has been introduced. This
function returns the current record number for the file
specified. The two forms are:
a) get_FNR() This form returns the current record number of the
current input file. The value returned is equal to the built-in
variable FNR. b) get_FNR(filename) This form returns the current
record number of the input file specified. If filename ==
FILENAME, this form is equivalent to the first form. If the
filename specified is not open or is not open for input, a value
of zero, 0, is returned.
This function has been added because of the input functions
'fgetline' and 'fsrchrecord'. For the current input file, the
built-in variable FNR is always updated automatically to contain
the record number of the last record input (the current record).
However, when reading from a file other than the current input
file, previously there was no means of obtaining the current
record number of the input file. With 'fgetline', the user
utility could maintain an independent count of records read.
However, if the 'fsrchrecord' function is used, there is no other
means of obtaining the record number of the last record read.
12. The function 'resetre' has been introduced. In previous
versions, once a regular expression has been used to search
for a match, the internal form of the regular expression is
set and could not be changed. If the regular expression
contained embedded named expressions and the value(s) of the
corresponding variables changed, the new values of the variables
could not be incorporated into the regular expression. This
QTAwk - xi - QTAwk
improved the speed of scanning since the regular expressions did
not have to be converted to internal form for each use. To
obtain the variability needed for changing variable values,
strings were used for matching. For match expressions which
change frequently, strings used for match operators are still the
better choice, since they are converted to internal form for each
use. However, for those occasions where the match expression
changes infrequently due to the change in value of embedded named
expressions, the use of a string leads to impaired performance.
This function has been introduced for just such use. This
function releases the internal form of all regular expressions so
that the next time it is used as the search pattern for a match
operation, the internal form is rebuilt using the then current
values of any variables used in named expressions.
Note that the use of arrays for match patterns falls between the
use of strings, for which the internal regular expression form is
rebuilt for each use, and regular expressions for which the
internal form is built for the first use and then remains
static. When arrays are used for matching, the internal regular
expression form is built when first used and retained until the
array is changed. For arrays the internal regular expression
form is assigned when the array as a whole is assigned to another
variable. Thus the internal regular expression form can be
retained and reused.
13. The new variable FILEATTR is defined. This variable
contains the attributes of the current input file in a string.
The string defines the attributes in the same manner as the
variable for the 'findfile' function defines the attributes for
the files found by that function.
14. The new variable FILEDATE is defined. This variable
contains the date of the current input file in the operating
system format. The 'sdate/stime' functions may be used to format
the date.
15. The new variable FILEPATH is defined. This variable
contains the drive and path of the current input file. The path
string ends with the subdirectory separator character.
16. The new variable FILESIZE is defined. This variable
contains the size in bytes of the current input file.
QTAwk - xii - QTAwk
17. The new variable FILETIME is defined. This variable
contains the time of the current input file in the operating
system format. The 'sdate/stime' functions may be used to format
the time.
18. The following built-in variables have been added:
a) FILEDATE_CREATE - date of file creation as returned by the
Operating System. b) FILETIME_CREATE - time of file creation as
returned by the Operating System. c) FILEDATE_LACCESS - date of
last file access as returned by the Operating System. d)
FILETIME_LACCESS - time of last file access as returned by the
Operating System.
The same variables are available in the PC/MS-DOS version, butr
their values are set equal to the values of FILEDATE and
FILETIME.
19. The functions 'sdate' and 'stime' have been changed from
previous versions. The fixed formats supported by previous
versions have been replaced by the use of a 'format string'
identical to that supported by ANSI C. The forms are: a)
sdate(frmt_strg) return string formatted according to 'frmt_strg'
using current system date and time. b) sdate(frmt_strg,fdate)
return string formatted according to 'frmt_strg' using date
specified in fdate and current system time. fdate is a file date
in the operating system format. c)
sdate(frmt_strg,year,month,day) return string formatted according
to 'frmt_strg' using date specified in 'year', 'month' and 'day'
and current system time. d) stime(frmt_strg) return string
formatted according to 'frmt_strg' using current system date and
time. e) stime(frmt_strg,ftime) return string formatted
according to 'frmt_strg' using time specified in ftime and
current system date. ftime is a file time in the operating
system format. f) stime(frmt_strg,hour,minute,second) return
string formatted according to 'frmt_strg' using time specified in
'hour', 'minute' and 'second' and current system date.
The format string is similar to those used in the 'print'
functions except that the following substitutions are made:
% Substitution Made %a --> Locale's Abbreviated Weekday Name %A
--> Locale's Weekday Name %b --> Locale's Abbreviated Month Name
QTAwk - xiii - QTAwk
%B --> Locale's Month Name %c --> Locale's Appropriate Date And
Time Representation %d --> Day Of The Month As A Decimal Number
(01-31) %H --> The Hour (24-hour Clock) As A Decimal Number
(00-23) %I --> The Hour (12-hour Clock) As A Decimal Number
(01-12) %j --> The Day Of The Year As A Decimal Number (001-366)
%m --> The Month As A Decimal Number (01-12) %M --> The Minute As
A Decimal Number (00-59) %p --> The Locale's Equivalent Of The
AM/PM Designations Associated With A 12-hour Clock %S --> The
Second As Decimal Number (00-61) %U --> The Week Number Of The
Year (the First Sunday As The First Day Of Week 1) As A Decimal
Number (00-53) %w --> The Weekday As A Decimal Number (0-6),
Where Sunday Is 0 %W --> The Week Number Of The Year (the First
Monday As The First Day Of Week 1) As A Decimal Number (00-53) %x
--> The Locale's Appropriate Date Representation %X --> The
Locale's Appropriate Time Representation %y --> The Year Without
Century As A Decimal Number (00-99) %Y --> The Year With Century
As A Decimal Number %Z --> The Time Zone Name Or Abbreviation, Or
By No Character If No Time Zone Is Determinable %% --> %
20. The functions 'jdn' and 'cal' have been modified. 'jdn'
computes the Julian Day Number for an associated date. 'cal'
computes the date for a given Julian Day Number. The Julian Day
Number is very useful in date computations. The forms are: a)
jdn() Computes the Julian Day Number for the current System
Date. b) jdn(fdate) computes the Julian Day Number for the file
date, fdate, in the operating system format. c)
jdn(year,month,day) computes the Julian Day Number for the date
specified. d) cal(frmt_strg,jdn) return string formatted
according to 'frmt_stgr' using date specified in Julian Day
Number, jdn, and current system time. The format string,
frmt_strg, is the same as used by the 'sdate' and 'stime'
functions above. If
the jdn passed is an integer as calculated and returned by
the 'jdn' function, then the current system time is used for any
time substitutions. If the jdn passed is a floating point
number, the fractional part is used to compute the time used.
21. The new command line option, "-Ww filename", has been
introduced. This option allows the user to write the internal
form of a utility to the specified file. The internal form may
then be specified with the "-f" option. Reading the internal
form of a utility in this manner can greatly speed the initial
execution of the utility.
QTAwk - xiv - QTAwk
22. The command line switch '-Wd' has been introduced. This
switch causes QTAwk to delay parsing of the input until an input
field, "$i", or the "NF" variable is accessed. Without this
switch, QTAwk parses each input record as it is read. For
utilities which do not access input fields for every record or
very few records, this can speed up processing considerably.
23. The following built-in variables have been added:
a) MATCH_INDEX - see the description of the expanded 'sub' and
'gsub' functions below.
b) CONVFMT - specifies the format used to convert floating point
values to a string value. Previously the value of OFMT was used
for this purpose. OFMT is now used only for formatting floating
point values for output.
c) IGNORECASE - if assigned a true value, QTAwk ignores
alphabetic case in all match operations against strings and
regular expressions. This is true for all built-in functions
using regular expressions and in searches for record terminator
and field terminator strings using RS and FS respectively.
d) RT - Contains the full string value of the record terminator
of the current input record. If the value of RS is not changed,
this always contains a single new line character. If the value
of RS is changed to an array or a regular expression, then the
record terminator can change with each input record. The value
of RT gives the user access to the current value of the
terminator.
e) RECLEN - used to specify the length of fixed length records.
If RECLEN has an integral non-zero value, the
value is used to read the next record and RS is ignored.
When RECLEN is used to determine input records, RT is always set
to the null string.
f) FIELDWIDTHS - when assigned a string value containing space
separated integral numbers of the form:
n1 n2 n3 ... nn
the splitting of input records into fields is governed by the
QTAwk - xv - QTAwk
numbers in FIELDWIDTHS rather than FS. Each number in
FIELDWIDTHS specifies the width of a field including columns
between fields. If you want to ignore the columns between
fields, you can specify the width as a separate field that is
subsequently ignored. When the value FIELDWIDTHS does not match
this form, field splitting is done using FS in the usual manner.
g) FIELDFILL - used to fill a field when the replacement value
is less than the field width. This variable value is used only
when field splitting is based on FIELDWIDTHS rather than FS. The
default value is a single blank character.
h) DELAY_INPUT_PARSE - used to delay parsing of the current
input record until the value of NF or one of the field variables,
$i, 1 ≤ i ≤ NF, is needed. The default value is false. If the
value is true, then the input record is not parsed until
necessary. For utilities which do not reference NF or the field
variables, $i, or seldom reference them, delaying the parsing of
the input record can speed the execution of the utility
significantly.
24. The command line option "-vvar = value" has been introduced
to assign "value" to the variable "var' before execution of the
utility starts. This makes the assigned value of "var" available
in any BEGIN actions. This option may be used multiple time on
the caommand line to assign multiple variables.
25. The command line option "-Wd" has been introduced to delay
input parsing until needed. See the description of the built-in
variable "DELAY_INPUT_PARSE" above.
26. The command line option "-wfilename" has been changed to
"-Wwfilename" to conform to the POSIX specifcation.
27. The predefined pattern, GROUP, has been expanded to accept
any valid QTAwk expression. The expressions in a GROUP
pattern are evaluated when the GROUP patterns are first matched
against an input line. The expression is evaluated and the
result converted to a regular expression. If the result of the
expression is an array, the entire array is matched at the
corresponding position in the GROUP pattern.
28. The 'sub' and 'gsub' functions have been expanded. The
QTAwk - xvi - QTAwk
second argument, the replacement string expression, is no longer
evaluated at the time the function is called, but when the
replacement string is used for replacement. Since the first
argument, the match pattern expression, may now be an array, the
value of the built-in variable, MATCH_INDEX, can be affected as
matches are made. By delaying the evaluation of the replacement
string expression until the replacement is made, the change in
MATCH_INDEX can be used to affect the value of the replacement
string for each replacement. QTAwk guarantees that the
replacement string expression will be evaluated as replacements
are made in the text string from left to right. For constant
expressions used for the replacement string, this change has no
direct effect.
29. The Tagged string operator, '$$', has been fixed to
properly yield the tagged strings.
30. A new standard file, in addition to "stdaux", "stdin",
"stderr", "stdprn" and "stdout", has been introduced, "keyboard"
for unbuffered input directly from the keyboard. The built-in
variable "ECHO_INPUT" also controls echoing of input characters
from this "file" to the standard output file. The standard files
are defined in more detail in Section 11.
31. The returns from the "getline" and "fgetline" functions
have been altered slightly. The values returned are the same as
the new functions "srchrecord" and "fsrchrecord": a) the number
of characters read plus the length of the End-Of-Record string
plus 1, b) 0 if End-Of-File was encountered, or c) -1 if an error
occurred.
==> QTAwk Version 5.11, dated 03/30/92. This version contains
two additions from the previous versions:
1. The predefined variable FILE_SORT has been added to control
the sequence of files returned by the "findfile" built-in
function.
2. The predefined variable "Gregorian" has been added to
control the use of the Gregorian or Julian Calendars in
converting to/from dates and Julian Day Numbers by the built-in
"cal", "jdn" and "sdate" functions.
QTAwk - xvii - QTAwk
==> QTAwk Version 5.10, dated 10/01/91. This version contains
several changes and additions from the previous versions:
1. The command line variable setting mechanism:
variable = value
has been fixed.
2. The function "jdn(fdate)" or "jdn(year,month,day)" has been
added to compute the Julian Day Number (jdn) of a specified
date. The Julian Day Number is very useful for date
computations.
3. The function "cal(jdn,form)" has been added to complement
the 'jdn' function above. The "cal" function returns the date
corresponding to the specified jdn passed.
4. The function "findfile" has been added. This function has
the following forms
a) findfile(var) b) findfile(var,pattern) c)
findfile(var,pattern,attributes)
where:
a) var == the variable in which to return the array of files
found which match the pattern specified
b) pattern == a string specifying the pattern to match the
filenames. Follows the DOS wildcard convention. If no pattern
is specified, "*.*" is used.
c) attributes == a string specifying the desired file
attributes: Archive, Read-Only, Hidden, System, Sub-Directory, or
Volume ID.
This function returns the number of files found which match the
pattern specified. The file names, sizes, last modify date and
times are returned via the variable specified. The file date and
time are returned in "DOS format" which is good for sorting
purposes, but unreadable. The "stime" and
"sdate" functions have been expanded to include the file
QTAwk - xviii - QTAwk
time and date and format them as desired.
5. The "sdate" function has been expanded to return the Julian
Day Number as a formatting option and can format any specified
date as well as the file date returned by the "findfile"
function.
6. The "stime" function has been expanded to be able to format
any specified time as well as the file time returned by the
"findfile" function.
7. For format arguments greater than the number of format
options the 'sdate' and 'stime' functions no longer terminate
execution with an error message. The remainder operator is
silently used to obtain a correct formatting option.
==> QTAwk Version 5.00, dated 02/01/91. This version contains
several additions from the previous versions:
1. The character output functions 'putc(c)' and fputc(c,F)'
have been added. These functions are identical in operation to
the C functions of the same names. With the 'getc' and
'fgetc(F)' functions of the previous versions, the user now has a
complete set of character I/O.
2. The built-in variable 'ECHO_INPUT' has been added.
ECHO_INPUT has a default value of FALSE. If TRUE, when reading
from the standard input file, normally the console keyboard, the
input is echoed to the standard output file, normally the console
display. When the standard input file is re-directed from a
file, it is convenient to set ECHO_INPUT to FALSE to prevent
echoing each input character to the display.
3. The variable "QTAwk_Path" has been added. If an environment
setting for "QTAWK" exists, QTAwk_Path is set to the value of the
setting upon program initiation. This string value is then used
in searching for input files. The search sequence for files
follows the the order:
a) If a directory and/or path name name is specified with a
file, then that path only is searched.
b) If no directory or path name is specified, then the current
directory is searched. If the file is not found, then all paths
specified in QTAwk_Path are searched for the file in the order
specified. Multiple paths may be
QTAwk - xix - QTAwk
specified, separated by a semi-colon. This follows the same
convention for specifying multiple paths in the system
environment path setting "PATH".
The value of QTAwk_Path may be reset at any time by the user's
utility to change the paths to be searched for desired files to
be opened for reading.
4. The built-in function 'append(filename)' has been added.
Executing this function before any output to "filename" will
cause all output to be appended to the end of the file "filename"
if it currently exists. Closing the file, "filename", with the
'close' function will cancel the effect of 'append' for any
subsequent output to that file.
5. The tagged string operator, '$$', has been expanded to
accept numerical expressions in the forms: a) $$0 --> refers to
the entire string matched by the last regular expression in the
pattern. b) $$0.0 --> refers to the entire string matched by the
last regular expression. c) $$j --> refers to the string
matching the regular expression contained in the parenthesis set
at nesting level 1 and count j in the last regular expression.
Valid values of j are in the range 1 ≤ j ≤ 31. Identical to
$$1.j of the next form. d) $$i.j --> refers to the string
matching the regular expression contained in the parenthesis set
at the ith nesting level, jth count in the last regular
expression. Valid values of i and j are in the range 1 ≤ i ≤ 7,
1 ≤ j ≤ 31. e) $$variable_name --> this form may expand into any
one of the previous forms. If the variable operated on by '$$'
is a string or regular expression, it is first converted to
numeric form.
Thus $$5.02, is the string matching the regular expression
contained within the parenthesis at the fifth level and 2nd
count.
==> QTAwk Version 4.20, dated 10/11/90. This version contains
three additions from the previous versions:
1. The behavior of the RS predefined variable has been
changed. It is now similar to the behavior of the FS variable.
If RS is assigned a value, which when converted to a string
value, is a single character in length, then that character
becomes the record separator. If the string is
QTAwk - xx - QTAwk
longer in length than a single character, then it is treated
as a regular expression. The string matching the regular
expression is treated as a record separator. As for FS, the
string value is converted to the internal regular expression form
when the assignment is made.
2. Two new functions have been added: getc() --> reads a single
character from the current input file. The character is returned
by the function. fgetc(file) --> reads a single character from
the file 'file'. The character is returned by the function.
These functions allow the user to naturally obtain single
characters from any file including the standard input file (which
would be the keyboard if not redirected or piped).
3. Error messages now have a numerical value displayed in
addition to the short error message. The error messages are
listed in numerical order in the QTAwk documentation with a short
explanation of the error. In some cases, an attempt has been
made to provide guidance as to what may have caused the error and
possible remedies. Since the error messages are generated at
fixed points within QTAwk and may be caused by different reasons
in different utilities during interpretation or during execution
on data files, it is not possible to list every possible reason
for the display of the error messages. The line number within
the QTAwk utility on which the error was discovered and the input
data file record number are provided in the error message to
provide some help to the user in attempting to ascertain the real
reason for the error.
==> QTAwk Version 4.10. This version contains one addition from
the previous versions:
1. In previous versions, the GROUP pattern keyword could accept
patterns consisting only of a regular expression constant. For
version 4.10, The GROUP pattern keyword has been expanded to
accept regular expression constants, string constants and
variables. The variables are evaluated at the time the GROUP
patterns are first utilized to scan an input record. The value
is converted to string form and interpreted as a regular
expression.
GROUP /regular expression constant/ { actions } GROUP "string
constant" { actions } GROUP Variable_name { actions }
QTAwk - xxi - QTAwk
GROUP patterns are still converted into an internal form
for regular expressions only once, when the pattern is first used
to scan an input line. Any variables in a GROUP pattern will be
evaluated, converted to string form and interpreted a regular
expression.
==> QTAwk Version 4.02. This version contains two additions
from the previous versions:
1. The command line argument, double hyphen, "--", stops
further scanning of the command line for options. The double
hyphen argument is not passed to the QTAwk utility in the ARGV
array or counted in the ARGC variable. Since QTAwk only
recognizes two command options, this has been included to be
compatible with the latest Unix(tm) conventions.
2. The built-in array ENVIRON has been added. This array
contains the environment strings passed to QTAwk. Changing a
string in ENVIRON will have no effect on the environment strings
passed in the QTAwk "system" built-in function. Environment
strings are set with the PC/MS-DOS "SET" command. The strings
are of the form:
name = string
where the blanks on either side of the equal sign, '=', are
optional and depend on the particular form used in the "SET"
command. The QTAwk utility may scan the elements of ENVIRON for
a particular name or string as desired.
QTAwk - xxii - QTAwk
1.0 Introduction
QTAwk is called a Utility Creation Tool and not a programming
language because it is intended for the average computer user as
well as the more experienced user and programmer. QTAwk has been
designed to make it easy for the average user to create those
small, or maybe not so small, utilities needed to accomplish
small, or not so small, everyday jobs. The jobs which are too
small to justify the time and cost of using the traditional
computer programming language and maybe hiring a professional
programmer to accomplish.
If you are like many computer users, you would frequently like
to make changes in various text files wherever certain patterns
appear, or extract data from parts of certain lines while
discarding the rest. To write a program to do this in a language
such as C or Pascal is a time-consuming inconvenience that may
take many lines of code. The job may be easier with QTAwk.
The QTAwk utility interprets a special-purpose programming
language that makes it possible to handle simple
data-reformatting jobs easily with just a few lines of code.
This manual teaches you what QTAwk does and how you can use
QTAwk effectively. Using `QTAwk' you can:
manage small, personal databases
generate reports
validate data
produce indexes, and perform other document preparation tasks
even experiment with algorithms that can be adapted later to
other computer languages
This paper presents a description of the QTAwk utility creation
tool and its use. Most computer users have many small tasks to
accomplish that are usually left undone for lack of the proper
tool. Typically these tasks require finding one or more records
within a file and executing some action depending on the record
located.
In order to accomplish these tasks the user needs a tool which
will allow the following to be accomplished easily:
1. reading files record by record,
QTAwk - xxiii - QTAwk
2. spliting, or parsing, the records read into words or fields,
3. determining if a record, or records, satisfy a predetermined
match criteria, i.e. finding the "proper" record(s),
4. when the proper records are found, executing some action or
actions on the records or fields of the records.
QTAwk supplies the user with all of these features in an easy to
use manner. Specifying the name of a file is all the user need
do to open the file and read it record by record. The user may
easily change what a "record" is or let it default to an ASCII
text line as used by all text editors and which can be written by
all word processors or force QTAwk to view files as sequences of
fixed length records. QTAwk will automatically split (parse)
records into fields. Initially a field is a word or a sequence
of nonblank characters. The user may change the definition of a
field easily to adapt to the needs of a particular situation.
The user may even force QTAwk to split input records into fixed
length fields.
Arithmetic expressions, logical expressions or regular
expressions may be used to define the criteria for selecting
records for action. Regular expressions are a powerful means of
describing the criteria for selecting, i.e, matching, the text of
records. Arithmetic expressions utilize the ordinary arithmetic
operators (addition, subtraction, multiplication, etc.) for
describing the criteria for selecting records and logical
expressions utilize the logical operators (less than, equal to,
greater than, etc.) for selecting records.
Of all the operators available in QTAwk, the regular expression
operators may be only ones most readers are not familiar with.
Regular expressions are a powerful and useful tool for working
with text. Yet for all their power, they are surprisingly simple
and easy to use when learned. Section 2 explains regular
expressions fully, in a manner that will make them usable by a
person totally unfamiliar with them.
QTAwk is patterned after The Awk Programming Language by Alfred
V. Aho, Brian W. Kernighan and Peter J. Weinberger. The Awk
program implementing The Awk Programming Language is available on
most Unix (tm) systems. Aho, Kernighan and Weinberger invented
the automatic input loop and the pattern-action pairs used in
QTAwk and are to be heartily congratulated for this. Without
Awk, QTAwk would not exist. QTAwk is an extensive expansion of
The Awk Programming Language in many important aspects. In
addition, some of the admitted shortcomings of The Awk
Programming Language have been corrected. Appendix iii contains
a detailed listing of the differences between QTAwk and Awk.
QTAwk - xxiv - QTAwk
QTAwk - xxv - QTAwk
QTAwk - 1-26 - QTAwk
Section 2.0 Getting Started
2.0 Getting Started
QTAwk is designed to be used to search data or text files using
short user created utilities. The types of files that QTAwk is
designed to work with are "text" files, commonly called ASCII
files. The files contain user readable text and numbers. The
text is contained in records and the records end with
carriage-return, newline character pairs or single newline
characters. Text files are written by application programs and
word processors and text editors.
The data in the files is grouped by fields on a single record or
by records separated by a blank line or some other "special"
character. For example, the following records list data on
various states:
US # 10461 # 4375 # MD # Annapolis ( Maryland )
US # 40763 # 5630 # VA # Richmond ( Virginia )
US # 2045 # 620 # DE # Dover ( Delaware )
US # 24236 # 1995 # WV # Charleston ( West Virginia )
US # 46047 # 12025 # PA # Harrisburg ( Pennsylvania )
US # 7787 # 7555 # NJ # Trenton ( New Jersey )
US # 52737 # 17895 # NY # Albany ( New York )
US # 9614 # 535 # VT # Montpelier ( Vermont )
US # 9278 # 975 # NH # Concord ( New Hampshire )
US # 33265 # 1165 # ME # Augusta ( Maine )
Each record consists of at least 12 words. The 12 words of the
first record are:
1. US
2. #
3. 10461
4. #
5. 4375
6. #
7. MD
8. #
9. Annapolis
10. (
11. Maryland
12. )
The first word lists the country, the third word lists the state
area in square miles, the fifth word lists the state population
in thousands. the seventh word lists the state abbreviation, the
QTAwk - 2-1 - QTAwk
Section 2.0 Getting Started
ninth word lists the state capital, and the eleventh word lists
the state name. The second, fourth, sixth, eighth, tenth and
last words are word separators. The word separators are not
necessary for QTAwk, but make each line easier for people to
read. A copy of this entire file, states.dta, is given in
Appendix v.
This data can be manipulated in various ways. A few of the ways
in which this can be done are:
1. the manner of listing changed, or
2. only records meeting certain criteria listed:
a) those states with a minimum area,
b) those states with a minimum population,
c) population greater than a minimum and less than a
maximum,
d) area less than a maximum and population greater than a
minimum,
e) population density (population / area) less than a
maximum.
3. the list could be sorted:
a) alphabetically by
1: state capital,
2: state name,
3: state abbreviation.
b) by area,
c) by population.
4. some data could be deleted from the list such as the
capital.
There are many more ways to manipulate the data. In order to do
so the data in the list must first be read record by record and
each record split into its constituent parts. Once the parts for
each record have been determined, the data can be easily
manipulated, changed, or rearranged.
2.1 Running QTAwk
QTAwk is started from the command prompt, giving the QTAwk
utility to run and the files to search. The QTAwk utility may be
written directly on the command line or contained in one or more
files named on the command line. If given on the command line,
it is usually enclosed in double quotes:
QTAwk "$5 > 50000 {print;}" states.dta
QTAwk - 2-2 - QTAwk
Section 2.1 Getting Started
This QTAwk utility will print the record for every state for
which the area is greater than 50,000 square miles.
The example shows the form of QTAwk utilities, a sequence of
patterns and actions in the form:
pattern1 { action1 }
pattern2 { action2 }
pattern3 { action3 }
..
..
..
QTAwk opens the files named on the command line, reads a record,
splits (parses) each record into the individual words or fields
and compares the record with each pattern in the order in which
they have been written in the QTAwk utility. If the record
matches a pattern, the corresponding action contained in braces
is executed.
Patterns may be arithmetic expressions, logical expressions,
regular expressions or combinations of all three types of
expressions. The example above has a logical expression
pattern.
Programs indicate the end of text lines in ASCII files with a
Carriage Return, Newline character pair. QTAwk follows the
practice of converting all such pairs to a single newline when
reading the file. In writing files, QTAwk converts single
Newline characters to a Carriage Return, Newline pair.
For the data in the "states.dta" data file, a user may ask for
the total population of Canada. The first field can be used to
identify the data for Canada and the fifth field contains
population data. The following utility will sum the population
data for Canada:
$1 == "Canada" { Total += $5 }
END { print Total; }
In this example, when the first field of a record is equal to
"Canada", the fifth field is accumulated into the variable
Total. When all records have been processed, Total is printed.
The printing of Total is accomplished in the action associated
with the pattern 'END'. 'END' is a predefined pattern, the
associated action is executed after closing the input file(s).
QTAwk - 2-3 - QTAwk
Section 2.1 Getting Started
Usually, data such as the state date is given in tabular format
with each item in a record in a columns. The state data is shown
in such a format in Appendix v. A few lines are repeated below:
Country State State Capital Area Population
US Maryland MD Annapolis 10461 4375
US Virginia VA Richmond 40763 5630
US Delaware DE Dover 2045 620
.
.
.
Europe West Germany Bonn 92100 14030
Europe France Paris 211208 55020
Europe United Kingdom London 94092 56040
Europe Ireland Dublin 27136 3595
For the above table, field five, $5, contains the state area
for states in the US. But for European countries, the area is
contained in field four, $4, since the third field is missing.
When data in records is contained within fixed length fields as
in this table, it can be more convenient to split the record into
fields based on the field widths rather than characters which
terminate or separate the fields. This can be especially true
when some fields are missing in some of the records. QTAwk
allows the user to split records on fixed field widths by
assigning a sequence of integer field widths to the built-in
variable, FIELDWIDTHS. For the table above this could be
accomplished with:
FIELDWIDTHS = "8 25 6 15 9 10";
Thus, for the first record the fields are:
$1 = "US "
$2 = "Maryland "
$3 = "MD "
$4 = "Annapolis "
$5 = "10461 "
$6 = "4375 "
and for the last record the fields are:
$1 = "Europe "
$2 = "Ireland "
$3 = " "
QTAwk - 2-4 - QTAwk
Section 2.1 Getting Started
$4 = "Dublin "
$5 = "27136 "
$6 = "3595 "
Note that for fixed fields widths, the fields may contain blanks
and comparisons must be against the entire field. Thus, the
short utility to find the total population for Canada must be
written as:
$1 == "Canada " { Total += $5 }
END { print Total; }
to find those records for Canada.
There are several things to note using FIELDWIDTHS for parsing
input records:
1. If the input record is longer than the sum of the field
widths specified in FIELDWIDTHS, QTAwk creates a last field
containing the remainder of the input record. For example,
if FIELDWIDTHS is specified as:
FIELDWIDTHS = "2 4 4 5";
4 fields with a total width of 15 characters. For an input
record of 25 characters, QTAwk creates 5 fields with $5
containing the last 10 characters of the input record. FS is
NOT used to parse the remaining input record after the last
field specified in FIELDWIDTHS.
2. If the input record is shorter in length than the sum of the
field widths specified in FIELDWIDTHS, QTAwk only creates the
number of fields necessary to contain the input record. The
last field will contain only those characters necessary to
fill out the input record. For example, if FIELDWIDTHS is
specified as above and the input record is only 10 characters
long, QTAwk will create only 3 fields, NF will be set to 3,
and $3 will be only 2 characters long.
Thus, for the tabular data, FIELDWIDTHS could be defined as:
FIELDWIDTHS = "8 25 6 15 9";
and QTAwk would automatically assign the population data to $6
for each record.
QTAwk - 2-5 - QTAwk
Section 2.1 Getting Started
The remaining Sections explain QTAwk expressions, patterns,
action statements and more. All of these are combined into a
QTAwk utility. In using and creating QTAwk utilities, the user
needs to remember the fundamental QTAwk processing sequence:
1. QTAwk opens each input file and reads the file record by
record,
2. as each record is read, it is split into fields,
3. each pattern expression is executed, and
4. the associated action is executed for each pattern expression
which evaluates to true.
Keeping this fundamental loop in mind will make using QTAwk very
simple indeed. The advanced user should read the section on
Pattern/Actions, Section 8, for a method of speeding file search
using the FILE_SEARCH and FILE_SEARCH_PAT predefined variables.
Also the FIELDWIDTHS built-in variable may be used to alter the
manner in which QTAwk splits each input record into fields and
utilize fixed length fields within records. The RECLEN variable
may be used to alter the manner in which QTAwk reads records and
utilize fixed length records.
QTAwk - 2-6 - QTAwk
Section 3.0 Regular Expressions
3.0 Regular Expressions
Regular expressions are a means of describing sequences of
"characters". In the discussion of QTAwk, "character" will be
taken to mean any character from the extended ASCII sequence of
characters from ASCII '1' to ASCII '255'. Appendix i contains a
listing of the ASCII characters with both their decimal and
hexadecimal equivalent.
A string is a finite sequence of characters. The length of a
string is the number of characters contained in the string. A
special string is the empty string, also called the null string,
which is of zero length, i.e., it contains no characters. We
shall use the symbol '' below to refer to the null string.
Another way to think of a string is as the concatenation of a
sequence of characters. Two strings may be concatenated to form
another string. Concatenating the two strings:
"abcdef"
and
"ghijklmn"
forms the third string:
"abcdefghijklmn"
In many instances, it is desirable to describe a string with
several alternatives for one or more of the characters. Thus we
may wish to find the strings:
FRED
or
TED
A convenient manner of describing both strings with the same
regular expression is
/(FR|T)ED/
Strings in QTAwk are enclosed in double quotes, ", and regular
expressions are enclosed in slashes, '/'.
QTAwk - 3-1 - QTAwk
Section 3.1 Regular Expressions
3.1 'OR' Operator
The symbol '|' means "or" and so the above regular expression
would be read as: The string "FR" or the string "T" concatenated
with the string "ED". The parenthesis are used to group strings
into equivalent positions in the resultant regular expression.
In this manner it is possible to build a regular expression for
several alternative strings.
In many instances it is also desirable to build regular
expressions that contain many alternatives for one character,
i.e., one character strings. For example, we may want to find
all instances of the words "doing" or "going". We could build
the regular expression:
/(d|g)oing/
3.2 Character Classes
Although the last regular expression is a fairly simple example,
it serves to introduce the notion of "character class". If we
define the notation:
[dg] = (d|e)
then we may write the regular expression as:
/[dg]oing/
The character class notation saves us from having to explicitly
write the "or" symbols in the regular expression. The "or" is
implied between each character of the class.
Now suppose that we wanted to expand our search to all five
letter words ending in "ing" and starting with any lower case
letter and having any lower case letter as the second character.
We would write the regular expression:
/(a|b|c|d|...|x|y|z)(a|b|c|d|...|x|y|z)ing/
or
/[abcd...xyz][abcd...xyz]ing/
Regular expressions in these cases can not only get very long,
but can be very tedious to write and are very prone to error. We
QTAwk - 3-2 - QTAwk
Section 3.2 Regular Expressions
introduce the notion of a range of characters into the character
class and define:
[a-z] = [abcd...xyz] = (a|b|c|d|...|x|y|z)
The above regular expression can now be written:
/[a-z][a-z]ing/
A considerable savings and less error prone. The hyphen, '-',
is recognized as expressing a range of characters only when it
occurs within a character class. Within character classes, the
hyphen loses this significance in the following three cases:
1. when it is the first character of the character class, e.g.,
[-b] = (-|b)
2. when it is the last character of the character class, e.g.,
[b-] = (b|-)
3. when the first character of the indicated range is greater
in the ASCII collating sequence than the second character of
the indicated range, e.g.,
[z-a]
would be recognized as:
(z|-|a)
In interpreting the range notation in character classes, QTAwk
uses the ASCII collating sequence.
[0-Z]
is equivalent to:
[0123456789:;<=>?@A-Z]
Continuing the last example, if we did not want to limit the
first character to lower case, but also wanted to include the
possibility of upper case letters, we could use the following
regular expression:
QTAwk - 3-3 - QTAwk
Section 3.2 Regular Expressions
/([A-Z]|[a-z])[a-z]ing/
This regular expression allows the first letter to be any
character in the range from A to Z or in the range from a to z.
But the "or" is implied in character classes, shortening the
above regular expression to:
/[A-Za-z][a-z]ing/
If we now wish to expand the above from all five letter words
ending in "ing" to all six letter words ending in "ing", we could
write the regular expression as:
/[A-Za-z][a-z][a-z]ing/
In general, if we did not want to specify the number of
characters between the first letter and the "ing" ending, we
could specify an regular expression as:
/[A-Za-z](|[a-z])(|[a-z])...(|[a-z])ing/
By specifying the null string in the 'or' regular expression,
the regular expression allows a character in the range a to z or
no character to match. The shortest string matched by this
regular expression would be a single upper or lower case letter
followed by "ing". The regular expression would also match any
string starting with an upper or lower case letter with any
number of lower case letters following and ending in "ing".
3.3 Closure
What we need to describe this regular expression is a notation
for specifying "zero or more" copies of a character or string.
Such a notation exists and is written as:
/[A-Za-z][a-z]*ing/
where the notation
[a-z]*
means zero or more occurrences of the character class [a-z].
This operation is called closure and the '*' is called the
closure operator. In general, the notation may be used for any
regular expression within a regular expression. The following
are valid regular expressions using the notion of zero or more
QTAwk - 3-4 - QTAwk
Section 3.3 Regular Expressions
occurrences of an regular expression within another regular
expression:
/mis*ion/
would match "miion", "mision", "mission", "misssion",
"missssion", etc.
/bot*om/
would match "boom", "botom", "bottom", "botttom", "bottttom,
etc.
/(Fr|T)*ed/
would match "ed", "Fred", "Ted", "FrFred", "TTed", "FrFrFred",
"TTTed", "FrTFred", "FrFrTed", "TFrFred", etc.
As an extension to the '*' operator, we frequently would want to
search for "one or more" occurrences of a regular expression. As
above we would write this as:
/[A-Za-z][a-z][a-z]*ing/
The [a-z][a-z]* construct would ensure that at least one letter
occurred between the initial letter and the string "ing". This
occurs often enough that the notation
[a-z]+ = [a-z][a-z]*
has been adopted to handle this situation. Thus, use the
operator '*' for "zero or more" occurrences and the operator '+'
for "one or more" occurrences. The '+' operator is called the
positive closure operator.
In many cases it is desirable to search for either zero or one
regular expression. For example, it would be desirable to search
for names preceded by either Mr or Mrs The regular expression:
/Mrs*/
would find: Mr and Mrs and Mrss and Mrsss, etc. The following
regular expression will accomplish what we really want in this
case:
/Mr(|s)/
QTAwk - 3-5 - QTAwk
Section 3.3 Regular Expressions
This regular expression would find 'Mr' followed by zero or one
's'.
The operator '?' has been selected to denote "zero or one" of
the preceding regular expression. Thus,
/Mrs?/ = /Mr(|s)/
3.4 Repetition Operator
In some cases we wish to specify a minimum and maximum repeat
count for a regular expression. For example, suppose it was
desirable for a regular expression to contain a minimum of 2 and
a maximum of 4 copies of "abc". We could specify this as:
/abcabc(abc)?(abc)?/
The notation {2,4} has been adopted for expressing this. The
general form of the repetition operator is {n1,n2}. n1 and n2
are integers, with n1 greater than or equal to 1 and n2 greater
than or equal to n1, 1 ≤ n1 ≤ n2. A repetition count would be
specified as:
/r{n1,n2}/ = /rrrrrrrrrrrrrrr?r?r?r?r?r?/
│<─── n1 ───>│ │
│<──────── n2 ─────────>│
The above could be expressed as:
/(abc){2,4}/ = /(abc)(abc)(abc)?(abc)?/
Since the repetition operator repeats the immediately preceding
regular expression, the parenthesis around "abc" are necessary to
repeat the whole string. Without the parenthesis the regular
expression would expand as:
/abc{2,4}/ = /abccc?c?/
The repetition operator can be used to repeat either single
characters, groups of characters, character classes or quoted
strings. The use of the operator is illustrated below for each
case:
1. Single character:
/abc{2,4}/ = /abccc?c?/
QTAwk - 3-6 - QTAwk
Section 3.4 Regular Expressions
2. Grouped regular expression:
/(abc){2,4}/ = /(abc)(abc)(abc)?(abc)?/
3. character class:
/[abc]{2,4}/ = /[abc][abc][abc]?[abc]?/
4. quoted string:
/"abc"{2,4}/ = /"abcabc(abc)?(abc)?"/
For quoted strings, the whole of the string contained
within quotes is repeated, with all repetitions maintained
within the quotes.
5. named expressions (described later):
/{abc}{2,4}/ = /{abc}{abc}({abc})?({abc})?"/
A special case exists for character classes in which the class
of characters to exclude is greater than the class of characters
to include. For example, suppose that we wanted in a certain
character position to include all characters that weren't
numerics. We could build a character class of all characters and
leave the numerics out. An easier method is to use the
"complemented" or "negated" character class. A special operator
has been introduced for this purpose. The logical NOT symbol,
'!', occurring as the first character in a character class,
negates the class, i.e., any character NOT in the class is
recognized at the character position.
Thus, to define the negated character class of all characters
which are not numerics, we would specify:
[!0-9]
To define all characters except the semi-colon, we would
specify:
[!;]
Note that the symbol '!' has this special meaning only as the
FIRST character in a character class. The caret symbol, '^', as
the FIRST character in a character class may also be used to
negate a character class. Traditionally, the caret been used for
QTAwk - 3-7 - QTAwk
Section 3.4 Regular Expressions
this purpose, but QTAwk allows the logical NOT operator, '!'
also.
Utilizing the above concepts for building regular expressions by
concatenating characters, concatenating regular expressions to
build more complicated regular expressions, using parenthesis to
nest or group regular expressions within regular expressions,
using character classes to denote constructs with implied "or"s,
negated character classes to specify characters to exclude from a
given position, using the closure operators, '*', '+' and '?',
and the repetition operator, {n1,n2}, for expressing multiple
copies, very complicated regular expressions may be built for
searching for strings in files.
3.5 Escape Sequences
To round out the ability for building regular expressions for
searching, we need only a few more tools. In some cases we may
wish for the regular expression to contain blanks or tab
characters. In addition, other non-printable characters may be
included in regular expressions. These characters are defined
with "escape sequences". Escape sequences are two or more
characters used to denote a single character. The first
character is always the backslash, '\'. The second character is
by convention a letter as follows:
\a == bell (alert) ( \x07 )
\b == backspace ( \x08 )
\f == formfeed ( \x0c )
\n == newline ( \x0a )
\r == carriage return ( \x0d )
\s == space (blank) ( \x20 )
\t == horizontal tab ( \x09 )
\v == vertical tab ( \x0b )
\c == c [ \\ == \ ]
\ooo == character represented by octal value ooo
1 to 3 octal digits acceptable
\xhhh== character represented by hexadecimal value hhh
1 to 3 hexadecimal digits acceptable
Any other character following the backslash is translated to
mean that character. Thus '\c' would become a single 'c', '\['
would become '[', etc. The latter is necessary in order to
include such characters as '[', ']', '-', '!', '(', ')', '*',
'+', '?' in regular expressions without invoking their special
meanings as regular expression operators.
QTAwk - 3-8 - QTAwk
Section 3.6 Regular Expressions
3.6 Position Operators
Three additional special characters have, by convention, been
defined for use in writing regular expressions, namely the period
'.', the caret, '^' and the dollar sign, '$'. The period has
been assigned to mean "any character" in the set of characters
except the newline character, '\n'. For our use the period means
any character from ASCII 1 to ASCII 9 inclusive and ASCII 11 to
ASCII 255 inclusive and exclusive of the newline character, ASCII
10.
The caret and the dollar sign are position indicators and not
character indicators. The caret, '^', is used to indicate the
beginning or start of the search string. Thus, any character
following the caret in a regular expression must be the first
character of the string to be searched otherwise the match
fails. The dollar sign , '$', is used to indicate the end of the
search string. Thus, any character preceding the dollar sign in
a regular expression must be the last character of the string to
be searched or the match fails.
To indicate "beginning of line", the caret must be in the first
character position of a regular expression. Similarly, to
indicate "end of line", the dollar sign must be in the last
character position of a regular expression. In any other
position, these characters lose their special significance.
Thus, the regular expression:
/(^|[\s\t])A/
or
/([\s\t]|^)A/
means that 'A' must be the first character on a line, or be
preceded by a space or tab character to match. Similarly
/A($|[\s\t])/
or
/A([\s\t]|$)/
means that 'A' must be the last character on a line or be
followed by a space or tab character.
QTAwk - 3-9 - QTAwk
Section 3.7 Regular Expressions
3.7 Examples
The regular expression:
/[A-Za-z][a-z]\s+.*/
will match an upper or lower case letter followed by a lower
case letter followed by one or more blanks followed by any
character except a newline zero or more times.
The regular expression:
/\([A-Z]+\)[!\s]+/
will match a left parenthesis followed by one or more upper
case letters followed by a right parenthesis followed by one or
more characters which are not blanks.
The regular expression:
/[\s\t]+ARCHIVE([\s\t]+|$)/
will match one or more blanks or tabs followed by the word (in
upper case) "ARCHIVE" followed either by one or more blanks or
tabs or by the end of line. Note this kind of construct is handy
for finding words as independent units and not buried within
other words.
The regular expression:
/([\s\t]+|$)/
is necessary to find words with trailing blanks or that end the
search line. If only [\s\t]+ had been used then words ending the
search line would not be found, since there are no trailing
blanks or tabs.
Note that for files with the newline character, '\n', at the end
of all lines, commonly called ASCII text files, it is possible to
search for regular expressions that may span more than one line.
For example, if we wanted to find all sequences of the names
Ted, Alice, George and Mary
that were separated by spaces, tabs or line boundaries, we
would write the following regular expression:
QTAwk - 3-10 - QTAwk
Section 3.7 Regular Expressions
/[\s\t\n]+Ted[\s\t\n]+Alice[\s\t\n]+Mary[\s\t\n]/
The regular expression:
/^As\s+(Fred|Ted|Jed|Ned)\s+(began|ended)(\s+|$)/
will match the beginning of the search line followed by "As",
i.e., 'A' as the first character of the search line, followed by
's', followed by one or more blanks followed by "Fred" or "Ted"
or "Jed" or "Ned" followed by one or more blanks followed by
"began" or "ended" followed by one or more blanks or the end of
the search line. This could be modified slightly to be:
/^As\s+(Fr|T|J|N)ed\s+(began|ended)(\s+|$)/
or
/^As\s+(Fr|[TJN])ed\s+(began|ended)(\s+|$)/
either form will result in exactly the same search.
3.8 Look Ahead Operator
Sometimes it is necessary to find a regular expression, but only
when it is followed by another regular expression. Thus we wish
to find "Mr", but only when it is followed by "Smith". The
"look-ahead" operator, '@', is used to denote this situation. In
general, if r is a regular expression we wish to match, but only
when followed by the regular expression s, then we would express
this as:
/r@s/
Thus, to find "Mr", but only when followed by "Smith", we have:
/Mr@[\s\t]+Smith/
3.9 Match Classes
There are also circumstances in which we wish to find pairs of
characters. For example, we wish to find all clauses in a letter
enclosed within parenthesis, "()", braces, "{}", or brackets,
"[]". We could write several separate regular expressions which
are identical except that one would use parenthesis, another
braces, etc. A simpler method has been introduced using the
concept of matched character classes. A matched character class
QTAwk - 3-11 - QTAwk
Section 3.9 Regular Expressions
is denoted as:
[#\(\{\[] and [#\)\}\]]
The first instance of a "matched character class" in a regular
expression will match any character in the class. The second
instance will match only the character in the position of the
class matched by the first instance. For example, in the above
two classes, if the character that matched the first class was
'[', then only a ']' would match the second class and not a ')'
or a '}'. Note the use of the backslash above to avoid any
confusion in interpreting the characters "()", "{}", and "[]" as
characters and regular expression operators. Except for ']', the
backslash is not needed since the characters do not act as
operators within a character class. For the character ']', the
backslash is necessary to prevent early termination of the
character class.
Note that matched character classes cannot be nested. Thus, the
span of characters between two different matched character
classes cannot overlap. If we wanted to find regular expressions
contained within "([" and ")]" or within "{[" and "}]", the
instances of each in the regular expression could not overlap,
i.e., we could NOT write a regular expression like:
this /[#\(\[] exp [#\{\[] contains [#\)\]] two [#\}\]] matched/
│<────────────────────────────────>│ │
│ │<────────────────────────────────>│
This regular expression would be interpreted as:
/this [#\(\[] exp [#\{\[] contains [#\)\]] two [#\}\]] matched/
│<───────────────>│ │<───────────────>│
3.10 Named Expressions
If the strings to be found using regular expressions are
complicated, the associated regular expressions can become very
difficult to understand. This makes it very hard to determine if
the regular expression is correct. For example, the regular
expression (as one line):
/^[A-Za-z_][A-Za-z0-9_]*([\s\t]+\**[A-Za-z_][A-Za-z0-9_]*)*
\((([\s\t]*[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*)(,([\s\t]*
[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*))*)*\)([\s\t]*
(\/\*.*\*\/)[\s\t]*)*$/
QTAwk - 3-12 - QTAwk
Section 3.10 Regular Expressions
will find function definitions in C language programs.
Constructing and analyzing this regular expression as a single
entity, is difficult.
Breaking such regular expressions into smaller units, which are
shorter and simpler, makes the task much easier. QTAwk has
introduced the concept of "named expressions" for this purpose.
Named expressions are QTAwk variable names enclosed in braces,
'{' '}'. In translating the regular expression into internal
form QTAwk, scans the regular expression for named expressions
and substitutes the current value of the variable named. If a
variable does not exist by the name specified, no substitution is
made.
By defining a variable:
fst = "first words";
Then the following regular expression:
/The {fst} of the child/
would expand into:
/The first words of the child/
Named expressions allow for building up regular expressions
from smaller more easily understood regular expressions and for
re-using the smaller regular expressions. The following example
QTAwk utility builds the previous regular expression for
recognizing C language function definitions (all on one line)
from many smaller regular expressions. Each constituent regular
expression is built to recognize a particular part of the
function definition. When combined into the final regular
expression, the three parts of the definition can be easily
understood. The final regular expression is expanded in the
final print statement. It spans several 80 character lines and
is much more difficult to understand due to its length and
complexity.
Example:
BEGIN {
# define variables for use in regular expressions:
# Define C name expression
c_n = /[A-Za-z_][A-Za-z0-9_]*/;
# Define C comment expression
QTAwk - 3-13 - QTAwk
Section 3.10 Regular Expressions
# Note: Does NOT allow comment to span lines
c_c = /(\/\*.*\*\/)/;
# Define single line comment
c_slc = /([\s\t]*{c_c}[\s\t]*)*/;
# Define C name with pointer
c_np = /\**{c_n}/;
# Define C name with pointer or address
c_ni = /[\*&]*{c_n}/;
# Define C function type and name declaration
c_fname = /{c_n}([\s\t]+{c_np})*/;
# Define expression for first argument in function list
c_first_arg = /([\s\t]*{c_ni})/;
# Define expression for remaining argument in function list
c_rem_arg = /([\s\t]*,{c_first_arg})*/;
# Define C function argument list
c_arg_list = /\(({c_first_arg}{c_rem_arg})*\)/;
#
# Expression to find all C function definitions
totl_name = /^{c_fname}{c_arg_list}{c_slc}$/;
#
# print total expression to illustrate expansion of named
# expressions
# Refer to the description of the 'replace' function
#
print replace(totl_name);
}
The string output by this utility is:
^[A-Za-z_][A-Za-z0-9_]*([\s\t]+\**[A-Za-z_][A-Za-z0-9_]*)*
\((([\s\t]*[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*)(,([\s\t]*
[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*))*)*\)([\s\t]*
(\/\*.*\*\/)[\s\t]*)*$
Note that in printing the regular expression, the leading and
trailing slash, '/', were not printed.
3.11 Predefined Names
In translating regular expressions, names starting with an
underscore and followed by a single upper or lower case letter
are reserved as predefined. The following predefined names are
currently available for use in named expressions:
Alphabetic
{_a} == [A-Za-z]
Brackets
QTAwk - 3-14 - QTAwk
Section 3.11 Regular Expressions
{_b} == [{}()[]<>]
Control Character
{_c} == [\x001-\x01f\x07f]
Decimal Digit
{_d} == [0-9]
Exponent
{_e} == [DdEe][-+]?{_d}{1,3}
Floating point number
{_f} == [-+]?({_d}+\.{_d}*|{_d}*\.{_d}+)
Floating, optional exponent
{_g} == {_f}({_e})?
Hexadecimal digit
{_h} == [0-9A-Fa-f]
decimal Integer
{_i} == [-+]?{_d}+
alpha-Numeric
{_n} == [A-Za-z0-9]
Octal digit
{_o} == [0-7]
Punctuation
{_p} == [\!-/:-@[-
double or single Quote
{_q} == {_s}["'
Real number
{_r} == {_f}{_e}
zero or even number of Slashes
{_s} == (^|[!\\](\\\\)*)
printable character
{_t} == [\s-~]
graphical character
{_u} == [\x01f-~]
White space
{_w} == [\s\t]
space, \t, \n, \v, \f, \r, \s
{_z} == [\t-\r\s]
The above predefined names will take precedence over any
variables with identical names in replacing named expressions in
regular expressions and the 'replace' function.
3.12 Tagged Strings
QTAwk recognizes and searches for regular expressions containing
parenthesized regular expressions. QTAwk makes special use of
the strings which match regular expressions contained within
parenthesis. The strings matching regular expressions within
QTAwk - 3-15 - QTAwk
Section 3.12 Regular Expressions
parenthesis are called Tagged Strings and the QTAwk Tag Operator,
'$$', is used to refer to tagged strings. The use of the tag
operator to refer to tagged strings is explained in Section 5 on
QTAwk expressions. The discussion here will explain how tagged
strings are counted. It is important to understand how QTAwk
counts tagged strings to use the tag operator.
A pair of numbers, ln.cn, are be used to label parenthesized
regular expressions according to the nesting level, ln, and the
count, cn, at a given level. There is no theoretical limit on
the number of parenthesized regular expressions or the level to
which the parenthesized regular expressions may be nested.
However, while matching a regular expression, QTAwk only keeps
track of tagged strings to a nesting level of 7, 1 ≤ ln ≤ 7, and
a maximum count of 31, 1 ≤ cn ≤ 31, for each level. QTAwk can
utilize regular expressions with parenthesis nested deeper than 7
and a count greater than 31 at each level, but for use with the
tag operator, these limits apply.
The following examples illustrate the method for counting tagged
strings. Tagged strings are identified with a pair of integers:
'i' for the nesting level and 'j' for the count at a given
level. Tagged strings are counted according to the parenthesis
set in the regular expression. Thus, the examples below show
parenthesis nesting level and count using the regular expression
and not the strings matching the regular expressions.
For the regular expression:
/[Tt]he matching ((string|digit)s (can|will))/
1. ((string|digit)s (can|will))
i.j == 1.01
Nesting Level 1, First regular expression at this level
2. (string|digit)
i.j == 2.01
Nesting Level 2, First regular expression at this level
3. (can|will)
i.j == 2.02
Nesting Level 2, Second regular expression at this level
Using the same regular expression, but omitting one set of
parenthesis, we have:
/[Tt]he matching (string|digit)s (can|will)/
QTAwk - 3-16 - QTAwk
Section 3.12 Regular Expressions
1. (string|digit)
i.j == 1.01
Nesting Level 1, First regular expression at this level
2. (string|digit)
i.j == 1.02
Nesting Level 1, Second regular expression at this level
Care must be used in determining parenthesis level and counts
for regular expressions containing variable names, predefined or
user defined. For example, the regular expression:
/({_g}|{_i}) (--> ([rfi]))/
matches floating point numbers with optional exponents or
integers followed by
"--> r" or
"--> f" or
"--> i"
When the predefined names are expanded, the regular expression
is: (Note: The original regular expression has been split across
three lines since it is too long to print on a single line)
|[-+]?[0-9]+)
(--> ([rfi]))/
The parenthesis set at level 1, count 1 contains the string
matching the floating point number or integer. The parenthesis
set at level 2, count 1 contains the string matching the mantissa
of the floating point number. The parenthesis set at level 2,
count 2 contains the string matching the optional exponent of the
floating point number (whether the exponent is present in the
matching string or not. The parenthesis set "(--> ([rfi]))" is
at level 1, count 2 and "([rfi])" is at level 2, count 3. The
various levels and counts are listed below:
1. ({_g}|{_i})
i.j == 1.01, level 1, count 01
2. ({_d}+\.{_d}*|{_d}*\.{_d}+)
i.j == 2.01, level 2, count 01 ---> mantissa portion of {_g}
expanded
3. ([DdEe][-+]?{_d}{1,3})
i.j == 2.02, level 2, count 02 ---> exponent portion of {_g}
expanded
QTAwk - 3-17 - QTAwk
Section 3.12 Regular Expressions
4. {_d}{1,3} == [0-9]([0-9])?([0-9])?
i.j == 3.01 && 3.02, level 3, count 01 and 02
5. (--> ([rfi]))
i.j == 1.02, level 1, count 02
6. ([rfi])
i.j == 2.03, level 2, count 03
Thus, when named expressions, predefined or user defined, are
included in regular expressions, care must be taken to account
for any parenthesis set contained within the named variable when
determining parenthesis set level and count for use with the tag
operator.
QTAwk - 3-18 - QTAwk
Section 3.13 Regular Expressions
3.13 Regular Expression Operator Summary
The QTAwk regular expression operators are summarized below:
^ matches Beginning of Line as first character of regular
expression
$ matches End of Line as last character of regular expression
\c matches following: (hexadecimal value shown in parenthesis)
\a == bell (alert) ( \x07 )
\b == backspace ( \x08 )
\f == formfeed ( \x0c )
\n == newline ( \x0a )
\r == carriage return ( \x0d )
\s == space (blank) ( \x20 )
\t == horizontal tab ( \x09 )
\v == vertical tab ( \x0b )
\c == c [ \\ == \ ]
\ooo == character represented by octal value ooo
1 to 3 octal digits acceptable
\xhhh== character represented by hexadecimal value hhh
1 to 3 hexadecimal digits acceptable
. matches any character except newline, '\n'
[abc0-9] Character Class - match any character in class
[^abc0-9] Negated Character Class - match any character not in
class
[!abc0-9] Negated Character Class - match any character not in
class
[#abc0-9] Matched Character Class - for second match, class
character must match in corresponding position
* - Closure, Zero or more matches
+ - Positive Closure, One or More matches
? - Zero or One matches
r(s)t embedded regular expression s. String matching s is a
"Tagged String"
r|s|t, '|' == logical 'or' operator. Regular expression r or s
or t
@ - Look-Ahead, r@t, matches regular expression 'r' only when r
is followed by regular expression 't'.Regular expression t
not contained in final match string. Symbol loses special
meaning when contained within parenthesis, '()', or character
class, '[]'.
r{n1,n2} - at least n1 and up to n2 repetitions of {re} r. n1,
n2 integers with 1 ≤ n1 ≤ n2
c{2,6} ==> ccc?c?c?c?
QTAwk - 3-19 - QTAwk
Section 3.13 Regular Expressions
c{3,3} ==> ccc
{Re}s grouped by ", (), [], or names, "{name}" are repeated
as a group. {Re}s grouped by " or names, "{name}", are
enclosed within parenthesis when repeated: (Note the
treatment of quoted {ex}s)
(r){2,6} ==> (r)(r)(r)?(r)?(r)?(r)?
[r]{2,6} ==> [r][r][r]?[r]?[r]?[r]?
{r}{2,6} ==> {r}{r}({r})?({r})?({r})?({r})?
"r"{2,6} ==> "rr(r)?(r)?(r)?(r)?"
{named_expr} - named expression. In regular expressions
"{name}" is replaced by the string value of the corresponding
variable. Unrecognized variable names are not replaced.
QTAwk - 3-20 - QTAwk
Section 4.0 Variables
4.0 Variables
Variables in QTAwk are of four kinds:
1. user defined
2. built-in
3. field
4. tag
The names of user defined variables start with an upper or lower
case character or underscore optionally followed by one or more
upper or lower case characters, digits or underscores. Most
QTAwk built-in variables are named with upper case letters and
underscores (only three are defined with lower case characters).
Variables are defined by using them in expressions. Variables
have numeric, character, string or regular expression values or
all four depending upon the context in which they are used in
expressions or function calls.
Except for variables defined with the 'local' keyword or used as
function parameters, all variables are global in scope. That is
they are accessible and can be changed anywhere within a QTAwk
utility. Local variables will be discussed in Section 10, where
the 'local' keyword is discussed. Function parameters for user
defined functions will be discussed in Section 12.
All variables are initialized with a zero (0) numeric value and
the null string value when created by reference. The value of
the variable is changed with the assignment operator, '=' or
'op='.
var1 = 45.87;
var2 = "string value";
var3 = /[\s\t]+[A-Za-z_][A-Za-z0-9_]+/;
var1 has a numeric value of 45.87 from the assignment
statement. It has a string value of "45.87" and a value as a
regular expression of /45\.87/. The string and regular
expression values of var1 may be changed by changing the value of
the built-in variable "CONVFMT". The string value of CONVFMT is
used to convert floating point numeric values to string and
regular expression values. CONVFMT is initialized with a value
of "%.6g" and can be changed with an assignment statement. Such
changes would then affect the string and regular expression
QTAwk - 4-1 - QTAwk
Section 4.0 Variables
values of floating point numeric quantities. For example, if
CONVFMT is assigned a value of "%u", then the string and regular
expression values of var1 would become "45" and /45/
respectively.
The numeric value of both var2 and var3 is zero (0).
The string value of var3 is "[\s\t]+[A-Za-z_][A-Za-z0-9_]+".
Note that the tab escape sequence, '\t', is not expanded in
converting the regular expression to a string. The reverse is
not true. One difference between strings and regular expressions
is the time at which escape sequences such as '\t' are translated
to ASCII hexadecimal characters. For strings, the translation is
done when the strings are read from the QTAwk utility file. For
regular expressions the escape sequences are translated when the
regular expression is converted to internal form. For this
reason, strings used in the place of regular expressions undergo
a double translation, first when read from the QTAwk utility file
and second when converted into the internal regular expression
form. The second translation of strings used for regular
expressions is the reason backslash characters, '\', must be
doubled for strings used in this manner. Refer to Section 7 for
a more complete discusion of strings and regular expressions.
QTAwk - 4-2 - QTAwk
Section 5.0 Built-in Variables
5.0 Built-in Variables
QTAwk offers the following built-in variables. The variables
may be set by the user. Those marked with an asterisk, '*', are
new to QTAwk. Those marked with a plus sign, '+', have new or
expanded meaning in QTAwk.
1. * _arg_chk ==> TRUE/FALSE. Default value = 0. If assigned
a FALSE value, the number of arguments passed to a user
defined function is checked only to ensure that the number is
not more than defined. Arguments defined, but not passed are
initialized and passed for use as local variables as in Awk.
If assigned a true value, the number of arguments passed to a
user defined function is checked against the number defined
for the function, unless the function was defined with a
variable number of arguments. If the number passed is not
exactly equal to the number defined, an error message is
issued and execution halted. For this case, any local
variables must be defined with the 'local' keyword.
2. ARGC ==> set equal to the number of arguments passed to
QTAwk as in Awk.
3. * ARGI ==> equal to the index value in ARGV of the next
command line argument to be processed. This value may be
changed and will change the array element of ARGV processed
next. When the last element of ARGV is the current input
file, ARGI is set to one of two integer values:
a) the integer value of the index of the last element of
ARGV plus one, or
b) if the last element of ARGV has a string index, ARGI is
set to zero.
Setting ARGI to zero, ARGC or a value for which there is no
element of ARGV with a corresponding index value, will cause
the current input file to be the last command line argument
processed.
4. ARGV ==> one-dimensional array with elements equal to the
arguments passed to QTAwk as in Awk. The index values are
integers ranging in value from zero to ARGC. ARGV[0] ==
filename by which QTAwk was invoked, including full path
information.
5. * CLENGTH ==> length of string matched in 'case' statement
QTAwk - 5-1 - QTAwk
Section 5.0 Built-in Variables
6. * CSTART ==> start of string matched in 'case' statement
7. CONVFMT ==> STRING conversion format for floating point
numbers. Default value of "%.6g".
8. * CYCLE_COUNT ==> value for the current cycle through the
outer pattern match loop for the current record. Value
incremented by the 'cycle' statement.
9. * DEGREES ==> TRUE/FALSE. Default value = 0. If assigned a
false value, trigonometric functions assume radian values are
passed and return radian values. If assigned a TRUE value,
trigonometric functions assume degree values are passed and
return degree values.
10. * DELAY_INPUT_PARSE ==> TRUE/FALSE. Default value = 0.
Used to delay parsing of the current input record until the
value of NF or one of the field variables, $i, 1 ≤ i ≤ NF, is
needed. The default value is false. If the value is true,
then the input record is not parsed until necessary. For
utilities which do not reference NF or the field variables,
$i, in any patterns expressions (or seldom executed pattern
expressions), delaying the parsing of the input record can
speed the execution of the utility significantly. The normal
sequence of execution is:
a) determine next record according to RS and read record,
b) set FNR and NR,
c) parse record according to FS and set NF and $i, 1 ≤ i ≤
NF,
d) start executing pattern expressions
The third step in the above sequence can be delayed until a
field variable value or NF is needed by DELAY_INPUT_PARSE to
a true value.
11. * ECHO_INPUT ==> TRUE/FALSE. Default value = 0. If
assigned a true value, when reading from the keyboard file,
the input is echoed to the standard output file, normally the
console display.
When the standard input file is the keyboard, any characters
read are always echoed to the console display. In addition,
if ECHO_INPUT is true, then any characters read are also
echoed to the standard output file.
When the standard input file is redirected from a file or
QTAwk - 5-2 - QTAwk
Section 5.0 Built-in Variables
piped from the output of another application program, it is
convenient to set ECHO_INPUT to a false value to prevent
echoing each input character to the standard output file.
For input read from the keyboard file with the "fgetc"
function, ECHO_INPUT should be set to a true value to display
key values as they are pressed.
The following table shows the effect of ECHO_INPUT on input
from the standard input file:
ECHO_INPUT stdin from stdin from redirected redirected
value keyboard keyboard stdin stdin
to display to stdout to display to stdout
true yes yes no yes
false yes no no no
For more information on the "keyboard" and "standard
input" files, refer to Section 11.
12. ENVIRON ==> one-dimensional array with elements equal to
the environment strings passed to QTAwk. The index values
are integers ranging in value from zero to the number of
environment strings defined less one.
13. * FALSE ==> predefined with zero, 0, constant value.
14. * FIELDFILL ==> used to fill a field when the replacement
value is less than the field width or if the field changed is
greater than the number of fields in the original record. If
the field changed is greater than the number of fields
defined in FIELDWIDTHS, the extra fields created are
initialized as null strings separated by the strng value of
OFS. This variable value is used only when field splitting
is based on FIELDWIDTHS rather than FS. The default value is
a single blank character.
15. * FIELDWIDTHS ==> when assigned a string value containing
space separated integral numbers of the form:
n1 n2 n3 ... nn
the splitting of input records into fields is governed by
the numbers in FIELDWIDTHS rather than FS. Each number in
QTAwk - 5-3 - QTAwk
Section 5.0 Built-in Variables
FIELDWIDTHS specifies the width of a field including columns
between fields. If you want to ignore the columns between
fields, you can specify the width as a separate field that is
subsequently ignored. When the value FIELDWIDTHS does not
match this form, field splitting is done using FS in the
usual manner. If the length of the input record is greater
than the sum of the field widths specified in FIELDWIDTHS,
QTAwk creates an additional field and assigns the remainder
of the input record to the field.
16. * FILEATTR ==> contains the attributes of the current input
file in a string. The string defines the attributes in the
same manner as the variable for the 'findfile' function
defines the attributes for the files found by that function.
Changing the value of this variable has no effect on the
attributes of the current input file.
17. * FILEDATE ==> contains the date of the current input file
in the operating system format. The 'sdate' functions may be
used to format the date. Changing the value of this variable
has no effect on the date of the current input file.
18. * FILEDATE_CREATE ==> contains the creation date of the
current input file in the operating system format. The
'sdate' functions may be used to format the date. Changing
the value of this variable has no effect on the creation date
of the current input file. On PC/MS-DOS systems, this value
of this variable equals the value of FILEDATE.
19. * FILEDATE_LACCESS ==> contains the last file access date
of the current input file in the operating system format.
The 'sdate' functions may be used to format the date.
Changing the value of this variable has no effect on the last
access date of the current input file. On PC/MS-DOS systems,
this value of this variable equals the value of FILEDATE.
20. + FILENAME ==> equal to the string value of current input
file, NOT including any path specified. If assigned a new
value, the file with a name equal to the new string value is
opened (or an error message displayed if the filename is
illegal). The new file becomes the current input file. The
former input file is not closed and input may continue from
the current position by re-assigning FILENAME, putting the
name in ARGV for future use or read with the 'fgetline'
function.
QTAwk - 5-4 - QTAwk
Section 5.0 Built-in Variables
If assigned a new string value by the user's utility, the
full drive and path must be specified if necessary to find
the file. QTAwk will strip off the drive and path and assign
the filename to the "FILENAME" variable. The following
statements will save the current filepath and name and
re-assign latter:
# Save Current File Name
fpn_hold = FILEPATH ∩ FILENAME;
.
.
.
# Revert To Processing Original Input File
FILENAME = fpn_hold;
21. * FILEPATH ==> contains the drive and path of the current
input file. The path string ends with the subdirectory
separator character. Changing the value of this variable has
no effect on the path of the current input file.
22. * FILE_SEARCH ==> true/false values. Default value = 0.
Value determines whether next input record read from the
current input file is the next physical record or the next
record(s) matching a pattern in FILE_SEARCH_PAT. If assigned
a false value, then the next physical record is read. If
assigned a true value, then the next record(s) matching a
pattern in FILE_SEARCH_PAT is read. Multiple records may be
read if SPAN_RECORDS is true and the match spans more than
one record. See the end of Section 8 for a more complete
explanation of searching the input file for the next record.
23. * FILE_SEARCH_PAT ==> assigned by the user to the patterns
to match in the current input file if FILE_SEARCH contains a
true value. If FILE_SEARCH_PAT is an array, then all array
elements are scanned for matches and the MATCH_INDEX built-in
variable set accordingly. See the end of Section 8 for a
more complete explanation of searching the input file for the
next record.
24. * FILESIZE ==> contains the size in bytes of the current
input file. Changing the value of this variable has no
effect on the size of the current input file.
25. * FILE_SORT ==> String value of this variable defines the
sort order for the array returned by the 'findfile'
function. The default value is the null string, i.e., the
QTAwk - 5-5 - QTAwk
Section 5.0 Built-in Variables
array contains files in the order returned by the operating
system.
Valid character values for sorting in the string are:
'n' or 'N' -- sort by file name
'e' or 'E' -- sort by file extension
'd' or 'D' -- sort by file date
't' or 'T' -- sort by file time
's' or 'S' -- sort by file size
Any other character will be ignored.
If multiple sort options are specified, the files returned
are sorted in the order of preference given.
For example:
# Will Sort By Date, Time, File Extension, File Name And
Last By Size
FILE_SORT = "dtens";
26. * FILETIME ==> contains the time of the current input file
in the DOS format. The 'stime' functions may be used to
format the time. Changing the value of this variable has no
effect on the time of the current input file.
27. * FILETIME_CREATE ==> contains the creation time of the
current input file in the DOS format. The 'stime' functions
may be used to format the time. Changing the value of this
variable has no effect on the creation time of the current
input file. On PC/MS-DOS systems the value of this variable
equals the value of FILETIME.
28. * FILETIME_LACCESS ==> contains the last file access time
of the current input file in the DOS format. The 'stime'
functions may be used to format the time. Changing the value
of this variable has no effect on the lastr access time of
the current input file. On PC/MS-DOS systems the value of
this variable equals the value of FILETIME.
29. FNR ==> equal to current record number of current input
file. If the current value of $0 contains more than one
input record, then FNR contains the number of the last record
in $0. See the definition of the built-in variables
FILE_SEARCH, FILE_SEARCH_PAT and SPAN_RECORDS and the end of
QTAwk - 5-6 - QTAwk
Section 5.0 Built-in Variables
Section 8 for a more complete explanation of searching the
input file for the next record and how $0 may contain more
than one input record. See also NR.
30. FS ==> value of current input field separator. If
FIELDWIDTHS is not assigned a value as specified above, then
the value of FS is used for splitting the current input
record into fields. The default value for FS is /{_z}+/,
i.e., any consecutive white space characters. If FS is set
on the command line or in the user utility then the following
rules apply (see also RS below):
a) setting to a single blank, ' ' or " ", will set FS to the
default value of /{_z}+/,
b) setting to a value other than a regular expression
constant, the equivalent string form will be derived and
the string considered a regular expression. The string
or regular expression will be converted to the regular
expression internal form when the assignment is made.
Input records are scanned for a string matching the
regular expression and matching strings become field
separators. The length of matching strings is governed
by the LONGEST_EXPR built-in variable.
31. * Gregorian ==> TRUE/FALSE. Default == 1. If assigned a
true value, QTAwk assumes Gregorian Calendar in computing the
Julian Day Number in the "jdn" function. Also affects
calendar used in converting jdn to calendar date by the "cal"
function. If assigned a false value, QTAwk assumes the
Julian Calendar.
32. * IGNORECASE ==> TRUE/FALSE. Default == 0. If assigned a
true value, QTAwk ignores case in any operation comparing two
strings and for single character comparisons. The affected
comparisons include:
a) any match against regular expressions using the
operators, '~~' and '!~', either explicit or implied, in
pattern expressions,
b) any match against regular expressions using the
operators, '~~' and '!~', either explicit or implied, in
action expressions,
c) any match against regular expressions using the
functions: gsub(), index(), match(), split(), sub(),
strim, srch_record and fsrch_record.
d) any search using the FILE_SEARCH_PAT variable,
e) any matches of one string against another using the '=='
QTAwk - 5-7 - QTAwk
Section 5.0 Built-in Variables
and '!=' operators,
f) any match of one character against another using the '=='
and '!=' operators,
g) any match of one string against another string in
'switch'/'case' statements,
h) any match of one character against another character in
'switch'/'case' statements,
i) any match against a regular expression in 'switch'/'case'
statements,
j) all searches for record terminator character/strings
using RS,
k) all searches for field terminator character/strings using
FS.
33. * LONGEST_EXP ==> TRUE/FALSE. Default == 1. If assigned a
true value, the longest string matching a regular expression
is found in:
a) patterns
b) match operators, '~~' and '!~'
c) 'match' function
d) 'gsub' function
e) 'sub' function
f) 'strim' function
g) input record separator strings matching RS pattern(s)
h) input record field separator strings matching FS
pattern(s)
i) field separator matching in 'split' function
If assigned a false value, then the first string matching
the regular expression is found.
34. * MATCH_INDEX ==> set to the string value of the index of
the matching element when an array is used for matching. Use
string value of SUBSEP to separate indices for
multidimensioned arrays.
35. * MAX_CYCLE ==> Maximum value for CYCLE_COUNT for cycling
through outer pattern match loop with current input record.
Default value == 100.
36. * MLENGTH ==> length of string matched in match operators,
'~~' and '!~'
37. * MSTART ==> start of string matched in match operators,
'~~' and '!~'
QTAwk - 5-8 - QTAwk
Section 5.0 Built-in Variables
38. + NF ==> equal to the number of fields in the current input
record. The action of QTAwk when NF is changed reflects the
intuitive effect of changing NF. If the new value is greater
than the current value, the current input line is lengthened
with new empty fields separated by the output field separator
strings, OFS. If the new value is less than the current
value, the current input line is shortened by truncating at
the end of the field corresponding to the new NF value.
39. * NG ==> set to the number of the GROUP expression matching
the current input record for GROUP patterns.
40. NR ==> total number of records read so far across all input
files. See also FNR.
41. OFMT ==> output conversion format for floating point
numbers. Default value of "%.6g".
42. OFS ==> output field separator. Default value of a single
blank, '\b'.
43. ORS ==> output record separator. Default value of a single
newline character, '\n'.
44. * RECLEN ==> If assigned a non-zero numeric value, then the
current input file is assumed to consist of fixed length
records with the record length equal to the integral value of
RECLEN. When RECLEN determines input records, RT is assigned
the null string whenever a record is read from the current
input file.
45. RLENGTH ==> length of string matched in 'match' function
46. RSTART ==> start of string matched in 'match' function
47. * RETAIN_FS ==> TRUE/FALSE. Default value = 0. The value
of this variable is used only if input record fields are
determined by FS and not by the FIELDWIDTHS variable. If
assigned a false value, and FS is used for field parsing,
then OFS is used between fields to reconstruct $0 whenever a
field value is altered. If assigned a true value, and FS is
used for field parsing, the original field separator
characters are retained in reconstructing $0 whenever a field
value is altered.
48. RS ==> input record separator. The default value for RS is
QTAwk - 5-9 - QTAwk
Section 5.0 Built-in Variables
a single newline character, '\n'. If RS is set on the
command line or in the user utility then the following rules
apply (see also FS above):
a) setting to the null string, "", will set RS to the
regular expression /\n\n/. Thus, blank lines, i.e., two
consecutive newline characters, bound input records.
b) setting to a value other than a regular expression
constant, the equivalent string form will be derived and
the string considered a regular expression. The string
or regular expression will be converted to the regular
expression internal form when the assignment is made.
The input file character stream is scanned for a string
matching the regular expression and matching strings
become record separators. The length of matching strings
is governed by the LONGEST_EXPR built-in variable.
49. * RT ==> set equal to the record terminator string whenever
a new record is read from the current input file. If fixed
langth records are used by setting RECLEN, then RT is
assigned the null string.
50. * SPAN_RECORDS ==> true/false value. Default value = 0.
Value determines whether or not matches to FILE_SEARCH_PAT or
the search pattern in 'srchrecord' or 'fsrchrecord' are
allowed to span records. If assigned a false value, matches
must be contained in a single record. See the end of Section
8 for a more complete explanation of searching the input file
for the next record.
51. + SUBSEP ==> a default value of a comma character, ','.
The value of SUBSEP is used to separate the index values in
MATCH_INDEX when a multidimensional array is used for
matching.
52. * TRACE ==> control statement tracing. Default value = 0.
Determines whether statements are traced during execution.
53. * TRANS_FROM ==> string used by 'stran' function for
translating from. Default value is
"ABCDEFGHIJKLMNOPQRSTUVWXYZ".
54. * TRANS_TO ==> string used by 'stran' function for
translating to. Default value is
"abcdefghijklmnopqrstuvwxyz".
55. * TRUE ==> predefined with one, 1, constant value.
QTAwk - 5-10 - QTAwk
Section 5.0 Built-in Variables
56. * QTAwk_Path ==> defined at program initiation to value of
the environment variable "QTAWK". Specifies paths to be
searched for input files. Multiple paths may be specified,
separated by semi-colons. The paths specified are searched
for input files in the order specified. If a drive and/or
path is specified with the filename, then that drive and path
only are searched for the desired file. If no drive or path
is specified with the filename, then the current directory is
searched and then the paths specified in QTAwk_Path. The
search sequence for files follows the the order:
a) If a directory and/or path name name is specified with a
file, then that path only is searched.
b) If no directory or path name is specified, then the
current directory is searched. If the file is not found,
then all paths specified in QTAwk_Path are searched for
the file in the order specified. Multiple paths may be
specified, separated by a semi-colon. This follows the
same convention for specifying multiple paths in the
system environment path setting "PATH".
The value of QTAwk_Path may be reset at any time by the
user's utility to change the paths to be searched for desired
files to be opened for reading. Setting the value of
QTAwk_Path to a false value, will null the directory search,
only the current directory will be searched.
57. * vargc ==> count of variable arguments passed to current
invocation
58. * vargv ==> singly-dimensioned array of variable arguments
passed to current function invocation. Indexing numeric
starting at one, 1.
QTAwk - 5-11 - QTAwk
Section 5.0 Built-in Variables
:h1 res=30.*00 id=Expression.Expressions
QTAwk - 5-12 - QTAwk
Section 6.0 Expressions
6.0 Expressions
Expressions specify values to QTAwk or specify the computations
that a utility performs when it executes. An expression consists
of a sequence of variable name(s), constant(s), built-in
functions, user-defined functions and operator(s). The purpose
of an expression is to yield a value or assign a value to a
variable.
6.1 Operators
QTAwk provides a rich set of operators which may be used in
expressions. The QTAwk operators are listed below from highest
to lowest precedence:
Operation Operator Associativity
grouping () left to right
subscripting [] left to right
field $ left to right
tagged string $$ left to right
logical negation (NOT) ! left to right
one's complement ~ left to right
increment/decrement ++ -- right to left
unary plus/minus + - left to right
exponentiation ^ right to left
multiply, divide,
remainder * / % left to right
binary plus/minus + - left to right
concatenation left to right
concatenation ∩ (ASCII 239) left to right
shift left/right << >> left to right
relational < <= > >= left to right
equality == != left to right
matching ~~ !~ left to right
array membership in left to right
bitwise AND & left to right
bitwise XOR @ left to right
bitwise OR | left to right
logical AND && left to right
logical OR || left to right
conditional ? : right to left
assignment = ^= *= /= %= right to left
+= -= &= @= |=
<<= >>= ∩=
sequence , left to right
QTAwk - 6-1 - QTAwk
Section 6.1 Expressions
Note that QTAwk has changed some operators from C and Awk.
QTAwk has retained the Awk exponentiation operator (the C bitwise
XOR operator) and made '@' the QTAwk bitwise XOR operator. QTAwk
has changed the Awk match operators to '~~' and '!~' to make them
consistent with the equality operators, '==' and '!='. This has
freed up the single tilde to restore it to its C meaning of unary
one's complement. QTAwk has also brought forward the remainder
of the C operators: shift, '<<' and '>>', bitwise operators, '&',
'@' and '|', and the sequence operator, ','.
The shift operators, '<<' and '>>', are bit shift operators for
numeric operands and character shift operators for string
operands.
Note that all expression operators are left associative except:
1. increment/decrement, '++' and '--',
2. exponentiation, '^'
3. conditional, ' ? : '
4. assignment, '='
5. assignment, '^=' '*=' '/=' '%=' '+=' '-=' '&=' '@=' '|='
'<<=' '>>=' '∩='
Left associativity means that operators of the same precedence
are evaluated left to right: 10 - 9 + 7 means (10 - 9) + 7.
Operators of higher precedence are evaluated before operators of
lower precedence. Thus, 10 + 5 * 9 means 10 + (5 * 9) and
evaluates to 55. Since the multiply operator has a higher
precedence than the addition operator, it is evaluated first.
6.2 Numeric Forms and Arithmetic Operations
QTAwk maintains two separate numeric forms, integral and
floating point. The main difference between the two forms
involves the type of arithmetic performed for the binary and
unary arithmetic operators. The arithmetic operators are:
Operation Operator
one's complement ~
increment/decrement ++ --
unary plus/minus + -
exponentiation ^
multiply, divide, * / %
remainder
binary plus/minus + -
QTAwk - 6-2 - QTAwk
Section 6.2 Expressions
bitwise AND &
bitwise XOR @
bitwise OR |
assignment ^= *= /= %=
+= -= &= @= |=
For all binary arithmetic operations (except the bitwise
operators), the following two simple rules are followed:
1. If both operand values are integers, then an integral
operation is performed and an integral result yielded.
2. If either or both operand values is floating point numeric,
then a floating point operation is performed after obtaining
the equivalent floating point value of any integer value and
a floating point result yielded.
If either or both operand values are strings, the numeric value
for either or both string values are obtained before applying the
above rules. If a string has no numeric form, then an integer
zero value is used.
Single character operands are treated as integers, using the
ASCII integer value of the single character.
The application of the above rules is most noticeable when one
numeric is divided by a second:
numeric_1 / numeric_2
If both numeric_1 and numeric_2 are integers, integer division
is performed and the result will be an integer. Thus:
(1 / 2) = 0
If either numeric_1 or numeric_2 is a floating point, floating
point division is performed and the result is a floating point:
1.0 / 2 = 0.5
and
1 / 2.0 = 0.5
and
QTAwk - 6-3 - QTAwk
Section 6.2 Expressions
1.0 / 2.0 = 0.5
6.3 Numerics and Strings
QTAwk maintains the separate numeric forms in converting strings
to numerics. "123" will be converted to integral form, while
"123.0" will be converted to floating point form. The difference
in the two forms may become significant in cases where variables
are assigned numeric values derived from input string fields or
other strings. If numeric operations are performed and full
floating point division is desired, 0.0 should be added to a
numeric value to assure floating point form.
Thus, QTAwk has three idioms for converting between strings and
numeric forms:
1. Convert numeric value to string form:
123 ∩ "" --> "123"
2. Convert integer string form to integer value:
Decimal Integer String: "123" + 0 --> 123
Octal Integer String: "0123" + 0 --> 83
Hexadecimal Integer String: "0x012" + 0 --> 18
3. Convert integer or floating point string form to floating
point value:
"123" + 0.0 --> 123.0
"123.0" + 0.0 --> 123.0
"123.0" + 0 --> 123.0
Note that if the string has a floating point form, then a
floating point numeric will always result.
Also, the conversion from numeric value to string form will
result in a floating point value or integer value depending upon
both the numeric value and the value of "OFMT". In converting a
numeric value to string form, QTAwk first considers whether the
numeric value is integer or floating point. If an integer value,
then a straight integer to ASCII string conversion is performed,
retaining the full accuracy of the numeric value. If, however,
the numeric value is a floating point value, then the value is
formatted using the output numeric format in the built-in
variable, 'OFMT'. The default value of OFMT is "%.6g". Changing
the value of OFMT to "%6u" will result in integer results for all
conversions of a floating point numeric to a string. A value of
"%.2f" OFMT will always result in a floating point string form
QTAwk - 6-4 - QTAwk
Section 6.3 Expressions
with two decimal places.
The above conversions from numeric value to string value also
apply to the default conversions made in outputting numeric
values with the 'print' and 'fprint' functions described in
Section 11.
6.3.1 Assignment Operators
The assignment operators are used to assign values to
variables. Thus:
total = 0
initializes, assigns, the value zero, 0, to the variable
'total'. Multiple assignments may be accomplished in a single
expression:
total_a = total_b = total_c = total_d = 0
initializes the four variables 'total_a', 'total_b', 'total_c',
'total_d' to 0.
The left operand of the assignment operator, '=', must be
variable.
Another form of the assignment operator is available, 'op=',
where 'op' is one of the operators, '^', '*', '/', '%', '+', '-',
'&', '@', '|', '∩', '<<', '>>'. The left operand of 'op=' must
be a variable. The expression "var op= exp" has the same effect
as "var = var op exp", except that the variable 'var' is
evaluated only once. Thus,
total = total + 45;
could be written as:
total += 45;
The effect is to increase the value of 'total' by 45.
6.4 Grouping Operators: ()
Parenthesis are used for grouping parts of expressions to ensure
a particular evaluation. For example, the expression "4 + 5 * 9"
is evaluated as:
QTAwk - 6-5 - QTAwk
Section 6.4 Expressions
"5 * 9" is evaluated first since '*' has the highest precedence
in the expression, then
"4 + 45" is evaluated to yield the expression result of 44.
If, instead, the desired order of evaluation was:
evaluate "4 + 5" first, then
evaluate "9 * 9" to yield result of 81
Parenthesis are necessary to ensure the alternative order of
evaluation: "(4 + 5) * 9". The parenthesis group "4 + 5" to
ensure that the lower precedence additive operator is evaluated
before the multiplicative operator.
6.5 Arithmetic Operators
6.5.1 Unary Ones Complement: ~
The unary ones complement operator works on both numeric and
string values.
For numeric values, i.e., character, integer and floating point
values, the result of this operation is the bitwise complement of
the operand value. Each 1 bit in the operand results in a 0 bit
in the result and each 0 bit in the operand results in a 1 bit in
the result. For floating point operands, the equivalent integer
value is used for the operand.
For string values (regular expressions are considered strings
for this purpose), the ones complement operation is performed on
each character of the string.
6.5.2 Unary Increment/Decrement: ++ --
The operand of either the increment or decrement operator must
be a variable, either global or local. The unary increment and
decrement operators may be used as either prefix operators or
postfix operators.
The result of the postfix ++ operator is the value of the
operand. After the result is obtained, the operand value is
incremented. That is, the operand has the value 1 added to it.
If the operand is a string, it is converted to a numeric value
and the value incremented.
The result of the postfix -- operator is analogous to the
QTAwk - 6-6 - QTAwk
Section 6.5.2 Expressions
postfix ++ operator except that the operand value is decremented,
i.e., the value 1 is subtracted from it.
The result of the prefix ++ operator is the incremented value of
the operand. The operand has the value 1 added to it, the new
value of the operand is the result of the prefix ++ operator. If
the operand is a string, it is converted to a numeric value and
the value incremented.
The result of the prefix -- operator is analogous to the prefix
++ operator except that the operand value is decremented, i.e.,
the value 1 is subtracted from it.
These rules can be easily seen in the expressions:
a = b = c++ = 1;
print a,b,c;
output --> 1 1 2
a++ = ++b + c++;
print a,b,c;
output --> 5 2 3
The first statement sets a, b and c each to 1. c is then
incremented to 2 by the postfix increment operator.
In the second statement, the prefix increment operator
increments b to 2 and the new value is used for the binary '+'
operator. The current value of c, 2, is used for the binary '+'
operator and the value of c is incremented to 3 by the postfix
increment operator after the addition operation is complete. a
is set to 2 + 2 = 4. The postfix increment operator then
increments a to 5.
6.5.3 Unary Plus/Minus: + -
The value of the unary plus operator is the numeric value of its
operand. The value of the unary minus operator is the negative
of the numeric value of its operand.
6.5.4 Exponentiation: ^
The result of the binary exponentiation operator, ^, is the
value of the left operand raised to the power of the right
operand. Both operands are converted to floating point numerics
in performing the operation. If both operands are integer
values, the result is an integer value. If either or both
QTAwk - 6-7 - QTAwk
Section 6.5.4 Expressions
operands are floating point values, the result is a floating
point value.
If either operand has a string value, the numeric value is
obtained before applying the above rule.
6.5.5 Multiplicative and Additive Operators: * / % and + -
The multiplicative operators, multiply, *, divide, /, and
remainder, %, and the additive operators, add, +, and subtract,
-, follow the standard arithmetic rules for their result. The
remainder operator, x % y, yields the remainder of x/y, which is
x - i*y for some integer i, such that i*y < x < (i+1)*y.
6.6 Bitwise And, Or and Xor: & | @
For these operators only integer values are used. If the value
of either operand is a string or floating point numeric, the
corresponding integer value is obtained for use in the
operation.
The result of the binary '&' operator is the bitwise AND of the
operand values, i.e., a bit in the result is a 1 if and only if
each of the corresponding bits in the operand values are 1s.
The result of the binary '@' operator is the bitwise exclusive
OR, XOR, of the operand values, i.e., a bit in the result is a 1
if and only if either the corresponding bit of the left operand
is 1 or the corresponding bit of the right operand is 1, but not
both.
The result of the binary '|' operator is the bitwise OR of the
operand values, i.e., a bit in the result is a 1 if and only if
at least one of the corresponding bits in either operand values
is 1.
6.7 Subscripting: []
Array elements are accessed using the subscripting operators,
'[]'. If the variable AA is a singly dimensioned array, the
elements would be accessed as AA[expression]. The index
expression can be any valid QTAwk expression. For
multidimensional arrays, multiple subscripting operators are
used. If BB is a two-dimensional array, then the rows are
subscripted as BB[row_number_exp] and the individual columns in
each row as BB[row_number_exp][column_number_exp]. Refer to
QTAwk - 6-8 - QTAwk
Section 6.7 Expressions
Section 6 for a discussion of arrays.
6.8 Shift Operators: << >>
The shift operators perform either a bitwise or character shift
of the left operand. If the left operand is a string, a
character shift is performed. If the left operand is a numeric,
the equivalent integer value is obtained and used for a bitwise
shift. For either shift, the right operand specifies the amount
of the shift. The equivalent integer value of the right operand
is used.
The shift operators operate on string left operands in QTAwk by
shifting the string characters. For string shifts, the
characters shifted off one end of the string, wrap to the other
end.
For numeric operands and the right shift operator, >>, the
integer value of the left operand is treated as unsigned, i.e.,
the sign bit is not extended in the shift, zero bits are shifted
in. If either operand is a floating point value, the result is a
floating point value.
Thus, these operators yield the results indicated in the
following situations:
"Test String" << 2 --> "st StringTe"
"Test String" >> 3 --> "ingTest Str"
55 << 1 --> 110
55 << 2 --> 220
55.0 << 1 --> 110.0
55.0 << 2 --> 220.0
6.9 String Concatenation Operator: ∩
QTAwk has retained the practice of forcing string concatenation
by placing two constants, variables or function calls adjacent.
QTAwk has introduced the string concatenation operator, '∩'
(character 239, 0x0ef of the extended ASCII character set). The
string concatenation operator has the advantage of making
concatenation explicit and allowing the string concatenation
QTAwk - 6-9 - QTAwk
Section 6.9 Expressions
assignment operator, '∩='. Thus, string concatenation operations
which previously had to be written as:
new_string = new_string old_string;
may now be written:
new_string ∩= old_string;
Thus a loop to build a string of numerics which previously was
written as:
for( i = 8 , j = 9 ; i ; i-- ) j = j i;
can be written as:
for( i = 8 , j = 9 ; i ; i-- ) j ∩= i;
and will produce a value for j of:
"987654321"
The string concatenation operator will make some constructs
work as expected. For example, the statements:
ostr = "prefix";
suff = "suffix";
k = 1;
j = ostr ++k suff;
print "j = ",j;
print "ostr = ",ostr;
will produce the seemly odd output:
j = prefix1suffix
ostr = 1
This results from two factors:
1. In tokenizing the statements, white space is used to break
keyword, variable, function names and multicharacter
operators. Otherwise it is ignored.
2. The increment operator, '++', has higher precedence than
string concatenation.
QTAwk - 6-10 - QTAwk
Section 6.9 Expressions
Thus, QTAwk processes the following stream of tokens:
1. j
2. =
3. ostr
4. ++
5. k
6. suff
7. ;
In interpreting the stream, '++' is encountered immediately
after 'ostr' and is interpreted as a postfix operator operating
on 'ostr' instead of a prefix operator operating on 'k'. Thus,
the stream appears to QTAwk as:
j = ostr ++ k suff;
After concatenating the current string value of ostr, "prefix",
with the string value of k, "1", ostr is converted to a numeric,
yielding a value of zero, 0, which is incremented to one, 1.
This seemingly anomalous situation can be remedied in two ways:
1. Surround ++k with parenthesis, thus explicitly binding '++'
to 'k':
j = ostr (++k) suff;
2. Use the string concatenation operator, '∩', to make explicit
the string concatenation:
j = ostr ∩ ++k suff;
or
j = ostr ∩ ++k ∩ suff;
The output produced by this, is what was really desired:
j = prefix2suffix
ostr = prefix
6.10 Field operator: $
The field operator is a unary operator used for accessing the
fields of the current input record. "$expression" evaluates
QTAwk - 6-11 - QTAwk
Section 6.10 Expressions
expression and converts the value to an integer value if
necessary. The operator yields the field of the current input
record corresponding to integer specified. Field numbering
starts at 1 with the first field and increases to the maximum
number of fields found for the current input record. The number
of fields in the current input record is stored by QTAwk in the
built-in variable, NF, as each record is read. If the value of
the operand expression is zero, the operator yields entire input
record. If the value of the operand expression is greater than
the number of fields in the current input record, i.e.,
expression > NF, the operator yields a null string.
6.11 Tagged String Operator: $$
QTAwk has introduced the tag operator, '$$'. The tag operator
works in conjunction with tagged strings as explained in Section
2. The tag operator is analogous to the field operator, but the
operand of this operator is numeric and takes five forms:
1. $$0 --> refers to the entire string matched by the last
regular expression.
2. $$0.0 --> refers to the entire string matched by the last
regular expression.
3. $$j (j an integer value) --> refers to the string matching
the regular expression contained in the parenthesis set at
nesting level 1 and count j in the last regular expression
matched in the current action pattern. Valid values of j are
in the range 1 ≤ j ≤ 31. Identical to $$1.j of the next
form.
4. $$i.j (i and j integer values) --> refers to the string
matching the regular expression contained in the parenthesis
set at the ith nesting level, jth count in the last regular
expression. Valid values of i and j are in the range:
1 ≤ i ≤ 7
1 ≤ j ≤ 31
Note that the values of j less than 10 must be specified
with a leading zero digit. Thus, a value of 1.1 will be
interpreted as 1.10, yielding values for i and j of:
i == 1
j == 10
QTAwk - 6-12 - QTAwk
Section 6.11 Expressions
A value for i.j specified as 1.01 yields values for i and j
of:
i == 1
j == 1
If i is greater than 7, a value of 7 is used. If j is
greater than 31, a value of 31 is used. These rules are
consistent with the rules for field selection with the '$'
operator. For, '$i' if the value of i is greater than NF,
the value of NF is used.
5. $$variable_name --> this form may expand into any one of the
previous forms. If the variable operated on by '$$' is a
string or regular expression, it is first converted to
numeric form. If the numeric value of variable_name is an
integer, the first or third forms above are appropriate. If
the numeric value is a floating point, the second and fourth
forms are appropriate. For floating point forms, only the
first two digits of the fractional part are used, the
remaining digits are discarded. The restrictions listed
above on the integer and fractional portions are maintained.
When used in an action, the last regular expression match in the
corresponding pattern will set the values of the tagged strings
for the tag operator. If there was no regular expression match
in the pattern, then the value of the tag operator is the null
string.
The operator may also be used in the 'sub' and 'gsub' functions
in the same manner that '&' is used. Tagged strings found
matching the regular expression of the scan string will be
substituted for the tag operator in the replacement string.
The use of tagged strings in the 'sub' and 'gsub' functions,
described in Section 11, differs slightly in their use of
variables to specify the tagged string desired. In actions, '$$'
is a unary operator just as '$' is. As such they operate on the
token immediately to their right. In the replacement strings
used in 'sub' and 'gsub', '$$' (and '&') act less as operators
than as tokens specifying replacement values. As such, in the
replacement string, a means is needed of separating the variable
specifying the tagged string level and count from the surrounding
text. As for named expressions in regular expressions, the curly
braces, '{}', are used to accomplish this. Thus, in the
replacement string for the 'sub' and 'gsub' functions, a tagged
QTAwk - 6-13 - QTAwk
Section 6.11 Expressions
string is specified with a variable as:
"replacement string -->$${var_name}<--";
Regular expressions used in actions with the match operators,
'~~' and '!~', and with functions taking regular expressions as
arguments, will not disturb the value of the tag operator set by
the pattern. The pattern/action pair:
/{_i}/ {
print $$0;
if ( $0 ~~ /[789]/ ) print $$0;
}
With the input line:
this line contains an integer 12745
both 'print' statements will output "12745"
Using the tag operator, a simple QTAwk utility could be written
to read a file containing a list of floating point numbers, one
per line, and change the exponents of the numbers. For example
to add 5 to the exponents:
/{_d}+\.{_d}+([eEdD][-+]?({_d}{_d}?{_d}?))?/ {
if ( $$2.01 ) $$2.01 += 5;
}
{ print; }
The following lines in a file:
1.23
1.23e+45
1.23E+45
342187.3465
342187.3465e-55
342187.3465E-55
will be output by the utility as:
1.23
1.23e+50
1.23E+50
342187.3465
QTAwk - 6-14 - QTAwk
Section 6.11 Expressions
342187.3465e-60
342187.3465E-60
If the exponent sign were included in the second level
parenthesis set, as in the following utility:
/{_d}+\.{_d}+([eEdD]([-+]?{_d}{_d}?{_d}?))?/ {
if ( $$2.01 ) $$2.01 += 5;
}
{ print; }
the output would be:
1.23
1.23e+50
1.23E+50
342187.3465
342187.3465e-50
342187.3465E-50
Unlike the fields of the input line, which are re-derived
whenever the input line is changed, the strings matching the tag
operator are not re-derived and thus are no longer available
whenever the input line is changed. Once the input line has been
changed in any manner, the tag operator will yields only null
strings. Thus after, the assignment "$$2.01 += 5;" in the above
utility, any further uses of the tag operator will yield only
null strings and any statements with the tag operator on the left
side of the assignment operator will not change the input line.
6.12 Logical Operators: && ||
The logical operators are the logical AND, '&&', and logical OR,
'||'. A logical expression evaluates to 1 if true and 0 if
false. The operands of '&&' and '||' are evaluated from left to
right with '&&' having a higher precedence than '||'. Evaluation
of the binary '&&' and '||' ceases as soon as the value of the
operator can be determined.
Thus, for
exp1 && exp2
if 'exp1' evaluates to false, i.e., a zero numeric or null
string, then 'exp2' is not evaluated and the value of the
QTAwk - 6-15 - QTAwk
Section 6.12 Expressions
expression is 0, or false. Otherwise, 'exp2' is evaluated and
the value of the expression is 1 if 'exp2' is true and 0 if
'exp2' is false.
For
exp1 || exp2
if 'exp1' evaluates to true, i.e., a nonzero numeric or non-null
string, then 'exp2' is not evaluated and the value of the
expression is 1. Otherwise, 'exp2' is evaluated and the value of
the expression is 1 if 'exp2' is true and 0 if 'exp2' is false.
For expressions involving multiple '&&' and '||' operators, the
precedence of the operators must be remembered. It is probably
best to use parenthesis to group expressions to avoid confusion.
For example, predict the output for the following expressions:
a = b = 1;
c = 0;
d = a++ || b++ && c++;
print a,b,c,d; #1
a = b = 1;
c = 0;
d = a++ || (b++ && c++);
print a,b,c,d; #2
a = b = 1;
c = 0;
d = (a++ || b++) && c++;
print a,b,c,d; #3
a = b = 0;
c = 1;
d = a++ || b++ && c++;
print a,b,c,d; #4
a = b = 0;
c = 1;
d = a++ || (b++ && c++);
print a,b,c,d; #5
a = b = 0;
c = 1;
d = (a++ || b++) && c++;
print a,b,c,d; #6
The outputs are:
QTAwk - 6-16 - QTAwk
Section 6.12 Expressions
a b c d
output #1: 2 1 0 1
output #2: 2 1 0 1
output #3: 2 1 1 0
output #4: 1 1 1 0
output #5: 1 1 1 0
output #6: 1 1 1 0
Since '&&' has higher precedence than '||', the expressions:
a++ || b++ && c++
and
a++ || (b++ && c++)
are equivalent.
6.13 Comparison Operators: < <= > >= == ~= ~~ !~
The binary comparison operators compare the value of the left
operand expression to the value of right operand expression. The
comparison operators are the relational operators: less than '<',
less than or equal to, '<=', greater than, '>', greater than or
equal to, '>=', equal to, '==', not equal to, '!=, and the
regular expression matching operators: is matched by, '~~', is
not matched by, '!~'. The comparison operators yield a value of
1 or 0.
In comparison expressions with the relational operators, if both
operands are numeric, then the comparison is numeric. Otherwise,
any numeric operand is coerced to a string and a string
comparison made.
For input record fields, where the field can be both a string
and a numeric, e.g., $i = "123.5", it is considered a numeric.
Any comparison made with an input record field, the type of test,
numeric or string will depend on the other operand. If both
operands are input record fields, then if both can be numerics,
they are treated as such for the comparison and the comparison
will be numeric.
For numeric comparisons, if both operands are integer values,
then the comparison made will be an integer comparison. If
either operand is a floating point, then the comparison made will
QTAwk - 6-17 - QTAwk
Section 6.13 Expressions
be a floating point comparison using the equivalent floating
point value for any inetger value.
For string comparisons, the ASCII collating sequence will be
used. If two strings are identical except for length, the
shorter string will be considered to less than the longer
string.
For the match operators, '~~' and '!~', the right operand is
converted to its string value if it is not already a string. The
internal regular expression form of the left operand is obtained
and used to search the left operand for a match. Refer to
Section 2 for a full description of regular expressions and their
use in matching strings.
6.13.1 Match Operator Variables
QTAwk has defined two new built-in variables associated with the
match operators, MLENGTH and MSTART. Whenever the match operator
is executed MLENGTH is set equal to the length of the matching
string or zero if no match is found. MSTART is set equal to the
position of the start of the matching string or zero if no match
is found. These built-in variables are completely analogous to
the built-in variables RLENGTH and RSTART for the built-in
'match' function.
6.14 Conditional Operator: ? :
The conditional operator, '? :' is a ternary operator, it takes
three operands:
expc ? expt : expf
Depending on the value of the conditional expression 'expc' only
one of the expressions 'expt', the true value expression, or
'expf', the false value expression, is evaluated, but not both.
the conditional expression, 'expc', is evaluated first. If the
value obtained from 'expc' is true, i.e., a nonzero numeric or
non-null string, then the true expression, 'expt', is evaluated
and the value obtained is the value of the operator. If the
conditional expression yields a false value, then the false
expression, 'expf' is evaluated and the value is the value of the
operator. This operator is typically used in places where an:
"if ( condition ) statement else statement"
QTAwk - 6-18 - QTAwk
Section 6.14 Expressions
would used to set a value, but where it is inconvenient to use
the "if else" control flow statement and where the expressions
'expt' and 'expf' are simple expressions and not compound
statements.
A simple example often cited is 'print' for output. Depending
upon the value of a flag variable, one of two different variables
are to output:
print flag_var ? output_var1 : output_var2;
6.15 Logical Negation: !
The logical negation operator yields the negative logical value
of the operand:
!expression
The operand expression is evaluated and if a true value, nonzero
numeric or non-nul string is obtained, then a value of false, 0,
is the value of the operator. If false value is yield by the
operand expression, then a value of true, 1, is the value of the
operator.
6.16 Array Membership: in
This operator tests if the value of the left operand is an
index, at the next level of indexing, of the right operand:
expi in expa
If the value obtained in evaluating the left operand index
expression, 'expi', is an index of the array obtained from
evaluating the right array operand expression, 'expa', then a
true value, 1, is the value of the operator, otherwise a false
value, 0, is the value. If the value yielded by 'expa' is not an
array, 0 is the value of the operator.
The next level of indexing is always tested. Thus, "i in A"
tests if the current value of the variable 'i' is a valid index
of the array variable 'A'. "i in A[j]" tests if the value of 'i'
is a valid column index in the jth row of the array 'A'.
6.17 Sequence Operator: ,
QTAwk uses the C sequence operator, the comma, ','. Using the
QTAwk - 6-19 - QTAwk
Section 6.17 Expressions
sequence operator, expressions may be combined into an expr_list:
expression_1 , expression_2 , expression_3 , ...
As in C, a list of expressions separated by the sequence
operator is valid anywhere an expression is valid. Such lists of
expressions separated by the sequence operator will be referred
to as an expression list or expr_list. Each expression in an
expr_list is evaluated in turn. The final value of the expr_list
is the value of the last expression. The sequence operator is
very useful in the loop control statements discussed in Section
10.
6.18 White Space
In tokenizing a utility, white space is used to break keywords,
variable and function names and multicharacter operators.
Otherwise it is ignored. White space is any of the characters:
\t, \n, \v, \f, \r, \s == [\t\n\v\f\r\s] == {_z}
Thus none of the characters of the multicharacter operators can
be separated by one of the white space characters. The
multicharacter operators are:
Operation Operator
tag $$
increment/decrement ++ --
shift left/right << >>
relational < <= > >=
equality == !=
matching ~~ !~
array membership in
logical AND &&
logical OR ||
assignment ^= *= /= %=
+= -= &= @= |=
<<= >>= ∩=
By observing this simple rule for multicharacter operators,
expressions such as the following will yield the expected
results:
a = b = c = d++ = 1
a++ = b++ + ++c;
print a,b,c,d;
QTAwk - 6-20 - QTAwk
Section 6.18 Expressions
output ==> 4 2 2 2
6.19 Constants
Expressions in QTAwk can contain several types of constants:
1. numeric constants
a) decimal numeric constants
b) octal numeric constants
c) hexadecimal numeric constants
d) floating point numeric constants
2. character constants
3. string constants
4. regular expression constants
Numeric constants have several forms: integer constants and
floating point constants. Integers follow the C practice of
allowing decimal, octal and hexadecimal base constants.
6.19.1 Numeric Constants
Decimal constants match the form:
/{_i}/ --> [-+]?[0-9]+
Octal constants match the form:
/0{_o}+/ --> 0[0-7]+
Hexadecimal constants match the form:
/0[xX]{_h}+/ --> 0[xX][0-9A-Fa-f]+
The results of all three of the following expressions are
equivalent. All set the variable, int_cons, to the integer
value, 11567.
int_cons = 11567;
int_cons = 026457;
int_cons = 0x2d2f;
Note that octal and hexadecimal integers are recognized not
only in QTAwk expressions in utilitys, but also in input record
fields and other strings.
QTAwk - 6-21 - QTAwk
Section 6.19.1 Expressions
Floating point numeric constants match the form:
{_f}
or
6.19.2 Character Constants
Character constants are single characters enclosed in single
quotes, ', The same escape sequences allowed in strings and
regular expressions are allowed in character constants. All
three of the following expressions will set the variable,
chr_cons, to 'A':
chr_cons = 'A';
chr_cons = '\x041';
chr_cons = '\101'
QTAwk will maintain variables set to character constants as
single characters, but they may be used in arithmetic expressions
as any other number. When used in arithmetic expressions, QTAwk
will automatically convert them to their ASCII numeric value.
Thus:
chr_cons = 'a';
int_cons = chr_cons / 2;
print chr_cons,int_cons;
output: a 48
Printing int_cons yields 48, the integer ASCII value of 'a', ==
97, divided by 2. Note that QTAwk maintains 'chr_cons' as a
character on output and prints the character value and not the
integer value, 97.
The 'substr' function will return a character constant when the
requested substring is only a single character wide.
6.19.3 String Constants
String constants are character sequences enclosed in double
quotes, ". The same escape sequences allowed in regular
QTAwk - 6-22 - QTAwk
Section 6.19.3 Expressions
expressions are allowed in string constants.
6.19.4 Regular Expression Constants
Regular Expression constants are character sequences enclosed in
slashes, '/', Regular Expression are discussed fully in Section
2.
QTAwk - 6-23 - QTAwk
QTAwk - 6-24 - QTAwk
Section 7.0 Arrays
7.0 Arrays
Arrays in QTAwk are a blending of Awk and C. The use of the Awk
associative arrays is continued and expanded to allow integer
indices. The use of the comma to delineate multiple array
indices is discontinued. The comma is now the sequence operator
and will be so treated in array index expressions. Thus, the
reference
A[i,j]
will now reference the element of 'A' subscripted by the current
value of the variable j. As a consequence of this the Awk
built-in variable SUBSEP has been assigned a new meaning, refer
to Section 4 and to the discussion on arrays later in this
section.
7.1 Multidimensional Arrays
QTAwk allows multidimensional arrays referenced in the same
manner as C. Thus:
A[i][j]
references the jth column of the ith row of the two-dimensional
array A.
7.2 Integer and String Indices
Array subscripts may be strings. Thus:
A[i]["state"]
would reference the "state" element of the ith row of the two
dimensional array, A. QTAwk allows array indices to be either
integers or strings. Single character indices, e.g., A['a'], are
treated as integer indices with an index value equivalent to the
integer ASCII value of the single character. Integer and string
indices may be used in the same array. Integer indices are
stored before string indices. Integer indices follow the usual
numeric ordering and string indices follow the ASCII collating
sequence. The ordering will be apparent in use of the 'in' form
of the 'for' statement:
for ( k in A ) statement
QTAwk - 7-1 - QTAwk
Section 7.2 Arrays
k is stepped through the indices of the singly dimensioned
array, A, in the order stored. Thus if A has the following
indices: 1, 3, 5, 7, 8, 9, 10, 12, 14, "county", "state", "zip".
Then k would be stepped through the indices in that order. Note
that allowing both string and integer indices overcomes the
disconcerting order of the "stringized numerical" indices of
Awk. Specifically, index 10 does not precede 2 as "10" does
precede "2" in Awk. QTAwk still allows the use of numeric
strings such as "10", "2", etc., but in most cases where such
strings would be used, the user should be aware that integer
indices are now available and will prevent the counterintuitive
ordering of Awk.
Character constants, or variables assigned character constant
values, may also be used for array indices. When used as array
indices, character constants are converted to their integer ASCII
value and that value used as the index value. Thus:
A['a']
and
A[97]
would access the same array element of A.
Note that only indexed elements of an array actually referenced
exist. Thus, for the array A above, the elements for indices 2,
4, 6 and 13 do not exist since they have not been referenced.
This follows the general philosophy that a variable does not
exist until it has been referenced.
7.3 QTAwk Arrays in Arithmetic Expressions
When Arrays are used in arithmetic expressions in QTAwk, the
entire array is operated on or assigned. For example, if the
variable 'B' is a 3x3 array with the following values:
B[1][1] = 11, B[1][2] = 12, B[1][3] = 13
B[2][1] = 21, B[2][2] = 22, B[2][3] = 23
B[3][1] = 31, B[3][2] = 32, B[3][3] = 33
Assigning B to the variable 'A':
A = B
QTAwk - 7-2 - QTAwk
Section 7.3 Arrays
will duplicate the entire array into A.
A[1][1] = 11, A[1][2] = 12, A[1][3] = 13
A[2][1] = 21, A[2][2] = 22, A[2][3] = 23
A[3][1] = 31, A[3][2] = 32, A[3][3] = 33
If A and B are array variables and C is a scalar (non-array)
variable, then the following expression forms for the assignment
operators, 'op=', are legal:
1. A = B
assign one array to a second. The original elements of array
A are deleted and the the elements of B duplicated into A.
2. C = B
assigning an array to a variable currently a scalar. Again
the elements of B are duplicated into elements of C which
becomes an array.
3. A = C
assigning a scalar to a variable which is an array. The
elements of the array are discarded and the variable becomes
a scalar.
4. A = B[i]...[j]
assigning an array element to a variable which is currently
an array. Since the element of an array is a scalar, this
case is essentially the same as the immediately previous
case, and A becomes a scalar.
5. A[i]...[j] = B[k]...[l]
since array elements are scalars, this is the usual scalar
assignment case.
6. A op= C
the 'op=' operator is applied to every element of A. Thus, A
+= 2, would add '2' to every element of A.
7. A op= B
the 'op=' operator is applied to every element of A for which
an element exists in B with identical indices. No elements
are created in A to match elements of B with indices
different from any element of A. Thus, the sequence of
statements:
A = B;
QTAwk - 7-3 - QTAwk
Section 7.3 Arrays
A += B;
would leave every element of A with twice the value of the
corresponding element of B.
There are two cases of using arrays with the assignment
operators that are not legal and for which QTAwk will issue an
error message at runtime.
1. A[i]...[j] = B
2. A[i]...[j] op= B
3. C op= B
These are all variations on the same expression. In the first
case, the expression is attempting to assign an array to an array
element. Since an array element cannot be further expanded into
an array, the assignment is not allowed. In the second and third
cases, the expressions are attempting to operate on a scalar with
an array and assign the result to the scalar. Both of these
expressions fail for the same reason: an array cannot operate on
a scalar. It is possible for a single value, a scalar, to
operate on every element of an array, but the reverse, having
each element of the array operate on the scalar is not
permitted.
The reasoning prohibiting the second and third cases above is
extended to all binary expressions involving arrays in QTAwk. In
general, arrays are allowed in expressions with binary arithmetic
operators:
~ ^ * / % + - << >> & @ |
as well as string concatenation:
A B (equivalent to A ∩ B)
In such expressions, arrays are allowed in the following forms:
1. A op B
2. A op C
But not as
C op A
It could be argued that expressions such as,
QTAwk - 7-4 - QTAwk
Section 7.3 Arrays
2 + A
should be allowed since '+' is commutative and the expression
could be written equivalently as,
A + 2
This is true for addition, but not for all of the binary
arithmetic operators. For example, the division operator is not
commutative.
2 / A
could not be written equivalently as:
A / 2
For this reason, QTAwk does not allow any array expressions of
the form:
scalar op array
The unary arithmetic operators may also be used to operate on
entire arrays:
++A (prefix increment operator)
--A (prefix decrement operator)
A++ (post-fix increment operator)
A-- (post-fix decrement operator)
-A (Unary minus operator)
+A (Unary plus operator)
~A (Unary one's complement operator)
An expression such as:
A + B
will result in an array with element indices identical to those
of A, and with values which are the sum of the elements of A and
B, which have identical indices. If A has an element for which B
QTAwk - 7-5 - QTAwk
Section 7.3 Arrays
does not have a corresponding element, the resultant element
value is equal to the A element value. Elements of B which have
no corresponding element in A are not represented in the
resultant array.
An array with elements of double the value of the elements of B
can created as:
A = B;
D = A + B;
or as
D = B + B;
or as
D = B * 2;
any of the above sequence of statements will result in an array,
D, with elements with indices identical to B, and with double the
element values. The array A could be made an array with elements
twice the element values of B with the statement:
A = B;
A *= 2;
Arrays may be used in expressions with arithmetic operators and
the whole array will be utilized in the expression. This does
not extend to all of the logical operators:
! < <= >= > == != && ||
Using an array with a logical operator listed above will result
in the first element in the array only being used in the
expression.
7.4 Arrays as Regular Expressions
Arrays can be used with the match operators, ~~ and !~, both
their explicit use in expressions and their implicit use in
patterns and the built-in functions, 'match', 'sub' and 'gsub'.
The use of arrays as regular expressions is similar to the use of
the predefined pattern, GROUP. Each element of the array is
treated as a separate regular expression and the matching process
matches against all elements of the array. If a match is found
QTAwk - 7-6 - QTAwk
Section 7.4 Arrays
with one of the elements of the array, the built-in variable,
MATCH_INDEX, is set to the string value of the array element
index. If the array is multidimensional, the indices are
separated by the string value of the built-in variable, SUBSEP.
The default value of SUBSEP is a single comma, ','.
When an array is used for searching for multiple regular
expressions, the internal form of the combined patterns is
generated. The internal form is not discarded once the match
operation has been completed. The internal form is discarded
whenever any element of the array is changed, the array is
discarded via the 'deletea' statement or the variable is assigned
a new value.
The array can be assigned to another variable, after a match
operation and the internal form is also assigned to the new
variable. This could preserve the internal form if the old
variable is assigned a new value.
The use of arrays for searching for matches to one or more
regular expressions, can be utilized to produce dynamic regular
expressions which can change in time as when strings are used for
regular expressions, but which retain the speed of static regular
expressions and their static internal form. For example, setting
a single array element as:
tp[1] = /Test: {var1}/;
and matching against the array:
if ( match_var ~~ tp ) expressions
As long as the element of tp is not changed, the internal
regular expression form is used for matching. Since the internal
form is not re-derived for each match, the matching processes is
speeded up. If the value of 'var1' changes or a new test pattern
is desired, then tp[1] can be re-assigned:
tp[1] = /Test: {var1}/;
The assignment discards the internal regular expression form for
the tp array which will be re-derived on the next match
operation.
Arrays can be used in this manner not only in the action portion
of pattern-action pairs, but also in the pattern portion. Thus a
QTAwk - 7-7 - QTAwk
Section 7.4 Arrays
pattern such as:
tp {
.
.
.
}
can be used. The first match against tp will set the internal
regular expression form for the array. The internal form will
not change until an element of tp is changed, then the internal
form will be re-derived on the next match against the array.
Note that both individual elements of an array and the array as
a whole can be used for matching. Thus, matching against tp[1]:
if ( mvar ~~ tp[1] ) expressions
will match against the regular expression:
/Test: {var1}/;
The first such match will set the internal form of the regular
expression using the value of the variable 'var1' at that time.
This internal form does not change. Re-assigning tp[1]:
tp[1] = /Test: {var1}/;
does not alter the regular expression internal form even though
the internal regular expression form for the array tp has been
discarded by the assignment. Matching against tp as an array
will use the any new value for 'var1' at the time of the new
match operation.
Arrays can be used in GROUP patterns. Any match against an
element of the array will activate the action associated with the
array position in the GROUP and set the built-in variable, NG, to
the integer value for the position of the array in the GROUP.
Using arrays in GROUP patterns does not set the MATCH_INDEX
variable. Also the internal regular expression form for the
GROUP patterns is not re-derived when the array changes.
The use of arrays as the matching regular expression in the
'gsub' function is explained in Section 11. The use of an array
as the matching regular expression in the 'gsub' function as
opposed to a loop for multiple regular expressions is also
QTAwk - 7-8 - QTAwk
Section 7.4 Arrays
described.
QTAwk - 7-9 - QTAwk
QTAwk - 7-10 - QTAwk
Section 8.0 Strings, Regular Expressions and Arrays
8.0 Strings, Regular Expressions and Arrays
Strings and regular expressions in QTAwk are very similar, yet
very different. Regular expressions can be used wherever strings
are used and strings may be used in most cases where a regular
expression may be used.
8.1 Regular Expression and String Translation
Regular expressions and strings used as regular expressions are
turned into an internal form for scanning the target string for a
match. For regular expressions this process of conversion into
the internal form is done once, when the regular expression is
first used. For strings the process is done every time the
string is used as a regular expression.
The process of conversion into the internal form can be time
consuming if done repeatedly. The judicious use of strings and
regular expressions can give both flexibility and speed. By
using regular expressions in those places where the content of
the regular expression will not change after the first use, the
speed of a single conversion can be attained. By using strings
in those places where a regular expression is called for, e.g.,
the first argument of the 'gsub' function and the right hand
expression for the match operators, the flexibility of
dynamically changing expressions can be gained at the expense of
speed.
QTAwk has a form of regular expression between the use of
regular expression constants and string constants, array
variables. When an array variable is used for matching, the
internal regular expression form for the entire array is
derived. The derived internal form is kept after the matching
operation and is discarded only when the array is changed. Thus,
arrays are like regular expression constants in that the internal
form is kept between matching operations, but also like string
constants in that the internal can be changed when necessary.
8.2 Regular Expressions in Patterns
There are, however, some places where strings cannot be used as
regular expressions. The most notable of these is stand-alone
regular expressions in patterns. Stand-alone regular expressions
in patterns are a shorthand for:
$0 ~~ /re/
QTAwk - 8-1 - QTAwk
Section 8.2 Strings, Regular Expressions and Arrays
Thus, complex expressions may be built from stand-alone regular
expressions in patterns. For example, the pattern:
/re1/ && /re2/
will match only those records for which both regular expressions
re1 and re2 match. Using the logical, relational, equality and
bit-wise operators, two or more regular expressions may be
combined in patterns to test records against more than one
regular expression. The following pattern:
/re1/ != /re2/
will select only those records matching re1 and NOT matching
re2. But records matching re2 and not matching re1 will also be
selected.
!/re1/
will select those records not matching the regular expression.
To use regular expressions in this manner the following logical
truth table may be used for selecting desired records which match
or do not match desired regular expressions:
r1 T T F F
r2 T F T F
== T F F T
!= F T T F
<= T F T T
< F F T F
> F T F F
>= T T F T
& T F F F
| T T T F
@ F T T F
&& T F F F
|| T T T F
Thus, if you wanted to select only those records that matched
both regular expressions and reject those records that did not
match both, the following patterns are the only ones to do so:
/re1/ & /re2/
or
QTAwk - 8-2 - QTAwk
Section 8.2 Strings, Regular Expressions and Arrays
/re1/ && /re2/
To select those records matching only re1 and not re2 or both,
the following patterns could be used:
/re1/ > /re2/
or
/re1/ && !/re2/
Note that strings could be used for regular expressions in
patterns instead of stand-alone regular expressions. However,
the economy of expression of the stand-alone regular expressions
would be lost. For example, for the stand-alone regular
expression pattern:
/re1/ && !/re2/
The following string expression could be used:
($0 ~~ "re1") && ($0 !~ "re2")
or
($0 ~~ "re1") && !($0 ~~ "re2")
If re1 or re2 contained named expressions, then the values of
the variables contained in re1 or re2 could be changed to
dynamically alter the lines matched. The matching process for
the above string expressions would be much slower than for the
correspond expressions with stand-alone regular expressions.
Array variables could also be used for those situations where
re1 and/or re2 contained named expressions which change and for
which the change must be reflected in the matching process.
Also, arrays allow the user better control over when the internal
regular expression form is derived. Setting:
are1[1] = re1;
and
are2[1] = re2;
The use of are1 and are2 as matching patterns:
QTAwk - 8-3 - QTAwk
Section 8.2 Strings, Regular Expressions and Arrays
are1 && are2
would be identical to
re1 && re2
as a pattern expression. The advantage of using the arrays is
when any named expression in re1 and/or r2 change value. Then
simply re-assigning the arrays as:
are1[1] = re1;
and
are2[1] = re2;
would discard the internal regular expression form for both
arrays which would then be re-derived when the pattern is next
matched against an input record. The use of arrays as pattern
matching regular expressions yields truely dynamic regular
expressions with the user utility having total control over when
the internal form is discarded and re-derived.
Regular expressions and strings may also be used in 'case'
statements as described later. However, strings are not
equivalent to regular expressions in the 'case' statement.
QTAwk - 8-4 - QTAwk
Section 9.0 Pattern-Actions
9.0 Pattern-Actions
The fundamental QTAwk processing sequence is:
1. QTAwk opens each input file and reads the file record by
record, a record is determined by the value of the built-in
variable RS.
2. When each record is read, it is split into fields determined
by the value of the built-in variable FS.
3. Each pattern expression is executed.
4. The associated action is executed for each pattern expression
which evaluates to true.
The above basic processing loop may be altered for special cases
as explained at the end of this section.
QTAwk recognizes utilities in the following format:
pattern { action }
The opening brace, '{', of the action must be on the same line
as the pattern. Patterns control the execution of actions. When
a pattern matches a record, i.e., the pattern expression
evaluates to a true value, the associated action is executed.
Patterns consist of valid QTAwk expressions or regular
expressions. The sequence operator acquires a special meaning in
pattern expressions and loses its meaning as a sequence
operator.
QTAwk follows the C practice in logical operations of
considering a nonzero numeric value as true and a zero numeric
value as false. This has been expanded in QTAwk for strings by
considering the null string as false and any non-null string as
true. When a logical operation is performed, the operation
returns an integer value of one (1) for a true condition and an
integer value of zero (0) for a false condition.
9.1 QTAwk Patterns
QTAwk recognizes the following type of pattern/action pairs:
1. { action }
the pattern is assumed TRUE for every record and the action
is executed for all records.
2. expression
QTAwk - 9-1 - QTAwk
Section 9.1 Pattern-Actions
the default action {print;} is executed for every record for
which expression evaluates to TRUE.
3. expression { action }
the action is executed for each record for which expression
evaluates to TRUE.
4. /regular expression/ { action }
the actions are executed for each record for which the
regular expression matches a string in the record (TRUE
condition). The regular expression may be specified
explicitly as shown or specified by a variable with a regular
expression value. For example, setting the variable, var_re,
as:
var_re = /Replacement String/;
and specifying the pattern as:
var_re { action }
would be identical to:
/Replacement String/ { action }
The use of a variable has the advantage of being able to
change to the value of the variable. Changing the variable
to another regular expression gives the QTAwk utility the
capability of dynamically changing patterns recognized.
5. compound pattern { action }
the pattern combines regular expressions with logical NOT,
'!', logical AND, '&&', logical OR, '||', bit-wise AND, '&',
bit-wise OR, '|', bit-wise XOR, '@', the relational
operators, '<=', '<', '>', '>=', the equality operators, '=='
and '!=', and the matching operators, '~~' and '!~'. The
action is executed for each record for which the compound
pattern is TRUE.
6. expression1 , expression2 { action }
range pattern. The action is executed for the first record
for which expression1 is TRUE and every record until
expression2 evaluates TRUE. The range is inclusive. This
illustrates the special meaning of the sequence operator in
patterns.
QTAwk - 9-2 - QTAwk
Section 9.1 Pattern-Actions
7. predefined pattern { action }
the predefined patterns are described next
9.2 QTAwk Predefined Patterns
QTAwk provides five predefined patterns, all of which (except
for the 'GROUP' pattern) require actions. The five predefined
patterns are:
1. BEGIN { action }
the action(s) associated with the BEGIN pattern are executed
once prior to opening the first input file. There may be
multiple BEGIN { action } combinations. Each action is
executed in the order in which it is specified.
2. INITIAL { action }
or
INITIALIZE { action }
the action(s) associated with the INITIAL (INITIALIZE)
pattern are executed after each input file is opened and
before the first record is read. There may be multiple
INITIAL { action } combinations. Each action is executed in
the order in which it is specified.
3. GROUP expression { action }
GROUP expression1 { action }
GROUP expression2 { action }
GROUP expression3 { action }
or
GROUP expression1
GROUP expression2 { action }
GROUP expression3
GROUP expression4 { action }
or
GROUP {exp}1
GROUP {exp}2 { action }
GROUP {exp}3
GROUP {exp}4
the pattern associated with the 'GROUP' pattern keyword may
be any valid QTAwk expression. All expressions in a GROUP
QTAwk - 9-3 - QTAwk
Section 9.2 Pattern-Actions
are evaluated when the GROUP is first matched against an
input record. The result of the evaluation is converted to a
regular expression for matching. If the result of evaluating
a GROUP expression is an array, the entire array is used for
matching at the current position in the GROUP, i.e., all
elements of the array are converted to regular expressions
and each is matched against the current input record.
All consecutive GROUP/action pairs are grouped and the
search for the regular expressions optimized over the group.
Each expression of the GROUP may have a separate action
associated with it. In this case the appropriate action is
executed if the expression is matched on the current input
record. If the action for a expression is not given, then
the next action explicitly given is executed. If no action
is given for the last expression of a GROUP, then the default
action
{ print ; }
is assigned to it. When one of the expressions of the GROUP
is matched, the built-in variable, NG, is set equal to the
number of the expression. The numbering of the expressions
in the GROUP starts with one, 1.
There may be more than one GROUP of expression patterns.
Any pattern not preceded with the 'GROUP' keyword will cause
a GROUP to be terminated. The occurrence of the 'GROUP'
keyword again will start a new GROUP and the numbering of the
new group starts at one, 1.
GROUP patterns are discussed in more detail later.
4. NOMATCH { action }
the action(s) associated with the NOMATCH pattern are
executed for each record for which no pattern is TRUE. There
may be multiple NOMATCH { action } combinations. Each action
is executed in the order in which it is specified.
5. FINAL { action }
or
FINALIZE { action }
the actions associated with the FINAL (FINALIZE) pattern are
executed after the last record of each input file has been
read and before the file is closed. There may be multiple
FINAL { action } combinations. Each action is executed in
QTAwk - 9-4 - QTAwk
Section 9.2 Pattern-Actions
the order in which it is specified.
6. END ( action )
the action(s) associated with the END pattern are executed
once after the last input file has been closed. There may be
multiple END { action } combinations. Each action is
executed in the order in which it is specified.
Note that there may be multiple predefined pattern-action pairs
defined in an QTAwk utility. Each action is executed at the
appropriate time in the order defined.
For some QTAwk utilities, the basic processing loop as outlined
at the beginning of this chapter may be slower than necessary.
If all actions of a utility are associated with regular
expressions or if a certain record matching one or more regular
expressions must be found before any actions are executed, then
the process of reading all records and parsing into fields before
executing pattern expressions can be slow. For this purpose
QTAwk has two special built-in variables:
1. FILE_SEARCH, and
2. FILE_SEARCH_PAT
When FILE_SEARCH is TRUE, the next record read will be the
record matching a regular expression from FILE_SEARCH_PAT. If
FILE_SEARCH is FALSE, the normal file input process is followed.
The file search process may be turned on and off as necessary for
a single input file in this manner.
FILE_SEARCH_PAT is set by the user utility to one or more
regular expressions against which records from the current input
file are matched. FILE_SEARCH_PAT may be set to a single regular
expression as a simple variable, e.g.,
■ FILE_SEARCH_PAT = /test string/;
or a singly dimensioned array, e.g.,
■ FILE_SEARCH_PAT[1] = /test string 1/;
■ FILE_SEARCH_PAT[2] = /test string 2/;
■ FILE_SEARCH_PAT[3] = /test string 3/;
■ FILE_SEARCH_PAT[4] = /test string 4/;
or a multidimensioned array, e.g.,
QTAwk - 9-5 - QTAwk
Section 9.2 Pattern-Actions
■ FILE_SEARCH_PAT[1][1] = /test string 1,1/;
■ FILE_SEARCH_PAT[1][2] = /test string 1,2/;
■ FILE_SEARCH_PAT[1][3] = /test string 1,3/;
■ FILE_SEARCH_PAT[2][1] = /test string 2,1/;
■ FILE_SEARCH_PAT[2][2] = /test string 2,2/;
■ FILE_SEARCH_PAT[2][3] = /test string 2,3/;
■ FILE_SEARCH_PAT[3][1] = /test string 3,1/;
■ FILE_SEARCH_PAT[3][2] = /test string 3,2/;
■ FILE_SEARCH_PAT[3][3] = /test string 3,3/;
When FILE_SEARCH is TRUE, the current input file is scanned for
a match to FILE_SEARCH_PAT. When a record is found matching a
regular expression in FILE_SEARCH_PAT, the record is read, parsed
into fields according to FS and each pattern expression
executed. The associated actions for TRUE pattern expressions
are executed. Note that the variable RS still determines the
parsing of the input file into records.
Under some circumstances, the above process can return in '$0'
multiple records from the current input file. In searching the
input file for a match with FILE_SEARCH_PAT, a match may span
more than one record if the new variable, SPAN_RECORDS, is TRUE.
In this case, '$0' is set to the full set of records spanning the
match to FILE_SEARCH_PAT and FNR is set to the record number of
the last record in $0.
If SPAN_RECORDS is FALSE, any matches to FILE_SEARCH_PAT are not
allowed to span input records and '$0' will contain only a single
record.
The following simple QTAwk utility will mimic the QTGrep program
in searching for multiple regular expression and keywords and
print them. Using the ability, via FILE_SEARCH, to process only
those lines which match a desired pattern, speeds the processing
of the file considerably.
QTAwk - 9-6 - QTAwk
Section 9.2 Pattern-Actions
# QTAwk Utility To Mimic QTGrep Program -- Only
# Slower
#
# Scan Specified Files For Matches To Multiple
# Regular Expressions And Keywords And Print
# Matching Lines With Line Number And Number Of
# Match
#
# Have Explicitly Written Regular Expressions To
# Match In 'BEGIN' Action And Must Change 'BEGIN'
# Action For Differing Searches.
#
# Use FILE_SEARCH And FILE_SEARCH_PAT Built-In
# Variables To Speed Search - Process ONLY Those
# Lines Which Match An Element In The
# FILE_SEARCH_PAT Array.
#
# Set SPAN_RECORDS To:
# TRUE - Allow Matches To Span Multiple Records
# FALSE - Confine Matches To A Single Record
# Have Set Span_records To TRUE In This Version.
#
BEGIN {
#### Patterns To Find Listed Below:
#
# Set First Patterns To Find
# FILE_SEARCH_PAT[1] = /Test Pattern 1/;
# FILE_SEARCH_PAT[2] = /Test Pattern 2/;
# FILE_SEARCH_PAT[3] = /Test Pattern 3/;
# FILE_SEARCH_PAT[4] = /Test Pattern 4/;
#### Patterns To Find Listed Above:
# Allow Matches To Span One Or More Input Records
SPAN_RECORDS = TRUE;
# Indicate Whether Searching For Multiple Patterns
multiple_search = FALSE;
# Turn On File Search Mode For Faster Processing
FILE_SEARCH = TRUE;
}
INITIAL {
fprintf("stderr","%s\n",FILENAME);
match_cnt = 0;
record_cnt = 0;
}
{
match_cnt++;
record_cnt++;
QTAwk - 9-7 - QTAwk
Section 9.2 Pattern-Actions
# set number of expression matched - match index, mi
if ( multiple_search ) mi = MATCH_INDEX + 0; else mi = 1;
# Check If Matches Span Multiple Records
# If So - Then Split Records Out Of Input Record, $0, And
# Compute Record Count Of First Record
# Print Multiple Records With Leading Zeros On Record Number
# And Expression Count To Flag That Multiple Record Match
if ( SPAN_RECORDS && match($0,/\n/) ) {
cnt = split($0,lines,/\n/);
fnr = FNR - cnt + 1;
record_cnt += cnt - 1;
for ( i in lines )
printf("%06uE: %06uR: %s\n",mi,fnr++,lines[i]);
} else printf("%6uE: %6uR: %s\n", mi,FNR ,$0);
}
# Output Totals For Individual File
FINAL {
# Update Totals Across Files
tmatch_cnt += match_cnt;
trecord_cnt += record_cnt;
printf("%s%s\n",FILEPATH,FILENAME);
printf(" Matches: %6u\n",match_cnt);
printf("Records Matched: %6u\n",record_cnt);
printf("Records Scanned: %6u\n\n",FNR);
}
# Output Totals For All Files
END {
printf("Total Matches: %6u\n",tmatch_cnt);
printf("Total Records Matched: %6u\n",trecord_cnt);
printf("Total Records Scanned: %6u\n",NR);
}
Another handy use of FILE_SEARCH is to use FILE_SEARCH_PAT to
search for the first line in a range of records to be processed.
When the first record is found using the file searching
capabilities of QTAwk, FILE_SEARCH is set to false and the range
of records is processed normally. After the last record in the
range has been processed, FILE_SEARCH is set to true to continue
the file search for the next cluster.
Such a utility may resemble:
# QTAwk Utility To Rapidly Find And Process Clusters Of
Records.
# Record Cluster Start Found By Matching FILE_SEARCH_PAT
BEGIN {
QTAwk - 9-8 - QTAwk
Section 9.2 Pattern-Actions
FILE_SEARCH = TRUE;
FILE_SEARCH_PAT[1] = /search pattern 1/;
FILE_SEARCH_PAT[2] = /search pattern 2/;
FILE_SEARCH_PAT[3] = /search pattern 3/;
FILE_SEARCH_PAT[4] = /search pattern 4/;
FILE_SEARCH_PAT[5] = /search pattern 5/;
}
# Have Found First Record Matching A Pattern In
# FILE_SEARCH_PAT. Turn File Search Mode Off
# And Process Range Of Records Normally
FILE_SEARCH {
FILE_SEARCH = FALSE;
. # actions to be taken on record matching
. # FILE_SEARCH_PAT
.
.
.
.
.
# Delete Next Statement If Record Matching FILE_SEARCH_PAT Is
# To Be Matched Against Patterns
next;
}
# Pattern/Actions For Records To Be Processed Normally
pattern1 {
# actions
}
pattern2 {
# actions
}
.
.
.
# Pattern/Action For Last Record In Range
# Reset FILE_SEARCH To True To Resume
# File Search Mode. May Also Set FILE_SEARCH_PAT
# To Alter Matching Conditions For Next Record Cluster
pattern_l {
FILE_SEARCH = TRUE;
.
.
.
}
Note if FILE_SEARCH_PAT is not set, then it has the default
value of the null string. A match against the null string will
QTAwk - 9-9 - QTAwk
Section 9.2 Pattern-Actions
match null, or zero length, records (QTAwk silently replaces a
null string regular expression pattern with the regular
expression /^$/. Thus, the following very simple QTAwk utility
will find all null records in a given file and print the
corresponding record numbers.
BEGIN { FILE_SEARCH = TRUE; }
{ print FNR; }
QTAwk - 9-10 - QTAwk
Section 10.0 Group Patterns
10.0 Group Patterns
GROUP patterns follow the syntax:
GROUP exp1 { optional action }
GROUP /re2/ { optional action }
GROUP var3 { optional action }
GROUP "st4" { optional action }
Actions are optional with any particular expression in the
group. If no action is given, the next action specified in the
group is executed. If no action is specified for the last
expression in a group, the default action, "{print;}" is assigned
to it.
Any utility may have more than one GROUP of patterns. A group
is terminated by any pattern not starting with the 'GROUP'
keyword.
10.1 GROUP Pattern Advantage
GROUP patterns have two distinct advantages in QTAwk:
1. the regular expressions corresponding to the expressions
contained in the GROUP are optimized to decrease search time,
and
2. input records are searched once for all regular expressions
in a GROUP. If the regular expressions were organized as
individual pattern/actions, each record is searched
separately for each regular expression.
For utilities containing many regular expression patterns for
which to search, a program organized into one or more GROUPs can
be many times faster than a utility organized as ordinary
pattern/action pairs. For example, the QTAwk utility in Appendix
iv searches a C source file listing for ANSI C Standard defined
names. The utility organizes the search into a single GROUP and
will search a source file approximately 6 times faster than the
same utility organized as separate pattern/action pairs without
the use of a GROUP.
10.2 GROUP Pattern Disadvantage
GROUP patterns have one disadvantage compared to ordinary
pattern/action pairs. QTAwk will find only one of the regular
QTAwk - 10-1 - QTAwk
Section 10.2 Group Patterns
expressions in a GROUP. A set of GROUP patterns:
GROUP expression1 { action1; }
GROUP expression2 { action2; }
GROUP expression3 { action3; }
is similar in execution to:
$0 ~~ expression1 { action1; next; }
$0 ~~ expression2 { action2; next; }
$0 ~~ expression3 { action3; next; }
If more than one regular expression in a group will match a
given string in the input record, the regular expression listed
first in the GROUP will be matched and the appropriate action
executed. If all regular expression patterns in a GROUP must be
found in input records, then separate pattern-action pairs must
be used.
10.3 GROUP Pattern Regular Expressions
The regular expressions associated with the GROUP pattern can be
any valid QTAwk expression. All expressions in a GROUP are
evaluated when the GROUP is first matched against an input record
and converted to regular expressions.
GROUP patterns are converted into an internal form for regular
expressions only once, when the pattern is first used to scan an
input line. Any variables which are the result of a GROUP
expression will be evaluated, converted to string form and
interpreted as a regular expression. Similarly, any named
expressions in a regular expression constant, string constant or
regular expression value of a variable will be replaced once, at
the time of conversion.
If one of the expressions in a GROUP match the current input
record, the value of the built-in variable NG is set to the
sequence value of the matching expression in the GROUP.
Numbering of the expressions in the GROUP starts with 1.
If any expression in a GROUP which evaluates to an array, the
entire array is utilized for matching at that position of the
GROUP, i.e., if any element of the array matches the current
input record, the value of NG is assigned the sequence value of
the array in the GROUP.
QTAwk - 10-2 - QTAwk
Section 11.0 Statements
11.0 Statements
Statements specify the flow of control through a utility when it
executes. A statement that contains expressions also computes
values and/or alters the values stored in variables when the
statement executes.
QTAwk has departed from Awk by using the C convention of using
the semi-colon, ';', as a statement terminator. QTAwk treats
newline characters as white space, nothing more. Comments are
introduced by the symbol, '#', and continue to the next newline
character. Thus the Awk practice of letting new-lines terminate
some statements can no longer be used. The Awk rules for
terminating statements with the newline except under some
conditions can now be forgotten. In QTAwk, terminate all
statements with a semi-colon, ';'.
11.1 QTAwk Keywords
The QTAwk keywords are:
1. break
2. case
3. continue
4. cycle
5. default
6. delete
7. deletea
8. do
9. else
10. endfile
11. exit
12. for
13. if
14. in
15. local
16. next
17. return
18. switch
19. while
The keywords 'cycle', 'deletea', 'local' and 'endfile' are new
to QTAwk. The keywords 'switch', 'case' and 'default' have been
appropriated from C with expanded functionality over C.
11.2 Statements
QTAwk - 11-1 - QTAwk
Section 11.2 Statements
A statement can be one of the following:
break;
case expr_list: statement
continue;
cycle;
default: statement
delete variable[expr_list];
deletea variable;
do statement while ( expr_list );
endfile;
exit;
exit expression;
for ( expr_list ; expr_list ; expr_list ) statement
for ( variable1 in variable2 ) statement
if ( expr_list ) statement
if ( expr_list ) statement else statement
local variable;
local variable1 , variable2 , ...;
local variable1 = value;
local variable1 = value , variable2 = value , ...;
next;
return;
return expr_list;
switch ( expr_list ) statement
while ( expr_list ) statement
;
expression;
expression , expression , expression , ...;
{ statement statement statement ... }
QTAwk provides braces for grouping statements to form compound
statements. Various keywords are available for controlling the
logical flow of statement execution and for looping over
statements multiple times.
11.3 'cycle' and 'next'
The 'cycle' and 'next' statements allow the user to control the
execution of the QTAwk outer loop which reads records from the
current input file and compares them against the patterns. Both
statements, restart the pattern matching.
The 'next' statement causes the next input record to be read
before restarting the outer pattern matching loop with the first
pattern-action pair.
QTAwk - 11-2 - QTAwk
Section 11.3 Statements
The 'cycle' statement may use the current input record or the
next input record for restarting the outer pattern matching
loop. As each input record is read from the current input file,
the built-in variable CYCLE_COUNT is set to one. The 'cycle'
statement increments the numeric value of CYCLE_COUNT by one and
compares the new value to the numeric value of the built-in
variable MAX_CYCLE. One of two actions is taken depending on the
result of this comparison:
1. If CYCLE_COUNT is greater than MAX_CYCLE, then the next
input record is read, setting NR, FNR, $0, NF and the record
fields $1, $2, ... $NF, before restarting the outer pattern
matching loop. This is identical to the action of the 'next'
keyword.
2. If CYCLE_COUNT is less than or equal to MAX_CYCLE, the
current values of NR, FNR, $0, NF and the record fields are
utilized when restarting the outer pattern matching loop.
The default value of MAX_CYCLE is 100. Both CYCLE_COUNT and
MAX_CYCLE are built-in variables and may be set by the user's
utility. Setting MAX_CYCLE is useful to control the number of
iterations possible on a record. Setting MAX_CYCLE to 1 would
make the 'cycle' and 'next' keywords identical.
If the value of CYCLE_COUNT is set by the user's utility, care
should be taken to prevent the possibility of the utility
entering a loop from which it cannot exit.
The 'cycle' statement is useful when it is necessary to process
the current input record through the outer pattern match loop
more than once. The following utility is a trivial example of
one such use. This utility will print each record with the
record number multiple times. The number of times is determined
by the value assigned MAX_CYCLE in the 'BEGIN' action.
BEGIN {
MAX_CYCLE = 10;
}
{
print FNR,$0;
cycle;
}
The "next" record read for both the 'next;' and 'cycle;'
QTAwk - 11-3 - QTAwk
Section 11.3 Statements
statements depends on the value of the built-in variable
"FILE_SEARCH". If FILE_SEARCH is false, then the next physical
record is read. If FILE_SEARCH is true, the next record is the
record(s) containing a string matching a pattern in
FILE_SEARCH_PAT.
11.4 'delete' and 'deletea'
The 'delete' and 'deletea' statements allow the user to delete
individual elements of an array or an entire array respectively.
The form of the 'delete' and 'deletea' statements are:
delete A[expr_list];
and
deletea A;
The first form will delete the element of array A referenced by
the subscript determined by 'expr_list'. The second form will
delete the entire array. Note that for singly dimensioned
arrays, the 'deletea' statement is equivalent to the statement:
for ( j in A ) delete A[j];
The use of the 'deletea' statement is encouraged for simplicity
and speed of execution. The 'delete' statement may be used for
arrays of any dimension. However, for arrays with dimension
greater than 2, the elements of the array are not deleted, but
simply initialized to zero and the null string. This behavior
has to do with the structure of arrays and the 'holes' which
could be left by deleting elements. For singly dimensioned
arrays, there is no problem, since there can be no 'hole' left by
deleting an element. For example consider the singly dimensioned
array:
A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
If the array element A[5] is deleted
A[1] A[2] A[3] A[4] ____ A[6] A[7] A[8] A[9]
Then the remaining elements 'shift' to fill the 'hole'.
A[1] A[2] A[3] A[4] A[6] A[7] A[8] A[9]
QTAwk - 11-4 - QTAwk
Section 11.4 Statements
For two-dimensional arrays a complication arises in trying to
fill the 'hole' left by deleting an array element.
A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
A[4][1] A[4][2] A[4][3] A[4][4] A[4][5] A[4][6]
A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]
If element A[4][4] is deleted, then we have the 'hole':
A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
A[4][1] A[4][2] A[4][3] _______ A[4][5] A[4][6]
A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]
In trying to fill the 'hole', we have a choice of shifting the
elements below the deleted element up to fill the 'hole', column
priority, or shifting the elements to the right of the deleted
element to fill the 'hole', row priority. In QTAwk, row priority
is used in filling the 'hole':
A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
A[4][1] A[4][2] A[4][3] A[4][5] A[4][6]
A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]
For arrays of higher dimensions the situation is even more
complicated. Not only do elements have to be "shifted", but
elements in the array will have to be discarded to do so. For
example, if A is a 3x3x3 array and element A[2][2][2] is deleted,
then element A[2][2][3], if it existed, would also be deleted by
shifting other elements to fill the 'hole'. QTAwk will in this
case initialize the element A[2][2][2] to zero and the null
string rather than delete the element and lose other elements.
Thus, the 'delete' statement only truly deletes elements for one
and two dimensional arrays.
The 'deletea' statement, however, works on arrays of any
dimension. For multidimensional arrays, the 'deletea' would be
equivalent to nested 'for' statements. For example, if the
QTAwk - 11-5 - QTAwk
Section 11.4 Statements
'delete' statement truly deleted elements of a three dimensional
array, then the 'deletea' statement could be imagined as
equivalent to:
for ( i in A )
for ( j in A[i] )
for ( k in A[i][j] ) delete A[i][j][k]
11.5 'if'/'else'
The 'if' and 'else' keywords provide for executing one of
possibly two statements conditioned upon the TRUE or FALSE value
of an expr_list. The form of the 'if'/'else' statement is:
if ( expr_list ) statement1
or
if ( expr_list ) statement1 else statement2
If expr_list when evaluated, produces a TRUE value then
statement1 is executed. If the expr_list produces a FALSE value,
then for the second form, statement2 is executed.
11.6 'switch', 'case', 'default'
QTAwk includes an expanded form of the C 'switch'/'case'
statements. In C, the 'switch'/'case' statements must be of the
form:
switch ( expr_list ) {
case constant1: statement
case constant2: statement
case constant3: statement
case constant4: statement
default: statement
}
The expr_list of the 'switch' statement must evaluate to an
integral value and 'constant1', 'constant2', 'constant3', and
'constant4', must be compile-time integral constant values. In
QTAwk, the 'case' statement may contain any valid QTAwk
expression or expr_list:
switch ( expr_list ) {
case expr_list1: statement
QTAwk - 11-6 - QTAwk
Section 11.6 Statements
case expr_list2: statement
case expr_list3: statement
case expr_list4: statement
default: statement
}
The expr_lists of the case statements are evaluated in turn at
execution time. The resultant value is checked against the value
of the expr_list of the 'switch' statement using the following
logic.
if ( cexpr is a regular expression ) sexpr ~~ cexpr;
else sexpr == cexpr;
where cexpr is the value of the case expr_list and sexpr is the
value of the 'switch' statement expr_list. Thus if cexpr is a
regular expression, a match operation is performed. If cexpr is
a string, a string comparison is performed. If cexpr is a
numeric, a numerical comparison is performed. It is possible to
have case statements with differing types of expr_list values in
the same 'switch' statement and the proper comparison is made.
In addition a given case expr_list can evaluate to different
types at different times.
Once a true value is returned by a case statement comparison,
the execution falls through from 'case' to 'case' with no further
comparisons made. The fall through of execution is broken by the
use of the 'break' statement as in C.
Note that the expr_list of a 'case' statement is evaluated at
execution time and it is possible for some 'case' expr_lists to
never be evaluated. Thus, side effects from the evaluation of
'case' expr_lists should not be relied upon. This is
particularly true where execution falls through from one 'case'
statement to the next.
If the expr_list of a 'case' statement evaluates to a regular
expression, then two built-in variables are set when the match
operation is performed: CLENGTH and CSTART. CLENGTH is set to
the length of the matching string found (or zero) and CSTART is
set to the starting position of the matching string found (or
zero). CLENGTH and CSTART are completely analogous to RLENGTH
and RSTART set for the 'match' function and MLENGTH and MSTART
for the match operators, '~~' and '!~'.
QTAwk - 11-7 - QTAwk
Section 11.6 Statements
The 'default' keyword is provided in analogy to C. The
statements following the 'default' statement are executed if the
'switch' expr_list matches no 'case' expr_list. The 'default'
statement may be combined with other 'case' statements. It need
not be the last statement as shown.
11.7 Loops
QTAwk has four forms of loop control statements:
1. for ( expr_list1 ; expr_list2 ; expr_list3 ) statement
2. for ( var in array ) statement
3. while ( expr_list ) statement
4. do statement while ( expr_list );
11.7.1 'while'
The 'while' statement has the form:
while ( expr_list ) statement
the expr_list is evaluated and if TRUE 'stmt' is executed and
expr_list is re-evaluated. This cycle continues until expr_list
evaluates to FALSE, at which point the cycle is terminated and
execution resumes with the utility after 'stmt'.
11.7.2 'for'
The 'for' statement has two forms:
1. for ( expr_list1 ; expr_list2 ; expr_list3 ) statement
2. for ( var in array ) statement
In the first form the following sequence of operations are
performed:
1. The expressions in expr_list1 are evaluated,
2. The expressions in expr_list2 are evaluated,
3. The action taken is dependent upon whether the resultant
value of expr_list2 is true or false:
a) TRUE
1: Execute 'statement', which may be a compound
statement.
2: Execute the expressions in expr_list3.
3: Control returns to item 2. above.
b) FALSE - terminate loop
QTAwk - 11-8 - QTAwk
Section 11.7.2 Statements
The second form may also be used for multidimensional arrays:
for ( var in array[s_expr_list]...[s_expr_list] ) statement
For each subscript in the next higher index level in the array
reference, var is set to the index value and 'statement' is
executed. 'statement' may be a compound statement. For a
multidimensional array, the second form may be used to loop
sequentially through the indices of the next higher index level.
Thus for a two dimensional array:
for ( i in A )
for ( j in A[i] )
will loop through the indices in the array in row order.
11.7.3 'do'/'while'
The form of the 'do'/'while' statement is:
do statement while ( expr_list );
'statement' is executed, expr_list evaluated and if TRUE
'statement' is executed again else the loop is terminated. Note
that 'statement' is executed at least once.
11.8 'local'
The 'local' keyword is used to define variables within a
compound statement that are local to the compound statement and
that disappear when the compound statement is exited. The
'local' keyword may be used within any compound statement, but is
especially useful in user-defined functions as described later.
Variables defined with the 'local' keyword may be assigned an
initial value in the statement and multiple variables may be
defined with a single statement. If a variable is not assigned
an initial value, it is initialized to zero and the null string
just as global variables are initialized.
Thus:
local i, j = 12, k = substr(str,5);
will define three variables local to the enclosing compound
statement:
1. i initialized to zero/null string,
QTAwk - 11-9 - QTAwk
Section 11.8 Statements
2. j initialized to 12, and
3. k initialized to a sub-string of the variable 'str'
Local variables initialized explicitly in 'local' statements may
be initialized to constants, the values of global variables,
values returned by built-in functions, values returned by
user-defined functions or previously defined local variables. If
the value is set to that of a previously defined local variable,
the variable may not be defined in the same 'local' statement.
Thus:
local k = 5;
local j = k;
is correct, but
local k = 5, j = k;
is not. In the latter case QTAwk will quietly assume that the
k, to which j is assigned, is a global variable.
11.9 'endfile'
The 'endfile' keyword causes the utility to behave as if the end
of the current input file has been reached. Any 'FINAL' actions
are executed, if any input files remain to be processed from the
command line, the next is opened for processing. If no further
input files remain to be processed, any 'END' actions are
executed.
11.10 'break'
This keyword will terminate the execution of the enclosing
'while', 'for', 'do'/'while' loop or break execution in cascaded
'case' statements.
11.11 'continue'
This keyword will cause execution to jump to just after the last
statement in the loop body and execute the next iteration of the
enclosing loop. The loop may be any 'for', 'while' or
'do'/'while' loop.
11.12 'exit opt_expr_list'
This statement causes the utility to behave as if the end of the
QTAwk - 11-10 - QTAwk
Section 11.12 Statements
current input file had been reached. Any further input files
specified are ignored. If there are any FINAL or END actions,
they are executed.
If encountered in a FINAL action, the action is terminated, any
further input files are ignored and any END actions are
executed.
If encountered in an END action, the execution of the action is
terminated and utility execution is terminated.
The optional expr_list is evaluated and the resultant value
returned to DOS upon termination by QTAwk as the exit status. If
no expr_list is present, or no 'exit' statement encountered,
QTAwk returns a value of zero for the exit status.
11.13 'return opt_expr_list'
This statement will cause execution to return from a user
defined function. If the optional expr_list is present, it is
evaluated and the resultant value returned as the functional
value.
QTAwk - 11-11 - QTAwk
QTAwk - 11-12 - QTAwk
Section 12.0 Built-in Functions
12.0 Built-in Functions
QTAwk offers a rich set of built-in arithmetic, string, I/O,
array and system functions. The array of built-in functions
available has been extended over that available with Awk. The
I/O functions have been changed to match the functional syntax of
all other built-in and user defined functions.
12.1 Arithmetic Functions
QTAwk offers the following built-in arithmetic functions. Those
marked with an asterisk, '*', are new to QTAwk:
1. acos(x) ==> return arc-cosine of x. The return value has
degree units if the built-in variable DEGREES is true, radian
units otherwise. Refer to the DEGREES built-in variable.
2. asin(x) ==> return arc-sine of x. The return value has
degree units if the built-in variable DEGREES is true, radian
units otherwise. Refer to the DEGREES built-in variable.
3. atan2(y,x) ==> return arc-tangent of y/x, -π to π. The
return value has degree units if the built-in variable
DEGREES is true, radian units otherwise. Refer to the
DEGREES built-in variable.
4. cos(x) ==> return cosine of x. Assumes x in degrees if
built-in variable DEGREES is true, otherwise assumes x in
radians. Refer to the DEGREES built-in variable.
5. * cosh(x) ==> return hyperbolic cosine of x.
6. exp(x) ==> return e^x.
7. * fract(x) ==> return fractional portion of x.
8. int(x) ==> return integer portion of x.
9. * jdn or jdn() ==> return the Julian Day Number of the
current system date. The Julian Day Number (jdn) is the
number of whole days that have elapsed since a certain
reference time in the past. The jdn is widely used in
astronomy and elsewhere in calculations involving the
counting of days and the computation of the day of the week.
The reference time from which all jdn's are measured has been
chosen by astronomers to be January 1, 4713 B.C. (Julian
QTAwk - 12-1 - QTAwk
Section 12.1 Built-in Functions
Calendar) at noon. As an example of a jdn in modern times,
from noon October 1, 1991 to noon October 2, 1991 is
2,448,544.
The use of the Gregorian or Julian Calendar by QTAwk in the
computation of the jdn is controlled by the built-in variable
"Gregorian". If Gregorian = TRUE, QTAwk uses the Gregorian
Calendar, otherwise QTAwk uses the Julian Calendar. The
Julian Day Number computed by QTAwk is, by default, based on
the Gregorian Calendar adopted in the British Empire,
including the American Colonies, on September 2, 1752. Dates
before that day are based on the Julian Calendar in the US.
The "cal" function described later is the reverse of the
"jdn" function, turning a specified jdn into a formatted date
string.
As mentioned, the jdn of a given date is handy in performing
date computations. For example, given two dates, their
respective jdn's may be found with this function and the
number of days between the two dates computed simply by
subtracting one jdn from the other. A day a specified number
of days in the future (past) is easily computed by adding
(subtracting) the desired number of days to the jdn and using
the "cal" function to obtain the new date. Also, the day of
the week of a given date may be computed quite simply with
the formula:
dow = (jdn + 1) % 7
where dow is the "day of week". The value computed is
between 0 and 6 inclusive with:
0 ==> Sunday
1 ==> Monday
2 ==> Tuesday
3 ==> Wednesday
4 ==> Thursday
5 ==> Friday
6 ==> Saturday
10. * jdn(year,month,day) ==> return the Julian Day Number of
the date specified.
11. * jdn(fdate) ==> return the Julian Day Number of the date
specified. The date is specified in the "PC/MS-DOS file date
format" as obtained with the "findfile" function.
QTAwk - 12-2 - QTAwk
Section 12.1 Built-in Functions
12. log(x) ==> return natural (base e) logarithm of x.
13. * log10(x) ==> return base 10 logarithm of x.
14. * pi or pi() ==> return pi.
15. rand() ==> return random number r, 0 <= r < 1.
16. sin(x) ==> return sine of x. Assumes x in degrees if
built-in variable DEGREES is true, otherwise assumes x in
radians. Refer to the DEGREES built-in variable.
17. * sinh(x) ==> return hyperbolic sine of x.
18. sqrt(x) ==> return square root of x.
19. srand(x) ==> set x as new seed for rand().
20. srand() ==> use current system time as new seed for
rand().
12.2 String Functions
QTAwk offers the following built-in string handling functions.
Those marked with an asterisk, '*', are new to QTAwk.
Some string function require or return a string position.
String positions start with the first character as position 1.
1. * cal(fmt_str,jdn) ==> return date/time corresponding to
Julian Day Number (jdn) specified, formatted according to the
string value of fmt_str. The format options are identical to
those for the "sdate/stime" functions described later. This
function is the reverse of the "jdn" function described
previously.
If the jdn specified is an integer value, any time
substitutions made in the format string use the current
system time. If the jdn specified is a floating point
numeric with a nonzero fractional part, the fractional part
is used to compute a time of day. For example, a floating
point jdn value of:
spec_day = 2449251.399803241;
the statement:
QTAwk - 12-3 - QTAwk
Section 12.2 Built-in Functions
print cal("%m/%d/%Y -- %H:%M:%S",spec_day);
produces the output:
09/20/1993 -- 09:35:43
The integer portion, 2449251, produces the date, 09/20/1993,
and the fractional portion, 0.399803241, produces the time,
09:35:43.
Given a time of day in hours (on a 24 hour/day basis),
minutes and seconds, to compute the fractional portion of a
jdn, the following formula may be used:
fract_jdn = (hours/24.0) + (minutes/1440.0) +
(seconds/86400.0);
2. * center(s,w) ==> return string s centered in w blank
characters.
3. * center(s,w,c) ==> return string s centered in w 'c'
characters.
4. * copies(s,n) ==> return n copies of string s.
5. * deletec(s,p,n) ==> return string s with n characters
deleted starting at position p.
6. gsub(re,rs) ==> substitute for strings matched by regular
expression, re, globally in $0, return number of
substitutions made. The evaluation of the replacement string
expression, rs, is the same as below.
7. gsub(re,rs,t) ==> substitute for strings matched by regular
expression, re, globally in string t, return number of
substitutions made. The substitution strings are determined
by the replacement string, rs, the string value of the second
argument. The replacement string expression, rs, is not
evaluated at the time the function is called, but when the
replacement string is used for replacement.
Since the first argument, re, may be an array, the value of
the built-in variable, MATCH_INDEX, can be effected as
matches are made. By delaying the evaluation of the
replacement string expression until the replacement is made,
the change in MATCH_INDEX can be used to effect the value of
QTAwk - 12-4 - QTAwk
Section 12.2 Built-in Functions
the replacement string for each replacement.
Replacing a list of strings in another string could be
accomplished by two methods. In both methods the strings to
replace, the pattern strings, are contained in the singly
dimensioned array 'str_pat' and the replacement strings are
contained in the singly dimensioned array 'str_rep'. The
first method uses a loop to replace each string in str_pat.
for ( i in str_pat ) gsub(str_pat[i],str_rep[i]),rep_var);
The second method uses the array capabilities of QTAwk to
automatically scan for all pattern strings in 'str_pat'.
gsub(str_pat,str_rep[MATCH_INDEX + 0],rep_var);
As each pattern string is found, MATCH_INDEX is set to the
string value of the corresponding index of the pattern
string. The replacement string expression is then
evaluated. Adding 0 to MATCH_INDEX is necessary to convert
its string value to a numeric for indexing into the str_rep
array.
An example of such use is found in the "more.exp" utility
included with QTAwk. When the ANSI display option is
activated, strings may be highlighted on output. Finding and
highlighting the desired strings is accomplished with the
single statement:
# Put In High Light ANSI Sequences For High Lighted Text
if ( highlight )
gsub(High_pat,High_Text[MATCH_INDEX + 0] ∩ "$$0" ∩ Normal,dl);
where 'highlight' is a variable with a TRUE/FALSE value used
to flag if strings are to be highlighted. 'High_pat' and
'High_Text' are arrays with the string patterns to highlight
and the ANSI character sequences necessary for changing the
display colors. '$$0' is the tag string replacement token
containing the string matching the desired pattern in
'High_pat'. Note that '$$0' must be contained in double
quotations to ensure that the substitution made for the tag
string token is accomplished by the 'gsub' function.
Otherwise the basic processing loop of QTAwk would evaluated
'$$' as the tag string operator, operating on the token '0'
when the replacement string expression is evaluated which
would yield the matching string from the associated pattern
QTAwk - 12-5 - QTAwk
Section 12.2 Built-in Functions
(or the null string if there was no matching string from the
pattern). 'Normal' contains the ANSI character sequence
necessary to return the display to the normal display
colors. 'dl' is the variable containing the display line.
The replacement string expression:
High_Text[MATCH_INDEX + 0] ∩ "$$0" ∩ Normal
is evaluated each time 'gsub' finds a string match in 'dl'
for one of the patterns in 'High_pat'. The value of
MATCH_INDEX reflects the index of the element in 'High_pat'
matched (0 is added to convert MATCH_INDEX from a string
value to an integer value for indexing 'High_Text').
QTAwk guarantees that the replacement string expression will
be evaluated as replacements are made in the text string from
left to right. For constant expressions used for the
replacement string, evaluating the replacement string
expression when the function is called or when the
replacement is made are equivalent.
Without the use of arrays as search patterns, the above
could only be accomplished as a much slower loop:
for ( i in High_pat )
gsub(High_pat[i],High_Text[i] ∩ "$$0" ∩ Normal,dl);
8. index(s1,s2) ==> return position of string s2 in string s1.
Return zero, 0, if string s1 does not contain string s2 as a
substring.
9. * insert(s1,s2,p) ==> return string formed by inserting
string s2 into string s1 starting at position p.
10. * justify(a,n,w) ==> return string w characters long formed
by justifying n elements of array a padded with blanks. If n
elements of array a with at least one blank between elements
would exceed width w, then the number of array elements
justified is reduced to fit in the length w.
11. * justify(a,n,w,c) ==> return string w characters long
formed by justifying n elements of array a padded with
character 'c'. If n elements of array a with at least one
'c' character between elements would exceed width w, then the
number of array elements justified is reduced to fit in the
QTAwk - 12-6 - QTAwk
Section 12.2 Built-in Functions
length w.
12. length ==> return number of characters in $0.
13. length() ==> return number of characters in $0.
14. length(s) ==> return number of characters in string s.
15. match(s,r) ==> return true, 1, if string s contains a
substring matched by regular expression r. If string
contains no match to regular expression r, then returns
false, 0. Set RLENGTH to length of substring matched (or
zero) and RSTART to start position of substring matched (or
zero).
16. * overlay(s1,s2,p) ==> return string formed by overlaying
string s2 on string s1 starting at position p. May extend
length of s1. If p > length(s1), s1 padded with blanks to
appropriate length.
17. * remove(s,c) ==> return string formed by removing all 'c'
characters from string s.
18. * replace(s) ==> return string formed by replacing all
repeated expressions, {n1,n2}, and named expressions, {name},
in string s. Same operation performed for strings used as
regular expressions and in converting regular expressions
into internal form.
19. * sdate(fmt_str) ==> return current system date and time
formatted according to string value of fmt_str. The format
string is similar to those used in the 'print' function
except that the following substitutions are made:
%a --> Locale's Abbreviated Weekday Name
%A --> Locale's Weekday Name
%b --> Locale's Abbreviated Month Name
%B --> Locale's Month Name
%c --> Locale's Appropriate Date And Time Representation
%d --> Day Of The Month As A Decimal Number (01-31)
%H --> The Hour (24-hour Clock) As A Decimal Number (00-23)
%I --> The Hour (12-hour Clock) As A Decimal Number (01-12)
%j --> The Day Of The Year As A Decimal Number (001-366)
%m --> The Month As A Decimal Number (01-12)
%M --> The Minute As A Decimal Number (00-59)
%p --> The Locale's Equivalent Of The AM/PM Designations
Associated With A 12-hour Clock
QTAwk - 12-7 - QTAwk
Section 12.2 Built-in Functions
%S --> The Second As Decimal Number (00-61)
%U --> The Week Number Of The Year (the First Sunday As The
First Day Of Week 1) As A Decimal Number (00-53)
%w --> The Weekday As A Decimal Number (0-6), Where Sunday Is
0
%W --> The Week Number Of The Year (the First Monday As The
First Day Of Week 1) As A Decimal Number (00-53)
%x --> The Locale's Appropriate Date Representation
%X --> The Locale's Appropriate Time Representation
%y --> The Year Without Century As A Decimal Number (00-99)
%Y --> The Year With Century As A Decimal Number
%Z --> The Time Zone Name Or Abbreviation, Or By No Character
If No Time Zone Is Indeterminable
%% --> %
Some common format strings would be:
%m/%d/%y - mm/dd/yy
%m/%d/%Y - mm/dd/yyyy
%d/%m/%y - dd/mm/yy
%d/%m/%Y - dd/mm/yyyy
%b %d, %Y - abbrev. month dd, yyyy
%B %d, %Y - month dd, yyyy
%a %m/%d/%Y - abbrev weekday mm/dd/yyyy
%A %m/%d/%Y - weekday mm/dd/yyyy
%a, %b %d, %Y - abbrev. weekday, abbrev. month dd, yyyy
%A, %B %d, %Y - weekday, month dd, yyyy
%b - return ASCII string for abbrev. month name
%B - return ASCII string for month name
%a - return ASCII string for abbrev. day name
%A - return ASCII string for day name
%y%m%d - return date in yymmdd form for sorting
%j - return number of days this year
%H:%M:%S - hh:mm:ss 0 <= hh <= 24
%H:%M - hh:mm 0 <= hh <= 24
%I:%M:%S %p - hh:mm:ss AM/PM
%I:%M %p - hh:mm AM/PM
20. * sdate(fmt_str,fdate) ==> return date formatted according
to string value of fmt_str. This function is similar to
"sdate(fmt_str)" except that the file date (fdate) instead of
the current system date is formatted. The file date, fdate,
is usually obtained by the "findfile" function. If the value
of fdate passed is zero, the current system date is
utilized. The current system time is utilized for any time
substitutions.
QTAwk - 12-8 - QTAwk
Section 12.2 Built-in Functions
21. * sdate(fmt_str,year,month,day) ==> return date formatted
according to string value of fmt_str. This function is
similar to "sdate(fmt_str)" except that the specific date
passed (year, month, day) is formatted. The current system
time is utilized for any time substitutions.
22. split(s,a) ==> split string s into array a on field
separator FS. Return number of fields. The same rules
applied to FS for splitting the current input record apply to
the use of fs in splitting s into a.
23. split(s,a,fs) ==> split string s into array a on field
separator fs. Return number of fields. The same rules
applied to FS for splitting the current input record apply to
the use of fs in splitting s into a.
24. * srange(c1,c2) ==> return string formed from character by
concatenating characters from c1 to c2 inclusive. If c2 < c1
null string returned. Thus,
srange('a','k') == "abcdefghijk".
25. * srev(s) ==> return string formed by reversing string s.
srev(srange('a','k')) == "kjihgfedcba".
26. * stime(fmt_str) ==> return current system time and date
formatted according to the string value of fmt_str.
%a --> Locale's Abbreviated Weekday Name
%A --> Locale's Weekday Name
%b --> Locale's Abbreviated Month Name
%B --> Locale's Month Name
%c --> Locale's Appropriate Date And Time Representation
%d --> Day Of The Month As A Decimal Number (01-31)
%H --> The Hour (24-hour Clock) As A Decimal Number (00-23)
%I --> The Hour (12-hour Clock) As A Decimal Number (01-12)
%j --> The Day Of The Year As A Decimal Number (001-366)
%m --> The Month As A Decimal Number (01-12)
%M --> The Minute As A Decimal Number (00-59)
%p --> The Locale's Equivalent Of The AM/PM Designations
Associated With A 12-hour Clock
%S --> The Second As Decimal Number (00-61)
%U --> The Week Number Of The Year (the First Sunday As The
First Day Of Week 1) As A Decimal Number (00-53)
%w --> The Weekday As A Decimal Number (0-6), Where Sunday Is
0
QTAwk - 12-9 - QTAwk
Section 12.2 Built-in Functions
%W --> The Week Number Of The Year (the First Monday As The
First Day Of Week 1) As A Decimal Number (00-53)
%x --> The Locale's Appropriate Date Representation
%X --> The Locale's Appropriate Time Representation
%y --> The Year Without Century As A Decimal Number (00-99)
%Y --> The Year With Century As A Decimal Number
%Z --> The Time Zone Name Or Abbreviation, Or By No Character
If No Time Zone Is Determinable
%% --> %
Refer to the 'sdate' function for some examples of common
formatting strings.
27. * stime(fmt_str,ftime) ==> return time formatted according
to the string value of fmt_str. This function is similar to
"stime(fmt_str)" except that the file time (ftime) instead of
the current system time is formatted. The file time, ftime,
is usually obtained by the "findfile" function. The current
system date is utilized for any date substitutions.
28. * stime(fmt_str,hour,minute,second) ==> return date
formatted according to the string value of fmt_str. This
function is similar to "stime(fmt_str)" except that the
specific time passed (hour, minute, second) is formatted.
The current system date is utilized for any date
substitutions.
29. * stran(s) ==> return string formed by translating
characters in string s matching characters in string value of
built-in variable, TRANS_FROM, to corresponding character in
string value of built-in variable, TRANS_TO. if no
corresponding character in TRANS_TO, then replace with
blank. TRANS_FROM and TRANS_TO initially set to:
TRANS_FROM = srange('A','Z');
TRANS_TO = srange('a','z');
30. * stran(s,st) ==> return string formed by translating
characters in string s matching characters in string value of
built-in variable, TRANS_FROM, to corresponding character in
st. if no corresponding character in st then replace with
blank.
31. * stran(s,st,sf) ==> return string formed by translating
characters in string s matching characters in sf to
QTAwk - 12-10 - QTAwk
Section 12.2 Built-in Functions
corresponding character in st. if no corresponding character
in st then replace with blank.
32. * strim(s) ==> return string formed by trimming leading and
tailing white space from string s. Leading white space
matches the regular expression /^{_w}+/. Tailing white space
matches the regular expression /{_w}+$/.
33. * strim(s,le) ==> return string formed by trimming string
matching le and tailing white space from string s. Differing
actions are taken depending the type of le:
────────────┬──────────────────────────────────────────────────
le type │ action
────────────┼──────────────────────────────────────────────────
regular │
expression │ delete first string matching regular expression
│
string │ convert to regular expression and delete
│ first matching string
│
single │
character │ delete all leading characters equal to 'le'
│
nonzero │
numeric │ delete leading white space matching /^{_w}+/
│
zero │
numeric │ ignore
────────────┴──────────────────────────────────────────────────
strim(s,TRUE) is equivalent to the form strim(s)
The following all delete the leading dashes from the given
string:
strim("------ remove leading -------",/^-+/);
strim("------ remove leading -------",/-+/);
strim("------ remove leading -------",'-');
==> "remove leading -------"
34. * strim(s,le,te) ==> return string formed by trimming
string matching le and string matching te from s. 'le' and
'te' may be a regular expression, a string, a single
character or a numeric. Differing actions are taken
QTAwk - 12-11 - QTAwk
Section 12.2 Built-in Functions
depending the type of le and te:
────────────┬──────────────────────────────────────────────────
le/te type │ action
────────────┼──────────────────────────────────────────────────
regular │
expression │ delete first string matching regular expression
│
string │ convert to regular expression and delete
│ first matching string
│
single │
character │ delete all leading/tailing characters equal
│ to 'le'/'te' respectively
│
nonzero │
numeric │ delete leading/tailing white space matching
│ /^{_w}+/ or /{_w}+$/ respectively
│
zero │
numeric │ ignore
────────────┴──────────────────────────────────────────────────
strim(s,TRUE,TRUE) is equivalent to the form strim(s)
strim("======remove leading and tailing-------",'=','-')
or
strim("======remove leading and tailing-------",/^=+/,'-')
or
strim("======remove leading and tailing-------",'+',/-+$/)
or
strim("======remove leading and tailing-------",/^=+/,/-+$/)
==> "remove leading and tailing"
strim("======remove leading-------",'=',FALSE)
==> "remove leading-------"
strim("======remove tailing-------",FALSE,'-')
==> "======remove tailing"
35. * strlwr(s) ==> return string s translated to lower-case.
36. * strupr(s) ==> return string s translated to upper-case.
37. sub(re,rs) ==> substitute for leftmost string matched by
QTAwk - 12-12 - QTAwk
Section 12.2 Built-in Functions
regular expression, re, in $0, return number of substitutions
made (0/1). Refer to description of built-in function 'gsub'
for when the replacement string expression, rs, is
accomplished.
38. sub(re,rs,t) ==> substitute for leftmost string matched by
regular expression, r, in t, return number of substitutions
made (0/1). Refer to description of built-in function 'gsub'
for when the replacement string expression, rs, is
accomplished.
39. substr(s,p) ==> return string formed from suffix of string
s starting at position p.
40. substr(s,p,n) ==> return string formed from n characters of
string s starting at position p. If n == 1, a character
constant is returned.
QTAwk - 12-13 - QTAwk
Section 12.3 Built-in Functions
12.3 File Functions
QTAwk offers the following built-in file functions. Those
marked with an asterisk, '*', differ from those in Awk.
For input files, if a drive and/or path is specified with the
filename, then only that drive and path are searched for the
desired file. If no drive or path is specified, then the current
directory is searched, if the file is not found, then the paths
(optionally with a drive specifier) specified by the string value
of the built-in variable QTAwk_Path are searched for the file.
Multiple paths may be specified in QTAwk_Path by separating them
with semi-colons.
The file functions are:
Input:
getline;
getline();
getline(v);
getline v;
fgetline(F)
fgetline(F,v)
getc()
fgetc(F)
srchrecord(sp)
srchrecord(sp,rs)
srchrecord(sp,rs,var)
fsrchrecord(fn,sp)
fsrchrecord(fn,sp,rs)
fsrchrecord(fn,sp,rs,var)
Output:
print;
print expr_list;
print(expr_list);
fprint F;
fprint(F);
fprint F,expr_list;
fprint(F,expr_list);
printf(fmt);
printf(fmt,expr_list);
fprintf(F,fmt);
fprintf(F,fmt,expr_list);
putc(c);
fputc(c,F);
QTAwk - 12-14 - QTAwk
Section 12.3 Built-in Functions
Miscellaneous:
1. append(F);
2. close(F);
3. get_FNR();
4. get_FNR(F);
12.3.1 Input Functions
1. * getline();
or
* getline; ==> reads next record from current input file into
$0. Sets fields, NF, NR and FNR. The effect of the
'getline' function is summarized in the table following the
description of the 'fsrchrecord' function.
Returns:
a) the number of characters read plus the length of the
End-Of-Record plus 1,
b) 0 if End-Of-File was encountered, or
c) -1 if an error occurred.
The built-in variables FILE_SEARCH and FILE_SEARCH_PAT have
no effect on the next record input with this function. The
next physical record in the file is read irregardless of the
value of FILE_SEARCH.
2. * getline(v);
or
* getline v; ==> reads next record from current input file
into variable v. Sets NR and FNR. The effect of the
'getline' function is summariazed in the table following the
description of the 'fsrchrecord' function.
Returns:
a) the number of characters read plus the length of the
End-Of-Record plus 1,
b) 0 if end-of-file was encountered or
c) -1 if an error occurred.
The built-in variables FILE_SEARCH and FILE_SEARCH_PAT have
no effect on the next record input with this function. The
next physical record in the file is read irregardless of the
value of FILE_SEARCH.
3. * fgetline(F) ==> reads next record from file F into $0.
Sets fields and NF. The effect of the 'fgetline' function is
summarized in the table following the description of the
QTAwk - 12-15 - QTAwk
Section 12.3.1 Built-in Functions
'fsrchrecord' function.
Returns:
a) the number of characters read plus the length of the
End-Of-Record plus 1,
b) 0 if end-of-file was encountered or
c) -1 if an error occurred.
The built-in variables FILE_SEARCH and FILE_SEARCH_PAT have
no effect on the next record input with this function. The
next physical record in the file is read irregardless of the
value of FILE_SEARCH.
4. * fgetline(F,v) ==> reads next record from file F into
variable v. The effect of the 'fgetline' function is
summarized in the table following the description of the
'fsrchrecord' function.
Returns
a) the number of characters read plus the length of the
End-Of-Record plus 1,
b) 0 if end-of-file was encountered or
c) -1 if an error occurred.
The built-in variables FILE_SEARCH and FILE_SEARCH_PAT have
no effect on the next record input with this function. The
next physical record in the file is read irregardless of the
value of FILE_SEARCH.
5. * srchrecord(sp) or
* srchrecord(sp,rs) or
* srchrecord(sp,rs,var) searches the current input file for
the next record or records matching the search pattern,
'sp'. If the record separator parameter, 'rs', is not
specified, records are determined by the variable RS. If
'rs' is specified, record boundaries are determined by the
strings matching 'rs'. 'rs' may be a simple constant or
variable or an array. As for the built-in variable RS,
specifying rs as the null string, "", will set a blank line
as the record separator. If rs is specified as the null,
QTAwk will silently use the regular expression /\n\n/. The
record or records matching the search pattern are returned in
'$0' if 'var' is not specified. If 'var' is specified, the
matching record or records are returned in 'var'. The
built-in variables, FNR and NR are updated to reflect the
current position and record number after the search. The
QTAwk - 12-16 - QTAwk
Section 12.3.1 Built-in Functions
built-in variables, NF and '$i', i ≤ 0 ≤ NF, are set when
'var' is not specified. The effect of the 'srchrecord'
function is summarized in the table following the description
of the 'fsrchrecord' function.
Returns
a) the number of characters read plus the length of the
End-Of-Record plus 1,
b) 0 if end-of-file was encountered or
c) -1 if an error occurred.
6. * fsrchrecord(fn,sp) or
* fsrchrecord(fn,sp,rs) or
* fsrchrecord(fn,sp,rs,var) searches the file specified in
'fn' for the next record or records matching the search
pattern, 'sp'. If the record separator parameter, 'rs', is
not specified, records are determined by the variable RS. If
'rs' is specified, record boundaries are determined by the
strings matching 'rs'. 'rs' may be a simple constant or
variable or an array. As for the built-in variable RS,
specifying rs as the null string, "", will set a blank line
as the record separator. If rs is specified as the null,
QTAwk will silently use the regular expression /\n\n/. The
record or records matching the search pattern are returned in
'$0' if 'var' is not specified. If 'var' is specified, the
matching record or records are returned in 'var'. The
built-in variables, NF and '$i', i ≤ 0 ≤ NF, are set when
'var' is not specified. Returns
a) the number of characters read plus the length of the
End-Of-Record plus 1,
b) 0 if end-of-file was encountered or
c) -1 if an error occurred.
The following table details the effect of the 'getline',
'fgetline', 'srchrecord' and 'fsrchrecord' functions on $0,
the field variables, $i, and the built-in variables, NF, NR
and FNR.
getline() getline(v) fgetline(F) fgetline(F,v)
srchrecord() srchrecord(v) fsrchrecord(F) fsrchrecord(F,v)
$0 updated not updated updated not updated
$i, i>0 updated not updated updated not updated
NF updated not updated updated not updated
NR updated updated not updated not updated
QTAwk - 12-17 - QTAwk
Section 12.3.1 Built-in Functions
FNR updated updated not updated not updated
Note: the function parameters sp and rs have not been
shown in 'srchrecord' and 'fsrchrecord' to highlight the
similarity with the functions 'getline' and 'fgetline' and
their effect on the variables indicated.
Note: FNR and NR are updated to their proper and correct
values whenever the basic QTAwk processing loop reads the
next input record from the current input file or when the
'getline' or 'srchrecord' functions are executed.
If the record number of the record read by the 'fgetline' or
'fsrchrecord' functions is desired, the 'get_FNR' function
must be used.
7. * getc(); ==> reads next character from current input file
and returns the character read, 0 if End-Of-File or -1 on
file read error.
The ECHO_INPUT built-in variable controls echo of characters
read from the standard input file or keyboard file to the
standard output file.
8. * fgetc(F); ==> reads next character from file F and returns
the character read, 0 if End-Of-File or -1 if a file error
occurs.
If reading from the standard file, "keyboard", there are no
End-Of-File or error returns. The "fgetc" function will
return 0 when a function key, cursor key or other key not
corresponding to an ASCII character is pressed when reading
from the keyboard file. A second call must be made to read
the keyboard scan code to recognize the key pressed.
Keyboard scan codes are listed in Appendix ii.
The ECHO_INPUT built-in variable controls echo of characters
read from the standard input file or keyboard file to the
standard output file.
12.3.2 Output Functions
1. * fprint(F);
or
* fprint F; ==> prints $0 to file 'F' followed by ORS.
Returns the number of characters printed.
QTAwk - 12-18 - QTAwk
Section 12.3.2 Built-in Functions
2. * fprint(F,...);
or
* fprint F,...; ==> prints expressions in the expr_list,
'...', to the file 'F', each separated by OFS. The last
expression is followed by ORS. Returns the number of
characters printed.
3. fprintf(F,fmt,...) ==> print expr_list, ..., to file 'F'
according to format 'fmt'. Returns the number of characters
printed.
4. print();
or
print; ==> prints $0 to standard output file followed by
ORS. Returns the number of characters printed.
5. print(...);
or
print ...; ==> prints expressions in the expr_list, ..., to
the standard output file, each separated by OFS. The last
expression is followed by ORS. Returns the number of
characters printed.
6. printf(fmt,...) ==> print expr_list, ..., to standard output
file according to format, fmt. Returns the number of
characters printed.
7. * putc(c) ==> writes the character 'c' to the standard
output file and returns the character c if no error occurred
on the write. Returns -1 if an error occurred.
8. * fputc(c,F) ==> writes the character 'c' to the file F and
returns the character c if no error occurred on the write.
Returns -1 if an error occurred.
9. sprintf(fmt,...) ==> return string formed by formatting
expr_list, ... , according to format, fmt.
12.3.3 Miscellaneous File Functions
1. * append(F) ==> this function causes all subsequent output
to file 'F' to be written, "printed", at the end of the file,
i.e., appended to the end. If any output is printed to the
file before executing this function, if the file currently
exists, it is opened and truncated to zero length prior to
printing the first character. Closing the file, "F", with
QTAwk - 12-19 - QTAwk
Section 12.3.3 Built-in Functions
the 'close' function will cancel the effect of 'append' for
any subsequent output to that file.
2. close(F) ==> close file F. If the same file is open for
both reading and writing, the 'close' function will close the
file opened for writing first. The same file may be opened
for both reading and writing with the 'append' function.
Specifying the current scan file for the 'append' function is
possible and will direct any output to the end of the file
without altering the current read position.
3. * get_FNR()
or
* get_FNR(F) ==> this function returns the current record
number of the input file specified. The first form returns a
value equal to the built-in variable FNR and is equivalent
to:
get_FNR(FILENAME)
If the filename specified is not open or is not open for
input, a value of zero, 0, is returned.
This function has been added because of the input functions
'fgetline' and 'fsrchrecord'. For the current input file,
the built-in variable FNR is always updated automatically to
contain the record number of the last record input (the
current record). However, when reading from a file other
than the current input file, there is no other means of
obtaining the current record number of the file. With
'fgetline', the user utility could maintain an independent
count of records read. However, if the 'fsrchrecord'
function is used, it is not possible for the utility to
maintain such a count since not all records are actually
'read'. Many may be skipped when searching for a string
matching the search pattern.
The use of the re-direction and pipeline operators, '<', '>',
'>>' and '|', have been discontinued as error prone. The use of
the syntax:
{ print $1, $2 > $3 }
has been replaced by the 'fprint' function:
{ fprint($3,$1,$2); }
QTAwk - 12-20 - QTAwk
Section 12.3.3 Built-in Functions
or
{ fprint $3,$1,$2; }
12.3.4 Standard Files
QTAwk has six standard files: keyboard, stdaux, stderr, stdin,
stdout, stdprn. These files are always open and available for
input or output.
1. keyboard -- Keyboard
this file is designed to read input from the keyboard. If
the keyboard has been redirected to a disk file by the
PC/MS-DOS redirection or piping facility, then the
End-Of-File will not be recognized. Any utility reading
redirected input from this file must supply a means within
the utility for termination. This can be done on special
input and use of the "exit" or "endfile" statement.
The "fgetc" function will return 0 when a function key,
cursor key or other key not corresponding to an ASCII
character is pressed when reading from the keyboard file. A
second call must be made to read the keyboard scan code to
recognize the key pressed. Keyboard scan codes are listed in
Appendix ii.
The ECHO_INPUT built-in variable controls echo of characters
read from the keyboard file to the standard output file. If
characters read are not to be displayed, then ECHO_INPUT
should be set to a false value. The default value of
ECHO_INPUT is 0.
2. stdaux -- Standard Auxiliary File
the first communications port if available for output.
3. stderr -- Standard Error File
the console display.
4. stdin -- Standard Input File
normally the console keyboard, but may be redirected to a
disk file or piped from the output of another application
program.
This file is the utility input file if no input file has
been specified on the command line or an input file of "-"
has been specified.
QTAwk - 12-21 - QTAwk
Section 12.3.4 Built-in Functions
If not redirected or piped, input will be read from the
keyboard. This file is buffered and the input is not
available to the QTAwk utility until the carriage return, or
enter, key is pressed. If single key characters are needed
by the utility as the keys are pressed, then the "keyboard"
file should be used for input using the "fgetc" input
function. Also, cursor keys, function keys or other
"special" character keys cannot be input from this file. The
"keyboard" file and the "fgetc" function must be used for
such input.
The ECHO_INPUT built-in variable controls echo of characters
read from the redirected or piped standard input file to the
standard output file. If the standard input file has not
been redirected or piped, then input read from the standard
input file will always be displayed irregardless of the value
of ECHO_INPUT. If characters read from the redirected or
piped standard input file are not to be displayed, then
ECHO_INPUT should be set to a false value. The default value
of ECHO_INPUT is 0. Normally, the default value is desired,
otherwise redirected or piped input will be sent to the
standard output file as it is read.
The following table shows the effect of ECHO_INPUT on input
from the standard input file:
redirected redirected
ECHO_INPUT stdin stdin stdin stdin
value to display to stdout to display to stdout
true yes yes no yes
false yes no no no
5. stdout -- Standard Output File
normally the console screen, but may be redirected to a disk
file or piped to the input of another application program.
Note that the "print", "printf" and "putc" functions all
output only to this file.
6. stdprn -- Standard Printer File
the printer attached to the parallel, or printer, port.
QTAwk - 12-22 - QTAwk
Section 12.4 Built-in Functions
12.4 Miscellaneous Functions
12.4.1 Expression Type
*e_type(expr) ==> returns the type of 'expr'. Function
evaluates the expression 'expr' and returns the type of the final
result. The return is an integer defining the type:
Return Type
0 Uninitialized (returned when 'expr' is a variable which
has not had a value assigned to it. Also
if the variable not been assigned
since acted on by "deletea" statement)
1 Regular Expression Value
2 String Value
3 Single Character Value
4 Integral Value
5 Floating Point Value
local lvar;
e_type(lvar) ==> 0
e_type(/string test/) ==> 1
e_type("string test") ==> 2
e_type('a') ==> 3
e_type(45) ==> 4
e_type(45.6) ==> 5
e_type(45.6 ∩ "") ==> 2
e_type("45.6" + 0.0) ==> 5
e_type("45.6" + 0) ==> 5
e_type("45" + 0) ==> 4
e_type("45" + 0.0) ==> 5
12.4.2 Execute String
QTAwk offers two forms of a function to execute QTAwk dynamic
expressions or statements. The first form will execute strings
as QTAwk expressions or statements. The second will execute
array elements as QTAwk expressions or elements.
*execute(s[,se[,rf]]) ==> execute string s as an QTAwk statement
or expression. If se == TRUE, then string s is executed as an
expression and the resultant value is returned by the 'execute'
function. If se == FALSE, then string s is executed as a
statement and the constant value of one, 1, is returned. The se
QTAwk - 12-23 - QTAwk
Section 12.4.2 Built-in Functions
parameter is optional and defaults to FALSE. Any built-in or
user-defined function may be executed in the 'execute' function
except the 'execute' function itself. New variables may be
defined as well as new constant strings and regular expressions.
The optional rf parameter is the error recovery flag. If rf =
FALSE (the default value), an error encountered in parsing or
executing the string s will cause QTAwk to issue the appropriate
error message and halt execution. If rf == TRUE, an error
encountered in parsing or executing the string s will cause QTAwk
to issue the appropriate error message, discontinue parsing or
execution of the string and continue executing the current QTAwk
utility. Attempting to execute the 'execute' function from
within the 'execute' function is a fatal error and will always
cause QTAwk to halt execution.
The following string can be executed as either an expression or
statement:
nvar = "power2 = 2 ^ 31;";
If executed as an expression:
print execute(nvar,1);
the output will be: 2147483648
If executed as a statement:
print execute(nvar,0);
or
print execute(nvar);
the output will be: 1
Multiple statements/expressions may be executed with a compound
statement of the form:
pvar = "{ pow8 = 2 ^ 8; pow16 = 2 ^ 16; pow31 = 2 ^ 31; }";
Then
execute(pvar,0);
QTAwk - 12-24 - QTAwk
Section 12.4.2 Built-in Functions
or
execute(pvar);
will set the three variables:
1. pow8
2. pow16
3. pow31
even if the variables were not previously defined. If the
variables were not previously defined, they will added to the
list of the utility global variables.
Note that attempting to execute pvar as an expression:
execute(pvar,1);
will result in the error message "Undefined Symbol", since the
braces '{}' are only recognized in statements. All three
expressions may be executed, as an expression, by the use of the
sequence operator in the following manner:
pvar = "pow8 = 2 ^ 8 , pow16 = 2 ^ 16 , pow31 = 2 ^ 31;";
*execute(a[,se[,rf]]) ==> execute the elements of array a as an
QTAwk statement or expression. The se and rf parameters have the
same function and default values as above. For example, the
compound statement contained in pvar above may be split among the
elements of an array:
avar[1] = "{";
avar[2] = "pow8 = 2 ^ 8;";
avar[3] = "pow16 = 2 ^ 16;";
avar[4] = "pow31 = 2 ^ 31;";
avar[5] = "}";
and executed as:
execute(avar);
or
execute(avar,0);
QTAwk - 12-25 - QTAwk
Section 12.4.2 Built-in Functions
12.4.3 Array Function
QTAwk offers the following built-in array function.
rotate(a) - the values of the array are rotated.
The value of the first element goes to the last element, the
second to the first, third to the second, etc. If the array has
the following elements:
1. a[1] = 1
2. a[2] = 2
3. a[3] = 3
4. a[4] = 4
then rotate(a) will have the result:
1. a[1] = 2
2. a[2] = 3
3. a[3] = 4
4. a[4] = 1
It is not necessary to specify one-dimensional arrays. If:
1. a[1][1] = 1
2. a[1][2] = 2
3. a[1][3] = 3
4. a[1][4] = 4
Then rotate(a[1]) will produce the result:
1. a[1][1] = 2
2. a[1][2] = 3
3. a[1][3] = 4
4. a[1][4] = 1
12.4.4 System Control Function
system(e) ==> executes the system command specified by the
string value of the expression e.
12.4.5 Variable Access
QTAwk - 12-26 - QTAwk
Section 12.4.5 Built-in Functions
There are two built-in functions available for access to
variables. The first, "pd_sym", accesses predefined variables
and the second, ud_sym, accesses user-defined variables. Each
has two forms:
pd_sym(name_str)
ud_sym(name_str)
or
pd_sym(name_num,name_str)
ud_sym(name_num,name_str)
1. To access predefined variables, the function "pd_sym" may be
used. This function has been supplied to provide a
predefined variable access function similar to the function
"ud_sym" for accessing user-defined variables. The forms and
returns are similar.
2. To access user-defined variables where the variable name may
not be known in advance, the function "ud_sym" has been
supplied. The first form:
ud_sym(name_expr)
is useful in situations where the variable name is not known
until the statement is to be executed. In these cases,
name_expr may be any expression or variable with a string
value equal to the name of the unknown variable. In this
form, the string value of name_expr is used to access the
variable. 'ud_sym' returns the variable in question, if one
exists, whose name is equal to the string value passed.
The functional return value may be used in any expression
just as the variable itself would. This includes operating
on the return value with the array index operators, "[]".
Note: This form may be used to access both local and global
variables. If both a local and global variable have been
defined with the desired name and the local variable is
within scope, then the local variable is returned.
The second form:
ud_sym(name_expr,name_str)
QTAwk - 12-27 - QTAwk
Section 12.4.5 Built-in Functions
is useful in those situations where it may be impractical to
use string values to access the variables, e.g., in a "for",
"while" or "do" loop, but a numeric value can be used to
access the variables.
The user variables are accessed in the order defined in the
user utility starting with one, 1. If the integer value of
name_expr exceeds the number of user-defined variables, then
a constant is returned. The second parameter must be a
variable. Upon return, this variable will have a string
value equal to the name of the variable found or the null
string if name_expr exceeds the number of user-defined
variables. The return value of this variable may be tested
to assure that a variable was found.
The functional return value may be used in any expression
just as the variable itself would. This includes operating
on the return value with the array index operators, "[]".
Appendix vi lists the pre-defined variables with the integer
value used in the first argument to access the variable.
Note: This form may be used to access global variables
ONLY. Local variables cannot be accessed with this form of
the function.
The following short function will return the number of
user-defined global variables:
# function to return the current number of
# GLOBAL variables defined in utility
function var_number(display) {
local cnt, j, jj;
for ( cnt = 1, j = ud_sym(cnt,jj) ; jj ; j = ud_sym(cnt,jj) ) {
if ( display ) print cnt ∩ " : " ∩ jj ∩ " ==>" ∩ j ∩ "<==";
cnt++;
}
return cnt - 1;
}
The following function may be called with the name of the
variable desired. The value of the variable will be returned.
Note that the appropriate variables have been defined in the
"BEGIN" action.
QTAwk - 12-28 - QTAwk
Section 12.4.5 Built-in Functions
BEGIN {
#define the conversion variables
_kilometers_to_statute_miles_ = 1.609344; # mile / kilometers (exact)
_statute_miles_to_kilometers_ = 1/1.609344; # kilometers / mile (exact)
_km_to_sm_ = 1.609344; # mile / kilometers (exact)
_sm_to_km_ = 1/1.609344; # kilometers / mile (exact)
_inches_to_centimeters_ = 2.54;
_centimeters_to_inches_ = 1/2.54;
_in_to_cm_ = 2.54;
_cm_to_in_ = 1/2.54;
_radians_to_degrees_ = 180/pi;
_degrees_to_radians_ = pi/180;
_rad_to_deg_ = 180/pi;
_deg_to_rad_ = pi/180;
}
# function to return the appropriate conversion
function conversion_factor(to_n,from_n) {
local c_name = '_' ∩ to_n ∩ "_to_" ∩ from_n ∩ '_';
return ud_sym(c_name);
}
The following function will list all pre-defined variables with
their current value and the integer used to access them:
BEGIN {
for ( i = 1 , jj = pd_sym(i,j) ; j ; jj = pd_sym(++i,j) ) {
gsub(/\n/,"\\n",jj);
gsub(/\s/,"\\s",jj);
gsub(/\t/,"\\t",jj);
print i,j,jj;
}
}
12.4.6 File Search Function
QTAwk offers three forms of the "findfile" function. All three
forms search for files matching a specified
"drive:\path\filename.ext" pattern. Since PC/MS-DOS functions
calls are utilized for the search operations, only PC/MS-DOS
style wild cards are allowed in the "filename.ext" portion of the
pattern. The three forms of the file search function are:
1. findfile(variable)
2. findfile(variable,pattern)
3. findfile(variable,pattern,attributes)
QTAwk - 12-29 - QTAwk
Section 12.4.6 Built-in Functions
All three forms of the function return the number of files found
which match the pattern specified with the specified attributes.
The first argument specifies the variable in which the array of
files found is returned. The second argument specifies the
"drive:\path\filename.ext" pattern to match against and the third
argument specifies the file attributes which the files found must
match. If the file pattern is not passed or passed as the null
string, then the pattern "*.*" matching all files in the current
directory is utilized.
If the last argument, the file attributes, is passed, then the
string form is used as specifying the attributes desired. The
attributes are specified as a string with the following values
recognized:
'a' or 'A' --> file archive bit set indicating that file has
been written to since the file was last archived (bit
cleared).
'd' or 'D' --> sub-directory, only sub-directories matching the
pattern will be returned.
'h' or 'H' --> file hidden bit is set indicating that file is
hidden, i.e., not ordinarily displayed by directory
searches.
'r' or 'R' --> file read-only bit is set, i.e., file may not be
written to.
's' or 'S' --> file system bit is set indicating that file is a
"system file". Again these files are not ordinarily
displayed by directory searches.
'v' or 'V' --> Volume ID bit. Returns volume IDs matching
pattern. This is only found in the root directory.
'z' or 'Z' --> all file attributes cleared.
If the file attributes are not specified, then all files which
match the pattern specified with all attributes cleared and all
files with the archive and/or read-only bits set will be
returned.
The variable passed in the first argument will be used to return
the array of files found. The zeroth, 0, element of the array
will contain the path specified in the pattern passed or the
QTAwk - 12-30 - QTAwk
Section 12.4.6 Built-in Functions
current working directory if no path is specified. Starting with
element one, 1, the filename.ext, file size, file time, file date
and file attributes will be returned. Thus, if the first
argument passes the variable "files", this variable will contain
elements with the following information (with 1 ≤ n ≤ N, the
number of files found):
files[0] --> path specified or current working directory
files[n]["name"] --> name of n'th file found
files[n]["size"] --> size of n'th file found
files[n]["date"] --> date of n'th file found in PC/MS-DOS file
date format
files[n]["time"] --> time of n'th file found in PC/MS-DOS file
time format
files[n]["attr"] --> string with attributes of n'th file found,
null string if all attributes cleared.
Normally, the list of files returned in the file array are
unsorted, i.e., in the order found by PC/MS-DOS. If the files
returned in the file array have to be ordered by filename, file
extension, file date, file time or file size, then the FILE_SORT
built-in variable must be used to specified the sort order
desired.
For PC/MS-DOS, the PC/MS-DOS file format time and date can be
converted to month, day, year and hour, minute second using the
following simple computations:
:lo.
fdate = files[n]["date"];
day = fdate & 0x01f;
month = (fdate >> 5) & 0x00f;
year = (fdate >> 9) + 1980;
ftime = files[n]["time"];
second = (ftime & 0x01f) * 2;
minute = (ftime >> 5) & 0x03f;
hour = (ftime >> 11);
Alternatively, the 'sdate' and 'stime' built-in functions may be
used to format the file date and/or time. For example, the
QTAwk - 12-31 - QTAwk
Section 12.4.6 Built-in Functions
following loop will re-set the appropriate array elements with
the formatted file date and time:
date_fmt_str = "%m/%d/%Y"; # format date as month/day/year
time_fmt_str = "%H:%M:%S"; # format time as hour:minute:second
for ( elem in file ) {
if ( elem == 0 ) continue; # skip path element
file[elem]["date"] = sdate(date_fmt_str,file[elem]["date"]);
file[elem]["time"] = sdate(time_fmt_str,file[elem]["time"]);
}
12.4.7 Re-Set Regular Expressions
Once a regular expression has been used in QTAwk the internal
form is set and cannot normally be changed. If the value of
named expressions used in regular expressions changes
infrequently, the use of strings for search patterns is slow.
Also the value of regular expressions used in GROUP expressions
cannot normally be changed. For this reason QTAwk has included
the built-in function:
resetre()
This function resets ALL regular expressions except array
regular expressions, including all GROUP expressions. The
internal forms for regular expressions are deleted. On the next
use of a regular expression for matching, the internal form is
rederived with the current values of any named expressions. The
time a GROUP is matched against an input record, the GROUP
expressions are evaluated and the internal regular expression
form for the GROUP is rederived.
Refer to arrays in Section 6, for the use of arrays as regular
expressions. Arrays may be utilized anywhere in a expressions a
regular expression is used. The internal form of the regular
expression is retained after the expression has been executed.
The internal regular expression form is deleted only when the
array is changed. The use of arrays as regular expressions gives
the user more control over the dynamic changing of the regular
expression internal form.
QTAwk - 12-32 - QTAwk
Section 13.0 User-Defined Functions
13.0 User-Defined Functions
QTAwk supports user-defined functions and has enhanced them over
Awk in several important aspects.
13.1 Local Variables
In QTAwk it is no longer necessary to declare local variables as
excess arguments in the function definition. QTAwk has included
the 'local' keyword. This keyword may be used in any compound
statement, but was invented specifically for user-defined
functions. Consider the simple function to accumulate words from
the current input record in the formatting utility:
# accumulate words for line
function addword(w) {
local lw = length(w); # length of added word
# check new line length
if ( cnt + size + lw >= width ) printline(yes);
line[++cnt] = w; # add word to line array
size += lw;
}
That lw is local to the function and will disappear when the
function is exited is obvious from the definition of lw. It is
also now easy to pick out the function arguments, 'w' in this
case. The initialization of lw to the length of the argument
passed is also easily picked up from the definition. The 'local'
keyword thus truly separates local variables from the function
arguments.
13.2 Argument Checking
By using the '_arg_chk' built-in variable, it is also possible
to have QTAwk now do some argument checking for the user. If
_arg_chk is TRUE, then QTAwk will, at run-time, check the number
of arguments passed against the number of arguments defined. If
the number passed differs from the number defined, then a
run-time error is issued and QTAwk halts. When '_arg_chk' is
FALSE, QTAwk will check at run-time only that the number of
arguments passed is less than or equal to the number defined.
This follows the Awk practice and allows for the use of arguments
defined, but not passed, as local variables. The default value
of '_arg_chk' is FALSE. It is recommended that '_arg_chk' be set
to TRUE and the 'local' keyword used to define variables meant to
QTAwk - 13-1 - QTAwk
Section 13.2 User-Defined Functions
be local to a function.
13.3 Variable Length Argument Lists
QTAwk allows user-defined functions to be defined with a
variable number of arguments. The actual number of arguments
will be determined from the call at run-time. QTAwk follows the
C syntax for defining a function with a variable number of
arguments:
# function to determine maximum value
function max(...) {
local max = vargv[1];
local i;
for ( i = 2 ; i <= vargc ; i++ )
if ( max < vargv[i] ) max = vargv[i];
return max;
}
The ellipses, '...', is used as the last argument in a
user-defined argument list to indicate that a variable number of
arguments follow. In the max function shown, no fixed arguments
are indicated. Within the function, the variable arguments are
accessed via the built-in singly-dimensioned array, 'vargv'. The
built-in variable 'vargc' is set equal to the number of elements
of the array and, hence, the variable number of arguments passed
to the function. Since the variable arguments are passed in a
singly dimensioned array, the 'for' statement may be used to
access each in turn:
# function to determine maximum value
function max(...) {
local max = vargv[1];
local i;
for ( i in vargv )
if ( max < vargv[i] ) max = vargv[i];
return max;
}
A user-defined function may have fixed arguments and a variable
number of arguments following:
# function with both fixed and variable number of arguments
function sample(arg1,arg2,...) {
QTAwk - 13-2 - QTAwk
Section 13.3 User-Defined Functions
.
.
.
}
If a user-defined function is to have a variable number of
arguments, then the 'local' keyword must be used to define local
variables. The ellipses denoting the variable arguments must be
last in the function definition argument list.
13.4 Null Argument List
A user defined function may be defined with no arguments.
Consider the function to accumulate words from input records for
the text formatter:
# function to add current line to parsed text
function add_line() {
for ( i = 1 ; i <= NF ; i++ ) if ( length($i) ) addword($i);
}
In the case of a user-defined function with no arguments to be
passed, the function may be invoked with no parenthesized
parameter list. Consider the invocation of the add-line function
in the text formatter. The action executed for input records
which do not start with a format control word is:
{
if ( format ) add_line;
else if ( table_f ) format_table($0);
else output_line($0,FALSE);
}
In QTAwk, the add_line function may be invoked as "add_line" as
above or as "add_line()", with a null length parameter list.
QTAwk has also relaxed the Awk rule that the left parenthesis of
the parameter list must immediately follow a user-defined
function invocation. QTAwk allows blanks between the name and
the left parenthesis. The blanks are ignored.
13.5 Arrays and Used-Defined Functions
Just as arrays are integrated into QTAwk expressions, arrays are
also integrated into the passing of arguments to, and the return
value from, user-defined functions. Used-defined functions may
QTAwk - 13-3 - QTAwk
Section 13.5 User-Defined Functions
return arrays as well as scalars. This will be illustrated in a
sample utility later.
QTAwk passes scalar arguments to user-defined functions by
value, i.e., if a scalar variable is specified as an argument to
a function, a copy of the variable is passed to the function and
not the variable itself. This is called pass by value. Thus, if
the function alters the value of the argument, the variable is
not altered, only the copy. When the function terminates, the
copy is discarded, and the variable still retains its original
value.
In contrast, QTAwk passes array variables by "reference". This
means that the local variable represented by the function
argument, is the referenced variable and not a copy. Any changes
to the local variable are actually made to the referenced
variable.
In QTAwk, function arguments may also be constant arrays and not
variable arrays, i.e., the argument may be the result of an
arithmetic operation on an array. For example, if A is an array,
then the result of the expression
"A + 10"
is an array and would be passed as a constant array as a
function argument. Such arrays are discard at function
termination.
QTAwk passes by reference under three conditions:
1. The argument is a global or local variable and an array,
2. The argument is a global or local variable and used as an
array, i.e., indexed or referenced by an 'in' statement, in
the called function. This is true whether the referenced
variable is a scalar or array when the function is called.
If the referenced variable was a scalar when the function is
called, then at function termination, if the statement(s) in
which the argument was indexed WERE EXECUTED, the referenced
variable will be an array with the index values referenced.
This behavior is identical to creating array elements in
global variables by referencing the elements.
3. The argument is a global or local scalar variable and at
function termination the argument is an array. In this case,
QTAwk - 13-4 - QTAwk
Section 13.5 User-Defined Functions
the argument may not have been directly referenced as an
array, but may be the result of an operation involving an
array. Alternatively the argument may have been passed to
another function which referenced it as an array or set it to
the result of operations on arrays.
The following QTAwk utility with a user-defined function will
illustrate the use of arrays and scalars as function arguments
and the return of arrays by user-defined functions.
QTAwk - 13-5 - QTAwk
Section 13.5 User-Defined Functions
BEGIN {
# create arrays 'a' and 'b'
for ( i = 1 ; i < 6 ; i++ ) a[i] = b[i] = i;
# create scalars 'c' and 'f'
c = f = 10;
# pass scalar variables/values and return scalar value
print "scalar : "set_var(c,c,c)" and c == "c;
# pass two arrays, 'a' & 'b',
# and one scalar constant, 'c+0'
# function will return an array "== a + b + (c+0)"
d = set_var(a,b,c+0);
# print returned array 'd' (== a + b + (c+0))
for ( i in d ) print "d["i"] = "d[i];
#print scalar 'c' to show unchanged
print c;
# pass two arrays, 'a' & 'b',
# and one scalar variable, 'c'
# function will return an array "== a + b + c"
e = set_var(a,b,c);
# print returned array
for ( i in e ) print "e["i"] = "e[i];
# print former scalar, 'c',
# converted to array by operation c = b + 2;
for ( i in c ) print "c["i"] = "c[i];
# pass two arrays, 'a' & 'b', and constant array, 'b+0'
h = set_var(a,b,b+0);
# print returned array
for ( i in h ) print "h["i"] = "h[i];
# print array 'b' to assure that unchanged
for ( i in b ) print "b["i"] = "b[i];
# attempt illegal operation in function: w = f + b
# adding array, 'b', to scalar, 'f'.
# error message will be issued and execution halted
g = set_var(f,b,f);
}
function set_var(x,y,z) {
# create local variable
local w = x + y + z;
# alter third argument
# if first & second arguments arrays,
# this will convert third to an array
# (if not already passed as an array).
z = y + 2;
return w;
}
QTAwk - 13-6 - QTAwk
QTAwk - 13-7 - QTAwk
Section 13.5 User-Defined Functions
This QTAwk utility illustrates several ideas in using arrays and
user-defined functions in QTAwk. The line:
print "scalar : "set_var(c,c,c)" and c == "c;
calls the function 'set_var' with three scalar variables, all
'c'. Three copies of 'c' are actually passed. The local
variable, 'w', is computed using scalar quantities and is a
scalar quantity. Since argument 'y' is a scalar quantity, the
result of the expression:
z = y + 2;
is a scalar and the third argument, 'c', is unchanged. A
functional value of 30 (== c + c + c) is returned.
The line:
d = set_var(a,b,c+0);
passes arrays as the first and second arguments. The third
argument is a constant scalar value, and thus cannot be changed
by the function called. The return value of the function:
w = x + y + z; (== a + b + 10;)
is an array. The line:
for ( i in d ) print "d["i"] = "d[i];
prints the values of the array:
d[1] = 12
d[2] = 14
d[3] = 16
d[4] = 18
d[5] = 20
The line:
e = set_var(a,b,c);
passes arrays as the first and second arguments. The third
argument is a variable scalar value, and thus can be changed by
the function called if the third argument at function termination
is an array. The return value of the function:
QTAwk - 13-8 - QTAwk
Section 13.5 User-Defined Functions
w = x + y + z; (== a + b + c;)
is an array as above. Note that at function termination, the
third argument is now an array since it was set to the result of
an operation on an array:
z = y + 2;
which is now equivalent to:
z = b + 2;
Thus, at function termination the scalar variable 'c' has been
converted to an array. The line:
for ( i in c ) print "c["i"] = "c[i];
will print the values of the array elements:
c[1] = 3 ( == b[1] + 2)
c[2] = 4 ( == b[2] + 2)
c[3] = 5 ( == b[3] + 2)
c[4] = 6 ( == b[4] + 2)
c[5] = 7 ( == b[5] + 2)
The line:
h = set_var(a,b,b+0);
passes arrays as the first and second arguments. The third
argument is a constant array value, and thus cannot be changed by
the function called. The return value of the function is an
array as above. Note that at function termination, the third
argument is again an array as above. However, the third argument
has been passed as a constant array and thus no variable is
changed as 'c' was above. The third argument is discarded at
function termination. The line:
for ( i in b ) print "b["i"] = "b[i];
prints the array 'b' to assure that it was not changed.
The line:
g = set_var(f,b,f);
QTAwk - 13-9 - QTAwk
Section 13.5 User-Defined Functions
will result in an illegal operation in the function:
local w = x + y + z; (== f + b + f;)
this operation is now attempting to add an array to a scalar:
f + b
This operation will result in an error message and halt
execution.
The above sample QTAwk utility illustrates the power of
user-defined functions in automatically handling scalars and
arrays as both arguments and return values and adjusting
accordingly. The same function may be used interchangeably for
both arrays and scalars with natural and predictable results.
QTAwk - 13-10 - QTAwk
Section 14.0 Format Specification
14.0 Format Specification
QTAwk follows the Draft ANSI C language standard for the format
string in the 'printf' and 'fprintf' functions except for the 'P'
and 'n' types, which are not supported and will give
unpredictable results.
A format specification has the form:
%[flags][width][.precision][h | l | L]type
which is matched by the following regular expression:
/%{flags}?{width}?{precision}?[hlL]?{type}/
with:
flags = /[-+\s#0]/;
width = /([0-9]+|\*)/;
precision = /(\.([0-9]+|\*))/;
type = /[diouxXfeEgGcs]/;
Each field of the format specification is a single character or
a number signifying a particular format option. The type
character, which appears after the last optional format field,
enclosed in braces '[..]', determines whether the associated
argument is interpreted as a character, a string, or a number.
The simplest format specification contains only the percent sign
and a type character (for example, %s). The optional fields
control other aspects of the formatting, as follows:
1. flags ==> Control justification of output and printing of
signs, blanks, decimal points, octal and hexadecimal
prefixes.
2. width ==> Control minimum number of characters output.
3. precision ==> Controls maximum number of characters printed
for all or part of the output field, or minimum number of
digits printed for integer values.
4. h, l, L ==> Prefixes that determine size of argument
expected (this field is retained only for compatibility to C
format strings).
a) h ==> Used as a prefix with the integer types d, i, o,
QTAwk - 14-1 - QTAwk
Section 14.0 Format Specification
x, and X to specify that the argument is short int, or
with u to specify a short unsigned int
b) l == > Used as a prefix with d, i, o, x, and X types to
specify that the argument is long int, or with u to
specify a long unsigned int; also used as a prefix with
e, E, f, g, and G types to specify a double, rather than
a float
c) L ==> Used as a prefix with e, E, f, g, and G types to
specify a long double
If a percent sign, '%', is followed by a character that has no
meaning as a format field, the character is simply copied to the
output. For example, to print a percent-sign character, use
"%%".
14.1 Output Types
Type characters:
1. d ==> integer, Signed decimal integer
2. i ==> integer, Signed decimal integer
3. u ==> integer, Unsigned decimal integer
4. o ==> integer, Unsigned octal integer
5. x ==> integer, Unsigned hexadecimal integer, using "abcdef"
6. X ==> integer, Unsigned hexadecimal integer, using "ABCDEF"
7. f ==> float, Signed value having the form
[-]dddd.dddd
where dddd is one or more decimal digits. The number of
digits before the decimal point depends on the magnitude of
the number, and the number of digits after the decimal point
depends on the requested precision.
8. e ==> float, Signed value having the form
[-]d.dddde[sign]ddd,
where d is a single decimal digit, dddd is one or more
decimal digits, ddd is exactly three decimal digits, and sign
is + or -.
9. E ==> float, Identical to the e format, except that E
introduces the exponent instead of e.
QTAwk - 14-2 - QTAwk
Section 14.1 Format Specification
10. g ==> float, Signed value printed in f or e format,
whichever is more compact for the given value and precision.
The e format is used only when the exponent of the value is
less than -4 or greater than the precision argument.
Trailing zeros are truncated and the decimal point appears
only if one or more digits follow it.
11. G ==> float, Identical to the g format, except that E
introduces the exponent (where appropriate) instead of e.
12. c ==> character, Single character
13. s ==> string, Characters printed up to the first null
character ('\0') or until the precision value is reached.
14.2 Output Flags
Flag Characters
1. '-' ==> Left justify the result within the given field
width. Default: Right justify.
2. '+' ==> Prefix the output value with a sign (+ or -) if the
output value is of a signed type. Default: Sign appears only
for negative signed values (-).
3. blank (' ') ==> Prefix the output value with a blank if the
output value is signed and positive. The blank is ignored if
both the blank and + flags appear. Default: No blank.
4. '#' ==> When used with the o, x, or X format, the # flag
prefixes any nonzero output value with 0, 0x, or 0X,
respectively. Default: No blank.
5. '#' ==> When used with the e, E, or f format, the # flag
forces the output value to contain a decimal point in all
cases. Default: Decimal point appears only if digits follow
it.
6. '#' ==> When used with the g or G format, the # flag forces
the output value to contain a decimal point in all cases and
prevents the truncation of trailing zeros. Default: Decimal
point appears only if digits follow it. Trailing zeros are
truncated.
7. '#' ==> Ignored when used with c, d, i, u or s
QTAwk - 14-3 - QTAwk
Section 14.2 Format Specification
8. '0' ==> For d, i, o, u, x, X, e, E, f, g, and G conversions,
leading zeros (following any indication of sign or base) are
used to pad to the field width; no space padding is
performed. If the 0 and - flags both appear, the 0 flag will
be ignored. For d, i, o, u, x, and X conversions, if a
precision is specified, the 0 flag will be ignored. For
other conversions the behavior is undefined. Default: Use
blank padding
If the argument corresponding to a floating-point specifier is
infinite or indefinite, the following output is produced:
+ infinity ==> 1.#INFrandom-digits
- infinity ==> -1.#INFrandom-digits
Indefinite ==> digit.#INDrandom-digits
14.3 Output Width
The width argument is a non-negative decimal integer controlling
the minimum number of characters printed. If the number of
characters in the output value is less than the specified width,
blanks are added to the left or the right of the values
(depending on whether the - flag is specified) until the minimum
width is reached. If width is prefixed with a 0 flag, zeros are
added until the minimum width is reached (not useful for
left-justified numbers).
The width specification never causes a value to be truncated; if
the number of characters in the output value is greater than the
specified width, or width is not given, all characters of the
value are printed (subject to the precision specification).
The width specification may be an asterisk (*), in which case an
integer argument from the argument list supplies the value. The
width argument must precede the value being formatted in the
argument list. A nonexistent or small field width does not cause
a truncation of a field; if the result of a conversion is wider
than the field width, the field expands to contain the conversion
result.
14.4 Output Precision
The precision specification is a non-negative decimal integer
preceded by a period, '.', which specifies the number of
characters to be printed, the number of decimal places, or the
number of significant digits. Unlike the width specification,
QTAwk - 14-4 - QTAwk
Section 14.4 Format Specification
the precision can cause truncation of the output value, or
rounding in the case of a floating-point value.
The precision specification may be an asterisk, '*', in which
case an integer argument from the argument list supplies the
value. The precision argument must precede the value being
formatted in the argument list.
The interpretation of the precision value, and the default when
precision is omitted, depend on the type, as shown below:
1. d,i,u,o,x,X ==> The precision specifies the minimum number
of digits to be printed. If the number of digits in the
argument is less than precision, the output value is padded
on the left with zeros. The value is not truncated when the
number of digits exceeds precision. Default: If precision is
0 or omitted entirely, or if the period (.) appears without a
number following it, the precision is set to 1.
2. e, E ==> The precision specifies the number of digits to be
printed after the decimal point. The last printed digit is
rounded. Default: Default precision is 6; if precision is 0
or the period (.) appears without a number following it, no
decimal point is printed.
3. f ==> The precision value specifies the number of digits
after the decimal point. If a decimal point appears, at
least one digit appears before it. The value is rounded to
the appropriate number of digits. Default: Default precision
is 6; if precision is 0, or if the period (.) appears without
a number following it, no decimal point appears.
4. g, G ==> The precision specifies the maximum number of
significant digits printed. Default: Six significant digits
are printed, without any trailing zeros that are truncated.
5. c ==> No effect. Default: Character printed
6. s ==> The precision specifies the maximum number of
characters to be printed. Characters in excess of precision
are not printed. Default: All characters of the string are
printed.
QTAwk - 14-5 - QTAwk
QTAwk - 14-6 - QTAwk
Section 15.0 Trace Statements
15.0 Trace Statements
QTAwk has added a facility for debugging utilities. This
facility is activated through the built-in variable 'TRACE'.
QTAwk can trace the loop control statements, 'if', 'while', 'do',
'for' (both forms), and 'switch'. In addition, built-in
functions and user-defined functions are traced.
By default, TRACE is set to FALSE and no tracing is done. The
variable may be set to any value, numeric, string or regular
expression and the value will determine the statements traced.
If TRACE has a nonzero numeric value then QTAwk will trace all
statements of the type listed.
15.1 Selective Statement Tracing
If TRACE has a string value, then the string is compared against
the keywords:
1. if
2. while
3. do
4. for
5. switch
6. function_b (built-in functions)
7. function_u (user-defined functions)
If an exact match (case is important) is found, then the
statement is traced. If TRACE is set to a regular expression,
then the keywords are matched against the regular expression. If
a match is found, then the statement is traced.
15.2 Trace Output
In tracing a statement, QTAwk issues a message to the standard
output file. The message issued will have the form:
Stmt Trace: stmt_str value_str
Action File line: xxxx
Scanning File: FILENAME
Line: xxxxx
Record: xxxxxx
where stmt_str is the appropriate keyword listed above for the
statement traced and value_str is a value dependent upon the
statement traced as listed below:
QTAwk - 15-1 - QTAwk
Section 15.2 Trace Statements
keyword value string
if ==> 0/1 conditional expression TRUE/FALSE
while ==> 0/1 conditional expression TRUE/FALSE
do ==> 0/1 conditional expression TRUE/FALSE
for ==> 0/1 conditional expression TRUE/FALSE
for ==> subscript value
switch ==> switch expression value
function_b ==> function name
function_u ==> function name
When a statement that can be traced is encountered, the value
of the statement is determined, e.g., for an 'if' statement, the
value of the conditional is evaluated before issuing the trace
statement.
The following TRACE values will trace the statements indicated:
1. TRACE = "if";
This value will trace all 'if' statements, indicating the
TRUE/FALSE value of the conditional.
2. TRACE = /^[iwd]/;
This value will trace all 'if', 'while' and 'do' statements,
indicating the TRUE/FALSE value of the conditional.
3. TRACE = /_u$/;
This value will trace all user-defined functions, indicating
the function name in the trace message.
QTAwk - 15-2 - QTAwk
Section 16.0 Invoking QTAwk
16.0 Invoking QTAwk
There are two ways of specifying utilities to QTAwk:
1. Specifying the utility on the command line, e.g.,
{apc} "/^{_w}*$/{if(!bc++)print;next;}{bc=FALSE;print;}"
file1
This short command line utility will read file1, printing
only the first blank line in a series of blank lines. All
nonblank lines are printed.
Note that the "utility" has been enclosed in double quotes,
". This is necessary to keep the operating system from
interpreting the utility as a file. In addition, if the
utility contains symbols recognized by the operating system
command line interpreter, e.g., the re-direction operators,
'<' or '>', the quotes keep the interpreter from recognizing
the symbols. If the utility contains quotes, e.g., a
constant string definition, then the imbedded quotes should
be preceded by a back-slash, '\'.
For example, the short utility:
QTAwk "{print FNR\" : \"$0;}" file1
prints each line of file1 preceded by the line number. The
constant string,
" : "
separates the line number from the line. Back-slashes must
precede the quotes surrounding the constant string.
2. -fufile
or
-f ufile
When a utility may be used frequently or grows too long to
include on the command line as above, it becomes necessary to
store it in a file. The utility may then be specified to
QTAwk with this option. The blank between the 'f' command
line option and the utility file name is optional.
16.1 Multiple QTAwk Utilities
QTAwk - 16-1 - QTAwk
Section 16.1 Invoking QTAwk
More than one utility file may be specified to QTAwk in this
manner. Each utility file specified is read in the order
specified and combined into a single QTAwk utility. In this
manner it is possible to keep constantly used pattern-actions or
user-defined functions in separate files and combine them into
QTAwk utilities as necessary. The order of the utility files is
not important except for the order in which predefined patterns
are executed and the order in which pattern-action pairs are
executed. Thus if a utility file contained only common
user-defined functions, it may be defined in any order in
relation to other utility files. See also the "include"
directive discussed below.
Scanning of the command line for arguments may be stopped with
the double hyphen command line argument, "--". This argument is
not passed to the QTAwk utility.
This method of specifying utilities to QTAwk cannot be combined
with the command line utility definition method.
The command line is scanned for all utility files specified with
the 'f' option prior to reading the utility files or any input
files. The utility files are then "removed" from the command
line and the command line argument count.
16.1.1 #include Directive
A second method for specifying multiple utility files is
available that follows the "C" practice of "including" a file at
a particular point in a file. QTAwk follows the "C" syntax:
#include "filename.ext"
or
#include <filename.ext>
The two forms differ only in where QTAwk searches for the file
to include. The first form, with double quotes, will search for
the specified file only in the current directory if no path is
specified. If a path is specified, then the specified path only
is searched.
The second form, with the angle brackets, will search first in
the current directory and then in the QTAwk default directories
specified with the QTAWK environment setting. If a path is
QTAwk - 16-2 - QTAwk
Section 16.1.1 Invoking QTAwk
specified with this form, the specified path only is searched.
White space is allowed before and after the '#' symbol. This
directive is not recognized within the action portion of any
pattern.
By using the "include" directive, a master file may include
pattern/actions and user-defined functions as necessary. It is
then necessary to specify only one file on the command line.
This makes it much easier to build utilities from several files
and remember only one file name to accomplish a desired purpose.
16.2 Command Line Options
QTAwk supports the following command line options:
1. f - utility file specification as described above.
2. F - set field separator to single character. Described
below.
3. v - set variable value before utility execution. Described
below.
4. Wd - delay input parsing. Described under the built-in
variable DELAY_INPUT_PARSE.
5. Ww - write internal form of utility to specified file. With
this option, QTAwk writes the internal form of the utility to
the file specified. The filename is specified on the command
line in a manner similar to that used for the 'f' option:
-Wwfilename
-Ww filename
After writing the utility to the specified file, QTAwk stops
execution. The file to which the internal form of the
utility has been written may be specified as an input utility
file with the 'f' option. However, only one such file may be
input and may not be combined with ordinary ASCII utility
files.
16.3 File Search Sequence
QTAwk may search for input files in more than one place. Many
QTAwk - 16-3 - QTAwk
Section 16.3 Invoking QTAwk
times it is convenient to gather all QTAwk utility files into a
single directory on a convenient drive, e.g., the directory
"\{ap}\{utl}" may be used. It would then be convenient to invoke
any utility without the necessity of always specifying the full
path to the utility. QTAwk recognizes the environment variable
setting of "QTAWK" for this purpose. At program invocation QTAwk
searches the environment for the setting for "QTAWK" and sets the
string value of the built-in variable "QTAwk_Path" to this
string.
For input files, if a drive and/or path is specified with the
filename, then that drive and path only are searched for the
desired file. If no drive or path is specified, then the current
directory is searched, if the file is not found, then the paths
(optionally with a drive specifier) specified by the string in
the built-in variable QTAwk_Path are searched for the file.
Multiple paths may be specified in QTAwk_Path by separating them
with semi-colons.
The string value of the built-in variable QTAwk_Path may be
altered by the executing utility at any time to change the paths
searched for input files.
16.4 Setting the Field Separator
The QTAwk input record separator, FS, may be set on the command
line with the 'F' option.
QTAwk -F "/:/"
or
QTAwk -F/:/
The blank between the 'F' and the string or regular expression
defining the new input record separator is optional. This option
may only be specified once on the command line. The command line
is scanned for all 'F' options prior to reading any utility files
or input files. The option and the new value for FS are then
"removed" from the command line and the command line count.
Another method is available for setting FS prior to reading the
input files. This method is more general and may be used
multiple times on the command line and may be used to set any
utility variable and not just FS.
QTAwk - 16-4 - QTAwk
Section 16.5 Invoking QTAwk
16.5 Command Line Variables
Including the following on the command line:
-v"var = value"
or
-vvar=value
or
"var = value"
or
var=value
will set the variable 'var' to the value specified. 'var' may
be any built-in or user-defined variable in the QTAwk utility.
'var' must be a variable defined in the current QTAwk utility or
a run-time error will occur and QTAwk will stop processing. If
preceeded with the option string '-v' as indicated in the first
two examples, the value of the variable is set before execution
of the utility starts. If the 'v' option is not specified, the
value of 'var' is set when the token is encountered in processing
command line file tokens.
16.6 QTAwk Execution Sequence
QTAwk execution follows the sequence:
1. The command line is scanned for any options, 'f', 'W' or
'v'. If any such options are found, they are interpreted and
removed from the command line. Upon encountering a command
line argument of the form '--' (double hyphen), scanning of
the command line is halted. The '--' argument is removed
from the command line.
2. The QTAwk utility is read and converted to internal form.
If any 'f' options were found in the preceding step, the
associated utility files are opened, read and converted in
the order specified. If no 'f' options were specified, the
first command line argument is processed as the QTAwk utility
and then removed from the command line arguments. If a 'Ww'
option was specified, the associated utility file is opened
QTAwk - 16-5 - QTAwk
Section 16.6 Invoking QTAwk
and the internal form utility written.
3. The ARGC and ARGV built-in variables are set according to
the command line parameters. The ARGI built-in variable is
set to 1.
4. Any variable values set on the command line with the 'v'
option are set as specified.
5. Any "BEGIN" actions in the QTAwk utility are executed. This
is done prior to any further interpretation of the command
line arguments.
6. The command line argument, ARGV[ARGI], is examined. One of
two actions is taken depending on the form of the argument:
a) An argument of the form:
var = value
or
var=value
is interpreted as setting the variable specified, to the
value specified.
b) Any other argument is interpreted as a file name. The
file specified is opened for input. If the file does not
exist, an error message is issued and execution halted.
If a single hyphen, '-', is specified, it is interpreted
as representing the standard input file. If no command
line arguments are specified beyond the QTAwk utility or
variable setting commands, the standard input file is
read for input.
7. Any "INITIAL" actions in the QTAwk utility are executed.
8. The input file is read record by record and matched against
the patterns present in the QTAwk utility. If no
pattern/action pairs are given in the QTAwk utility, each
record is read, the NF, FNR, NR and field variables are set
and the record is then discarded. If an 'exit' or 'endfile'
statement is executed, action passes to the next step below.
9. When the end of the input file is reached or an "exit" or
QTAwk - 16-6 - QTAwk
Section 16.6 Invoking QTAwk
"endfile" statement is executed, any 'FINAL' actions are
executed.
10. The input file is closed.
11. If an "exit" statement was executed, processing passes to
step 12. below, else the following steps are executed:
a) The element of ARGV corresponding the the current index
value of ARGI is sought. If none is found, processing
proceeds as if the "exit" statement was executed.
b) ARGI is set to the index value of the next element of
ARGV. If there is no next element of ARGV, processing
proceeds as if the "exit" statement was executed.
c) processing continues with step 6) above.
12. Any "END" actions in the QTAwk utility are executed.
13. QTAwk execution halts.
QTAwk - 16-7 - QTAwk
QTAwk - 16-8 - QTAwk
Section 17.0 QTAwk Limits
17.0 QTAwk Limits
QTAwk has the following limitations:
1024 fields
4096 characters per input record
4096 characters per formatted output record
256 characters in character class (with character ranges
expanded)
256 user-defined functions
256 local variables
256 global variables
1024 characters in constant strings
1024 characters in regular expressions on input
4096 characters in regular expressions after expansion of named
expressions and repetition operators.
4096 characters in strings used as regular expressions after
expansion of named expressions and repetition operators.
4096 characters in strings returned by 'replace' functions
4096 characters in input strings read by 'getline' and fgetline'
functions
4096 characters in strings after substitution for 'gsub' and
'sub' functions
4096 characters maximum in strings returned by following
functions:
1. copies
2. deletec
3. insert
4. overlay
5. remove
QTAwk - 17-1 - QTAwk
QTAwk - 17-2 - QTAwk
Section 18.0 2
18.0 Appendix i
ASCII character set
( escape sequences shown for non-printable characters )
dec hex char dec hex char dec hex char dec hex char
╤ ╤ ╤
0 00 NUL │ 32 20 \s │ 64 40 @ │ 96 60 `
1 01 ^ SOH │ 33 21 ! │ 65 41 A │ 97 61 a
2 02 ^ STX │ 34 22 " │ 66 42 B │ 98 62 b
3 03 ^ ETX │ 35 23 # │ 67 43 C │ 99 63 c
4 04 ^ EOT │ 36 24 $ │ 68 44 D │ 100 64 d
5 05 ^ ENQ │ 37 25 % │ 69 45 E │ 101 65 e
6 06 ^ ACK │ 38 26 & │ 70 46 F │ 102 66 f
7 07 ^\a BEL │ 39 27 ' │ 71 47 G │ 103 67 g
8 08 ^\b BS │ 40 28 ( │ 72 48 H │ 104 68 h
9 09 ^ \t HT │ 41 29 ) │ 73 49 I │ 105 69 i
10 0A ^ \n LF │ 42 2A * │ 74 4A J │ 106 6A j
11 0B ^\v VT │ 43 2B + │ 75 4B K │ 107 6B k
12 0C ^\f FF │ 44 2C , │ 76 4C L │ 108 6C l
13 0D ^\r CR │ 45 2D - │ 77 4D M │ 109 6D m
14 0E ^ SO │ 46 2E . │ 78 4E N │ 110 6E n
15 0F ^ SI │ 47 2F / │ 79 4F O │ 111 6F o
16 10 ^ DLE │ 48 30 0 │ 80 50 P │ 112 70 p
17 11 ^ DC1 │ 49 31 1 │ 81 51 Q │ 113 71 q
18 12 ^ DC2 │ 50 32 2 │ 82 52 R │ 114 72 r
19 13 ^ DC3 │ 51 33 3 │ 83 53 S │ 115 73 s
20 14 ^ DC4 │ 52 34 4 │ 84 54 T │ 116 74 t
21 15 ^ NAK │ 53 35 5 │ 85 55 U │ 117 75 u
22 16 ^ SYN │ 54 36 6 │ 86 56 V │ 118 76 v
23 17 ^ ETB │ 55 37 7 │ 87 57 W │ 119 77 w
24 18 ^ CAN │ 56 38 8 │ 88 58 X │ 120 78 x
25 19 ^ │ 57 39 9 │ 89 59 Y │ 121 79 y
26 1A SUB │ 58 3A : │ 90 5A Z │ 122 7A z
27 1B ^ ESC │ 59 3B ; │ 91 5B [ │ 123 7B {
28 1C ^ FS │ 60 3C < │ 92 5C \ │ 124 7C |
29 1D ^ GS │ 61 3D = │ 93 5D ] │ 125 7D }
30 1E ^ │ 62 3E > │ 94 5E ^ │ 126 7E ~
31 1F ^ │ 63 3F ? │ 95 5F _ │ 127 7F
QTAwk - 18-1 - QTAwk
Section 18.0 2
ASCII character sets. (continued)
dec hex char dec hex char dec hex char dec hex char
╤ ╤ ╤
128 80 ^Ç │ 160 A0 á │ 192 C0 └ │ 224 E0 α
129 81 ^ü │ 161 A1 í │ 193 C1 ┴ │ 225 E1 ß
130 82 ^é │ 162 A2 ó │ 194 C2 ┬ │ 226 E2 Γ
131 83 ^â │ 163 A3 ú │ 195 C3 ├ │ 227 E3 π
132 84 ^ä │ 164 A4 ñ │ 196 C4 ─ │ 228 E4 Σ
133 85 ^à │ 165 A5 Ñ │ 197 C5 ┼ │ 229 E5 σ
134 86 ^å │ 166 A6 ª │ 198 C6 ╞ │ 230 E6 µ
135 87 ^ç │ 167 A7 º │ 199 C7 ╟ │ 231 E7 τ
136 88 ^ê │ 168 A8 ¿ │ 200 C8 ╚ │ 232 E8 Φ
137 89 ^ë │ 169 A9 ⌐ │ 201 C9 ╔ │ 233 E9 Θ
138 8A ^è │ 170 AA ¬ │ 202 CA ╩ │ 234 EA Ω
139 8B ^ï │ 171 AB ½ │ 203 CB ╦ │ 235 EB δ
140 8C ^î │ 172 AC ¼ │ 204 CC ╠ │ 236 EC ∞
141 8D ^ì │ 173 AD ¡ │ 205 CD ═ │ 237 ED
142 8E ^Ä │ 174 AE « │ 206 CE ╬ │ 238 EE
143 8F ^Å │ 175 AF » │ 207 CF ╧ │ 239 EF ∩
144 90 ^É │ 176 B0 ░ │ 208 D0 ╨ │ 240 F0 ≡
145 91 ^æ │ 177 B1 ▒ │ 209 D1 ╤ │ 241 F1 ±
146 92 ^Æ │ 178 B2 ▓ │ 210 D2 ╥ │ 242 F2 ≥
147 93 ^ô │ 179 B3 │ │ 211 D3 ╙ │ 243 F3 ≤
148 94 ^ö │ 180 B4 ┤ │ 212 D4 ╘ │ 244 F4 ⌠
149 95 ^ò │ 181 B5 ╡ │ 213 D5 ╒ │ 245 F5 ⌡
150 96 ^û │ 182 B6 ╢ │ 214 D6 ╓ │ 246 F6 ÷
151 97 ^ù │ 183 B7 ╖ │ 215 D7 ╫ │ 247 F7 ≈
152 98 ^ÿ │ 184 B8 ╕ │ 216 D8 ╪ │ 248 F8 °
153 99 ^Ö │ 185 B9 ╣ │ 217 D9 ┘ │ 249 F9 ∙
154 9A ^Ü │ 186 BA ║ │ 218 DA ┌ │ 250 FA ·
155 9B ^¢ │ 187 BB ╗ │ 219 DB █ │ 251 FB
156 9C ^£ │ 188 BC ╝ │ 220 DC ▄ │ 252 FC
157 9D ^¥ │ 189 BD ╜ │ 221 DD ▌ │ 253 FD ²
158 9E ^₧ │ 190 BE ╛ │ 222 DE ▐ │ 254 FE ■
159 9F ^ƒ │ 191 BF ┐ │ 223 DF ▀ │ 255 FF
QTAwk - 18-2 - QTAwk
Section 19.0 |
19.0 Appendix ii
Keyboard Codes
These tables lists the ASCII code, in hexadecimal, and the
keyboard scan code returned by keyboard keys for 101/102-key
keyboards. In the tables the ASCII code and scan code are listed
as a number pair, ss/aa, where ss is the scan code and aa is the
ASCII code. Keys and key combinations marked '**' are used by
the computer keyboard, but do not put values into the keyboard
buffer. Keys and key combinations marked '--' are ignored by the
computer keyboard. Note: In the United States, the 101/102-key
keyboard is shipped with 101 keys. Overseas versions have an
additional key sandwiched between the left Shift key and the Z
key. This additional key is identified in these tables as "Key
45."
Table 1 is organized by keystroke, following the layout of the
keyboard. Table 2 is organized in ascending ASCII code, and
Table 3 is organized in ascending scan code.
Table 1
Keystroke Hexadecimal Decimal
Esc 01/1B 001/027
1 02/31 002/049
2 03/32 003/050
3 04/33 004/051
4 05/34 005/052
5 06/35 006/053
6 07/36 007/054
7 08/37 008/055
8 09/38 009/056
9 0A/39 010/057
0 0B/30 011/048
- 0C/2D 012/045
= 0D/3D 013/061
Backspace 0E/08 014/008
Tab 0F/09 015/009
q 10/71 016/113
w 11/77 017/119
e 12/65 018/101
r 13/72 019/114
t 14/74 020/116
y 15/79 021/121
u 16/75 022/117
QTAwk - 19-1 - QTAwk
Section 19.0 |
i 17/69 023/105
o 18/6F 024/111
p 19/70 025/112
[ 1A/5B 026/091
] 1B/5D 027/093
Enter 1C/0D 028/013
Ctrl ** **
a 1E/61 030/097
s 1F/73 031/115
d 20/64 032/100
f 21/66 033/102
g 22/67 034/103
h 23/68 035/104
j 24/6A 036/106
k 25/6B 037/107
l 26/6C 038/108
; 27/3B 039/059
' 28/27 040/039
` 29/60 041/096
Shift ** **
\ 2B/5C 043/092
z 2C/7A 044/122
x 2D/78 045/120
c 2E/63 046/099
v 2F/76 047/118
b 30/62 048/098
n 31/6E 049/110
m 32/6D 050/109
, 33/2C 051/044
. 34/2E 052/046
/ 35/2F 053/047
Gray * 37/2A 055/042
Alt ** **
Space 39/20 057/032
Caps Lock ** **
F1 3B/00 059/000
F2 3C/00 060/000
F3 3D/00 061/000
F4 3E/00 062/000
F5 3F/00 063/000
F6 40/00 064/000
F7 41/00 065/000
F8 42/00 066/000
F9 43/00 067/000
F10 44/00 068/000
F11 85/00 133/000
QTAwk - 19-2 - QTAwk
Section 19.0 |
F12 86/00 134/000
Num Lock ** **
Scroll Lock ** **
White Home 47/00 071/000
White Up Arrow 48/00 072/000
White PgUp 49/00 073/000
Gray - 4A/2D 074/045
White Left Arrow 4B/00 075/000
Center Key 4C/00 076/000
White Right Arrow 4D/00 077/000
Gray + 4E/2B 078/043
White End 4F/00 079/000
White Down Arrow 50/00 080/000
White PgDn 51/00 081/000
White Ins 52/00 082/000
White Del 53/00 083/000
SysReq ** **
Key 45 56/5C 086/092
Enter (number keypad) 1C/0D 028/013
Gray / 35/2F 053/047
PrtSc ** **
Pause ** **
Gray Home 47/00 071/000
Gray Up Arrow 48/00 072/000
Gray Page Up 49/00 073/000
Gray Left Arrow 4B/00 075/000
Gray Right Arrow 4D/00 077/000
Gray End 4F/00 079/000
Gray Down Arrow 50/00 080/000
Gray Page Down 51/00 081/000
Gray Insert 52/00 082/000
Gray Delete 53/00 083/000
Shift Esc 01/1B 001/027
! 02/21 002/033
@ 03/40 003/064
# 04/23 004/035
$ 05/24 005/036
% 06/25 006/037
^ 07/5E 007/094
& 08/26 008/038
* (white) 09/2A 009/042
( 0A/28 010/040
) 0B/29 011/041
_ 0C/5F 012/095
+ (white) 0D/2B 013/043
Shift Backspace 0E/08 014/008
QTAwk - 19-3 - QTAwk
Section 19.0 |
Shift Tab (Backtab) 0F/00 015/000
Q 10/51 016/081
W 11/57 017/087
E 12/45 018/069
R 13/52 019/082
T 14/54 020/084
Y 15/59 021/089
U 16/55 022/085
I 17/49 023/073
O 18/4F 024/079
P 19/50 025/080
{ 1A/7B 026/123
} 1B/7D 027/125
Shift Enter 1C/0D 028/013
Shift Ctrl ** **
A 1E/41 030/065
S 1F/53 031/083
D 20/44 032/068
F 21/46 033/070
G 22/47 034/071
H 23/48 035/072
J 24/4A 036/074
K 25/4B 037/075
L 26/4C 038/076
: 27/3A 039/058
" 28/22 040/034
~ 29/7E 041/126
| 2B/7C 043/124
Z 2C/5A 044/090
X 2D/58 045/088
C 2E/43 046/067
V 2F/56 047/086
B 30/42 048/066
N 31/4E 049/078
M 32/4D 050/077
< 33/3C 051/060
> 34/3E 052/062
? 35/3F 053/063
Shift Gray * 37/2A 055/042
Shift Alt ** **
Shift Space 39/20 057/032
Shift Caps Lock ** **
Shift F1 54/00 084/000
Shift F2 55/00 085/000
Shift F3 56/00 086/000
Shift F4 57/00 087/000
QTAwk - 19-4 - QTAwk
Section 19.0 |
Shift F5 58/00 088/000
Shift F6 59/00 089/000
Shift F7 5A/00 090/000
Shift F8 5B/00 091/000
Shift F9 5C/00 092/000
Shift F10 5D/00 093/000
Shift F11 87/00 135/000
Shift F12 88/00 136/000
Shift Num Lock ** **
Shift Scroll Lock ** **
Shift 7 (number pad) 47/37 071/055
Shift 8 (number pad) 48/38 072/056
Shift 9 (number pad) 49/39 073/057
Shift Gray - 4A/2D 074/045
Shift 4 (number pad) 4B/34 075/052
Shift 5 (number pad) 4C/35 076/053
Shift 6 (number pad) 4D/36 077/054
Shift Gray + 4E/2B 078/043
Shift 1 (number pad) 4F/31 079/049
Shift 2 (number pad) 50/32 080/050
Shift 3 (number pad) 51/33 081/051
Shift 0 (number pad) 52/30 082/048
Shift . (number pad) 53/2E 083/046
Shift SysReq ** **
Shift Key 45 56/7C 086/124
Shift Enter (number pad) 1C/0D 028/013
Shift Gray / 35/2F 053/047
Shift PrtSc ** **
Shift Pause ** **
Shift Gray Home 47/00 071/000
Shift Gray Up Arrow 48/00 072/000
Shift Gray Page Up 49/00 073/000
Shift Gray Left Arrow 4B/00 075/000
Shift Gray Right Arrow 4D/00 077/000
Shift Gray End 4F/00 079/000
Shift Gray Down Arrow 50/00 080/000
Shift Gray Page Down 51/00 081/000
Shift Gray Insert 52/00 082/000
Shift Gray Delete 53/00 083/000
Ctrl Esc 01/1B 001/027
Ctrl 1 -- --
Ctrl 2 (NUL) 03/00 003/000
Ctrl 3 -- --
Ctrl 4 -- --
Ctrl 5 -- --
Ctrl 6 (RS) 07/1E 007/030
QTAwk - 19-5 - QTAwk
Section 19.0 |
Ctrl 7 -- --
Ctrl 8 -- --
Ctrl 9 -- --
Ctrl 0 -- --
Ctrl - 0C/1F 012/031
Ctrl = -- --
Ctrl Backspace (DEL) 0E/7F 014/127
Ctrl Tab 94/00 148/000
Ctrl q (DC1) 10/11 016/017
Ctrl w (ETB) 11/17 017/023
Ctrl e (ENQ) 12/05 018/005
Ctrl r (DC2) 13/12 019/018
Ctrl t (DC4) 14/14 020/020
Ctrl y (EM) 15/19 021/025
Ctrl u (NAK) 16/15 022/021
Ctrl i (HT) 17/09 023/009
Ctrl o (SI) 18/0F 024/015
Ctrl p (DEL) 19/10 025/016
Ctrl [ (ESC) 1A/1B 026/027
Ctrl ] (GS) 1B/1D 027/029
Ctrl Enter (LF) 1C/0A 028/010
Ctrl a (SOH) 1E/01 030/001
Ctrl s (DC3) 1F/13 031/019
Ctrl d (EOT) 20/04 032/004
Ctrl f (ACK) 21/06 033/006
Ctrl g (BEL) 22/07 034/007
Ctrl h (Backspace) 23/08 035/008
Ctrl j (LF) 24/0A 036/010
Ctrl k (VT) 25/0B 037/011
Ctrl l (FF) 26/0C 038/012
Ctrl ; -- --
Ctrl ' -- --
Ctrl ` -- --
Ctrl Shift ** **
Ctrl \ (FS) 2B/1C 043/028
Ctrl z (SUB) 2C/1A 044/026
Ctrl x (CAN) 2D/18 045/024
Ctrl c (ETX) 2E/03 046/003
Ctrl v (SYN) 2F/16 047/022
Ctrl b (STX) 30/02 048/002
Ctrl n (SO) 31/0E 049/014
Ctrl m (CR) 32/0D 050/013
Ctrl , -- --
Ctrl . -- --
Ctrl / -- --
Ctrl Gray * 96/00 150/000
QTAwk - 19-6 - QTAwk
Section 19.0 |
Ctrl Alt ** **
Ctrl Space 39/20 057/032
Ctrl Caps Lock -- --
Ctrl F1 5E/00 094/000
Ctrl F2 5F/00 095/000
Ctrl F3 60/00 096/000
Ctrl F4 61/00 097/000
Ctrl F5 62/00 098/000
Ctrl F6 63/00 099/000
Ctrl F7 64/00 100/000
Ctrl F8 65/00 101/000
Ctrl F9 66/00 102/000
Ctrl F10 67/00 103/000
Ctrl F11 89/00 137/000
Ctrl F12 8A/00 138/000
Ctrl Num Lock -- --
Ctrl Scroll Lock -- --
Ctrl White Home 77/00 119/000
Ctrl White Up Arrow 8D/00 141/000
Ctrl White PgUp 84/00 132/000
Ctrl Gray - 8E/00 142/000
Ctrl White Left Arrow 73/00 115/000
Ctrl 5 (number pad) 8F/00 143/000
Ctrl White Right Arrow 74/00 116/000
Ctrl Gray + 90/00 144/000
Ctrl White End 75/00 117/000
Ctrl White Down Arrow 91/00 145/000
Ctrl White PgDn 76/00 118/000
Ctrl White Ins 92/00 146/000
Ctrl White Del 93/00 147/000
Ctrl SysReq ** **
Ctrl Key 45 -- --
Ctrl Enter (number pad) 1C/0A 028/010
Ctrl / (number pad) 95/00 149/000
Ctrl PrtSc 72/00 114/000
Ctrl Break 00/00 000/000
Ctrl Gray Home 77/00 119/000
Ctrl Gray Up Arrow 8D/E0 141/224
Ctrl Gray Page Up 84/00 132/000
Ctrl Gray Left Arrow 73/00 115/000
Ctrl Gray Right Arrow 74/00 116/000
Ctrl Gray End 75/00 117/000
Ctrl Gray Down Arrow 91/E0 145/224
Ctrl Gray Page Down 76/00 118/000
Ctrl Gray Insert 92/E0 146/224
Ctrl Gray Delete 93/E0 147/224
QTAwk - 19-7 - QTAwk
Section 19.0 |
Alt Esc 01/00 001/000
Alt 1 78/00 120/000
Alt 2 79/00 121/000
Alt 3 7A/00 122/000
Alt 4 7B/00 123/000
Alt 5 7C/00 124/000
Alt 6 7D/00 125/000
Alt 7 7E/00 126/000
Alt 8 7F/00 127/000
Alt 9 80/00 128/000
Alt 0 81/00 129/000
Alt - 82/00 130/000
Alt = 83/00 131/000
Alt Backspace 0E/00 014/000
Alt Tab A5/00 165/000
Alt q 10/00 016/000
Alt w 11/00 017/000
Alt e 12/00 018/000
Alt r 13/00 019/000
Alt t 14/00 020/000
Alt y 15/00 021/000
Alt u 16/00 022/000
Alt i 17/00 023/000
Alt o 18/00 024/000
Alt p 19/00 025/000
Alt [ 1A/00 026/000
Alt ] 1B/00 027/000
Alt Enter 1C/00 028/000
Alt Ctrl ** **
Alt a 1E/00 030/000
Alt s 1F/00 031/000
Alt d 20/00 032/000
Alt f 21/00 033/000
Alt g 22/00 034/000
Alt h 23/00 035/000
Alt j 24/00 036/000
Alt k 25/00 037/000
Alt l 26/00 038/000
Alt ; 27/00 039/000
Alt ' 28/00 040/000
Alt ` 29/00 041/000
Alt Shift ** **
Alt \ 2B/00 043/000
Alt z 2C/00 044/000
Alt x 2D/00 045/000
Alt c 2E/00 046/000
QTAwk - 19-8 - QTAwk
Section 19.0 |
Alt v 2F/00 047/000
Alt b 30/00 048/000
Alt n 31/00 049/000
Alt m 32/00 050/000
Alt , 33/00 051/000
Alt . 34/00 052/000
Alt / 35/00 053/000
Alt Gray * 37/00 055/000
Alt Space 39/20 057/032
Alt Caps Lock ** **
Alt F1 68/00 104/000
Alt F2 69/00 105/000
Alt F3 6A/00 106/000
Alt F4 6B/00 107/000
Alt F5 6C/00 108/000
Alt F6 6D/00 109/000
Alt F7 6E/00 110/000
Alt F8 6F/00 111/000
Alt F9 70/00 112/000
Alt F10 71/00 113/000
Alt F11 8B/00 139/000
Alt F12 8C/00 140/000
Alt Num Lock ** **
Alt Scroll Lock ** **
Alt Gray - 4A/00 074/000
Alt Gray + 4E/00 078/000
Alt 7 (number pad) # #
Alt 8 (number pad) # #
Alt 9 (number pad) # #
Alt 4 (number pad) # #
Alt 5 (number pad) # #
Alt 6 (number pad) # #
Alt 1 (number pad) # #
Alt 2 (number pad) # #
Alt 3 (number pad) # #
Alt Del -- --
Alt SysReq ** **
Alt Key 45 -- --
Alt Enter (number pad) A6/00 166/000
Alt / (number pad) A4/00 164/000
Alt PrtSc ** **
Alt Pause ** **
Alt Gray Home 97/00 151/000
Alt Gray Up Arrow 98/00 152/000
Alt Gray Page Up 99/00 153/000
Alt Gray Left Arrow 9B/00 155/000
QTAwk - 19-9 - QTAwk
Section 19.0 |
Alt Gray Right Arrow 9D/00 157/000
Alt Gray End 9F/00 159/000
Alt Gray Down Arrow A0/00 160/000
Alt Gray Page Down A1/00 161/000
Alt Gray Insert A2/00 162/000
Alt Gray Delete A3/00 163/000
QTAwk - 19-10 - QTAwk
Section 19.0 |
Table 2
Keystroke Hexadecimal Decimal
Ctrl Break 00/00 000/000
Alt Esc 01/00 001/000
Ctrl 2 (NUL) 03/00 003/000
Alt Backspace 0E/00 014/000
Shift Tab (Backtab) 0F/00 015/000
Alt q 10/00 016/000
Alt w 11/00 017/000
Alt e 12/00 018/000
Alt r 13/00 019/000
Alt t 14/00 020/000
Alt y 15/00 021/000
Alt u 16/00 022/000
Alt i 17/00 023/000
Alt o 18/00 024/000
Alt p 19/00 025/000
Alt [ 1A/00 026/000
Alt ] 1B/00 027/000
Alt Enter 1C/00 028/000
Alt a 1E/00 030/000
Alt s 1F/00 031/000
Alt d 20/00 032/000
Alt f 21/00 033/000
Alt g 22/00 034/000
Alt h 23/00 035/000
Alt j 24/00 036/000
Alt k 25/00 037/000
Alt l 26/00 038/000
Alt ; 27/00 039/000
Alt ' 28/00 040/000
Alt ` 29/00 041/000
Alt \ 2B/00 043/000
Alt z 2C/00 044/000
Alt x 2D/00 045/000
Alt c 2E/00 046/000
Alt v 2F/00 047/000
Alt b 30/00 048/000
Alt n 31/00 049/000
Alt m 32/00 050/000
Alt , 33/00 051/000
Alt . 34/00 052/000
Alt / 35/00 053/000
Alt Gray * 37/00 055/000
F1 3B/00 059/000
QTAwk - 19-11 - QTAwk
Section 19.0 |
F2 3C/00 060/000
F3 3D/00 061/000
F4 3E/00 062/000
F5 3F/00 063/000
F6 40/00 064/000
F7 41/00 065/000
F8 42/00 066/000
F9 43/00 067/000
F10 44/00 068/000
White Home 47/00 071/000
Gray Home 47/00 071/000
Shift Gray Home 47/00 071/000
White Up Arrow 48/00 072/000
Gray Up Arrow 48/00 072/000
Shift Gray Up Arrow 48/00 072/000
White PgUp 49/00 073/000
Gray Page Up 49/00 073/000
Shift Gray Page Up 49/00 073/000
Alt Gray - 4A/00 074/000
White Left Arrow 4B/00 075/000
Gray Left Arrow 4B/00 075/000
Shift Gray Left Arrow 4B/00 075/000
Center Key 4C/00 076/000
White Right Arrow 4D/00 077/000
Gray Right Arrow 4D/00 077/000
Shift Gray Right Arrow 4D/00 077/000
Alt Gray + 4E/00 078/000
White End 4F/00 079/000
Gray End 4F/00 079/000
Shift Gray End 4F/00 079/000
White Down Arrow 50/00 080/000
Gray Down Arrow 50/00 080/000
Shift Gray Down Arrow 50/00 080/000
White PgDn 51/00 081/000
Gray Page Down 51/00 081/000
Shift Gray Page Down 51/00 081/000
White Ins 52/00 082/000
Gray Insert 52/00 082/000
Shift Gray Insert 52/00 082/000
White Del 53/00 083/000
Gray Delete 53/00 083/000
Shift Gray Delete 53/00 083/000
Shift F1 54/00 084/000
Shift F2 55/00 085/000
Shift F3 56/00 086/000
Shift F4 57/00 087/000
QTAwk - 19-12 - QTAwk
Section 19.0 |
Shift F5 58/00 088/000
Shift F6 59/00 089/000
Shift F7 5A/00 090/000
Shift F8 5B/00 091/000
Shift F9 5C/00 092/000
Shift F10 5D/00 093/000
Ctrl F1 5E/00 094/000
Ctrl F2 5F/00 095/000
Ctrl F3 60/00 096/000
Ctrl F4 61/00 097/000
Ctrl F5 62/00 098/000
Ctrl F6 63/00 099/000
Ctrl F7 64/00 100/000
Ctrl F8 65/00 101/000
Ctrl F9 66/00 102/000
Ctrl F10 67/00 103/000
Alt F1 68/00 104/000
Alt F2 69/00 105/000
Alt F3 6A/00 106/000
Alt F4 6B/00 107/000
Alt F5 6C/00 108/000
Alt F6 6D/00 109/000
Alt F7 6E/00 110/000
Alt F8 6F/00 111/000
Alt F9 70/00 112/000
Alt F10 71/00 113/000
Ctrl PrtSc 72/00 114/000
Ctrl White Left Arrow 73/00 115/000
Ctrl Gray Left Arrow 73/00 115/000
Ctrl White Right Arrow 74/00 116/000
Ctrl Gray Right Arrow 74/00 116/000
Ctrl White End 75/00 117/000
Ctrl Gray End 75/00 117/000
Ctrl White PgDn 76/00 118/000
Ctrl Gray Page Down 76/00 118/000
Ctrl White Home 77/00 119/000
Ctrl Gray Home 77/00 119/000
Alt 1 78/00 120/000
Alt 2 79/00 121/000
Alt 3 7A/00 122/000
Alt 4 7B/00 123/000
Alt 5 7C/00 124/000
Alt 6 7D/00 125/000
Alt 7 7E/00 126/000
Alt 8 7F/00 127/000
Alt 9 80/00 128/000
QTAwk - 19-13 - QTAwk
Section 19.0 |
Alt 0 81/00 129/000
Alt - 82/00 130/000
Alt = 83/00 131/000
Ctrl White PgUp 84/00 132/000
Ctrl Gray Page Up 84/00 132/000
F11 85/00 133/000
F12 86/00 134/000
Shift F11 87/00 135/000
Shift F12 88/00 136/000
Ctrl F11 89/00 137/000
Ctrl F12 8A/00 138/000
Alt F11 8B/00 139/000
Alt F12 8C/00 140/000
Ctrl White Up Arrow 8D/00 141/000
Ctrl Gray - 8E/00 142/000
Ctrl 5 (number pad) 8F/00 143/000
Ctrl Gray + 90/00 144/000
Ctrl White Down Arrow 91/00 145/000
Ctrl White Ins 92/00 146/000
Ctrl White Del 93/00 147/000
Ctrl Tab 94/00 148/000
Ctrl / (number pad) 95/00 149/000
Ctrl Gray * 96/00 150/000
Alt Gray Home 97/00 151/000
Alt Gray Up Arrow 98/00 152/000
Alt Gray Page Up 99/00 153/000
Alt Gray Left Arrow 9B/00 155/000
Alt Gray Right Arrow 9D/00 157/000
Alt Gray End 9F/00 159/000
Alt Gray Down Arrow A0/00 160/000
Alt Gray Page Down A1/00 161/000
Alt Gray Insert A2/00 162/000
Alt Gray Delete A3/00 163/000
Alt / (number pad) A4/00 164/000
Alt Tab A5/00 165/000
Alt Enter (number pad) A6/00 166/000
Ctrl a (SOH) 1E/01 030/001
Ctrl b (STX) 30/02 048/002
Ctrl c (ETX) 2E/03 046/003
Ctrl d (EOT) 20/04 032/004
Ctrl e (ENQ) 12/05 018/005
Ctrl f (ACK) 21/06 033/006
Ctrl g (BEL) 22/07 034/007
Backspace 0E/08 014/008
Shift Backspace 0E/08 014/008
Ctrl h (Backspace) 23/08 035/008
QTAwk - 19-14 - QTAwk
Section 19.0 |
Tab 0F/09 015/009
Ctrl i (HT) 17/09 023/009
Ctrl Enter (LF) 1C/0A 028/010
Ctrl Enter (number pad) 1C/0A 028/010
Ctrl j (LF) 24/0A 036/010
Ctrl k (VT) 25/0B 037/011
Ctrl l (FF) 26/0C 038/012
Enter 1C/0D 028/013
Enter (number keypad) 1C/0D 028/013
Shift Enter 1C/0D 028/013
Shift Enter (number pad) 1C/0D 028/013
Ctrl m (CR) 32/0D 050/013
Ctrl n (SO) 31/0E 049/014
Ctrl o (SI) 18/0F 024/015
Ctrl p (DEL) 19/10 025/016
Ctrl q (DC1) 10/11 016/017
Ctrl r (DC2) 13/12 019/018
Ctrl s (DC3) 1F/13 031/019
Ctrl t (DC4) 14/14 020/020
Ctrl u (NAK) 16/15 022/021
Ctrl v (SYN) 2F/16 047/022
Ctrl w (ETB) 11/17 017/023
Ctrl x (CAN) 2D/18 045/024
Ctrl y (EM) 15/19 021/025
Ctrl z (SUB) 2C/1A 044/026
Esc 01/1B 001/027
Shift Esc 01/1B 001/027
Ctrl Esc 01/1B 001/027
Ctrl [ (ESC) 1A/1B 026/027
Ctrl \ (FS) 2B/1C 043/028
Ctrl ] (GS) 1B/1D 027/029
Ctrl 6 (RS) 07/1E 007/030
Ctrl - 0C/1F 012/031
Space 39/20 057/032
Shift Space 39/20 057/032
Ctrl Space 39/20 057/032
Alt Space 39/20 057/032
! 02/21 002/033
" 28/22 040/034
# 04/23 004/035
$ 05/24 005/036
% 06/25 006/037
& 08/26 008/038
' 28/27 040/039
( 0A/28 010/040
) 0B/29 011/041
QTAwk - 19-15 - QTAwk
Section 19.0 |
* (white) 09/2A 009/042
Gray * 37/2A 055/042
Shift Gray * 37/2A 055/042
+ (white) 0D/2B 013/043
Gray + 4E/2B 078/043
Shift Gray + 4E/2B 078/043
, 33/2C 051/044
- 0C/2D 012/045
Gray - 4A/2D 074/045
Shift Gray - 4A/2D 074/045
. 34/2E 052/046
Shift . (number pad) 53/2E 083/046
/ 35/2F 053/047
Gray / 35/2F 053/047
Shift Gray / 35/2F 053/047
0 0B/30 011/048
Shift 0 (number pad) 52/30 082/048
1 02/31 002/049
Shift 1 (number pad) 4F/31 079/049
2 03/32 003/050
Shift 2 (number pad) 50/32 080/050
3 04/33 004/051
Shift 3 (number pad) 51/33 081/051
4 05/34 005/052
Shift 4 (number pad) 4B/34 075/052
5 06/35 006/053
Shift 5 (number pad) 4C/35 076/053
6 07/36 007/054
Shift 6 (number pad) 4D/36 077/054
7 08/37 008/055
Shift 7 (number pad) 47/37 071/055
8 09/38 009/056
Shift 8 (number pad) 48/38 072/056
9 0A/39 010/057
Shift 9 (number pad) 49/39 073/057
: 27/3A 039/058
; 27/3B 039/059
< 33/3C 051/060
= 0D/3D 013/061
> 34/3E 052/062
? 35/3F 053/063
@ 03/40 003/064
A 1E/41 030/065
B 30/42 048/066
C 2E/43 046/067
D 20/44 032/068
QTAwk - 19-16 - QTAwk
Section 19.0 |
E 12/45 018/069
F 21/46 033/070
G 22/47 034/071
H 23/48 035/072
I 17/49 023/073
J 24/4A 036/074
K 25/4B 037/075
L 26/4C 038/076
M 32/4D 050/077
N 31/4E 049/078
O 18/4F 024/079
P 19/50 025/080
Q 10/51 016/081
R 13/52 019/082
S 1F/53 031/083
T 14/54 020/084
U 16/55 022/085
V 2F/56 047/086
W 11/57 017/087
X 2D/58 045/088
Y 15/59 021/089
Z 2C/5A 044/090
[ 1A/5B 026/091
\ 2B/5C 043/092
Key 45 56/5C 086/092
] 1B/5D 027/093
^ 07/5E 007/094
_ 0C/5F 012/095
` 29/60 041/096
a 1E/61 030/097
b 30/62 048/098
c 2E/63 046/099
d 20/64 032/100
e 12/65 018/101
f 21/66 033/102
g 22/67 034/103
h 23/68 035/104
i 17/69 023/105
j 24/6A 036/106
k 25/6B 037/107
l 26/6C 038/108
m 32/6D 050/109
n 31/6E 049/110
o 18/6F 024/111
p 19/70 025/112
q 10/71 016/113
QTAwk - 19-17 - QTAwk
Section 19.0 |
r 13/72 019/114
s 1F/73 031/115
t 14/74 020/116
u 16/75 022/117
v 2F/76 047/118
w 11/77 017/119
x 2D/78 045/120
y 15/79 021/121
z 2C/7A 044/122
{ 1A/7B 026/123
| 2B/7C 043/124
Shift Key 45 56/7C 086/124
} 1B/7D 027/125
~ 29/7E 041/126
Ctrl Backspace (DEL) 0E/7F 014/127
Ctrl Gray Up Arrow 8D/E0 141/224
Ctrl Gray Down Arrow 91/E0 145/224
Ctrl Gray Insert 92/E0 146/224
Ctrl Gray Delete 93/E0 147/224
Alt 7 (number pad) # #
Alt 8 (number pad) # #
Alt 9 (number pad) # #
Alt 4 (number pad) # #
Alt 5 (number pad) # #
Alt 6 (number pad) # #
Alt 1 (number pad) # #
Alt 2 (number pad) # #
Alt 3 (number pad) # #
Ctrl ** **
Shift ** **
Alt ** **
Caps Lock ** **
Num Lock ** **
Scroll Lock ** **
SysReq ** **
PrtSc ** **
Pause ** **
Shift Ctrl ** **
Shift Alt ** **
Shift Caps Lock ** **
Shift Num Lock ** **
Shift Scroll Lock ** **
Shift SysReq ** **
Shift PrtSc ** **
Shift Pause ** **
Ctrl Shift ** **
QTAwk - 19-18 - QTAwk
Section 19.0 |
Ctrl Alt ** **
Ctrl SysReq ** **
Alt Ctrl ** **
Alt Shift ** **
Alt Caps Lock ** **
Alt Num Lock ** **
Alt Scroll Lock ** **
Alt SysReq ** **
Alt PrtSc ** **
Alt Pause ** **
Ctrl 1 -- --
Ctrl 3 -- --
Ctrl 4 -- --
Ctrl 5 -- --
Ctrl 7 -- --
Ctrl 8 -- --
Ctrl 9 -- --
Ctrl 0 -- --
Ctrl = -- --
Ctrl ; -- --
Ctrl ' -- --
Ctrl ` -- --
Ctrl , -- --
Ctrl . -- --
Ctrl / -- --
Ctrl Caps Lock -- --
Ctrl Num Lock -- --
Ctrl Scroll Lock -- --
Ctrl Key 45 -- --
Alt Del -- --
Alt Key 45 -- --
QTAwk - 19-19 - QTAwk
Section 19.0 |
Table 3
Keystroke Hexadecimal Decimal
Ctrl Break 00/00 000/000
Alt Esc 01/00 001/000
Esc 01/1B 001/027
Shift Esc 01/1B 001/027
Ctrl Esc 01/1B 001/027
! 02/21 002/033
1 02/31 002/049
Ctrl 2 (NUL) 03/00 003/000
2 03/32 003/050
@ 03/40 003/064
# 04/23 004/035
3 04/33 004/051
$ 05/24 005/036
4 05/34 005/052
% 06/25 006/037
5 06/35 006/053
Ctrl 6 (RS) 07/1E 007/030
6 07/36 007/054
^ 07/5E 007/094
& 08/26 008/038
7 08/37 008/055
* (white) 09/2A 009/042
8 09/38 009/056
( 0A/28 010/040
9 0A/39 010/057
) 0B/29 011/041
0 0B/30 011/048
Ctrl - 0C/1F 012/031
- 0C/2D 012/045
_ 0C/5F 012/095
+ (white) 0D/2B 013/043
= 0D/3D 013/061
Alt Backspace 0E/00 014/000
Backspace 0E/08 014/008
Shift Backspace 0E/08 014/008
Ctrl Backspace (DEL) 0E/7F 014/127
Shift Tab (Backtab) 0F/00 015/000
Tab 0F/09 015/009
Alt q 10/00 016/000
Ctrl q (DC1) 10/11 016/017
Q 10/51 016/081
q 10/71 016/113
Alt w 11/00 017/000
QTAwk - 19-20 - QTAwk
Section 19.0 |
Ctrl w (ETB) 11/17 017/023
W 11/57 017/087
w 11/77 017/119
Alt e 12/00 018/000
Ctrl e (ENQ) 12/05 018/005
E 12/45 018/069
e 12/65 018/101
Alt r 13/00 019/000
Ctrl r (DC2) 13/12 019/018
R 13/52 019/082
r 13/72 019/114
Alt t 14/00 020/000
Ctrl t (DC4) 14/14 020/020
T 14/54 020/084
t 14/74 020/116
Alt y 15/00 021/000
Ctrl y (EM) 15/19 021/025
Y 15/59 021/089
y 15/79 021/121
Alt u 16/00 022/000
Ctrl u (NAK) 16/15 022/021
U 16/55 022/085
u 16/75 022/117
Alt i 17/00 023/000
Ctrl i (HT) 17/09 023/009
I 17/49 023/073
i 17/69 023/105
Alt o 18/00 024/000
Ctrl o (SI) 18/0F 024/015
O 18/4F 024/079
o 18/6F 024/111
Alt p 19/00 025/000
Ctrl p (DEL) 19/10 025/016
P 19/50 025/080
p 19/70 025/112
Alt [ 1A/00 026/000
Ctrl [ (ESC) 1A/1B 026/027
[ 1A/5B 026/091
{ 1A/7B 026/123
Alt ] 1B/00 027/000
Ctrl ] (GS) 1B/1D 027/029
] 1B/5D 027/093
} 1B/7D 027/125
Alt Enter 1C/00 028/000
Ctrl Enter (LF) 1C/0A 028/010
Ctrl Enter (number pad) 1C/0A 028/010
QTAwk - 19-21 - QTAwk
Section 19.0 |
Enter 1C/0D 028/013
Enter (number keypad) 1C/0D 028/013
Shift Enter 1C/0D 028/013
Shift Enter (number pad) 1C/0D 028/013
Alt a 1E/00 030/000
Ctrl a (SOH) 1E/01 030/001
A 1E/41 030/065
a 1E/61 030/097
Alt s 1F/00 031/000
Ctrl s (DC3) 1F/13 031/019
S 1F/53 031/083
s 1F/73 031/115
Alt d 20/00 032/000
Ctrl d (EOT) 20/04 032/004
D 20/44 032/068
d 20/64 032/100
Alt f 21/00 033/000
Ctrl f (ACK) 21/06 033/006
F 21/46 033/070
f 21/66 033/102
Alt g 22/00 034/000
Ctrl g (BEL) 22/07 034/007
G 22/47 034/071
g 22/67 034/103
Alt h 23/00 035/000
Ctrl h (Backspace) 23/08 035/008
H 23/48 035/072
h 23/68 035/104
Alt j 24/00 036/000
Ctrl j (LF) 24/0A 036/010
J 24/4A 036/074
j 24/6A 036/106
Alt k 25/00 037/000
Ctrl k (VT) 25/0B 037/011
K 25/4B 037/075
k 25/6B 037/107
Alt l 26/00 038/000
Ctrl l (FF) 26/0C 038/012
L 26/4C 038/076
l 26/6C 038/108
Alt ; 27/00 039/000
: 27/3A 039/058
; 27/3B 039/059
Alt ' 28/00 040/000
" 28/22 040/034
' 28/27 040/039
QTAwk - 19-22 - QTAwk
Section 19.0 |
Alt ` 29/00 041/000
` 29/60 041/096
~ 29/7E 041/126
Alt \ 2B/00 043/000
Ctrl \ (FS) 2B/1C 043/028
\ 2B/5C 043/092
| 2B/7C 043/124
Alt z 2C/00 044/000
Ctrl z (SUB) 2C/1A 044/026
Z 2C/5A 044/090
z 2C/7A 044/122
Alt x 2D/00 045/000
Ctrl x (CAN) 2D/18 045/024
X 2D/58 045/088
x 2D/78 045/120
Alt c 2E/00 046/000
Ctrl c (ETX) 2E/03 046/003
C 2E/43 046/067
c 2E/63 046/099
Alt v 2F/00 047/000
Ctrl v (SYN) 2F/16 047/022
V 2F/56 047/086
v 2F/76 047/118
Alt b 30/00 048/000
Ctrl b (STX) 30/02 048/002
B 30/42 048/066
b 30/62 048/098
Alt n 31/00 049/000
Ctrl n (SO) 31/0E 049/014
N 31/4E 049/078
n 31/6E 049/110
Alt m 32/00 050/000
Ctrl m (CR) 32/0D 050/013
M 32/4D 050/077
m 32/6D 050/109
Alt , 33/00 051/000
, 33/2C 051/044
< 33/3C 051/060
Alt . 34/00 052/000
. 34/2E 052/046
> 34/3E 052/062
Alt / 35/00 053/000
/ 35/2F 053/047
Gray / 35/2F 053/047
Shift Gray / 35/2F 053/047
? 35/3F 053/063
QTAwk - 19-23 - QTAwk
Section 19.0 |
Alt Gray * 37/00 055/000
Gray * 37/2A 055/042
Shift Gray * 37/2A 055/042
Space 39/20 057/032
Shift Space 39/20 057/032
Ctrl Space 39/20 057/032
Alt Space 39/20 057/032
F1 3B/00 059/000
F2 3C/00 060/000
F3 3D/00 061/000
F4 3E/00 062/000
F5 3F/00 063/000
F6 40/00 064/000
F7 41/00 065/000
F8 42/00 066/000
F9 43/00 067/000
F10 44/00 068/000
White Home 47/00 071/000
Gray Home 47/00 071/000
Shift Gray Home 47/00 071/000
Shift 7 (number pad) 47/37 071/055
White Up Arrow 48/00 072/000
Gray Up Arrow 48/00 072/000
Shift Gray Up Arrow 48/00 072/000
Shift 8 (number pad) 48/38 072/056
White PgUp 49/00 073/000
Gray Page Up 49/00 073/000
Shift Gray Page Up 49/00 073/000
Shift 9 (number pad) 49/39 073/057
Alt Gray - 4A/00 074/000
Gray - 4A/2D 074/045
Shift Gray - 4A/2D 074/045
White Left Arrow 4B/00 075/000
Gray Left Arrow 4B/00 075/000
Shift Gray Left Arrow 4B/00 075/000
Shift 4 (number pad) 4B/34 075/052
Center Key 4C/00 076/000
Shift 5 (number pad) 4C/35 076/053
White Right Arrow 4D/00 077/000
Gray Right Arrow 4D/00 077/000
Shift Gray Right Arrow 4D/00 077/000
Shift 6 (number pad) 4D/36 077/054
Alt Gray + 4E/00 078/000
Gray + 4E/2B 078/043
Shift Gray + 4E/2B 078/043
White End 4F/00 079/000
QTAwk - 19-24 - QTAwk
Section 19.0 |
Gray End 4F/00 079/000
Shift Gray End 4F/00 079/000
Shift 1 (number pad) 4F/31 079/049
White Down Arrow 50/00 080/000
Gray Down Arrow 50/00 080/000
Shift Gray Down Arrow 50/00 080/000
Shift 2 (number pad) 50/32 080/050
White PgDn 51/00 081/000
Gray Page Down 51/00 081/000
Shift Gray Page Down 51/00 081/000
Shift 3 (number pad) 51/33 081/051
White Ins 52/00 082/000
Gray Insert 52/00 082/000
Shift Gray Insert 52/00 082/000
Shift 0 (number pad) 52/30 082/048
White Del 53/00 083/000
Gray Delete 53/00 083/000
Shift Gray Delete 53/00 083/000
Shift . (number pad) 53/2E 083/046
Shift F1 54/00 084/000
Shift F2 55/00 085/000
Shift F3 56/00 086/000
Key 45 56/5C 086/092
Shift Key 45 56/7C 086/124
Shift F4 57/00 087/000
Shift F5 58/00 088/000
Shift F6 59/00 089/000
Shift F7 5A/00 090/000
Shift F8 5B/00 091/000
Shift F9 5C/00 092/000
Shift F10 5D/00 093/000
Ctrl F1 5E/00 094/000
Ctrl F2 5F/00 095/000
Ctrl F3 60/00 096/000
Ctrl F4 61/00 097/000
Ctrl F5 62/00 098/000
Ctrl F6 63/00 099/000
Ctrl F7 64/00 100/000
Ctrl F8 65/00 101/000
Ctrl F9 66/00 102/000
Ctrl F10 67/00 103/000
Alt F1 68/00 104/000
Alt F2 69/00 105/000
Alt F3 6A/00 106/000
Alt F4 6B/00 107/000
Alt F5 6C/00 108/000
QTAwk - 19-25 - QTAwk
Section 19.0 |
Alt F6 6D/00 109/000
Alt F7 6E/00 110/000
Alt F8 6F/00 111/000
Alt F9 70/00 112/000
Alt F10 71/00 113/000
Ctrl PrtSc 72/00 114/000
Ctrl White Left Arrow 73/00 115/000
Ctrl Gray Left Arrow 73/00 115/000
Ctrl White Right Arrow 74/00 116/000
Ctrl Gray Right Arrow 74/00 116/000
Ctrl White End 75/00 117/000
Ctrl Gray End 75/00 117/000
Ctrl White PgDn 76/00 118/000
Ctrl Gray Page Down 76/00 118/000
Ctrl White Home 77/00 119/000
Ctrl Gray Home 77/00 119/000
Alt 1 78/00 120/000
Alt 2 79/00 121/000
Alt 3 7A/00 122/000
Alt 4 7B/00 123/000
Alt 5 7C/00 124/000
Alt 6 7D/00 125/000
Alt 7 7E/00 126/000
Alt 8 7F/00 127/000
Alt 9 80/00 128/000
Alt 0 81/00 129/000
Alt - 82/00 130/000
Alt = 83/00 131/000
Ctrl White PgUp 84/00 132/000
Ctrl Gray Page Up 84/00 132/000
F11 85/00 133/000
F12 86/00 134/000
Shift F11 87/00 135/000
Shift F12 88/00 136/000
Ctrl F11 89/00 137/000
Ctrl F12 8A/00 138/000
Alt F11 8B/00 139/000
Alt F12 8C/00 140/000
Ctrl White Up Arrow 8D/00 141/000
Ctrl Gray Up Arrow 8D/E0 141/224
Ctrl Gray - 8E/00 142/000
Ctrl 5 (number pad) 8F/00 143/000
Ctrl Gray + 90/00 144/000
Ctrl White Down Arrow 91/00 145/000
Ctrl Gray Down Arrow 91/E0 145/224
Ctrl White Ins 92/00 146/000
QTAwk - 19-26 - QTAwk
Section 19.0 |
Ctrl Gray Insert 92/E0 146/224
Ctrl White Del 93/00 147/000
Ctrl Gray Delete 93/E0 147/224
Ctrl Tab 94/00 148/000
Ctrl / (number pad) 95/00 149/000
Ctrl Gray * 96/00 150/000
Alt Gray Home 97/00 151/000
Alt Gray Up Arrow 98/00 152/000
Alt Gray Page Up 99/00 153/000
Alt Gray Left Arrow 9B/00 155/000
Alt Gray Right Arrow 9D/00 157/000
Alt Gray End 9F/00 159/000
Alt Gray Down Arrow A0/00 160/000
Alt Gray Page Down A1/00 161/000
Alt Gray Insert A2/00 162/000
Alt Gray Delete A3/00 163/000
Alt / (number pad) A4/00 164/000
Alt Tab A5/00 165/000
Alt Enter (number pad) A6/00 166/000
Alt 7 (number pad) # #
Alt 8 (number pad) # #
Alt 9 (number pad) # #
Alt 4 (number pad) # #
Alt 5 (number pad) # #
Alt 6 (number pad) # #
Alt 1 (number pad) # #
Alt 2 (number pad) # #
Alt 3 (number pad) # #
Ctrl ** **
Shift ** **
Alt ** **
Caps Lock ** **
Num Lock ** **
Scroll Lock ** **
SysReq ** **
PrtSc ** **
Pause ** **
Shift Ctrl ** **
Shift Alt ** **
Shift Caps Lock ** **
Shift Num Lock ** **
Shift Scroll Lock ** **
Shift SysReq ** **
Shift PrtSc ** **
Shift Pause ** **
Ctrl Shift ** **
QTAwk - 19-27 - QTAwk
Section 19.0 |
Ctrl Alt ** **
Ctrl SysReq ** **
Alt Ctrl ** **
Alt Shift ** **
Alt Caps Lock ** **
Alt Num Lock ** **
Alt Scroll Lock ** **
Alt SysReq ** **
Alt PrtSc ** **
Alt Pause ** **
Ctrl 1 -- --
Ctrl 3 -- --
Ctrl 4 -- --
Ctrl 5 -- --
Ctrl 7 -- --
Ctrl 8 -- --
Ctrl 9 -- --
Ctrl 0 -- --
Ctrl = -- --
Ctrl ; -- --
Ctrl ' -- --
Ctrl ` -- --
Ctrl , -- --
Ctrl . -- --
Ctrl / -- --
Ctrl Caps Lock -- --
Ctrl Num Lock -- --
Ctrl Scroll Lock -- --
Ctrl Key 45 -- --
Alt Del -- --
Alt Key 45 -- --
QTAwk - 19-28 - QTAwk
Section 20.0 |
20.0 Appendix iii
Differences Between QTAwk and Awk:
1. Expanded Regular Expressions
All of the Awk regular expression operators are allowed plus
the following:
a) complemented character class using the Awk notation,
'[^...]', as well as the Awk/QTAwk and C logical negation
operator, '[!...]'.
b) Matched character classes, '[#...]'. These classes are
used in pairs. The position of the character matched in
the first class of the pair, determines the character
which must match in the position occupied by the second
class of the pair.
c) Look-ahead Operator. r@t regular expression r is
matched only when followed by regular expression t.
d) Repetition Operator. r{n1,n2} at least n1 and up to n2
repetitions of regular expression r.
e) Named Expressions. {named_expr} is replaced by the
string value of the corresponding variable.
f) Tagged Expressions. Enclosing a portion of a regular
expression, in parenthesis, "()" makes the matching
string available for use with the Tag Operator, '$$'.
2. Consistent statement termination syntax. The QTAwk Utility
Creation Tool utilizes the semi-colon, ';', to terminate all
statements. The practice in Awk of using new lines to
"sometimes" terminate statements is no longer allowed.
3. Expanded Operator Set
The Awk set of operators has been changed to make them more
consistent and to more closely match those of C. The Awk
match operator, '~', has been changed to '~~' so that the
similarity between the match operators, '~~' and '!~', to the
equality operators, '==' and '!=", is complete. The single
tilde symbol, '~', reverts to the C one's complement
operator, an addition to the operator set over Awk. The
introduction of the explicit string concatenation operator,
'∩' (ASCII 239, 0x0ef). The remaining "new" operators to
QTAwk are:
QTAwk - 20-1 - QTAwk
Section 20.0 |
Operation Operator
tag $$
one's complement ~
concatenation ∩
shift left/right << >>
matching ~~ !~
bit-wise AND &
bit-wise XOR @
bit-wise OR |
sequence ,
The carat, '^', remains as the exponentiation operator. The
symbol '@' is used for the exclusive OR operator. For string
operands, the shift operators, '<<' and '>>', shift the
strings with wrap-around instead of a bit shift as for
numeric operands.
4. Expanded set of recognized constants:
a) decimal integers,
b) octal integers,
c) hexadecimal integers,
d) character constants, and
e) floating point constants.
These constants are recognized in utilitys, input fields and
strings.
5. Expanded predefined patterns giving more control:
a) INIITAL - similar to BEGIN. Actions executed after
opening each input file and before reading first record.
b) FINAL - similar to END. Actions executed after reading
last record of each input file and before closing file.
c) NOMATCH - actions executed for each input record for
which no pattern was matched.
d) GROUP - used to group multiple regular expressions for
search optimization. Can speed search by a factor of
six.
6. True multidimensional Arrays
QTAwk - 20-2 - QTAwk
Section 20.0 |
The use of the comma in index expressions to simulate
multiple array indices is no longer supported. True multiple
indices are supported. Indexing is in the C manner,
'a[i1][i2]'. The use of the SUBSEP built-in variable of Awk
has been redefined.
7. Integer array indices as well as string indices
Array indices have been expanded to include integers as well
as the string indices of Awk. Indices are not automatically
converted to strings as in Awk. Thus, for true integer
indices, the index ordering follows the numeric sequence with
an integer index value of '10' following an integer value of
'2' instead of preceding it.
8. Arrays integrated into QTAwk
QTAwk integrates arrays with arithmetic operators so that the
operations are carried out on the entire array. QTAwk also
integrates arrays into user-defined functions so that they
can be passed to and returned from such functions in a
natural and intuitive manner. Awk does not allow returning
arrays from user-defined functions or allow arithmetic
operators to operate on whole arrays.
In addition, with version 6.00 for PC/MS-DOS VERSION 1.00
FOR OS/2, arrays have been fully integrated into all aspects
of QTAwk including the match operators, '~~' and '!~', and
their implied use in patterns and the built-in functions,
'sub', 'gsub', and 'match'. The MATCH_INDEX built-in
variable has been added to return the matching array element
index when an array has been used for pattern matching. The
string value of the SUBSEP built-in variable is used as the
index separator in MATCH_INDEX for multidimensional arrays.
Arrays used as regular expressions with the match operators,
both explicit and implied, retain their internal regular
expression form between uses. In addition, the internal
regular expression form is assigned when the array as a whole
is assigned to another variable, the internal regular
expression form is also assigned. The internal regular
expression form is discarded only when the array is changed.
This gives the user a more balanced control over dynamic
regular expressions between that of true regular expressions,
which retain the internal form until execution is halted, and
strings used as regular expressions, which discard the
internal regular expression form after each use.
QTAwk - 20-3 - QTAwk
Section 20.0 |
9. NEW Keywords:
a) cycle
similar to 'next' except that may use current record in
restarting outer pattern matching loop.
b) deletea
similar to 'delete' except that ALL array values
deleted.
c) switch, case, default
similar to C syntax with the allowed 'switch' values and
'case' labels expanded to include any legal QTAwk
expression, evaluated at run-time. The expressions may
evaluate to any value including any numeric value, string
or regular expression.
d) local
new keyword to allow the declaration and use of local
variables within compound statements, including
user-defined functions. Its use in user defined
functions instead of the Awk practice of defining excess
formal parameters, leads to easier to read and maintain
functions. The C 'practice' of allowing initialization
in the 'local' statement is followed.
e) endfile
similar to 'exit'. Simulates end of current input file
only, any remaining input files are still processed.
10. New Arithmetic Functions
QTAwk includes 18 built-in arithmetic functions. All of the
functions supported by Awk plus the following:
a) acos(x) - arc-cosine of x
b) asin(x) - arc-sine of x
c) cosh(x) - hyperbolic cosine of x
d) jdn or jdn() or jdn(fdate) or jdn(y,m,d) or jdn(fdate) -
Julian Day Number of date specified
e) fract(x) - fractional portion of x
f) log10(x) - logarithm base 10
g) pi or pi() - pi
h) sinh(x) - hyperbolic sine of x
11. New String Functions
QTAwk includes 33 built-in string functions. All of the
functions supported by Awk plus the following:
QTAwk - 20-4 - QTAwk
Section 20.0 |
a) cal(fmt,jdn) - Julian Day Number to formatted date
b) center(s,w) or center(s,w,c) - center string
c) copies(s,n) - copies of string
d) deletec(s,p,n) - delete characters from a string
e) insert(s1,s2,p) - insert one string into another string
f) justify(a,n,w) or justify(a,n,w,c) - justify string
g) overlay(s1,s2,p) - overlay one string on another
h) remove(s,c) - remove characters from a string
i) replace(s) - replace all variables in a string
j) sdate(fmt) or sdate(fmt,fdate) or sdate(fmt,y,m,d) -
format date/time
k) srange(c1,c2) - return string formed of all characters
from c1 to c2
l) srev(s) - reverse characters of string
m) stime(fmt) or stime(fmt,ftime) or stime(fmt,h,m,s) -
format date/time
n) stran(s) or stran(s,st) or stran(s,st,sf) - translate
characters
o) strim(s) or strim(s,c) or strim(s,c,d) - trim leading
and/or trailing characters
p) strlwr(s) - translate to lower case
q) strupr(s) - translate to upper case
12. New Miscellaneous Functions
a) The function 'rotate(a)' is provided to rotate the
elements of the array a.
b) execute(s) or execute(s,se) or execute(s,se,rf) - execute
string s
c) execute(a) or execute(a,se) or execute(a,se,rf) - execute
array a
d) findfile(var,pattern,attributes) - find files with
specified names and attributes
e) pd_sym - access pre-defined variables
f) ud_sym - access user defined variables
g) resetre - return QTAwk utility to start-up condition for
all regular expressions, including patterns and GROUP
patterns. Only the internal regular expression forms for
arrays are not re-initialized. The internal regular
expression forms for arrays are re-initialized whenever
the array is changed in any manner.
13. New I/O Functions
I/O function syntax has been made consistent with syntax of
other functions. The redirection operators, '<', '>' and
'>>', and pipeline operator, '|', have been deleted as
excessively error prone in expressions. The functional
QTAwk - 20-5 - QTAwk
Section 20.0 |
syntax of the 'getline' function has been made identical to
that of the other built-in functions. The new functions
'fgetline', 'fprint' and 'fprintf' have been introduced for
reading and writing to files other than the current input
file and to replace the redirection operators.
a) Single character input functions have been added:
1: getc() - return next character from current input
file,
2: fgetc(F) - return next character from named file, F
3: putc(c) - output character c to standard output file
4: fputc(c,F) - output character c to file F
b) The dropped file re-direction operator, '>>', has been
replaced by the 'append' function:
append(F) -- Opens the file F for output to the end of
the file. All subsequent output to the file is appended
to the end of the file. This function must be called
before the first output to the file to append. Any
output to the file prior to calling this function will
open the file and discard any existing contents, i.e.,
truncate to zero length.
c) Two functions to search files for one or more regular
expressions:
1: srchrecord( sp )
or srchrecord( sp , rs )
or srchrecord( sp , rs , var )
search current input file for next record containing
match to 'sp', using 'rs' as record separator (RS if
'rs' not specified), returning record found in 'var',
$0 if 'var' not specified. Update NR and FNR. Also
reparse $0 if 'var' not specified and update NF.
Returns:
a| n ==> Record Present And Read, n == Number Of
Characters In Record plus EOR length plus 1.
b| 0 ==> End-Of-File, EOF, Encountered
c| -1 ==> Read Error Occurred (Including Failure To
Open File)
2: fsrchrecord( fn , sp )
or fsrchrecord( fn , sp , rs )
or fsrchrecord( fn , sp , rs , var )
QTAwk - 20-6 - QTAwk
Section 20.0 |
search file 'fn' for next record containing match to
'sp', using 'rs' as record separator (RS if 'rs' not
specified), returning record found in 'var', $0 if
'var' not specified. Reparse $0 if 'var' not
specified and update NF.
Returns:
a| n ==> Record Present And Read, n == Number Of
Characters In Record plus EOR length plus 1.
b| 0 ==> End-Of-File, EOF, Encountered
c| -1 ==> Read Error Occurred (Including Failure To
Open File)
d) The function 'get_FNR(F)' has been introduced. This
function returns the current record number of the input
file 'F'. This function is necessary to obtain the
current input record number for input files used with the
'fgetline' and 'fsrchrecord' functions.
14. Expanded capability of formatted Output.
The limited output formatting available with the Awk 'printf'
function has been expanded by adopting the complete output
format specification of the ANSI C standard.
15. 'local' keyword
The 'local' keyword has been introduced to allow for
variables local to user-defined functions (and any compound
statement). This expansion makes the Awk practice of
defining 'extra' formal parameters no longer necessary.
16. Expanded user-defined functions
With the 'local' keyword, QTAwk allows the user to define
functions that may accept a variable number of arguments.
Functions, such as finding the minimum/maximum of a variable
number of variables, are possible with one function rather
than defining separate functions for each possible
combination of arguments.
17. User controlled trace capability
A user controlled statement trace capability has been added.
This gives the user a simple to use mechanism to trace
utility execution. Rather than adding 'print' statements,
merely re-defining the value of a built-in variable will give
utility execution trace information, including utility line
number.
QTAwk - 20-7 - QTAwk
Section 20.0 |
18. Expanded built-in variable list
With 57 built-in variables, QTAwk includes all of the
built-in variables of Awk plus the following: (Note: the
definition and use of SUBSEP has been changed from that in
Awk)
a) _arg_chk - used to determine whether to check number of
arguments passed to user-defined functions.
b) ARGI - index value in ARGV of next command line
argument. Gives more control of command line argument
processing.
c) CONVFMT - used for converting floating point numbers to
strings. OFMT used only for output floating point
numbers.
d) CLENGTH - similar to 'RLENGTH' of Awk. Set whenever a
'case' value evaluates to a regular expression.
e) CSTART - similar to 'RSTART' of Awk. Set whenever a
'case' value evaluates to a regular expression.
f) CYCLE_COUNT - count number of outer loop cycles with
current input record.
g) DEGREES - if TRUE, trigonometric functions assume degree
values, radians if FALSE.
h) ENVIRON - one dimensional array with elements equal to
the environment strings passed to QTAwk
i) ECHO_INPUT - controls echo of standard input file to
standard output file.
j) FALSE - predefined with constant value, 0.
k) FIELDFILL - string value used for filling fixed length
fields when fields changed.
l) FIELDWIDTHS - can be assigned a value for fixed width
fields, over-riding the use of FS for splitting current
record into fields.
m) FILEATTR -- file attributes of current input file.
QTAwk - 20-8 - QTAwk
Section 20.0 |
n) FILEDATE -- date in DOS format of current input file.
o) FILETIME -- time in DOS format of current input file.
p) FILEDATE_CREATE -- creation date in operating system
format of current input file.
q) FILETIME_CREATE -- creation time in operating system
format of current input file.
r) FILEDATE_LACCESS -- last access creation date in
operating system format of current input file.
s) FILETIME_LACCESS -- last access time in operating system
format of current input file.
t) FILESIZE -- size in bytes of current input file.
u) FILE_SORT -- string value to define sort order of array
returned by "findfile" function.
v) FILE_SEARCH -- TRUE/FALSE value to search current input
file for record(s) containing match to regular
expression(s) in FILE_SEARCH_PAT. Default value FALSE.
w) FILE_SEARCH_PAT -- contains one or more patterns for
searching current input file.
x) FS allowed to be an array. If FS is an array, multiple
patterns may be set for field separators.
y) Gregorian -- TRUE/FALSE value to distinguish using
Gregorian or Julian calendar in computing Julian Day
Number or converting back to calendar date.
z) IGNORECASE - if assigned a true value, QTAwk ignores
case is all string and regular expression match
operations.
aa) LONGEST_EXP - used to control whether the longest or
the first string matching a regular expression is found.
ab) MATCH_INDEX - assigned the string value of the matching
array element when an array used for regular expression
match.
QTAwk - 20-9 - QTAwk
Section 20.0 |
ac) MAX_CYCLE - maximum number of outer loop cycles
permitted with current input record.
ad) MLENGTH - similar to 'RLENGTH' of Awk. Set whenever a
stand-alone regular expression is encountered in
evaluating a pattern.
ae) MSTART - similar to 'RSTART' of Awk. Set whenever a
stand-alone regular expression is encountered in
evaluating a pattern.
af) NF - if value changed, current input record changed to
reflect new value.
ag) NG - equal to the number of the regular expression in a
GROUP matching a string in the current input record.
ah) OFMT - string value used only as format for output of
floating point numbers.
ai) RECLEN - if assigned a non-zero numeric value, integral
value used for length of fixed length records. RS not
used unless RECLEN has a zero numeric value.
aj) RETAIN_FS - if TRUE the original characters separating
the fields of the current input record are retained
whenever a field is changed, causing the input record to
be re-constructed. If FALSE the output field separator,
OFS, is used to separate fields in the current input
record during reconstruction. The latter practice is the
only method available in Awk.
ak) RS allowed to be an array. If RS is an array, and
RECLEN has a zero numeric value, multiple patterns may be
set for record separators.
al) RT - automatically assigned string value of record
terminator for current input record.
am) SUBSEP -- string value used as the array element index
separator in MATCH_INDEX.
an) SPAN_RECORDS -- TRUE/FALSE, default value FALSE. if
TRUE allows matches to FILE_SEARCH_PAT to span multiple
input records and return multiple records in $0. If
FALSE, matches confined to a single record. Also
QTAwk - 20-10 - QTAwk
Section 20.0 |
controls matches spanning records in 'srchrecord' and
'fsrchrecord' functions.
ao) TRACE - value used to determine utility tracing.
ap) TRANS_FROM/TRANS_TO - strings used by 'stran' function
if second and/or third arguments not specified.
aq) TRUE - predefined with constant value, 1
ar) QTAwk_Path - initialized from 'QTAWK' environment
variable. Sets paths searched for input files.
as) vargc - used only in used-defined functions defined
with a variable number of arguments. At run-time, set
equal to the actual number of variable arguments passed.
at) vargv - used only in used-defined functions defined
with a variable number of arguments. At run-time, an
single dimensioned array with each element set to the
argument actually passed.
19. New command line options available:
a) -ffilename ==> multiple utility files may be specified.
In addition, the file directive:
#include "filename"
may be used to include other files.
b) -vvar=value ==> sets 'var' to value before any "BEGIN"
actions executed
c) -Wd - delays parsing of input record until any fields or
the NF variable referenced.
d) -Wwfilename - writes internal form of utility to the
file "filename". The file may then be specified with the
'f' option. Speeds initial reading of utility files.
20. Definition of built-in variable, RS, expanded. When value
assigned to RS, it is converted to regular expression form.
Strings matching regular expression act as record separator.
Similar in behavior to field separator, FS. If an array,
multiple record separator patterns may be specified.
QTAwk - 20-11 - QTAwk
Section 20.0 |
21. In QTAwk, setting built-in variable, "FILENAME", to another
value will change the current input file. Setting the
variable in Awk, has no effect on current input file.
22. In QTAwk, setting built-in variable, NF to another value
will change the current contents of $0. If the new value is
greater than the current value, the current input line is
lengthened with new empty fields separated by the output
field separator strings, OFS. If the new value is less than
the current value, then $0 is shortened by truncating at the
end of the field corresponding to the new NF value.
23. The Tag Operator, $$, may be used in a manner similar to
field operator, $. The Tag operator may be used to obtain or
to set a particular part of the string matching the regular
expression pattern.
24. The return value of the 'getline' function has been changed
when a valid record has been read. The return value is the
length of the record plus the length of the End-Of-Record
plus 1.
25. Corrected admitted problems with Awk. The problems
mentioned on page 182 of "The Awk Programming Language" have
been corrected. Specifically:
a) true multidimensional arrays have been implemented,
b) the 'getline' syntax has been made to match that of
other functions,
c) declaring local variables in user-defined functions has
been corrected,
d) intervening blanks are allowed between the function call
name and the opening parenthesis (in fact, under QTAwk it
is permissible to have no opening parenthesis or argument
list for user-defined functions that have been defined
with no formal arguments).
QTAwk - 20-12 - QTAwk
Section 21.0 |
21.0 Appendix iv
The following QTAwk utility is designed to search C source code
files for keywords defined in the ANSI C standard. It is
included here to illustrate the use of the the 'GROUP' keyword.
# QTAwk utility to scan C source files for keywords
# defined in the ANSI C standard keywords:
# macro or function names defined in the standard
# types or constants defined in the standard
#
# program to illustrate GROUP pattern keyword
#
# input: C source file
# output: all lines containing ANSI C standard defined keywords
#
# use 'GROUP' pattern keyword to form one large GROUP of
# patterns to speed search. Only two actions defined:
# 1) action to print macro or function names
# 2) action to print types or constants
#
#
BEGIN {
#
# ANSI C key words
#
# expression for leader
ldr = /(^|[\s\t])/;
# opening function parenthesis - look-ahead to find
o_p = /@[\s\t]*\(/;
#
# define strings for formatted output
#
tls = "Total Lines Scanned: %lu\n";
tlc = "Total Line containing macro/function names: %lu\n";
tlt = "Total Line containing type/constant names: %lu\n";
}
#
#
# Following are macro or functions names as defined
# by ANSI C standard
#
# 1
GROUP /{ldr}assert{o_p}/
# 2
QTAwk - 21-1 - QTAwk
Section 21.0 |
# Following regular expression split across 2 lines
# for documentation only.
GROUP /{ldr}is(al(num|pha)|cntrl|x?digit|graph|
p(rint|unct)|space|(low|upp)er){o_p}/
# 3
GROUP /{ldr}to(low|upp)er{o_p}/
# 4
GROUP /{ldr}set(locale|v?buf){o_p}/
# 5
GROUP /{ldr}a(cos|sin|tan2?|bort){o_p}/
# 6
GROUP /{ldr}(cos|sin|tan)h?{o_p}/
# 7
GROUP /{ldr}(fr|ld)?exp{o_p}/
# 8
GROUP /{ldr}log(10)?{o_p}/
# 9
GROUP /{ldr}modf{o_p}/
# 10
GROUP /{ldr}pow{o_p}/
# 11
GROUP /{ldr}sqrt{o_p}/
# 12
GROUP /{ldr}ceil{o_p}/
# 13
GROUP /{ldr}(f|l)?abs{o_p}/
# 14
GROUP /{ldr}f(loor|mod){o_p}/
# 15
GROUP /{ldr}jmp_buf{o_p}/
# 16
GROUP /{ldr}(set|long)jmp{o_p}/
# 17
GROUP /{ldr}signal{o_p}/
# 18
GROUP /{ldr}raise{o_p}/
# 19
GROUP /{ldr}va_(arg|end|list|start){o_p}/
# 20
GROUP /{ldr}re(move|name|wind){o_p}/
# 21
GROUP /{ldr}tmp(file|nam){o_p}/
# 22
GROUP /{ldr}(v?[fs])?printf{o_p}/
# 23
GROUP /{ldr}[fs]?scanf{o_p}/
QTAwk - 21-2 - QTAwk
Section 21.0 |
# 24
GROUP /{ldr}f?get(c(har)?|s|env){o_p}/
# 25
GROUP /{ldr}f?put(c(har)?|s){o_p}/
# 26
GROUP /{ldr}ungetc{o_p}/
# 27
# Following regular expression split across 2 lines
# for documentation only.
GROUP /{ldr}f(close|flush|(re)?open|read|write|
[gs]etpos|seek|tell|eof|ree|pos_t){o_p}/
# 28
GROUP /{ldr}clearerr{o_p}/
# 29
GROUP /{ldr}[fp]error{o_p}/
# 30
GROUP /{ldr}ato[fil]{o_p}/
# 31
# Following regular expression split across 2 lines
# for documentation only.
GROUP /{ldr}str(to(d|k|u?l)|n?c(py|at|mp)|
coll|r?chr|c?spn|pbrk|str|error|len){o_p}/
# 32
GROUP /{ldr}s?rand{o_p}/
# 33
GROUP /{ldr}(c|m|re)?alloc{o_p}/
# 34
GROUP /{ldr}_?exit{o_p}/
# 35
GROUP /{ldr}(f|mk|asc|c|gm|local|strf)?time{o_p}/ {
printf("Macro/function\n%uE - %luR: %s\n%s\n",NG,FNR,$0,$$0);
mf_count++;
}
#
# following are types or constants
#
# 36
GROUP /errno/
# 37
GROUP /NULL/
# 38
GROUP /offsetof/
# 39
GROUP /(fpos|ptrdiff|size|wchar)_t/
# 41
GROUP /NDEBUG/
QTAwk - 21-3 - QTAwk
Section 21.0 |
# 42
GROUP /LC_(ALL|COLLATE|CTYPE|NUMERIC|TIME)/
# 43
GROUP /E(DOM|RANGE|OF)/
# 44
GROUP /HUGE_VAL/
# 45
GROUP /sig_atomic_t/
# 46
GROUP /SIG(_(DFL|ERR|IGN)|ABRT|FPE|ILL|INT|SEGV|TERM)/
# 47
GROUP /FILE/
# 48
GROUP /_IO[FLN]BF/
# 49
GROUP /BUFSIZ/
# 50
GROUP /L_tmpnam/
# 51
GROUP /(OPEN|RAND|TMP|U(CHAR|INT|LONG|SHRT))_MAX/
# 52
GROUP /SEEK_(CUR|END|SET)/
# 53
GROUP /std(err|in|out)/
# 54
GROUP /l?div_t/
# 55
GROUP /CLK_TCK/
# 56
GROUP /(clock|time)_t/
# 57
GROUP /tm_(sec|min|hour|[mwy]day|mon|year|isdst)/
# 58
GROUP /CHAR_(BIT|M(AX|IN))/
# 59
GROUP /(INT|LONG|S(CHAR|HRT))_(M(IN|AX))/
# 60
GROUP /(L?DBL|FLT)_((MANT_)?DIG|EPSILON|M(AX|IN)(_(10_)?EXP)?)/
# 61
GROUP /FLT_R(ADIX|OUNDS)/ {
printf("type/constant\n%uE - %luR: %s\n%s\n",NG,FNR,$0,$$0);
tc_count++;
}
FINAL {
printf(tls,FNR);
QTAwk - 21-4 - QTAwk
Section 21.0 |
printf(tlc,mf_count);
printf(tlt,tc_count);
}
QTAwk - 21-5 - QTAwk
QTAwk - 21-6 - QTAwk
Section 22.0 |
22.0 Appendix v
This is a complete copy of the data file, states.dta, used in to
illustrate QTAwk. The fields of the first record for the default
field separator FS = /{_z}+/ is shown below followed by the
fields for the record separator FS = /{_w}+[\#()]({_w}+|$)/
Fields for Default FS = /{_z}+/
1. US -- country/continent name
2. # -- separator
3. 47750 -- area, square miles
4. # -- separator
5. 4515 -- population in thousands
6. # -- separator
7. LA -- abbreviation (US & Canada only)
8. # -- separator
9. Baton -- first half capital city name
10. Rouge -- second half capital city name
11. ( -- separator
12. Louisiana -- state/country name
13. ) -- Terminator
Fields for FS = /{_w}+[\#()]({_w}+|$)/:
1. US -- country/continent name
2. 47750 -- area, square miles
3. 4515 -- population in thousands
4. LA -- abbreviation (US & Canada only)
5. Baton Rouge -- full capital city name
6. Louisiana -- state/country name
US # 10461 # 4375 # MD # Annapolis ( Maryland )
US # 40763 # 5630 # VA # Richmond ( Virginia )
US # 2045 # 620 # DE # Dover ( Delaware )
US # 24236 # 1995 # WV # Charleston ( West Virginia )
US # 46047 # 12025 # PA # Harrisburg ( Pennsylvania )
US # 7787 # 7555 # NJ # Trenton ( New Jersey )
US # 52737 # 17895 # NY # Albany ( New York )
US # 9614 # 535 # VT # Montpelier ( Vermont )
US # 9278 # 975 # NH # Concord ( New Hampshire )
US # 33265 # 1165 # ME # Augusta ( Maine )
US # 8286 # 5820 # MA # Boston ( Massachusetts )
US # 5019 # 3160 # CT # Hartford ( Connecticut )
US # 1212 # 975 # RI # Providence ( Rhode Island )
US # 52669 # 6180 # NC # Raleigh ( North Carolina )
US # 31116 # 3325 # SC # Columbia ( South Carolina )
QTAwk - 22-1 - QTAwk
Section 22.0 |
US # 58914 # 5820 # GA # Atlanta ( Georgia )
US # 51704 # 4015 # AL # Montgomery ( Alabama )
US # 42143 # 4755 # TN # Nashville ( Tennessee )
US # 40414 # 3780 # KY # Frankfort ( Kentucky )
US # 58668 # 10925 # FL # Tallahassee ( Florida )
US # 68139 # 4395 # WA # Olympia ( Washington )
US # 412582 # 8985 # OR # Salem ( Oregon )
US # 147045 # 830 # MT # Helena ( Montana )
US # 83566 # 1020 # ID # Boise ( Idaho )
US # 110562 # 945 # NV # Carson City ( Nevada )
US # 84902 # 1690 # UT # Salt Lake City ( Utah )
US # 97808 # 525 # WY # Cheyenne ( Wyoming )
US # 104094 # 3210 # CO # Denver ( Colorado )
US # 158704 # 25620 # CA # Sacramento ( California )
US # 121594 # 1425 # NM # Sante Fe ( New Mexico )
US # 114002 # 3040 # AZ # Phoenix ( Arizona )
US # 70702 # 690 # ND # Bismark ( North Dakota )
US # 77120 # 715 # SD # Pierre ( South Dakota )
US # 77350 # 1615 # NE # Lincoln ( Nebraska )
US # 82282 # 2450 # KS # Topeka ( Kansas )
US # 69697 # 5040 # MO # Jefferson City ( Missouri )
US # 69957 # 3375 # OK # Oklahoma City ( Oklahoma )
US # 266805 # 16090 # TX # Austin ( Texas )
US # 86614 # 4205 # MN # St Paul ( Minnesota )
US # 56275 # 2970 # IA # Des Moines ( Iowa )
US # 53191 # 2375 # AR # Little Rock ( Arkansas )
US # 47750 # 4515 # LA # Baton Rouge ( Louisiana )
US # 47691 # 2640 # MS # Jackson ( Mississippi )
US # 57872 # 11620 # IL # Springfield ( Illinois )
US # 66213 # 4800 # WI # Madison ( Wisconsin )
US # 97107 # 9090 # MI # Lansing ( Michigan )
US # 36417 # 5585 # IN # Indianapolis ( Indiana )
US # 44786 # 10760 # OH # Columbus ( Ohio )
US # 591004 # 515 # AK # Juneau ( Alaska )
US # 6473 # 1045 # HI # Honolulu ( Hawaii )
Canada # 255285 # 2370 # AB # Edmonton ( Alberta )
Canada # 366255 # 2885 # BC # Victoria ( British Columbia )
Canada # 251000 # 1060 # MB # Winnipeg ( Manitoba )
Canada # 251700 # 1010 # SK # Regina ( Saskatchewan )
Canada # 21425 # 875 # NS # Halifax ( Nova Scotia )
Canada # 594860 # 6585 # PQ # Quebec ( Quebec )
Canada # 2184 # 126 # PE # Charlottetown ( Prince Edward Island )
Canada # 156185 # 585 # NF # St John's ( New Foundland )
Canada # 28354 # 715 # NB # Fredericton ( New Brunswick )
Canada # 412582 # 8985 # ON # Toronto ( Ontario )
Canada # 1304903 # 51 # NW # Yellowknife ( Northwest Territories
QTAwk - 22-2 - QTAwk
Section 22.0 |
)
Canada # 186300 # 23 # YU # Whitehorse ( Yukon Territory )
Europe # 92100 # 14030 # Bonn ( West Germany )
Europe # 211208 # 55020 # Paris ( France )
Europe # 94092 # 56040 # London ( United Kingdom )
Europe # 27136 # 3595 # Dublin ( Ireland )
Europe # 194882 # 38515 # Madrid ( Spain )
Europe # 116319 # 56940 # Rome ( Italy )
Europe # 8600383 # 275590 # Moscow ( Russia )
Europe # 120728 # 37055 # Warsaw ( Poland )
Europe # 32377 # 7580 # Vienna ( Austria )
Europe # 35921 # 10675 # Budapest ( Hungary )
Many times data such as that above is provided in tabular form
for so that the user can more easily identify the column of data
desired. The data is presented below in such a tabular form.
Defining the variable FIELDWIDTHS as:
FIELDWIDTHS = 8 25 6 15 9 10
would parse each line of the table as: (for the first line)
$1 = US
$2 = Maryland
$3 = MD
$4 = Annapolis
$5 = 10461
$6 = 4375
Country State State Capital Area Population
US Maryland MD Annapolis 10461 4375
US Virginia VA Richmond 40763 5630
US Delaware DE Dover 2045 620
US West Virginia WV Charleston 24236 1995
US Pennsylvania PA Harrisburg 46047 12025
US New Jersey NJ Trenton 7787 7555
US New York NY Albany 52737 17895
US Vermont VT Montpelier 9614 535
US New Hampshire NH Concord 9278 975
US Maine ME Augusta 33265 1165
US Massachusetts MA Boston 8286 5820
US Connecticut CT Hartford 5019 3160
US Rhode Island RI Providence 1212 975
US North Carolina NC Raleigh 52669 6180
US South Carolina SC Columbia 31116 3325
US Georgia GA Atlanta 58914 5820
US Alabama AL Montgomery 51704 4015
QTAwk - 22-3 - QTAwk
Section 22.0 |
US Tennessee TN Nashville 42143 4755
US Kentucky KY Frankfort 40414 3780
US Florida FL Tallahassee 58668 10925
US Washington WA Olympia 68139 4395
US Oregon OR Salem 412582 8985
US Montana MT Helena 147045 830
US Idaho ID Boise 83566 1020
US Nevada NV Carson City 110562 945
US Utah UT Salt Lake City 84902 1690
US Wyoming WY Cheyenne 97808 525
US Colorado CO Denver 104094 3210
US California CA Sacramento 158704 25620
US New Mexico NM Sante Fe 121594 1425
US Arizona AZ Phoenix 114002 3040
US North Dakota ND Bismark 70702 690
US South Dakota SD Pierre 77120 715
US Nebraska NE Lincoln 77350 1615
US Kansas KS Topeka 82282 2450
US Missouri MO Jefferson City 69697 5040
US Oklahoma OK Oklahoma City 69957 3375
US Texas TX Austin 266805 16090
US Minnesota MN St Paul 86614 4205
US Iowa IA Des Moines 56275 2970
US Arkansas AR Little Rock 53191 2375
US Louisiana LA Baton Rouge 47750 4515
US Mississippi MS Jackson 47691 2640
US Illinois IL Springfield 57872 11620
US Wisconsin WI Madison 66213 4800
US Michigan MI Lansing 97107 9090
US Indiana IN Indianapolis 36417 5585
US Ohio OH Columbus 44786 10760
US Alaska AK Juneau 591004 515
US Hawaii HI Honolulu 6473 1045
Canada Alberta AB Edmonton 255285 2370
Canada British Columbia BC Victoria 366255 2885
Canada Manitoba MB Winnipeg 251000 1060
Canada Saskatchewan SK Regina 251700 1010
Canada Nova Scotia NS Halifax 21425 875
Canada Quebec PQ Quebec 594860 6585
Canada Prince Edward Island PE Charlottetown 2184 126
Canada New Foundland NF St John's 156185 585
Canada New Brunswick NB Fredericton 28354 715
Canada Ontario ON Toronto 412582 8985
Canada Northwest Territories NW Yellowknife 1304903 51
Canada Yukon Territory YU Whitehorse 186300 23
Europe West Germany Bonn 92100 14030
QTAwk - 22-4 - QTAwk
Section 22.0 |
Europe France Paris 211208 55020
Europe United Kingdom London 94092 56040
Europe Ireland Dublin 27136 3595
Europe Spain Madrid 194882 38515
Europe Italy Rome 116319 56940
Europe Russia Moscow 8600383 275590
Europe Poland Warsaw 120728 37055
Europe Austria Vienna 32377 7580
Europe Hungary Budapest 35921 10675
For such tabular data, using fixed width fields is often the
easiest method of splitting the record into the desired fields.
This is especially true when some fields of some records are
missing as for the european countries.
QTAwk - 22-5 - QTAwk
QTAwk - 22-6 - QTAwk
Section 23.0 |
23.0 Appendix vi
The Pre-Defined variables are listed in the tables below. The
first table lists the variables in the order of increasing
integer used for access to the variable via the 'pd_sym(i,j)'
function. The first column of each table lists the value passed
in 'i' to access the variable. The second column lists the
string value returned as the value of 'j'. The second table
lists the variables in alphabetical order.
Table 1 Table 2
i j i j
1 <> RLENGTH 23 <> _arg_chk
2 <> CLENGTH 24 <> ARGC
3 <> MLENGTH 25 <> ARGI
4 <> RSTART 26 <> ARGV
5 <> CSTART 2 <> CLENGTH
6 <> MSTART 50 <> CONVFMT
7 <> FALSE 5 <> CSTART
8 <> FILENAME 29 <> CYCLE_COUNT
9 <> FNR 22 <> DEGREES
10 <> FS 56 <> DELAY_INPUT_PARSE
11 <> LONGEST_EXP 32 <> ECHO_INPUT
12 <> NF 31 <> ENVIRON
13 <> NR 7 <> FALSE
14 <> OFMT 54 <> FIELDFILL
15 <> OFS 53 <> FIELDWIDTHS
16 <> ORS 48 <> FILEADATE
17 <> RETAIN_FS 49 <> FILEATIME
18 <> RS 45 <> FILEATTR
19 <> TRANS_FROM 46 <> FILECDATE
20 <> TRANS_TO 47 <> FILECTIME
21 <> TRUE 41 <> FILEDATE
22 <> DEGREES 8 <> FILENAME
23 <> _arg_chk 44 <> FILEPATH
24 <> ARGC 43 <> FILESIZE
25 <> ARGI 42 <> FILETIME
26 <> ARGV 38 <> FILE_SEARCH
27 <> TRACE 39 <> FILE_SEARCH_PAT
28 <> NG 35 <> FILE_SORT
29 <> CYCLE_COUNT 9 <> FNR
30 <> MAX_CYCLE 10 <> FS
31 <> ENVIRON 34 <> Gregorian
32 <> ECHO_INPUT 55 <> IGNORECASE
33 <> QTAwk_Path 11 <> LONGEST_EXP
34 <> Gregorian 37 <> MATCH_INDEX
QTAwk - 23-1 - QTAwk
Section 23.0 |
35 <> FILE_SORT 30 <> MAX_CYCLE
36 <> SUBSEP 3 <> MLENGTH
37 <> MATCH_INDEX 6 <> MSTART
38 <> FILE_SEARCH 12 <> NF
39 <> FILE_SEARCH_PAT 28 <> NG
40 <> SPAN_RECORDS 13 <> NR
41 <> FILEDATE 14 <> OFMT
42 <> FILETIME 15 <> OFS
43 <> FILESIZE 16 <> ORS
44 <> FILEPATH 33 <> QTAwk_Path
45 <> FILEATTR 52 <> RECLEN
46 <> FILECDATE 17 <> RETAIN_FS
47 <> FILECTIME 1 <> RLENGTH
48 <> FILEADATE 18 <> RS
49 <> FILEATIME 4 <> RSTART
50 <> CONVFMT 51 <> RT
51 <> RT 40 <> SPAN_RECORDS
52 <> RECLEN 36 <> SUBSEP
53 <> FIELDWIDTHS 27 <> TRACE
54 <> FIELDFILL 19 <> TRANS_FROM
55 <> IGNORECASE 20 <> TRANS_TO
56 <> DELAY_INPUT_PARSE 21 <> TRUE
QTAwk - 23-2 - QTAwk
Section 24.0 |
24.0 Appendix vii
This appendix lists the utilitys included with the QTAwk
package. Each utility includes a short description of its
purpose and the aspects of QTAwk which may be best learned by
examining the utility.
crv.dat
input data for curvefit.exp utility.
setstd.dat
input data for calcrp.exp utility.
addcomma.exp
function to put commas in numbers passed as function
argument. Number with commas returned as a string. Adapted
from Awk, page 72. Changes allow the calling fumction to
specify the format for converting floating point numbers to a
string instead of defaulting to the use of OFMT. Illustrates
recursion and the use of 'local' variables, regular
expressions and the 'sub' function versus the 'gsub'
function.
alarm.exp
utility to set resident alarm clock for scheduled
appointments. Reads appointment file, finds appointments for
today, runs program to set resident alarm clock program and
displays appointments set. Automatically schedules lunch and
end of day (close of business). Utility takes no arguments,
sets scan file in BEGIN action. Illustrates the use of the
'sdate' and 'stime' functions, manipulation of ARGV array
elements and ARGC to automatically set input file, use of
'exit' keyword for incorrect invocation, use of 'system'
function to run other programs.
ansiclst.exp
QTAwk program to scan C source files for keywords defined in
the ANSI C standard.
ansicstd.exp
same as ansiclst.exp except uses GROUP patterns for searchs.
ansirsv.exp
scans C source file for words defined as reserved by the ANSI
C standard
QTAwk - 24-1 - QTAwk
Section 24.0 |
ansirsvg.exp
same as ansirsv.exp except uses GROUP patterns for searchs.
apptadd.exp
apptdis.exp
calcin.exp
Calculator QTAwk Utility - Infix calculator. Uses ANSI.SYS
to highlight display. Illustrates the use of the 'execute'
function. The 'execute' function is used to execute input
expressions and return the result and the use of the 'ud_sym'
function for displaying user defined variables without
knowing the variables names.
calcinna.exp
same as calcin.exp without dependence on ANSI.SYS to
highlight display.
calcrp.exp
Calculator QTAwk Utility - Postfix calculator. Uses ANSI.SYS
to highlight display. Illustrates the use of the the
"switch/case" keywords with regular expression case labels,
and string and character constants as 'case' labels, the use
of the match operator, '~~', user-defined functions, the
'delete' statement, the '#include' directive. Uses ANSI.SYS
to highlight display.
calcrpna.exp
same as calcrp.exp without depdence on ANSI.SYS.
cdcl.exp
Decipher C Declarator Syntax. Illustrates the use of named
expressions in regular expressions, the use of the
"switch/case" statements, the use of regular expressions as
'case' statement labels.
cliche.exp
generate an endless stream of cliches, adapted from Awk, page
113. Illustrates use of 'srand' and 'rand' functions.
compat.exp
utility to scan QTAwk utility checking for use of new and/or
changed keywords and built-in functions and variables.
Illustrates use of string constants spanning more than one
line, the use of the 'split' function, the use of the 'gsub'
QTAwk - 24-2 - QTAwk
Section 24.0 |
function.
compress.exp
utility to print input file to output compressing groups of
blank lines to a single blank line.
curvefit.exp
utility to find least squares best fit of input data to 1 of
9 equations. Illustrates the use of QTAwk as a numeric
processor of input data, the use of the 'execute' function to
execute selected equations, GROUP patterns, user-defined
functions.
dirsrch.exp
function to search directory specified for filenames
specified. Returns array with path to file found and file
sizes. Function meant to included using QTAwk "#include"
directive
filename.exp
simple utility to split input lines into file path and
filename portions. Uses PC/MS-DOS path and filename
conventions. Illustrates use of character classes and named
expressions in regular expressions, use of 'sub' and 'substr'
functions.
fin.exp
financial calculator utility. Assumes use of ANSI.SYS for
highlighting display. Illustrates use of 'execute' function,
'#include" directive, use of 'ud_sym' function and
user-defined functions. Extensive documentation on financial
calculations included in utility.
finna.exp
same as fin.exp except does not assume use of ANSI.SYS.
fmtdoc.exp
documanet formatter. Recognizes formatting commands starting
at first character position of a record. Illustrates use of
'justify', 'split', 'copies', 'center' and 'overlay'
functions, use of the 'keyboard' file for input, GROUP
patterns, and "switch/case' statements.
fmtdoc2p.exp
same as 'fmtdoc.exp' except makes two passes over input files
to format documents.
QTAwk - 24-3 - QTAwk
Section 24.0 |
geodh.exp
utility to convert geodetic latitude, longitude to x, y, z
position. Illustrates use of arithmetic functions.
getdir.exp
utility to find files matching a specified regular expression
pattern. Illustrates use of regular expressions and 'system'
function. Also illustrates the use of the 'execute' function
to convert a string into a regular expression to improve
matching performance. Assumes use of "4DOS" "command.com"
replacement. If "4DOS" not used, a few statements must be
modified to obtain correct reading of directory.
graph.exp
histogrm.exp
holiday.exp
jdn.exp
linenum.exp
more.exp
prdn.exp
romn.exp
screen.exp
sincos.exp
slike.exp
soundix.exp
soundx4.exp
ssfuncs.exp
state.exp
table.exp
toc.exp
wordfreq.exp
QTAwk - 24-4 - QTAwk
Section 25.0 |
25.0 Appendix viii
QTAwk error returns. When QTAwk encounters an error which it
cannot correct, it generates and displays an error message in the
format:
1: Error (xxxx): Error Message Text
2: From 'execute' Function.
3: Action File line: llll
4: Scanning File: utility filename
5: Record: rrrr
Line 2 is generated only if the error occurred during execution
of the 'execute' function. Lines 4 to 5 are displayed only if an
input file is currently being scanned.
On a normal exit QTAwk returns a value of zero, 0, to
PC/MS-DOS. This value may be set with the 'exit' statement. On
encountering an error which generates an error message, QTAwk
exits with a nonzero value between 1 and 6. The warning messages
below will exit with a value of zero. The exit values generated
on detecting an error are:
1. Warning Errors ==> 0 , error value < 1000
2. File Errors ==> 2 , 2000 <= error value < 3000
3. Regular Expression Errors ==> 3 , 3000 <= error value < 4000
4. Run-Time Errors ==> 4 , 4000 <= error value < 5000
5. Interpretation Errors ==> 5 , 5000 <= error value < 6000
6. Memory Error ==> 6 , 6000 <= error value < 7000
The 'error value' range shown in the above list, shown the range
of the numeric value shown in the error message for that type of
error.
The error number displayed on line 1 may be used to find the
error diagnostic from the following listing.
1. Warning Errors
0
Invalid Option.
The only valid command line options are:
-- -> to stop command line option scanning
-f -> to specify a utility filename
-w -> to specify a utility filename for utility internal
form
-F -> to specify the input record field separator.
QTAwk - 25-1 - QTAwk
Section 25.0 |
10
Warning, Attempt To Close File Not Open.
An attempt has been made to close a file with the 'close'
function, that is not currently open.
2. File Errors
2000
File Not Found: {filename}
The filename given in the error message, was been
specified on the command line. The file named does not
exist. QTAwk displays this error message and terminates
processing.
2030
Only One Internal Form Utility File Allowed
When specifying utility files, only one utility in
internal form may be specified. In addition, internal
form utility files may not be mixed with ASCII utility
files.
3. Regular Expression Errors
3000
Stop pattern without a start
The range pattern has the form:
expression , expression
The comma, ',', is used to separate the expressions of
the pattern. The associated action is executed when the
first or start expression is TRUE. Execution continues
for every input record until, and including, the second
or stop expression is TRUE. A comma, ',', has been found
in a pattern without the first expression. This is
usually caused by imbalanced braces, "{}". Check all
prior braces to ensure that every left brace, '{', has an
associated terminating right brace, '}'.
3010
Already have a stop pattern
The range pattern has the form: expression , expression
The comma, ',', is used to separate the expressions of
the pattern. The associated action is executed when the
first or start expression is TRUE. Execution continues
for every input record until, and including, the second
or stop expression is TRUE. A second comma, ',', has
been found in a pattern. This may be caused by the
QTAwk - 25-2 - QTAwk
Section 25.0 |
unbalanced braces as for error number 3000 above. A
second cause may stem from the fact that new patterns for
pattern/action pairs must be separated from previous
patterns by a new-line if no action, i.e., the default
action, is associated with the previous pattern.
4. Run-Time Errors
4000
Command Line Variable Set - Not Used.
Only variables defined in the QTAwk utility may be set on
the command line with the form "variable = value"
4010
Missing Opening Parenthesis On 'while'.
The proper syntax for the 'while' statement is:
while ( conditional_expression ) statement
The left parenthesis, '(', starting the conditional
expression was not found following the 'while' keyword.
Check that the syntax conforms to the form above.
4020
Missing Opening Parenthesis On 'switch'.
The form of the 'switch' construct is:
switch ( switch_expression ) statement
The left parenthesis, '(', was not found following the
'switch' keyword.
4030
Unable to compile Regular Expression
QTAwk was unable to convert a regular expression to
internal form. Please contact the QTAwk author with
information on the circumstances of this error message.
4040
Internal Array Error.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.
4050
pre/post '++' or '--' Needs a Variable.
The pre/post ++/-- operators operate on variables only.
This error is usually generated because of an incorrect
QTAwk - 25-3 - QTAwk
Section 25.0 |
understanding of the precedence rules. The operator was
associated by QTAwk when the utility line was parsed than
the user expected. Check the precedence rules and the
syntax of the line cited.
4070
Undefined Symbol.
A symbol has been found which QTAwk does not recognize.
This error should not occur and represents an internal
error. Please contact the QTAwk author with information
on the circumstances of this error message.
4080
Internal Error #200
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.
4090
Attempt To Delete Nonexistent Array Element.
The 'delete' statement was followed with a reference to
an array element that does not exist.
4095
Attempt To Delete Array Element With 'deletea'
Statement.
The 'deletea' statement is used to delete whole arrays.
If it is desired to delete a single array element, the
'delete' statement should be used.
4100
Internal GROUP parse error 1001.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.
4100
Warning, Attempt To Close File Not Successful.
An attempt has been made to close a file with the 'close'
function. The close action has not been successful,
usually because the file named does not exist. Check the
name specified.
4110
'strim' Function Result Exceeds Limits.
The built-in function 'strim' has been called with a
string to trim which exceeds the maximum limits of 4096
characters.
QTAwk - 25-4 - QTAwk
Section 25.0 |
4120
Cannot Nest 'execute' Function.
The 'execute' function cannot be executed with a
string/array executed by this function. An attempt has
been made to do this. Check the string/array which was
executed.
4130
'(g)sub' Function Result Exceeds Limits.
The function 'sub' or 'gsub' has been called to replace
matching strings and the resultant string after
replacement would exceed the limit of 4096 characters.
4140
Missing ')' for Function Call.
A built-in function has been called with a left
parenthesis starting the argument list, but no right
parenthesis terminating the argument list. Check the
line in question.
4160
[sf]printf needs format string as first argument
The 'fprintf' and 'sprintf' functions need a format
string which specifies the output. The format string is
the second argument and must be specified for these
functions.
4170
4180
4190
Format Specifications Exceed Arguments To Print.
The 'printf', fprintf' and 'sprintf' functions use a
format string to control the output. Certain characters
strings in the format control the output of numerics and
imbedded strings. There must be exactly one extra
argument for each of these character control strings.
This error occurs when there are more control strings
than extra arguments.
4220
Third Argument For '(g)sub' Function Must Be A Variable.
The optional third argument of the 'sub' and 'gsub'
functions must be a variable. The string value of this
variable is replaced after string substitution has been
accomplished.
QTAwk - 25-5 - QTAwk
Section 25.0 |
4222
First Argument For 'findfile' Function Must Be A
Variable.
Findfile returns the results of a directory search in the
variable specified as the first argument of the findfile
function call. Thus, this argument must a variable.
4223
'findfile' Function: Cannot Assign Array To Scalar
Findfile returns the results of a directory search in the
variable specified as the first argument of the findfile
function call. Since the result returned is an array and
an array cannot be assigned to the element of another
array, the first argument cannot be an array element.
4230
Excessive Length Specified 'substr' Function.
The form of the 'substr' function is: substr(s,p[,n]).
The third argument is optional, but if specified cannot
exceed 4096.
4240
Start Position Specified Too Great, 'substr' Function.
The form of the 'substr' function is: substr(s,p[,n]).
The second argument cannot exceed 4096.
4270
'rotate' Function Needs Array Member As Argument.
The argument for the 'rotate' function must be an array.
If a variable is used, make sure that it is an array when
the function is called.
4280
Excessive Width Specified 'center' Function.
The second argument specifies the width of the line in
which to center the string value of the first argument.
The width specified cannot exceed 4096.
4290
Excessive Copies Specified 'copies' Function.
The second argument of the 'copies' function specifies
the number of copies of the string value of the first
argument to return. The number of copies specified
cannot exceed 65,536. See error number 4300 below also.
4300
QTAwk - 25-6 - QTAwk
Section 25.0 |
'copies' Function Return Exceeds Limits.
The 'copies' function returns the string value of the
first argument, copied the number of times specified by
the second argument. The total length of the returned
string:
arg2 * length(arg1)
cannot exceed 4096 characters.
4310
Excessive Characters Specified 'deletec' Function.
The 'deletec' function deletes the number of characters
specified by the third argument starting at the position
specified by the second argument from the string value of
the first argument. The form of the function is:
deletec(string,start,num)
The number of characters specified to delete, 'num',
cannot exceed 65,536. If 'num' is zero or exceeds the
number of characters remaining in the string from the
start position, then the remainder of the string is
deleted. See also error 4320 below.
4320
Excessive Characters Specified 'deletec' Function.
The 'deletec' function deletes the number of characters
specified by the third argument starting at the position
specified by the second argument from the string value of
the first argument. The form of the function is:
deletec(string,start,num)
The start is negative or greater than the length of the
string value of the first argument, then no characters
are deleted.
4330
'deletec' Intermediate Result Exceeds Limits.
The 'deletec' function deletes the number of characters
specified by the third argument starting at the position
specified by the second argument from the string value of
the first argument. The form of the function is:
deletec(string,start,num)
QTAwk - 25-7 - QTAwk
Section 25.0 |
If the length of the string value of the first argument
exceeds 4096 then this error is triggered.
4340
Start Position Specified Too Great, 'insert' Function.
The 'insert' function inserts the string value of the
second argument into the string value of the first
argument, starting at the position specified by the third
argument. The form of the function is:
insert(string1,string2,start)
The third argument cannot exceed 65,536. If start
exceeds the length of the string value of 'string1', then
the string value of 'string2' is concatenated onto the
string value of 'string1'
4350
'insert' Function Intermediate Result Exceeds Limits.
The 'insert' function inserts the string value of the
second argument into the string value of the first
argument, starting at the position specified by the third
argument. The form of the function is:
insert(string1,string2,start)
The length of the string value value of 'string1' cannot
exceed 4096 in length. The result of insert 'string2'
into 'string1' cannot exceed 4096 also. See error number
4360 below.
4360
'insert' Function Return Exceeds Limits.
The 'insert' function inserts the string value of the
second argument into the string value of the first
argument, starting at the position specified by the third
argument. The form of the function is:
insert(string1,string2,start)
The length of the string value value of 'string1' cannot
exceed 4096 in length. The result of insert 'string2'
into 'string1' cannot exceed 4096 also. See error number
4350 above.
4370
QTAwk - 25-8 - QTAwk
Section 25.0 |
Start Position Specified Too Great, 'overlay' Function.
The 'overlay' function overlays the string value of the
second argument into the string value of the first
argument, starting at the position specified by the third
argument. The form of the function is:
overlay(string1,string2,start)
The third argument cannot exceed 65,536. If start
exceeds the length of the string value of 'string1', then
blanks are appended to 'string1' to create a string of
length 'start'. The second string is then concatenated
to this string. See also error numbers 4380, 4390, and
4400 below.
4380
'overlay' Function Result Exceeds Limits.
The 'overlay' function overlays the string value of the
second argument into the string value of the first
argument, starting at the position specified by the third
argument. The form of the function is:
overlay(string1,string2,start)
The third argument cannot exceed 4096. If start exceeds
the length of the string value of 'string1', then blanks
are appended to 'string1' to create a string of length
'start'. The second string is then concatenated to this
string. See also error number 4370 above and 4390 and
4400 below.
4390
'overlay' Function Intermediate Result Exceeds Limits.
The 'overlay' function overlays the string value of the
second argument into the string value of the first
argument, starting at the position specified by the third
argument. The form of the function is:
overlay(string1,string2,start)
The length of the string value of 'string1' cannot
exceed 4096 characters. See also error number 4370 and
4380 above and 4400 below.
4400
'overlay' Function Result Exceeds Limits.
QTAwk - 25-9 - QTAwk
Section 25.0 |
The 'overlay' function overlays the string value of the
second argument into the string value of the first
argument, starting at the position specified by the third
argument. The form of the function is:
overlay(string1,string2,start)
The length of the resultant string after overlaying
'string2' onto 'string1' cannot exceed 4096. See also
error numbers 4370, 4380, and 4390 above.
4410
'remove' Function Intermediate Result Exceeds Limits.
The 'remove' function removes all characters specified by
the second argument from the string value of the first
argument. The form of the function is:
remove(string,char)
The length of 'string' before any character are removed
cannot exceed 4096.
4420
Excessive Width Specified 'justify' Function.
The 'justify' function forms a string from the elements
of the array specified by the first argument. The string
will have a length specified by the integer value of the
third argument and will be formed from the number of
array elements specified by the second argument. Any
padding characters necessary between array elements can
be specified by the optional fourth argument. The form
of the function is:
justify(array_var,count,width [,pad_char] );
The width specified cannot exceed 65,536. See also
error number 4430 below.
4430
Excessive Number Of Array Elements Specified 'justify'
Function.
The 'justify' function forms a string from the elements
of the array specified by the first argument. The string
will have a length specified by the integer value of the
third argument and will be formed from the number of
array elements specified by the second argument. Any
QTAwk - 25-10 - QTAwk
Section 25.0 |
padding characters necessary between array elements can
be specified by the optional fourth argument. The form
of the function is:
justify(array_var,count,width [,pad_char] );
The count of array elements to use cannot exceed
65,536. See also error number 4420 above.
4440
Bad Function Call - Internal Error.
An internal error has occurred in calling a built-in
function. Please contact the QTAwk author with
information on the circumstances of this error.
4450
Missing ')' for Function Call.
A user-defined function has been called with an argument
list and no right parenthesis, ')', terminating the
argument list.
4460
More Arguments For Function Than Defined. Function:
{User_Function_Name}.
More argument are passed to the user defined function
named in the error message than were defined for the
function. Check the user function name or the definition
of the function for necessary extra arguments.
4470
Less Arguments For User Function Than Defined. Function:
'{User_Function_Name}'.
Less arguments are passed to the user defined function
named in the error message than were defined for the
function. This error message is generated ONLY if the
built-in variable '_arg_chk' has a TRUE value. Variables
local to a user-defined function should be defined with
the 'local' keyword.
4480
Constant Passed For Function Array Parameter.
A parameter to a user defined function used as an array
within the function cannot be passed a constant value.
Only a variable can be passed for this parameter. If the
statement where the variable is indexed as an array is
executed, the variable will be an array upon return from
QTAwk - 25-11 - QTAwk
Section 25.0 |
the function.
4490
Internal Error - Misalignment Of Local List ( ).
This is an internal QTAwk error. It should ideally never
happen. If this error message is generated, please
contact the QTAwk author with information on the
circumstances.
4500
Cannot Assign Array To Array Element.
Arrays can be assigned to variables, however, it is an
error it attempt to assign an array to a single element
of another array.
4510
Array Cannot Operate on Scalar.
A scalar may operate on an array, but the reverse is not
true.
4520
Assignment Operator needs a Variable on left.
The assignment operator, '=', or any of the
operator/assignment operators, 'op=', only operate on a
variable to the left of the operator.
4530
Stack Underflow
Internal stack error. Please contact the QTAwk author
with information on the circumstances of this error
message.
4540
String Type Variable Needed For Pattern String -
'findfile' Function
The second argument of the 'findfile' function, if given,
must be a string type expression
5. Interpretation Errors
5030
Internal Error #3.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.
5040
QTAwk - 25-12 - QTAwk
Section 25.0 |
Internal Error #2.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.
5050
BEGIN/END/NOMATCH/INITIAL/FINAL Patterns or User Function
Require An Action.
The predefined patterns:
BEGIN
INITIAL
NOMATCH
FINAL
END
must have actions associated with them. The brace
opening the action must be on the same line as the
predefined pattern.
5060
Exceeded Internal Stack Size on Scan.
The internal stack for containing parsed tokens has been
exceeded. Attempt to simplify the utility in the area
where this error occurred.
5070
Underflow Internal Stack on Scan.
This is an internal error. If this error occurs please
contact the QTAwk author with information on the
circumstances of this error message.
5080
Missing ')' For Function Call.
A used defined function argument list must be terminated
with a right parenthesis, ')'. A symbol has been found
which cannot be part of the argument list and is not a
right parenthesis.
5090
Function Call Without Parenthesized Argument List.
A user defined function definition must include an
argument list. The argument list may be empty, e.g.,
"()", if there are no formal arguments.
5100
'fprint' Function Takes A minimum Of 1 Argument.
QTAwk - 25-13 - QTAwk
Section 25.0 |
The 'fprint' built-in function must have at least the
name name of the output file specified.
5110
printf and 'sprintf' Functions Take A Minimum Of 1
Argument.
These functions must have at least a format string
defined.
5120
'fprintf' Function Needs A Minimum of Two Arguments.
This function needs an output file name and a format
string.
5130
Second Argument Of 'fgetline' Has To Be A Variable.
If two arguments are specified for the 'fgetline'
built-in function, the second must be a variable.
5140
Argument Of 'getline' Has To Be A Variable.
If an argument for the 'getline' function is specified,
it must be a variable.
5150
split Function Needs Variable Name As Second Argument.
The second second argument for the 'split' function must
be a variable. The pieces into which the first argument
is split will be returned as array elements of the
variable specified.
5160
'rotate' Function Needs Variable As Argument.
The argument of the 'rotate' function has to be an array
variable.
5170
'justify' Function Needs Variable As First Argument.
The format of the 'justify' built-in function is:
justify(a,n,w)
or
justify(a,n,w,c)
QTAwk - 25-14 - QTAwk
Section 25.0 |
The first argument, 'a', must be an array variable. The
first n elements of the array are concatenated to form a
string 'w' characters long. a single space is used to
separate the concatenated elements. If the optional
third argument is specified, it is converted to a
character value and used to separate the elements.
5180
'[pu]_sym' Function Needs Variable As Second Argument.
The second argument must be variable whose value can be
changed to equal the string value of the name variable
specified.
5190
Bad Function Call
Internal QTAwk error. Please contact the QTAwk author
with information on the circumstances of this error.
5200
Improper Number Of Arguments, {Function_Name} Function.
The built-in function specified has been called with an
improper number of arguments for the function. Check the
user manual for the correct use of the intended
function.
5210
Need Variable On Left Side Of Assignment.
In an assignment statement of the form:
variable = expression;
a variable must be specified on the left side of the
assignment operator to receive the value of the
expression on the right side of the operator.
5220
Conditional Expression Error - Missing ':'
The form of the conditional expression is :
test_expression ? expression_1 : expression_2;
test_expression is evaluated, if the result is TRUE
(nonzero numeric or non-null string), expression_1 is
evaluated and the value becomes the value of the
conditional expression. If the value of test_expression
is FALSE (zero numeric or null string), expression_2 is
QTAwk - 25-15 - QTAwk
Section 25.0 |
evaluated and the value becomes the value of the
conditional expression.
5230
'in' Operator Needs Array As Right Operand
The form of the 'in' operator is:
expression in array_var
The operand to the right of 'in', array_var here, has to
be a variable. If the variable is not an array, then the
value of the expression is FALSE.
5240
Missing ')' in Expression Grouping.
An expression has been scanned with unbalanced
parenthesis. Check for a missing terminating right
parenthesis.
5250
Pre-Increment/Decrement Operators Need Variable.
The increment and decrement operators, '++' and '--',
only operate on variables. An instance has been found in
which the operator has been used as a prefix operator on
something other than a variable. Check that grouping has
not changed a post-fix operator into a prefix operator.
5260
Undefined Symbol.
A symbol has been found which matches no defined QTAwk
syntax. This usually, but not always, occurs when the
terminating semi-colon, ';', has been left off a
statement.
5270
Need Variable for Array Reference
A left bracket for indexing an array has been
encountered. However, the preceding symbol was not a
variable. Only variables may be arrays and indexed.
5280
Missing Index For Array
A left bracket for indexing an array has been
encountered. However, the index expression is missing:
var[] a null index is not allowed in QTAwk.
QTAwk - 25-16 - QTAwk
Section 25.0 |
5290
Missing ']' Terminating array index.
A left bracket and an index expression for indexing an
array have been encountered. However, the right bracket
terminating the index expression was not recognized.
Check that the array index follows the form:
var[index_expression]
5300
Post-Increment/Decrement Operators Need Variable.
The increment and decrement operators, '++' and '--',
only operate on variables. An instance has been found in
which the operator has been used as a post-fix operator
on something other than a variable. Check that grouping
has not changed a prefix operator into a post-fix
operator.
5310
'if' Keyword - No Expression To Test.
The proper syntax for the 'if' statement is:
if ( conditional_expression ) statement
The left parenthesis, '(', starting the conditional
expression was not found following the 'if' keyword.
Check that the syntax conforms to the form above.
5320
'if' Keyword - No Terminating ')' On Test Expression.
The proper syntax for the 'if' statement is:
if ( conditional_expression ) statement
The right parenthesis, ')', terminating the conditional
expression was not found. Check that the syntax to
conforms the form above.
5330
'while' Keyword - No Terminating ')' On Test Expression.
The proper syntax for the 'while' statement is:
while ( conditional_expression ) statement
The right parenthesis, ')', terminating the conditional
expression was not found. Check that the syntax to
QTAwk - 25-17 - QTAwk
Section 25.0 |
conforms the form above.
5340
Missing 'while' Part Of 'do'.
The proper syntax for the 'do' statement is:
do statement while ( conditional_expression );
The 'while' keyword was not found following the
statement portion. Check that a possible left brace,
'{', starting a compound statement may have been deleted
or for the possible misuse of a keyword as a variable.
5350
Missing '(' On 'while' Part Of 'do'.
The proper syntax for the 'do' statement is:
do statement while ( conditional_expression );
The left parenthesis, '(', starting the conditional
expression was not found following the 'while' keyword.
Check that the syntax conforms to the form above.
5360
Missing ')' On 'while' Part Of 'do'.
The proper syntax for the 'do' statement is:
do statement while ( conditional_expression );
The right parenthesis, ')', terminating the conditional
expression was not found. Check that the syntax to
conforms the form above.
5370
Missing ';' Terminating 'do - while'.
The proper syntax for the 'do' statement is:
do statement while ( conditional_expression );
Note the semicolon following the right parenthesis
terminating the conditional expression. The semicolon is
necessary here.
5380
Missing Opening Parenthesis On 'for'.
The proper syntax for the 'for' statement is:
QTAwk - 25-18 - QTAwk
Section 25.0 |
for ( initial_expression ; conditional_expression ;
loop_expression )
statement
or
for ( variable_name in array_name ) statement
The left parenthesis, '(', was not found following the
'for' keyword. Check that the syntax conforms to the
form above.
5390
5400
5420
Improper Syntax - 'for' Conditional.
The proper syntax for the 'for' statement is:
for ( initial_expression ; conditional_expression ;
loop_expression )
statement
One of the semicolons separating the three expressions
or the terminating right parenthesis was not found.
Check that the syntax follows the form above
5410
'in' Operator Needs Variable As Left Operand in 'for'
Expression.
The proper syntax for the 'for' statement is:
for ( variable_name in array_name ) statement
the symbol following the left parenthesis and preceding
the 'in' keyword must be a valid variable name.
5430
break/continue Keyword Outside Of Loop.
Either of these keywords must be used inside of a
'while', 'for' or 'do' loop. In addition, the 'break'
statement may be used inside a 'switch-case' construct to
terminate execution flow. One of the keywords has been
found outside of such a construct. Check for an
imbalance of braces, '{}', enclosing compound
statements.
QTAwk - 25-19 - QTAwk
Section 25.0 |
5440
'return' Statement Outside Of User Function.
The 'return' statement may only be used inside of a
user-defined function to terminate execution of the
function and cause execution to return to the place where
the function was called. The 'return' keyword was
encountered outside of the definition of such a
function. Check for the use of the keyword as a variable
or for unbalanced braces, '{}', enclosing the statements
of the function.
5450
Exceeded Limits on Number of Local Variable Definitions
(1).
QTAwk places a limit of 256 local variables within any
compound statement. An attempt has been made to define
more local variables than this limit allows.
5460
No Variables Defined With 'local' Keyword.
The form of local variable definition with the 'local'
keyword follows the form:
local var1, var2 = optional_expression;
The 'local' keyword was encountered followed immediately
by a semicolon. Check that the syntax follows the above
form.
5470
'switch' Keyword - No Terminating ')' On Expression.
The form of the 'switch' construct is:
switch ( switch_expression ) statement
The right parenthesis, ')', terminating the
switch_expression was not found.
5480
'case/default Statement Without Switch Statement.
The 'case' keyword is used within the 'switch' statement
to specify case expressions to which execution should
transfer after matching the switch expression. A 'case'
keyword was found outside of the 'switch' statement.
Check for the use of the keyword as a variable or for
unbalanced braces enclosing a compound 'switch'
QTAwk - 25-20 - QTAwk
Section 25.0 |
statement.
5490
Multiple 'default' Statements in 'switch'.
The 'default' keyword is used within a 'switch' statement
to specify a transfer point at execution should proceed
when the switch_expression fails to match any
case_expression. Only one 'default' transfer point is
allowed per 'switch' statement. Check for possible
unbalanced braces, '{}', enclosing a compound statement
in previous 'case' statements.
5500
Missing ':' Following Expression On Case Label.
The form of the 'case' statement is:
case case_expression:
A colon, ':', must terminate the case expression. QTAwk
did not find the terminating colon.
5510
Need Variable For 'delete' Reference
The form of the 'delete' statement is:
delete variable_name;
or
delete (variable_name);
or
delete variable_name[index];
or
delete (variable_name[index]);
where variable must be a global or local variable.
5520
'deletea' Statement Variable Cannot Be Indexed.
The form of the 'deletea' statement is:
deletea variable_name;
QTAwk - 25-21 - QTAwk
Section 25.0 |
or
deletea (variable_name);
where variable must be a global or local variable and
cannot be indexed.
5530
Need Variable For 'deletea' Reference
The form of the 'deletea' statement is:
deletea variable_name;
or
deletea (variable_name);
where variable must be a global or local variable.
5540
No ';' Terminating Statement.
All statements in QTAwk are terminated by a semicolon.
The terminating semicolon was not found by QTAwk.
5550
Internal Compilation Error - Action Strings.
This is an QTAwk internal error that should never
happen. If this error message in encountered, please
contact the QTAwk author with information on the
circumstances of this error.
5560
Error On Single Line Action. No Termination.
In parsing/compiling an action entered from the command
line or by executing the 'execute' built-in function, the
end of the line was reached without reaching the end of
the action expression(s). Typically caused by a missing
right bracket, '}' (or unbalanced brackets - more left
brackets than right brackets).
5570
Too Many User Functions Defined.
QTAwk currently has a limit of 256 user defined
functions. The currently utility has attempted to define
more than that limit. Please contact the QTAwk author
with information on the circumstances of this error
QTAwk - 25-22 - QTAwk
Section 25.0 |
message.
5580
Exceeded Limits on Number of Local Variable Definitions
(2).
QTAwk currently has a limit of 256 'local' variables
defined within any single compound statement. The
currently utility has attempted to define more than that
limit. Please contact the QTAwk author with information
on the circumstances of this error message.
5590
Expecting Function Name To Follow 'function' Keyword In
Pattern.
The 'function' keyword has been encountered in a pattern
without a function name immediately following. This
syntax error may be corrected by inserting the missing
name or by removing the function keyword from the
pattern.
5600
Multi-Defined Function Name.
The name supplied for a user defined function has been
used previously. The current usage attempts to redefine
the name. Change either the first use of the name or the
present.
5610
Unexpected Symbol - Function Argument List Definition.
A user defined function has been encountered with the
accompanying list defining the passed argument names.
The form of the list is a variable name followed by 1) a
comma and more names, 2) an ellipses, '...' followed by a
right parenthesis, or 3) a right parenthesis ending the
list. A symbol other than a comma or right parenthesis
has been found following a variable name.
5620
Expecting ')' To Terminate Function Parameter List.
A user defined function has been encountered with the
accompanying list defining the passed argument names.
The form of the list is a variable name followed by 1) a
comma and more names, 2) an ellipses, '...' followed by a
right parenthesis, or 3) a right parenthesis ending the
list. A symbol has been found other than the right
parenthesis following the ellipses.
QTAwk - 25-23 - QTAwk
Section 25.0 |
5630
Unexpected Symbol - Function Argument List Definition.
A user defined function has been encountered with the
accompanying list defining the passed argument names.
The form of the list is a variable name followed by 1) a
comma and more names, 2) an ellipses, '...' followed by a
right parenthesis, or 3) a right parenthesis ending the
list. A symbol other than a comma or right parenthesis
has been found following a variable name.
5640
Expecting Parenthesized Argument Definition List For
Function.
A user defined function has the following form: function
function_name ( argument_list ) The left parenthesis of
the argument list was not found.
5650
Improper Syntax - Improper Ending For Pattern
A pattern expression must be ended by: 1) a comma (the
first expression in a range expression only), 2) the left
brace, '{', starting the associated action, 3) an
End-of-File, or 4) new line. a symbol other than above
has been encountered.
5670
Internal Parse Error: 1001.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.
5680
Local Variable With Reserved Name.
An attempt has been made to define a local variable in
either a user defined function argument list or with the
'local' keyword, with a name equal to a reserved word.
5690
Improper Use of Keyword.
A pattern keyword has been encountered in an action
statement.
5700
User Function Variable Argument Keyword Outside Of User
Function.
QTAwk - 25-24 - QTAwk
Section 25.0 |
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
encountered outside of a user defined function.
5710
Variable Argument Keyword In User Function Defined
Without Variable Number Of Arguments.
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
encountered inside of a user defined function which was
not defined with a variable length argument list.
5720
Internal Error - Variable Argument List Variable.
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
previously defined as a local variable within the current
compound statement.
5730
Internal Error - Variable Argument List Variable.
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
previously defined as a global variable.
5740
Internal Parse Error: 1002.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.
5742
5744
Constant Exceeds Numerical Limits: xxxxxxxxxxxx
The constant shown (indicated by 'xxxxxxxxxxxx' in error
message), exceeds the numerical limits for this
implementation of QTAwk. Reduce the size of the numeric
QTAwk - 25-25 - QTAwk
Section 25.0 |
5750
Internal Parse Error: 1003.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.
5760
Empty Regular Expression.
A regular expression must have some characters between
the beginning and ending slashes. A regular expression
has been encountered with none.
5770
Regular Expression - No Terminating /.
A regular expression constant must be contained on one
line and be terminated by a slash. A regular expression
has been been found with a no terminating slash before
encountering a new line.
5780
Internal Parse Error: 1004.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.
5785
Escape to Continue Quoted Line Not Allowed in Pattern.
Using a escape character, '\', at the end of a line and
within a quoted string to continue the string on the next
line, is not allowed within a pattern. Patterns must be
contained on a single line.
5788
Unexpected End-of-File While Scanning Utility.
QTAwk encountered the End-Of-File while reading and
interpreting a utility and before the expected end of the
utility.
5790
String Constant - No Terminating ".
A string constant must be contained on one line and be
terminated by a double quote. A string constant has been
been found with a no terminating double quote before
encountering a new line.
5800
QTAwk - 25-26 - QTAwk
Section 25.0 |
Internal Parse Error: 1005.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.
5810
Character Constant - No Terminating '.
a character constant must be contained on one line and be
terminated by a single quote. A character constant has
been been found with a no terminating single quote before
encountering a new line.
5820
Character Constant Longer Than One Character
A character constant is a single character bounded by
single quotes as in 'A'. Escape sequences may also be
used for specifying the character for a character
constant, e.g., '\f' or '\x012' or '\022' are three ways
to specify a single form feed character. This error
reports that an attempt has been made to use single
quotes to bound more than a single character.
5830
Lexical Error - Illegal '.'
Periods are used only in floating point numerics, e.g.,
0.88 or .33, or in user defined function definitions to
indicate a variable number of arguments, e.g.,
function max(...) {
A period has been found which does not match either of
these uses.
5840
Lexical Error
A character has been read which does not fit any syntax
for a valid utility.
5850
Exceeded Max. Limits On Number Of Variables.
A maximum of 256 global variables may be defined in any
single QTAwk utility.
6. Memory Errors
6000
Out of Memory (n: , s: )
QTAwk - 25-27 - QTAwk
Section 25.0 |
The QTAwk utility has used all available memory and
attempted to exceed that limit. It is recommended that
the utility be made shorter, or split into multiple
utilities run separately.
6010
Insufficient Memory.
The QTAwk utility has used all available memory and
attempted to exceed that limit. It is recommended that
the utility be made shorter, or split into multiple
utilities run separately.
6030
Action Too Long
An action has been defined which exceeds the limits set
for the internal length. The maximum length for the
internal form of any action is 409,600 characters.
6040
6050
6060
Out of Memory
The QTAwk utility has used all available memory and
attempted to exceed that limit. It is recommended that
the utility be made shorter, or split into multiple
utilities run separately.
6070
Stack Overflow
QTAwk maintains an internal stack for intermediate
results. The current utility has generated too many
intermediate results for this stack. Simplify
expressions.
7. Regular Expression Errors
7000
Unbalanced Parenthesis in Regular Expression.
Sequence Number: ssssss, Expression Number: eeeeee
Parenthesis in a regular expression are unbalanced, i.e.,
there are more opening left parenthesis, '(', than
closing right parenthesis, ')'.
QTAwk - 25-28 - QTAwk
Table of Contents
Table of Contents
QTAwk License ............................................... iii
QTAwk 6.00 PC/MS_DOS Order Form ............................... v
QTAwk 1.00 OS/2 Order Form .................................... v
== Order Information ......................................... vi
== International Orders: ..................................... vi
== Company Purchase Orders: .................................. vi
== Multi-System Licenses: .................................... vi
1.0 Introduction .......................................... xxiii
2.0 Getting Started ......................................... 2-1
2.1 Running QTAwk ........................................... 2-2
3.0 Regular Expressions ..................................... 3-1
3.1 'OR' Operator ........................................... 3-2
3.2 Character Classes ....................................... 3-2
3.3 Closure ................................................. 3-4
3.4 Repetition Operator ..................................... 3-6
3.5 Escape Sequences ........................................ 3-8
3.6 Position Operators ...................................... 3-9
3.7 Examples ............................................... 3-10
3.8 Look Ahead Operator .................................... 3-11
3.9 Match Classes .......................................... 3-11
3.10 Named Expressions ..................................... 3-12
3.11 Predefined Names ...................................... 3-14
3.12 Tagged Strings ........................................ 3-15
3.13 Regular Expression Operator Summary ................... 3-19
4.0 Variables ............................................... 4-1
5.0 Built-in Variables ...................................... 5-1
6.0 Expressions ............................................. 6-1
6.1 Operators ............................................... 6-1
6.2 Numeric Forms and Arithmetic Operations ................. 6-2
6.3 Numerics and Strings .................................... 6-4
6.3.1 Assignment Operators .................................. 6-5
6.4 Grouping Operators: () .................................. 6-5
6.5 Arithmetic Operators .................................... 6-6
6.5.1 Unary Ones Complement: ~ .............................. 6-6
6.5.2 Unary Increment/Decrement: ++ -- ...................... 6-6
6.5.3 Unary Plus/Minus: + - ................................. 6-7
6.5.4 Exponentiation: ^ ..................................... 6-7
6.5.5 Multiplicative and Additive Operators: * / % and + - .. 6-8
- ii - |
Table of Contents
6.6 Bitwise And, Or and Xor: & | @ .......................... 6-8
6.7 Subscripting: [] ........................................ 6-8
6.8 Shift Operators: << >> .................................. 6-9
6.9 String Concatenation Operator: ∩ ........................ 6-9
6.10 Field operator: $ ..................................... 6-11
6.11 Tagged String Operator: $$ ............................ 6-12
6.12 Logical Operators: && || .............................. 6-15
6.13 Comparison Operators: < <= > >= == ~= ~~ !~ ........... 6-17
6.13.1 Match Operator Variables ............................ 6-18
6.14 Conditional Operator: ? : ............................. 6-18
6.15 Logical Negation: ! ................................... 6-19
6.16 Array Membership: in .................................. 6-19
6.17 Sequence Operator: , .................................. 6-19
6.18 White Space ........................................... 6-20
6.19 Constants ............................................. 6-21
6.19.1 Numeric Constants ................................... 6-21
6.19.2 Character Constants ................................. 6-22
6.19.3 String Constants .................................... 6-22
6.19.4 Regular Expression Constants ........................ 6-23
7.0 Arrays .................................................. 7-1
7.1 Multidimensional Arrays ................................. 7-1
7.2 Integer and String Indices .............................. 7-1
7.3 QTAwk Arrays in Arithmetic Expressions .................. 7-2
7.4 Arrays as Regular Expressions ........................... 7-6
8.0 Strings, Regular Expressions and Arrays ................. 8-1
8.1 Regular Expression and String Translation ............... 8-1
8.2 Regular Expressions in Patterns ......................... 8-1
9.0 Pattern-Actions ......................................... 9-1
9.1 QTAwk Patterns .......................................... 9-1
9.2 QTAwk Predefined Patterns ............................... 9-3
10.0 Group Patterns ........................................ 10-1
10.1 GROUP Pattern Advantage ............................... 10-1
10.2 GROUP Pattern Disadvantage ............................ 10-1
10.3 GROUP Pattern Regular Expressions ..................... 10-2
11.0 Statements ............................................ 11-1
11.1 QTAwk Keywords ........................................ 11-1
11.2 Statements ............................................ 11-1
11.3 'cycle' and 'next' .................................... 11-2
11.4 'delete' and 'deletea' ................................ 11-4
11.5 'if'/'else' ........................................... 11-6
11.6 'switch', 'case', 'default' ........................... 11-6
- iii - |
Table of Contents
11.7 Loops ................................................. 11-8
11.7.1 'while' ............................................. 11-8
11.7.2 'for' ............................................... 11-8
11.7.3 'do'/'while' ........................................ 11-9
11.8 'local' ............................................... 11-9
11.9 'endfile' ............................................ 11-10
11.10 'break' ............................................. 11-10
11.11 'continue' .......................................... 11-10
11.12 'exit opt_expr_list' ................................ 11-10
11.13 'return opt_expr_list' .............................. 11-11
12.0 Built-in Functions .................................... 12-1
12.1 Arithmetic Functions .................................. 12-1
12.2 String Functions ...................................... 12-3
12.3 File Functions ....................................... 12-14
12.3.1 Input Functions .................................... 12-15
12.3.2 Output Functions ................................... 12-18
12.3.3 Miscellaneous File Functions ....................... 12-19
12.3.4 Standard Files ..................................... 12-21
12.4 Miscellaneous Functions .............................. 12-23
12.4.1 Expression Type .................................... 12-23
12.4.2 Execute String ..................................... 12-23
12.4.3 Array Function ..................................... 12-26
12.4.4 System Control Function ............................ 12-26
12.4.5 Variable Access .................................... 12-26
12.4.6 File Search Function ............................... 12-29
12.4.7 Re-Set Regular Expressions ......................... 12-32
13.0 User-Defined Functions ................................ 13-1
13.1 Local Variables ....................................... 13-1
13.2 Argument Checking ..................................... 13-1
13.3 Variable Length Argument Lists ........................ 13-2
13.4 Null Argument List .................................... 13-3
13.5 Arrays and Used-Defined Functions ..................... 13-3
14.0 Format Specification .................................. 14-1
14.1 Output Types .......................................... 14-2
14.2 Output Flags .......................................... 14-3
14.3 Output Width .......................................... 14-4
14.4 Output Precision ...................................... 14-4
15.0 Trace Statements ...................................... 15-1
15.1 Selective Statement Tracing ........................... 15-1
15.2 Trace Output .......................................... 15-1
16.0 Invoking QTAwk ........................................ 16-1
- iv - |
Table of Contents
16.1 Multiple QTAwk Utilities .............................. 16-1
16.1.1 #include Directive .................................. 16-2
16.2 Command Line Options .................................. 16-3
16.3 File Search Sequence .................................. 16-3
16.4 Setting the Field Separator ........................... 16-4
16.5 Command Line Variables ................................ 16-5
16.6 QTAwk Execution Sequence .............................. 16-5
17.0 QTAwk Limits .......................................... 17-1
18.0 Appendix i ............................................ 18-1
19.0 Appendix ii ........................................... 19-1
20.0 Appendix iii .......................................... 20-1
21.0 Appendix iv ........................................... 21-1
22.0 Appendix v ............................................ 22-1
23.0 Appendix vi ........................................... 23-1
24.0 Appendix vii .......................................... 24-1
25.0 Appendix viii ......................................... 25-1
- v - |