home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.questions
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!news.acns.nwu.edu!casbah.acns.nwu.edu!navarra
- From: navarra@casbah.acns.nwu.edu (John Navarra)
- Subject: Associative arrays in [gn]awk --whynot! (long but interesting)
- Message-ID: <1992Aug13.063348.11691@news.acns.nwu.edu>
- Sender: usenet@news.acns.nwu.edu (Usenet on news.acns)
- Organization: Northwestern University, Evanston Illinois.
- Date: Thu, 13 Aug 1992 06:33:48 GMT
- Lines: 203
-
- Well, after my previous posting on [gn]awk associative arrays I decided
- to spend a little time investigating them. Here is what I came up with.
- I ran the following little awk script on a few files:
- {
- array[NR] = $1
- }
- END {
- for ( item in array)
- printf("%s\n", array[item] )
- }
-
-
- I ran it on files with 10,11,12,15 and 21 records where each record was
- separated by a newline. Here are the results from the files with just
- numbers in them. Basically, you can replace the numbers by anything
- because no sorting is done by any of the programs before the arrays
- are printed. We see some interesting behavior from both awk and nawk.
- Gawk works as expected -- printing the array out as it was read in.
- However, gawk did not work this way when more than one field was present
- in a file. (more on that below)
-
- awk nawk gawk
- =====================10 file===============
- 2 2 1
- 3 3 2
- 4 4 3
- 5 5 4
- 6 6 5
- 7 7 6
- 8 8 7
- 9 9 8
- 10 10 9
- 1 1 10
- =====================11 file===============
- 2 2 1
- 3 3 2
- 4 4 3
- 5 5 4
- 6 6 5
- 7 7 6
- 8 8 7
- 9 9 8
- 10 10 9
- 11 11 10
- 1 1 11
- =====================12 file===============
- 2 2 1
- 3 3 2
- 4 4 3
- 5 5 4
- 6 6 5
- 7 7 6
- 8 8 7
- 9 9 8
- 10 10 9
- 11 11 10
- 12 12 11
- 1 1 12
- =====================15 file===============
- 13 2 1
- 2 3 2
- 14 4 3
- 3 5 4
- 15 6 5
- 4 7 6
- 5 8 7
- 6 9 8
- 7 10 9
- 8 11 10
- 9 12 11
- 10 13 12
- 11 14 13
- 12 15 14
- 1 1 15
- =====================21 file===============
- 13 2 1
- 2 3 2
- 14 4 3
- 3 5 4
- 15 6 5
- 4 7 6
- 16 8 7
- 5 9 8
- 17 10 9
- 6 11 10
- 18 12 11
- 7 13 12
- 19 14 13
- 8 15 14
- 9 16 15
- 10 17 16
- 20 18 17
- 11 19 18
- 21 20 19
- 12 21 20
- 1 1 21
-
- ======================
- Here is a program that makes an associative array between the first and
- fifth fields of a /etc/passwd-like file.
-
- awk '
- BEGIN { FS=":"}
-
- {
- #make asscociative array from first and fifth fields
- fullname[$1]=$5
- }
- END {
- for ( username in fullname)
- printf("%-10s %s \n", username, fullname[username]);
- }' short.passwd
-
-
- Here is the passwd-like file:
- 1root:##root:0:1:Operator:/:/bin/csh
- 2news:##news:6:6:Mr. News:/usr/local/lib/news:/bin/sh
- 3games:##games:11:10:Mr. Games:/usr/games:/bin/false
- 4ftp:##ftp:12:10:Mr. ftp:/usr/spool/ftp:/bin/false
- 5usenet:##usenet:20:12:Mr. Usenet:/var/spool/news:/bin/true
- 6server:*:21:10:Mr. Listserv:/usr/server:/bin/tcsh
- 7accman:##accman:284:30:Account
- Manager:/usr/local/lib/cas:/usr/local/bin/cas
- 8nims:##nims:336:10:Christopher Nims:/home/u1/nims:/bin/tcsh
- 9matt:##matt:337:10:Matthew Larson:/home/u1/matt:/bin/tcsh
- 10mccoy:##mccoy:339:10:James McCoy:/home/u1/mccoy:/bin/tcsh
- 11pib:##pib:340:20:Philip Burns:/home/u1/pib:/bin/tcsh
- 12jln:##jln:341:20:John Norstad:/home/u1/jln:/bin/csh
- 13len:##len:343:20:Leonard Evens:/home/u1/len:/bin/csh
- 14dave:##dave:345:20:Dave Wenger:/home/u2/dave:/bin/tcsh
- 15navarra:##navarra:453:20:John Navarra:/home/u3/navarra:/bin/bash
-
- Notice that I numbered the entries so you can see how out of order
- they are with awk, nawk, and gawk versions:
-
- awk: results with just 15 numbers (from above)
- 1root Operator 13
- 11pib Philip Burns 2
- 5usenet Mr. Usenet 14
- 7accman Account Manager 3
- 14dave Dave Wenger 15
- 6server Mr. Listserv 4
- 13len Leonard Evens 5
- 12jln John Norstad 6
- 3games Mr. Games 7
- 4ftp Mr. ftp 8
- 10mccoy James McCoy 9
- 9matt Matthew Larson 10
- 8nims Christopher Nims 11
- 2news Mr. News 12
- 15navarra John Navarra 1
-
- nawk:
- 12jln John Norstad 2
- 7accman Account Manager 3
- 4ftp Mr. ftp 4
- 10mccoy James McCoy 5
- 14dave Dave Wenger 6
- 11pib Philip Burns 7
- 5usenet Mr. Usenet 8
- 2news Mr. News 9
- 6server Mr. Listserv 10
- 1root Operator 11
- 8nims Christopher Nims 12
- 3games Mr. Games 13
- 15navarra John Navarra 14
- 9matt Matthew Larson 15
- 13len Leonard Evens 1
-
- gawk:
- 10mccoy James McCoy 1
- 6server Mr. Listserv 2
- 5usenet Mr. Usenet 3
- 15navarra John Navarra 4
- 4ftp Mr. ftp 5
- 12jln John Norstad 6
- 9matt Matthew Larson 7
- 11pib Philip Burns 8
- 13len Leonard Evens 9
- 14dave Dave Wenger 10
- 3games Mr. Games 11
- 8nims Christopher Nims 12
- 2news Mr. News 13
- 7accman Account Manager 14
- 1root Operator 15
-
- This is truly weird. I got the same results each time I ran
- the program on the file for each of awk, nawk, and gawk. Each is completely
- different from the other program and each is completely different from
- its own result of 15 numbers from the test above.
-
- I ran these test on a Sun running 4.1.2. If you want to try running the
- above programs on your machine and send me the results, I would be
- glad to look at them. Basically, associative arrays in gn[awk] are
- unreliable (especially for a multi-field file). Gawk seems to be the
- best at printing out arrays in the order which they were loaded in but
- it does not do it correctly in all cases.
-
- -tms
- --
- From the Lab of the MaD ScIenTiST:
-
- navarra@casbah.acns.nwu.edu
-