home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!elroy.jpl.nasa.gov!swrinde!gatech!news.byu.edu!eff!world!ora.com!minya!jc
- From: jc@minya.UUCP (John Chambers)
- Newsgroups: comp.lang.perl
- Subject: Re: Columnization
- Message-ID: <1360@minya.UUCP>
- Date: 11 Sep 92 18:27:19 GMT
- References: <1992Aug5.182712.14285@athena.mit.edu> <BtM68r.J0A@NCoast.ORG>
- Lines: 103
-
- > | Consider the example from the manual page:
- > | ls | paste - - - -
- > | This is supposed to produce four-column output. It does put the names
- > | four to a line, but in most directories, they will not be in columns.
- > | There seems to be no options that will produce columnar output.
- >
- > pr -m -t -w# file1 file2 ...
- >
- > where # is the width (default 72). May be System V only.
-
- Several people have suggested something like this, but I don't see any
- way that you can use pr to columnize the output of ls like the above
- "ls | paste" command does (but correctly ;-). To do so, you would need
- a filter that reads ls's output and writes it into N files, and then
- feed these files to pr, and then remember to delete the intermediate
- files. If you're going to write this filter, well, you might just as
- well have it write a single file (stdout) instead and align the fields
- itself. So pr doesn't help you at all. If you have to write a program
- to chop up the data into multiple files, it's just as easy to write
- your own program in C or perl to do the entire job.
-
- Actually, I did just get a properly aligned 4-column listing on this
- Sys/V machine, using the command:
- ls | paste - - - - | align l l l
- The "align" command is a little C program that I wrote years ago,
- which aligns its input in columns. It can do left, center, or right
- alignment, you can specify a field separator ("-s:" would work for
- /etc/passwd, for example), and there are a couple of other goodies.
- The program is about 200 lines of C, so it's not all that difficult to
- do. I use it all the time from within vi to columnize tables within
- documents.
-
- For example, contrast the output of:
-
- : ls -ltr |tail
-
- -rw-r--r-- 1 jc other 5227 Sep 10 11:41 People
- drwxr-xr-x 2 jc mail 1024 Sep 10 12:09 uucp
- -rw-rw-r-- 1 jc mail 1791 Nov 3 1991 test14
- drwxrwxr-x 7 jc mail 4608 Sep 10 22:46 misc.920703
- -rw-rw-r-- 1 jc mail 24110 Sep 10 22:46 NNTP.sh05
- drwxrwxr-x 2 jc mail 2560 Sep 10 22:47 tuucp
- drwxr-xr-x 12 jc other 5120 Sep 10 22:47 Mail
- -rw-rw-r-- 1 jc mail 7550 Sep 11 10:58 Phones
- drwxr-xr-x 4 jc other 2560 Sep 11 11:10 esp
- drwxrwxr-x 3 jc mail 1024 Sep 11 12:13 pl
-
- and:
-
- : ls -ltr |tail |align l r l5 l r l r r
- -rw-r--r-- 1 jc other 5227 Sep 10 11:41 People
- drwxr-xr-x 2 jc mail 1024 Sep 10 12:09 uucp
- -rw-rw-r-- 1 jc mail 1791 Nov 3 1991 test14
- drwxrwxr-x 7 jc mail 4608 Sep 10 22:46 misc.920703
- -rw-rw-r-- 1 jc mail 24110 Sep 10 22:46 NNTP.sh05
- drwxrwxr-x 2 jc mail 2560 Sep 10 22:47 tuucp
- drwxr-xr-x 4 jc other 5120 Sep 10 22:47 Mail
- -rw-rw-r-- 1 jc mail 7550 Sep 11 10:58 Phones
- drwxr-xr-x 4 jc other 2560 Sep 11 11:10 esp
- drwxrwxr-x 3 jc mail 1024 Sep 11 12:13 pl
-
- The "l5" illustrates a minimum field width. I'm still thinking about
- adding an option that would figure out how to compress things like the
- group and size fields, which could be one column closer together in
- this example. But that gets tricky.
-
- An even worse problem turns up with things like the "ps" command. On
- many Unices (mostly bsd), this command occasionally produces fields
- that are run together, without any separator character at all. It is
- quite difficult to recognize such things, except on an ad-hoc basis.
- On the other hand, the Sys/V ps command generates an STIME field that
- sometimes looks like "12:19:20" and sometimes looks like "Sep 08", and
- this also leads to yet more ad-hoc code. The folks who brought us Unix
- software have put some interesting and amusing barriers in the way of
- getting things done in a simple, straightforward fashion. It is
- interesting to contemplate how one might go about writing a perl
- pattern that correctly chops apart the fields of things like the
- following excerpt:
-
- 0 F S PID C ADDR SZ WCHAN STIME TTY TIME COMD
- 39 S root 0 0 c0feb000 0 d0183c94 Aug 18 ? 2:10 sched
- 10 R N jc 252 65 c0feb288 397 Sep 08 console146:59 X :0
- 10 S jc 11788 39 c0febb88 91 e0000000 14:04:40 pts/8 0:00 /bin/csh -c
- ps -elf
-
- Some special goodies to contemplate: The "R N" that is one field, as
- well as the "Aug 18" and "Sep 08" that should align with the
- "14:04:40" field below them (so you can't count strings of spaces);
- the missing WCHAN field in one line (so you can't count non-blank
- fields); the "console146:59" that is two fields run together; the
- misaligned COMD field(s) caused by an overly-long TIME field (so you
- can't count columns); the dangling "ps -elf" as a separate line. This
- is a somewhat artificial example, made by combining the results on
- several machines, but it gives you an idea of what you have to contend
- with to write a portable program that handles all such data.
-
- Is there a way to handle such abortions in perl?
-
- --
- All opinions Copyright (c) 1992 by John Chambers. Inquire for licensing at:
- 1-617-647-1813 ...!{bu.edu,harvard.edu,eddie.mit.edu,ruby.ora.com}!minya!jc
- --
- Pensu tutmonde; agu loke.
-