home *** CD-ROM | disk | FTP | other *** search
- Article 829 of net.sources:
- Path: mcdsun!noao!hao!nbires!seismo!rutgers!ll-xn!mit-eddie!uw-beaver!tektronix!teklds!copper!stevesu
- From: stevesu@copper.TEK.COM (Steve Summit)
- Newsgroups: net.sources
- Subject: Fast wc
- Message-ID: <970@copper.TEK.COM>
- Date: 11 Apr 87 02:18:38 GMT
- Reply-To: stevesu@copper.UUCP (Steve Summit)
- Distribution: world
- Organization: Tektronix, Inc., Beaverton, OR.
- Lines: 462
-
- Here's something I came across in my bin directory. According to
- the modification time on the file, I must have written it back in
- 1984. It's just like the "standard" wc (4.[23]bsd, anyway) with
- the following two improvements:
-
- 1. It doesn't count what you don't ask for. Therefore,
- wc -l is faster and wc -c is _m_u_c_h faster than when it has
- to count words (which is harder). It also seems to be
- considerably faster than /usr/ucb/wc.
-
- 2. It has the -v (verbose) and -p (count pages) options that
- some old version of wc (4.1? 2.9?) that I got used to had.
-
- 3. (Three! Three improvements! _N_obody expects...) It prints
- the fields in the order you ask for (i.e. wc -cwl gives
- you the reverse of the usual order). (This isn't
- terribly important, and I've never made use of it, but
- for some reason I wrote it that way.)
-
- There is also a -s flag that lets you set the page size used in
- calculating page counts with -p.
-
- Here is a timing comparison (on a 780 running Ultrix):
-
- $ cd /usr/dict
- $ time /usr/ucb/wc words web*
- 24259 24259 198596 words
- 234936 234936 2486813 web2
- 76205 121847 1012730 web2a
- 335400 381042 3698139 total
- 3:29.2 real 56.9 user 10.8 sys
-
- $ time wc.new words web* > /dev/null
- 1:44.1 real 26.7 user 11.7 sys
-
- $ time wc.new -w words web* > /dev/null
- 1:50.0 real 26.1 user 11.9 sys
-
- $ time wc.new -l words web* > /dev/null
- 1:08.5 real 14.6 user 11.4 sys
-
- $ time wc.new -c words web* > /dev/null
- 25.8 real 0.2 user 9.7 sys
-
- Of course, if all you really care about is the character count,
- an ls -l is faster still (although it will give you a different
- answer if the file contains bad blocks, but I digress).
-
- The word-counting algorithm probably isn't the one I would have
- chosen, but it matches the one that /usr/ucb/wc uses.
-
- If you're picky about plug compatibility, you should note that
- the error handling is a bit different than the standard version.
- (There is, regrettably, no "usage:" message.)
-
- Following my signature are the source and man page.
-
- Steve Summit
- stevesu@copper.tek.com
-
-