NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / lang / perl / 4941 < prev next >

Wrap

Text File | 1992-07-24 | 8.2 KB | 254 lines

Path: sparky!uunet!pmafire!news.dell.com!swrinde!cs.utexas.edu!hermes.chpc.utexas.edu!news.utdallas.edu!convex!tchrist From: tchrist@convex.COM (Tom Christiansen) Newsgroups: comp.lang.perl Subject: Re: Perl meets shell Message-ID: <1992Jul24.160750.22163@news.eng.convex.com> Date: 24 Jul 92 16:07:50 GMT References: <Brs2rD.Fz1@cs.psu.edu> Sender: usenet@news.eng.convex.com (news access account) Reply-To: tchrist@convex.COM (Tom Christiansen) Organization: CONVEX Realtime Development, Colorado Springs, CO Lines: 236 Originator: tchrist@pixel.convex.com Nntp-Posting-Host: pixel.convex.com X-Disclaimer: This message was written by a user at CONVEX Computer Corp. The opinions expressed are those of the user and not necessarily those of CONVEX. From the keyboard of flee@cs.psu.edu (Felix Lee): :I've been slowly collecting a pile of Perl routines that I want to use :in both shell scripts and Perl scripts. But it's rather awkward to :arrange for this to happen. : :Presumably, in Perl, you would do something like : require 'fqdn.pl'; : $y = &fqdn($x); : :And in shell you would do : y=`fqdn $x` :or : cat list | fqdn : :Supporting the shell usage requires writing a wrapper that applies the :routine to @ARGV or <STDIN>. The awkward part is this same wrapper :has to be written for every little routine in the library. : :Instead, what I have right now are two Perl programs, "perl-apply" and :"perl-map", that can be used like so: : y=`perl-apply fqdn $x` : cat list | perl-map fqdn : perl-map fqdn `cat list` : :"perl-apply proc args" locates proc, loads it, calls proc with all the :args, and prints the result. : :"perl-map proc args" locates proc, loads it, calls proc on each arg :(or on each line of STDIN), and prints each result on a separate line. : :This still isn't ideal. The shell usage is somewhat messy. : :It would be nice if the same file could be used for both purposes. :This requires a way for the file to know whether it's being run as a :script, or loaded with "require" (or "do"). : :Here's one approach. If you put something like : #!/usr/bin/wrapper :at the top of the file, this will get ignored. Unfortunately, the :wrapper has to be a binary executable. : :Perhaps the wrapper could be Perl itself, with a new flag. :Something like: : #!/usr/bin/perl -Emain :The "-Emain" means that the procedure "main" should be called after :the file is loaded, after all the immediate statements are executed, :before exiting. -Emain is interesting: I sometimes thing perl should autocall &main(@ARGV) if defined &main. :Does this seem reasonable? It still seems messy. I agree that having both perl and shell be able to access the same routine in a similar fashion. I do think you're making this a tad more difficult than it need be. It just so happens that I've been doing this for years. I'm going to spell out what I've done and why very carefully, so that people disagreeing with the why part can recraft their own solution. The trick to making this work is that you do indeed have a wrapper, but it's a perl one, not a binary executable. This program does basically these things: 1. require() the appropriate file. 2. Call foo() with the appropriate args. 3. Print out foo()'s output appropriately. 4. Exit appropriately. The only catch is to find a reasonable definition for "appropriate" in all cases above in which it occurs. STEP 1: require() the appropriate file First, how do you map routine() to foo.pl? I'm going to say that routine() will by definition reside in "routine.pl". We want one and only one copy of the wrapper (let's call it `exechook' for now) any routine that we want to call. Well indirect through it. One way to do this from the shell is $ exechook routine args but a more desriable one is merely $ routine args So we'll have exechook consult its $0 to see whether it was called as exechook and needs to look further for whom to call, or whether to derive the name of the routine to invoke right from $0. So we'll make exechook live in /usr/local/bin (and be exectuable) and make links: $ROOT=/usr/local # or just /usr cp routine.pl $ROOT/lib/perl/ cd $ROOT/bin; ln exechook routine Now we'll *call* "routine", but get exechook, who'll load routine.pl for us. That makes the initial exechook code looks like this: $ME = 'exechook'; if ($0 =~ m,/?$ME$,) { $routine = shift; die "usage: [$ME] file [args ...]\n" unless @ARGV; } else { ($routine = $0) =~ s,.*/,,; } require "$routine.pl"; STEP 2: Call foo() with the appropriate args. Felix has proposed that if the program has no arguments, that it should call the function repeatedly for each line of STDIN. That means that var=`fqdn` shold be the same as var=`fqdn \`cat\`` What I don't like about this is that it makes no allowance for routines that get no arguments by their very nature. For example, if you didn't have a whoami on your system, and wanted one that worked this way: sub whoami { (getpwuid($>))[0]; } And made an exec hook for it, you would be very surprised to have myname=`whoami` Sit and wait for input. So let's not do that. We'll give it @ARGV and let it do what it wants. People wanting `cat` behaviour can code it themselves, or use xargs or apply or whatever they're really looking for. STEP 3: Print out foo()'s output appropriately. I'd like this to work both for things that return single values and those that return lists. Each value should be printed out with a newline tagged on the end for the shell's benefit. That means basically doing this print join("\n", &$routine(@ARGV)), "\n"; Although there's no need to make a scalar, so we should do this instead: $, = "\n"; print &$routine(@ARGV)), ''; STEP 4: Exit appropriately. This is more interesting than it might initially appear. Shell programs should exit zero if they succeed, and non-zero if they fail. This means there's one success state and a bunch (1..255) of failure states. Perl functions, on the other hand, generally indicate failure by returning undef (as opposed to merely the null string or 0), whereas success can come in any one of a virtually infinite variety of fun-filled flavours. I propose that the following classes of functions should all be accomodated: I. sub hello { "Hello, world!" } II. sub whoami { (getpwuid($>))[0] } III. sub pwnam { getpwent(shift) } IV. sub amisu { $> == 0 } V. sub range { shift .. shift } Type I functions are the easiest, since they also succeed. We just print out their return value(s) and exit 0. Type II functions aren't that much different. But they can fail. So we now no that we have to check for a return of undef, and both not print anything and exit non-zero. This brings our code to @return = &$routine(@ARGV); exit 1 if @return == 0 || @return == 1 && !defined $return[0]; Type III functions are fine now, because if the getpwent fails, we get undef back, and no printout and a failure exit code. Type IV functions are particularly problematic. You'd really like to be able to say if (&amisu) { } in perl, and if amisu; then ... in shell. That means that we must introduce a hack. If the return value is 1, then we aren't going to print it out, but just return success. print (@return, "") unless @return == 1 && $return[0] == 1; Now, because of our decision for class IV functions, we have a problem with class V functions (well, and others). Consider: if we call range 1 1 it'll return just 1, which wouldn't get printed. But so it goes. So, that leaves us with this for exechook, which isn't very big: #!/usr/bin/perl $ME = 'exechook'; if ($0 =~ m#/?$ME$#) { die "usage: [$ME] file [args ...]\n" unless @ARGV; $routine = shift; } else { ($routine = $0) =~ s#.*/##; } require "$routine.pl"; @return = &$routine(@ARGV); exit 1 unless defined $return[0]; $, = "\n"; print (@return, "") unless $return[0] == 1; exit 0; --tom -- Tom Christiansen tchrist@convex.com convex!tchrist I might be able to shoehorn a reference count in on top of the numeric value by disallowing multiple references on scalars with a numeric value, but it wouldn't be as clean. I do occasionally worry about that. --lwall :-)