home *** CD-ROM | disk | FTP | other *** search
Text File | 1999-04-17 | 46.7 KB | 1,303 lines |
- NAME
- perlsub - Perl subroutines
-
- SYNOPSIS
- To declare subroutines:
-
- sub NAME; # A "forward" declaration.
- sub NAME(PROTO); # ditto, but with prototypes
-
- sub NAME BLOCK # A declaration and a definition.
- sub NAME(PROTO) BLOCK # ditto, but with prototypes
-
-
- To define an anonymous subroutine at runtime:
-
- $subref = sub BLOCK; # no proto
- $subref = sub (PROTO) BLOCK; # with proto
-
-
- To import subroutines:
-
- use PACKAGE qw(NAME1 NAME2 NAME3);
-
-
- To call subroutines:
-
- NAME(LIST); # & is optional with parentheses.
- NAME LIST; # Parentheses optional if predeclared/imported.
- &NAME; # Makes current @_ visible to called subroutine.
-
-
- DESCRIPTION
- Like many languages, Perl provides for user-defined subroutines.
- These may be located anywhere in the main program, loaded in
- from other files via the `do', `require', or `use' keywords, or
- even generated on the fly using `eval' or anonymous subroutines
- (closures). You can even call a function indirectly using a
- variable containing its name or a CODE reference to it.
-
- The Perl model for function call and return values is simple:
- all functions are passed as parameters one single flat list of
- scalars, and all functions likewise return to their caller one
- single flat list of scalars. Any arrays or hashes in these call
- and return lists will collapse, losing their identities--but you
- may always use pass-by-reference instead to avoid this. Both
- call and return lists may contain as many or as few scalar
- elements as you'd like. (Often a function without an explicit
- return statement is called a subroutine, but there's really no
- difference from the language's perspective.)
-
- Any arguments passed to the routine come in as the array `@_'.
- Thus if you called a function with two arguments, those would be
- stored in `$_[0]' and `$_[1]'. The array `@_' is a local array,
- but its elements are aliases for the actual scalar parameters.
- In particular, if an element `$_[0]' is updated, the
- corresponding argument is updated (or an error occurs if it is
- not updatable). If an argument is an array or hash element which
- did not exist when the function was called, that element is
- created only when (and if) it is modified or if a reference to
- it is taken. (Some earlier versions of Perl created the element
- whether or not it was assigned to.) Note that assigning to the
- whole array `@_' removes the aliasing, and does not update any
- arguments.
-
- The return value of the subroutine is the value of the last
- expression evaluated. Alternatively, a `return' statement may be
- used to exit the subroutine, optionally specifying the returned
- value, which will be evaluated in the appropriate context (list,
- scalar, or void) depending on the context of the subroutine
- call. If you specify no return value, the subroutine will return
- an empty list in a list context, an undefined value in a scalar
- context, or nothing in a void context. If you return one or more
- arrays and/or hashes, these will be flattened together into one
- large indistinguishable list.
-
- Perl does not have named formal parameters, but in practice all
- you do is assign to a `my()' list of these. Any variables you
- use in the function that aren't declared private are global
- variables. For the gory details on creating private variables,
- see the section on "Private Variables via my()" and the section
- on "Temporary Values via local()". To create protected
- environments for a set of functions in a separate package (and
- probably a separate file), see the section on "Packages" in the
- perlmod manpage.
-
- Example:
-
- sub max {
- my $max = shift(@_);
- foreach $foo (@_) {
- $max = $foo if $max < $foo;
- }
- return $max;
- }
- $bestday = max($mon,$tue,$wed,$thu,$fri);
-
-
- Example:
-
- # get a line, combining continuation lines
- # that start with whitespace
-
- sub get_line {
- $thisline = $lookahead; # GLOBAL VARIABLES!!
- LINE: while (defined($lookahead = <STDIN>)) {
- if ($lookahead =~ /^[ \t]/) {
- $thisline .= $lookahead;
- }
- else {
- last LINE;
- }
- }
- $thisline;
- }
-
- $lookahead = <STDIN>; # get first line
- while ($_ = get_line()) {
- ...
- }
-
-
- Use array assignment to a local list to name your formal
- arguments:
-
- sub maybeset {
- my($key, $value) = @_;
- $Foo{$key} = $value unless $Foo{$key};
- }
-
-
- This also has the effect of turning call-by-reference into call-
- by-value, because the assignment copies the values. Otherwise a
- function is free to do in-place modifications of `@_' and change
- its caller's values.
-
- upcase_in($v1, $v2); # this changes $v1 and $v2
- sub upcase_in {
- for (@_) { tr/a-z/A-Z/ }
- }
-
-
- You aren't allowed to modify constants in this way, of course.
- If an argument were actually literal and you tried to change it,
- you'd take a (presumably fatal) exception. For example, this
- won't work:
-
- upcase_in("frederick");
-
-
- It would be much safer if the `upcase_in()' function were
- written to return a copy of its parameters instead of changing
- them in place:
-
- ($v3, $v4) = upcase($v1, $v2); # this doesn't
- sub upcase {
- return unless defined wantarray; # void context, do nothing
- my @parms = @_;
- for (@parms) { tr/a-z/A-Z/ }
- return wantarray ? @parms : $parms[0];
- }
-
-
- Notice how this (unprototyped) function doesn't care whether it
- was passed real scalars or arrays. Perl will see everything as
- one big long flat `@_' parameter list. This is one of the ways
- where Perl's simple argument-passing style shines. The
- `upcase()' function would work perfectly well without changing
- the `upcase()' definition even if we fed it things like this:
-
- @newlist = upcase(@list1, @list2);
- @newlist = upcase( split /:/, $var );
-
-
- Do not, however, be tempted to do this:
-
- (@a, @b) = upcase(@list1, @list2);
-
-
- Because like its flat incoming parameter list, the return list
- is also flat. So all you have managed to do here is stored
- everything in `@a' and made `@b' an empty list. See the Pass by
- Reference manpage for alternatives.
-
- A subroutine may be called using the "`&'" prefix. The "`&'" is
- optional in modern Perls, and so are the parentheses if the
- subroutine has been predeclared. (Note, however, that the "`&'"
- is *NOT* optional when you're just naming the subroutine, such
- as when it's used as an argument to `defined()' or `undef()'.
- Nor is it optional when you want to do an indirect subroutine
- call with a subroutine name or reference using the `&$subref()'
- or `&{$subref}()' constructs. See the perlref manpage for more
- on that.)
-
- Subroutines may be called recursively. If a subroutine is called
- using the "`&'" form, the argument list is optional, and if
- omitted, no `@_' array is set up for the subroutine: the `@_'
- array at the time of the call is visible to subroutine instead.
- This is an efficiency mechanism that new users may wish to
- avoid.
-
- &foo(1,2,3); # pass three arguments
- foo(1,2,3); # the same
-
- foo(); # pass a null list
- &foo(); # the same
-
- &foo; # foo() get current args, like foo(@_) !!
- foo; # like foo() IFF sub foo predeclared, else "foo"
-
-
- Not only does the "`&'" form make the argument list optional,
- but it also disables any prototype checking on the arguments you
- do provide. This is partly for historical reasons, and partly
- for having a convenient way to cheat if you know what you're
- doing. See the section on Prototypes below.
-
- Function whose names are in all upper case are reserved to the
- Perl core, just as are modules whose names are in all lower
- case. A function in all capitals is a loosely-held convention
- meaning it will be called indirectly by the run-time system
- itself. Functions that do special, pre-defined things are
- `BEGIN', `END', `AUTOLOAD', and `DESTROY'--plus all the
- functions mentioned in the perltie manpage. The 5.005 release
- adds `INIT' to this list.
-
- Private Variables via my()
-
- Synopsis:
-
- my $foo; # declare $foo lexically local
- my (@wid, %get); # declare list of variables local
- my $foo = "flurp"; # declare $foo lexical, and init it
- my @oof = @bar; # declare @oof lexical, and init it
-
-
- A "`my'" declares the listed variables to be confined
- (lexically) to the enclosing block, conditional
- (`if/unless/elsif/else'), loop
- (`for/foreach/while/until/continue'), subroutine, `eval', or
- `do/require/use''d file. If more than one value is listed, the
- list must be placed in parentheses. All listed elements must be
- legal lvalues. Only alphanumeric identifiers may be lexically
- scoped--magical builtins like `$/' must currently be `local'ize
- with "`local'" instead.
-
- Unlike dynamic variables created by the "`local'" operator,
- lexical variables declared with "`my'" are totally hidden from
- the outside world, including any called subroutines (even if
- it's the same subroutine called from itself or elsewhere--every
- call gets its own copy).
-
- This doesn't mean that a `my()' variable declared in a
- statically *enclosing* lexical scope would be invisible. Only
- the dynamic scopes are cut off. For example, the `bumpx()'
- function below has access to the lexical `$x' variable because
- both the my and the sub occurred at the same scope, presumably
- the file scope.
-
- my $x = 10;
- sub bumpx { $x++ }
-
-
- (An `eval()', however, can see the lexical variables of the
- scope it is being evaluated in so long as the names aren't
- hidden by declarations within the `eval()' itself. See the
- perlref manpage.)
-
- The parameter list to `my()' may be assigned to if desired,
- which allows you to initialize your variables. (If no
- initializer is given for a particular variable, it is created
- with the undefined value.) Commonly this is used to name the
- parameters to a subroutine. Examples:
-
- $arg = "fred"; # "global" variable
- $n = cube_root(27);
- print "$arg thinks the root is $n\n";
- fred thinks the root is 3
-
- sub cube_root {
- my $arg = shift; # name doesn't matter
- $arg **= 1/3;
- return $arg;
- }
-
-
- The "`my'" is simply a modifier on something you might assign
- to. So when you do assign to the variables in its argument list,
- the "`my'" doesn't change whether those variables are viewed as
- a scalar or an array. So
-
- my ($foo) = <STDIN>; # WRONG?
- my @FOO = <STDIN>;
-
-
- both supply a list context to the right-hand side, while
-
- my $foo = <STDIN>;
-
-
- supplies a scalar context. But the following declares only one
- variable:
-
- my $foo, $bar = 1; # WRONG
-
-
- That has the same effect as
-
- my $foo;
- $bar = 1;
-
-
- The declared variable is not introduced (is not visible) until
- after the current statement. Thus,
-
- my $x = $x;
-
-
- can be used to initialize the new $x with the value of the old
- `$x', and the expression
-
- my $x = 123 and $x == 123
-
-
- is false unless the old `$x' happened to have the value `123'.
-
- Lexical scopes of control structures are not bounded precisely
- by the braces that delimit their controlled blocks; control
- expressions are part of the scope, too. Thus in the loop
-
- while (defined(my $line = <>)) {
- $line = lc $line;
- } continue {
- print $line;
- }
-
-
- the scope of `$line' extends from its declaration throughout the
- rest of the loop construct (including the `continue' clause),
- but not beyond it. Similarly, in the conditional
-
- if ((my $answer = <STDIN>) =~ /^yes$/i) {
- user_agrees();
- } elsif ($answer =~ /^no$/i) {
- user_disagrees();
- } else {
- chomp $answer;
- die "'$answer' is neither 'yes' nor 'no'";
- }
-
-
- the scope of `$answer' extends from its declaration throughout
- the rest of the conditional (including `elsif' and `else'
- clauses, if any), but not beyond it.
-
- (None of the foregoing applies to `if/unless' or `while/until'
- modifiers appended to simple statements. Such modifiers are not
- control structures and have no effect on scoping.)
-
- The `foreach' loop defaults to scoping its index variable
- dynamically (in the manner of `local'; see below). However, if
- the index variable is prefixed with the keyword "`my'", then it
- is lexically scoped instead. Thus in the loop
-
- for my $i (1, 2, 3) {
- some_function();
- }
-
-
- the scope of `$i' extends to the end of the loop, but not beyond
- it, and so the value of `$i' is unavailable in
- `some_function()'.
-
- Some users may wish to encourage the use of lexically scoped
- variables. As an aid to catching implicit references to package
- variables, if you say
-
- use strict 'vars';
-
-
- then any variable reference from there to the end of the
- enclosing block must either refer to a lexical variable, or must
- be fully qualified with the package name. A compilation error
- results otherwise. An inner block may countermand this with "`no
- strict 'vars''".
-
- A `my()' has both a compile-time and a run-time effect. At
- compile time, the compiler takes notice of it; the principle
- usefulness of this is to quiet "`use strict 'vars''". The actual
- initialization is delayed until run time, so it gets executed
- appropriately; every time through a loop, for example.
-
- Variables declared with "`my'" are not part of any package and
- are therefore never fully qualified with the package name. In
- particular, you're not allowed to try to make a package variable
- (or other global) lexical:
-
- my $pack::var; # ERROR! Illegal syntax
- my $_; # also illegal (currently)
-
-
- In fact, a dynamic variable (also known as package or global
- variables) are still accessible using the fully qualified `::'
- notation even while a lexical of the same name is also visible:
-
- package main;
- local $x = 10;
- my $x = 20;
- print "$x and $::x\n";
-
-
- That will print out `20' and `10'.
-
- You may declare "`my'" variables at the outermost scope of a
- file to hide any such identifiers totally from the outside
- world. This is similar to C's static variables at the file
- level. To do this with a subroutine requires the use of a
- closure (anonymous function with lexical access). If a block
- (such as an `eval()', function, or `package') wants to create a
- private subroutine that cannot be called from outside that
- block, it can declare a lexical variable containing an anonymous
- sub reference:
-
- my $secret_version = '1.001-beta';
- my $secret_sub = sub { print $secret_version };
- &$secret_sub();
-
-
- As long as the reference is never returned by any function
- within the module, no outside module can see the subroutine,
- because its name is not in any package's symbol table. Remember
- that it's not *REALLY* called `$some_pack::secret_version' or
- anything; it's just `$secret_version', unqualified and
- unqualifiable.
-
- This does not work with object methods, however; all object
- methods have to be in the symbol table of some package to be
- found.
-
- Persistent Private Variables
-
- Just because a lexical variable is lexically (also called
- statically) scoped to its enclosing block, `eval', or `do' FILE,
- this doesn't mean that within a function it works like a C
- static. It normally works more like a C auto, but with implicit
- garbage collection.
-
- Unlike local variables in C or C++, Perl's lexical variables
- don't necessarily get recycled just because their scope has
- exited. If something more permanent is still aware of the
- lexical, it will stick around. So long as something else
- references a lexical, that lexical won't be freed--which is as
- it should be. You wouldn't want memory being free until you were
- done using it, or kept around once you were done. Automatic
- garbage collection takes care of this for you.
-
- This means that you can pass back or save away references to
- lexical variables, whereas to return a pointer to a C auto is a
- grave error. It also gives us a way to simulate C's function
- statics. Here's a mechanism for giving a function private
- variables with both lexical scoping and a static lifetime. If
- you do want to create something like C's static variables, just
- enclose the whole function in an extra block, and put the static
- variable outside the function but in the block.
-
- {
- my $secret_val = 0;
- sub gimme_another {
- return ++$secret_val;
- }
- }
- # $secret_val now becomes unreachable by the outside
- # world, but retains its value between calls to gimme_another
-
-
- If this function is being sourced in from a separate file via
- `require' or `use', then this is probably just fine. If it's all
- in the main program, you'll need to arrange for the `my()' to be
- executed early, either by putting the whole block above your
- main program, or more likely, placing merely a `BEGIN' sub
- around it to make sure it gets executed before your program
- starts to run:
-
- sub BEGIN {
- my $secret_val = 0;
- sub gimme_another {
- return ++$secret_val;
- }
- }
-
-
- See the section on "Package Constructors and Destructors" in the
- perlmod manpage about the `BEGIN' function.
-
- If declared at the outermost scope, the file scope, then
- lexicals work someone like C's file statics. They are available
- to all functions in that same file declared below them, but are
- inaccessible from outside of the file. This is sometimes used in
- modules to create private variables for the whole module.
-
- Temporary Values via local()
-
- NOTE: In general, you should be using "`my'" instead of
- "`local'", because it's faster and safer. Exceptions to this
- include the global punctuation variables, filehandles and
- formats, and direct manipulation of the Perl symbol table
- itself. Format variables often use "`local'" though, as do other
- variables whose current value must be visible to called
- subroutines.
-
- Synopsis:
-
- local $foo; # declare $foo dynamically local
- local (@wid, %get); # declare list of variables local
- local $foo = "flurp"; # declare $foo dynamic, and init it
- local @oof = @bar; # declare @oof dynamic, and init it
-
- local *FH; # localize $FH, @FH, %FH, &FH ...
- local *merlyn = *randal; # now $merlyn is really $randal, plus
- # @merlyn is really @randal, etc
- local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
- local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
-
-
- A `local()' modifies its listed variables to be "local" to the
- enclosing block, `eval', or `do FILE'--and to *any subroutine
- called from within that block*. A `local()' just gives temporary
- values to global (meaning package) variables. It does not create
- a local variable. This is known as dynamic scoping. Lexical
- scoping is done with "`my'", which works more like C's auto
- declarations.
-
- If more than one variable is given to `local()', they must be
- placed in parentheses. All listed elements must be legal
- lvalues. This operator works by saving the current values of
- those variables in its argument list on a hidden stack and
- restoring them upon exiting the block, subroutine, or eval. This
- means that called subroutines can also reference the local
- variable, but not the global one. The argument list may be
- assigned to if desired, which allows you to initialize your
- local variables. (If no initializer is given for a particular
- variable, it is created with an undefined value.) Commonly this
- is used to name the parameters to a subroutine. Examples:
-
- for $i ( 0 .. 9 ) {
- $digits{$i} = $i;
- }
- # assume this function uses global %digits hash
- parse_num();
-
- # now temporarily add to %digits hash
- if ($base12) {
- # (NOTE: not claiming this is efficient!)
- local %digits = (%digits, 't' => 10, 'e' => 11);
- parse_num(); # parse_num gets this new %digits!
- }
- # old %digits restored here
-
-
- Because `local()' is a run-time command, it gets executed every
- time through a loop. In releases of Perl previous to 5.0, this
- used more stack storage each time until the loop was exited.
- Perl now reclaims the space each time through, but it's still
- more efficient to declare your variables outside the loop.
-
- A `local' is simply a modifier on an lvalue expression. When you
- assign to a `local'ized variable, the `local' doesn't change
- whether its list is viewed as a scalar or an array. So
-
- local($foo) = <STDIN>;
- local @FOO = <STDIN>;
-
-
- both supply a list context to the right-hand side, while
-
- local $foo = <STDIN>;
-
-
- supplies a scalar context.
-
- A note about `local()' and composite types is in order.
- Something like `local(%foo)' works by temporarily placing a
- brand new hash in the symbol table. The old hash is left alone,
- but is hidden "behind" the new one.
-
- This means the old variable is completely invisible via the
- symbol table (i.e. the hash entry in the `*foo' typeglob) for
- the duration of the dynamic scope within which the `local()' was
- seen. This has the effect of allowing one to temporarily occlude
- any magic on composite types. For instance, this will briefly
- alter a tied hash to some other implementation:
-
- tie %ahash, 'APackage';
- [...]
- {
- local %ahash;
- tie %ahash, 'BPackage';
- [..called code will see %ahash tied to 'BPackage'..]
- {
- local %ahash;
- [..%ahash is a normal (untied) hash here..]
- }
- }
- [..%ahash back to its initial tied self again..]
-
-
- As another example, a custom implementation of `%ENV' might look
- like this:
-
- {
- local %ENV;
- tie %ENV, 'MyOwnEnv';
- [..do your own fancy %ENV manipulation here..]
- }
- [..normal %ENV behavior here..]
-
-
- It's also worth taking a moment to explain what happens when you
- `local'ize a member of a composite type (i.e. an array or hash
- element). In this case, the element is `local'ized *by name*.
- This means that when the scope of the `local()' ends, the saved
- value will be restored to the hash element whose key was named
- in the `local()', or the array element whose index was named in
- the `local()'. If that element was deleted while the `local()'
- was in effect (e.g. by a `delete()' from a hash or a `shift()'
- of an array), it will spring back into existence, possibly
- extending an array and filling in the skipped elements with
- `undef'. For instance, if you say
-
- %hash = ( 'This' => 'is', 'a' => 'test' );
- @ary = ( 0..5 );
- {
- local($ary[5]) = 6;
- local($hash{'a'}) = 'drill';
- while (my $e = pop(@ary)) {
- print "$e . . .\n";
- last unless $e > 3;
- }
- if (@ary) {
- $hash{'only a'} = 'test';
- delete $hash{'a'};
- }
- }
- print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
- print "The array has ",scalar(@ary)," elements: ",
- join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
-
-
- Perl will print
-
- 6 . . .
- 4 . . .
- 3 . . .
- This is a test only a test.
- The array has 6 elements: 0, 1, 2, undef, undef, 5
-
-
- Note also that when you `local'ize a member of a composite type
- that does not exist previously, the value is treated as though
- it were in an lvalue context, i.e., it is first created and then
- `local'ized. The consequence of this is that the hash or array
- is in fact permanently modified. For instance, if you say
-
- %hash = ( 'This' => 'is', 'a' => 'test' );
- @ary = ( 0..5 );
- {
- local($ary[8]) = 0;
- local($hash{'b'}) = 'whatever';
- }
- printf "%%hash has now %d keys, \@ary %d elements.\n",
- scalar(keys(%hash)), scalar(@ary);
-
-
- Perl will print
-
- %hash has now 3 keys, @ary 9 elements.
-
-
- The above behavior of local() on non-existent members of
- composite types is subject to change in future.
-
- Passing Symbol Table Entries (typeglobs)
-
- [Note: The mechanism described in this section was originally
- the only way to simulate pass-by-reference in older versions of
- Perl. While it still works fine in modern versions, the new
- reference mechanism is generally easier to work with. See
- below.]
-
- Sometimes you don't want to pass the value of an array to a
- subroutine but rather the name of it, so that the subroutine can
- modify the global copy of it rather than working with a local
- copy. In perl you can refer to all objects of a particular name
- by prefixing the name with a star: `*foo'. This is often known
- as a "typeglob", because the star on the front can be thought of
- as a wildcard match for all the funny prefix characters on
- variables and subroutines and such.
-
- When evaluated, the typeglob produces a scalar value that
- represents all the objects of that name, including any
- filehandle, format, or subroutine. When assigned to, it causes
- the name mentioned to refer to whatever "`*'" value was assigned
- to it. Example:
-
- sub doubleary {
- local(*someary) = @_;
- foreach $elem (@someary) {
- $elem *= 2;
- }
- }
- doubleary(*foo);
- doubleary(*bar);
-
-
- Note that scalars are already passed by reference, so you can
- modify scalar arguments without using this mechanism by
- referring explicitly to `$_[0]' etc. You can modify all the
- elements of an array by passing all the elements as scalars, but
- you have to use the `*' mechanism (or the equivalent reference
- mechanism) to `push', `pop', or change the size of an array. It
- will certainly be faster to pass the typeglob (or reference).
-
- Even if you don't want to modify an array, this mechanism is
- useful for passing multiple arrays in a single LIST, because
- normally the LIST mechanism will merge all the array values so
- that you can't extract out the individual arrays. For more on
- typeglobs, see the section on "Typeglobs and Filehandles" in the
- perldata manpage.
-
- When to Still Use local()
-
- Despite the existence of `my()', there are still three places
- where the `local()' operator still shines. In fact, in these
- three places, you *must* use `local' instead of `my'.
-
- 1. You need to give a global variable a temporary value, especially `$_'.
- The global variables, like `@ARGV' or the punctuation
- variables, must be `local'ized with `local()'. This block
- reads in /etc/motd, and splits it up into chunks separated
- by lines of equal signs, which are placed in `@Fields'.
-
- {
- local @ARGV = ("/etc/motd");
- local $/ = undef;
- local $_ = <>;
- @Fields = split /^\s*=+\s*$/;
- }
-
-
- It particular, it's important to `local'ize `$_' in any
- routine that assigns to it. Look out for implicit
- assignments in `while' conditionals.
-
- 2. You need to create a local file or directory handle or a local function.
- A function that needs a filehandle of its own must use
- `local()' uses `local()' on complete typeglob. This can be
- used to create new symbol table entries:
-
- sub ioqueue {
- local (*READER, *WRITER); # not my!
- pipe (READER, WRITER); or die "pipe: $!";
- return (*READER, *WRITER);
- }
- ($head, $tail) = ioqueue();
-
-
- See the Symbol module for a way to create anonymous symbol
- table entries.
-
- Because assignment of a reference to a typeglob creates an
- alias, this can be used to create what is effectively a
- local function, or at least, a local alias.
-
- {
- local *grow = \&shrink; # only until this block exists
- grow(); # really calls shrink()
- move(); # if move() grow()s, it shrink()s too
- }
- grow(); # get the real grow() again
-
-
- See the section on "Function Templates" in the perlref
- manpage for more about manipulating functions by name in
- this way.
-
- 3. You want to temporarily change just one element of an array or hash.
- You can `local'ize just one element of an aggregate. Usually
- this is done on dynamics:
-
- {
- local $SIG{INT} = 'IGNORE';
- funct(); # uninterruptible
- }
- # interruptibility automatically restored here
-
-
- But it also works on lexically declared aggregates. Prior to
- 5.005, this operation could on occasion misbehave.
-
-
- Pass by Reference
-
- If you want to pass more than one array or hash into a function-
- -or return them from it--and have them maintain their integrity,
- then you're going to have to use an explicit pass-by-reference.
- Before you do that, you need to understand references as
- detailed in the perlref manpage. This section may not make much
- sense to you otherwise.
-
- Here are a few simple examples. First, let's pass in several
- arrays to a function and have it `pop' all of then, return a new
- list of all their former last elements:
-
- @tailings = popmany ( \@a, \@b, \@c, \@d );
-
- sub popmany {
- my $aref;
- my @retlist = ();
- foreach $aref ( @_ ) {
- push @retlist, pop @$aref;
- }
- return @retlist;
- }
-
-
- Here's how you might write a function that returns a list of
- keys occurring in all the hashes passed to it:
-
- @common = inter( \%foo, \%bar, \%joe );
- sub inter {
- my ($k, $href, %seen); # locals
- foreach $href (@_) {
- while ( $k = each %$href ) {
- $seen{$k}++;
- }
- }
- return grep { $seen{$_} == @_ } keys %seen;
- }
-
-
- So far, we're using just the normal list return mechanism. What
- happens if you want to pass or return a hash? Well, if you're
- using only one of them, or you don't mind them concatenating,
- then the normal calling convention is ok, although a little
- expensive.
-
- Where people get into trouble is here:
-
- (@a, @b) = func(@c, @d);
- or
- (%a, %b) = func(%c, %d);
-
-
- That syntax simply won't work. It sets just `@a' or `%a' and
- clears the `@b' or `%b'. Plus the function didn't get passed
- into two separate arrays or hashes: it got one long list in
- `@_', as always.
-
- If you can arrange for everyone to deal with this through
- references, it's cleaner code, although not so nice to look at.
- Here's a function that takes two array references as arguments,
- returning the two array elements in order of how many elements
- they have in them:
-
- ($aref, $bref) = func(\@c, \@d);
- print "@$aref has more than @$bref\n";
- sub func {
- my ($cref, $dref) = @_;
- if (@$cref > @$dref) {
- return ($cref, $dref);
- } else {
- return ($dref, $cref);
- }
- }
-
-
- It turns out that you can actually do this also:
-
- (*a, *b) = func(\@c, \@d);
- print "@a has more than @b\n";
- sub func {
- local (*c, *d) = @_;
- if (@c > @d) {
- return (\@c, \@d);
- } else {
- return (\@d, \@c);
- }
- }
-
-
- Here we're using the typeglobs to do symbol table aliasing. It's
- a tad subtle, though, and also won't work if you're using `my()'
- variables, because only globals (well, and `local()'s) are in
- the symbol table.
-
- If you're passing around filehandles, you could usually just use
- the bare typeglob, like `*STDOUT', but typeglobs references
- would be better because they'll still work properly under `use
- strict 'refs''. For example:
-
- splutter(\*STDOUT);
- sub splutter {
- my $fh = shift;
- print $fh "her um well a hmmm\n";
- }
-
- $rec = get_rec(\*STDIN);
- sub get_rec {
- my $fh = shift;
- return scalar <$fh>;
- }
-
-
- Another way to do this is using `*HANDLE{IO}', see the perlref
- manpage for usage and caveats.
-
- If you're planning on generating new filehandles, you could do
- this:
-
- sub openit {
- my $name = shift;
- local *FH;
- return open (FH, $path) ? *FH : undef;
- }
-
-
- Although that will actually produce a small memory leak. See the
- bottom of the "open()" entry in the perlfunc manpage for a
- somewhat cleaner way using the `IO::Handle' package.
-
- Prototypes
-
- As of the 5.002 release of perl, if you declare
-
- sub mypush (\@@)
-
-
- then `mypush()' takes arguments exactly like `push()' does. The
- declaration of the function to be called must be visible at
- compile time. The prototype affects only the interpretation of
- new-style calls to the function, where new-style is defined as
- not using the `&' character. In other words, if you call it like
- a builtin function, then it behaves like a builtin function. If
- you call it like an old-fashioned subroutine, then it behaves
- like an old-fashioned subroutine. It naturally falls out from
- this rule that prototypes have no influence on subroutine
- references like `\&foo' or on indirect subroutine calls like
- `&{$subref}' or `$subref->()'.
-
- Method calls are not influenced by prototypes either, because
- the function to be called is indeterminate at compile time,
- because it depends on inheritance.
-
- Because the intent is primarily to let you define subroutines
- that work like builtin commands, here are the prototypes for
- some other functions that parse almost exactly like the
- corresponding builtins.
-
- Declared as Called as
-
- sub mylink ($$) mylink $old, $new
- sub myvec ($$$) myvec $var, $offset, 1
- sub myindex ($$;$) myindex &getstring, "substr"
- sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
- sub myreverse (@) myreverse $a, $b, $c
- sub myjoin ($@) myjoin ":", $a, $b, $c
- sub mypop (\@) mypop @array
- sub mysplice (\@$$@) mysplice @array, @array, 0, @pushme
- sub mykeys (\%) mykeys %{$hashref}
- sub myopen (*;$) myopen HANDLE, $name
- sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
- sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
- sub myrand ($) myrand 42
- sub mytime () mytime
-
-
- Any backslashed prototype character represents an actual
- argument that absolutely must start with that character. The
- value passed to the subroutine (as part of `@_') will be a
- reference to the actual argument given in the subroutine call,
- obtained by applying `\' to that argument.
-
- Unbackslashed prototype characters have special meanings. Any
- unbackslashed `@' or `%' eats all the rest of the arguments, and
- forces list context. An argument represented by `$' forces
- scalar context. An `&' requires an anonymous subroutine, which,
- if passed as the first argument, does not require the "`sub'"
- keyword or a subsequent comma. A `*' allows the subroutine to
- accept a bareword, constant, scalar expression, typeglob, or a
- reference to a typeglob in that slot. The value will be
- available to the subroutine either as a simple scalar, or (in
- the latter two cases) as a reference to the typeglob.
-
- A semicolon separates mandatory arguments from optional
- arguments. (It is redundant before `@' or `%'.)
-
- Note how the last three examples above are treated specially by
- the parser. `mygrep()' is parsed as a true list operator,
- `myrand()' is parsed as a true unary operator with unary
- precedence the same as `rand()', and `mytime()' is truly without
- arguments, just like `time()'. That is, if you say
-
- mytime +2;
-
-
- you'll get `mytime() + 2', not `mytime(2)', which is how it
- would be parsed without the prototype.
-
- The interesting thing about `&' is that you can generate new
- syntax with it:
-
- sub try (&@) {
- my($try,$catch) = @_;
- eval { &$try };
- if ($@) {
- local $_ = $@;
- &$catch;
- }
- }
- sub catch (&) { $_[0] }
-
- try {
- die "phooey";
- } catch {
- /phooey/ and print "unphooey\n";
- };
-
-
- That prints `"unphooey"'. (Yes, there are still unresolved
- issues having to do with the visibility of `@_'. I'm ignoring
- that question for the moment. (But note that if we make `@_'
- lexically scoped, those anonymous subroutines can act like
- closures... (Gee, is this sounding a little Lispish? (Never
- mind.))))
-
- And here's a reimplementation of `grep':
-
- sub mygrep (&@) {
- my $code = shift;
- my @result;
- foreach $_ (@_) {
- push(@result, $_) if &$code;
- }
- @result;
- }
-
-
- Some folks would prefer full alphanumeric prototypes.
- Alphanumerics have been intentionally left out of prototypes for
- the express purpose of someday in the future adding named,
- formal parameters. The current mechanism's main goal is to let
- module writers provide better diagnostics for module users.
- Larry feels the notation quite understandable to Perl
- programmers, and that it will not intrude greatly upon the meat
- of the module, nor make it harder to read. The line noise is
- visually encapsulated into a small pill that's easy to swallow.
-
- It's probably best to prototype new functions, not retrofit
- prototyping into older ones. That's because you must be
- especially careful about silent impositions of differing list
- versus scalar contexts. For example, if you decide that a
- function should take just one parameter, like this:
-
- sub func ($) {
- my $n = shift;
- print "you gave me $n\n";
- }
-
-
- and someone has been calling it with an array or expression
- returning a list:
-
- func(@foo);
- func( split /:/ );
-
-
- Then you've just supplied an automatic `scalar()' in front of
- their argument, which can be more than a bit surprising. The old
- `@foo' which used to hold one thing doesn't get passed in.
- Instead, the `func()' now gets passed in `1', that is, the
- number of elements in `@foo'. And the `split()' gets called in a
- scalar context and starts scribbling on your `@_' parameter
- list.
-
- This is all very powerful, of course, and should be used only in
- moderation to make the world a better place.
-
- Constant Functions
-
- Functions with a prototype of `()' are potential candidates for
- inlining. If the result after optimization and constant folding
- is either a constant or a lexically-scoped scalar which has no
- other references, then it will be used in place of function
- calls made without `&' or `do'. Calls made using `&' or `do' are
- never inlined. (See constant.pm for an easy way to declare most
- constants.)
-
- The following functions would all be inlined:
-
- sub pi () { 3.14159 } # Not exact, but close.
- sub PI () { 4 * atan2 1, 1 } # As good as it gets,
- # and it's inlined, too!
- sub ST_DEV () { 0 }
- sub ST_INO () { 1 }
-
- sub FLAG_FOO () { 1 << 8 }
- sub FLAG_BAR () { 1 << 9 }
- sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
-
- sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
- sub BAZ_VAL () {
- if (OPT_BAZ) {
- return 23;
- }
- else {
- return 42;
- }
- }
-
- sub N () { int(BAZ_VAL) / 3 }
- BEGIN {
- my $prod = 1;
- for (1..N) { $prod *= $_ }
- sub N_FACTORIAL () { $prod }
- }
-
-
- If you redefine a subroutine that was eligible for inlining,
- you'll get a mandatory warning. (You can use this warning to
- tell whether or not a particular subroutine is considered
- constant.) The warning is considered severe enough not to be
- optional because previously compiled invocations of the function
- will still be using the old value of the function. If you need
- to be able to redefine the subroutine you need to ensure that it
- isn't inlined, either by dropping the `()' prototype (which
- changes the calling semantics, so beware) or by thwarting the
- inlining mechanism in some other way, such as
-
- sub not_inlined () {
- 23 if $];
- }
-
-
- Overriding Builtin Functions
-
- Many builtin functions may be overridden, though this should be
- tried only occasionally and for good reason. Typically this
- might be done by a package attempting to emulate missing builtin
- functionality on a non-Unix system.
-
- Overriding may be done only by importing the name from a module-
- -ordinary predeclaration isn't good enough. However, the `subs'
- pragma (compiler directive) lets you, in effect, predeclare subs
- via the import syntax, and these names may then override the
- builtin ones:
-
- use subs 'chdir', 'chroot', 'chmod', 'chown';
- chdir $somewhere;
- sub chdir { ... }
-
-
- To unambiguously refer to the builtin form, one may precede the
- builtin name with the special package qualifier `CORE::'. For
- example, saying `CORE::open()' will always refer to the builtin
- `open()', even if the current package has imported some other
- subroutine called `&open()' from elsewhere.
-
- Library modules should not in general export builtin names like
- "`open'" or "`chdir'" as part of their default `@EXPORT' list,
- because these may sneak into someone else's namespace and change
- the semantics unexpectedly. Instead, if the module adds the name
- to the `@EXPORT_OK' list, then it's possible for a user to
- import the name explicitly, but not implicitly. That is, they
- could say
-
- use Module 'open';
-
-
- and it would import the `open' override, but if they said
-
- use Module;
-
-
- they would get the default imports without the overrides.
-
- The foregoing mechanism for overriding builtins is restricted,
- quite deliberately, to the package that requests the import.
- There is a second method that is sometimes applicable when you
- wish to override a builtin everywhere, without regard to
- namespace boundaries. This is achieved by importing a sub into
- the special namespace `CORE::GLOBAL::'. Here is an example that
- quite brazenly replaces the `glob' operator with something that
- understands regular expressions.
-
- package REGlob;
- require Exporter;
- @ISA = 'Exporter';
- @EXPORT_OK = 'glob';
-
- sub import {
- my $pkg = shift;
- return unless @_;
- my $sym = shift;
- my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
- $pkg->export($where, $sym, @_);
- }
-
- sub glob {
- my $pat = shift;
- my @got;
- local(*D);
- if (opendir D, '.') { @got = grep /$pat/, readdir D; closedir D; }
- @got;
- }
- 1;
-
-
- And here's how it could be (ab)used:
-
- #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
- package Foo;
- use REGlob 'glob'; # override glob() in Foo:: only
- print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
-
-
- Note that the initial comment shows a contrived, even dangerous
- example. By overriding `glob' globally, you would be forcing the
- new (and subversive) behavior for the `glob' operator for every
- namespace, without the complete cognizance or cooperation of the
- modules that own those namespaces. Naturally, this should be
- done with extreme caution--if it must be done at all.
-
- The `REGlob' example above does not implement all the support
- needed to cleanly override perl's `glob' operator. The builtin
- `glob' has different behaviors depending on whether it appears
- in a scalar or list context, but our `REGlob' doesn't. Indeed,
- many perl builtins have such context sensitive behaviors, and
- these must be adequately supported by a properly written
- override. For a fully functional example of overriding `glob',
- study the implementation of `File::DosGlob' in the standard
- library.
-
- Autoloading
-
- If you call a subroutine that is undefined, you would ordinarily
- get an immediate fatal error complaining that the subroutine
- doesn't exist. (Likewise for subroutines being used as methods,
- when the method doesn't exist in any base class of the class
- package.) If, however, there is an `AUTOLOAD' subroutine defined
- in the package or packages that were searched for the original
- subroutine, then that `AUTOLOAD' subroutine is called with the
- arguments that would have been passed to the original
- subroutine. The fully qualified name of the original subroutine
- magically appears in the `$AUTOLOAD' variable in the same
- package as the `AUTOLOAD' routine. The name is not passed as an
- ordinary argument because, er, well, just because, that's why...
-
- Most `AUTOLOAD' routines will load in a definition for the
- subroutine in question using eval, and then execute that
- subroutine using a special form of "goto" that erases the stack
- frame of the `AUTOLOAD' routine without a trace. (See the
- standard `AutoLoader' module, for example.) But an `AUTOLOAD'
- routine can also just emulate the routine and never define it.
- For example, let's pretend that a function that wasn't defined
- should just call `system()' with those arguments. All you'd do
- is this:
-
- sub AUTOLOAD {
- my $program = $AUTOLOAD;
- $program =~ s/.*:://;
- system($program, @_);
- }
- date();
- who('am', 'i');
- ls('-l');
-
-
- In fact, if you predeclare the functions you want to call that
- way, you don't even need the parentheses:
-
- use subs qw(date who ls);
- date;
- who "am", "i";
- ls -l;
-
-
- A more complete example of this is the standard Shell module,
- which can treat undefined subroutine calls as calls to Unix
- programs.
-
- Mechanisms are available for modules writers to help split the
- modules up into autoloadable files. See the standard AutoLoader
- module described in the AutoLoader manpage and in the AutoSplit
- manpage, the standard SelfLoader modules in the SelfLoader
- manpage, and the document on adding C functions to perl code in
- the perlxs manpage.
-
- SEE ALSO
- See the perlref manpage for more about references and closures.
- See the perlxs manpage if you'd like to learn about calling C
- subroutines from perl. See the perlmod manpage to learn about
- bundling up your functions in separate files.
-
-