home *** CD-ROM | disk | FTP | other *** search
- =head1 NAME
-
- perlmod - Perl modules (packages and symbol tables)
-
- =head1 DESCRIPTION
-
- =head2 Packages
-
- Perl provides a mechanism for alternative namespaces to protect packages
- from stomping on each other's variables. In fact, apart from certain
- magical variables, there's really no such thing as a global variable
- in Perl. The package statement declares the compilation unit as
- being in the given namespace. The scope of the package declaration
- is from the declaration itself through the end of the enclosing block,
- C<eval>, C<sub>, or end of file, whichever comes first (the same scope
- as the my() and local() operators). All further unqualified dynamic
- identifiers will be in this namespace. A package statement affects
- only dynamic variables--including those you've used local() on--but
- I<not> lexical variables created with my(). Typically it would be
- the first declaration in a file to be included by the C<require> or
- C<use> operator. You can switch into a package in more than one place;
- it influences merely which symbol table is used by the compiler for the
- rest of that block. You can refer to variables and filehandles in other
- packages by prefixing the identifier with the package name and a double
- colon: C<$Package::Variable>. If the package name is null, the C<main>
- package is assumed. That is, C<$::sail> is equivalent to C<$main::sail>.
-
- (The old package delimiter was a single quote, but double colon
- is now the preferred delimiter, in part because it's more readable
- to humans, and in part because it's more readable to B<emacs> macros.
- It also makes C++ programmers feel like they know what's going on.)
-
- Packages may be nested inside other packages: C<$OUTER::INNER::var>. This
- implies nothing about the order of name lookups, however. All symbols
- are either local to the current package, or must be fully qualified
- from the outer package name down. For instance, there is nowhere
- within package C<OUTER> that C<$INNER::var> refers to C<$OUTER::INNER::var>.
- It would treat package C<INNER> as a totally separate global package.
-
- Only identifiers starting with letters (or underscore) are stored in a
- package's symbol table. All other symbols are kept in package C<main>,
- including all of the punctuation variables like $_. In addition, the
- identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC, and SIG are
- forced to be in package C<main>, even when used for other purposes than
- their builtin one. Note also that, if you have a package called C<m>,
- C<s>, or C<y>, then you can't use the qualified form of an identifier
- because it will be interpreted instead as a pattern match, a substitution,
- or a translation.
-
- (Variables beginning with underscore used to be forced into package
- main, but we decided it was more useful for package writers to be able
- to use leading underscore to indicate private variables and method names.
- $_ is still global though.)
-
- Eval()ed strings are compiled in the package in which the eval() was
- compiled. (Assignments to C<$SIG{}>, however, assume the signal
- handler specified is in the C<main> package. Qualify the signal handler
- name if you wish to have a signal handler in a package.) For an
- example, examine F<perldb.pl> in the Perl library. It initially switches
- to the C<DB> package so that the debugger doesn't interfere with variables
- in the script you are trying to debug. At various points, however, it
- temporarily switches back to the C<main> package to evaluate various
- expressions in the context of the C<main> package (or wherever you came
- from). See L<perldebug>.
-
- The special symbol C<__PACKAGE__> contains the current package, but cannot
- (easily) be used to construct variables.
-
- See L<perlsub> for other scoping issues related to my() and local(),
- and L<perlref> regarding closures.
-
- =head2 Symbol Tables
-
- The symbol table for a package happens to be stored in the hash of that
- name with two colons appended. The main symbol table's name is thus
- C<%main::>, or C<%::> for short. Likewise symbol table for the nested
- package mentioned earlier is named C<%OUTER::INNER::>.
-
- The value in each entry of the hash is what you are referring to when you
- use the C<*name> typeglob notation. In fact, the following have the same
- effect, though the first is more efficient because it does the symbol
- table lookups at compile time:
-
- local *main::foo = *main::bar;
- local $main::{foo} = $main::{bar};
-
- You can use this to print out all the variables in a package, for
- instance. Here is F<dumpvar.pl> from the Perl library:
-
- package dumpvar;
- sub main::dumpvar {
- ($package) = @_;
- local(*stab) = eval("*${package}::");
- while (($key,$val) = each(%stab)) {
- local(*entry) = $val;
- if (defined $entry) {
- print "\$$key = '$entry'\n";
- }
-
- if (defined @entry) {
- print "\@$key = (\n";
- foreach $num ($[ .. $#entry) {
- print " $num\t'",$entry[$num],"'\n";
- }
- print ")\n";
- }
-
- if ($key ne "${package}::" && defined %entry) {
- print "\%$key = (\n";
- foreach $key (sort keys(%entry)) {
- print " $key\t'",$entry{$key},"'\n";
- }
- print ")\n";
- }
- }
- }
-
- Note that even though the subroutine is compiled in package C<dumpvar>,
- the name of the subroutine is qualified so that its name is inserted into
- package C<main>. While popular many years ago, this is now considered
- very poor style; in general, you should be writing modules and using the
- normal export mechanism instead of hammering someone else's namespace,
- even main's.
-
- Assignment to a typeglob performs an aliasing operation, i.e.,
-
- *dick = *richard;
-
- causes variables, subroutines, and file handles accessible via the
- identifier C<richard> to also be accessible via the identifier C<dick>. If
- you want to alias only a particular variable or subroutine, you can
- assign a reference instead:
-
- *dick = \$richard;
-
- makes $richard and $dick the same variable, but leaves
- @richard and @dick as separate arrays. Tricky, eh?
-
- This mechanism may be used to pass and return cheap references
- into or from subroutines if you won't want to copy the whole
- thing.
-
- %some_hash = ();
- *some_hash = fn( \%another_hash );
- sub fn {
- local *hashsym = shift;
- # now use %hashsym normally, and you
- # will affect the caller's %another_hash
- my %nhash = (); # do what you want
- return \%nhash;
- }
-
- On return, the reference will overwrite the hash slot in the
- symbol table specified by the *some_hash typeglob. This
- is a somewhat tricky way of passing around references cheaply
- when you won't want to have to remember to dereference variables
- explicitly.
-
- Another use of symbol tables is for making "constant" scalars.
-
- *PI = \3.14159265358979;
-
- Now you cannot alter $PI, which is probably a good thing all in all.
- This isn't the same as a constant subroutine (one prototyped to
- take no arguments and to return a constant expression), which is
- subject to optimization at compile-time. This isn't. See L<perlsub>
- for details on these.
-
- You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and
- package the *foo symbol table entry comes from. This may be useful
- in a subroutine which is passed typeglobs as arguments
-
- sub identify_typeglob {
- my $glob = shift;
- print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n";
- }
- identify_typeglob *foo;
- identify_typeglob *bar::baz;
-
- This prints
-
- You gave me main::foo
- You gave me bar::baz
-
- The *foo{THING} notation can also be used to obtain references to the
- individual elements of *foo, see L<perlref>.
-
- =head2 Package Constructors and Destructors
-
- There are two special subroutine definitions that function as package
- constructors and destructors. These are the C<BEGIN> and C<END>
- routines. The C<sub> is optional for these routines.
-
- A C<BEGIN> subroutine is executed as soon as possible, that is, the moment
- it is completely defined, even before the rest of the containing file
- is parsed. You may have multiple C<BEGIN> blocks within a file--they
- will execute in order of definition. Because a C<BEGIN> block executes
- immediately, it can pull in definitions of subroutines and such from other
- files in time to be visible to the rest of the file. Once a C<BEGIN>
- has run, it is immediately undefined and any code it used is returned to
- Perl's memory pool. This means you can't ever explicitly call a C<BEGIN>.
-
- An C<END> subroutine is executed as late as possible, that is, when the
- interpreter is being exited, even if it is exiting as a result of a
- die() function. (But not if it's is being blown out of the water by a
- signal--you have to trap that yourself (if you can).) You may have
- multiple C<END> blocks within a file--they will execute in reverse
- order of definition; that is: last in, first out (LIFO).
-
- Inside an C<END> subroutine C<$?> contains the value that the script is
- going to pass to C<exit()>. You can modify C<$?> to change the exit
- value of the script. Beware of changing C<$?> by accident (e.g. by
- running something via C<system>).
-
- Note that when you use the B<-n> and B<-p> switches to Perl, C<BEGIN>
- and C<END> work just as they do in B<awk>, as a degenerate case.
-
- =head2 Perl Classes
-
- There is no special class syntax in Perl, but a package may function
- as a class if it provides subroutines that function as methods. Such a
- package may also derive some of its methods from another class package
- by listing the other package name in its @ISA array.
-
- For more on this, see L<perltoot> and L<perlobj>.
-
- =head2 Perl Modules
-
- A module is just a package that is defined in a library file of
- the same name, and is designed to be reusable. It may do this by
- providing a mechanism for exporting some of its symbols into the symbol
- table of any package using it. Or it may function as a class
- definition and make its semantics available implicitly through method
- calls on the class and its objects, without explicit exportation of any
- symbols. Or it can do a little of both.
-
- For example, to start a normal module called Some::Module, create
- a file called Some/Module.pm and start with this template:
-
- package Some::Module; # assumes Some/Module.pm
-
- use strict;
-
- BEGIN {
- use Exporter ();
- use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
-
- # set the version for version checking
- $VERSION = 1.00;
- # if using RCS/CVS, this may be preferred
- $VERSION = do { my @r = (q$Revision: 2.21 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # must be all one line, for MakeMaker
-
- @ISA = qw(Exporter);
- @EXPORT = qw(&func1 &func2 &func4);
- %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ],
-
- # your exported package globals go here,
- # as well as any optionally exported functions
- @EXPORT_OK = qw($Var1 %Hashit &func3);
- }
- use vars @EXPORT_OK;
-
- # non-exported package globals go here
- use vars qw(@more $stuff);
-
- # initalize package globals, first exported ones
- $Var1 = '';
- %Hashit = ();
-
- # then the others (which are still accessible as $Some::Module::stuff)
- $stuff = '';
- @more = ();
-
- # all file-scoped lexicals must be created before
- # the functions below that use them.
-
- # file-private lexicals go here
- my $priv_var = '';
- my %secret_hash = ();
-
- # here's a file-private function as a closure,
- # callable as &$priv_func; it cannot be prototyped.
- my $priv_func = sub {
- # stuff goes here.
- };
-
- # make all your functions, whether exported or not;
- # remember to put something interesting in the {} stubs
- sub func1 {} # no prototype
- sub func2() {} # proto'd void
- sub func3($$) {} # proto'd to 2 scalars
-
- # this one isn't exported, but could be called!
- sub func4(\%) {} # proto'd to 1 hash ref
-
- END { } # module clean-up code here (global destructor)
-
- Then go on to declare and use your variables in functions
- without any qualifications.
- See L<Exporter> and the L<perlmodlib> for details on
- mechanics and style issues in module creation.
-
- Perl modules are included into your program by saying
-
- use Module;
-
- or
-
- use Module LIST;
-
- This is exactly equivalent to
-
- BEGIN { require "Module.pm"; import Module; }
-
- or
-
- BEGIN { require "Module.pm"; import Module LIST; }
-
- As a special case
-
- use Module ();
-
- is exactly equivalent to
-
- BEGIN { require "Module.pm"; }
-
- All Perl module files have the extension F<.pm>. C<use> assumes this so
- that you don't have to spell out "F<Module.pm>" in quotes. This also
- helps to differentiate new modules from old F<.pl> and F<.ph> files.
- Module names are also capitalized unless they're functioning as pragmas,
- "Pragmas" are in effect compiler directives, and are sometimes called
- "pragmatic modules" (or even "pragmata" if you're a classicist).
-
- Because the C<use> statement implies a C<BEGIN> block, the importation
- of semantics happens at the moment the C<use> statement is compiled,
- before the rest of the file is compiled. This is how it is able
- to function as a pragma mechanism, and also how modules are able to
- declare subroutines that are then visible as list operators for
- the rest of the current file. This will not work if you use C<require>
- instead of C<use>. With require you can get into this problem:
-
- require Cwd; # make Cwd:: accessible
- $here = Cwd::getcwd();
-
- use Cwd; # import names from Cwd::
- $here = getcwd();
-
- require Cwd; # make Cwd:: accessible
- $here = getcwd(); # oops! no main::getcwd()
-
- In general C<use Module ();> is recommended over C<require Module;>.
-
- Perl packages may be nested inside other package names, so we can have
- package names containing C<::>. But if we used that package name
- directly as a filename it would makes for unwieldy or impossible
- filenames on some systems. Therefore, if a module's name is, say,
- C<Text::Soundex>, then its definition is actually found in the library
- file F<Text/Soundex.pm>.
-
- Perl modules always have a F<.pm> file, but there may also be dynamically
- linked executables or autoloaded subroutine definitions associated with
- the module. If so, these will be entirely transparent to the user of
- the module. It is the responsibility of the F<.pm> file to load (or
- arrange to autoload) any additional functionality. The POSIX module
- happens to do both dynamic loading and autoloading, but the user can
- say just C<use POSIX> to get it all.
-
- For more information on writing extension modules, see L<perlxstut>
- and L<perlguts>.
-
- =head1 SEE ALSO
-
- See L<perlmodlib> for general style issues related to building Perl
- modules and classes as well as descriptions of the standard library and
- CPAN, L<Exporter> for how Perl's standard import/export mechanism works,
- L<perltoot> for an in-depth tutorial on creating classes, L<perlobj>
- for a hard-core reference document on objects, and L<perlsub> for an
- explanation of functions and scoping.
-