home *** CD-ROM | disk | FTP | other *** search
- <!-- $RCSfile$$Revision$$Date$ -->
- <!-- $Log$ -->
- <HTML>
- <TITLE> PERLREF </TITLE>
- <h2>NAME</h2>
- perlref - Perl references and nested data structures
- <p><h2>DESCRIPTION</h2>
- In Perl 4 it was difficult to represent complex data structures, because
- all references had to be symbolic, and even that was difficult to do when
- you wanted to refer to a variable rather than a symbol table entry. Perl
- 5 not only makes it easier to use symbolic references to variables, but
- lets you have "hard" references to any piece of data. Any scalar may hold
- a hard reference. Since arrays and hashes contain scalars, you can now
- easily build arrays of arrays, arrays of hashes, hashes of arrays, arrays
- of hashes of functions, and so on.
- <p>Hard references are smart--they keep track of reference counts for you,
- automatically freeing the thing referred to when its reference count
- goes to zero. If that thing happens to be an object, the object is
- destructed. See
- <A HREF="perlobj.html">
- the perlobj manpage</A>
- for more about objects. (In a sense,
- everything in Perl is an object, but we usually reserve the word for
- references to objects that have been officially "blessed" into a class package.)
- <p>A symbolic reference contains the name of a variable, just as a
- symbolic link in the filesystem merely contains the name of a file.
- The <B>*glob</B> notation is a kind of symbolic reference. Hard references
- are more like hard links in the file system: merely another way
- at getting at the same underlying object, irrespective of its name.
- <p>"Hard" references are easy to use in Perl. There is just one
- overriding principle: Perl does no implicit referencing or
- dereferencing. When a scalar is holding a reference, it always behaves
- as a scalar. It doesn't magically start being an array or a hash
- unless you tell it so explicitly by dereferencing it.
- <p>References can be constructed several ways.
- <p>
- <dl>
-
- <dt><b><A NAME="perlcall_48">1.</A></b>
- <dd>
- By using the backslash operator on a variable, subroutine, or value.
- (This works much like the & (address-of) operator works in C.) Note
- that this typically creates <I>ANOTHER</I> reference to a variable, since
- there's already a reference to the variable in the symbol table. But
- the symbol table reference might go away, and you'll still have the
- reference that the backslash returned. Here are some examples:
- <p></dd>
- <pre>
- $scalarref = \$foo;
- $arrayref = \@ARGV;
- $hashref = \%ENV;
- $coderef = \&handler;
- </pre>
-
- <dt><b><A NAME="perlcall_49">2.</A></b>
- <dd>
- A reference to an anonymous array can be constructed using square
- brackets:
- <p></dd>
- <pre>
- $arrayref = [1, 2, ['a', 'b', 'c']];
- </pre>
- Here we've constructed a reference to an anonymous array of three elements
- whose final element is itself reference to another anonymous array of three
- elements. (The multidimensional syntax described later can be used to
- access this. For example, after the above, $arrayref->[2][1] would have
- the value "b".)
- <p>
- <dt><b><A NAME="perlcall_50">3.</A></b>
- <dd>
- A reference to an anonymous hash can be constructed using curly
- brackets:
- <p></dd>
- <pre>
- $hashref = {
- 'Adam' => 'Eve',
- 'Clyde' => 'Bonnie',
- };
- </pre>
- Anonymous hash and array constructors can be intermixed freely to
- produce as complicated a structure as you want. The multidimensional
- syntax described below works for these too. The values above are
- literals, but variables and expressions would work just as well, because
- assignment operators in Perl (even within local() or my()) are executable
- statements, not compile-time declarations.
- <p>Because curly brackets (braces) are used for several other things
- including BLOCKs, you may occasionally have to disambiguate braces at the
- beginning of a statement by putting a <B>+</B> or a
- <A HREF="perlfunc.html#perlfunc_207">return</A>
- in front so
- that Perl realizes the opening brace isn't starting a BLOCK. The economy and
- mnemonic value of using curlies is deemed worth this occasional extra
- hassle.
- <p>For example, if you wanted a function to make a new hash and return a
- reference to it, you have these options:
- <p><pre>
- sub hashem { { @_ } } # silently wrong
- sub hashem { +{ @_ } } # ok
- sub hashem { return { @_ } } # ok
- </pre>
-
- <dt><b><A NAME="perlcall_51">4.</A></b>
- <dd>
- A reference to an anonymous subroutine can be constructed by using
- <B>sub</B> without a subname:
- <p></dd>
- <pre>
- $coderef = sub { print "Boink!\n" };
- </pre>
- Note the presence of the semicolon. Except for the fact that the code
- inside isn't executed immediately, a <B>sub {}</B> is not so much a
- declaration as it is an operator, like <B>do{}</B> or <B>eval{}</B>. (However, no
- matter how many times you execute that line (unless you're in an
- <B>eval("...")</B>), <B>$coderef</B> will still have a reference to the <I>SAME</I>
- anonymous subroutine.)
- <p>For those who worry about these things, the current implementation
- uses shallow binding of local() variables; my() variables are not
- accessible. This precludes true closures. However, you can work
- around this with a run-time (rather than a compile-time) eval():
- <p><pre>
- {
- my $x = time;
- $coderef = eval "sub { \$x }";
- }
- </pre>
- Normally--if you'd used just <B>sub{}</B> or even <B>eval{}</B>--your unew sub
- would only have been able to access the global $x. But because you've
- used a run-time eval(), this will not only generate a brand new subroutine
- reference each time called, it will all grant access to the my() variable
- lexically above it rather than the global one. The particular $x
- accessed will be different for each new sub you create. This mechanism
- yields deep binding of variables. (If you don't know what closures, deep
- binding, or shallow binding are, don't worry too much about it.)
- <p>
- <dt><b><A NAME="perlcall_52">5.</A></b>
- <dd>
- References are often returned by special subroutines called constructors.
- Perl objects are just reference a special kind of object that happens to know
- which package it's associated with. Constructors are just special
- subroutines that know how to create that association. They do so by
- starting with an ordinary reference, and it remains an ordinary reference
- even while it's also being an object. Constructors are customarily
- named new(), but don't have to be:
- <p></dd>
- <pre>
- $objref = new Doggie (Tail => 'short', Ears => 'long');
- </pre>
-
- <dt><b><A NAME="perlcall_54">6.</A></b>
- <dd>
- References of the appropriate type can spring into existence if you
- dereference them in a context that assumes they exist. Since we haven't
- talked about dereferencing yet, we can't show you any examples yet.
- <p></dd>
-
- </dl>
-
- That's it for creating references. By now you're probably dying to
- know how to use references to get back to your long-lost data. There
- are several basic methods.
- <p>
- <dl>
-
- <dt><b><A NAME="perlcall_48">1.</A></b>
- <dd>
- Anywhere you'd put an identifier as part of a variable or subroutine
- name, you can replace the identifier with a simple scalar variable
- containing a reference of the correct type:
- <p></dd>
- <pre>
- $bar = $$scalarref;
- push(@$arrayref, $filename);
- $$arrayref[0] = "January";
- $$hashref{"KEY"} = "VALUE";
- &$coderef(1,2,3);
- </pre>
- It's important to understand that we are specifically <I>NOT</I> dereferencing
- <B>$arrayref[0]</B> or <B>$hashref{"KEY"}</B> there. The dereference of the
- scalar variable happens <I>BEFORE</I> it does any key lookups. Anything more
- complicated than a simple scalar variable must use methods 2 or 3 below.
- However, a "simple scalar" includes an identifier that itself uses method
- 1 recursively. Therefore, the following prints "howdy".
- <p><pre>
- $refrefref = \\\"howdy";
- print $$$$refrefref;
- </pre>
-
- <dt><b><A NAME="perlcall_49">2.</A></b>
- <dd>
- Anywhere you'd put an identifier as part of a variable or subroutine
- name, you can replace the identifier with a BLOCK returning a reference
- of the correct type. In other words, the previous examples could be
- written like this:
- <p></dd>
- <pre>
- $bar = ${$scalarref};
- push(@{$arrayref}, $filename);
- ${$arrayref}[0] = "January";
- ${$hashref}{"KEY"} = "VALUE";
- &{$coderef}(1,2,3);
- </pre>
- Admittedly, it's a little silly to use the curlies in this case, but
- the BLOCK can contain any arbitrary expression, in particular,
- subscripted expressions:
- <p><pre>
- &{ $dispatch{$index} }(1,2,3); # call correct routine
- </pre>
- Because of being able to omit the curlies for the simple case of <B>$$x</B>,
- people often make the mistake of viewing the dereferencing symbols as
- proper operators, and wonder about their precedence. If they were,
- though, you could use parens instead of braces. That's not the case.
- Consider the difference below; case 0 is a short-hand version of case 1,
- <I>NOT</I> case 2:
- <p><pre>
- $$hashref{"KEY"} = "VALUE"; # CASE 0
- ${$hashref}{"KEY"} = "VALUE"; # CASE 1
- ${$hashref{"KEY"}} = "VALUE"; # CASE 2
- ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
- </pre>
- Case 2 is also deceptive in that you're accessing a variable
- called %hashref, not dereferencing through $hashref to the hash
- it's presumably referencing. That would be case 3.
- <p>
- <dt><b><A NAME="perlcall_50">3.</A></b>
- <dd>
- The case of individual array elements arises often enough that it gets
- cumbersome to use method 2. As a form of syntactic sugar, the two
- lines like that above can be written:
- <p></dd>
- <pre>
- $arrayref->[0] = "January";
- $hashref->{"KEY} = "VALUE";
- </pre>
- The left side of the array can be any expression returning a reference,
- including a previous dereference. Note that <B>$array[$x]</B> is <I>NOT</I> the
- same thing as <B>$array->[$x]</B> here:
- <p><pre>
- $array[$x]->{"foo"}->[0] = "January";
- </pre>
- This is one of the cases we mentioned earlier in which references could
- spring into existence when in an lvalue context. Before this
- statement, <B>$array[$x]</B> may have been undefined. If so, it's
- automatically defined with a hash reference so that we can look up
- <B>{"foo"}</B> in it. Likewise <B>$array[$x]->{"foo"}</B> will automatically get
- defined with an array reference so that we can look up <B>[0]</B> in it.
- <p>One more thing here. The arrow is optional <I>BETWEEN</I> brackets
- subscripts, so you can shrink the above down to
- <p><pre>
- $array[$x]{"foo"}[0] = "January";
- </pre>
- Which, in the degenerate case of using only ordinary arrays, gives you
- multidimensional arrays just like C's:
- <p><pre>
- $score[$x][$y][$z] += 42;
- </pre>
- Well, okay, not entirely like C's arrays, actually. C doesn't know how
- to grow its arrays on demand. Perl does.
- <p>
- <dt><b><A NAME="perlcall_51">4.</A></b>
- <dd>
- If a reference happens to be a reference to an object, then there are
- probably methods to access the things referred to, and you should probably
- stick to those methods unless you're in the class package that defines the
- object's methods. In other words, be nice, and don't violate the object's
- encapsulation without a very good reason. Perl does not enforce
- encapsulation. We are not totalitarians here. We do expect some basic
- civility though.
- <p></dd>
-
- </dl>
-
- The ref() operator may be used to determine what type of thing the
- reference is pointing to. See
- <A HREF="perlfunc.html">
- the perlfunc manpage</A>
- .
- <p>The bless() operator may be used to associate a reference with a package
- functioning as an object class. See
- <A HREF="perlobj.html">
- the perlobj manpage</A>
- .
- <p>A type glob may be dereferenced the same way a reference can, since
- the dereference syntax always indicates the kind of reference desired.
- So <B>${*foo}</B> and <B>${\$foo}</B> both indicate the same scalar variable.
- <p>Here's a trick for interpolating a subroutine call into a string:
- <p><pre>
- print "My sub returned ${\mysub(1,2,3)}\n";
- </pre>
- The way it works is that when the <B>${...}</B> is seen in the double-quoted
- string, it's evaluated as a block. The block executes the call to
- <B>mysub(1,2,3)</B>, and then takes a reference to that. So the whole block
- returns a reference to a scalar, which is then dereferenced by <B>${...}</B>
- and stuck into the double-quoted string.
- <p><h3>Symbolic references</h3>
- We said that references spring into existence as necessary if they are
- undefined, but we didn't say what happens if a value used as a
- reference is already defined, but <I>ISN'T</I> a hard reference. If you
- use it as a reference in this case, it'll be treated as a symbolic
- reference. That is, the value of the scalar is taken to be the
- <A HREF="perlapi.html#perlapi_0">NAME</A>
-
- of a variable, rather than a direct link to a (possibly) anonymous
- value.
- <p>People frequently expect it to work like this. So it does.
- <p><pre>
- $name = "foo";
- $$name = 1; # Sets $foo
- ${$name} = 2; # Sets $foo
- ${$name x 2} = 3; # Sets $foofoo
- $name->[0] = 4; # Sets $foo[0]
- @$name = (); # Clears @foo
- &$name(); # Calls &foo() (as in Perl 4)
- $pack = "THAT";
- ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
- </pre>
- This is very powerful, and slightly dangerous, in that it's possible
- to intend (with the utmost sincerity) to use a hard reference, and
- accidentally use a symbolic reference instead. To protect against
- that, you can say
- <p><pre>
- use strict 'refs';
- </pre>
- and then only hard references will be allowed for the rest of the enclosing
- block. An inner block may countermand that with
- <p><pre>
- no strict 'refs';
- </pre>
- Only package variables are visible to symbolic references. Lexical
- variables (declared with my()) aren't in a symbol table, and thus are
- invisible to this mechanism. For example:
- <p><pre>
- local($value) = 10;
- $ref = \$value;
- {
- my $value = 20;
- print $$ref;
- }
- </pre>
- This will still print 10, not 20. Remember that local() affects package
- variables, which are all "global" to the package.
- <p><h3>Further Reading</h3>
- Besides the obvious documents, source code can be instructive.
- Some rather pathological examples of the use of references can be found
- in the <I>t/op/ref.t</I> regression test in the Perl source directory.<p>