home *** CD-ROM | disk | FTP | other *** search
- NAME
- perlsyn - Perl syntax
-
- DESCRIPTION
- A Perl script consists of a sequence of declarations and
- statements. The only things that need to be declared in
- Perl are report formats and subroutines. See the sections
- below for more information on those declarations. All
- uninitialized user-created objects are assumed to start
- with a null or 0 value until they are defined by some
- explicit operation such as assignment. (Though you can
- get warnings about the use of undefined values if you
- like.) The sequence of statements is executed just once,
- unlike in sed and awk scripts, where the sequence of
- statements is executed for each input line. While this
- means that you must explicitly loop over the lines of your
- input file (or files), it also means you have much more
- control over which files and which lines you look at.
- (Actually, I'm lying--it is possible to do an implicit
- loop with either the -n or -p switch. It's just not the
- mandatory default like it is in sed and awk.)
-
- Declarations
-
- Perl is, for the most part, a free-form language. (The
- only exception to this is format declarations, for obvious
- reasons.) Comments are indicated by the "#" character, and
- extend to the end of the line. If you attempt to use /*
- */ C-style comments, it will be interpreted either as
- division or pattern matching, depending on the context,
- and C++ // comments just look like a null regular
- expression, so don't do that.
-
- A declaration can be put anywhere a statement can, but has
- no effect on the execution of the primary sequence of
- statements--declarations all take effect at compile time.
- Typically all the declarations are put at the beginning or
- the end of the script. However, if you're using
- lexically-scoped private variables created with my(),
- you'll have to make sure your format or subroutine
- definition is within the same block scope as the my if you
- expect to to be able to access those private variables.
-
- Declaring a subroutine allows a subroutine name to be used
- as if it were a list operator from that point forward in
- the program. You can declare a subroutine (prototyped to
- take one scalar parameter) without defining it by saying
- just:
-
- sub myname ($);
- $me = myname $0 or die "can't get myname";
-
- Note that it functions as a list operator though, not as a
- unary operator, so be careful to use or instead of ||
- there.
-
- Subroutines declarations can also be loaded up with the
- require statement or both loaded and imported into your
- namespace with a use statement. See the perlmod manpage
- for details on this.
-
- A statement sequence may contain declarations of
- lexically-scoped variables, but apart from declaring a
- variable name, the declaration acts like an ordinary
- statement, and is elaborated within the sequence of
- statements as if it were an ordinary statement. That
- means it actually has both compile-time and run-time
- effects.
-
- Simple statements
-
- The only kind of simple statement is an expression
- evaluated for its side effects. Every simple statement
- must be terminated with a semicolon, unless it is the
- final statement in a block, in which case the semicolon is
- optional. (A semicolon is still encouraged there if the
- block takes up more than one line, since you may
- eventually add another line.) Note that there are some
- operators like eval {} and do {} that look like compound
- statements, but aren't (they're just TERMs in an
- expression), and thus need an explicit termination if used
- as the last item in a statement.
-
- Any simple statement may optionally be followed by a
- SINGLE modifier, just before the terminating semicolon (or
- block ending). The possible modifiers are:
-
- if EXPR
- unless EXPR
- while EXPR
- until EXPR
-
- The if and unless modifiers have the expected semantics,
- presuming you're a speaker of English. The while and
- until modifiers also have the usual "while loop" semantics
- (conditional evaluated first), except when applied to a
- do-BLOCK (or to the now-deprecated do-SUBROUTINE
- statement), in which case the block executes once before
- the conditional is evaluated. This is so that you can
- write loops like:
-
- do {
- $line = <STDIN>;
- ...
- } until $line eq ".\n";
-
- See the do entry in the perlfunc manpage. Note also that
- the loop control statements described later will NOT work
- in this construct, since modifiers don't take loop labels.
- Sorry. You can always wrap another block around it to do
- that sort of thing.
-
- Compound statements
-
- In Perl, a sequence of statements that defines a scope is
- called a block. Sometimes a block is delimited by the
- file containing it (in the case of a required file, or the
- program as a whole), and sometimes a block is delimited by
- the extent of a string (in the case of an eval).
-
- But generally, a block is delimited by curly brackets,
- also known as braces. We will call this syntactic
- construct a BLOCK.
-
- The following compound statements may be used to control
- flow:
-
- if (EXPR) BLOCK
- if (EXPR) BLOCK else BLOCK
- if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
- LABEL while (EXPR) BLOCK
- LABEL while (EXPR) BLOCK continue BLOCK
- LABEL for (EXPR; EXPR; EXPR) BLOCK
- LABEL foreach VAR (LIST) BLOCK
- LABEL BLOCK continue BLOCK
-
- Note that, unlike C and Pascal, these are defined in terms
- of BLOCKs, not statements. This means that the curly
- brackets are required--no dangling statements allowed. If
- you want to write conditionals without curly brackets
- there are several other ways to do it. The following all
- do the same thing:
-
- if (!open(FOO)) { die "Can't open $FOO: $!"; }
- die "Can't open $FOO: $!" unless open(FOO);
- open(FOO) or die "Can't open $FOO: $!"; # FOO or bust!
- open(FOO) ? 'hi mom' : die "Can't open $FOO: $!";
- # a bit exotic, that last one
-
- The if statement is straightforward. Since BLOCKs are
- always bounded by curly brackets, there is never any
- ambiguity about which if an else goes with. If you use
- unless in place of if, the sense of the test is reversed.
-
- The while statement executes the block as long as the
- expression is true (does not evaluate to the null string
- or 0 or "0"). The LABEL is optional, and if present,
- consists of an identifier followed by a colon. The LABEL
- identifies the loop for the loop control statements next,
- last, and redo. If the LABEL is omitted, the loop control
- statement refers to the innermost enclosing loop. This
- may include dynamically looking back your call-stack at
- run time to find the LABEL. Such desperate behavior
- triggers a warning if you use the -w flag.
-
- If there is a continue BLOCK, it is always executed just
- before the conditional is about to be evaluated again,
- just like the third part of a for loop in C. Thus it can
- be used to increment a loop variable, even when the loop
- has been continued via the next statement (which is
- similar to the C continue statement).
-
- Loop Control
-
- The next command is like the continue statement in C; it
- starts the next iteration of the loop:
-
- LINE: while (<STDIN>) {
- next LINE if /^#/; # discard comments
- ...
- }
-
- The last command is like the break statement in C (as used
- in loops); it immediately exits the loop in question. The
- continue block, if any, is not executed:
-
- LINE: while (<STDIN>) {
- last LINE if /^$/; # exit when done with header
- ...
- }
-
- The redo command restarts the loop block without
- evaluating the conditional again. The continue block, if
- any, is not executed. This command is normally used by
- programs that want to lie to themselves about what was
- just input.
-
- For example, when processing a file like /etc/termcap. If
- your input lines might end in backslashes to indicate
- continuation, you want to skip ahead and get the next
- record.
-
- while (<>) {
- chomp;
- if (s/\\$//) {
- $_ .= <>;
- redo unless eof();
- }
- # now process $_
- }
-
- which is Perl short-hand for the more explicitly written
- version:
-
-
-
- LINE: while ($line = <ARGV>) {
- chomp($line);
- if ($line =~ s/\\$//) {
- $line .= <ARGV>;
- redo LINE unless eof(); # not eof(ARGV)!
- }
- # now process $line
- }
-
- Or here's a a simpleminded Pascal comment stripper
- (warning: assumes no { or } in strings)
-
- LINE: while (<STDIN>) {
- while (s|({.*}.*){.*}|$1 |) {}
- s|{.*}| |;
- if (s|{.*| |) {
- $front = $_;
- while (<STDIN>) {
- if (/}/) { # end of comment?
- s|^|$front{|;
- redo LINE;
- }
- }
- }
- print;
- }
-
- Note that if there were a continue block on the above
- code, it would get executed even on discarded lines.
-
- If the word while is replaced by the word until, the sense
- of the test is reversed, but the conditional is still
- tested before the first iteration.
-
- In either the if or the while statement, you may replace
- "(EXPR)" with a BLOCK, and the conditional is true if the
- value of the last statement in that block is true. While
- this "feature" continues to work in version 5, it has been
- deprecated, so please change any occurrences of "if BLOCK"
- to "if (do BLOCK)".
-
- For Loops
-
- Perl's C-style for loop works exactly like the
- corresponding while loop; that means that this:
-
- for ($i = 1; $i < 10; $i++) {
- ...
- }
-
- is the same as this:
-
-
-
- $i = 1;
- while ($i < 10) {
- ...
- } continue {
- $i++;
- }
-
- Besides the normal array index looping, for can lend
- itself to many other interesting applications. Here's one
- that avoids the problem you get into if you explicitly
- test for end-of-file on an interactive file descriptor
- causing your program to appear to hang.
-
- $on_a_tty = -t STDIN && -t STDOUT;
- sub prompt { print "yes? " if $on_a_tty }
- for ( prompt(); <STDIN>; prompt() ) {
- # do something
- }
-
-
- Foreach Loops
-
- The foreach loop iterates over a normal list value and
- sets the variable VAR to be each element of the list in
- turn. The variable is implicitly local to the loop and
- regains its former value upon exiting the loop. If the
- variable was previously declared with my, it uses that
- variable instead of the global one, but it's still
- localized to the loop. This can cause problems if you
- have subroutine or format declarations within that block's
- scope.
-
- The foreach keyword is actually a synonym for the for
- keyword, so you can use foreach for readability or for for
- brevity. If VAR is omitted, $_ is set to each value. If
- LIST is an actual array (as opposed to an expression
- returning a list value), you can modify each element of
- the array by modifying VAR inside the loop. That's
- because the foreach loop index variable is an implicit
- alias for each item in the list that you're looping over.
-
- Examples:
-
- for (@ary) { s/foo/bar/ }
-
- foreach $elem (@elements) {
- $elem *= 2;
- }
-
- for $count (10,9,8,7,6,5,4,3,2,1,'BOOM') {
- print $count, "\n"; sleep(1);
- }
-
- for (1..15) { print "Merry Christmas\n"; }
- foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) {
- print "Item: $item\n";
- }
-
- Here's how a C programmer might code up a particular
- algorithm in Perl:
-
- for ($i = 0; $i < @ary1; $i++) {
- for ($j = 0; $j < @ary2; $j++) {
- if ($ary1[$i] > $ary2[$j]) {
- last; # can't go to outer :-(
- }
- $ary1[$i] += $ary2[$j];
- }
- # this is where that last takes me
- }
-
- Whereas here's how a Perl programmer more confortable with
- the idiom might do it:
-
- OUTER: foreach $wid (@ary1) {
- INNER: foreach $jet (@ary2) {
- next OUTER if $wid > $jet;
- $wid += $jet;
- }
- }
-
- See how much easier this is? It's cleaner, safer, and
- faster. It's cleaner because it's less noisy. It's safer
- because if code gets added between the inner and outer
- loops later on, the new code won't be accidentally
- executed, the next explicitly iterates the other loop
- rather than merely terminating the inner one. And it's
- faster because Perl executes a foreach statement more
- rapidly than it would the equivalent for loop.
-
- Basic BLOCKs and Switch Statements
-
- A BLOCK by itself (labeled or not) is semantically
- equivalent to a loop that executes once. Thus you can use
- any of the loop control statements in it to leave or
- restart the block. (Note that this is NOT true in eval{},
- sub{}, or contrary to popular belief do{} blocks, which do
- NOT count as loops.) The continue block is optional.
-
- The BLOCK construct is particularly nice for doing case
- structures.
-
- SWITCH: {
- if (/^abc/) { $abc = 1; last SWITCH; }
- if (/^def/) { $def = 1; last SWITCH; }
- if (/^xyz/) { $xyz = 1; last SWITCH; }
- $nothing = 1;
- }
- There is no official switch statement in Perl, because
- there are already several ways to write the equivalent.
- In addition to the above, you could write
-
- SWITCH: {
- $abc = 1, last SWITCH if /^abc/;
- $def = 1, last SWITCH if /^def/;
- $xyz = 1, last SWITCH if /^xyz/;
- $nothing = 1;
- }
-
- (That's actually not as strange as it looks once you
- realize that you can use loop control "operators" within
- an expression, That's just the normal C comma operator.)
-
- or
-
- SWITCH: {
- /^abc/ && do { $abc = 1; last SWITCH; };
- /^def/ && do { $def = 1; last SWITCH; };
- /^xyz/ && do { $xyz = 1; last SWITCH; };
- $nothing = 1;
- }
-
- or formatted so it stands out more as a "proper" switch
- statement:
-
- SWITCH: {
- /^abc/ && do {
- $abc = 1;
- last SWITCH;
- };
-
- /^def/ && do {
- $def = 1;
- last SWITCH;
- };
-
- /^xyz/ && do {
- $xyz = 1;
- last SWITCH;
- };
- $nothing = 1;
- }
-
- or
-
- SWITCH: {
- /^abc/ and $abc = 1, last SWITCH;
- /^def/ and $def = 1, last SWITCH;
- /^xyz/ and $xyz = 1, last SWITCH;
- $nothing = 1;
- }
-
- or even, horrors,
-
- if (/^abc/)
- { $abc = 1 }
- elsif (/^def/)
- { $def = 1 }
- elsif (/^xyz/)
- { $xyz = 1 }
- else
- { $nothing = 1 }
-
- A common idiom for a switch statement is to use foreach's
- aliasing to make a temporary assignment to $_ for
- convenient matching:
-
- SWITCH: for ($where) {
- /In Card Names/ && do { push @flags, '-e'; last; };
- /Anywhere/ && do { push @flags, '-h'; last; };
- /In Rulings/ && do { last; };
- die "unknown value for form variable where: `$where'";
- }
-
- Another interesting approach to a switch statement is
- arrange for a do block to return the proper value:
-
- $amode = do {
- if ($flag & O_RDONLY) { "r" }
- elsif ($flag & O_WRONLY) { ($flag & O_APPEND) ? "a" : "w" }
- elsif ($flag & O_RDWR) {
- if ($flag & O_CREAT) { "w+" }
- else { ($flag & O_APPEND) ? "a+" : "r+" }
- }
- };
-
-
- Goto
-
- Although not for the faint of heart, Perl does support a
- goto statement. A loop's LABEL is not actually a valid
- target for a goto; it's just the name of the loop. There
- are three forms: goto-LABEL, goto-EXPR, and goto-&NAME.
-
- The goto-LABEL form finds the statement labeled with LABEL
- and resumes execution there. It may not be used to go
- into any construct that requires initialization, such as a
- subroutine or a foreach loop. It also can't be used to go
- into a construct that is optimized away. It can be used
- to go almost anywhere else within the dynamic scope,
- including out of subroutines, but it's usually better to
- use some other construct such as last or die. The author
- of Perl has never felt the need to use this form of goto
- (in Perl, that is--C is another matter).
-
- The goto-EXPR form expects a label name, whose scope will
- be resolved dynamically. This allows for computed gotos
- per FORTRAN, but isn't necessarily recommended if you're
- optimizing for maintainability:
-
- goto ("FOO", "BAR", "GLARCH")[$i];
-
- The goto-&NAME form is highly magical, and substitutes a
- call to the named subroutine for the currently running
- subroutine. This is used by AUTOLOAD() subroutines that
- wish to load another subroutine and then pretend that the
- other subroutine had been called in the first place
- (except that any modifications to @_ in the current
- subroutine are propagated to the other subroutine.) After
- the goto, not even caller() will be able to tell that this
- routine was called first.
-
- In almost all cases like this, it's usually a far, far
- better idea to use the structured control flow mechanisms
- of next, last, or redo instead of resorting to a goto.
- For certain applications, the catch and throw pair of
- eval{} and die() for exception processing can also be a
- prudent approach.
-
- PODs: Embedded Documentation
-
- Perl has a mechanism for intermixing documentation with
- source code. While it's expecting the beginning of a new
- statement, if the compiler encounters a line that begins
- with an equal sign and a word, like this
-
- =head1 Here There Be Pods!
-
- Then that text and all remaining text up through and
- including a line beginning with =cut will be ignored. The
- format of the intervening text is described in the perlpod
- manpage.
-
- This allows you to intermix your source code and your
- documentation text freely, as in
-
- =item snazzle($)
-
- The snazzle() function will behave in the most spectacular
- form that you can possibly imagine, not even excepting
- cybernetic pyrotechnics.
-
- =cut back to the compiler, nuff of this pod stuff!
-
- sub snazzle($) {
- my $thingie = shift;
- .........
- }
-
- Note that pod translators should only look at paragraphs
- beginning with a pod diretive (it makes parsing easier),
- whereas the compiler actually knows to look for pod
- escapes even in the middle of a paragraph. This means
- that the following secret stuff will be ignored by both
- the compiler and the translators.
-
- $a=3;
- =secret stuff
- warn "Neither POD nor CODE!?"
- =cut back
- print "got $a\n";
-
- You probably shouldn't rely upon the warn() being podded
- out forever. Not all pod translators are well-behaved in
- this regard, and perhaps the compiler will become pickier.
-