home *** CD-ROM | disk | FTP | other *** search
- package PerlIO;
-
- our $VERSION = '1.02';
-
- # Map layer name to package that defines it
- our %alias;
-
- sub import
- {
- my $class = shift;
- while (@_)
- {
- my $layer = shift;
- if (exists $alias{$layer})
- {
- $layer = $alias{$layer}
- }
- else
- {
- $layer = "${class}::$layer";
- }
- eval "require $layer";
- warn $@ if $@;
- }
- }
-
- sub F_UTF8 () { 0x8000 }
-
- 1;
- __END__
-
- =head1 NAME
-
- PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space
-
- =head1 SYNOPSIS
-
- open($fh,"<:crlf", "my.txt"); # portably open a text file for reading
-
- open($fh,"<","his.jpg"); # portably open a binary file for reading
- binmode($fh);
-
- Shell:
- PERLIO=perlio perl ....
-
- =head1 DESCRIPTION
-
- When an undefined layer 'foo' is encountered in an C<open> or
- C<binmode> layer specification then C code performs the equivalent of:
-
- use PerlIO 'foo';
-
- The perl code in PerlIO.pm then attempts to locate a layer by doing
-
- require PerlIO::foo;
-
- Otherwise the C<PerlIO> package is a place holder for additional
- PerlIO related functions.
-
- The following layers are currently defined:
-
- =over 4
-
- =item unix
-
- Low level layer which calls C<read>, C<write> and C<lseek> etc.
-
- =item stdio
-
- Layer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc. Note
- that as this is "real" stdio it will ignore any layers beneath it and
- got straight to the operating system via the C library as usual.
-
- =item perlio
-
- This is a re-implementation of "stdio-like" buffering written as a
- PerlIO "layer". As such it will call whatever layer is below it for
- its operations.
-
- =item crlf
-
- A layer which does CRLF to "\n" translation distinguishing "text" and
- "binary" files in the manner of MS-DOS and similar operating systems.
- (It currently does I<not> mimic MS-DOS as far as treating of Control-Z
- as being an end-of-file marker.)
-
- =item utf8
-
- Declares that the stream accepts perl's internal encoding of
- characters. (Which really is UTF-8 on ASCII machines, but is
- UTF-EBCDIC on EBCDIC machines.) This allows any character perl can
- represent to be read from or written to the stream. The UTF-X encoding
- is chosen to render simple text parts (i.e. non-accented letters,
- digits and common punctuation) human readable in the encoded file.
-
- Here is how to write your native data out using UTF-8 (or UTF-EBCDIC)
- and then read it back in.
-
- open(F, ">:utf8", "data.utf");
- print F $out;
- close(F);
-
- open(F, "<:utf8", "data.utf");
- $in = <F>;
- close(F);
-
- =item bytes
-
- This is the inverse of C<:utf8> layer. It turns off the flag
- on the layer below so that data read from it is considered to
- be "octets" i.e. characters in range 0..255 only. Likewise
- on output perl will warn if a "wide" character is written
- to a such a stream.
-
- =item raw
-
- The C<:raw> layer is I<defined> as being identical to calling
- C<binmode($fh)> - the stream is made suitable for passing binary data
- i.e. each byte is passed as-is. The stream will still be
- buffered. Unlike in the earlier versions of Perl C<:raw> is I<not>
- just the inverse of C<:crlf> - other layers which would affect the
- binary nature of the stream are also removed or disabled.
-
- The implementation of C<:raw> is as a pseudo-layer which when "pushed"
- pops itself and then any layers which do not declare themselves as suitable
- for binary data. (Undoing :utf8 and :crlf are implemented by clearing
- flags rather than popping layers but that is an implementation detail.)
-
- As a consequence of the fact that C<:raw> normally pops layers
- it usually only makes sense to have it as the only or first element in
- a layer specification. When used as the first element it provides
- a known base on which to build e.g.
-
- open($fh,":raw:utf8",...)
-
- will construct a "binary" stream, but then enable UTF-8 translation.
-
- =item pop
-
- A pseudo layer that removes the top-most layer. Gives perl code
- a way to manipulate the layer stack. Should be considered
- as experimental. Note that C<:pop> only works on real layers
- and will not undo the effects of pseudo layers like C<:utf8>.
- An example of a possible use might be:
-
- open($fh,...)
- ...
- binmode($fh,":encoding(...)"); # next chunk is encoded
- ...
- binmode($fh,":pop"); # back to un-encocded
-
- A more elegant (and safer) interface is needed.
-
- =back
-
- =head2 Custom Layers
-
- It is possible to write custom layers in addition to the above builtin
- ones, both in C/XS and Perl. Two such layers (and one example written
- in Perl using the latter) come with the Perl distribution.
-
- =over 4
-
- =item :encoding
-
- Use C<:encoding(ENCODING)> either in open() or binmode() to install
- a layer that does transparently character set and encoding transformations,
- for example from Shift-JIS to Unicode. Note that under C<stdio>
- an C<:encoding> also enables C<:utf8>. See L<PerlIO::encoding>
- for more information.
-
- =item :via
-
- Use C<:via(MODULE)> either in open() or binmode() to install a layer
- that does whatever transformation (for example compression /
- decompression, encryption / decryption) to the filehandle.
- See L<PerlIO::via> for more information.
-
- =back
-
- =head2 Alternatives to raw
-
- To get a binary stream an alternate method is to use:
-
- open($fh,"whatever")
- binmode($fh);
-
- this has advantage of being backward compatible with how such things have
- had to be coded on some platforms for years.
-
- To get an un-buffered stream specify an unbuffered layer (e.g. C<:unix>)
- in the open call:
-
- open($fh,"<:unix",$path)
-
- =head2 Defaults and how to override them
-
- If the platform is MS-DOS like and normally does CRLF to "\n"
- translation for text files then the default layers are :
-
- unix crlf
-
- (The low level "unix" layer may be replaced by a platform specific low
- level layer.)
-
- Otherwise if C<Configure> found out how to do "fast" IO using system's
- stdio, then the default layers are:
-
- unix stdio
-
- Otherwise the default layers are
-
- unix perlio
-
- These defaults may change once perlio has been better tested and tuned.
-
- The default can be overridden by setting the environment variable
- PERLIO to a space separated list of layers (C<unix> or platform low
- level layer is always pushed first).
-
- This can be used to see the effect of/bugs in the various layers e.g.
-
- cd .../perl/t
- PERLIO=stdio ./perl harness
- PERLIO=perlio ./perl harness
-
- For the various value of PERLIO see L<perlrun/PERLIO>.
-
- =head2 Querying the layers of filehandles
-
- The following returns the B<names> of the PerlIO layers on a filehandle.
-
- my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH".
-
- The layers are returned in the order an open() or binmode() call would
- use them. Note that the "default stack" depends on the operating
- system and on the Perl version, and both the compile-time and
- runtime configurations of Perl.
-
- The following table summarizes the default layers on UNIX-like and
- DOS-like platforms and depending on the setting of the C<$ENV{PERLIO}>:
-
- PERLIO UNIX-like DOS-like
-
- unset / "" unix perlio / stdio [1] unix crlf
- stdio unix perlio / stdio [1] stdio
- perlio unix perlio unix perlio
- mmap unix mmap unix mmap
-
- # [1] "stdio" if Configure found out how to do "fast stdio" (depends
- # on the stdio implementation) and in Perl 5.8, otherwise "unix perlio"
-
- By default the layers from the input side of the filehandle is
- returned, to get the output side use the optional C<output> argument:
-
- my @layers = PerlIO::get_layers($fh, output => 1);
-
- (Usually the layers are identical on either side of a filehandle but
- for example with sockets there may be differences, or if you have
- been using the C<open> pragma.)
-
- There is no set_layers(), nor does get_layers() return a tied array
- mirroring the stack, or anything fancy like that. This is not
- accidental or unintentional. The PerlIO layer stack is a bit more
- complicated than just a stack (see for example the behaviour of C<:raw>).
- You are supposed to use open() and binmode() to manipulate the stack.
-
- B<Implementation details follow, please close your eyes.>
-
- The arguments to layers are by default returned in parenthesis after
- the name of the layer, and certain layers (like C<utf8>) are not real
- layers but instead flags on real layers: to get all of these returned
- separately use the optional C<separate> argument:
-
- my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1);
-
- The result will be up to be three times the number of layers:
- the first element will be a name, the second element the arguments
- (unspecified arguments will be C<undef>), the third element the flags,
- the fourth element a name again, and so forth.
-
- B<You may open your eyes now.>
-
- =head1 AUTHOR
-
- Nick Ing-Simmons E<lt>nick@ing-simmons.netE<gt>
-
- =head1 SEE ALSO
-
- L<perlfunc/"binmode">, L<perlfunc/"open">, L<perlunicode>, L<perliol>,
- L<Encode>
-
- =cut
-
-