home *** CD-ROM | disk | FTP | other *** search
- =head1 NAME
-
- YAML - YAML Ain't Markup Language (tm)
-
- =head1 SYNOPSIS
-
- use YAML;
-
- # Load a YAML stream of 3 YAML documents into Perl data structures.
- my ($hashref, $arrayref, $string) = Load(<<'...');
- ---
- name: ingy
- age: old
- weight: heavy
- # I should comment that I also like pink, but don't tell anybody.
- favorite colors:
- - red
- - white
- - blue
- ---
- - Clark Evans
- - Oren Ben-Kiki
- - Brian Ingerson
- --- >
- You probably think YAML stands for "Yet Another Markup Language". It
- ain't! YAML is really a data serialization language. But if you want
- to think of it as a markup, that's OK with me. A lot of people try
- to use XML as a serialization format.
-
- "YAML" is catchy and fun to say. Try it. "YAML, YAML, YAML!!!"
- ...
-
- # Dump the Perl data structures back into YAML.
- print Dump($string, $arrayref, $hashref);
-
- # YAML::Dump is used the same way you'd use Data::Dumper::Dumper
- use Data::Dumper;
- print Dumper($string, $arrayref, $hashref);
-
- =head1 DESCRIPTION
-
- The YAML.pm module implements a YAML Loader and Dumper based on the YAML
- 1.0 specification. L<http://www.yaml.org/spec/>
-
- YAML is a generic data serialization language that is optimized for
- human readability. It can be used to express the data structures of most
- modern programming languages. (Including Perl!!!)
-
- For information on the YAML syntax, please refer to the YAML
- specification.
-
- =head1 WHY YAML IS COOL
-
- =over 4
-
- =item YAML is readable for people.
-
- It makes clear sense out of complex data structures. You should find
- that YAML is an exceptional data dumping tool. Structure is shown
- through indentation, YAML supports recursive data, and hash keys are
- sorted by default. In addition, YAML supports several styles of scalar
- formatting for different types of data.
-
- =item YAML is editable.
-
- YAML was designed from the ground up to be an excellent syntax for
- configuration files. Almost all programs need configuration files, so
- why invent a new syntax for each one? And why subject users to the
- complexities of XML or native Perl code?
-
- =item YAML is multilingual.
-
- Yes, YAML supports Unicode. But I'm actually referring to programming
- languages. YAML was designed to meet the serialization needs of Perl,
- Python, Ruby, Tcl, PHP and Java. It was also designed to be
- interoperable between those languages. That means any YAML serialization
- produced by Perl can be processed by Python, and be guaranteed to return
- the data structure intact. (Even if it contained Perl specific
- structures like GLOBs)
-
- =item YAML is taint safe.
-
- Using modules like Data::Dumper for serialization is fine as long as you
- can be sure that nobody can tamper with your data files or
- transmissions. That's because you need to use Perl's C<eval()> built-in
- to deserialize the data. Somebody could add a snippet of Perl to erase
- your files.
-
- YAML's parser does not need to eval anything.
-
- =item YAML is full featured.
-
- YAML can accurately serialize all of the common Perl data structures and
- deserialize them again without losing data relationships. Although it is
- not 100% perfect (no serializer is or can be perfect), it fares as well
- as the popular current modules: Data::Dumper, Storable, XML::Dumper and
- Data::Denter.
-
- YAML.pm also has the ability to handle code (subroutine) references and
- typeglobs. (Still experimental) These features are not found in Perl's
- other serialization modules.
-
- =item YAML is extensible.
-
- The YAML language has been designed to be flexible enough to solve it's
- own problems. The markup itself has 3 basic construct which resemble
- Perl's hash, array and scalar. By default, these map to their Perl
- equivalents. But each YAML node also supports a type (or "transfer
- method") which can cause that node to be interpreted in a completely
- different manner. That's how YAML can support oddball structures like
- Perl's typeglob.
-
- =item YAML.pm plays well with others.
-
- YAML has been designed to interact well with other Perl Modules like POE
- and Time::Object. (date support coming soon)
-
- =back
-
- =head1 USAGE
-
- =head2 Exported Functions
-
- The following functions are exported by YAML.pm by default when you use
- YAML.pm like this:
-
- use YAML;
-
- To prevent YAML.pm from exporting functions, say:
-
- use YAML ();
-
- =over 4
-
- =item Dump(list-of-Perl-data-structures)
-
- Turn Perl data into YAML. This function works very much like
- Data::Dumper::Dumper(). It takes a list of Perl data strucures and
- dumps them into a serialized form. It returns a string containing the
- YAML stream. The structures can be references or plain scalars.
-
- =item Load(string-containing-a-YAML-stream)
-
- Turn YAML into Perl data. This is the opposite of Dump. Just like
- Storable's thaw() function or the eval() function in relation to
- Data::Dumper. It parses a string containing a valid YAML stream into a
- list of Perl data structures.
-
- =item Store()
-
- This function is deprecated, and now refered to as Dump. It is still available
- for the time being, but will generate a warning if you are using -w. You
- B<are> using -w, aren't you? :)
-
- The reason for this deprecation is that the YAML spec talks about programs
- called Loaders and Dumpers. "Storers" is too hard to say, I guess...
-
- =back
-
- =head2 Exportable Functions
-
- =over 4
-
- =item DumpFile(filepath, list)
-
- Writes the YAML stream to a file instead of just returning a string.
-
- =item LoadFile(filepath)
-
- Reads the YAML stream from a file instead of a string.
-
- =item Bless(perl-node, [yaml-node | class-name])
-
- Associate a normal Perl node, with a yaml node. A yaml node is an object
- tied to the YAML::Node class. The second argument is either a yaml node
- that you've already created or a class (package) name that supports a
- yaml_dump() function. A yaml_dump() function should take a perl node and
- return a yaml node. If no second argument is provided, Bless will create a
- yaml node. This node is not returned, but can be retrieved with the Blessed()
- function.
-
- Here's an example of how to use Bless. Say you have a hash containing three
- keys, but you only want to dump two of them. Furthermore the keys must be
- dumped in a certain order. Here's how you do that:
-
- use YAML qw(Dump Bless);
- $hash = {apple => 'good', banana => 'bad', cauliflower => 'ugly'};
- print Dump $hash;
- Bless($hash)->keys(['banana', 'apple']);
- print Dump $hash;
-
- produces:
-
- --- #YAML:1.0
- apple: good
- banana: bad
- cauliflower: ugly
- --- #YAML:1.0
- banana: bad
- apple: good
-
- Bless returns the tied part of a yaml-node, so that you can call the
- YAML::Node methods. This is the same thing that YAML::Node::ynode()
- returns. So another way to do the above example is:
-
- use YAML qw(:all);
- use YAML::Node;
- $hash = {apple => 'good', banana => 'bad', cauliflower => 'ugly'};
- print Dump $hash;
- Bless($hash);
- $ynode = ynode(Blessed($hash));
- $ynode->keys(['banana', 'apple']);
- print Dump $hash;
-
- =item Blessed(perl-node)
-
- Returns the yaml node that a particular perl node is associated with
- (see above). Returns undef if the node is not (YAML) blessed.
-
- =item Dumper()
-
- Alias to Dump(). For Data::Dumper fans.
-
- =item freeze() and thaw()
-
- Aliases to Dump() and Load(). For Storable fans.
-
- This will also allow YAML.pm to be plugged directly into modules like POE.pm,
- that use the freeze/thaw API for internal serialization.
-
- =back
-
- =head2 Exportable Function Groups
-
- This is a list of the various groups of exported functions that you can import
- using the following syntax:
-
- use YAML ':groupname';
-
- =over 4
-
- =item all
-
- Imports Dump(), Load(), Store(), DumpFile(), LoadFile(), Bless() and Blessed().
-
- =item POE
-
- Imports freeze() and thaw().
-
- =item Storable
-
- Imports freeze() and thaw().
-
- =back
-
- =head2 Class Methods
-
- YAML can also be used in an object oriented manner. At this point it
- offers no real advantage. This interface will be improved in a later
- release.
-
- =over 4
-
- =item new()
-
- New returns a new YAML object. For example:
-
- my $y = YAML->new;
- $y->Indent(4);
- $y->dump($foo, $bar);
-
- =back
-
- =head2 Object Methods
-
- =over 4
-
- =item dump()
-
- OO version of Dump().
-
- =item load()
-
- OO version of Load().
-
- =back
-
- =head2 Options
-
- YAML options are set using a group of global variables in the YAML
- namespace. This is similar to how Data::Dumper works.
-
- For example, to change the indentation width, do something like:
-
- local $YAML::Indent = 3;
-
- The current options are:
-
- =over 4
-
- =item Indent
-
- This is the number of space characters to use for each indentation level
- when doing a Dump(). The default is 2.
-
- By the way, YAML can use any number of characters for indentation at any
- level. So if you are editing YAML by hand feel free to do it anyway that
- looks pleasing to you; just be consistent for a given level.
-
- =item UseHeader
-
- Default is 1. (true)
-
- This tells YAML.pm whether to use a separator string for a Dump
- operation. This only applies to the first document in a stream.
- Subsequent documents must have a YAML header by definition.
-
- =item UseVersion
-
- Default is 1. (true)
-
- Tells YAML.pm whether to include the YAML version on the
- separator/header.
-
- The canonical form is:
-
- --- YAML:1.0
-
- =item SortKeys
-
- Default is 1. (true)
-
- Tells YAML.pm whether or not to sort hash keys when storing a document.
-
- YAML::Node objects can have their own sort order, which is usually what
- you want. To override the YAML::Node order and sort the keys anyway, set
- SortKeys to 2.
-
- =item AnchorPrefix
-
- Default is ''.
-
- Anchor names are normally numeric. YAML.pm simply starts with '1' and
- increases by one for each new anchor. This option allows you to specify a
- string to be prepended to each anchor number.
-
- =item UseCode
-
- Setting the UseCode option is a shortcut to set both the DumpCode and
- LoadCode options at once. Setting UseCode to '1' tells YAML.pm to dump
- Perl code references as Perl (using B::Deparse) and to load them back
- into memory using eval(). The reason this has to be an option is that
- using eval() to parse untrusted code is, well, untrustworthy. Safe
- deserialization is one of the core goals of YAML.
-
- =item DumpCode
-
- Determines if and how YAML.pm should serialize Perl code references. By
- default YAML.pm will dump code references as dummy placeholders (much
- like Data::Dumper). If DumpCode is set to '1' or 'deparse', code
- references will be dumped as actual Perl code.
-
- DumpCode can also be set to a subroutine reference so that you can
- write your own serializing routine. YAML.pm passes you the code ref. You
- pass back the serialization (as a string) and a format indicator. The
- format indicator is a simple string like: 'deparse' or 'bytecode'.
-
- =item LoadCode
-
- LoadCode is the opposite of DumpCode. It tells YAML if and how to
- deserialize code references. When set to '1' or 'deparse' it will use
- C<eval()>. Since this is potentially risky, only use this option if you
- know where your YAML has been.
-
- LoadCode can also be set to a subroutine reference so that you can write
- your own deserializing routine. YAML.pm passes the serialization (as a
- string) and a format indicator. You pass back the code reference.
-
- =item UseBlock
-
- YAML.pm uses heuristics to guess which scalar style is best for a given
- node. Sometimes you'll want all multiline scalars to use the 'block'
- style. If so, set this option to 1.
-
- NOTE: YAML's block style is akin to Perl's here-document.
-
- =item ForceBlock
-
- Force every possible scalar to be block formatted. NOTE: Escape characters
- cannot be formatted in a block scalar.
-
- =item UseFold
-
- If you want to force YAML to use the 'folded' style for all multiline
- scalars, then set $UseFold to 1.
-
- NOTE: YAML's folded style is akin to the way HTML folds text,
- except smarter.
-
- =item UseAliases
-
- YAML has an alias mechanism such that any given structure in memory gets
- serialized once. Any other references to that structure are serialized
- only as alias markers. This is how YAML can serialize duplicate and
- recursive structures.
-
- Sometimes, when you KNOW that your data is nonrecursive in nature, you
- may want to serialize such that every node is expressed in full. (ie as
- a copy of the original). Setting $YAML::UseAliases to 0 will allow you
- to do this. This also may result in faster processing because the lookup
- overhead is by bypassed.
-
- THIS OPTION CAN BE DANGEROUS. *If* your data is recursive, this option
- *will* cause Dump() to run in an endless loop, chewing up your computers
- memory. You have been warned.
-
- =item CompressSeries
-
- Default is 1.
-
- Compresses the formatting of arrays of hashes:
-
- -
- foo: bar
- -
- bar: foo
-
- becomes:
-
- - foo: bar
- - bar: foo
-
- Since this output is usually more desirable, this option is turned on by
- default.
-
- =back
-
- =head1 YAML TERMINOLOGY
-
- YAML is a full featured data serialization language, and thus has its
- own terminology.
-
- It is important to remember that although YAML is heavily influenced by
- Perl and Python, it is a language in it's own right, not merely just a
- representation of Perl structures.
-
- YAML has three constructs that are conspicuously similar to Perl's hash,
- array, and scalar. They are called mapping, sequence, and string
- respectively. By default, they do what you would expect. But each
- instance may have an explicit or implicit type that makes it behave
- differently. In this manner, YAML can be extended to represent Perl's
- Glob or Python's tuple, or Ruby's Bigint.
-
- =over 4
-
- =item stream
-
- A YAML stream is the full sequence of bytes that a YAML parser would
- read or a YAML emitter would write. A stream may contain one or more YAML
- documents separated by YAML headers.
-
- ---
- a: mapping
- foo: bar
- ---
- - a
- - sequence
-
- =item document
-
- A YAML document is an independent data structure representation within a
- stream. It is a top level node.
-
- --- YAML:1.0
- This: top level mapping
- is:
- - a
- - YAML
- - document
-
- =item node
-
- A YAML node is the representation of a particular data stucture. Nodes
- may contain other nodes. (In Perl terms, nodes are like scalars.
- Strings, arrayrefs and hashrefs. But this refers to the serialized
- format, not the in-memory structure.)
-
- =item transfer method
-
- This is similar to a type. It indicates how a particular YAML node
- serialization should be transferred into or out of memory. For instance
- a Foo::Bar object would use the transfer 'perl/Foo::Bar':
-
- - !perl/Foo::Bar
- foo: 42
- bar: stool
-
- =item collection
-
- A collection is the generic term for a YAML data grouping. YAML has two
- types of collections: mappings and sequences. (Similar to hashes and arrays)
-
- =item mapping
-
- A mapping is a YAML collection defined by key/value pairs. By default YAML
- mappings are loaded into Perl hashes.
-
- a mapping:
- foo: bar
- two: times two is 4
-
- =item sequence
-
- A sequence is a YAML collection defined by an ordered list of elements. By
- default YAML sequences are loaded into Perl arrays.
-
- a sequence:
- - one bourbon
- - one scotch
- - one beer
-
- =item scalar
-
- A scalar is a YAML node that is a single value. By default YAML scalars
- are loaded into Perl scalars.
-
- a scalar key: a scalar value
-
- YAML has many styles for representing scalars. This is important because
- varying data will have varying formatting requirements to retain the
- optimum human readability.
-
- =item simple scalar
-
- This is a single line of unquoted text. All simple scalars are automatic
- candidates for "implicit transferring". This means that their B<type> is
- determined automatically by examination. Unless they match a set of
- predetermined YAML regex patterns, they will raise a parser exception.
- The typical uses for this are simple alpha strings, integers, real
- numbers, dates, times and currency.
-
- - a simple string
- - -42
- - 3.1415
- - 12:34
- - 123 this is an error
-
- =item single quoted scalar
-
- This is similar to Perl's use of single quotes. It means no escaping and
- no implicit transfer. It must be used on a single line.
-
- - 'When I say ''\n'' I mean "backslash en"'
-
- =item double quoted scalar
-
- This is similar to Perl's use of double quotes. Character escaping can
- be used. There is no implicit transfer and it must still be single line.
-
- - "This scalar\nhas two lines, and a bell -->\a"
-
- =item folded scalar
-
- This is a multiline scalar which begins on the next line. It is
- indicated by a single closing brace. It is unescaped like the single
- quoted scalar. Line folding is also performed.
-
- - >
- This is a multiline scalar which begins on
- the next line. It is indicated by a single
- carat. It is unescaped like the single
- quoted scalar. Line folding is also
- performed.
-
- =item block scalar
-
- This final multiline form is akin to Perl's here-document except that
- (as in all YAML data) scope is indicated by indentation. Therefore, no
- ending marker is required. The data is verbatim. No line folding.
-
- - |
- QTY DESC PRICE TOTAL
- --- ---- ----- -----
- 1 Foo Fighters $19.95 $19.95
- 2 Bar Belles $29.95 $59.90
-
- =item parser
-
- A YAML processor has four stages: parse, load, dump, emit.
-
- A parser parses a YAML stream. YAML.pm's Load() function contains a
- parser.
-
- =item loader
-
- The other half of the Load() function is a loader. This takes the
- information from the parser and loads it into a Perl data structure.
-
- =item dumper
-
- The Dump() function consists of a dumper and an emitter. The dumper
- walks through each Perl data structure and gives info to the emitter.
-
- =item emitter
-
- The emitter takes info from the dumper and turns it into a YAML stream.
-
- NOTE:
- In YAML.pm the parser/loader and the dumper/emitter code are currently
- very closely tied together. When libyaml is written (in C) there will be
- a definite separation. libyaml will contain a parser and emitter, and
- YAML.pm (and YAML.py etc) will supply the loader and dumper.
-
- =back
-
- For more information please refer to the immensely helpful YAML
- specification available at L<http://www.yaml.org/spec/>.
-
- =head1 ysh - The YAML Shell
-
- The YAML distribution ships with a script called 'ysh', the YAML shell.
- ysh provides a simple, interactive way to play with YAML. If you type in
- Perl code, it displays the result in YAML. If you type in YAML it turns
- it into Perl code.
-
- To run ysh, (assuming you installed it along with YAML.pm) simply type:
-
- ysh [options]
-
- Please read L<ysh> for the full details. There are lots of options.
-
- =head1 BUGS & DEFICIENCIES
-
- If you find a bug in YAML, please try to recreate it in the YAML Shell
- with logging turned on ('ysh -L'). When you have successfully reproduced
- the bug, please mail the LOG file to the author (ingy@cpan.org).
-
- WARNING: This is *ALPHA* code.
-
- BIGGER WARNING: This is *TRIAL1* of the YAML 1.0 specification. The YAML
- syntax may change before it is finalized. Based on past experience, it
- probably will change. The authors of this spec have worked for over a
- year putting together YAML 1.0, and we have flipped it on it's
- syntactical head almost every week. We're a fickle lot, we are. So use
- this at your own risk!!!
-
- =over 4
-
- =item Circular Leaves
-
- YAML is quite capable of serializing circular references. And for the
- most part it can deserialize them correctly too. One notable exception
- is a reference to a leaf node containing itself. This is hard to do from
- pure Perl in any elegant way. The "canonical" example is:
-
- $foo = \$foo;
-
- This serializes fine, but I can't parse it correctly yet. Unfortunately,
- every wiseguy programmer in the world seems to try this first when you
- ask them to test your serialization module. Even though it is of almost
- no real world value. So please don't report this bug unless you have a
- pure Perl patch to fix it for me.
-
- By the way, similar non-leaf structures Dump and Load just fine:
-
- $foo->[0] = $foo;
-
- You can test these examples using 'ysh -r'. This option makes sure that
- the example can be deserialized after it is serialized. We call that
- "roundtripping", thus the '-r'.
-
- =item Unicode
-
- Unicode is not yet supported. The YAML specification dictates that all
- strings be unicode, but this early implementation just uses ASCII.
-
- =item Structured Keys
-
- Python, Java and perhaps others support using any data type as the
- key to a hash. YAML also supports this. Perl5 only uses strings as
- hash keys.
-
- YAML.pm can currently parse structured keys, but their meaning gets lost
- when they are loaded into a Perl hash. Consider this example using the
- YAML Shell:
-
- ysh > ---
- yaml> ?
- yaml> foo: bar
- yaml> : baz
- yaml> ...
- $VAR1 = {
- 'HASH(0x1f1d20)' => 'baz'
- };
- ysh >
-
- YAML.pm will need to be fixed to preserve these keys somehow. Why?
- Because if YAML.pm gets a YAML document from YAML.py it must be able to
- return it with the Python data intact.
-
- =item Globs, Subroutines, Regexes and File Handles
-
- As far as I know, other Perl serialization modules are not capable of
- serializing and deserializing typeglobs, subroutines (code refs),
- regexes and file handles. YAML.pm has dumping capabilities for all of these.
- Loading them may produce wild results. Take care.
-
- NOTE: For a (huge) dump of Perl's global guts, try:
-
- perl -MYAML -e '$YAML::UseCode=1; print Dump \%main::'
-
- To limit this to a single namespace try:
-
- perl -MCGI -MYAML -e '$YAML::UseCode=1; print Dump \%CGI::'
-
- =item Speed
-
- This is a pure Perl implementation that has been optimized for
- programmer readability, not for computational speed.
-
- Neil Watkiss and Clark Evans are currently developing libyaml, the
- official C implementation of the YAML parser and emitter. YAML.pm will
- be refactoring to use this library once it is stable. Other languages
- like Python, Tcl, PHP, Ruby, JavaScript and Java can make use of the
- same core library.
-
- Please join us on the YAML mailing list if you are interested in
- implementing something.
-
- L<https://lists.sourceforge.net/lists/listinfo/yaml-core>
-
- =item Streaming Access
-
- This module Dumps and Loads in one operation. There is no interface
- for parsing or emitting a YAML stream one node at a time. It's all
- or nothing.
-
- An upcoming release will have support for incremental parsing.
- Incremental dumping is harder. Stay tuned.
-
- =back
-
- =head1 RESOURCES
-
- Please read L<YAML::Node> for advanced YAML features.
-
- L<http://www.yaml.org> is the official YAML website.
-
- L<http://www.yaml.org/spec/> is the YAML 1.0 specification.
-
- L<http://wiki.yaml.org/spec/> is the official YAML wiki.
-
- YAML has been registered as a Source Forge project.
- (L<http://www.sourceforge.net>) Currently we are only using the mailing
- list facilities there.
-
- =head1 IMPLEMENTATIONS
-
- This is the first implementation of YAML functionality based on the 1.0
- specification.
-
- The following people have shown an interest in doing implementations.
- Please contact them if you are also interested in writing an
- implementation.
-
- ---
- - name: Neil Watkiss
- project:
- - libyaml
- - YAML mode for the vim editor
- email: nwatkiss@ttul.org
-
- - name: Brian Ingerson
- project: YAML.pm, libyaml Perl binding
- email: ingy@ttul.org
-
- - name: Clark Evans
- project: libyaml, Python binding
- email: cce@clarkevans.com
-
- - name: Oren Ben-Kiki
- project: Java Loader/Dumper
- email: orenbk@richfx.com
-
- - name: Paul Prescod
- project: YAML Antagonist/Anarchist
- email: paul@prescod.net
-
- - name: Ryan King
- project: YAML test specialist
- email: rking@panoptic.com
-
- - name: Steve Howell
- project: Python and Ruby implementations
- email: showell@zipcon.net
-
- - name: Patrick Leboutillier
- project: Java Loader/Dumper
- email: patrick_leboutillier@hotmail.com
-
- - name: Shane Caraveo
- project: PHP Loader/Dumper
- email: shanec@activestate.com
-
- - name: Brian Quinlan
- project: Python Loader/Dumper
- email: brian@sweetapp.com
-
- - name: Jeff Hobbs
- project: Tcl Loader/Dumper
- email: jeff@hobbs.org
-
- - name: Claes Jacobsson
- project: JavaScript Loader/Dumper
- email: claes@contiller.se
-
- =head1 AUTHOR
-
- Brian Ingerson <INGY@cpan.org> is resonsible for YAML.pm.
-
- The YAML language is the result of a ton of collaboration between Oren
- Ben-Kiki, Clark Evans and Brian Ingerson. Several others have added help
- along the way.
-
- Neil Watkiss is pioneering libyaml. Bless that boy!
-
- Ryan King offered much help on the 0.35 release. The XP advocate
- extraordinaire, help me refactor my entire test suite into its
- current form. Regression tests are extremely important to the success
- of this project.
-
- =head1 COPYRIGHT
-
- Copyright (c) 2001, 2002. Brian Ingerson. All rights reserved.
-
- This program is free software; you can redistribute it and/or modify it
- under the same terms as Perl itself.
-
- See L<http://www.perl.com/perl/misc/Artistic.html>
-
- =cut
-