This file provides an overview of the Bento design. It describes
the design more from the API perspective than the format perspective.
However, most of the concepts also apply to the format level. In
some respects the format is simpler than the API, but it difficult
to understand without first understanding the API functionality it is
intended to support.
Bento Entities
==============
The easiest way to begin understanding the Bento design is probably to
review the entities that the API manipulates.
Primary Entities
================
The most important entities in the Bento design are containers, objects,
properties, values, and types.
Every object is in some container. An object consists of a set of
properties. The properties are not in any particular order. Each
property consists of a sequence of values, indexed from 1 to n. Every
object must have at least one property, and that property must have at
least one value. Each value has a type; several values of the same
property may have the same type. The type of a value is unrelated to
its index. Each value consists of a variable length sequence of bytes.
Now let us look at these primary entities in more detail.
Containers
----------
All Bento objects are stored in containers. Bento knows very little
about a container beyond the objects in it. However, the container
itself is an object, and can have properties, so applications can
specify further information about the container if they wish.
Containers are often files, but they can also be many other forms of
storage. For example, we are already planning to support the following
types of containers: blocks of memory, the clipboard, network messages,
and Bento values. Undoubtedly other types of containers will be useful
as well.
Objects
-------
Each Bento object has a persistent ID which is unique within its
container. Other than that, objects don╒t really exist independent of
their properties. An object contains no information beyond what is
stored in its properties.
Properties
----------
A property defines a role for a value. Properties are like field names
in a record, except they can be added freely to an object, and their
names are globally unique, so that applications can understand them.
Properties are distinct from types.
For example, a string might be used for the name of an object, the
author of the object, a comment, etc These different uses would be
indicated by different properties.
Conversely, the string might be in ASCII, Unicode, or some other
international string representation. These different formats would not
be indicated by the property, but by the type (see below).
Values
------
Values are where the data is actually stored. The data for a value can
be stored anywhere in a container. In fact, it can be broken up into
any number of separate pieces, and the pieces can be stored anywhere.
(See the discussion of continued values below.)
Values may range in size from 0 bytes to 2^32 bytes (if you have that
much storage). Bento is optimized for ╥large╙ values, such as streams
of formatted text, graphics metafiles, etc.
Types
-----
The type of a value describes the format of that value. Types record
the structure of a value, whether it is compressed, what its byte
ordering is, etc.
To continue the example above, the type of a string value would indicate
the alphabet, whether it was null terminated, and possibly other
information (such as the intended language). It might also indicate
that the string was stored in a compressed form, and would indicate the
compression technique, and the dictionary if one was required. If the
string used multi-byte characters, and the byte-ordering was not defined
by the alphabet, the type would indicate the byte-ordering within the
characters.
Secondary Entities
==================
There are several additional entities that play supporting roles in the
Bento design. These entities are important to fully understand how
Bento works, but they do not signficantly change the picture given
above.
Type and property descriptions
------------------------------
The property associated with a value is a reference to a property
description. Similarly, the type is a reference to a type description.
These type and property descriptions are objects, and their IDs are
drawn from the same name-space as other object IDs.
Many type and property descriptions will simply consist of the globally
unique name of the type or property. To continue the example above
further, the type of a string of 7-bit ASCII, not compressed or
otherwise transformed, would simply be described by a globally unique
name. This would allow applications to recognize the type.
Reference to type and property descriptions are distinct from references
to ordinary objects in the API to allow language type checking to catch
errors in the manipulation of type and property references. However,
type and property references can still be passed to the object and value
operations, so that value manipulation can be done on types and
properties as well as normal objects.
Globally unique names
---------------------
Globally unique names are simply strings that follow certain
conventions. They begin with a registered naming authority, and have
additional segments, each of which is unique in the context of the
previous segments.
The most common globally unique names will be generated by system
vendors or commercial application developers, and may be registered.
However, many names will be generated by local developers to record
their local types and properties. To meet this need, the naming rules
allow for local creation of unregistered unique names.
IDs and accessors
-----------------
Each object is assigned a persistent ID that is unique within the
container in which the object is created. These IDs are never reused
once they have been assigned, so even if an object is deleted, its ID
will never be reassigned.
In the API types, properties, and objects can be referred to using their
IDs, but for convenience, they are usually referred to using accessors
provided by the API. Since IDs are only unique within a container,
they must always be used with an explicit container, while the accessors
include an implicit container reference.
Accessors are used to refer to containers and values. Accessors are
only unique within a given session, so they cannot be stored in values
as reference to other values. IDs must always be used for persistent
references.
Dynamic values
---------------
Bento needs to support external references from one container to
another, or to other entities such as files, etc. It does this through
dynamic values. These are values whose types indicate that they contain
a description of the real value, rather than the actual data.
Except for the indirect characteristic of their types, indirect values are created and stored exactly like normal values. However, when they are accessed, a handler is called to resolve the description to an actual value.
Value segments
--------------
To support interleaving and other uses that require breaking a value up
into pieces, Bento allows a value to consist of multiple segments stored
at different locations in the container. These segments are not visible
at the API, which glues them together to create a single stream of
bytes.
Handlers
========
Handlers are pieces of code called by the Bento library to do specific
jobs, but not part of the Bento library as such. Functions are put into
handlers rather than the library to make the library more portable, and
also to provide a standard way to extend the library.
Handlers come in two main forms: container handlers and value handlers.
In addition, the API uses special handlers for reporting errors and
allocating and deallocating memory.
Actual I/O to containers is always done using container handlers, to
provide platform independence. Container handlers provide stream I/O,
plus a few special interfaces for reading and writing specific parts of
the container format.
The many different types of containers mentioned in the first section
are not actually implemented in the Bento library. Instead, the library
simply calls different types of handlers, all of which provide the same
interface. These handlers map I/O to the underlying storage in a way
that depends on the container type.
Value handlers are only required for values that require special support
for access. For example, a value that is compressed on writing and
decompressed on reading would need a special handler. Value handlers
have the option of providing specialized operations to manipulate the
value, either instead of or in addition to the standard value