home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.mactech.com 2010
/
ftp.mactech.com.tar
/
ftp.mactech.com
/
online
/
source
/
c
/
compilers
/
Tickle-4.0.sit.hqx
/
Tickle-4.0
/
cbtree
/
src
/
db.man
< prev
next >
Wrap
Text File
|
1992-04-29
|
22KB
|
661 lines
DB(3) C LIBRARY FUNCTIONS DB(3)
NAME
btree_open, hash_open, recno_open - database access methods
SYNOPSIS
#include <sys/types.h>
#include <db.h>
DB *
btree_open(const char *file, int flags, int mode,
const BTREEINFO * openinfo);
DB *
hash_open(const char *file, int flags, int mode,
const HASHINFO * openinfo);
DB *
recno_open(const char *file, int flags, int mode,
const RECNOINFO * openinfo);
DESCRIPTION
Btree_open, hash_open, and recno_open are access method
interfaces to database files in btree, hashed, and flat-file
formats, respectively. The btree format is a representation
of a sorted, balanced tree structure. The hashed format is
an extensible, dynamic hashing scheme. The flat-file format
is a UNIX file with fixed or variable length lines. These
formats are described in more detail below.
Access to all file types is based on key/data pairs.
Each routine opens file for reading and/or writing. Data-
bases never intended to be preserved on disk may be created
by setting the file parameter to NULL. The flags and mode
arguments are as specified to the open(2) routine, however,
only the O_CREAT, O_EXCL, O_RDONLY, O_RDWR, O_TRUNC and
O_WRONLY flags are meaningful. The argument openinfo is a
pointer to an access method specific structure described
below.
The open routines return a pointer to a DB structure on suc-
cess and NULL on error. The DB structure contains at least
the following fields:
typedef struct {
int (*close)(const DB *db);
int (*sync)(const DB *db);
int (*del)(const DB *db, const DBT *key, u_int flags);
int (*get)(const DB *db, DBT *key, DBT *data, u_int flags);
int (*put)(const DB *db, const DBT *key, const DBT *data,
u_int flags);
int (*seq)(const DB *db, DBT *key, DBT *data, u_int flags);
int type;
Sun Release 4.1 Last change: April 2, 1991 1
DB(3) C LIBRARY FUNCTIONS DB(3)
void *openinfo;
} DB;
The elements of this structure consist of a pointer to an
access method specific structure and a set of routines which
perform various functions. All of these routines take a
pointer to a structure as returned by one of the open rou-
tines, one or more pointers to key/data structures, and,
optionally, a flag value.
openinfo
A pointer to an internal structure specific to the
access method.
type The type of the underlying access method; either
DB_BTREE, DB_HASH or DB_RECNO.
close
A pointer to a routine to flush any cached information
to disk, free any allocated resources, and close the
database file. Since key/data pairs may be cached in
memory, failing to close the file with a close routine
may result in inconsistent or lost information. Close
routines return -1 on error (setting errno) and 0 on
success.
del A pointer to a routine to remove key/data pairs from
the database. Delete routines return -1 on error (set-
ting errno), 0 on success, and 1 if the specified key
was not in the file.
get A pointer to a routine which is the interface for keyed
retrieval from the database. The address and length of
the data associated with the specified key are returned
in the structure referenced by data. Get routines
return -1 on error (setting errno), 0 on success, and 1
if the key was not in the file.
put A pointer to a routine to store key/data pairs in the
database.
The parameter flag must be set to one of the following
values:
R_IAFTER
Append the data immediately after the data refer-
enced by key, creating a new key/data pair. (This
implies that the access method is able to create
new keys, i.e. the keys are ordered and indepen-
dent, for example, record numbers. Applicable
only to the RECNO access method.)
Sun Release 4.1 Last change: April 2, 1991 2
DB(3) C LIBRARY FUNCTIONS DB(3)
R_IBEFORE
Insert the data immediately before the data refer-
enced by key, creating a new key/data pair. (This
implies that the access method is able to create
new keys, i.e. the keys are ordered and indepen-
dent, for example, record numbers. Applicable
only to the RECNO access method.)
R_NOOVERWRITE
Enter the new key/data pair only if the key does
not previously exist.
R_PUT
Enter the new key/data pair and replace any previ-
ously existing key.
Put routines return -1 on error (setting errno), 0 on
success, and 1 if the R_NOOVERWRITE flag was set and
the key already exists in the file.
seq A pointer to a routine which is the interface for
sequential retrieval from the database. The address
and length of the key are returned in the structure
referenced by key, and the address and length of the
data are returned in the structure referenced by data.
Sequential key/data pair retrieval may begin at any
time, and the position of the ``cursor'' is not
affected by calls to the del, get, put, or sync rou-
tines. Modifications to the database during a sequen-
tial scan will be reflected in the scan, i.e. records
inserted behind the cursor will not be returned while
records inserted in front of the cursor will be
returned.
The flag value must be set to one of the following
values:
R_CURSOR
The data associated with the specified key is
returned. This differs from the get routines in
that it sets the ``cursor'' to the location of the
key as well. (This implies that the access method
has a implicit order which does not change.
Applicable only to the BTREE and RECNO access
methods.)
R_FIRST
The first key/data pair of the database is
returned.
R_LAST
Sun Release 4.1 Last change: April 2, 1991 3
DB(3) C LIBRARY FUNCTIONS DB(3)
The last key/data pair of the database is
returned. (This implies that the access method
has a implicit order which does not change.
Applicable only to the BTREE and RECNO access
methods.)
R_NEXT
Retrieve the key/data pair immediately after the
key/data pair most recently retrieved using the
seq routine. The cursor is moved to the returned
key/data pair. If flag is set to R_NEXT the first
time the seq routine is called, the first key/data
pair of the database is returned.
R_PREV
Retrieve the key/data pair immediately before the
key/data pair most recently retrieved using the
seq routine. The cursor is moved to the returned
key/data pair. If flag is set to R_PREV the first
time the seq routine is called, the last key/data
pair of the database is returned. (This implies
that the access method has a implicit order which
does not change. Applicable only to the BTREE and
RECNO access methods.)
Seq routines return -1 on error (setting errno), 0 on
success, 1 if there are no more key/data pairs avail-
able. If the RECNO access method is being used, and if
the database file is a character special file and no
complete key/data pairs are currently available, the
seq routines return 2.
sync A pointer to a routine to flush any cached information
to disk. If the database is in memory only, the sync
routine has no effect and will always succeed. Sync
routines return -1 on error (setting errno) and 0 on
success.
KEY/DATA PAIRS
Access to all file types is based on key/data pairs. Both
keys and data are represented by the following data struc-
ture:
typedef struct {
void *data;
size_t size;
} DBT;
The elements of the DBT structure are defined as follows:
data A pointer to a byte string.
Sun Release 4.1 Last change: April 2, 1991 4
DB(3) C LIBRARY FUNCTIONS DB(3)
size The length of the byte string.
Key/data strings must fit into available memory.
BTREE
One of the access methods is a btree: a sorted, balanced
tree structure with associated key/data pairs.
The access method specific data structure provided to
btree_open is as follows:
typedef struct {
u_long flags;
u_int psize;
u_int cachesize;
int (*compare)(const void *, const void *);
int lorder;
} BTREEINFO;
The elements of this structure are defined as follows:
flags
The flag value is specified by or'ing any of the fol-
lowing values:
R_DUP
On insertion, if the key to be inserted already
exists, permit insertion anyway. This flag per-
mits duplicate keys in the tree. By default,
duplicates are not permitted, and attempts to
insert them will fail. Note, the order of
retrieval of key/data pairs with duplicate keys is
undefined.
cachesize
A suggested maximum size, in bytes, of the memory
cache. Setting this value to zero specifies that an
appropriate amount of memory should be used. Since
every search examines the root page of the tree, cach-
ing the most recently used pages substantially improves
access time. In addition, physical writes are delayed
as long as possible, so a moderate cache can reduce the
number of I/O operations significantly. Obviously,
using a cache increases the likelihood of corruption or
lost data if the system crashes while a tree is being
modified. However, caching 10 pages decreases the
creation time of a large tree by between two and three
orders of magnitude.
compare
Compare is a user defined comparison function. It must
return an integer less than, equal to, or greater than
Sun Release 4.1 Last change: April 2, 1991 5
DB(3) C LIBRARY FUNCTIONS DB(3)
zero if the first argument is considered to be respec-
tively less than, equal to, or greater than the second.
The same comparison function must be used on a given
tree every time it is opened. If no comparison func-
tion is specified, strcmp(3) is used.
lorder
The byte order for 4-byte integers in the stored data-
base metadata. The number should represent the order
as an integer; for example, big endian order would be
the number 4,321. If lorder is 0 (no order is speci-
fied) the current host order is used. If the file
already exists, the specified value is ignored and the
value specified when the tree was created is used.
(Obviously, portability of the data forming the
key/data pairs is the concern of the application pro-
gram.)
psize
Page size is the size in bytes of the pages used for
nodes in the tree. If the file already exists, the
specified value is ignored and the value specified when
the tree was created is used. If psize is zero, an
appropriate page size is chosen (based on the system
memory and/or file system constraints), but will never
be less than 512 bytes.
If the pointer to the openinfo data structure is NULL, the
btree_open routine will use appropriate values.
If the database file already exists, and the O_TRUNC flag is
not specified to btree_open, the parameter psize ignored.
Key structures may reference byte strings of slightly less
than one-half the tree's page size only (see psize). Data
structures may reference byte strings of essentially unlim-
ited length.
Searches, insertions, and deletions in a btree will all com-
plete in O lg N.
Forward sequential scans of a tree are from the least key to
the greatest.
Space freed up by deleting key/data pairs from a btree is
never reclaimed, although it is normally made available for
reuse. The exception to this is that space occupied by
large data items (those greater than one quarter the size of
a page) is neither reclaimed nor reused. This means that
the btree storage structure is grow-only. The only solu-
tions are to avoid excessive deletions, or to create a fresh
tree periodically from a scan of an existing one.
Sun Release 4.1 Last change: April 2, 1991 6
DB(3) C LIBRARY FUNCTIONS DB(3)
HASH
One of the access methods is hashed access and storage. The
access method specific data structure provided to hash_open
is as follows:
typedef struct {
u_long (*hash)(const void *, const size_t);
u_int cachesize;
int bsize;
int ffactor;
int lorder;
int nelem;
} HASHINFO;
The elements of this structure are defined as follows:
bsize
Bsize defines the hash table bucket size, and is, by
default, 256 bytes. It may be preferable to increase
the page size for disk-resident tables and tables with
large data items.
cachesize
A suggested maximum size, in bytes, of the memory
cache. Setting this value to zero specifies that an
appropriate amount of memory should be used.
ffactor
Ffactor indicates a desired density within the hash
table. It is an approximation of the number of keys
allowed to accumulate in any one bucket, determining
when the hash table grows or shrinks. The default
value is 8.
hash Hash is a user defined hash function. Since no hash
function performs equally well on all possible data,
the user may find that the built-in hash function does
poorly on a particular data set. User specified hash
functions must take two arguments (a pointer to a byte
string and a length) and return an u_long to be used as
the hash value.
lorder
The byte order for 4-byte integers in the stored data-
base metadata. The number should represent the order
as an integer; for example, big endian order would be
the number 4,321. If lorder is 0 (no order is speci-
fied) the current host order is used. If the file
already exists, the specified value is ignored and the
value specified when the tree was created is used.
(Obviously, portability of the data forming the
key/data pairs is the concern of the application
Sun Release 4.1 Last change: April 2, 1991 7
DB(3) C LIBRARY FUNCTIONS DB(3)
program.)
nelem
Nelem is an estimate of the final size of the hash
table. If not set, the default value is 1. If not set
or set too low, hash tables will expand gracefully as
keys are entered, although a slight performance degra-
dation may be noticed.
If the pointer to the openinfo data structure is NULL, the
hash_open routine will use appropriate values.
If the hash table already exists, and the O_TRUNC flag is
not specified to hash_open, the parameters bsize, ffactor,
and nelem are ignored.
If a hash function is specified, hash_open will attempt to
determine if the hash function specified is the same as the
one with which the database was created, and will fail if it
is not.
Both key and data structures may reference byte strings of
essentially unlimited length.
Backward compatible interfaces to the routines described in
dbm(3), hsearch(3), and ndbm(3) are provided, however, these
interfaces are not compatible with previous file formats.
RECNO
One of the access methods is either variable or fixed-length
records, the former delimited by a specific byte value. The
access method specific data structure provided to recno_open
is as follows:
typedef struct {
u_long flags;
u_int cachesize;
size_t reclen;
u_char bval;
} RECNOINFO;
The elements of this structure are defined as follows:
flags
The flag value is specified by or'ing any of the fol-
lowing values:
R_FIXEDLEN
The records are fixed-length, not byte delimited.
The structure element reclen specifies the length
of the record, and the structure element bval is
used as the pad character.
Sun Release 4.1 Last change: April 2, 1991 8
DB(3) C LIBRARY FUNCTIONS DB(3)
R_SNAPSHOT
This flag requires that a snapshot of the file be
taken when recno_open is called, instead of per-
mitting any unmodified records to be read from the
original file.
cachesize
A suggested maximum size, in bytes, of the memory
cache. Setting this value to zero specifies that an
appropriate amount of memory should be used.
reclen
The length of a fixed-length record.
bval The delimiting byte to be used to mark the end of a
record for variable-length records, and the pad charac-
ter for fixed-length records.
Variable-length and fixed-length data files require key
structures to reference the following structure:
typedef struct {
u_long length;
u_long number;
u_long offset;
u_char valid;
} RECNOKEY;
The elements of this structure are defined as follows:
length
The length of the record.
number
The record number.
offset
The offset in the file at which the record is located.
valid
A flag value which indicates the validity of the other
fields in the structure. The flag value is specified
by or'ing one or more of the following values:
R_LENGTH
The record length is valid.
R_NUMBER
The record number is valid.
R_OFFSET
The byte offset is valid.
Sun Release 4.1 Last change: April 2, 1991 9
DB(3) C LIBRARY FUNCTIONS DB(3)
If the record retrieval is successful, the record number,
byte offset and record length are set in the RECNOKEY struc-
ture referenced by the caller's key structure.
Data structures may reference byte strings of essentially
unlimited length.
ERRORS
The open routines may fail and set errno for any of the
errors specified for the library routines open(2) and mal-
loc(3) or the following:
[EFTYPE]
A file used by one of the open routines is incorrectly
formatted.
[EINVAL]
A parameter has been specified (hash function, pad byte
etc.) that is incompatible with the current file
specification or there is a mismatch between the ver-
sion number of file and the software.
The close routines may fail and set errno for any of the
errors specified for the library routines close(2), read(2),
write(2), free(3), or fsync(2).
The del, get, put and seq routines may fail and set errno
for any of the errors specified for the library routines
read(2), write(2), free(3) or malloc(3).
The sync routines may fail and set errno for any of the
errors specified for the library routine fsync(2).
SEE ALSO
Dynamic Hash Tables, Per-Ake Larson, Communications of the
ACM, April 1988.
A New Hash Package for UNIX, Margo Seltzer, USENIX Proceed-
ings, Winter 1991.
BUGS
The typedef DBT is a mnemonic for ``data base thang'', and
was used because noone could think of a reasonable name that
wasn't already used.
None of the access methods provide any form of concurrent
access, locking, or transactions.
Only big and little endian byte order is supported.
Sun Release 4.1 Last change: April 2, 1991 10