Nebula 1994 June

home *** CD-ROM | disk | FTP | other *** search

/ Nebula 1994 June / NEBULA_SE.ISO / Documents / FAQ / Free-databases-faq < prev next >

Wrap

Internet Message Format | 1993-08-26 | 26.7 KB

Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!spool.mu.edu!olivea!sgigate!sgiblab!idiom.berkeley.ca.us!idiom.berkeley.ca.us!not-for-mail From: muir@idiom.berkeley.ca.us (David Muir Sharnoff) Newsgroups: comp.databases,comp.sources.d,comp.archives.admin,comp.answers,news.answers Subject: Catalog of free databases Followup-To: comp.archives.admin Date: 26 Aug 1993 01:53:39 -0700 Organization: Idiom Consulting / Berkeley, CA USA Lines: 638 Approved: news-answers-request@mit.edu Expires: Sat, 1 Jan 1994 23:59:00 GMT Message-ID: <freedb-1.0@idiom.berkeley.ca.us> NNTP-Posting-Host: idiom.berkeley.ca.us Xref: senator-bedfellow.mit.edu comp.databases:28292 comp.sources.d:9459 comp.archives.admin:1135 comp.answers:1744 news.answers:11807 Archive-name: free-databases Last-modified: 1993/08/26 Version: 1.0 Catalog of Free Databases This document attemts to catalog databases that are available without payment and with source. This document is still a draft. The latest version of the document can be ftp'ed from pub/free-databases on idiom.berkeley.ca.us. Please send additions and corrections to David Muir Sharnoff <muir@idiom.berkeley.ca.us> Thanks, -Dave --------------------------------------------------------------------------- Prototype entry: name: The name of the package version: The current version number of the package direct inquiries to "contact." interfaces: The external interfaces that are supported by the package. Common interfaces are: SQL, ESQL, dbm, etc. access methods: A list of the access methods that are supported multiuser: Can more than one person access the package at the same time. transactions: Does the package support transactions? distributed: Does the package support distributed databases? query language: What query languages does the package support if any? SQL, QUEL, etc. index size: (full text only) the size of the index as a percentage of the size of the text to be indexed. limits: Any known, annoying limits robustness: Can this package be used on mission-critical data? description: A description of the package. references: Pointers to other documentation announcments: Where to get announcments discussion: Where to send, or how to join discussions about the package bugs: Where to send bug reports requires: Special requirements for installing or running ports: What does the package run on? restrictions: Specail copyright or other restrictions on the software author: The primary author, if known. If not known, contact: The current contact point. If not specified, use "author." how to get: Instructions for obtaining the package updated: When the package was last updated (yyyy/mm/dd) [often incorrect] --------------------------------------------------------------------------- --------------------------- relational databases -------------------------- --------------------------------------------------------------------------- name: University INGRES version: 8.9 interfaces: QUEL, EQUEL access methods: heap, hash, isam, ordered multiuser: yes transactions: yes, but no multistatement transactions. Each statement is ACID distributed: no query language: QUEL limits: ? robustness: Very mature technology description: This is the database program that was the basis for INGRES Corporation. Obviously, it does not have all the bells and whistles of the current commercial product. However, it is small and fast and it works. So called ordered relations are slow and not locked. references: "The INGRES Papers" Stonebraker ed. Addison Wesley ports: SunOS, ? author: The Ingres project at UC Berkeley. contact: <ingres@postgres.berekely.edu> how to get: ftp pub/ingres89.tar.Z from toe.cs.berkeley.edu _and_ ftp pub/source/ingres.patch from idiom.berkeley.ca.us updated: 1993/05/20 name: MetalBase version: 5.0 interfaces: custome C library access methods: AVL-trees multiuser: yes, but in theory race conditions still exist transactions: yes distributed: no query language: "Report", and "View Relation" a curses based viewer limits: ? robustness: data corruption is possible when MetalBase is not shut down correctly description: MetalBase is a small relational database. It has all the pieces that a relational database should C interface, curses interface, report writer, etc. It does not have design which takes advantage of shared memory or the better access methods. None of the interfaces are standard, but all of them are easy to use. discussion: mbase-request@internode.com.au requires: curses ports: Linux, MS-DOS, Amiga, NeXT, Coherent, Macintosh MPW, SGI, Xenix restrictions: donations are suggested author: Richid Jernigan / PO Box 827 / Norris TN 37828 how to get: ftp systems/unix/linux/sources/usr.bin/mbase.tar.z from ftp.uu.net updated: 1992/10/01 name: Postgres version: 4.1 interfaces: libpq (C interface) access methods: Heap plus secondary indexes: B-tree, R-tree, Hash. multiuser: yes transactions: yes distributed: no query language: Postquel (incompatable superset of Quel) limits: ? robustness: "It is not up to commercial levels of reliability. I would not want _my_ payroll records in it :-)" description: Postgres is a database research project under Prof. Michael Stonebraker at U. C. Berkeley. To facilitate research efforts, a software test-bed was created; this is the "Postgres" DBMS software. The Postgres DBMS is extended relational or object oriented, depending on the buzzword du jour. Postgres is relational. It is highly extensible. It has object oriented features like inheritance. it has query language procedures, rules, updatable views, and more. references: There are may papers available, both through ftp and as hard-copy technical reports. Cruse the ftp site for papers or mail Michelle Mattera <michelle@postgres.berkeley.edu> discussion: send "Subject: ADD" to postgres-request@postgres.berkeley.edu bugs: <bug-postgres@postgres.berkeley.edu> ports: MIPS Ultrix 4.2+, SunOS 4.1.1+, NextStep 3.0, Linux 0.99.7 in progress: Alpha OSF/1, HP-PA HP-UX 8.07, HP-PA HP-UX 9.01, i386 SCO ODT 2.0, Sparc Solaris 2.1 previous versions: i386 SVR4, i386 386BSD, RS/6000 AIX 3.2 contact: developers: <postgres-questions@postgres> chief programmer: Marc Teitelbaum <marc@postgres.berkeley.edu> admin: Michelle Mattera <michelle@postgres.berkeley.edu> how to get: ftp pub/postgres/postgres-v4r1/* from toe.cs.berkeley.edu updated: 1993/03/19 name: REQUIEM version: ? interfaces: RQL, ERQL (extension) access methods: B-tree indexes can be created on attributes of base relations. multiuser: yes (multiuser extension) transactions: yes (multiuser extension) distributed: no query language: RQL robustness: [seems to maintained by zero to few people --muir] description: REQUIEM (RElational Query and Update Interactive systEM) is an extensible, relational DBMS developed in C with a query language based on the relational algebra called RQL (Relational Query Language). There appears to be three versions of REQUIEM: the base version and two extensions. One extension adds multiuser capability. The other adds an embeddable version of the query langauge. references: "An Extensible DBMS for Small-Medium Scale Systems", Papazoglou, M.P., IEEE Micro, April 1989. Relational Database Management - A Systems Programming Approach, Papazoglou, M.P. and Valder, W., Prentice Hall International, UK, 1989. "The Development of a Program Interface for the RDBMS Requiem" Power, R.A., 1991 Honours Thesis (dvi file available with source code for the embedded version). ports: Sparc/SunOS; base version only: MS-DOS, Macintosh contact: (embedded version only) Robert Power <robert.power@csis.dit.csiro.au> how to get: ftp pub/requiem/REQUIEM.tar.Z (multiuser version) or pub/requiem/Requiem.tar.Z (embeddable version) from dcssoft.anu.edu.au The base version can be constructed from the multiuser version. updated: 1992/10/06 name: shql version: 1.1 interfaces: SQL, shell multiuser: no transactions: no ? distributed: no limits: no NULLs in the data, spaces and backslashes may be added when the data contains punctuation, GROUP BY is not implemented. robustness: it is a shell script. description: Shql is a program that reads SQL commands interactively and executes those commands by creating and manipulating Unix files. The program is patterned after Ingres' interactive sql terminal monitor program. requires: bourne shell with functions, awk, grep, cut, sort, uniq, join, wc, and sed author: Bruce Momjian <root%candle.uucp@bts.com> how to get: ask archie updated: 1993/01/25 --------------------------------------------------------------------------- --------------------------- object oriented ------------------------------- --------------------------------------------------------------------------- name: Arjuna Distributed Programming System version: 2.0 interfaces: C++ access methods: ? multiuser: yes transactions: yes, nested distributed: yes, includes replicated objects query language: ? limits: ? robustness: "all reported bugs fixed" description: Arjuna is a programming system for reliable distributed computing. Arjuna supports nested atomic actions for controlling operations on objects (instances of C++ classes), which can potentially be persistent. The software available includes a C++ stub generator which hides much of the details of client-server based programming, plus a system programmer's manual containing details of how to install Arjuna and use it to build fault-tolerant distributed applications. discussion: send "join arjuna YOUR-NAME-HERE" to mailbase@mailbase.ac.uk ports: UNIX: Suns, HPs, etc. restrictions: A commercial extension exists. contact: arjuna@newcastle.ac.uk how to get: ftp ? from arjuna.ncl.ac.uk updated: 1993/05/15 name: EXODUS Project software version: GNU E 2.3.3, Storage Manager (SM) 3.0 interfaces: GNU E, (C++ for direct access to the Storage Manager) access methods: B+tree and linear-hashing based indexes multiuser: yes, client-server transactions: yes distributed: yes, applications can access multiple servers in a single transaction. Distributed commits are performed across servers and clients have access to an interface allowing participation in distributed commits managed by an external agent. query language: GNU E -- a persistent programming language based on C++ robustness: High (at least for academic software). The SM release includes a facility for regression testing most features, including crash recovery. description: The EXODUS Storage Manager (SM) is a client-server object storage system which provides "storage objects" for storing data, versions of objects, "files" for grouping related storage objects, and indexes for supporting efficient object access. A storage object is an uninterpreted container of bytes which can range in size from a few bytes to hundreds of megabytes. The Storage Manager provides routines to read, overwrite, and efficiently grow and shrink objects. In addition, the Storage Manager provides transactions, lock-based concurrency control, and log-based recovery. GNU E is a persistent, object oriented programming language developed as part of the Exodus project. GNU E extends C++ with the notion of persistent data, program level data objects that can be transparently used across multiple executions of a program, or multiple programs, without explicit input and output operations. references: A bibliography of EXODUS related papers can be obtained from the ftp site described below. Some of the papers are available from the ftp server as technical reports, and are marked as such in the bibliography. discussion: We maintain a list of users for notification of updates. Mail exodus@cs.wisc.edu to be placed on the list. bugs: exodusbugs@cs.wisc.edu ports: MIPS/Ultrix, SPARC/SunOS, (HP 7xx/HP-UX for SM only) restrictions: none, but see copyright notice located in all source files author: The EXODUS Database Toolkit project at the University of Wisconsin contact: exodus@cs.wisc.edu how to get: ftp exodus/* from ftp.cs.wisc.edu updated: 1993/07/22 name: William's Object Oriented Database (Wood) version: 0.6 interfaces: MCL 2.0 access methods: custom multiuser: no transactions: no distributed: no query language: none. Has BTrees for indexing. limits: Will slow down when the database size exceeds 256 megabytes. Otherwise, database size limited by disk size (up to Macintosh limit, which is, I believe, 4 gigabytes). Object size limited to 24 megabytes. If you think of a Wood database as a random access FASL file, you'll have the right idea. robustness: Until it has a real logging/recovery mechanism, I wouldn't advise using it for mission critical data. Caches pages in memory, so if you crash, you will lose. Has a function to flush the cache to disk, so you can do explicit checkpoints to make it more robust. description: Wood is a simple persistent store for MCL 2.0. This is still alpha software. It is incomplete: though you can save/restore all Lisp objects to/from a file, there is no transaction/recovery manager and no garbage collector for the persistent heap. I will not be able to provide much support, but you get source code. discussion: info-wood-request@cambridge.apple.com bugs: bug-wood@cambridge.apple.com ports: Macintosh CommonLisp 2.0 author: Bill St. Clair <bill@cambridge.apple.com> how to get: ftp pub/mcl2/contrib/wood* from cambridge.apple.com updated: 1993/03/07 --------------------------------------------------------------------------- --------------------------- deductive databases --------------------------- --------------------------------------------------------------------------- name: Aditi Deductive Database System version: beta release interfaces: motif, command line, NU-Prolog access methods: Base relations contain variable sized records. Base relations can be indexed with B-trees or multi-level signature files (superimposed code words) allowing multi-attribute indexing and querying, or they can be stored as unindexed flat files. multiuser: yes transactions: next release distributed: ? query language: prolog, graphical (Motif) limits: ? robustness: ? description: Aditi is a multi-user deductive database system. It supports base relations defined by facts (relations in the sense of relational databases) and derived relations defined by rules that specify how to compute new information from old information. The old information can be from derived relations as well as base relations; the rules of derived relations may be recursive. Both base relations and the rules defining derived relations are stored on disk and are accessed as required during query evaluation. ports: SPARC/SunOS, MIPS/IRIX author: The development of the Aditi system started in 1988 by Professor Kotagiri Ramamohanarao, and many people have been involved in its development, in particular Jayen Vaghani, Tim Leask, Peter Stuckey, John Shepherd, Zoltan Somogyi, James Harland and David Kemp. The support of Kim Marriott, David Keegel, and Warwick Harvey is also acknowledged. contact: aditi@cs.mu.oz.au how to get: send email to aditi@cs.mu.oz.au updated: 1992/12/17 name: CORAL version: 0.1 (Version 1.0 expected shortly) interfaces: Exodus storage mangager, C++ access methods: Hash-based and B+ tree indices multiuser: When used with Exodus transactions: When used with Exodus distributed: no query language: Prolog-like with SQL-style extensions; C++ interface limits: No type checking; only atomic values in persistent relations robustness: Research software; used for teaching and in research projects, but some bugs remain description: The CORAL deductive database/logic programming system was developed at the University of Wisconsin-Madison. The CORAL declarative language is based on Horn-clause rules with extensions like SQL's group-by and aggregation operators, and uses a Prolog-like syntax. Many evaluation techniques are supported, including bottom-up fixpoint evaluation and top-down backtracking. Disk-resident data is supported via an interface to the Exodus storage manager; however, CORAL can run without Exodus if disk-resident relations are not required. A good interface to C++ is provided. Relations defined using the declarative language can be manipulated from C++ code, and relations defined using C++ code can be used in declarative rules. C++ code defining relations can be incrementally loaded. requires: AT&T C++ 2.0 or later ports: Decstations, Sun 4, Sparc, HP Snakes author: The CORAL group consists of R. Ramakrishnan, P. Seshadri, D. Srivastava and S. Sudarshan. The following people made important contributions: T. Arora, P. Bothner, V. Karra and W.G. Roth. Several other people were also involved: J. Albert, T. Ball, L. Chan, M. Das, S. Goyal, R. Netzer and S. Sterner. contact: Raghu Ramakrishnan <raghu@ricotta.cs.wisc.edu> how to get: ftp from ftp.cs.wisc.edu updated: 1993/02/12 --------------------------------------------------------------------------- --------------------------- flat files ------------------------------------ --------------------------------------------------------------------------- name: Jinx version: 2.1 interfaces: perl, shell multiuser: no transactions: no distributed: no query language: none limits: no limits robustness: No bugs have ever been reported description: Very easy to use, curses based flat file handler. In Perl, so no limits. Allows Join, Project, Sort etc. Representation in 2 readable unix files. A documented Perl library makes it easy to add applications. references: Online help and a 17 page tutorial. requires: Perl, cterm (distributed with jinx) ports: any unix system with ordinary perl and curses restrictions: Copyleft author: Henk Penning, Utrecht University contact: Henk Penning <henkp@cs.ruu.nl> how to get: ftp pub/PERL/jinx.shar.Z and pub/PERL/cterm.shar.Z from ftp.cs.ruu.nl updated: 1991/11/01 name: rdb version: 2.5j interfaces: ? access methods: ? multiuser: ? transactions: ? distributed: ? query language: ? limits: ? robustness: ? description: RDB is mostly a set of Perl scripts working as filters, like "row" & "column"; a very nifty table formatting script is in "ptbl", which can do long field folding into multiple lines per row. references: ? discussion: ? bugs: ? requires: perl ports: ? author: Walt Hobbs how to get: ftp pub/RDB-hobbs/RDB-2.5j.tar.Z from rand.org updated: ? --------------------------------------------------------------------------- --------------------------- dbm variants --------------------------------- --------------------------------------------------------------------------- name: The Berkeley DB code version: 1.6 interfaces: ndbm, hsearch access methods: hash, b+tree, recno multiuser: no transactions: no distributed: no query language: none limits: can handle large items robustness: The db routines are used in some production code so they are likely to work reasonably well. description: The Berkeley DB Code is a unification of several previous interfaces. It also forms the basis of a unified interface to new access methods (b+tree, recno). references: "A New Hashing Package for UNIX", Margo Seltzer, Ozan Yigit, Proceedings of the Winter USENIX Conference, Dallas, TX, 1991. Also available by ftp'ing pub/oz/hash.ps.Z from nexus.yorku.ca. "Document Processing in a Relational Database System, Michael Stonebraker," Heidi Stettner, Joseph Kalash, Antonin Guttman, Nadene Lynn, Memorandum No. UCB/ERL M82/32, May 1982. "LIBTP: Portable, Modular Transactions for UNIX," Margo Seltzer, Michael Olson, Proceedings 1992 Winter Usenix Conference, San Francisco, CA, January 1992. reported bugs: does not align data in memory [fixed? --ed] ports: SunOS 4.1.2, Ultrix 4.2A, BSD 4.4, and most other Unix author: Margo Seltzer, Keith Bostic, Ozan Yigit contact: Keith Bostic <bostic@cs.berkeley.edu> how to get: ftp ucb/4bsd/db.tar.Z from ftp.cs.berkeley.edu updated: 1993/06/06 name: dbz version: "20 Feb 1993 Performance Release of C News" interfaces: dbm-like, command-line access access methods: hash multiuser: no transactions: no distributed: no query language: none limits: lines are limited to 1024 bytes unless the -l option is used robustness: very robust within its domain description: A dbm-like library maintained for use with C-news. ports: everything that runs C-news (lots) author: Jon Zeeff <zeeff@b-tech.ann-arbor.mi.us>, David Butler, Mark Moraes, Henry Spencer. Hashing function by Peter Honeyman. contact: Henry Spencer <henry@zoo.toronto.edu> how to get: included in the C-news distribution as ./dbz updated: 1992/02/11 name: gdbm version: 1.6 interfaces: dbm, ndbm, gdbm access methods: hash multiuser: no, but does lock the entire file transactions: no distributed: no query language: none limits: can handle large items robustness: [should be good --ed] description: An ndbm work-alike from the Free Software Foundation bugs: gnu.utils.bug author: Philip A. Nelson <phil@wwu.edu> how to get: ftp gdbm-*.tar.gz from any gnu archive updated: 1993/07/20 name: sdbm version: ? interfaces: ndbm access methods: hash multiuser: no transactions: no distributed: no query language: none limits: ? robustness: [I know of no problems --ed] description: ndbm work-alike hashed database library based on Per-Aake Larson's Dynamic Hashing algorithms. author: Ozan S. Yigit <oz@nexus.yorku.ca> how to get: included in the X11R5 distribution as contrib/util/sdbm updated: 1990/03/01 name: tdbm version: 1.1 interfaces: dbm-like access methods: hashing multiuser: In theory, but the required threads package is not currently distributed. transactions: yes distributed: yes query language: none limits: Some minor ones. robustness: Probably pretty reliable, but no hard data available. description: Tdbm is a transaction processing database with a dbm-like interface. It provides nested atomic transactions, volatile and persistent databases, and support for very large objects and distributed operation. references: A paper appearing in the Summer '92 USENIX proceedings describes the design and implementation of tdbm and examines its performance. discussion: Contact the author. bugs: Contact the author. author: Barry Brachman <brachman@cs.ubc.ca> requires: Nothing special. ports: Sparc, MIPS, AIX. Thought to be quite portable. restrictions: Copyrighted with liberal use policy. how to get: ftp pub/local/src/tdbm.tar.Z from cs.ubc.ca [137.82.8.5] updated: 1992/05/13 --------------------------------------------------------------------------- --------------------------- full text ------------------------------------- --------------------------------------------------------------------------- name: Liam Quin's text retrieval package (lq-text) version: 1.12-gamma interfaces: command line, curses access methods: hash (dbm) plus clustered linked list multiuser: read only distributed: no, can be used over nfs if the systems are similar query language: very limited command line limits: 30-bit max document size, 31-bit distinct words in vocabulary, up to 2^24 documents (possibly more but I don't have enough disk to test anything like that!) index size: >30%, <100% of input text robustness: The README says that there are bugs. description: lq-text is a text retrieval package. That means you can tell it about lots of files, and later you can ask it questions about them. The questions have to be: "which files contain this word?" or "which files contain this phrase?", but this information turns out to be rather useful. Lqtext has been designed to be reasonably fast. It uses an inverted index, which is simply a kind of database. This tends to be smaller than the size of the data, but more than half as large. You still need to keep the original data. Lqtext uses dbm (berkeley db or sdbm) to store its indexes. discussion: lq-text-beta-request@sq.com bugs: lq-text-beta@sq.com ports: most version of unix (except SCO) restrictions: permission required for commercial use. author: Liam R. E. Quin <lee@sq.com> how to get: ftp pub/lq-text*.tar.Z from relay.cs.toronto.edu updated: 1992/08/24 name: SMART version: 11.0 interfaces: terminal, X (slightly oder version), and several under development including X39.50 access methods: inverted file search or sequential search multiuser: yes, but last writer wins when there are update conflicts distributed: In-house version, to be made public in fall query language: Natural language index size: approx 40% of original text. limits: Can only handle roughly 4 Gbytes of text in non-distributed version. robustness: Research tool; parts have been well-tested but others not. description: SMART is an implementation of the vector-space model of information retrieval proposed by Salton back in the 60's. The primary purpose of SMART is to provide a framework in which to conduct information retrieval research. Standard versions of indexing, retrieval, and evaluation are provided. The system is designed to be used for small to medium scale collections, and offers reasonable speed and support for these actual applications. SMART analyses the collection of information and builds indexes. It can then be used to build natural-language based information retrieval software. It uses feedback from the user to tighten its search. restrictions: Research use only. discussion: smart-people-request@cs.cornell.edu ports: Unix contact: <chrisb@cs.cornell.edu> how to get: ftp pub/smart/* from ftp.cs.cornell.edu updated: 1992/07/21 name: WAIS (Wide Area Information Server) version: 8 b5.1 interfaces: the wais protocol (Z39.50) access methods: inverted string index multiuser: read only distributed: client/server query language: natural language, boolean, Relevance Feedback index size: roughtly = data size limits: "none" robustness: fairly high description: There are three main components: WAISINDEX, WAISSERVER, and WAISSEARCH. WAISINDEX creates an inverted file index. WAISINDEX includes filters for a number of common file formats. WAISSERVER listens for Z39.50 packets and tries to answer them. WAISSEARCH is the user agent that talks to WAISSERVERs. There are several front ends: shell, X, and emacs. announcements: wais-interest-request@think.com discussion: wais-discussion-request@think.com ports: vax, sun-3, sun-4, NeXT, sysV restriction: commercial version exists, contact info@wais.com author: Harry Morris <Morris@wais.com>, Brewster Kahle <Brewster@wais.com>, Jonny Goldman <Jonathan@Think.COM> how to get: ftp wais/* from wais.com updated: 1992/11/16