home *** CD-ROM | disk | FTP | other *** search
-
-
- N-1-3-040.31.2, "Internet Uniform Resource Indentifiers", by Alan
- Emtage*, <bajan@bunyip.com>
-
-
- We in the Internet community are now seeing the cold, hard reality of
- what it means to have exponential growth and not one of Sacred Seven
- Layers has been spared in this attack. One of the most obvious
- examples of this from the user's point of view, has been the explosion
- in Information Services. From archie to Z39.50, users are being asked
- to navigate through hundreds of Gigabytes of data to try and find just
- that information for which they are searching.
-
- In moving from an Internet world composed of weenies & wizards in the
- darkened offices to John Doe on Maple Street, a fundamental change
- occurs: suddenly we will have millions of "librarians" and
- "publishers" who won't be (and can't be expected to be) trained in the
- knowledge of those professionals. Recently, much attention has been
- focussed in the Information Systems community on how to identify,
- locate and access the (potentially) millions of resources on the
- network. Although there remains much work to be done, consensus seems
- to have been reached on the broad outlines of how such a system would
- work. The idea in its most basic form is that the location and access
- component is separate (and external) from the identification (internal
- content) of any object. Having done this, the following scheme has
- been developed.
-
- Uniform Resource Locators (URLs) will specify a well-known format in
- which the various information services interoperate and exchange data
- about the location and access methods for a specific resource. Thus
- WAIS and Prospero would be able to parse and understand references to
- the the same resource (document, service, etc.) regardless of in which
- system the information is embedded and the internal content of that
- information.
-
- Uniform Resource Serial Numbers (URSNs) on the other hand will be
- concerned with the identifying the actual data itself. There is
- currently no way to know if two resources with the same name (e.g.,
- documents with the same filename) actually contain the same
- information. Conversely, two documents with the different names may
- contain the same information, but in a different format (for example,
- ASCII or PostScript). Ideally we would also like to know how two
- objects may be related. Is one derived from the other? Is it the
- same document with spelling errors corrected? Is there a way of
- uniquely "signing" the object?
-
- Finally, Uniform Resource Identifiers (URIs) are comprised of the
- union of URLs and URSNs and would uniquely specify any resource on the
- network. Work is now beginning in the Internet Engineering Task Force
- (IETF) to describe and define the problem and to try to work out
- potential solutions. A draft document describing the architecture of
- the URL system as discussed at the last IETF meeting is currently
- being written by Tim-Berners Lee of CERN (timbl@nxoc01.cern.ch) and is
- due out in the next few months. The URSN is an inherently more
- complex problem and it is expected that there will be much wailing an
- gnashing of teeth before the final form of this is known.
-
- In many respects we are re-inventing and reworking much of Library
- Sciences in the Internet environment, with one significant difference:
- scale. Although "The Library" deals with amounts of information
- several orders of magnitude over what is currently available on the
- Internet, they effectively have centralized control over how books are
- classified and numbered (e.g., the ISBN designation). On the
- Internet, such tight centralized control would soon become a
- bottleneck in the system, and in addition, information cannot always
- be easily defined in terms of books, films or tapes (a running process
- for example). Perhaps the decentralized allocation methods of the
- Domain Name System could be adapted to the purpose. Again however,
- the problem is that while there may be 10 million machines on the
- network 3 years from now, there may be 100 million users and 1 billion
- objects; a difference of two orders of magnitude. A tall order for
- any system. This work has the potential to have a tremendous impact
- on how we use the Internet in years to come. Who knows how it will
- turn out? In any case, there'll be a lot of fun in just getting
- there.
-
-
- *VP Research & Development for Bunyip Information Services
-
-
-
-
-