home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Professional
/
OS2PRO194.ISO
/
os2
/
network
/
waisos2
/
os2wais.inf
(
.txt
)
< prev
next >
Wrap
OS/2 Help File
|
1993-07-19
|
11KB
|
230 lines
ΓòÉΓòÉΓòÉ 1. Introduction ΓòÉΓòÉΓòÉ
The WAIS OS/2 Client is a Public Domain software product developed at
the Library of Congress. The Client allows OS/2 users to connect to WAIS
Servers on the Internet and to search for and retrieve documents from those
Servers. Documents returned can be text, pictures, or other types of data,
depending on the type of server being accessed. The Client and WAIS Servers
communicate using the WAIS Protocol. This allows a single user to query many
different data servers without having to learn a new query language or
interface.
The Client can also be used to access local WAIS
Servers across a local area network (LAN).
ΓòÉΓòÉΓòÉ 2. Quick Start ΓòÉΓòÉΓòÉ
WAIS is simple to use.
First, choose one or more sources using the 'Sources' pull-down menu. Next, enter your query in the 'Tell me about' window. Then, just click on the
'Search' pushbutton.
ΓòÉΓòÉΓòÉ 3. Network Requirements ΓòÉΓòÉΓòÉ
The OS/2 Client runs on top of IBM's TCP/IP for OS/2 network software.
The user must be able to open a socket connection to a remote WAIS Server
machine on the network. The WAIS Client will work with either the 16-bit or
32-bit flavor of IBM's TCP/IP for OS/2 product. The Client will not work with
non-IBM TCP/IP products, but conversion should not be difficult. Since the
Client is Public Domain software, source code is
available for porting, modification, or improvement.
ΓòÉΓòÉΓòÉ 4. Sources ΓòÉΓòÉΓòÉ
The first step in beginning a search is to select a .source to contact.
The user lists the currently known sources by clicking on the "Sources" button.
A window listing current known sources will appear. You can select one or more
of these sources with the mouse and then hit the "Use Selected Sources" button
or double click on a source. The selected sources will then appear in the "Look
in these Sources:" window, ready to be searched. Most searches are a
single-source, but there are times when it is desirable to search multiple
sources simultaneously.
If you want to stop searching a source, select the source in the "Look
in these Sources:" window and execute the "Stop Using Source" command in
the "Sources" pull-down menu.
Known sources are described in files with ".src" extensions. The first
time the user lists sources, the Client loads in all the .src files in the
local directory. To see what these files contain, select a source in the "Known
Sources" window (just one) and then click on the "Edit Source" button.
A window will appear, showing all the information associated with that source.
Typically, the source description provides information on how to search that
source, how to obtain more information on that source, whether or not the server
service costs money, and the e-mail address of the source administrator. Be
careful, you can click in any of these windows and edit the contents, if you
change the network information, you may not be able to contact that source in
the future.
You can also select sources and hit the "Delete Selected Sources"
button. This erases all the information related to that source and erases the
*.src file in the local directory.
ΓòÉΓòÉΓòÉ 5. Queries ΓòÉΓòÉΓòÉ
Once you have selected a source to use,
the Client should put you back into the Query window. This is the window which
is labeled, "Tell me about:". You can now enter a natural language
question in this window, or just type a set of words and phases that are
relevant to the type of information you are seeking from the selected source.
The general algorithm for weighting words and phrases is as follows: if
a word is rarely used in the database, it get more weight; if a phase matches
exactly, it gets more weight; and if a word appears in the document title, it
get more weight.
Once you have entered your query, hit return or click on the
"Search" button to begin a search.
You can also enter more complex queries, depending on the type of server
you are contacting. For example, WAIS Inc. commercial servers allow you to
enter boolean queries by using logical words in capital letters, like AND, OR,
and NOT. The source description should tell you what kind of server it is and
what kinds of queries it supports. Also, the server description often contains
a method for getting a help document about that server.
ΓòÉΓòÉΓòÉ 6. Results ΓòÉΓòÉΓòÉ
Search results are displayed in the results window, the largest window
in the display with the column headings "Score Size HEADLINES". The server
should return a number of document titles or headlines, along with their score
and size. The score runs from 0 to 1000. The highest scoring documents are
listed first at the top of the display. The default file size indicates the
number of bytes or characters it contains. If the file is large, the size will
be expressed in multiples of 1024. If the size is followed by a "k", these are
units of 1024. "M" stands for megabytes, or units of 1024 squared (slightly
more than a million). "G" stands for gigabytes, or units of 1024 cubed
(slightly more than a billion).
ΓòÉΓòÉΓòÉ 7. Retrieving Documents ΓòÉΓòÉΓòÉ
You can double click on any displayed headline in the results window to retrieve and display the document.
Before retrieving a document, it is wise to look at how large it is to get an
idea of how long it will take to retrieve the document. A 150k file will take
anywhere from 10 seconds to a minute to download, depending on network traffic,
network bandwidth, and server workload.
The Client retrieves the document and puts it into a file called
"new_doc.tmp" and launches a viewer to display the document. The type of viewer
depends on the type of document retrieved.
The user can select which type of viewer to launch with each type of
document by selecting the "Document Viewers" menu item from the "Options" menu
list. Typically, editors are used for text documents, while an image viewer
is used to display GIF, JPEG, or TIFF documents.
The Client comes with default viewer settings. The OS/2 epm editor is
called on text documents. Also included on the Client distribution disk is a
Public Domain image viewer (pmviewjr.exe) which is called by the Client for GIF
and JPEG images. The user can substitute his preferred editors and viewers for
these default values.
The document viewer runs as a separate program. When you are done
viewing a document, simply quit or close out the editor or viewer. The WAIS
Client will still be running.
ΓòÉΓòÉΓòÉ 8. Saving Documents ΓòÉΓòÉΓòÉ
Each document retrieval erases the previous contents of "new_doc.tmp".
If the user wishes to permanently store a document, she should copy the file
"new_doc.tmp" to another file before retrieving another document. In the case
of text documents, simply use the "Save As" command in the editor to save
the file under another name. With images, the user may have to go to another
OS/2 command window to copy the file, unless the viewer has a "Save As" command.
ΓòÉΓòÉΓòÉ 9. Finding New Sources ΓòÉΓòÉΓòÉ
The Client disk comes with a few of .src files, but these are only for
demonstration purposes. The one source which is essential to have is the
Directory of Servers. This is a WAIS Server which is a database of databases.
Begin your search with this source in order to locate sources which are relevant
to your query.
The Directory of Servers functions like a normal WAIS Server, except
that the documents it returns are source descriptions, not documents. To
examine a source description, simply double click on the headline in the Results
window. The "document" will be retrieved and displayed. At this point you have
the option to discard the source description "Cancel", or to save it out for
future use "Save".
If you wish to save the source, be sure to edit the
"Filename" field to indicate the filename to use. The default name is
"new-src" which will be overwritten the next time you save a source description
without changing the file name. The Client will append a ".src" extension to
the source filename. The new source should now appear in the known sources
window, listed under the filename you chose, ready to be used.
If you are running WAIS on a FAT formatted disk, you will get an error
if you specify a filename greater than eight characters.
ΓòÉΓòÉΓòÉ 10. Creating Source Pointers ΓòÉΓòÉΓòÉ
You can also create source descriptions if you know the database name,
the internet address, and the port number of the Server you are trying to
contact. Call the "Create a New Source" command under the "Sources" pull-down
menu. Then fill out the necessary information by clicking in each field. The
IP Number is not required, but if you know it, put it in as it will save lookup
time. The rest of the information is optional.
You must enter the exact Database Name; the machine name and port
number are not sufficient. Servers run under the UNIX Operating System. The
Database Name is actually a UNIX path name which the Server uses to access the
database. UNIX is case sensitive. This means that the database name must have
the correct capitalization.
CAUTION:
Also note that UNIX pathnames use "/" not "\" as in DOS, or OS/2.
ΓòÉΓòÉΓòÉ 11. Relevance Feedback ΓòÉΓòÉΓòÉ
One of the most powerful aspects of WAIS is the ability to say to a
server, "find me more documents like this one." This is called relevance
feedback. This is a quick, intuitive way of searching large databases to obtain
the documents you are looking for. If you find a document that you want to use
for relevance feedback, select the document headline and execute the "Use
Document for Relevance Feedback" command under the "Documents" menu list. The
document headline, along with the source it comes from, will appear in the
relevance feedback window which is titled "Similar to:".
You can now run the search again(by
clicking on the "Search" button), but this time, in addition to your query,
the
document pointers in the relevance feedback window will be passed to the server
to refine your search. Relevance feedback can be used iteratively, adding and
deleting documents until you find the what you are looking for.
ΓòÉΓòÉΓòÉ 12. Relevance Feedback and Multiple Source Searches ΓòÉΓòÉΓòÉ
Relevance feedback works best with single-source searches with documents
which come from that source. If you are doing a multiple-source query,
relevance feedback becomes more complicated. For those of you who want to
know how it really works, read on.
Although all relevance feedback document ID's are send to all the
servers being searched, only those servers that can access relevance feedback
documents on their own file systems will use them, otherwise they will ignore
them. That is, relevance feedback documents from Server X cannot be used by
Server Y, unless Server X and Y are on the same file system.
Thus, if you are simultaneously searching on two servers (X and Y) with
relevance feedback documents from both servers, and if they are not on the
same file system, then each server will perform its search only with the relevance
feedback documents from their respective databases.
Also, when a user removes a source from the "Look in these
Sources:" window(via the "Stop Using Source" command), all the relevance
feedback documents from that source are placed at the bottom of the list, with
the label "These documents may be ignored:" to indicate that their source
is no longer being used. If they exist on a file system that is still in use,
they may still be used, but otherwise they will be ignored.