OS/2 Professional

home *** CD-ROM | disk | FTP | other *** search

/ OS/2 Professional / OS2PRO194.ISO / os2 / network / waisos2 / os2wais.inf (.txt) < prev next >

Wrap

OS/2 Help File | 1993-07-19 | 11KB | 230 lines

ΓòÉΓòÉΓòÉ 1. Introduction ΓòÉΓòÉΓòÉ The WAIS OS/2 Client is a Public Domain software product developed at the Library of Congress. The Client allows OS/2 users to connect to WAIS Servers on the Internet and to search for and retrieve documents from those Servers. Documents returned can be text, pictures, or other types of data, depending on the type of server being accessed. The Client and WAIS Servers communicate using the WAIS Protocol. This allows a single user to query many different data servers without having to learn a new query language or interface. The Client can also be used to access local WAIS Servers across a local area network (LAN). ΓòÉΓòÉΓòÉ 2. Quick Start ΓòÉΓòÉΓòÉ WAIS is simple to use. First, choose one or more sources using the 'Sources' pull-down menu. Next, enter your query in the 'Tell me about' window. Then, just click on the 'Search' pushbutton. ΓòÉΓòÉΓòÉ 3. Network Requirements ΓòÉΓòÉΓòÉ The OS/2 Client runs on top of IBM's TCP/IP for OS/2 network software. The user must be able to open a socket connection to a remote WAIS Server machine on the network. The WAIS Client will work with either the 16-bit or 32-bit flavor of IBM's TCP/IP for OS/2 product. The Client will not work with non-IBM TCP/IP products, but conversion should not be difficult. Since the Client is Public Domain software, source code is available for porting, modification, or improvement. ΓòÉΓòÉΓòÉ 4. Sources ΓòÉΓòÉΓòÉ The first step in beginning a search is to select a .source to contact. The user lists the currently known sources by clicking on the "Sources" button. A window listing current known sources will appear. You can select one or more of these sources with the mouse and then hit the "Use Selected Sources" button or double click on a source. The selected sources will then appear in the "Look in these Sources:" window, ready to be searched. Most searches are a single-source, but there are times when it is desirable to search multiple sources simultaneously. If you want to stop searching a source, select the source in the "Look in these Sources:" window and execute the "Stop Using Source" command in the "Sources" pull-down menu. Known sources are described in files with ".src" extensions. The first time the user lists sources, the Client loads in all the .src files in the local directory. To see what these files contain, select a source in the "Known Sources" window (just one) and then click on the "Edit Source" button. A window will appear, showing all the information associated with that source. Typically, the source description provides information on how to search that source, how to obtain more information on that source, whether or not the server service costs money, and the e-mail address of the source administrator. Be careful, you can click in any of these windows and edit the contents, if you change the network information, you may not be able to contact that source in the future. You can also select sources and hit the "Delete Selected Sources" button. This erases all the information related to that source and erases the *.src file in the local directory. ΓòÉΓòÉΓòÉ 5. Queries ΓòÉΓòÉΓòÉ Once you have selected a source to use, the Client should put you back into the Query window. This is the window which is labeled, "Tell me about:". You can now enter a natural language question in this window, or just type a set of words and phases that are relevant to the type of information you are seeking from the selected source. The general algorithm for weighting words and phrases is as follows: if a word is rarely used in the database, it get more weight; if a phase matches exactly, it gets more weight; and if a word appears in the document title, it get more weight. Once you have entered your query, hit return or click on the "Search" button to begin a search. You can also enter more complex queries, depending on the type of server you are contacting. For example, WAIS Inc. commercial servers allow you to enter boolean queries by using logical words in capital letters, like AND, OR, and NOT. The source description should tell you what kind of server it is and what kinds of queries it supports. Also, the server description often contains a method for getting a help document about that server. ΓòÉΓòÉΓòÉ 6. Results ΓòÉΓòÉΓòÉ Search results are displayed in the results window, the largest window in the display with the column headings "Score Size HEADLINES". The server should return a number of document titles or headlines, along with their score and size. The score runs from 0 to 1000. The highest scoring documents are listed first at the top of the display. The default file size indicates the number of bytes or characters it contains. If the file is large, the size will be expressed in multiples of 1024. If the size is followed by a "k", these are units of 1024. "M" stands for megabytes, or units of 1024 squared (slightly more than a million). "G" stands for gigabytes, or units of 1024 cubed (slightly more than a billion). ΓòÉΓòÉΓòÉ 7. Retrieving Documents ΓòÉΓòÉΓòÉ You can double click on any displayed headline in the results window to retrieve and display the document. Before retrieving a document, it is wise to look at how large it is to get an idea of how long it will take to retrieve the document. A 150k file will take anywhere from 10 seconds to a minute to download, depending on network traffic, network bandwidth, and server workload. The Client retrieves the document and puts it into a file called "new_doc.tmp" and launches a viewer to display the document. The type of viewer depends on the type of document retrieved. The user can select which type of viewer to launch with each type of document by selecting the "Document Viewers" menu item from the "Options" menu list. Typically, editors are used for text documents, while an image viewer is used to display GIF, JPEG, or TIFF documents. The Client comes with default viewer settings. The OS/2 epm editor is called on text documents. Also included on the Client distribution disk is a Public Domain image viewer (pmviewjr.exe) which is called by the Client for GIF and JPEG images. The user can substitute his preferred editors and viewers for these default values. The document viewer runs as a separate program. When you are done viewing a document, simply quit or close out the editor or viewer. The WAIS Client will still be running. ΓòÉΓòÉΓòÉ 8. Saving Documents ΓòÉΓòÉΓòÉ Each document retrieval erases the previous contents of "new_doc.tmp". If the user wishes to permanently store a document, she should copy the file "new_doc.tmp" to another file before retrieving another document. In the case of text documents, simply use the "Save As" command in the editor to save the file under another name. With images, the user may have to go to another OS/2 command window to copy the file, unless the viewer has a "Save As" command. ΓòÉΓòÉΓòÉ 9. Finding New Sources ΓòÉΓòÉΓòÉ The Client disk comes with a few of .src files, but these are only for demonstration purposes. The one source which is essential to have is the Directory of Servers. This is a WAIS Server which is a database of databases. Begin your search with this source in order to locate sources which are relevant to your query. The Directory of Servers functions like a normal WAIS Server, except that the documents it returns are source descriptions, not documents. To examine a source description, simply double click on the headline in the Results window. The "document" will be retrieved and displayed. At this point you have the option to discard the source description "Cancel", or to save it out for future use "Save". If you wish to save the source, be sure to edit the "Filename" field to indicate the filename to use. The default name is "new-src" which will be overwritten the next time you save a source description without changing the file name. The Client will append a ".src" extension to the source filename. The new source should now appear in the known sources window, listed under the filename you chose, ready to be used. If you are running WAIS on a FAT formatted disk, you will get an error if you specify a filename greater than eight characters. ΓòÉΓòÉΓòÉ 10. Creating Source Pointers ΓòÉΓòÉΓòÉ You can also create source descriptions if you know the database name, the internet address, and the port number of the Server you are trying to contact. Call the "Create a New Source" command under the "Sources" pull-down menu. Then fill out the necessary information by clicking in each field. The IP Number is not required, but if you know it, put it in as it will save lookup time. The rest of the information is optional. You must enter the exact Database Name; the machine name and port number are not sufficient. Servers run under the UNIX Operating System. The Database Name is actually a UNIX path name which the Server uses to access the database. UNIX is case sensitive. This means that the database name must have the correct capitalization. CAUTION: Also note that UNIX pathnames use "/" not "\" as in DOS, or OS/2. ΓòÉΓòÉΓòÉ 11. Relevance Feedback ΓòÉΓòÉΓòÉ One of the most powerful aspects of WAIS is the ability to say to a server, "find me more documents like this one." This is called relevance feedback. This is a quick, intuitive way of searching large databases to obtain the documents you are looking for. If you find a document that you want to use for relevance feedback, select the document headline and execute the "Use Document for Relevance Feedback" command under the "Documents" menu list. The document headline, along with the source it comes from, will appear in the relevance feedback window which is titled "Similar to:". You can now run the search again(by clicking on the "Search" button), but this time, in addition to your query, the document pointers in the relevance feedback window will be passed to the server to refine your search. Relevance feedback can be used iteratively, adding and deleting documents until you find the what you are looking for. ΓòÉΓòÉΓòÉ 12. Relevance Feedback and Multiple Source Searches ΓòÉΓòÉΓòÉ Relevance feedback works best with single-source searches with documents which come from that source. If you are doing a multiple-source query, relevance feedback becomes more complicated. For those of you who want to know how it really works, read on. Although all relevance feedback document ID's are send to all the servers being searched, only those servers that can access relevance feedback documents on their own file systems will use them, otherwise they will ignore them. That is, relevance feedback documents from Server X cannot be used by Server Y, unless Server X and Y are on the same file system. Thus, if you are simultaneously searching on two servers (X and Y) with relevance feedback documents from both servers, and if they are not on the same file system, then each server will perform its search only with the relevance feedback documents from their respective databases. Also, when a user removes a source from the "Look in these Sources:" window(via the "Stop Using Source" command), all the relevance feedback documents from that source are placed at the bottom of the list, with the label "These documents may be ignored:" to indicate that their source is no longer being used. If they exist on a file system that is still in use, they may still be used, but otherwise they will be ignored.