WAIS index support is provided by means of an auxiliary program provided with the gn distribution, called waisgn. In order to use this program the server maintainer must first obtain and compile the WAIS software distribution (directions are given below). This provides the program waisindex which creates the indices and the libraries which must be linked with the waisgn program. When the gn server receives a WAIS index query it execs (in the UNIX sense) the waisgn program passing the search term to it. That is, it turns itself into the waisgn program by replacing the server in memory with the waisgn binary. There is no need to run a WAIS server.
One reason for this design is size. The gn server is relatively small and hence fairly efficient. The server I run is about 64K in size. The waisgn program source is small but the libraries with which it must be linked are not. The final binary for waisgn is 400K to 500K in size. The design with two separate programs has several advantages. First the efficiency of servers which are not using WAIS is not degraded by its presence. Also WAIS is a complicated system to set up and run. Having it done with a separate program makes it much easier to check that things are functioning correctly and fix them if they are not.
1. Edit the file "config.h" in the main gn source directory and change the entry
#define WAISGN "/usr/local/etc/waisgn"to reflect the location where you want to keep the waisgn binary. Now run make in the gn source directory to make a new version of gn which is aware of this value.
2. Get the WAIS software. You can use either freeWAIS from
ftp://ftp.cnidr.org/pub/NIDR.tools/freeWAIS-0.202.tar.Z or ftp://ftp.bio.indiana.edu/util/wais/iubio-wais-8b5-d.tar.Z.Then build WAIS per the instructions with that distribution. I use the IUbio version because it results in a smaller binary for waisgn. I have tested waisgn with both. There is a noticeable difference the "best match" ranks they produce, but I have no information on what the differences in scoring algorithms are.
3. In the waisgn src directory make symbolic links to the directories "bin" and "ir" in the main WAIS source directory. The commands to do this are, for example
ln -s /path/to/freeWAIS-0.202/bin ln -s /path/to/freeWAIS-0.202/irThen examine the contents of these directories to make sure the links are working.
4. In the waisgn source directory run make, producing the waisgn binary. Copy the waisgn binary to the location you designated as WAISGN in step 1.
waisindex -t filename /complete/path/to/files... or waisindex -t first_line /complete/path/to/files...where "files" is typically replaced by a wildcard expression matching all the files you want to index. You could also have multiple wildcard expressions for the files. The difference in these two commands is that menu which waisgn will produce will title the matching documents either by the name of the file containing a match or by the the first line of the contents of the file containing a match. Note that in the first form the argument is literally the string "filename"; that string is not replaced with the name of a file. It is also possible to obtain the titles of the matching objects from the Name= field of a menu file (see below).
wgtest words...where words... is a list of search terms, perhaps just one. The output of this script includes a fair amount of diagnostic verbiage, but should include gopher protocol lines for the files which contain matches for your search term.
If this is successful you are ready to create the menu entries for your files and search item.
mkmenu /path/to/data/ *.txt
The script will create a menu file in the data directory which must be processed by mkcache to produce a .cache file. By default mkmenu assumes the files are text files (type 0). This can be changed by editing the TYPE variable in the script.
Name=Index Path=7w/relative/path/to/myindex/index.invand run mkcache on it. Note: You can if you wish put more than one set of index files in the same directory by using the "-d" option with waisindex. E.g. waisindex -d index1 ... will produce index1.inv, index1.dct etc., and waisindex -d index2 ... will produce index2.inv, etc. If you have done this then you can put multiple entries in the file .../myindex/menu.
Name=WAIS search of all my documents Path=7w/relative/path/to/myindex/index.inv
This default behavior can be overridden by using the type "7wc" instead of "7w" in the menu files mentioned above. When this is done the menu items will be the contents of the Name= field of the menu file in the data directory just as it is for non-search gn items.
waisindex -t para files...is used each separate paragraph (separated by blank lines) will be considered a document and its first line will be used as the title. Likewise if the "-t mail_or_rmail" option is used the files will be assumed to be standard UNIX mail files and each message will be considered as a separate document. To see a complete list of "-t" options run waisindex with no arguments.
In order for gn and waisgn to know to a byte range for the matching documents, it is necessary to "7wr" in place of "7w" in the menu files mentioned above (i.e. the menu file in the "myindex" directory and the one listing the search item). It is also necessary to use a different type for the data files. Instead of their gn type being '0' it should be "range". This means that the menu item should data file "file1" should be
Name=Whatever Path=range/relative/path/to/file1 Type=0
1. Run gn from the command line either with logging enabled or with the command "gn -L /dev/tty" so logging output will go to your terminal. When gn pauses for input enter the Path field of your index followed by a tab and a search term. I.e. you should enter something like.
7w/relative/path/to/myindex/index.invThe error messages may give you some idea of what is going wrong.search term
2. Run the wgtest program as described above in the section on testing. Make sure that waisgn is functioning properly as a standalone program.
3. If you still can't isolate your problem, edit the file waisgn.h and change the #define LOG_DEBUG at the end from FALSE to TRUE. Then recompile waisgn. It will now write the diagnostic error messages that you get with wgtest into the file /tmp/waisgn.debug where you can examine them. The advantage over wgest is that you can check the communication between gn and waisgn.
John Franks -- Dept of Math. Northwestern University <john@math.nwu.edu>