Installation and Maintenance of GN

Version 1.1

Installing the Software

1. Get the file gn-1.1.tar.Z and uncompress it and untar it to make the GN source directory hierarchy. This file is available via anonymous ftp to ftp.acns.nwu.edu in the directory /pub/gn. The top level of the directory created by untarring this file contains 5 directories: gn, mkcache, uncache, waisgn and docs.

2. Edit the file "config.h" in the top level directory. You should enter the host name of the computer on which you plan to run GN and the complete path name of your gopher data directory. If you want to run at a port other than 70 also edit the DEFAULTPORT entry. You should also specify the complete path of the file mkcache/gn_mime.types on your system. You can put this file anywhere convenient (and give it any name). This file is used by the mkcache program; see the section below on Content-type for an explanation of the function of this file. Other customizations are possible but should not be needed.

3. Edit the file "Makefile" in the top level directory. This allows you to specify the C compiler used if you wish to use something other than cc, e.g. gcc. You can also specify two directories in which things are placed when you do a "make install". The first of these SERVBINDIR is the path of the directory in which you want the executable file for the gn server installed. The second BINDIR is the location for the mkcache and uncache programs. If you are using a SysV version of UNIX you will need to uncomment the "-DSYSV" compiler directive also.

4. In the directory gn-1.1X do a "make" to produce the server "GN" and the two utilities "mkcache" and "uncache". The utility mkcache produces "cache files" for use by the server (it is described below) and uncache is used to convert from the Minnesota server gopherd to GN. If you want a C compiler other than cc you will need to edit the Makefile in each directory. The binary GN is the server and can be installed anywhere you choose. The binaries mkcache and uncache are utility programs for maintainers and should be installed somewhere in your path, e.g. /usr/local/bin.

5. You must be setup to run GN under inetd. There are surely variations on how this works from system to system so you may need to look at the man page for inetd.conf(5). Here's how it works under many systems, e.g. Suns: Edit the file /etc/services and create the line

     gn  70/tcp
(or replace 70 by the port you wish to use). Then edit the file /etc/inetd.conf and insert the line
     gn    stream    tcp nowait    nobody    /full/path/for/gn    gn

After the last GN you can have optional arguments to turn on logging or use a different data directory (see the man page gn.8).

It is important to run GN as "nobody" (the fifth field in the inetd.conf line above) or some other user with no no access privileges. If you are using an inetd with without the capability to set UID on startup (e.g., Ultrix), you should define the group ID and user ID in config.h so that the program is not running as root (look for the #defines GID_SET and UID_SET and uncomment them). It should never be necessary to run GN as root and to do so would be a serious mistake for maintaining security. Every attempt has been made to make GN as secure as possible, even if it is run as root, however, no program accessible to remote users on the internet can be assumed perfectly secure.

After editing the inetd.conf and services files you should find the process id number of the inetd process and do the command "kill -HUP pocess_id#". This must be done as root. If you have never done this before get someone who has to help you.

Setting up the Data Directory

1. In each directory of your data hierarchy create a file called menu with one item for each file or directory you want GN to publish. Items in this file have the format like the following:

     Name=This description of the file will display on the client
     Path=0/path/to/dir/file
     Type=0
     Host=YourHost.YourU.edu
     Port=70

     Name=This is a subdirectory
     Path=1/path/to/dir/subdir
     Type=1
     Host=YourHost.YourU.edu
     Port=70

     Name=This is a remote link
     Path=0/myfile/path
     Type=0
     Host=MyHost.MyUniv.edu
     Port=70
There are several things to note about these examples. The Name field must be first, the Path field starts with a GN type (e.g. '0' for a file or '1' for a directory, see the Appendix to this document for more details). It is followed by the path name of the file or directory relative to the top level data directory.

The Type, Host, and Port fields are optional for local items, but required for remote links. If they are not present, the Type will be taken from the first character of the Path field and the Host and Path fields will be those specified in config.h (or on the command line of mkcache). In general it is a good idea not include the Host and Port fields for local items. This makes it much easier if at some future time you should wish to move your server to a new host or new port. For more details on the format of menu files see the man page mkcache.1 and the sample menu files in each directory of the source hierarchy.

2. After the menu files have been created you must run the mkcache program to produce a .cache file. This can be done once for each directory or once in the top data directory with the "-r" option to make all the .cache files for the hierarchy. You might want to look a .cache file to see what it is like.

Testing

After compiling and setting up the software you can test it on a sample directory provided with the distribution. To do this first make a symbolic link in your root data directory to the "docs" directory in the source distribution. The command "ln -s /your/src/dir/docs" executed in the root directory should do this. If you system does not support symbolic links you can copy this directory and its subdirectories to your data directory temporarily.

Next edit the file "docs/sample.root.menu" to replace the host name "your.host.edu" with the name of the host for your server. Place a copy of the edited file in your root datat directory and name it "menu" (be sure to save the previous menu file if you have created one). Now in your root directory run the program mkcache with the command "mkcache -r". This will produce some messages including a warning about the file docs/Install, which you can ignore. (To understand the meaning of this warning read the section on structured files below.) Running mkcache with the "-r" option should produce two ".cache" files, one in your root directory and one in the docs subdirectory. Now change to the directory docs/images and run mkcache again to produce a .cache file here. Using the "-r" directory didn't take care of this directory because we don't want it to show on our menus and hence it isn't in any menu file. The image in this directory will appear "inline" in the menu for the root directory (if it is viewed with an HTTP client like Mosaic -- gopher clients lack the capability to display it and will ignore it.

Now you are ready to test your server installation on this directory. Try it with your favorite gopher or http client.

Trouble Shooting

If things are not working as they should here are some tips to help you isolate the problems. First to check the server itself try executing it from the command line. If you use the command "gn /full/path/of/root/dir", it should run and pause for input. Type a return in response and gn should print the "gopher protocol lines" of your top level .cache file and exit. If instead of a return you type the "selector" for a file (i.e. the contents of the Path= line in the menu, like "0/dir/filename") then gn should display the contents of that file and exit.

If this doesn't happen there should be an error message which may be helpful. Better error messages are placed in the log file so you may want run gn again with the additional arguments "-L logfile" and then examine the contents of the logfile. Or if you run "gn -L /dev/tty" the log entries will be printed to your screen instead of being put in a file. If it can't open a file, for example, the name of that file will be recorded in the logfile. Check its permissions. Remember that all files that gn servers must be world readable.

A second useful test is to telnet to your server at port 70. You should get a connection message and a pause for input. If you get a "Connection refused" message it is likely there is a problem with your inetd setup or for some reason your system can't find or can't execute the gn binary.

Limiting Access to Your GN Hierarchy

If you have opted to limit access to your gopher there are two ways to do this. For the first you use the "-a" option to GN (in the inetd.conf file). This will limit access to the server to those clients with an IP address or subnet address listed (and not excluded) in the file .access in the root data directory. The format of the .access file is one address per line, each line consisting of an IP address like 129.111.222.123 or a subnet address like 129.111.222 or 129.111. In case a subnet address is listed, any client with an IP address beginning with that subnet address will be allowed access.

You may also list the domain names of the machines using wildcards provided the machines all have proper PTR domain name records. To allow access to all machines under nwu.edu, use the line *.nwu.edu. Note that this will not allow access to a machine called nwu.edu if it exists. One would need to add in the record nwu.edu to allow access.

You can also exclude IP addresses or domain names by prefixing them with an '!', so if .access contained only the lines

!boombox.micro.umn.edu
*
Access would be permitted to every machine except boombox. Likewise
!129.111
*
would allow access to everyone except those on subnet 129.111. It is important to note that in determining access GN reads the .access file only until it finds a match (with or without '!') and then quits. So if .access consisted of the two lines

*
!129.111
then access would be granted to everyone since the * comes first and it matches everyone.

The "-A" option is similar to the -a option except access is allowed on a per directory basis. Each client request is processed by first looking for a .access file in the directory containing the requested item and comparing the IP address of the client with the addresses in this file. If no .access file exists in this directory, one is sought in the parent directory and then if necessary the parent of the parent, etc. up to the root data directory. If no .access file is found by this process access is allowed to all clients provided the item requested exists in a .cache file.

It is possible with GN to attain even finer access discrimination than on a per directory basis, though it is somewhat cumbersome to do so. Nevertheless if you have a need to make certain menu items visible (and accessible) to a select group of hosts, this is possible. Details on how to do it are in the last section of the document /docs/technical.notes.

Searching a Collection of files

Gn has two mechanisms to allow the user to create a small menu consisting of those items on a large menu whose files contain a particular search term. The first of these is described in this section. This method has the advantage of being much easier for the maintainer but will become slow if there are too many files being searched or the files are too large. It is appropriate if the number of files you wish to search is, say, less than 100 and the files themselves are at most a few hundred kilobytes (of course this depends on your particular hardware and your mileage may vary). If you have a large number of files or very large files, then using WAIS indexing may be a better choice. This is described in a separate document "The Waisgn User's Guide."

In the remainder of this section we will describe the grep-like builtin search capabilities of gn. The example GN server on hopf.math.nwu.edu there has a top level menu item which is called "Full text search of GN documentation." When a user selects this item the client will prompt for a search term and then return a menu of all the available files in the documentation directory which contain the search term. The searches are case insensitive. In fact they use grep like regular expressions (see the UNIX man page for grep(1)). The searching is not done by actually invoking the UNIX grep program, however. Gn has its own builtin regular expression code.

This feature is enabled by putting two entries in appropriate menu files. The first looks like this:

Name=Documentation for the GN server
Path=1s/docs
This is just a minor modification of the normal entry for the directory "docs" which contains the documentation files we want to make available by GN. The only difference is that the path is "1s/docs" instead of "1/docs". The "s" that we are making the directory searchable, i.e. giving permission to run grep like searches of all the files which are listed in the menu file for the docs directory. Of course, docs must be a directory in the GN hierarchy and it must contain a .cache file allowing files to be served by GN in the usual way.

As yet, however, we have no menu item for the search. This is achieved with the second menu item:

Name=Full text search of GN documentation
Path=7g/docs
The "7" indicates that this is a search type and the "g" indicates that this is a "grep" type search. In this example docs is a directory in the GN root directory, but it could be lower in which case we would have the line Path=7g/foo/bar/docs. Both of these menu items might, as in this example, go in the same menu, but this is not necessary. You might, for example, want to have the "Full text search of GN documentation" item occur in the menu which lists the documentation files, rather than menu which lists the documentation directory. Or you might want to have it occur in both menus, which is fine. This menu item can occur in any menu of your GN hierarchy. But, of course, the "Documentation for the GN server" menu item is corresponds to the physical directory "docs", so it must be in the menu corresponding to the directory containing "docs". In this example that is the GN root directory.

You can also do full text grep-like searches through any collection of directories (instead of a single directory) but it requires a little trick. Here's how you do it.

In the directory whose menu you want to contain the search item (let's call it dir1) create a new directory (call it newdir). Take the menu files from all the directories you want to search, concatenate them into a single file and make it the menu file in newdir. Run mkcache on this concatenated menu file to produce a .cache file in newdir. Now in the menu file in dir1 put the follow entry:

Name=Search a bunch of files from mulitple directories
Path=7g/dir1/newdir
then make a new .cache from this menu file. That's all there is to it. If you want also to have a menu visible to users showing all the files in the multiple directories you can add an item
Name=List of a bunch of files from mulitple directories
Path=1s/dir1/newdir
to the dir1/menu file. But this isn't necessary.

These searches are fairly efficient because GN contains its own regular expression matching routines rather than externally calling the grep function. Regular expressions which the user can enter as search terms are essentially the same as those allowed by grep (see the man page grep(1)) with the addition of the special character ~ which matches word boundaries.

Serving and Searching Structured Files

Gn has the capability to serve a single large file consisting of a number of sections so that each section appears to the client as a separate file with its own title. This is a generalization of the "mailfile" feature available on the Minnesota server. To use this feature requires two additional fields in the menu file, called "Separator" and "Section". These are regular expressions as in grep which are used to match lines which will be used as separators of the parts of the large file and lines which will be used for menu section items. Thus for a mail file one would use the lines

     Separator=^From 
     Section=^Subject:
The first line, which should have a literal space at the end not the word , means that sections (in this case mail messages) are separated by lines starting with From and a space. The ^ matches the start of a line and the space is necessary because some lines begin with From and a colon.

Here's another example. This document consists of sections with section headings lines written all in caps. Since I want to make a menu with each section a separate item I use the following entry in my menu file

     Name=Installation/Maintenance Guide Sections
     Path=1m/docs/Install
     Separator=^[A-Z][A-Z" ]*$
     Section=^
     Type=1
The Separator is ^[A-Z][A-Z "]*$. This matches any line starting with a letter from A to Z (i.e. caps) followed by any number of characters which are between A and Z or equal to space or the quotation mark, and then the end of the line. This describes the section headings of this document. I need the initial [A-Z] so blank lines won't be matched.

When the separator field is matched a new section is started which will have its own menu item. The title of the menu item is determined by the Section regular expression. In fact the section is searched, starting with the separator line, for a match for this second regular expression. When a match is found, everything on the line after the matching pattern is taken as the title. Thus for mail everything after the word "Subject:" becomes the title. In the example of this document, the expression ^ matches the beginning of the separator line so that whole line becomes the menu title. To see this in use gopher to hopf.math.nwu.edu and look in the documentation directory for this document.

Another example of how this might be used is for a directory. If a file consists of entries like

     Name: Franks, John
     Address:  Department of Mathematics, Northwestern University
     Phone: 708-491-5548
     etc., etc.
then Separator=^Name: and Section=^Name: would give a menu with an item
     1.  Franks, John
which when selected would would give the multiline record with my name, address, etc. In this example it would be even better to use the search feature for structured files.

In the example above the "1m" at the beginning of the Path field indicates that this is a structured file. It is Type 1 because to clients it will look like a directory. If we add an additional menu entry like

     Name=Search Installation Guide
     Path=7m/docs/Install
     Type=7
which is Type 7 and has a path beginning with "7m" the client will prompt the user for a search term which can be a regular expression. The GN server will return a menu with only those sections containing a match for the regular expression. Thus for the directory example if the user searched for Northwestern she would get only those directory entries containing that word.

Here's how this works. When mkcache is run with a menu file containing the "1m" entry above it produces the regular .cache file but also produces another file (in this case called Install..cache) which is a cache file for the sections of the file Install specified in this menu item. The lines in this cache file contain the menu titles obtained from the file by matching regular expressions and contain a selector which designates a range of bytes corresponding to a section of the document. Gn knows how to serve a single section of document when given one of these byte range selectors.

Since the file Install..cache was made when the item with path 1m/docs/Install was encountered we it is not necessary to remake when the item with path 7m/docs/Install is reached. We signal this by omitting the Separator and Section fields from this menu item. If these fields were in both items the cache file Install..cache would be made twice and the one done last would take effect if there was a difference in the regular expressions given. Of course if the regular expressions are omitted from both then the cache file will not be made and attempts to access either item will result in an error (cryptically reported as "Access denied"). For this reason Whenever, an item of type 1m or 7m with no regular expressions is encountered by mkcache, a warning message is printed.

It is easy to effectively use two different separator regular expressions or two different section expressions for the same file. You might for example want to have a mail file with menu by subject and another menu by author. To do this you must make a UNIX link (see the man page ln(1)) to give the mail file an additional name and use the two different names in the menu file Path entries. This is necessary so the cache files created will have different names.

The two regular expressions for the separator and the menu titles are not put in to the selector string. Thus they are not available to the client to change. This has a slightly unfortunate side effect when uncache is used to produce a menu file. Since there is no information about these regular expressions in the .cache file there is no way for the uncache program to put it in the menu file it makes and they must be added by hand. This is the only way that uncache fails to be a complete inverse for mkcache.

Note: All regular expressions given as search terms and all lines in which a match is sought are converted to lower case before the matching is attempted. This has the (desirable) effect of making all searches case insensitive. By contrast the regular expressions used to define separators and menu lines are case sensitive. Regular expressions which can be used for the separator and section strings are essentially the same as those allowed by grep with the addition of the special character ~ which matches word boundaries. To give special characters (including ^ ~ [ ] ( ) * . \ and $) their regular meaning they must be escaped with a \.

Using HTML -- GN as a WWW Server

Starting with release 1.0, the GN sever became a multi-protocol server. It will accept either gopher requests or HTTP requests and respond appropriately. To the maintainer this takes place automatically with no action necessary or his or her part.

For those not familiar with it HTTP stands for Hyper Text Transfer Protocol and it is the underlying protocol used by WWW (World Wide Web) browsers such as the Mosaic family. Gopher and HTTP each have some advantages not shared by the other. Making GN a multi-protocol server is an attempt to let us have our cake and eat it too.

While it is correct that as soon as you start up GN you are serving documents via HTTP, in order to take advantage of some of the really nice features, like images in menus, you do have to put some information in your menu file.

HTTP is a protocol designed for use with HTML (Hyper Text Markup Language) and the usual HTTP server consists of a collection documents written in this markup language with internal hypertext links between them rather than any menus, as such. The GN server works with HTTP clients by translating menus into HTML and serving them in accordance with protocol that HTTP browsers understand.

In addition you can, of course, create your own HTML documents and make them available on your server. You can learn about the format of an HTML document from an online beginners guide by Marc Andreesen at http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html (This is a URL or Universal Resource Locator which says the document is available via HTTP at www.ncsa.uiuc.edu in the file with the path given). This document is an excellent introduction to HTML documents and gives references for further reading.

Once you have created a document you can serve it with GN by giving it a file name ending in .html and making it available in the usual way as a text document. For example,

     Name=A Sample of Hypertext Markup
     Path=0/dir/dir2/sample.html

If this document is viewed with an HTTP browser it will be displayed with the capabilities of that browser (i.e. nicely formatted in the ways prescribed by your HTML document). If it is viewed by a gopher client the HTML source, i.e. the unformatted document with markup tags, will be displayed. If you want to create two versions of a document -- one in plain text and the other in HTML, this is easily handled by GN. Simply give the plain text file a name, say "sample," and use the name "sample.html" for the HTML version. Then use the plain text name, but with a Path starting with "0h" (that's zero h). For example a menu entry like

     Name=A Sample of Plain/Hyper text
     Path=0h/dir/dir2/sample
will provide the file "sample" to gopher clients and "sample.html" to HTTP clients. One note of warning: don't name the plain file "something.txt", because the .txt suffix indicates to HTTP clients that this is *not* an HTML file and it will do the wrong thing (the client, gopher or HTTP, only sees the plaintext file name.)

Adding HTML text to menus works slightly differently. You simply include the source in the menu file beginning with the keyword "httpText=" on a line by itself and ending with the keyword "endText=" on a line by itself. Here is an example from the main menu of the GN server at hopf.math.nwu.edu. It illustrates how to put graphic images into a menu.

     httpText=
     The GN Server
     <img src="http:/I/image/fract2.gif">
     

This is the home of the GN Gopher/HTTP server. It contains documentation on GN, the source, and several examples of how GN can be used. To get the source distribution select the compressed tar file listed below.

endText= Name=Announcement of GN version 1.0 Path=0/announce-1.0 etc.

After the keyword httpText=, the first line creates a title for the document. All HTML "tags" which do the markup are contained in angle brackets <>. The line starting <img src=... says to insert the graphic image on hopf at port 70 with Path=I/image/fract2.gif at this point in the document. The tag

indicates a paragraph break. See the document mentioned above for more details on HTML. Any HTML text can be inserted in this way in a menu. There can be multiple insertions and they can be anywhere in the text.

If you use the keyword Text= in place of httpText= then GN will serve the text to HTTP clients exactly as with httpText=, but will also put the text (with all HTML tags deleted) in the gopher menus using the 'i' or comment type supported by many clients. For the gopher clients no text formatting is done. The lines will have the same length they do in your menu file.

Remote links are slightly problematical for GN. If a link to a remote server is made in the usual way by specifying Name, Path, Type, Host and Port then the GN server assumes by default that this is a link to a server capaple of dealing only with the gopher protocol and will present it as such. The determination of whether or not a link is remote is done at the time that mkcache is run and a link is considered remote unless the the Host and Path fields in the menu are omitted or agree exactly with the default values as specified on the mkcache command line or at compile time in the file config.h. For this reason it is important that whenever you run mkcache you specify the host on the command line, unless you have placed that name in the config.h file as HOSTNAME.

Of course, you may know that a remote link is running the GN server and therefore capable of handling HTTP requests as well as gopher requests. In this case, to allow HTTP clients to get the best link, simply use "GNLink=" instead of "Name=" in your menu file. For example, a link to the Northwestern University Math server would look like:

     gnLink=Northwestern University Mathematics Department
     Path=1/
     Type=1
     Host=gopher.math.nwu.edu
     Port=70

Of course, it is also possible to handle links to servers that can only handle HTTP requests. This is done by placing them in the menu as HTML documents, bracketed by the httpText= and endText= keywords.

Finally, it is possible to use GN as server to serve only HTTP clients and have no menus. Well, there would have to be one menu, the root menu, but it could contain nothing but HTML surrounded by the keywords httpText= and endText=. This document could have hypertext links to other HTML documents which in turn have hypertext links, etc. It will still be necessary to create "dummy" menu files in each directory with the Path of each of the HTML files and to run mkcache to create .cache files. This is for security reasons.

Setting up a "Search All Menus" Item

A builtin feature of GN is the ability to have a menu item which when selected prompts the user for a search term and returns a "virtual menu" of all menu items which contain that term. In fact such an item can occur at any level and return either all matches from all menus on that server or all matches at or below some chosen level.

Here's how to set it up. Create an entry like this in the menu file where you want the search item to occur.

     Name=Search all menus on this server
     Type=7
     Path=7c/.cache
(If you want the search to cover only those items in directory /foo/bar, then the path line should be Path=7c/foo/bar/.cache) now run "mkcache" to translate the new menu file to a .cache file and you are done. The Type, Host and Port lines are optional -- if they are omitted mkcache will use the default value or the value supplied on the command line. When you change any of the menus in your server and remake the .cache files GN will automatically reflect this in menu searches. There is a maximum depth which GN will search into the GN hierarchy. It's value can be changed by editing the config.h file and re-compiling.

Compressed Files

If you wish you can keep files on your server in a compressed format and uncompress them on the fly as a client requests them. You need a program to compress the files and a companion program to decompress them. I recommend "gzip" and "zcat" from the GNU project. They are considerably more efficient than the UNIX standard "compress."

When configuring "GN" for compilation, be sure to set the #define DECOMPRESS in the file config.h to the path name of the program which will decompress the files you have compressed. The default value for this is "/usr/local/bin/zcat". Another possibility would be "/usr/ucb/uncompress -c".

If the file you want to make available is "rootdir/dir1/bigfile," first you must compress it with the compress command which will replace it with the file bigfile.gz or bigfile.Z. You then make a menu entry like the following (assuming bigfile was a text file and you have produced bigfile.gz).

     Name=All the text in Bigfile
     Path=0Z/dir/bigfile.gz
     Type=0

The key here is the 'Z' which is the second character of the Path field. It indicates that the file is compressed. The Path would start with "0Z" (that's zero Z) for any compressed text file. It doesn't matter how the file was compressed or whether its name is bigfile.Z or bigfile.gz or something else. You have already told GN how to decompress the file by specifying the DECOMPRESS program in config.h.

Of course, if bigfile is a binary the Path field would be 9Z/dir/bigfile.gz and the Type would be 9. For a sound file Path=sZ/dir/bigfile.gz, Type=s, etc. Files of types 0, 4, 5, 9, s, and I can be compressed. Structured files (type 1m) cannot be compressed.

You might want to let users download the file in compressed format. You could give them the option by having the menu item as above with Path=0Z/dir/bigfile.gz and also having a menu item

     Name=Bigfile in compressed format
     Path=9/dir/bigfile.gz
     Type=9
Note that the Type=9 since compressed files are binaries (even though bigfile is text) and there is no 'Z' as the second character of the Path, because now we do not want to decompress. Also note that two versions of bigfile show up on your menu (text and compressed binary) but there is only one file bigfile.gz on your disk.

Serving the Output of a Program or Script

Sometimes it is convenient to have the server return the output of a program or script. This capability is built into GN. Assuming you have a program in a file "prog" which returns some text you can make its output be an item on your server's menu with a menu entry like
     Name=Program output
     Type=0
     Path=exec0::/dir/prog

The phrase "exec" says to run the program "prog" which must be executable by the GN userid (probably "nobody"). The "0" after the exec says this is a text file. exec can return most types, including 0 (text), 1 (menus), 9 (binaries), s (sound), I (image). To specify a type the single character type is appended to the word exec in the path. Thus if you wanted to return the output of a program which is in the format of a sound file you would have an entry like

     Name=Image program output
     Type=I
     Path=execI::/dir/prog
 

The pair of colons in the path can contain arguments to the program. For security reasons none of the characters

     ; ` ' | \ * ? - ~ < > ^ ( ) [ ] { } & $ / or \ 
are allowed in the arguments to programs. Thus, if you want to run a command like "prog -u #!/bin/sh exec /fullpath/prog -u and make this script be what GN executes.

It would be nice to have the client query the user for a word or phrase and have this passed to the program as an argument. Unfortunately the basic gopher protocol protocol does not allow this. In the gopher+ protocol this deficiency was remedied by grafting on some wholly new interactions between client and server. Eventually gn will support these so-called "ASK blocks".

In the meantime here is an example of how to work around the difficulty. This is a simple example of how to make a search item which will print all the lines of a file containing a match for a given word. It requires two very small shell scripts. The first, let's call it "script1" would have a menu file entry like

     Name=Search for a word in Myfile
     Path=7/dir/script1
in gn_root/dir/menu and the script itself, gn_root/dir/script1, would contain only the lines
     #!/bin/sh
     echo "0Matches for $1: exec0:$1:/dir2/script2myhost.edu70"
with replaced by the tab character. When a client selects this item and returns a search term the script creates a new menu with one item "on the fly". The one menu item is "Matches for X" where X is the search term entered by the user.

When this "Matches" item is selected and sent to the server the second script "script2" is invoked with argument set to the search term already entered. Script2 should contain the lines

     #!/bin/sh
     grep "$1" /complete/path/of/Myfile
which will then output the matching lines. Recall that for security reasons every file, including scripts, which will be served, must be in a .cache file. To achieve this there must be a menu item for script2 in gn_root/dir2/menu like
     Name=Anything
     Path=exec0::/dir2/script2

One normally puts this in a different directory (gn_root/dir2 in this example) which is not in your gn hierarchy and hence this item won't show on any menu. It is only there so that when gn is asked to provide the output of script2 in response to the "on the fly menu", it can find script2 in a .cache file and hence serve its output.

One final remark about this example. It will work fine it the search term is a normal word or part of a word. However, regular expressions used by grep often contain special characters which by default are not allowed in arguments to scripts as mentioned above. This applies only to arguments to scripts, as in this example, and regular expressions work fine with the built searches in gn described above.

MIME Content-Type

Both gopher+ (which is not yet supported by GN) and HTTP/1.0 use an IANA standard "Mime Content-type" classification for the files served. Examples of such types are "text/plain", "image/gif", etc. Starting with version 1.0 the GN server needs to know this type to transmit to the client. This is normally accomplished automatically using the file gn_mime.types which can be edited by server maintainer. Information in this file enables mkcache program to transtlate the gopher Type= field and the suffix from the file name into a standard type. For example, the line
     I<tab>gif<tab>image/gif
in gn_mime.types means that menu items with Type=I and a file name like picture.gif should be given the "image/gif" type. This assignment can be overridden on a per file basis by adding a line to the menu file. For example, a menu item like
        Name=An html file
        Path=0/dir/filename
        ContentType=text/html
        Type=0
will be given the Content-type "text/html" even though the file name does not end with the suffix ".html" (had the file name been filename.html it would not have been necessary to add the ContentType= line. At present this attribute is only used with clients supporting the HTTP/1.0 protocol. It will be used in the future with gopher+ attributes. Gopher0 clients and HTTP/0.9 protocol clients are not capable of using this information.

If the file gn_mime.types is not present mkcache will issue a warning but use internal default values. The file exists so that you can add to it if you wish to add new kinds of documents to your server. The format of the file is explained in the file. The default version of the file is in mkcache/gn_mime.types. The internal defaults are the same as what is currently in this file.

Decoupling the GN and Filesystem Hierarchies

It is possible to do this, but generally recommended only if you have a reason to do so and are fairly familiar with how GN works and the syntax of .cache files. Information on how to do it can be found in the file docs/technical.notes.

Thanks

I would like to thank the many people who have aided in the creation of the GN package, either through writing code or finding and fixing bugs. They include Earle Ake, Henry Cejtin, Mike Crowley, Paul DuBois, Stephen Hebditch, Jishnu Mukerji, Marko Nordberg, Jim Rees, Craig Milo Rogers, Stephen Trier, Ed Vielmetti, and Rico Tudor.

Appendix: GN Internal Types

Each "Path" field in a gn menu file contains an entry known as a selector. The format of this selector is a gn internal type field followed by the path relative to the gn root directory of the file referenced. The gn internal type indicates to the server what kind of document or file is referenced. It can be a single character like "0" to indicate a text file, or a more complicated string. Here is a list of the types and what they mean.
0 -- Plain text
This is a normal ASCII text file the Mime content type is usedtext/plain
0h -- Plain text and html (two files)
Items of this type are represented by two files, filename and filename.html which are plaintext and HTML versions of the same document. The server will return the HTML version to HTTP clients and the plaintext version to gopher clients. Content types are text/plain and text/html respectively.
1 -- Menu
Items of this type correspond to directories in the GN hierarchy. From the menu file in this directory the server produces an HTML document for HTTP clients and a gopher protocol directory for gopher clients.
1m -- Structured file
Items of this type are files consisting of many smaller "documents" like a mail file. The server presents a menu of these smaller documents and when one is selected returns the appropriate part of the file. See the section on Structured Files).
1s -- Searchable Menu
Like type 1 but allows items in this menu to be searched with grep searches. See the section on Searching a Collection of Files.
2 -- CSO or "ph" URL (link)
The gn server does nothing with these links except give the URL (or gopher equivalent) to the client to handle.
3 -- Error
This type is reserved for error messages. It should not be used in a menu file.
4, 5, 6 -- Binhex, DOS .exe, and uuencoded files
These types refer to Mac binhex, DOS binaries and UNIX uuencoded files respectively. The server treats types 4 and 6 (binhex and uuencoded) exactly like text files and type 5 (DOS binaries) like type 9 binary files.
7 -- Maintainer defined search or response to client supplied string
An item of this type refers to a program or script provided by the server maintainer. This program should take an input string and respond with output in the format of a .cache file (see Technical notes).
7c -- Menu keyword search
This item refers to a directory in the GN hierarchy, typically the root directory. When selected it is selected the user is prompted for a search term (any grep like regular expression). The server returns a menu of all menu items in the heirarchy at or below the given directory which contain matches for the regular expression. See section above on "Searching all menus."
7g -- Grep searches
This item refers to a directory in the GN hierarchy, containing a number of files. When this item is selected the server prompts the user for a regular expression search term and returns a menu of all those files which contain a match for that regular expression See the section on Searching a Collection of Files.
7m -- Search a structured file
This item refers to an item of type '1m' as described above. When this item is selected the server prompts the user for a regular expression search term and returns a menu of all those documents in the referenced file which contain a match for that regular expression See the section on Structured Files).
7w, 7wc, 7wh, 7wr -- WAIS index searches
These items refer to a WAIS index in the GN hierarchy, containing an index of a collection of files in the hierarchy. When this item is selected the server prompts the user for a search term and returns a menu of all those files which contain a match. Type 7wc indicates that the menu item titles should be taken from the .cache files containing the documents rather than use what WAIS thinks the title should be (typically the filename). Type 7wh indicates that the indexed items should be type 0h, i.e. there is both a plaintext and an html version of each file. Type 7wr indicates a "range type", where a single file is considered to contain a number of documents, e.g. a mail file. See the waisgn users guide for more details.
8 -- Telnet URL (link)
The gn server does nothing with these links except give the URL (or gopher equivalent) to the client to handle.
9, I, s -- Binary files, Image files, Sound files
These files are treated as binary data and downloaded appropriately. To determine its Mime content type the file gn_mime.types is consulted and compared with the filename extension. This can be overridden by a ContentType= entry in the menu file. See the section above on MIME Content-Type.
0Z, 9Z, IZ, sZ -- Compressed files
These types refer to files of type 0, 9, I, and s respectively, but the files in question are keep on the server in a compressed format. When a client requests one of these files it is automatically uncompressed before transmission to that client. See the section above on Compressed Files.
exec0, exec1, exec9, execI, execs -- execute a command
These types cause the server to execute a program provided by the maintainer. The last character indicates the type of item the program or script will return, i.e. '0' for text, '1' for directory etc. See the section above on Serving the Output of a Program or Script.
R123-345-1m, R123-345-range
These types should not be entered in a menu file by the maintainer. They are used internally by the server for menu items which correspond to documents which consist of all the characters in a certain range of a file. "R123-345-1m", for example, would refer to the "document" consisting of bytes 123 to 345 of a file which has a type 1m menu entry. Similarly R123-345-range refers to a document returned as a match of a WAIS index search (type 7wr).
John Franks -- Dept of Math. Northwestern University <john@math.nwu.edu>