The basic philosophy of GN security is that by default no client requests are granted. Permission to serve a document must be explicitly granted by the maintainer. This is done in one of two ways: A file named .cache which is in the GN data hierarchy may be served and a file in the hierarchy which is listed in a file with the .cache file format may be served.
Despite this strong foundation several additional steps are prudent. The most important is that the maintainer must assure that no untrusted source has write access to any part of the GN hierarchy. For example an "incoming" anonymous ftp directory should never be part of a GN hierarchy, because an attacker could put a .cache file there granting access any file in the hierarchy.
Starting with version 2.14 of GN there is optional support for additional security against counterfeit .cache files. This is achieved by specifying a userid or groupid (not both) for .cache files. To do this use the "-k uid#" or "-K gid#" option to gn or sgn. When invoked in this way gn or sgn will not serve a document unless the .cache file listing it has the prescribed owner or gid. This uid# or gid# should be that of the maintainer not the one under which gn or sgn runs. If on your server all .cache files are created by a single user or a single group I strongly recommend using this option. Note that for a given .cache file in a directory to be served the owner of the different .cache file (which lists the given one and resides in the directory above it) must be correct, not the owner of the given .cache file itself. In particular, the top level .cache is always allowed as it is not listed in any .cache file. Sometimes even I get confused about this :)
Also the server should be run with a USER_ID (which can be set in config.h) with as few permissions as possible. Of course it must have read permission on all the files served but it should not have write permission for any directory or file other than its logfile. If the syslog option for logging is enabled there is not even any need for write permission on a logfile. A good practice is to have files in your hierarchy which you intend to serve be owned by the maintainer or their creator. They should be world readable (assuming they are for general consumption) but with restricted write permission. The files in your hierarchy should not be owned by the user id under which GN will run.
GN does not use the chroot system call to further restrict the files which the server can access. Doing so would enhance security at the expense of extra work for the maintainer. The effect of this is to prevent the server from even internally accessing any file which is not in your data directory. If you are especially concerned about security you may wish to run one of the public domain TCP wrappers in conjunction with GN. This will simultaneously enhance security for other TCP services like anonymous ftp.
More precisely, if any of the characters
; !` ' | \ * ? - ~ < > ^ ( ) [ ] { } & $ \r \n / or \
occurs in an argument for an item of "exec" type it is replaced by a
space. There are other programming constructs which would allow
the invocation of the command without the intervention of a shell.
However using them and not altering the arguments would merely pass
the risk on to the script writer. While GN would then arguably
be blameless in the event of damage, it would still be very easy
for an inexperienced maintainer to have a script line like
exec mail "$maintainer $1"
If $1 contained a ; followed by a dangerous command this could
be disastrous. For that reason I have chosen to check the arguments
and replace dangerous characters.
CGI (Common Gateway Interface) scripts work somewhat differently than the exec type. The CGI specification does not permit altering dangerous characters in the arguments. Briefly, here's what happens with CGI arguments.
If the request is of type POST, information is read from the client and put in a temporary file on disk. Then the script is executed with no arguments and its standard input comes from this file. Security is the responsibility of the script writer. It is not so dangerous to have arguments come from standard input but the script writer must still exercise care.
If the request is of type GET, the arguments are examined to see if they contain an '='. If they do, it is assumed that this is a CGI form response (something like name=John&toppings=pepperoni). In this case the script is executed with no arguments and the argument string is placed in an environment variable where the script can read it. Again this is fairly safe but the script writer must exercise care.
Finally if the GET request has arguments but no '=' it is assumed to be an ISINDEX type request and the script should be executed with the given arguments. While the CGI specification does not permit the altering of arguments, it does say that if the arguments pose any security problems it is permissible to put the string in an environment variable and execute the script without arguments, just as in the CGI forms case described above. GN takes a very strict position here and views any of the characters in the list above as a security problem requiring this action. It is quite possible that this will cause some scripts not to work with some user inputs, but this has not appeared to be a problem.
Exercise care when writing scripts. If possible avoid /bin/sh scripts in favor of something like perl, or even better C. Anytime you get input from the client make sure it contains no funny characters. For example the perl lines
$Name =~ s/[^A-Za-z. ]//g; $Phone =~ s/[0-9()\- ]//g;delete any characters except letters, '.' and spaces from $Name and any characters except digits, parentheses, '-', and space from $Phone.
If it is possible make sure that no untrusted user has write access to any part of your GN hierarchy. As mentioned above an attacker with write access to your hierarchy can create a .cache file which will give access to anything in your hierarchy. Even worse, she can create a shell script and a .cache file permitting it to be executed. A good principle to keep in mind is: Everyone with write access to any part of your data hierarchy has all the permissions of the userid under which your server runs! Of course sgn is a special case when run as root. If you want to use sgn on one of the standard ports it must be run as root because only root can access these ports. The first thing the server does is open the necessary socket and immediately change its userid to the one set in the config.h file. It is then the permissions of this userid that are effectively transferred to a user with write access to your GN hierarchy. I do not recommend making sgn be setuid root. This would allow users without root access to start up the server on any privileged port. If individuals without root access need to be starting or stopping a server they can do so on a non-privileged port. Sometimes it is not possible or desirable to deny write access to your GN hierarchy. For example, you may want to allow all users to have a subdirectory of the hierarchy in which to publish their "home pages". The "FORBID_EXEC" directive mentioned above may be a good idea in this case, to prevent any execution of scripts. You should note that there is no way to use .access files to prevent users on your system with write access to the data hierarchy from gaining access to files you are serving. They can simply make a symbolic link in their part of the hierarchy to the file you want to restrict and a .cache file permitting it to be served. Since the server has access to the restricted file it will serve it if it is listed in a .cache file.
The most important thing to remember in this situation is the principle cited above. All users have some permissions and are denied others. Remember that any permissions you grant to the userid under which GN runs are also granted to every user who can create a .cache file in your data hierarchy.