Il CD di internet

home *** CD-ROM | disk | FTP | other *** search

/ Il CD di internet / CD.iso / SOURCE / CONTRIB / HTTPD / HTTPD_AS.TAR / ascii / tutorials.txt < prev next >

Wrap

Text File | 1994-10-11 | 30.7 KB | 809 lines

NCSA httpd tutorials In an effort to make this documentation a bit more usable, we have written these tutorials on various aspects of server setup. NCSA httpd directory indexing NCSA httpd provides a directory indexing format which is similar to that which will be offered by the WWW Common Library. To set up this indexing, follow these steps. For an example of what these indexes look like, take a look at the demo. Activating Fancy indexing You first need to tell httpd to use the advanced indexing instead of the simple version. The simple version should be used if you prefer its simplicity, or if you are serving files off of a remote file server, for which the stat() call would be costly. You tell the server which you want to use with either the IndexOptions directive, or the older FancyIndexing directive. We recommend: IndexOptions FancyIndexing Icons NCSA httpd comes with a number of icons in the /icons subdirectory which are used for directory indexing. The first thing you should do is make sure your Server Resource Map h as the following line in it: Alias /icons/ /usr/local/etc/httpd/icons/ You should replace /usr/local/etc/httpd/ with whatever you set ServerRoot to be. Next, you need to tell the server what icons to provide for different types of files. You do this with the AddIcon and AddIconByType directives. We recommend something like the following setup: AddIconByType (IMG,/icons/image.xbm) image/* AddIconByType (SND,/icons/sound.xbm) audio/* AddIconByType (TXT,/icons/text.xbm) text/* This covers the three main types of files. If you want to add your own icons, simply create the appropriately sized xbm, place it in /icons, and choose a 3-letter ALT identifier for the type. httpd also requires three special icons, one for directories, one which is a blank icon the same size as the other icons, and one which specifies the parent directory of this index. To use the icons in the distribution, use the following lines in srm.conf: AddIcon /icons/menu.xbm ^^DIRECTORY^^ AddIcon /icons/blank.xbm ^^BLANKICON^^ AddIcon /icons/back.xbm .. However, not all files fit one of these types. To provide a general icon for any unknown files, use the DefaultIcon directive: DefaultIcon /icons/unknown.xbm Descriptions If you want to add descriptions to your files, use the AddDescription directive. For instance, to add the description "My pictures" to /usr6/rob/public_html/images, use the following line: AddDescription "My pictures" /usr6/rob/public_html/images/* If you want to have the titles of your HTML documents displayed for their descriptions, use the IndexOptions directive to activate ScanHTMLTitles: IndexOptions FancyIndexing ScanHTMLTitles WARNING: You should only use this option if your server has time to spare!!! This is a costly operation! Ignoring certain items Generally, you don't want httpd sending references to certain files when it's creating indexes. Such files are emacs autosave and backup files, httpd's .htaccess files, and perhaps any file beginning with . (if you have a gopher or FTP server running in that directory as well). We recommend you ignore the following patterns: IndexIgnore */.??* */README* */HEADER* This tells httpd to ignore any file beginning with ., and any file starting with README or HEADER. Creating READMEs and HEADERs When httpd is indexing a directory, it will look for two things and insert them into the index: A header, and a README. Generally, the header contains an HTML <H1> tag with a title for this index, and a brief description of what's in this directory. The README contains things you may want people to read about the items being served. httpd will look for both plaintext and HTML versions of HEADERs or READMEs. If we add the following lines to srm.conf: ReadmeName README HeaderName HEADER When httpd is indexing a directory, it will first look for HEADER.html. If it doesn't find that file, it will look for HEADER. If it finds neither, it will generate its own. If it finds one, it will insert it at the beginning of the index. Similarly, the server will look for README.html, then README to insert a trailer for the document. Setting up CGI in NCSA httpd CGI scripts are a way for documents to be generated on the fly. You should first read the brief introduction to CGI to learn what it is and why you would want to use it. There are two main mechanisms to tell NCSA httpd where your scripts are. Each has its pluses and its minuses. ScriptAlias The first approach is based on the Server Resource Map directive ScriptAlias. With this directive, you specify to the server that you want to designate a directory (or directories) as script-only, that is, any time the server tries to retrieve a file from these directories it will execute the file instead of reading it. The usual setup is to have the following line in srm.conf: ScriptAlias /cgi-bin/ cgi-bin/ This will make any request to the server which begins with /cgi-bin/ be fulfilled by executing the corresponding program in ServerRoot/cgi-bin/. You may have more than one ScriptAlias directive in srm.conf to desingnate different directories as CGI. The advantage of this setup is ease of administration, and centralization. Many system managers don't want things as dangerous as scripts anywhere in the filesystem. The disadvantage is that anyone wishing to create scripts must either have their own entry in srm.conf or must have write access to a ScriptAliased directory. CGI as files NCSA httpd 1.2 allows you to create CGI scripts anywhere, by specifying a "magic" MIME type for files which tells the server to execute them instead of sending them. To accomplish this, use the AddType directive in either the Server Resource Map or in a per-directory access control file. For instance, to make all files ending in .cgi scripts, use the following directive: AddType application/x-httpd-cgi .cgi Alternatively, you could add .sh and .pl after .cgi to allow automatic execution of shell scripts and PERL scripts. Note that you have to have Options ExecCGI activated in the directory you create scripts. (you might want to read more about directives like Option in the docs for installation). The advantage of this setup is that scripts may be absolutely anywhere. The disadvantage is that scripts may be absolutely anywhere (especially places you don't want them to be like users' home directories). NCSA httpd server side includes NCSA httpd allows users to create documents which provide simple information to clients on the fly. Such information can include the current date, the file's last modification date, and the size or last modification of other files. In its more advanced usage, it can provide a powerful interface to CGI and /bin/sh programs. Issues Having the server parse documents is a double edged sword. It can be costly for heavily loaded servers to perform parsing of files while sending them. Further, it can be considered a security risk to have average users executing commands as the server's User. If you disable the exec option, this danger is mitigated, but the performance issue remains. You should consider these items carefully before activating server-side includes on your server. Setting Up Includes First, you should decide which directories you want to allow Includes in. Most likely this will not include users' home directories or directories you do not trust. You should then decide, of the directories you are allowing includes in, which directories are safe enough to use exec in. For the directories in which you want to fully enable includes, you need to use the Options directive to turn on the option Includes. Similarly for the directories you want crippled (no exec) includes, you should use the option IncludesNOEXEC. In any directory you want to disable includes, use the Options directive without either option. Next, you need to tell the server what filename extension you are using for the parsed files. These files, while very similar to HTML, are not HTML and are thus not treated the same. Internally, the server uses the magic MIME type text/x-server-parsed-html to identify parsed documents. It will then perform a format conversion to change these files into HTML for the client. To tell the server which extension you want to use for parsed files, use the AddType directive. For instance: AddType text/x-server-parsed-html .shtml This makes any file ending with .shtml a parsed file. Alternatively, if you don't care about the performance hit of having all .html files parsed, you could use: AddType text/x-server-parsed-html .html This would make the server parse all .html files. Converting your old INC SRV documents to the new format You should use the program inc2shtml in the support subdirectory of the httpd distribution to translate your documents from httpd 1.1 and earlier to the new format. Usage is simple: inc2shtml file.html > file.shtml. The Include format All directives to the server are formatted as SGML comments within the document. This is in case the document should ever find itself in the client's hands unparsed. Each directive has the following format:  Each command takes different arguments, most only accept one tag at a time. Here is a breakdown of the commands and their associated tags: config The config directive controls various aspects of the file parsing. There are two valid tags: errmsg controls what message is sent back to the client if an error includes while parsing the document. When an error occurs, it is logged in the server's error log. timefmt gives the server a new format to use when providing dates. This is a string compatible with the strftime library call under most versions of UNIX. sizefmt determines the formatting to be used when displaying the size of a file. Valid choices are bytes, for a formatted byte count (formatted as 1,234,567), or abbrev for an abbreviated version displaying the number of kilobytes or megabytes the file occupies. include include will insert the text of a document into the parsed document. Any included file is subject to the usual access control. This command accepts two tags: virtual gives a virtual path to a document on the server. You must access a normal file this way, you cannot access a CGI script in this fashion. You can, however, access another parsed document. file gives a pathname relative to the current directory. ../ cannot be used in this pathname, nor can absolute paths be used. As above, you can send other parsed documents, but you cannot send CGI scripts. echo prints the value of one of the include variables (defined below). Any dates are printed subject to the currently configured timefmt. The only valid tag to this command is var, whose value is the name of the variable you wish to echo. fsize prints the size of the specified file. Valid tags are the same as with the include command. The resulting format of this command is subject to the sizefmt parameter to the config command. flastmod prints the last modification date of the specified file, subject to the formatting preference given by the timefmt parameter to config. Valid tags are the same as with the include command. exec executes a given shell command or CGI script. It must be activated to be used. Valid tags are: cmd will execute the given string using /bin/sh. All of the variables defined below are defined, and can be used in the command. cgi will execute the given virtual path to a CGI script and include its output. The server does not perform error checking to make sure your script didn't output horrible things like a GIF, so be careful. It will, however, interpret any URL Location: header and translate it into an HTML anchor. Variables defined for Parsed Documents A number of variables are made available to parsed documents. In addition to the CGI variable set, the following variables are made available: DOCUMENT_NAME: The current filename. DOCUMENT_URI: The virtual path to this document (such as /~robm/foo.shtml). QUERY_STRING_UNESCAPED: The unescaped version of any search query the client sent, with all shell-special characters escaped with \. DATE_LOCAL: The current date, local time zone. Subject to the timefmt parameter to the config command. DATE_GMT: Same as DATE_LOCAL but in Greenwich mean time. LAST_MODIFIED: The last modification date of the current document. Subject to timefmt like the others. Making your setup more secure When configuring the access control for your server, you will want to make sure you do not give any unauthorized access to anyone. Please follow these guidelines to ensure that your server is not compromised. A word of caution on DNS based access control and user authentication The access control by hostname and Basic user authentication facilities provided by httpd are relatively safe, but not bulletproof. The user authentication sends passwords across the network in plaintext, making them easily readable. The DNS based access control is only as safe as DNS, so you should keep that in mind when using it. Bottom line: If it absolutely positively cannot be seen by outside people, you probably should not use httpd to protect it. Disable Server-side includes wherever possible Whenever you can, use the Options directive to disable server-side includes. At the very least, you should disable the exec feature. Note that because the default value of Options is All, you should include an Options directive in every Directory cl ause in your global ACF and in every .htaccess file you write. Use AllowOverride None whe rever possible Use this directive to prevent any "untrusted" directories (such as users' home directories) from overriding your settings (and thus allowing their friends to execute xterms as nobody with a server-side include or other such horrors). You also gain a bonus in performance. Protect your users' home directories Protect your users' home directories with Directory directives. If your users all have their home directories in one physical location (such as /home), then this is easy: <Directory /home> AllowOverride None Options Indexes </Directory> If they are not all in one location such as /home, then you should use this wildcard pattern to secure them (assuming your UserDir is se t to public_html): <Directory /*/public_html*> AllowOverride None Options Indexes </Directory> In addition, if you wish to give your users the ability to create symbolic links to things only they own, use the Option SymLinksIfOwnerMatch. Mosaic User Authentication Tutorial Introduction This tutorial surveys the current methods in NCSA Mosaic for X version 2.0 and NCSA httpd for restricting access to documents. The tutorial also walks through setup and use of these methods. Mosaic 2.0 and NCSA httpd allow access restriction based on several criteria: Username/password-level access authorization. Rejection or acceptance of connections based on Internet address of client. A combination of the above two methods. This tutorial is based heavily on work done by Ari Luotonen at CERN and Rob McCool at NCSA. In particular, Ari wrote the client-side code currently in Mosaic 2.0, and Rob wrote NCSA httpd. Getting Started Before you can explore access authorization, you need to install NCSA httpd 1.0a5 or later on a Unix machine under your control, or get write access to one or more directories in a filespace already being served by NCSA httpd. You also need to be running Mosaic for X version 2.0 or later, or another browser known to support HTTP/1.0-based authentication. Prepared Examples Following are several examples of the range of access authorization capabilities available through Mosaic and NCSA httpd. The examples are served from a system at NCSA. Simple protection by password. This document is accessible only to user fido with password bones. Important Note: There is no correspondence between usernames and passwords on specific Unix systems (e.g. in an /etc/passwd file) and usernames and passwords in the authentication schemes we're discussing for use in the Web. As illustrated in the examples, Web-based authentication uses similar but wholly distinct password files; a user need never have an actual account on a given Unix system in order to be validated for access to files being served from that system and protected with HTTP-based authentication. Protection by password; multiple users allowed. This document is accessible to user rover with password bacon and user jumpy with password kibbles. Protection by network domain. This document is only accessible to clients running on machines inside domain ncsa.uiuc.edu. Note for non-NCSA readers: The .htaccess file used in this case is as follows: AuthUserFile /dev/null AuthGroupFile /dev/null AuthName ExampleAllowFromNCSA AuthType Basic <Limit GET> order deny,allow deny from all allow from .ncsa.uiuc.edu </Limit> Protection by network domain -- exclusion. This document is accessible to clients running on machines anywhere but inside domain ncsa.uiuc.edu. Note for NCSA readers: The .htaccess file used in this case is as follows: AuthUserFile /dev/null AuthGroupFile /dev/null AuthName ExampleDenyFromNCSA AuthType Basic <Limit GET> order allow,deny allow from all deny from .ncsa.uiuc.edu </Limit> General Information There are two levels at which authentication can work: per-server and per-directory. This tutorial primarily covers per-directory authentication. Per-directory authentication means that users with write access to part of the filesystem that is being served can control access to their files as they wish. They need not have root access on the system or write access to the server's primary config files. Access control for a given directory is controlled by a file named .htaccess that resides in that directory. The server reads this file on each access to a document in that directory (or documents in subdirectories). By-Password Authentication: Step By Step So let's suppose you want to restrict files in a directory called turkey to username pumpkin and password pie. Here's what to do: Create a file called .htaccess in directory turkey that looks like this: AuthUserFile /otherdir/.htpasswd AuthGroupFile /dev/null AuthName ByPassword AuthType Basic <Limit GET> require user pumpkin </Limit> Note that the password file will be in another directory (/otherdir). Also note that in this case there is no group file, so we specify /dev/null (the standard Unix way to say "this file doesn't exist"). AuthName can be anything you want. AuthType should always currently be Basic. Create the password file /otherdir/.htpasswd. The easiest way to do this is to use the htpasswd program distributed with NCSA httpd. Do this: htpasswd -c /otherdir/.htpasswd pumpkin Type the password -- pie -- twice as instructed. Check the resulting file to get a warm feeling of self-satisfaction; it should look like this: pumpkin:y1ia3tjWkhCK2 That's all. Now try to access a file in directory turkey -- Mosaic should demand a username and password, and not give you access to the file if you don't enter pumpkin and pie. If you are using a browser that doesn't handle authentication, you will not be able to access the document at all. How Secure Is It? The password is passed over the network not encrypted but not as plain text -- it is "uuencoded". Anyone watching packet traffic on the network will not see the password in the clear, but the password will be easily decoded by anyone who happens to catch the right network packet. So basically this method of authentication is roughly as safe as telnet -style username and password security -- if you trust your machine to be on the Internet, open to attempts to telnet in by anyone who wants to try, then you have no reason not to trust this method also. Multiple Usernames/Passwords If you want to give access to a directory to more than one username/password pair, follow the same steps as for a single username/password with the following additions: Add additional users to the directory's .htpasswd file. Use the htpasswd command without the -c flag to additional users; e.g.: htpasswd /otherdir/.htpasswd peanuts htpasswd /otherdir/.htpasswd almonds htpasswd /otherdir/.htpasswd walnuts Create a group file. Call it /otherdir/.htgroup and have it look something like this: my-users: pumpkin peanuts almonds walnuts ... where pumpkin, peanuts, almonds, and walnuts are the usernames. Then modify the .htaccess file in the directory to look like this: AuthUserFile /otherdir/.htpasswd AuthGroupFile /otherdir/.htgroup AuthName ByPassword AuthType Basic <Limit GET> require group my-users </Limit> Note that AuthGroupFile now points to your group file and that group my-users (rather than individual user pumpkin) is now required for access. That's it. Now any user in group my-users can use his/her individual username and password to gain access to directory turkey. CERN has extensive documents on http-based authentication. The URL is http://info.cern.ch/hypertext/WWW/AccessAuthorization/Overview.html. Graphical Information Map Tutorial Introduction This document is a step-by-step tutorial for designing and serving graphical maps of information resources. Through such a map, users can be provided with a graphical overview of any set of information resources; by clicking on different parts of the overview image, they can transparently access any of the information resources (possibly spread out all across the Internet). First Steps This tutorial assumes use of NCSA httpd (version 1.0a5 or later). Some other servers (e.g. Plexus) can also serve image maps, in server-specific ways; see the specific server's docs for more information. Make sure you have a working NCSA httpd server installed and running. Make sure you have write privileges to the server's conf/imagemap.conf config file. Also make sure that the imagemap program is compiled and in the server's htbin directory. This tutorial also assumes use of NCSA Mosaic for X version 2.0. Other clients that support inlined GIF images and HTTP/1.0 URL redirection will also work. Your First Image Map In this section we walk through the steps needed to get an initial image map up and running. First, create an image. There are a number of image creation and editing programs that will work nicely -- the one I use is called xpaint (you can find it on ftp.x.org in /R5contrib; The important thing is that the image ends up in GIF format. A common scheme for an image map is a collection of rectangles and circles, each containing a short text description of some piece of information or some information server; interconnections are conveyed through lines or arcs. Try to keep the individual items in the map spaced out far enough so a user will clearly know what he or she is clicking on. Second, create an image map file. Here is what an image map file looks like: default /X11/mosaic/public/none.html rect http://cui_www.unige.ch/w3catalog 15,8 135,39 rect gopher://rs5.loc.gov/11/global 245,86 504,143 rect http://nearnet.gnn.com/GNN-ORA.html 117,122 175,158 The format is fairly straightforward. The first line specifies the default response (the file to be returned if the region of the image in which the user clicks doesn't correspond to anything). Subsequent lines specify rectangles in the image that correspond to arbitrary URLs -- for the first of these lines, the rectangle specified by 15,8 (x,y of the upper-left corner, in pixels) and 135,39 (lower-right corner) corresponds to URL http://cui_www.unige.ch/w3catalog. So, what you need to do is find the upper-left and lower-right corners of a rectangle for each information resource in your image map. A good tool to use for this is xv (also on ftp.x.org in /contrib)-- pop up the Info window and draw rectangles over the image with the middle mouse button. It doesn't matter where you put your map file or what you name it. For the purposes of this example, let's assume it's called /foo/sample.map. Third, tell your server about your image map file. You do this by adding a file to the server's conf/imagemap.conf file. The line looks like this: sample : /foo/sample.map ... where sample is the symbolic name for your image map and /foo/sample.map is the actual name of your map file. Fourth, create an HTML document that contains your map image. An example follows: Click on the information resource you wish to see: <P> <A HREF="http://machine/htbin/imagemap/sample"> <img src="sample.gif" ismap> </A> <P> Note: machine is the name of the machine on which your HTTP server resides. sample is the symbolic name of your image map (from above). sample.gif is the name of your image (assuming, of course, that it's in the same directory on your server as the HTML file). Fifth, try it out! Load the HTML file, look at the inlined image, click somewhere, and see what happens. Subsequent Image Maps You can serve as many image maps from a single server as you want. Just add lines to conf/imagemap.conf pointing to each image map file you create. Real-World Examples Following are examples of distributed image maps on servers in the real world; they may or may not work at any point in time. The URL for them is provided. Experimental Internet Resources Metamap http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Demo/metamap.html University of California Museum of Paleontology http://ucmp1.berkeley.edu/. National Institute of Standards and Technology http://www.nist.gov/welcome.html server map at NCHPC information server http://info.lcs.mit.edu/Info/structure.html WAIS and HTTP Integration Introduction This document overviews existing methods for using WAIS as a back-end search engine for HTTP servers. Information herein is currently experimental and may or may not work for you. WAIS and Plexus Plexus is a powerful Perl-based HTTP server written and maintained by Tony Sanders at BSDI. The URL's you might be interested in are: http://www.bsdi.com/server/doc/plexus.html http://www.cs.cmu.edu:8001/Web/People/rgs/perl.html WAIS and GN GN is a multi-protocol server written and maintained by John Franks at NWU. It is shipped with support for WAIS as a back-end search engine. The URL's you might be interested in are: http://hopf.math.nwu.edu/ http://hopf.math.nwu.edu:70/0h/docs/waisgn.guide WAIS and NCSA httpd 1.0 Rob McCool has written a CGI script which allows NCSA httpd 1.0 as well as other CGI compliant servers to access a WAIS database in the same way that is mentioned in this document. The script is in the CGI archive. It contains instructions for setting it up under httpd 1.0. The URL's you might be interested in are: ftp://ftp.ncsa.uiuc.edu/Web/ncsa_httpd/cgi/wais.tar.Z freeWAIS 0.202's URL Type freeWAIS 0.202 is shipped with support for type "URL". Use of this type is a little tricky. First, Mosaic 2.0 doesn't know how to deal with this type directly, but Mosaic 2.1 (when it is released) will. Second, use of this type apparently implies overloading the "headline" of a WAIS hit with the URL. This is fine, except then the description that the user sees of a given document is the URL, and URLs are, as usual, pretty cryptic things to just throw in front of average users. But anyway, here's how it works: waisindex ... -t URL what-to-trim what-to-add ... So what does that mean? Well, first, -t URL tells waisindex to use type URL (note use of lowercase -t in this instance). Second, what-to-trim and what-to-add are parameters that tell the indexer how to put together the URL that's returned as the result of a query. Suppose your documents are normally stored in /X11/mosaic/public. Suppose also that these documents are normally served via a URL that begins with http://wintermute.ncsa.uiuc.edu:8080. This means that a file stored as /X11/mosaic/public/foo.html, for example, is normally served as http://wintermute.ncsa.uiuc.edu:8080/foo.html. The waisindex command you'd use in this case would be something like the following: waisindex -d ~/localwais/sources/www -export -t URL /X11/mosaic/public http://wintermute.ncsa.uiuc.edu:8080 /X11/mosaic/public/*.html ... where ~/localwais/sources/www is the name of the WAIS index file and /X11/mosaic/public/*.html are the files you are indexing. When queries are made on this database, the string /X11/mosaic/public is removed from the beginning of the filename of a matching file and the string http://wintermute.ncsa.uiuc.edu:8080 is put in its place. As per our previous example: /X11/mosaic/public/foo.html turns into http://wintermute.ncsa.uiuc.edu:8080/foo.html as the result of a WAIS hit. As you can see, this is perfect -- the WAIS server passes back the exact same URL that would normally be used to access this file via HTTP. So, everything from relative hyperlinks to relative inlined image references in the file will work correctly when the file is retrieved.