Authorizing Users to
Access Protected Content

A main ingredient in Web commerce sites that sell access is user authentication, sometimes better known as "password protection." A Web program is used to prevent unauthorized access to a page, all of the pages within a directory, or even an entire site. In order to gain access the user must provide a name and password. If the name and password match the information stored in a password file, the user is allowed in. If the name and password don't match, the user is tossed out, and access is denied.

Most server software includes its own password scheme, which is designed to work in concert with the user authentication feature built into most browsers. With this feature, a pop-up box appears that asks for the user name and password. Software already provided by the server determines if the user name and password match what is stored in a password file. If there's a match, the server allows access. If there isn't a match, the server displays an error page.

While no extra Web programs are typically required for user authentication, some Web administrators prefer their own password protection scheme. The usual approach for server-provided user authentication restricts access to specific directories at a site. But some applications require users to be able to access certain files in the directory, and not others. This is where a Web program specifically engineered to allow access to certain pages comes in handy. In this scenario, the server does not provide the page unless the proper password is given.

Considering Web Security Needs

Web security is too subjective to make any absolute statements, so the extent of your security measures should depend on the sensitivity of the data you are trying to protect, and the likelihood of someone wanting it. For example, if your site collects credit card numbers for access, you will want extreme security measures, both of your server in general and of your Web site.

Obviously, you don't want someone sneaking into your server and stealing all the credit cards off your hard drive. But someone doesn't even have to do that to breach your data. Consider the following: Joe Doakes uses his credit to get access to your site. He's given a password for that access. You also allow for Joe to upgrade his account status, and since you have his credit card number there, he can view his account, change it, and so forth.

Now imagine I. M. Hacker comes to your site, sees a list of your clients in a password file, then goes about trying to luck onto a password by trying simple word lists. Suppose Joe Doakes was naive enough to use a password like "mommy," and Mr. Hacker finds it after just a few minutes of trying. Now the bad guy has access to your site in the guise of Joe Doakes, which means he can also access that account management page. Because of the lax security at your site, Mr. Hacker now has Mr. Doakes' credit card number and other personal information, and can go on a buying spree.

The above is an extreme example that has a lot of "what ifs," but it's typical of the scenarios you have to go over to determine the level of security you need for your site. You will want to consult with your Web host administrator to see what options are available to you, and if more secured access will cost you more. You can then weigh the benefits of whatever technique you use with the security requirements you have.

Using .htaccess and .htpasswd Files

UNIX Web servers that use the Apache, NCSA, and Netscape software provide a built-in mechanism for user authentication. The good news is that this mechanism does not require any programming on your part. The bad news is that not all Web hosts allow their clients to make use of the user authentication system. If your Web host disallows access to the server's built-in user authentication features, you should seriously consider finding a new Web host.

Note: NT Web servers, and UNIX Web servers that run other software, may also provide a method for implementing user authentication. Unfortunately, there are literally dozens of methods in current use, with many variations. It is not practical to cover all of the various user authentication methods that are available. The UNIX approach shown here is employed in the vast majority of Web hosts around the world, and is the one you are most likely to encounter. If your Web host uses a different method, contact the Web host administrator for details on implementing user authentication.

User authentication under UNIX centers around two files: the access file, and the password file.

The access file is stored in the directory you want to protect. Anyone visiting that directory is required to provide a valid user name and password. The typical access file name is .htaccess. This file can be creating using any text editor.
The password file is typically stored in a "safe" directory of your server, preferably not in the same directory as the one you want to protect. This file, typically named .htpasswd, contains all the names and passwords of users who are allowed access. This file must be generated by the server, using an appropriate password-making program.

The .htaccess file serves an important role, because its contents regulate how directories are protected. The .htaccess file uses directives, which are like commands that instruct the Web server how to employ its user authentication system.

Below are more detail discussions of the .htaccess and .htpasswd files.

Access (.htaccess) File

The access file, typically named .htaccess, provides the basic instructions for allowing access to a given directory, as well as all subdirectories under that directory. When used for user authentication, the access file contains the following basic syntax:

  AuthUserFile /disk02/.htpasswd 
  AuthGroupFile /dev/null
  AuthName Name goes_here
  AuthType Basic

  <Limit GET>
  require valid-user
  </Limit>

Note: the .htaccess access file is actually used for many more functions than user authentication. For example, it is also used to change the visual appearance of directory listings and modify the contents of error messages returned by the Web server. For our application though, we will limit the discussion of the .htaccess access file to user authentication only.

Following is a line-by-line description of the contents of the .htaccess file.

AuthUserFile /disk02/.htpasswd -- Points to the file that contains the user names and passwords of those allowed access. The filename must be provided with an absolute path.
AuthGroupFile /dev/null -- Points to the file that contains the group names of those allowed access. Group files are seldom used in Web commerce sites. The file pointed to is actually an empty "bit bucket" file.
AuthName Name goes_here -- Specifies the "realm" for the protected directory. Every protected area has a unique realm.
AuthType Basic -- Specifies the user authentication system; in this case, Basic.
<Limit GET> -- Specifies the protection block tag. Includes is the server method (GET) that is protected. Other server methods (POST, PUT, etc.) can also be included, but GET is the most commonly used.
require valid-user -- Specifies that only validated users can be granted access.
</Limit> -- Ends the protection block tag.

In addition to the require directive within the <Limit> tag, a number of other authentication directives can be included, and which you may find helpful. The allow and deny directives affects which hosts can access a given directory, and can be used when you want to grant or deny access to a class of users, rather than (or in addition to) individual user names.

Option	What it does
all	all hosts are allowed/disallowed access
domain name	Domain name of a host; e.g. mysite.com
IP address	IP address of a host; e.g. 255.255.255.255

Partial domain names and IP addresses are also permitted. Example:

   allow from .machine.mysite.com

All hosts in the specified domain are allowed access.

   deny from .mysite.com

All hosts in the specified domain are allowed denied.

You can also use the order directive, to specifically control the order in which allow and deny directives are evaluated.

Option What it does

deny,allow Deny directives are evaluated before the allow directives.

allow,deny Allow directives are evaluated before the deny directives.

mutual-failure Only hosts that appear on allow list, but not deny list, are granted access.

For example:

  <Limit GET>
  order allow,deny
  allow from mysite.com
  deny from yoursite.com anothersite.com
  require valid-user
  </Limit>

This example first checks for all allowed domains, in this case mysite.com. Then it checks for all disallowed domains, in this case yoursite.com as well as anothersite.com. Note that there cannot be a space between the directives. That is, "allow,deny" is acceptable, but "allow, deny" is not.

AuthUserFile (.htpasswd file)

The AuthUserFile directive sets the name of a file containing a list of users and passwords for user authentication. The filename parameter must be an absolute path to the AuthUserFile. The default AuthUserFile filename is .htpasswd, but it can be any valid filename. Under UNIX the leading period makes the file not visible in a typical directory listing.

See "Creating a .htpasswd File," later in this chapter, for details on creating the user names and passwords in a .htaccess file.

The format of the AuthUserFile is:

  username1:password1
  username2:password2
  username3:password3

and so forth. Each username/password pair is on a separate line, and each username is separated by its password with a colon. The actual password is encrypted; that is; the plain-text password does not show up on the AuthUserfile, but rather the encrypted version. This version uses an encryption algorithm that is extremely difficult to crack, so anyone getting a copy of the AuthUserFile will not necessarily be able to glean any useful passwords from it.

The Apache documentation for the AuthUserFile stipulates that the behavior is "undefined" if the server encounters multiple instances of the same user name. Usually, the server will pick the first user name it finds, but in some cases a server error can result. Therefore, it is important to ensure that user names occur only once in a AuthUserFile.

Note that the AuthUserFile is a "flat" text database, and is therefore not very efficient if there are lots of username/passwords. You can reliably use the AuthUserFile if you have upwards of 500 to 750 username/passwords. With more entries in the file, consider using the DBM authorization system supported by most UNIX-based Web servers. However, not all Web hosts support the DBM authorization module. Check with your Web host administrator if you need to provide access to more than 500 or 750 user names.

It is very important that AuthUserFile is not placed in a directory that is servable by the Web server. See the section "Using an Out-of-Reach Directory," later in this chapter. And especially, DO NOT place the file in the same directory that is protected. Otherwise, users will be able to fetch the AuthUserFile, and at the very least get a listing of your authorized users.

Note: It is considered very difficult to crack UNIX passwords. Even the UNIX operating system and the Web server don't actually "decrypt" the password. Rather, the server merely matches the already crypted password with a crypted version of the test password. If they match, then the server grants access. This is how a hacker's program works. If they stumble onto the .htaccess file, it will point to the .htpasswd file. If this file is in a servable directory, they can retrieve it, and get the username:password pairs. Once they have that file, they can do a dictionary lookup to see if there are any matches to the crypted passwords. Odds are, they'll find one.

AuthGroupFile

The AuthGroupFile directive sets the name of the text containing a list of user groups for user authentication. Filename is the absolute path to the group file. Each line of the group file contains a group name followed by a colon, followed by the member usernames separated by spaces. For example:

   mygroup: fred ellen joe

Note: The AuthGroupFile is seldom used for authorizing users in a Web commerce site. Rather, use AuthUserFile.

AuthName directive

The AuthName directive sets the name of the "realm" for a directory. This realm is given to the client so that the user knows which username and password to send. It must be accompanied by the following directives:

AuthType
require
AuthUserFile or AuthGroupFile

AuthType directive

The AuthType directive selects the type of user authentication for a directory. Since Basic is the only type of authentication currently implemented in most servers, the AuthType directive always specifies Basic. The AuthType directive must be accompanied by the following directives:

AuthName
require
AuthUserFile or AuthGroupFile

Using an "Out of Reach" Directory

The .htpasswd password file should be placed in a directory that is not "servable" by the Web server. This directory is called an "out of reach" directory because it is in a path that the Web server cannot reach (the operating system, however, can still access the directory). The out of reach directory works because all Web servers have a "document root" directory specified as the main directory for your site. That document root may be something like

   /web/mysite/htdocs/

in the hierarchy of paths on the computer. Files and other subdirectories located under the ../htdocs path are accessible to persons visiting your site. However, unless the following paths are also indicated as being document roots, files in these are not servable, and therefore not viewable to anyone visiting your site

   /web/
   /web/mysite/
   /web/mysite/anotherdir/

The out of reach directory concept is particularly important with user authentication, because the .htpasswd password file can be placed there, which "hides" the file from anyone visiting your site. However, files stored in the out of reach directory are still accessible by the operating system, and by CGI programs. For maximum security, set up your sire so that the .htpasswd password file is in a directory outside the document root. Assuming

   /web/mysite/htdocs/

is the document root for your Web site, then

   /web/myseite/ood

is used as an "out of reach" directory for sensitive files.

Check with your site administrator to see if they can set you up with a directory structure to allow for out-of-reach directories. Some Web hosts don't have a provision for this, and it is best to determine the specifics directly from the Web host administrator. If your Web host can set up a special directory for you, they can suggest how to proceed.

Note that there is nothing terribly wrong with having the htpasswd file out in the open (a servable directory). But doing so does mean you are an easier target for hackers. You need to decide what you're protecting against, and design the system around that. Depending on the Web server, you can put htaccess in the /ood directory, but it hardly makes any difference. Non members probably won't know about the existence of the directory anyway. And it's validated users who can become your worst hackers. They have access to your htaccess file. Could this be a problem for you? Only you can decide.

Using Multiple .htpasswd Files

Remember that the .htaccess file, placed in the directory that you wish to protect, indicates the password file to use for granting access to bona fide users. The usual name for password files is .htpasswd, but it can be anything you like. With this in mind, if you have multiple directories to protect, with different users for each, you can store multiple password files in the "out of reach directory." Give each password file a different name, such as

   .htpasswd-dir1
   .htpasswd-dir2
   .htpasswd-dir3

or whatever. You will find site maintenance easier if all the password files are in a single, safe directory.

Creating a .htpasswd File

The .htpasswd file must be created using a utility that generates the encrypted passwords provided by you or users. Most Web servers come with a command-line program, htpasswd, for this purpose. To use the htpasswd program, merely provide the name of the password file you want to create or append to, the user name, and the clear-text password. For example, to add user "fred" to the .htpasswd file, and associate the password "doggie" with user fred, type:

   htpasswd .htpasswd fred

then at the password prompt, type:

   doggie

The htpasswd file will ask you to repeat the password, to be sure it is entered correctly.

Note: If the .htpasswd file does not already exist, use the -c switch to create a new file:

   htpasswd -c .htpasswd fred

If the .htaccess file is not in the same directory as the htpassswd program, be sure to specify the absolute path to the file. For example:

   htpasswd /disk/mysite.com/ood/.htpasswd fred

Password Protection Using "Blind" Directories or Filenames

The .htaccess and .htpasswd methods of providing user authentication are sometimes overkill. Not all sites require such protection of directories and files. Often, you can provide a "members-only" area merely by creating a directory on your site that has no direct links from other pages. When you want to allow access to a new member, tell him or her the URL of the "protected" pages. This technique is called blind URLs; the URLs are directories and filenames that are otherwise invisible in the non-protected areas of your site.

Creating a blink directory or file is simple: merely add the directory and file in the usual manner. Be sure that there are no links to the directory/file, and that your server does not display a directory index that might otherwise display the hidden directory and file.

Using JavaScript for Password Protection

Suppose you have some documents on your Web site that are not meant to be viewed by the general public. One solution is to have your Web server provide some sort of password access to the page, or perhaps even password protect an entire directory. This is the ideal method if security is vital. Only a password authentication program running on a server can provide the kind of secure access you need for sensitive information.

On the other hand, most of us don't work with truly sensitive stuff. We merely want to provide a barrier to restrict casual access by the general public. One low-tech approach is to merely remove any external links to your sensitive page (see "Password Protection Using 'Blind' Directories or Filenames," above). That way, users must explicitly type the path and name of the document. This method works best if the document is in a directory that contains an "index" or main home page. That way, if a user specifies just the path, they get the index document, and not the directory of files in that path.

The problem with this method is that once you give out the filename people can continue to access it. You may wish to limit the accesses to your restricted pages in some way -- for example, allow access on one day, but not on another. For this application JavaScript can help. With JavaScript, you can create a basic, no-frills enciphering program that converts a plain-text filename to an enciphered filename.

A numeric key can be used to increase the variability of the enciphering. The key has 63 possible values. Each of the keys results in a different enciphered filename from the same "plain-text" filename. You might use values 1 through 31, for example, as keys for each day of the month. This allows you to restrict access to your pages on specific days.

The encoding system used in the examples in this section is extremely simple, and can be broken in a matter of minutes by a trained cryptographer. However, to the average user the encoding scheme is not immediately obvious; there are no "secret words" or numbers stored in the script that a user can view.

Note: For the inquisitive, the script uses a simple cipher technique known as XORing, where the numeric value of each character of the plain-text password is mixed with a numeric key value. The result is similar to the old-fashioned "decoder ring" they used to give away in cereal boxes, where each letter is substituted for another. The benefit is that it's easy to change the key against which the letter values are matched, and that the key is not part of the message or script.

A reverse process can be used to decode the cipher back to its plain-text filename. In fact, the exact same JavaScript program is used as both the encoder and as the decoder. Use the cipher text as the password, and provide the same key value used for encoding (remember: the encrypted text alone isn't enough -- you need the key value!).

Enciphering the Filename

Cipher.html can be used to determine the enciphered filename using whatever plain-text word or filename you wish. It's also a good demonstrator for the whole process. If you know a little about encryption systems feel free to enhance and modify this basic script.

cipher.html

<HTML>
<HEAD>
<TITLE>Password Encyphering test</TITLE>
<SCRIPT LANGUAGE="JavaScript">
        function testEncode(form) {
        var Ret = encode (form.inputbox1.value,
        form.inputbox2.value)
        form.inputbox3.value = Ret
}

function encode (OrigString, CipherVal) {
        Ref="0123456789abcdefghijklmnopqrstuvwxyz._~"
        Ref=Ref+"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        CipherVal = parseInt(CipherVal)
        var Temp=""
        for (Count=0; Count < OrigString.length; Count++) {
                var TempChar = OrigString.substring (Count, Count+1)
                var Conv = cton(TempChar)
                var Cipher=Conv^CipherVal
                Cipher=ntoc(Cipher)
                Temp += Cipher
        }
        return (Temp)
}

function cton (Char) {
        return (Ref.indexOf(Char));
}

function ntoc (Val) {
        return (Ref.substring(Val, Val+1))
}
</SCRIPT>
</HEAD>
<BODY>
<FORM NAME="testform">
Plain text: <BR>
<INPUT TYPE="text" NAME="inputbox1" VALUE=""><P>
Key value:<BR>
<INPUT TYPE="text" NAME="inputbox2" VALUE=""><P>
Cipher:<BR>
<INPUT TYPE="text" NAME="inputbox3" VALUE=""><P>
<INPUT TYPE="button" NAME="button" Value="Encode"
onClick="testEncode(this.form)"><BR>
</FORM>
</BODY>
</HTML>

To use, enter a word to encrypt. The script is set up to use only lower-case values, underscores, and periods (like Web filenames). Avoid using upper-case letters, and do not use any characters not allowed in Web filenames. You must also enter a key value, from 1 to 63. Leaving this entry blank or using a 0 will result in the same cipher text as the plain text. Click the Encode button to view the enciphered result. For example, suppose you specify the_beatles as the plain-text, and 4 as the key. The resulting cipher text is plaxfaephao. Therefore, the filename you will use with this combination is plaxfaephao. It's up to you if you wish to add an htm or html extension (some servers are a bit picky when it comes to files without extensions).

If your server limits filenames to eight characters, the plain-text filename should likewise be limited to eight characters. Example: If you specify beatles as the plain-text, and 12 as the key, you get 726hp2g.

Asking Users for a Password to Access a Page

Use cipher.html, above, to determine the ciphered filename, based on the password you want to use and a key value from 1 to 63. Once you've obtained the ciphered filename you can create and store the restricted access page on your server using that name.

The password.html file demonstrates a JavaScript program showing the basic principle of allowing access to the restricted page. The program allows the user to enter a password. Clicking the Submit button decodes the password, and links to that page. Note that if the user selects the wrong password, an incorrect decipher string is generated, and the browser attempts to link to a file that does not exist. An error message results. This error message provided by the server cannot be avoided.

The key value gives you many more password combinations. The password.html file uses the current month as the key value. This allows the user to access the page for one month only. The next month the key value changes, and therefore the same password yields a different enciphered result. This system is particularly useful if you cannot update the restricted files on a regular basis. The files "self-expire" according to the current month.

password.html

<HTML>
<HEAD>
<TITLE>JavaScript Password File</TITLE>
<SCRIPT LANGUAGE="JavaScript">
function testEncode(form) {
        var dater = new Date();
        Month = dater.getMonth()+1;
        dater = null;
        var Ret = encode (form.inputbox1.value, Month)
        location = Ret + ".html"
}

function encode (OrigString, CipherVal) {
        Ref="0123456789abcdefghijklmnopqrstuvwxyz._~"
        Ref=Ref+"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        CipherVal = parseInt(CipherVal)
        var Temp=""
        for (Count=0; Count < OrigString.length; Count++) {
                var TempChar = OrigString.substring (Count, Count+1)
                var Conv = cton(TempChar)
                var Cipher=Conv^CipherVal
                Cipher=ntoc(Cipher)
                Temp += Cipher
        }
        return (Temp)
}

function cton (Char) {
        return (Ref.indexOf(Char));
}

function ntoc (Val) {
        return (Ref.substring(Val, Val+1))
}
</SCRIPT>
</HEAD>
<BODY>
<FORM NAME="testform" onSubmit=false;>
Please enter your password: <BR>
<INPUT TYPE="text" NAME="inputbox1" VALUE=""><P>
<INPUT TYPE="button" NAME="button" Value="Submit"
onClick="testEncode(this.form)"><BR>
<INPUT TYPE="hidden" NAME="hidden" VALUE=""><P>
</FORM>
</BODY>
</HTML>

(Abridged from Web Commerce Cookbook, published by Wiley Computer Publishing and written by Gordon McComb. Copyright © 1997, Gordon McComb. All Rights Reserved. Please see http://gmccomb.com/commerce/ for more details on this book.)

RETURN

Option	What it does
deny,allow	Deny directives are evaluated before the allow directives.
allow,deny	Allow directives are evaluated before the deny directives.
mutual-failure	Only hosts that appear on allow list, but not deny list, are granted access.

Authorizing Users to Access Protected Content