home *** CD-ROM | disk | FTP | other *** search
- Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!howland.erols.net!dispose.news.demon.net!demon!news.demon.co.uk!demon!hyperlens.demon.co.uk!sleipnir.webthing.com!nobody
- From: Nick Kew <nick@webthing.com>
- Newsgroups: comp.infosystems.www.authoring.cgi,comp.answers,news.answers
- Subject: FAQ: Frequently Asked Questions about CGI Programming
- Supersedes: <cgi-faq-956193977@jarl.webthing.com>
- Followup-To: comp.infosystems.www.authoring.cgi
- Date: 23 Jul 2000 09:57:29 GMT
- Organization: WDG
- Approved: news-answers-request@MIT.EDU
- Expires: 12 Aug 2000 09:57:29 GMT
- Message-ID: <cgi-faq-964346249@jarl.webthing.com>
- NNTP-Posting-Host: hyperlens.demon.co.uk
- Summary: The Common Gateway Interface - Programming for the WWWeb:
- Basics (what is CGI; when to use CGI vs other
- web programming techniques)
- HTTP and NPH scripts: technical info and references
- Programming tips: "How do I do this..."
- Troubleshooting: How to tackle your problems
- Further Reading: related FAQs and reference material
- Keywords: CGI,FAQ,HTTP,WWW
- X-NNTP-Posting-Host: hyperlens.demon.co.uk:212.228.207.125
- X-Trace: news.demon.co.uk 964355428 nnrp-09:2415 NO-IDENT hyperlens.demon.co.uk:212.228.207.125
- X-Complaints-To: abuse@demon.net
- Content-Type: text/plain; charset=us-ascii
- Mime-Version: 1.0
- Lines: 1646
- Xref: senator-bedfellow.mit.edu comp.infosystems.www.authoring.cgi:84553 comp.answers:41642 news.answers:188270
-
- Archive-Name: www/cgi-faq
- Posting-Frequency: Irregular
-
- Frequently Asked Questions on CGI Programming
- ---------------------------------------------
-
- 0. Preamble
- 0.1. Changes
- 0.2. Notice and Disclaimer
- 0.3. Where to get this document
- 0.4. How to contribute to this document?
- 0.5. Can I email the author my questions?
- 0.6. What's up with posting to comp.infosystems.www.authoring.cgi?
- 0.7. Credits
-
- 1. Basic Questions
- 1.1. What is CGI?
- 1.2. Is it a script or a program?
- 1.3. When do I need to use CGI?
- 1.4. Should I use CGI or JAVA?
- 1.5. Should I use CGI or SSI or ... { PHP/ASP/... }
- 1.6. Should I use CGI or an API?
- 1.7. So what are in a nutshell the options for webserver programming?
- 1.8. What do I absolutely need to know?
- 1.9. Does CGI create new security risks?
- 1.10. Do I need to be on Unix?
- 1.11. Do I have to use Perl?
- 1.12. What languages should I know/use?
- 1.13. Do I have to put it in cgi-bin?
- 1.14. Do I have to call it *.cgi? *.pl?
- 1.15. What is the "CGI Overhead", and should I be worried about it?
- 1.16. What do I need to know about file permissions and "chmod"?
- 1.17. What is CGIWrap, and how does it affect my program?
- 1.18. How do I decode the data in my Form?
-
- 2. HTTP Headers and NPH Scripts
- 2.1. What is HTTP (HyperText Transfer Protocol)?
- 2.2. What HTTP request headers can I use?
- 2.3. What Environment variables are available to my application?
- 2.4. Why doesn't my script get REMOTE_USER? My page is password-protected.
- 2.5. What HTTP response headers do I need to know about?
- 2.6. What is NPH?
- 2.7. Must/should/can I write nph scripts?
- 2.8. Do I have to call it nph-*
- 2.9. What is the difference between GET and POST?
-
- 3. Techniques: "How do I..."
- 3.1. Can I get information about who is visiting?
- 3.2. Can I get the email of visitors?
- 3.3. "But I saw some.kool.site display my email address..."
- 3.4. Can I verify the email addresses people enter in my Form?
- 3.5. Subject: How can I get the hostname of the remote user?
- 3.6. Can I get browser details and return different pages?
- 3.7. Can I trace where a user has come from/is going to?
- 3.8. Can I launch a long process and return a page before it's finished?
- 3.9. Can I launch a long process which the user interacts with?
- 3.10. Can I password-protect my pages?
- 3.11. Can I do HTTP authentication using CGI?
- 3.12. Can I identify users/sessions without password protection?
- 3.13. Can I redirect users to another page?
- 3.14. Can I run a CGI script without returning a new page to the browser?
- 3.15. Can I write output to a different Netscape frame?
- 3.16. Can I write output to several frames at once?
- 3.17. Can I use a CGI script to generate both text and inline images?
- 3.18. How can I use Caches to make CGI scripts faster and more Net-friendly?
- 3.19. How can I avoid users hitting "submit" twice?
- 3.20. How can I stop my CGI script reading and writing files as "nobody"?
- 3.21. How can I prevent my CGI results being cached by the browser?
- 3.22. How can I control the default filename when downloading a file via CGI?
-
- 4. Troubleshooting a CGI application
- 4.1. Are there some interactive debugging tools and services available?
- 4.2. I'm having trouble with my headers. What can I do?
- 4.3. Why do I get Error 500 ("the script misbehaved", or "Internal Server Error")
- 4.4. I tried to use (Content-Type|Location|whatever), but it appears in my Browser?
- 4.5. How can I run my CGI program 'live' in a debugger?
- 4.6. I'm using CGI with QUERY_STRING embedded in my HTML, but it gets corrupted?
-
- 5. Further Reading
- 5.1. Other FAQs/collections
- 5.2. Reference Pages
-
- INDEX
-
- -------------------------------------------------------------
-
- Subject: SECTION 0 - PREAMBLE
-
- NOTE: the numbering in this document is automatically generated by my
- posting software, and will change between postings if new questions are
- added (as _may_ happen when I see - or someone contributes - a FAQ I've
- previously overlooked :-)
-
-
- ------------------------------
-
- Subject: 0.1 Changes
-
-
- Last Modified: July 2000. Updated several links reported
- by Site Valet as moved. Otherwise unchanged.
-
-
- ------------------------------
-
- Subject: 0.2 Notice and Disclaimer
-
-
- Copyright 1996-2000 Nick Kew.
-
- You are free to copy or distribute this document in whole or in part
- for any purpose and on any medium you choose, provided you include
- this notice and disclaimer in full.
-
- Disclaimer: This information is offered in good faith and in the hope
- that it may be of use, but is not guaranteed to be correct, up to date
- or suitable for any particular purpose. The author accepts no liability
- in respect of this information or its use.
-
-
- ------------------------------
-
- Subject: 0.3 Where to get this document
-
-
- The official homes of this document on the Web are now
- URL http://www.webthing.com/tutorials/cgifaq.html
- URL http://www.htmlhelp.org/faq/cgifaq.html
-
- NOTE - If you want to mirror the FAQ on your WWW site on a
- publicly-visible server, please make sure you keep it up-to-date.
-
- Other known sources are:
-
- (1) USENET: posted to newsgroups (TEXT)
- news:comp.infosystems.www.authoring.cgi
- news:comp.answers
- news:news.answers
-
- (2) RTFM and mirror sites (TEXT)
- ftp://rtfm.mit.edu/pub/usenet/news.answers/www/cgi-faq
-
- (3) RTFM WWW mirror sites, including (Partial HTML)
- Europe - http://www.cs.ruu.nl/cgi-bin/faqwais
- America - http://www.cis.ohio-state.edu/hypertext/faq/usenet/
-
- (4) By EMAIL from the FAQserver at RTFM (TEXT)
- Send email to mailto:mail-server@rtfm.mit.edu with
- send usenet/news.answers/www/cgi-faq
- in the body of your message
-
-
- ------------------------------
-
- Subject: 0.4 How to contribute to this document?
-
-
- I have removed the InterFAQ from this answer, as it has become
- clear that people prefer the familiar approach of emailing me
- to that of contributing via the web, and (in turn) the InterFAQ
- contents has not been maintained for some time. Thomas Boutell
- has since introduced a somewhat similar project, the OpenFAQ.
-
- Just mail me. ( mailto:nick@webthing.com )
-
-
- ------------------------------
-
- Subject: 0.5 Can I email the author my questions?
-
-
- Please don't. Post them to an appropriate newsgroup, where they'll
- be seen and possibly answered by a whole lot more people than just me.
- And remember: bad (or incoherent) questions get bad answers, so think
- carefully before posting.
-
- If you have an actual programming job to do, I might be interested
- However, I am unlikely to be interested in jobs below $1000.
-
- If you think something already in the FAQ needs clarifying, feel free
- to mail me: don't expect a personal reply, but I *might* add
- something to the answer in question, so check the next posting (or three).
-
-
- ------------------------------
-
- Subject: 0.6 What's up with posting to comp.infosystems.www.authoring.cgi?
-
-
- This is now a moderated newsgroup. The moderator is a bot run by
- Thomas Boutell ( mailto:boutell@boutell.com ). The charter for
- moderation is as follows:
-
- This newsgroup is self-moderated. Your first posting will not appear
- until you have read and responded to an automatic welcome mailing, at
- which point your posting will appear with no further delay. Provision
- will also be made to automatically approve first postings that contain
- a header requesting this. Subsequent postings are approved
- automatically.
-
- If posting normally doesn't work - as could be the case if your
- newsfeed has trouble with moderated groups - you can post articles
- by emailing them to:
- mailto:authoring-cgi@boutell.com
- Provided the return address in your mail is correct, you will then
- receive precise instructions for having your post(s) automatically approved.
-
- Alternative means of posting are detailed in the WWW FAQ, posted
- regularly by Thomas Boutell.
-
-
- ------------------------------
-
- Subject: 0.7 Credits
-
-
- This FAQ was written by Nick Kew, and has been considerably improved
- with the help of comments and criticisms, newsgroup posts and
- miscellaneous suggestions from correspondents including
- Nathan Neulinger, Maurice L. Marvin, Matthew Healy, Alan J. Flavell,
- Don Libes, Alain Deckers, David S. Jackson, J.M. Ivler, and no doubt
- others I've forgotten to credit (please remind me if necessary).
-
-
- -------------------------------------------------------------
-
- Subject: SECTION 1 - BASIC QUESTIONS
-
- This section aims to deal with basic questions, addressing the role and
- nature of CGI, and its place in Web programming. Questions/answers which
- just don't appear to 'fit' under any other section may also be included
- here.
-
-
- ------------------------------
-
- Subject: 1.1 What is CGI?
-
-
- [ from the CGI reference http://hoohoo.ncsa.uiuc.edu/cgi/overview.html ]
-
- The Common Gateway Interface, or CGI, is a standard for external
- gateway programs to interface with information servers such as HTTP servers.
- A plain HTML document that the Web daemon retrieves is static,
- which means it exists in a constant state: a text file that doesn't change.
- A CGI program, on the other hand, is executed in real-time, so that it
- can output dynamic information.
-
-
- ------------------------------
-
- Subject: 1.2 Is it a script or a program?
-
-
- The distinction is semantic. Traditionally, compiled executables
- (binaries) are called programs, and interpreted programs are usually
- called scripts. In the context of CGI, the distinction has become
- even more blurred than before. The words are often used interchangably
- (including in this document). Current usage favours the word "scripts"
- for CGI programs.
-
-
- ------------------------------
-
- Subject: 1.3 When do I need to use CGI?
-
-
- There are innumerable caveats to this answer, but basically any
- Webpage containing a form will require a CGI script or program
- to process the form inputs.
-
-
- ------------------------------
-
- Subject: 1.4 Should I use CGI or JAVA?
-
-
- [answer to this non-question hopes to try and reduce the noise level of
- the recurrent "CGI vs JAVA" threads].
-
- CGI and JAVA are fundamentally different, and for most applications
- are NOT interchangable.
-
- CGI is a protocol for running programs on a WWW server. Whilst JAVA
- can also be used for that, and even has a standardised API (the servlet,
- which is indeed an alternative to CGI), the major role of JAVA on the
- Web is for clientside programming (the applet).
-
- In certain instances the two may be combined in a single application:
- for example a JAVA applet to define a region of interest from a
- geographical map, together with a CGI script to process a query
- for the area defined.
-
-
- ------------------------------
-
- Subject: 1.5 Should I use CGI or SSI or ... { PHP/ASP/... }
-
-
- CGI and SSI (Server-Side Includes) are often interchangable, and it may
- be no more than a matter of personal preference. Here are a few
- guidelines:
- 1) CGI is a common standard agreed and supported by all major HTTPDs.
- SSI is NOT a common standard, but an innovation of NCSA's HTTPD
- which has been widely adopted in later servers. CGI has the
- greatest portability, if this is an issue.
- 2) If your requirement is sufficiently simple that it can be done
- by SSI without invoking an exec, then SSI will probably be
- more efficient. A typical application would be to include
- sitewide 'house styles', such as toolbars, netscapeised <body>
- tags or embedded CSS stylesheets.
- 3) For more complex applications - like processing a form -
- where you need to exec (run) a program in any case, CGI
- is usually the best choice.
- 4) If your transaction returns a response that is not an HTML page,
- SSI is not an option at all.
-
- Many more recent variants on the theme of SSI are now available.
- Probably the best-known are PHP which embeds server-side scripting
- in a pre-html page, and ASP which is Microsoft's version of a
- similar interface.
-
-
- ------------------------------
-
- Subject: 1.6 Should I use CGI or an API?
-
-
- APIs are proprietary programming interfaces supported by particular
- platforms. By using an API, you lose all portability. If you know
- your application will only ever run on one platform (OS and HTTPD),
- and it has a suitable API, go ahead and use it. Otherwise stick to CGI.
-
-
- ------------------------------
-
- Subject: 1.7 So what are in a nutshell the options for webserver programming?
-
-
- Too many to enumerate - but I'll try and summarise. Briefly, there
- are several decisions you have to make, including:
- * Power. Is it up to a complex task?
- * Complexity. How much programming manpower is it worth?
- * Portability. Might you want to run your program on another system?
-
- So here's an overview of the main options. It's inevitably subjective,
- but may be helpful to someone:
-
- Basic SSI: Simple interface for basic dynamic content.
- Non-standard - read your server docs.
- Enhanced SSI[1]: Suitable for more complex tasks within
- an HTML page.
- CGI: The standardised, portable general-purpose API,
- not limited to working with HTML pages.
- Enhanced CGI-like[2]: Typically gain efficiency but lose portability
- compared to standard CGI.
- Servlets: An alternative API for JAVA, that overcomes
- the limitation of JAVA not supporting
- environment variables.
- Server API: Generally the most powerful and most complex option.
-
- [1] For example, PHP, ASP.
- [2] For example, CGI adapted to mod_perl or fastcgi.
-
-
- ------------------------------
-
- Subject: 1.8 What do I absolutely need to know?
-
-
- If you're already a programmer, CGI is extremely straightforward, and just
- three resources should get you up to speed in the time it takes to read them:
- 1) Installation notes for your HTTPD. Is it configured to run CGI
- scripts, and if so how does it identify that a URL should be executed?
- (Check your manuals, READMEs, ISP webpages/FAQS, and if you still can't
- find it ask your server administrator).
- 2) The CGI specification at NCSA tells you all you need to know
- to get your programs running as CGI applications.
- http://hoohoo.ncsa.uiuc.edu/cgi/interface.html
- 3) WWW Security FAQ. This is not required to 'get it working', but
- is essential reading if you want to KEEP it working!
- http://www.w3.org/Security/Faq/www-security-faq.html
-
- If you're NOT already a programmer, you'll have to learn. If you would
- find it hard to write, say, a 'grep' or 'cat' utility to run from the
- commandline, then you will probably have a hard time with CGI. Make
- sure your programs work from the commandline BEFORE trying them with CGI,
- so that at least one possible source of errors has been dealt with.
-
-
- ------------------------------
-
- Subject: 1.9 Does CGI create new security risks?
-
-
- Yes. Period.
- There is a lot you can do to minimise these. The most important thing
- to do is read and understand Lincoln Stein's excellent WWW security
- FAQ, at http://www.w3.org/Security/Faq/www-security-faq.html
-
-
- ------------------------------
-
- Subject: 1.10 Do I need to be on Unix?
-
-
- No, but it helps. The Web, along with the Internet itself, C, Perl,
- and almost every other Good Thing in the last 20 years of computing,
- originated in Unix. At the time of writing, this is still the
- most mature and best-supported platform for Web applications.
-
-
- ------------------------------
-
- Subject: 1.11 Do I have to use Perl?
-
-
- No - you can use any programming language you please. Perl is simply
- today's most popular choice for CGI applications. Some other widely-
- used languages are C, C++, TCL, BASIC and - for simple tasks -
- even shell scripts.
-
- Reasons for choosing Perl include its powerful text manipulation
- capabilities (in particular the 'regular' expression) and the fantastic
- WWW support modules available.
-
-
- ------------------------------
-
- Subject: 1.12 What languages should I know/use?
-
-
- It isn't really that important. Use what you're comfortable with,
- or what you're constrained (eg by your manager) to use.
-
- If you're just dabbling with programming, Perl is a good choice, simply
- because of the wealth of ready-to-run Perl/CGI resources available.
-
- If you're serious about programming, you should be at home in a
- range of languages. C, the industry standard, is a must (at least to
- the level of comfortably reading other people's code). You'll
- certainly want at least one scripting language such as Perl, Python
- or Tcl. C++ is also a good idea.
-
- In response to a Usenet newbie question:
- > I am seriously wanting to learn some CGI programming languages
-
- J.M. Ivler wrote some eloquent words of wisdom:
- > If you want to learn a programming language, learn a programming language.
- > If you want to learn how to do CGI programming, learn a programming
- > language first.
- >
- > My book is one of the few that tackles two languages at the same time.
- > Why? because it's not about languages (which are just syntax for logic).
- > CGI programming is about programming, and how to leverage the experience
- > for the person coming to the site, or maintaining the site, or in some way
- > meeting some requirements. Language is just a tool to do so.
-
-
- ------------------------------
-
- Subject: 1.13 Do I have to put it in cgi-bin?
-
-
- see next question
-
-
- ------------------------------
-
- Subject: 1.14 Do I have to call it *.cgi? *.pl?
-
-
- Maybe. It depends on your server installation.
-
- These types of filenames are commonly used conventions - no more.
- It is up to the server administrator whether or not CGI scripts are
- enabled, and (if so) what conventions tell the server to run or
- to print them.
-
- If you are running your own server, read the manual.
- If you're on ISP or other rented webspace, check their webpages for
- information or FAQs. As a last resort, ask the server administrator.
-
-
- ------------------------------
-
- Subject: 1.15 What is the "CGI Overhead", and should I be worried about it?
-
-
- The CGI Overhead is a consequence of HTTP being a stateless protocol.
- This means that a CGI process must be initialised for every "hit"
- from a browser.
-
- In the first instance, this usually means the server forking a
- new process. This in itself is a modest overhead, but it can
- become important on a heavily-used server if the number of
- processes grows to problem levels.
-
- In the second place, the CGI program must initialise. In the
- case of a compiled language such as C or C++ this is negligible,
- but there is a small penalty to pay for scripting languages such as Perl.
-
- Thirdly, CGI is often used as 'glue' to a backend program, such as
- a database, which may take some considerable time to initialise.
- This represents a major overhead, which must be avoided in any
- serious application. The most usual solution is for the backend
- program to run as a separate server doing most of the work, while
- the actual CGI simply carries messages.
-
- Fourthly, some CGI scripts are just plain inefficient, and may
- take hundreds of times the resources they need. Programs using
- system() or `backtick` notation often fall into this category.
-
- Note that there are ways to reduce or eliminate all these overheads,
- but these tend to be system- or server-specific. The best-supported
- server is probably Apache, as commercial server-vendors may prefer to
- push their proprietary solutions in preference to CGI.
-
-
- ------------------------------
-
- Subject: 1.16 What do I need to know about file permissions and "chmod"?
-
-
- Unix systems are designed for multiple users, and include provision
- for protecting your work from unauthorised access by other users
- of the system. The file permissions determine who is permitted
- to do what with your programs, data, and directories. The command
- that sets file permissions is chmod.
-
- Web servers typically run as user "nobody". That means that, setting
- aside serious bugs (such as those in certain versions of the Frontpage
- extensions), your files are absolutely secure from damage through the
- webserver. It also means that you may have to make explicit changes to
- enable the server to access them in a CGI context.
-
- There are two ways to run CGI:
- - by default they run as the webserver user (nobody)
- For most purposes this is safest, as your programs and data
- are protected by the operating system from unauthorised access
- through possible bugs in your CGI. However, when the CGI has
- to write to a file, that file must be writable to every web
- user on the system, and is therefore completely unprotected.
- - setuid, they run under your own userid.
- This means that files written by your CGI can be secure.
- On the other hand, any bugs in your CGI could now compromise
- *all* your programs and data on the server.
- As an elementary security precaution, scripts (e.g. Perl) are
- prevented from running setuid by most OSs. The "cgiwrap"
- program offers a workaround for this.
-
- A third way you should *never* permit CGI to be run is:
- - as root or setuid root, they can run as any user.
- This is extremely dangerous, as any bugs could compromise the
- entire server, including every user's files. Fortunately only
- the system administrator can install setuid root programs. If
- you are *at all* concerned about security, make sure that no such
- programs (in particular Frontpage extensions) are installed,
- regardless of whether you use them yourself.
-
- For a proper overview, "man chmod". Some modes that may be useful
- in a typical CGI context are:
-
- * CGI programs, 0755
- * data files to be readable by CGI, 0644
- * directories for data used by CGI, 0755
- * data files to be writable by CGI, 0666 (data has absolutely no security)
- * directories for data used by CGI with write access, 0777 (no security)
- * CGI programs to run setuid, 4755
- * data files for setuid CGI programs, 0600 or 0644
- * directories for data used by setuid CGI programs, 0700 or 0755
- * For a typical backend server process, 4750
-
- Finally, if this answer tells you anything you didn't already know,
- don't even think about trying to set up a secure server!
-
-
- ------------------------------
-
- Subject: 1.17 What is CGIWrap, and how does it affect my program?
-
-
- [ quoted from http://www.umr.edu/~cgiwrap/intro.html ]
-
- > CGIWrap is a gateway program that allows general users to use CGI scripts
- > and HTML forms without compromising the security of the http server.
- > Scripts are run with the permissions of the user who owns the script. In
- > addition, several security checks are performed on the script, which will not
- > be executed if any checks fail.
- >
- > CGIWrap is used via a URL in an HTML document. As distributed, cgiwrap
- > is configured to run user scripts which are located in the
- > ~/public_html/cgi-bin/ directory.
-
- See http://www.umr.edu/~cgiwrap/
-
-
- ------------------------------
-
- Subject: 1.18 How do I decode the data in my Form?
-
-
- The normal format for data in HTTP requests is URLencoded. All Form data
- is encoded in a string, of the form
- param1=value1¶m2=value2&...paramn=valuen
- Many non-alphanumeric characters are "escaped" in the encoding:
- the character whose hexadecimal number is "XY" will be represented by
- the character string "%XY".
-
- Decoding this string is a fundamental function of every CGI library.
-
- Another format is "multipart/form-data", also known as "file upload".
- You will get this from the HTML markup
- <form method="POST" enctype="multipart/form-data">
-
- (but note you must accept URLencoded input in any case, since not all
- browsers support multipart forms).
-
- Most(?) CGI libraries will handle this transparently.
-
-
- -------------------------------------------------------------
-
- Subject: SECTION 2 - HTTP HEADERS AND NPH SCRIPTS
-
- This is a fairly technical section dealing with HTTP, the protocol of
- the Web. It also includes NPH, the mechanism by which CGI programs can
- return HTTP header information directly to the Client.
-
-
- ------------------------------
-
- Subject: 2.1 What is HTTP (HyperText Transfer Protocol)?
-
-
- HTTP is the protocol of the Web, by which Servers and Clients (typically
- browsers) communicate. An HTTP transaction comprises a Request sent by
- the Client to the Server, and a Response returned from the Server to
- the Client.
- Every HTTP request and response includes a message header, describing
- the message. These are processed by the HTTPD, and may often be
- mostly ignored by CGI applications (but see below).
- A message body may also be included:
- 1) A HEAD or GET request sends only a header. Any form data is encoded
- in an HTTP_QUERY_STRING header field, which is available to the CGI
- program as an environment variable QUERY_STRING.
- 2) A POST request sends both header and body. The body typically
- comprises data entered by a user in a form.
- 3) A HEAD request does not expect a body in the response.
- 4) A GET or POST request will accept a response with or without a body,
- according to the header. The body of a response is typically an
- HTML document.
-
-
- ------------------------------
-
- Subject: 2.2 What HTTP request headers can I use?
-
-
- Most HTTP request headers are passed to the CGI script as environment
- variables. Some are guaranteed by the CGI spec. Others are server,
- browser and/or application dependent.
-
- To see what _your_ browser and server are telling each other, just use
- a trivial little CGI script to print out the environment. In Unix:
- #!/bin/sh
- echo "Content-type: text/plain"
- echo
- set
-
- (Just call it "env.cgi" or something, and put it where your server
- will execute it. Then point your browser at
- http://your.server/path/to/env.cgi ).
-
- This enables you to see at-a-glance what useful server variables are set.
- Note that dumping the environment like this within a more complex
- script can be a useful debugging technique.
-
- For details, see the CGI Environment Variables specification at
- http://hoohoo.ncsa.uiuc.edu/cgi/env.html
- (which also includes a version of the above script - somewhat more
- nicely formatted - online).
-
-
- ------------------------------
-
- Subject: 2.3 What Environment variables are available to my application?
-
-
- See previous question. Those you can rely on are documented in NCSA's
- pages; those associated with your particular server and browser can
- be determined using the above script.
-
-
- ------------------------------
-
- Subject: 2.4 Why doesn't my script get REMOTE_USER? My page is password-protected.
-
-
- You will get REMOTE_USER if the _script_ is password protected.
- That's all. The page the user is coming from has nothing to do with it.
-
-
- ------------------------------
-
- Subject: 2.5 What HTTP response headers do I need to know about?
-
-
- Unless you are using NPH, the HTTPD will insert necessary response
- headers on your behalf, always provided it is configured to do so.
-
- However, it is conventional for servers to insert the Content-Type header
- based on a page's filename, and CGI scripts cannot rely on this. Hence
- the usual advice is to print an explicit Content-Type header.
- At least one of "Content-Type", "Status" and "Location" is almost
- always required.
-
- A few other headers you may wish to use explicitly are:
- Status (to set HTTP return code explicitly. Caveats:
- (1) Behaviour is undefined if it conflicts with
- another header. (2) This is NOT an HTTP header.)
- Location (to redirect the user to another URI, which may or may
- not be on your own server)
- Set-cookie (Netscape/Nonstandard) Set a cookie
- Refresh (Netscape/Nonstandard) Clientpull
-
- You can also use general MIME headers: eg "Keywords" for the benefit of
- indexers (although in this instance some major search robots have
- regrettably introduced a new protocol to do the same thing).
-
- For a detailed reference, see RFC1945 (HTTP/1.0) or RFC2068 (HTTP/1.1).
-
-
- ------------------------------
-
- Subject: 2.6 What is NPH?
-
-
- NPH = No Parsed Headers. The script undertakes to print the entire
- HTTP response including all necessary header fields. The HTTPD
- is thereby instructed not to parse the headers (as it would normally do)
- nor add any which are missing.
-
-
- ------------------------------
-
- Subject: 2.7 Must/should/can I write nph scripts?
-
-
- Generally, no. It is usually better to save yourself hassle by letting
- the HTTPD produce the headers for you.
-
- If you are going to use NPH, be sure to read and understand the HTTP spec at
- http://www.w3.org/pub/WWW/Protocols/
-
- Your headers should be complete and accurate, because you're instructing
- the HTTPD not to correct them or insert what's missing.
-
- Possible circumstances where the use of NPH is appropriate are:
- * When your headers are sufficiently unusal that they might be
- differently parsed by different HTTPDs (eg combining "Location:"
- with a "Status:" other than 302).
- * When returning output over a period of time (eg displaying
- unbuffered results of a slow operation in 'real' time).
- See RFC1945 (HTTP/1.0) or RFC2068 (HTTP/1.1) for detail
-
-
- ------------------------------
-
- Subject: 2.8 Do I have to call it nph-*
-
-
- According to NCSA's reference pages, this is the standard for telling
- the server that your script is NPH, so this should be a fully portable
- convention.
-
-
- ------------------------------
-
- Subject: 2.9 What is the difference between GET and POST?
-
-
- Firstly, the the HTTP protocol specifies differing usages for the two
- methods. GET requests should always be idempotent on the server.
- This means that whereas one GET request might (rarely) change some state
- on the Server, two or more identical requests will have no further effect.
-
- This is a theoretical point which is also good advice in practice.
- If a user hits "reload" on his/her browser, an identical request will be
- sent to the server, potentially resulting in two identical database or
- guestbook entries, counter increments, etc. Browsers may reload a
- GET URL automatically, particularly if cacheing is disabled (as is usually
- the case with CGI output), but will typically prompt the user before
- re-submitting a POST request. This means you're far less likely to get
- inadvertently-repeated entries from POST.
-
- GET is (in theory) the preferred method for idempotent operations, such
- as querying a database, though it matters little if you're using a form.
- There is a further practical constraint that many systems have builtin
- limits to the length of a GET request they can handle: when the total size
- of a request (URL+params) approaches or exceeds 1Kb, you are well-advised
- to use POST in any case.
-
- In terms of mechanics, they differ in how parameters are passed to the
- CGI script. In the case of a POST request, form data is passed on
- STDIN, so the script should read from there (the number of bytes to be
- read is given by the Content-length header). In the case of GET, the
- data is passed in the environment variable QUERY_STRING. The content-type
- (application/x-www-form-urlencoded) is identical for GET and POST requests.
-
-
- -------------------------------------------------------------
-
- Subject: SECTION 3 - TECHNIQUES: "HOW DO I..."
-
- This section comprises programming hints and tips for a number of popular
- tasks. Also included are a number of common questions to which the answer
- is "you can't", with the reasons why.
-
-
- ------------------------------
-
- Subject: 3.1 Can I get information about who is visiting?
-
-
- *sigh*
- Many people keep mailing me questions or suggested hacks to get
- visitor information, particularly email addresses. It seems they
- won't take "NO" for an answer.
-
- The bottom line is that whatever information is available to _you_
- is _equally_ available to every spammer on the net. Therefore when
- a browser bug _does_ permit personal data to be collected, it gets
- reported and fixed very quickly (one short-lived Netscape 2.0.x
- release reportedly had such a bug in its Javascript engine).
-
- You can get some limited information from the environment variables
- passed to you by the browser. Relatively few of these are guaranteed
- to be available, and some may be misleading. For particular types
- of information, see below. For full details, see NCSA's reference pages.
-
-
- ------------------------------
-
- Subject: 3.2 Can I get the email of visitors?
-
-
- Why do you want to do this?
-
- The best information available is the REMOTE_ADDR and REMOTE_HOST,
- which tell you nothing about the user. Techniques such as "finger@"
- are not reliable, are widely disliked, and generally serve only to
- introduce long delays in your CGI. Better - as well as more polite -
- just to ask your users to fill in a form.
-
- BTW: the "From:" header line (HTTP_FROM variable) is usually only set
- by robots, since human visitors to your webpage will not normally want
- their addresses collected without permission, and browsers respect this.
-
-
- ------------------------------
-
- Subject: 3.3 "But I saw some.kool.site display my email address..."
-
-
- Some sites will play party tricks, which can get *some users* email
- addresses. Possible tell-tale signs of this are inordinate delays
- loading a page (fingering @REMOTE_HOST - doesn't often work but
- probably can't be detected from the webpage), or a submit button that
- appears to do nothing at all (a mailto: form - works well with some
- browsers but trivially detectable). As a "snoop" party trick that's
- fine, but if you find someone abusing these facilities (eg they send
- you junkmail), alert their service provider!
-
-
- ------------------------------
-
- Subject: 3.4 Can I verify the email addresses people enter in my Form?
-
-
- Unfortunately people will sometimes enter an incorrect or invalid
- email address in your Form. Worse, they may enter a valid but
- incorrect email address that will deliver to someone who doesn't
- want your mail.
-
- Proposed regexps to match email addresses are sometimes posted.
- Most of these will fail against perfectly valid email addresses,
- like "S=N.OTHER/OU1=X12345A/RECIPNUM=1/MTA-BASIC@attmail.com"
- (which is what your address looks like if you are connected to
- the Internet via X400 - and if you think that example is too easy,
- check the ones at the end of Eli the Bearded's Email Addressing FAQ).
-
- Probably the most complete parser and checker available for download
- is Tom Christiansen's, at
- http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz
- Of course, this still says nothing about deliverability.
-
- A frequently-suggested hack that doesn't work is to use
- SMTP EXPN or VRFY commands. Modern versions of sendmail permit
- administrators to disable these commands, and many sites take
- advantage of this facility to protect their users' privacy.
-
- Probably the best way to verify an email address is to send mail to
- it, asking the user to respond. Include a clause like "if you have
- received this mail in error, please accept our apologies..."
-
-
- ------------------------------
-
- Subject: 3.5 Subject: How can I get the hostname of the remote user?
-
-
- You can't. Well, not always.
-
- IF it is available, you'll find it in the REMOTE_HOST environment
- variable. However, this will more often than not contain the numerical
- IP address rather than the IP name of the remote host. Remember that
- not all IP addresses have a hostname associated with them; this is the
- case of most IP addresses assigned to dialup users, for example. Your
- web server may also not perform a reverse lookup on incoming
- connections, in which case REMOTE_HOST will contain the IP address even
- if it has a corresponding IP name. In the second case, you can do a
- reverse lookup yourself in your script, but this is expensive and
- should probably be avoided unless absolutely necessary.
-
- Even if you do manage to obtain a hostname, you should be aware that it
- may not correspond to the hostname the user is accessing your page
- from. It may instead be that of an intervening proxy host.
-
- The short answer is therefore that there is no reliable way of finding
- out what the remote user's hostname is.
-
-
- ------------------------------
-
- Subject: 3.6 Can I get browser details and return different pages?
-
-
- Why do you want to do this?
-
- Well-written HTML will display correctly in any browser, so the correct
- answer to this question is to design a template for your output in good
- HTML, and make sure your output is correct.
-
- If you insist on a different answer, you can use the HTTP_USER_AGENT
- environment variable. This requires care, and can lead to unexpected
- results. For example, checking for "Mozilla" and serving a frameset
- to it ensures that you *also* serve the frameset to early (Non-Frame)
- Netscapes, me-too browsers (notably Microsoft[1]) and others who have
- chosen to lie to you about their browser.
-
- Note also that not every User Agent is a browser. Your page may be
- read by a user agent you've never heard of, and then displayed by
- 100 different browsers. Or retrieved by different browsers from
- a cache. Another reason to write good HTML, and not try to
- devise a clever or koool substitute.
-
- [1] At the time of writing, only Netscape 2+ supported frames, and
- some authors considered them koool. That's changed, but the same
- general principle still holds.
-
-
- ------------------------------
-
- Subject: 3.7 Can I trace where a user has come from/is going to?
-
-
- HTTP_REFERER might or might not tell you anything. By all means
- use it to collect partial statistics if you participate in (say)
- an advertising banner scheme. But it is not always set, and may
- be meaningless (eg if a user has accessed your page from a bookmark,
- and the browser is too dumb to cope with this).
-
- The HTTP protocol forbids relying on Referer information for functionality
- in your programs, so don't try it.
-
- You cannot trace outgoing links at all. If you really must try,
- point all the external links to your HTTPD and use its redirection
- facility (which gives you generally-reliable logs). This is much
- less inefficient than using a CGI script.
-
- BTW: don't even think about asking Javascript to send you information
- on some event: it's a violation of privacy which Netscape fixed as
- soon as complaints about its abuse started coming in. If it works
- with *your* browser, you should upgrade!
-
-
- ------------------------------
-
- Subject: 3.8 Can I launch a long process and return a page before it's finished?
-
-
- [UNIX]
- You have to fork/spawn the long-running process.
- The important thing to remember is to close all its file descriptors;
- otherwise nothing will be returned to the browser until it's finished.
- The standard trick to accomplish this is redirection to/from /dev/null:
-
- "long_process < /dev/null > /dev/null 2>&1 &"
- print HTML page as usual
-
-
- ------------------------------
-
- Subject: 3.9 Can I launch a long process which the user interacts with?
-
-
- This does not fit well with the basic mechanics of the Web, in which
- each transaction comprises a single request and response.
- If your processing can be done on the Client machine, you can use
- a clientside application; for example a Java applet.
-
- For processing on the server, one trick that works well for Clients
- running an X server (and far more efficient than a JAVA solution) is:
- if ( fork() ) {
- print HTML page explaining what's going on and advising about xhost
- } else {
- exec ("xterm -display THEIR_DISPLAY -title MY_APP -e MY_PROG ARGS
- < /dev/null > /dev/null 2>&1 &") ;
- }
- NOTE: THEIR_DISPLAY is not necessarily the same as REMOTE_HOST or REMOTE_ADDR.
- You have to ask users to supply their display (set REMOTE_HOST as default).
-
- A JAVA terminal program will accomplish something similar for the many
- users with platforms that support JAVA but not X.
-
-
- ------------------------------
-
- Subject: 3.10 Can I password-protect my pages?
-
-
- Yes. Use your HTTPD's authentication, just as you would a basic HTML page.
- Now you'll have the identity of every visitor in REMOTE_USER.
-
-
- ------------------------------
-
- Subject: 3.11 Can I do HTTP authentication using CGI?
-
-
- It depends on which version of the question you asked.
-
- Yes, you can use CGI to trigger the browser's standard Username/Password
- dialogue. Send a response code 401, together with a "WWW-authenticate"
- header including details of the the authentication scheme and realm:
- e.g. (in a non-NPH script)
-
- Status: 401 Unauthorized to access the document
- WWW-authenticate: Basic realm="foobar"
- Content-type: text/plain
-
- Unauthorised to access this document
-
- The use you can make of this is server-dependent, and harder,
- since most servers expect to deal with authentication before ever
- reaching the CGI (eg through .www_acl or .htaccess).
- Thus it cannot usefully replace the standard login sequence, although
- it can be applied to other situations, such as re-validating a user -
- e.g after a certain timeout period or if the same person may need to
- login under more than one userid.
-
- What you can never get in CGI is the credentials returned by the user.
- The HTTPD takes care of this, and simply sets REMOTE_USER to the
- username if the correct password was entered.
-
- For a much longer but outdated discussion of this question,
- see my discussion at http://www.webthing.com/tutorials/login.html
-
-
- ------------------------------
-
- Subject: 3.12 Can I identify users/sessions without password protection?
-
-
- The most usual (but browser-dependent) way to do this is to set a cookie.
- If you do this, you are accepting that not all users will have a 'session'.
-
- An alternative is to pass a session ID in every GET URL, and in hidden
- fields of POST requests. This can be a big overhead unless _every_ page
- requires CGI in any case.
-
- Another alternative is the Hyper-G[1] solution of encoding a session-id in
- the URLs of pages returned:
- http://hyper-g.server/session_id/real/path/to/page
- This has the drawback of making the URLs very confusing, and causes any
- bookmarked pages to generate old session_ids.
-
- Note that a session ID based solely on REMOTE_HOST (or REMOTE_ADDR)
- will NOT work, as multiple users may access your pages concurrently
- from the same machine.
-
- [1] Actually I don't think that's been true of Hyper-G since sometime
- in '96. However, general advances in web server technology, such as
- Apache's mod_alias or mod_rewrite, make it straightforward without
- the need for CGI.
-
-
- ------------------------------
-
- Subject: 3.13 Can I redirect users to another page?
-
-
- For permanent and simple redirection, use the HTTPD configuration file:
- it's much more efficient than doing it yourself. Some servers enable
- you to do this using a file in your own directory (eg Apache) whereas
- others use a single configuration file (eg CERN).
-
- For more complicated cases (eg process form inputs and conditionally
- redirect the user), use the "Location:" response header.
- If the redirection is itself a CGI script, it is easy to URLencode
- parameters to it in a GET request, but don't forget to escape the URL!
-
-
- ------------------------------
-
- Subject: 3.14 Can I run a CGI script without returning a new page to the browser?
-
-
- Yes, but think carefully first: How are your readers going to know
- that their "submit" has succeeded? They may hit 'submit' many times!
-
- The correct solution according to the HTTP specification is to
- return HTTP status code 204. As an NPH script, this would be:
-
- #!/bin/sh
- # do processing (or launch it as background job)
- echo "HTTP/1.0 204 No Change"
- echo
-
- (as non-NPH, you'd simply replace HTTP/1.0 with the Status: CGI header).
-
- Alan J Flavell has pointed out that this will fail with certain
- popular browsers, and suggests a workaround to accommodate them:
-
- [ May 1998 update[1]: I'm deleting Alan's suggestion, because the problem
- is mainly of historical interest, and the workaround is no longer
- recommended. See his page for a a detailed survey and recommendations.
- ]
-
- His survey is at
- http://ppewww.ph.gla.ac.uk/%7Eflavell/status204/results.html
-
- [1] With apologies to Alan for having left it in so long.
-
-
- ------------------------------
-
- Subject: 3.15 Can I write output to a different Netscape frame?
-
-
- Yep. The fact you're using CGI makes no difference: use
- "target=" in your links as usual. Alternatively, the script
- can print a "Window-target:" header. Read Netscape's pages
- for detail: these answer all the questions about things like
- "getting rid of" or "breaking out of" frames, too.
-
-
- ------------------------------
-
- Subject: 3.16 Can I write output to several frames at once?
-
-
- A single CGI script can only ever print to one frame.
-
- However, this limitation may be overcome by using more than one script.
- The first script (the URL of the "submit" button) prints a frameset,
- typically to a "_parent" or "_top" target. The sources for one or
- more of the frames thus generated may also be CGI scripts, to which
- you can easily pass parameters (eg encoded in URLs with method GET).
- This hack is definitely not recommended. If you find yourself wanting
- to update several frames from a single user event, it probably means
- you should review the design of your application at a higher level.
-
- Warnings:
- 1. Don't forget to escape your URLs.
- 2. This technique results in your server being hit by multiple
- concurrent CGI requests. You'll need LOTS of memory, especially
- if you use a memory-hog like Perl. It can be a good recipe
- for bringing a server to its knees.
-
- Javascript is often a valid alternative here, but note just how silly
- it can (and often does) look in a different browser.
-
-
- ------------------------------
-
- Subject: 3.17 Can I use a CGI script to generate both text and inline images?
-
-
- Not directly. One script generates one response to one request.
-
- If you want to generate a dynamic page including dynamic images
- (say, a report including graphs, all of which depend on user input)
- then your primary script will print the usual
- <img src="[script-to-generate-image]" alt="[what you asked for]">
- and, just as in the multiple frames case, you can pass data to the
- image-generating program encoded in a GET URL. Of course, the same
- caveats apply: see above.
-
-
- ------------------------------
-
- Subject: 3.18 How can I use Caches to make CGI scripts faster and more Net-friendly?
-
-
- This is currently beyond the scope of this FAQ. However,
- there is an excellent introduction to net-friendly webpages, including
- CGI pages, at http://vancouver-webpages.com/CacheNow/
-
- A sample cacheing perl/cgi script by Andrew Daviel is available at
- http://vancouver-webpages.com/proxy/log-tail.pl
-
-
- ------------------------------
-
- Subject: 3.19 How can I avoid users hitting "submit" twice?
-
-
- You can't. You just have to deal with it when they do.
-
- You can avoid re-processing a submission by embedding a unique ID in your
- Form each time it is displayed. When you process the form, you enter
- the ID in a database. Or, if it's already there, you don't repeat the
- processing.
-
- You probably want to expire your database entries after a little time:
- an hour should be fine in a typical situation.
-
- If you're already using cookies (e.g. a shoppingcart), an alternative is
- to use the cookie as a unique identifier. This means you also have to
- handle the situation where a user deliberately "goes round twice" and
- submits the same form with different contents.
-
- If your script may take some time to process, you should also consider
- running it as a background job, and returning an immediate
- acknowledgement to the user (see above if your "immediate" response
- gets delayed until processing is complete in any case).
-
-
- ------------------------------
-
- Subject: 3.20 How can I stop my CGI script reading and writing files as "nobody"?
-
-
- CGI scripts are run by the HTTPD, and therefore by the UID of the HTTPD
- process, which is (by convention) usually a special user "nobody".
-
- There are two basic ways to run a script under your own userid:
- (1) The direct approach: use a setuid program.
- (2) The double-server approach: have your CGI script communicate
- with a second process (e.g. a daemon) running under your userid,
- which is responsible for the actual file management.
-
- The direct approach is usually faster, but the client-server architecture
- may help with other problems, such as maintaining integrity of a database.
-
- When running a compiled CGI program (e.g. C, C++), you can make it
- setuid by simply setting the setuid bit:
- e.g. "chmod 4755 myprog.cgi"
-
- For security reasons, this is not possible with scripting languages
- (eg Perl, Tcl, shell). A workaround is to run them from a setuid
- program, such as cgiwrap.
-
- In most cases where you'd want to use the client-server approach,
- the server is a finished product (such as an SQL server) with its
- own CGI interface.
- A lightweight alternative to this is Don Libes' "expect" package.
-
- Note that any program running under your userid has access to all your
- files, and could do serious damage if hacked. Take care!
-
-
- ------------------------------
-
- Subject: 3.21 How can I prevent my CGI results being cached by the browser?
-
-
- Firstly, we need to debunk a myth. People asking this question usually
- add that they tried "Pragma: no-cache". Whilst this is not actively
- wrong, there is no requirement on browsers to take any notice of it,
- and most of them don't.
-
- The "Pragma: no-cache" header (now superseded by HTTP/1.1 Cache-Control)
- is a directive to proxies. The browser sends it with an HTTP request
- to indicate that it wants the request to be dealt with by the original
- server and will not accept a proxy's cached document (e.g. when you
- use a reload button). The server may send it to tell a proxy not to
- cache the document.
-
- Having said all that, a practical hack to get round cacheing is
- to use a different URL for your CGI script each time it's called.
- This can easily be accomplished by adding a unique identifier such
- as current time in the QUERY_STRING or PATH_INFO. The browser will
- see a different URL, but the script can just ignore it. Note that
- this can be very inefficient, and should be avoided where possible.
-
-
- ------------------------------
-
- Subject: 3.22 How can I control the default filename when downloading a file via CGI?
-
-
- (from a newsgroup post by Matthew Healy)
-
- One option, assuming you aren't already using the PATH_INFO
- environment variable, is just to call your CGI script with extra
- path information.
-
- For example, suppose the URL to your script is actually
-
- http://server.com/scriptname?name1=value1&name2=value2
-
- Instead, try calling it as
-
- http://server.com/scriptname/filename.ext?name1=value1&name2=value2
-
- and note that you need to escape the URL if it's in an HTML page:
-
- http://server.com/scriptname/filename.ext?name1=value1&name2=value2
-
- And probably the browser will assign the name given in the last chunk
- as the suggested filename for downloading.
-
- This works because the http server looks for the program file to run,
- then passes any extra path to the program as PATH_INFO variable; the
- browser cannot tell where the SCRIPT_NAME part ends and the PATH_INFO
- part begins.
-
- This can also be very useful if you want one script to generate more
- than one filename -- the script can check the PATH_INFO value and
- alter its response accordingly...
-
-
- -------------------------------------------------------------
-
- Subject: SECTION 4 - TROUBLESHOOTING A CGI APPLICATION
-
- Since this subject is quite well covered by other documents, this FAQ has
- relatively little to say.
-
- Eric Wienke has a page "Debugging CGI Scripts 101" at
- http://www.liquidsilver.com/scripts/debug101.html
-
- Tom Christiansen's "Idiot's guide to solving Perl/CGI problems" is a
- slightly tongue-in-cheek list of common problems, and how to track
- them down. Much of what Tom covers is not specifically Perl, but
- applies equally to CGI programming in other languages.
-
- Marc Hedlund's CGI FAQ and Thomas Boutell's WWW FAQ also
- deal with this subject.
-
- See "Further Reading" below (if you don't already know where to find these
- documents).
-
-
- ------------------------------
-
- Subject: 4.1 Are there some interactive debugging tools and services available?
-
-
- (1) Several CGI programming libraries offer powerful interactive
- debugging facilities. These include:
-
- - for Perl, Lincoln Stein's CGI.pm
- (now part of the standard Perl distribution)
-
- - for Tcl, Don Libes' cgi.tcl
- http://expect.nist.gov/cgi.tcl
-
- - for C++, Nick Kew's CGI++
- http://www.webthing.com/cgiplusplus/
-
- (2) Nathan Neulinger's cgiwrap is another package with debugging aids.
- http://www.umr.edu/~cgiwrap/
-
- (3) The "mod_cgi" Apache module (new with Apache 1.2) enables you to
- capture script output and errors for diagnosis.
-
- See also the next question.
-
-
- ------------------------------
-
- Subject: 4.2 I'm having trouble with my headers. What can I do?
-
-
- For simple cases, examining your response headers "by hand" may suffice:
- (1) telnet to the host and port where the server is running - e.g.
- telnet www.myhost.com 80
- (2) Enter HTTP request. The most useful for this purpose is usually HEAD; eg
- HEAD /index.html HTTP/1.0
- (optional HTTP headers)
- (followed by a blank line)
- Now you'll get a full HTTP response header back.
-
- For complex cases, such as sending a request with headers (as a browser
- does) or POSTing a form, this author's free online diagnosis cg-eye is
- included in the respective toolkits at
- http://www.htmlhelp.org/tools/
- http://www.webthing.com/valet/
- This combines an offline cgi "linter" with two online services:
- (a) Interactive mode permits you to formulate an HTTP request,
- which is then sent to your server.
- (b) Live mode submits your form, exactly as it gets it from your
- browser.
- In both cases, it will print a detailed report of the transaction,
- and optionally (if the CGI is producing an HTML page) validate it.
-
-
- ------------------------------
-
- Subject: 4.3 Why do I get Error 500 ("the script misbehaved", or "Internal Server Error")
-
-
- Your script must follow the CGI interface, which requires it to print:
- (1) One or more Header lines.
- (2) A blank line
- (3) (optional, but strongly advised) a document body.
-
- This error means it didn't.
-
- The Header lines can include anything that's valid under HTTP, but must
- normally include at least one of the three special CGI headers:
- Content-Type
- Location
- Status
-
- Example (a very minimal HTML page via CGI)
- Content-Type: text/html <= Header
- <= Blank Line
- <title>HelloWorld</title>Hello World <= Document Body
-
- A common reason for a script to fail is that it crashed before printing
- the header and blank line (or while these are buffered). Or that it
- didn't run at all: you _did_ try it from the commandline as well as
- check the file permissions and server configuration, didn't you?
-
- Another possible reason is that it printed something else - like an
- error message - in the Headers. Check error logs, put a dummy header
- right at the top (for debugging only), check the "Idiot's Guide",
- and use the debug mode of your CGI library.
-
-
- ------------------------------
-
- Subject: 4.4 I tried to use (Content-Type|Location|whatever), but it appears in my Browser?
-
-
- That means you put the line in the wrong place. It must appear in the
- CGI Header, not the document body. See previous question.
-
- It's also possible that you didn't print a header at all, or had a blank
- line or other noise before or in the header, but that the HTTPD has
- corrected this error for you (servers which correct your errors may give
- rise to the "works on A not on B" phenomenon). See previous question.
-
-
- ------------------------------
-
- Subject: 4.5 How can I run my CGI program 'live' in a debugger?
-
-
- David S. Jackson offers the following tip:
-
- > I have a very good trick for debugging CGIs written in C/C++ running on
- > UNIX. You might want to add it to the debugging section of your CGI faq.
- >
- > First, in your CGI code, at it's start, add "sleep(30);". This will cause
- > the CGI to do nothing for thiry seconds (you may need to adjust this
- > time). Compile the CGI with debuging info ("-g" in gcc) and install the
- > CGI as normal. Next, using your web browser, activate the CGI. It will of
- > course just sit there doing nothing. While it is 'sleeping', find it's PID
- > (ps -a | grep <cgi name>). Load your debugger and attach to that PID
- > ("attach <pid>" in gdb). You will also need to tell it where to find the
- > symbol definitions ("symbol-file <cgi>" in gdb). Then set a break point
- > after the invocation of the sleep function and you are ready to debug. Do
- > be aware that your browser will eventually timeout if it doesn't recieve
- > anything.
-
- (Anyone know similar tricks for scripting languages)?
-
-
- ------------------------------
-
- Subject: 4.6 I'm using CGI with QUERY_STRING embedded in my HTML, but it gets corrupted?
-
-
- The problem is the & character, which has two separate special meanings:
- - In HTTP (and hence CGI) it is a separator in your QUERY_STRING
- - In HTML it is an escape character
-
- So when it appears in an HTML context, it should be encoded. If you need
- a link to myprog.cgi with QUERY_STRING "a=1&b=2" you should write
- <a href="myprog.cgi?a=1&b=2">my program</a>
- which the browser's HTML parser will convert to what you wanted.
-
- There are possible browser problems here, although they appear to be
- limited to older browsers. Some other approaches are:
- - Use a different separator character in CGI programs when called in this
- manner. Or even a completely different encoding. This is safe, but may
- be much more work unless your CGI library supports setting a different
- separator character.
- - Avoid any parameters whose names include that of any HTML entity.
- This runs a possible risk if the set of entities changes in future,
- or when browsers introduce proprietary 'extensions'.
-
-
- -------------------------------------------------------------
-
- Subject: SECTION 5 - FURTHER READING
-
-
- ------------------------------
-
- Subject: 5.1 Other FAQs/collections
-
-
- **** Lincoln Stein's FAQ is probably the most ****
- **** important WWW document you will ever read. ****
-
- Web Authoring FAQs
- http://www.htmlhelp.org/faq/wdgfaq.htm
-
- For general WWW issues, the World Wide Web FAQ by Thomas Boutell
- http://www.boutell.com/faq/
-
- Perl/CGI programming FAQ, by Shishir Gundavaram and Tom Christiansen
- http://www.perl.com/perl/faq/perl-cgi-faq.html
-
- The Idiot's Guide to solving Perl/CGI problems by Tom Christiansen
- http://www.perl.com/perl/faq/idiots-guide.html
-
- The WWW Security FAQ by Lincoln Stein
- http://www.w3.org/Security/Faq/www-security-faq.html
-
- CGI Resources Library
- http://www.cgi-resources.com/
-
- The WWW Virtual Library
- http://WWW.Stars.com/Vlib/
-
-
- ------------------------------
-
- Subject: 5.2 Reference Pages
-
-
- CGI Internet Draft - the official spec
- http://www.golux.com/coar/cgi/
-
- The Common Gateway Interface (CGI) - old de facto spec
- http://hoohoo.ncsa.uiuc.edu/cgi/interface.html
-
- HyperText Transfer Protocol (HTTP)
- http://www.w3.org/pub/WWW/Protocols/HTTP/
-
- HyperText Markup Language (HTML)
- http://www.w3.org/pub/WWW/MarkUp/
-
-
-
- ------------------------------
-
- Subject: INDEX
-
- The index is generated from an arbitrary list of keywords.
- If I've missed anything obvious that should be here, please let me know.
-
-
- APACHE 1.15, 3.12, 3.13, 4.1
- ASP 1.5, 1.7
- AUTHENTICATION 3.10, 3.11
- BACKGROUND 3.14, 3.19
- BASIC 1, 1.7, 1.11, 3.4, 3.9, 3.10, 3.11, 3.20
- BROWSER 1.15, 2.2, 2.3, 2.9, 3.1, 3.6, 3.7, 3.8, 3.11, 3.12, 3.16,
- 3.21, 3.22, 4.2, 4.5, 4.6
- C 1.10, 1.11, 1.12, 1.15, 3.20, 4.1, 4.5
- CACHE 3.6, 3.21
- CERN 3.13
- CGI 0.3, 0.6, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.11,
- 1.12, 1.14, 1.15, 1.16, 1.17, 1.18, 2.1, 2.2, 2.5, 2.9,
- 3.2, 3.7, 3.11, 3.12, 3.13, 3.14, 3.15, 3.16, 3.18, 3.20, 3.21, 3.22, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 5.1, 5.2
- CGIWRAP 1.16, 1.17, 3.20, 4.1
- CHMOD 1.16, 3.20
- COOKIE 2.5, 3.12, 3.19
- CREDENTIALS 3.11
- DATABASE 1.15, 2.9, 3.19, 3.20
- DEBUG 4.3, 4.5
- EMAIL 0.3, 3.1, 3.3, 3.4
- ENVIRONMENT 1.7, 2.1, 2.2, 2.9, 3.1, 3.5, 3.6, 3.22
- ERROR 3.4, 4.3, 4.4
- EXPECT 0.5, 2.1, 3.11, 3.20, 4.1
- FAQ 0, 0.3, 0.5, 0.6, 0.7, 1.8, 1.9, 3.4, 3.18, 4.5, 5.1
- FORK 3.8, 3.9
- FRAMES 3.6, 3.15, 3.16, 3.17
- GET 0.1, 0.5, 1.8, 1.18, 2.1, 2.4, 2.9, 3.1, 3.3, 3.11, 3.12,
- 3.13, 3.16, 3.17, 3.21, 4.2
- HEAD 2.1, 4.2
- HEADER 0.6, 2, 2.1, 2.5, 2.6, 2.9, 3.2, 3.11, 3.13, 3.14, 3.15,
- 3.21, 4.2, 4.3, 4.4
- HTML 0.3, 1.1, 1.5, 1.7, 1.8, 1.9, 1.17, 1.18, 2.1, 2.2, 3.6,
- 3.8, 3.9, 3.10, 3.11, 3.14, 3.22, 4, 4.2, 4.3, 4.6, 5.1,
- 5.2
- HTTP 0.3, 1.1, 1.8, 1.9, 1.15, 1.17, 1.18, 2, 2.1, 2.2, 2.5,
- 2.6, 2.7, 2.9, 3.4, 3.7, 3.11, 3.12, 3.14, 3.18, 3.21,
- 3.22, 4.1, 4.2, 4.3, 4.6, 5.1, 5.2
- HTTPD 1.5, 1.6, 1.8, 2.1, 2.5, 2.6, 2.7, 3.7, 3.10, 3.11, 3.13,
- 3.20, 4.4
- IMAGE 3.17
- JAVA 1.4, 1.7, 3.9
- JAVASCRIPT 3.1, 3.7, 3.16
- LOCATION 2.5, 2.7, 3.13, 4.3
- MICROSOFT 1.5, 3.6
- MOZILLA 3.6
- MULTIPART 1.18
- NCSA 1.1, 1.5, 1.8, 2.2, 2.3, 2.8, 3.1, 5.2
- NETSCAPE 2.5, 3.1, 3.6, 3.7, 3.15
- NOBODY 1.16, 3.20
- NPH 2, 2.5, 2.6, 2.7, 2.8, 3.11, 3.14
- PASSWORD 2.4, 3.11
- PERL 1.10, 1.11, 1.12, 1.15, 1.16, 3.4, 3.16, 3.18, 3.20, 4.1,
- 5.1
- PERMISSIONS 1.16, 1.17, 4.3
- PHP 1.5, 1.7
- POST 0.5, 0.6, 1.18, 2.1, 2.9, 3.12, 3.22
- PRAGMA 3.21
- REDIRECT 2.5, 3.13
- REFRESH 2.5
- REQUEST 2.1, 2.2, 2.9, 3.9, 3.13, 3.17, 3.21, 4.2
- RESPONSE 1.5, 1.12, 2.1, 2.5, 2.6, 3.9, 3.11, 3.13, 3.17, 3.19,
- 3.22, 4.2
- SECURITY 1.8, 1.9, 1.16, 1.17, 3.20, 5.1
- SERVER 0.3, 1.4, 1.5, 1.7, 1.8, 1.14, 1.15, 1.16, 1.17, 2.1,
- 2.2, 2.3, 2.5, 2.8, 2.9, 3.5, 3.9, 3.11, 3.12, 3.16,
- 3.20, 3.21, 3.22, 4.2, 4.3
- SSI 1.5, 1.7
- STATUS 2.5, 2.7, 3.11, 3.14, 4.3
- TCL 1.11, 1.12, 3.20, 4.1
- UNIX 1.10, 1.16, 2.2, 3.8, 4.5
- URL 0.3, 1.8, 1.17, 2.9, 3.12, 3.13, 3.16, 3.17, 3.21, 3.22
- URLENCODE 3.13
- WWW 0.3, 0.6, 1.4, 1.8, 1.9, 1.11, 1.17, 2.7, 2.9, 3.4, 3.11,
- 4.1, 4.2, 5.1, 5.2
- .
-